Processing XML prolog by SQL Server XML functions - sql-server

I have a large database table with an XML column. The XML contents is a kind of document like as below:
<?int-dov version="1.0" encoding="UTF-8" standalone="no"?>
<ds:datastoreItem ds:itemID="{F8484AF4-73BF-45CA-A524-0D796F244F37}" xmlns:ds="http://schemas.openxmlformats.org/officeDocument/2006/customXml"><ds:schemaRefs><ds:schemaRef ds:uri="http://schemas.openxmlformats.org/officeDocument/2006/bibliography"/></ds:schemaRefs></ds:datastoreItem>
I'm seeking a function or fast way to fetch standalone attribute value in a T-SQL query. When I run the below query:
select XmlContent.query('#standalone') from XmlDocuments
I get this error message:
Msg 2390, Level 16, State 1, Line 4
XQuery [XmlDocuments.XmlContent.query()]: Top-level attribute nodes are not supported
So, I would be appreciated if anybody gives me a solution to address this problem.

You can use the processing-instruction() function to get that.
SELECT #xml.value('./processing-instruction("int-dov")[1]','nvarchar(max)')
Result
version="1.0" encoding="UTF-8" standalone="no"
If you want to get just the standalone part, the only way I've found is to construct an XML node from it:
SELECT CAST(
N'<x ' +
#xml.value('./processing-instruction("int-dov")[1]','nvarchar(max)') +
N' />' AS xml).value('x[1]/#standalone','nvarchar(10)'
Result
no
db<>fiddle

Just to complement #Charlieface answer. All credit goes to him.
SQL
DECLARE #xml XML =
N'<?int-dov version="1.0" encoding="UTF-8" standalone="no"?>
<ds:datastoreItem ds:itemID="{F8484AF4-73BF-45CA-A524-0D796F244F37}"
xmlns:ds="http://schemas.openxmlformats.org/officeDocument/2006/customXml">
<ds:schemaRefs>
<ds:schemaRef ds:uri="http://schemas.openxmlformats.org/officeDocument/2006/bibliography"/>
</ds:schemaRefs>
</ds:datastoreItem>';
SELECT col.value('x[1]/#standalone','nvarchar(10)') AS [standalone]
, col.value('x[1]/#encoding','nvarchar(10)') AS [encoding]
, col.value('x[1]/#version','nvarchar(10)') AS [version]
FROM (VALUES(CAST(N'<x ' +
#xml.value('./processing-instruction("int-dov")[1]','nvarchar(max)') +
N' />' AS xml))
) AS tab(col);
Output
+------------+----------+---------+
| standalone | encoding | version |
+------------+----------+---------+
| no | UTF-8 | 1.0 |
+------------+----------+---------+

Related

How do I convert SQL to XML?

I am trying to output SQL as XML to match the exact format as the following
<?xml version="1.0" encoding="utf-8"?>
<ProrateImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schema.aldi-
sued.com/Logistics/Shipping/ProrateImport/20151009">
<Prorates>
<Prorate>
<OrderTypeId>1</OrderTypeId>
<DeliveryDate>2015-10-12T00:00:00+02:00</DeliveryDate>
<DivNo>632</DivNo>
<ProrateUnit>1</ProrateUnit>
<ProrateProducts>
<ProrateProduct ProductCode="8467">
<ProrateItems>
<ProrateItem StoreNo="1">
<Quantity>5</Quantity>
</ProrateItem>
<ProrateItem StoreNo="2">
<Quantity>5</Quantity>
</ProrateItem>
<ProrateItem StoreNo="3">
<Quantity>5</Quantity>
</ProrateItem>
</ProrateItems>
</ProrateProduct>
</ProrateProducts>
</Prorate>
</Prorates>
</ProrateImport>
Here is my query:
SELECT
OrderTypeID,
DeliveryDate, DivNo,
ProrateUnit,
(SELECT
ProductOrder [#ProductCode],
(SELECT
ProrateItem [#StoreNo],
CAST(Quantity AS INT) [Quantity]
FROM
##Result2 T3
WHERE
T3.DivNo = T2.DivNo
AND T3.DivNo = T1.DivNo
AND T3.DeliveryDate = T2.DeliveryDate
AND T3.DeliveryDate = T1.DeliveryDate
AND T3.ProductOrder = t2.ProductOrder
FOR XML PATH('ProrateItem'), TYPE, ROOT('ProrateItems')
)
FROM
##Result2 T2
WHERE
T2.DivNo = T1.DivNo
AND T2.DeliveryDate = T1.DeliveryDate
FOR XML PATH('ProrateProduct'), TYPE, ROOT('ProrateProducts')
)
FROM
##Result2 T1
GROUP BY
OrderTypeID, DeliveryDate, DivNo, ProrateUnit
FOR XML PATH('Prorate'), TYPE, ROOT('Prorates')
How do I add in the Following and have the ProrateImport/20151009" change to the current date?
<?xml version="1.0" encoding="utf-8"?>
<ProrateImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schema.aldi-
sued.com/Logistics/Shipping/ProrateImport/20151009">
This is my first time I have used XML
Im not sure i understand. Did you create the first XML yourself and just need to add the last script?
DECLARE #XMLHEADER nvarchar(max)
SET #XMLHEADER = '<?xml version="1.0" encoding="utf-8"?>
<ProrateImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/'+convert(varchar(8),getdate(),112)+'"
>'
select #xmlheader
And then you just need to add the rest of your output from your select statement.
There are several problems:
How to introduce namespaces?
How to introduce namespaces dynamically
How to add a <?xml ?> directive
two-leveled root (<ProrateImport><Prorate>)
namespaces
You have to use WITH XMLNAMESSPACES to introduce a namespace to your query.
Hint: the naked xmlns is introduced by DEFAULT, the xsi namespace will be introduced automatically by using ELEMENTS XSINIL:
WITH XMLNAMESPACES('http://www.w3.org/2001/XMLSchema' AS xsd
,DEFAULT 'http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20151009')
SELECT 1 AS Dummy
FOR XML PATH('rowElement'), ELEMENTS XSINIL, ROOT('root')
The result
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20151009"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</root>
Note: The namespaces must be stated literally. No computations, no variables!
dynamic namespaces
This is - out of the box - impossible. But you might use dynamically created SQL and use EXEC to get your result. Just create exactly the statement as above
DECLARE #cmd VARCHAR(MAX)=
'
WITH XMLNAMESPACES(''http://www.w3.org/2001/XMLSchema'' AS xsd
,DEFAULT ''http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/' + CONVERT(VARCHAR(8),GETDATE(),112) + ''')
SELECT 1 AS Dummy
FOR XML PATH(''rowElement''), ELEMENTS XSINIL, ROOT(''root'')';
PRINT #cmd
EXEC(#cmd);
the result
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20171019"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</root>
directive
The directive cannot be introduced into XML. SQL-Server will omit any <?xml ?> directive! This can only be done on string level:
DECLARE #cmd VARCHAR(MAX)=
'
WITH XMLNAMESPACES(''http://www.w3.org/2001/XMLSchema'' AS xsd
,DEFAULT ''http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/' + CONVERT(VARCHAR(8),GETDATE(),112) + ''')
SELECT(
SELECT 1 AS Dummy
FOR XML PATH(''rowElement''), ELEMENTS XSINIL, ROOT(''root'')) AS MyResult';
CREATE TABLE #resultTable(MyXmlAsString VARCHAR(MAX))
INSERT INTO #resultTable(MyXmlAsString)
EXEC(#cmd);
SELECT '<?xml version="1.0" encoding="utf-8"?>' + MyXmlAsString
FROM #resultTable;
The result
<?xml version="1.0" encoding="utf-8"?>
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20171019"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</root>
two-leveled root
You can nest two FOR XML statements to achieve this:
WITH XMLNAMESPACES(DEFAULT 'blah')
SELECT
(
SELECT 1 AS Dummy
FOR XML PATH('rowElement'),ROOT('innerRoot'),TYPE
)
FOR XML PATH('outerRoot');
But the annoying part is, that namespaces are introduced by each sub-select over and over. Not wrong but very annoying! A well known Microsoft connect issue. Please sign in and vote for it! The result:
<outerRoot xmlns="blah">
<innerRoot xmlns="blah"> <!--Here's the second xmlns! -->
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</innerRoot>
</outerRoot>
Your solution
After explained all this I'd suggest to create the XML without any namespace or declaration (what you are doing already!), then convert the result to NVARCHAR(MAX) and add the header and the closing footer on string level. This is ugly, but in your case the only way.
Hint: You will not be able to store the final result in a native XML type in SQL Server without loosing the directive.

SQL Server 2008: Null Return in Dynamic XML Query

I have a set of dynamic queries which return XML as varchars, see below.
Example query:
set #sqlstr = 'Select ''<?xml version="1.0" encoding="windows-1252" ?>'' + ''<File_Name><Location>'' + (Select a,b,c from table for xml path(''Row'')) + </Location></File_name>'''
exec(#sqlstr)
This works a treat until the select a,b,c ... query is NULL. Then I don't receive the outside elements as you'd expect like:
<?xml version="1.0" encoding="windows-1252"><File_Name><Location><Row></Row></Location></File_name>
All I receive is NULL
After a bit of Googling I find the issue is the concatenation of NULL results is a complete NULL Result. However I cannot find one solution gives me what I'd expect to be the result.
I've tried (not to say I have tried correctly)
IsNull(Exec(#sqlstring),'*blank elements*')
xsnil (doesn't seem to work in dynamic queries)
#result = exec(#sqlstring) then isnull and select
Anyone have a better solution? (preferably small due to multiple such queries)
I think you need something like this:
set #sqlstr = 'Select ''<?xml version="1.0" encoding="windows-1252" ?><File_Name><Location>'' + (Select IsNull(a, '') as a, IsNull(b, '') as b,IsNull(c, '') as c from table for xml path(''Row'')) + </Location></File_name>'''
exec(#sqlstr)

SQL Server FOR XML PATH: Set xml-declaration or processing instruction "xml-stylesheet" on top

I want to set a processing instruction to include a stylesheet on top of an XML:
The same issue was with the xml-declaration (e.g. <?xml version="1.0" encoding="utf-8"?>)
Desired result:
<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
<TestPath>
<Test>Test</Test>
<SomeMore>SomeMore</SomeMore>
</TestPath>
My research brought me to node test syntax and processing-instruction().
This
SELECT 'type="text/xsl" href="stylesheet.xsl"' AS [processing-instruction(xml-stylesheet)]
,'Test' AS Test
,'SomeMore' AS SomeMore
FOR XML PATH('TestPath')
produces this:
<TestPath>
<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
<Test>Test</Test>
<SomeMore>SomeMore</SomeMore>
</TestPath>
All hints I found tell me to convert the XML to VARCHAR, concatenate it "manually" and convert it back to XML. But this is - how to say - ugly?
This works obviously:
SELECT CAST(
'<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
<TestPath>
<Test>Test</Test>
<SomeMore>SomeMore</SomeMore>
</TestPath>' AS XML);
Is there a chance to solve this?
There is another way, which will need two steps but don't need you to treat the XML as string anywhere in the process :
declare #result XML =
(
SELECT
'Test' AS Test,
'SomeMore' AS SomeMore
FOR XML PATH('TestPath')
)
set #result.modify('
insert <?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
before /*[1]
')
Sqlfiddle Demo
The XQuery expression passed to modify() function tells SQL Server to insert the processing instruction node before the root element of the XML.
UPDATE :
Found another alternative based on the following thread : Merge the two xml fragments into one? . I personally prefer this way :
SELECT CONVERT(XML, '<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>'),
(
SELECT
'Test' AS Test,
'SomeMore' AS SomeMore
FOR XML PATH('TestPath')
)
FOR XML PATH('')
Sqlfiddle Demo
As it came out, har07's great answer does not work with an XML-declaration. The only way I could find was this:
DECLARE #ExistingXML XML=
(
SELECT
'Test' AS Test,
'SomeMore' AS SomeMore
FOR XML PATH('TestPath'),TYPE
);
DECLARE #XmlWithDeclaration NVARCHAR(MAX)=
(
SELECT N'<?xml version="1.0" encoding="UTF-8"?>'
+
CAST(#ExistingXml AS NVARCHAR(MAX))
);
SELECT #XmlWithDeclaration;
You must stay in the string line after this step, any conversion to real XML will either give an error (when the encoding is other then UTF-16) or will omit this xml-declaration.

Parsing XML with T-SQL in two separate tables

Using SQL Server 2012 and trying to parse an XML to 2 separate tables in my database. Normally 1 table would be enough, but not in this instance. My XML is structured as follows (I can't change it's structure, I already receive it like that)
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<podjetje id="" storitev="" uporabnik="" ts="" opis_storitve="">
<izdelki>
<izdelek st="1">
<izdelekID>ID</izdelekID>
<ean>EAN CODE</ean>
<izdelekIme>PRODUCT NAME</izdelekIme>
<url>WEBSITE</url>
<kratkiopis>SHORT DESCRIPTION</kratkiopis>
<opis>DESCRIPTION</opis>
<dodatneLastnosti>ATTRIBUTES</dodatneLastnosti>
<slikaVelika>BIG PICTURE URL</slikaVelika>
<dodatneSlike>
<dodatnaSlika1>EXTRA IMAGE URL</dodatnaSlika1>
<dodatnaSlika2>EXTRA IMAGE URL2</dodatnaSlika2>
<dodatnaSlika3>EXTRA IMAGE URL3</dodatnaSlika3>
</dodatneSlike>
</izdelek>
</izdelki>
</podjetje>
To insert this XML into a table i use SQL bulk insert
SET #SQLString = 'INSERT INTO tmpImport(XmlCol)
SELECT *
FROM OPENROWSET(BULK ''' + #ImportFileName + ''', SINGLE_BLOB, ERRORFILE = ''' + #BulkLoadFilePath + ''') AS x '
EXECUTE (#SQLString)
I can handle most of the data without any problems. I ran into some problems when i get to the node "dodatneSlike". The idea is, that each article has some pictures. The main picture is in the node "slikaVelika" and I can insert it into my table. There are extra pictures in the child nodes of node "dodatneSlike". This is causing me problems, because I have to insert these extra pictures into a separate table (inserting the picture from node "slikaVelika" would also help, but I think i can get around it if it's not possible). The table is nothing special, just the Article ID from node "izdelekID" and the pictures from "dodatneSlike".
The problem is, I never know how many nodes ("dodatnaSlika1", "dodatnaSlika2",...) there will be. There might be 1, 10, 0....
So my question is how do I get the values from "dodatnaSlika" nodes?
Try to use the native XQuery support in SQL Server! Much easier than the clunky old OPENROWSET stuff....
You can try something like this:
DECLARE #input XML = '<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<podjetje id="" storitev="" uporabnik="" ts="" opis_storitve="">
<izdelki>
<izdelek st="1">
<izdelekID>ID</izdelekID>
<ean>EAN CODE</ean>
<izdelekIme>PRODUCT NAME</izdelekIme>
<url>WEBSITE</url>
<kratkiopis>SHORT DESCRIPTION</kratkiopis>
<opis>DESCRIPTION</opis>
<dodatneLastnosti>ATTRIBUTES</dodatneLastnosti>
<slikaVelika>BIG PICTURE URL</slikaVelika>
<dodatneSlike>
<dodatnaSlika1>EXTRA IMAGE URL</dodatnaSlika1>
<dodatnaSlika2>EXTRA IMAGE URL2</dodatnaSlika2>
<dodatnaSlika3>EXTRA IMAGE URL3</dodatnaSlika3>
</dodatneSlike>
</izdelek>
</izdelki>
</podjetje>'
SELECT
izdelek_st = #input.value('(/podjetje/izdelki/izdelek/#st)[1]', 'int'),
izdelekID = #input.value('(/podjetje/izdelki/izdelek/izdelekID)[1]', 'varchar(50)'),
ean = #input.value('(/podjetje/izdelki/izdelek/ean)[1]', 'varchar(50)'),
XC.value('local-name(.)', 'varchar(50)'),
XC.value('(.)[1]', 'varchar(50)')
FROM
#input.nodes('/podjetje/izdelki/izdelek/dodatneSlike/*') AS XT(XC)
This will give you all the subnodes under <dodatneSlike> - no matter how many there are - and it gives you both the node name, as well as the node value.
Update: assuming you have multiple <izdelek> nodes, then you could use this query instead:
SELECT
izdelek_st = #input.value('(/podjetje/izdelki/izdelek/#st)[1]', 'int'),
izdelekID = xc1.value('(izdelekID)[1]', 'varchar(50)'),
ean = xc1.value('(ean)[1]', 'varchar(50)'),
XC2.value('local-name(.)', 'varchar(50)'),
XC2.value('(.)[1]', 'varchar(50)')
FROM
#input.nodes('/podjetje/izdelki/izdelek') AS XT1(XC1)
CROSS APPLY
xc1.nodes('dodatneSlike/*') AS XT2(XC2)

Generate XML comments with SQL FOR XML statement

Background: I am generating pieces of a much larger XML document (HL7 CDA documents) using SQL FOR XML queries. Following convention, we need to include section comments before this XML node so that when the nodes are reassembled into the larger document, they are easier to read.
Here is a sample of the expected output:
<!--
********************************************************
Past Medical History section
********************************************************
-->
<component>
<section>
<code code="10153-2" codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC"/>
<title>Past Medical History</title>
<text>
<list>
<item>COPD - 1998</item>
<item>Dehydration - 2001</item>
<item>Myocardial infarction - 2003</item>
</list>
</text>
</section>
</component>
Here is the SQL FOR XML statement that I have constructed to render the above XML:
SELECT '10153-2' AS [section/code/#code], '2.16.840.1.113883.6.1' AS [section/code/#codeSystem], 'LOINC' AS [section/code/#codeSystemName],
'Past Medical History' AS [section/title],
(SELECT [Incident] + ' - ' + [IncidentYear] as "item"
FROM [tblSummaryPastMedicalHistory] AS PMH
WHERE ([PMH].[Incident] IS NOT NULL) AND ([PMH].[PatientUnitNumber] = [PatientEncounter].[PatientUnitNumber])
FOR XML PATH('list'), TYPE
) as "section/text"
FROM tblPatientEncounter AS PatientEncounter
WHERE (PatientEncounterNumber = 6)
FOR XML PATH('component'), TYPE
While I can insert the comments from the controlling function that reassembles these XML snippets into the main document, our goal is to have the comments be generated with the output to avoid document construction errors.
I've tried a few things, but am having trouble producing the comments with the SELECT statement. I've tried a simple string, but have not been able to get the syntax for the line breaks. Any suggestions?
Example:
SELECT [EmployeeKey]
,[ParentEmployeeKey]
,[FirstName]
,[LastName]
,[MiddleName]
,[DepartmentName] AS "comment()"
FROM [AdventureWorksDW2008].[dbo].[DimEmployee]
FOR XML PATH('Employee'),ROOT('Employees')
produces:
<Employees>
<Employee>
<EmployeeKey>1</EmployeeKey>
<ParentEmployeeKey>18</ParentEmployeeKey>
<FirstName>Guy</FirstName>
<LastName>Gilbert</LastName>
<MiddleName>R</MiddleName>
<!--Production-->
</Employee>
<Employee>
<EmployeeKey>2</EmployeeKey>
<ParentEmployeeKey>7</ParentEmployeeKey>
<FirstName>Kevin</FirstName>
<LastName>Brown</LastName>
<MiddleName>F</MiddleName>
<!--Marketing-->
</Employee>
</Employees>
Just an alternative that also works:
select cast('<!-- comment -->' as xml)`
This may be the only viable approach if you're using FOR XML EXPLICIT, which doesn't support the [comment()] column alias notation of the answer by John Saunders. For example:
select
1 [tag],
null [parent],
(select cast('<!-- test -->' as xml)) [x!1],
2 [x!1!b]
for xml explicit, type
The above produces:
<x b="2"><!-- test --></x>
If the comment is dynamic, just concatenate it like this:
select cast('<!--' + column_name + '-->' as xml)` from table_name

Resources