SQL Server 2019 FOR XML nested nodes preserving CDATA - sql-server

I have to build this payload
<?xml version="1.0" encoding="utf-8"?>
<shipment>
<software>
<application>MYRTL</application>
<version>1.0</version>
</software>
<security>
<customer>X00000</customer>
<user>X00000</user>
<password>password1</password>
<langid>IT</langid>
</security>
<consignment action="I" cashondeliver="N" international="N" insurance="N">
<labelType>T</labelType>
<senderAccId>200200</senderAccId>
<consignmenttype>T</consignmenttype>
<actualweight>00008000</actualweight>
<actualvolume>0000018</actualvolume>
<totalpackages>2</totalpackages>
<packagetype>C</packagetype>
<division>D</division>
<product>N</product>
<insurancevalue>0000000000000</insurancevalue>
<insurancecurrency>EUR</insurancecurrency>
<reference><![CDATA[22X000223]]></reference>
<collectiondate>20220818</collectiondate>
<termsofpayment>S</termsofpayment>
<systemcode>RL</systemcode>
<systemversion>1.0</systemversion>
<codfvalue>0000000000000</codfvalue>
<codfcurrency>EUR</codfcurrency>
<goodsdesc><![CDATA[Bread, Butter & Puré]]></goodsdesc>
<addresses>
<address>
<addressType>S</addressType>
<vatno>123456789123</vatno>
<addrline1><![CDATA[Via Mondovì, n° 23]]></addrline1>
<postcode><![CDATA[20125]]></postcode>
<phone1><![CDATA[345]]></phone1>
<phone2><![CDATA[3456345]]></phone2>
<name><![CDATA[Jack & Joe srl]]></name>
<country><![CDATA[IT]]></country>
<town><![CDATA[Arquà Polesine]]></town>
<province><![CDATA[RO]]></province>
<email><![CDATA[mail#jack_and_joe.it]]></email>
</address>
<address>
<addressType>C</addressType>
<addrline1><![CDATA[12° Reggimento Granatieri, 14]]></addrline1>
<postcode><![CDATA[00195]]></postcode>
<phone1><![CDATA[321]]></phone1>
<phone2><![CDATA[3214321]]></phone2>
<name><![CDATA[Giosuè Rossë]]></name>
<country><![CDATA[IT]]></country>
<town><![CDATA[Gambolo']]></town>
<province><![CDATA[TV]]></province>
<email><![CDATA[mario#rossi.it]]></email>
</address>
<address>
<addressType>R</addressType>
<addrline1><![CDATA[Hauptstraße 13]]></addrline1>
<postcode><![CDATA[34100]]></postcode>
<phone1><![CDATA[333]]></phone1>
<phone2><![CDATA[333444555]]></phone2>
<name><![CDATA[Noè Giassù]]></name>
<country><![CDATA[IT]]></country>
<town><![CDATA[Völs am Schlern]]></town>
<province><![CDATA[BZ]]></province>
<email><![CDATA[mail#noe.it]]></email>
</address>
</addresses>
<collectiontrg>
<priopntime>0900</priopntime>
<priclotime>1200</priclotime>
<secopntime>1400</secopntime>
<secclotime>1800</secclotime>
<availabilitytime>1600</availabilitytime>
<pickupdate>18.08.2022</pickupdate>
<pickuptime>1600</pickuptime>
<pickupdays>1</pickupdays>
<pickupinstr><![CDATA[Test Shipment ===> DO NOT COLLECT <===]]></pickupinstr>
</collectiontrg>
<dimensions itemaction="I">
<itemsequenceno>1</itemsequenceno>
<itemtype>C</itemtype>
<itemreference><![CDATA[22X0002223_1]]></itemreference>
<volume>0000009</volume>
<weight>00003000</weight>
<length>030000</length>
<heigh>010000</heigh>
<width>030000</width>
<quantity>1</quantity>
</dimensions>
<dimensions itemaction="I">
<itemsequenceno>2</itemsequenceno>
<itemtype>C</itemtype>
<itemreference><![CDATA[22X0002223_2]]></itemreference>
<volume>0000009</volume>
<weight>00005000</weight>
<length>030000</length>
<heigh>010000</heigh>
<width>030000</width>
<quantity>1</quantity>
</dimensions>
</consignment>
</shipment>
I had the bad idea to use T-SQL since all data are in SQL Server DB
I thought it was quite easy, and actually, it was, since was just required to nest some FOR XML PATH, TYPE subqueries.
Problems arose when considered that some fields could contain not standard charachters, therefore was better to use some CDATA fields.
I faced several problems since it appears that the only way to preserve CDATA is using FOR XML EXPLICIT that seems to be deprecated.
However it was very difficult to find documentation.
Fortunately I found this post that helped me to make the reverse path:
Therefore I built a sproc with XML Explicit format:
SELECT 1 AS Tag,
NULL AS Parent,
'MYRTL' AS 'software!1!application!element',
'1.0' AS 'software!1!version!element',
NULL AS 'security!2!customer!element',
...
NULL AS 'security!2!langid!element',
NULL AS 'consignment!3!action',
...
NULL AS 'consignment!3!goodsdesc!CDATA',
NULL AS 'addresses!4!address',
NULL AS 'address!5!addressType!element',
...
NULL AS 'address!5!town!CDATA',
...
NULL AS 'collectiontrg!9!priopntime!element',
...
NULL AS 'collectiontrg!9!pickupdate!element',
UNION ALL
SELECT 2 AS Tag,
NULL AS Parent,
...
UNION ALL
SELECT 3 AS Tag,
NULL AS Parent,
...
UNION ALL
SELECT 9 AS Tag,
3 AS Parent,
...
FOR XML EXPLICIT, ROOT('shipment')
It seems to be working well... although I think there has to be a better way to build it.
Now I have a further issue that I do not know how to solve, or better, I could solve it using a dynamic query, but I would avoid it:
New issue is that node shipment.consignment.addresses.address where addressType=='C'
has to be omitted if it contains the same values as shipment.consignment.addresses.address where addressType=='S'
furthermore the node shipment.consignment.collectiontrg has to appear only if the variable pickupDate is not null
Is there a way to avoid the dynamic query?
Is there a better way to build this query?
Thanks

Related

Limit xml-namespaces to only the main root

I have this query
WITH XMLNAMESPACES(DEFAULT 'https://tribunet.hacienda.go.cr /docs/esquemas/2017/v4.2/facturaElectronica'
,'http://www.w3.org/2001/XMLSchema' AS xsd
,'http://www.w3.org/2001/XMLSchema-instance' AS xsi)
SELECT 1 AS [id]
,0 AS [pass]
(
/*Others*/
SELECT
OT.OTH_MESSAGE as Others
FROM [crdx_COREDev1].[dbo].[OTH_OTHERS] as OT
where
OT.OTH_ID=E.OTH_ID
fOR XML PATH ('Others'), type
)
,0 AS [CONSECUTIVE]
FOR XML PATH('FE');
This generates this XML
<FE xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="https://tribunet.hacienda.go.cr/docs/esquemas/2017/v4.2 /facturaElectronica"> <- CHANGE 2
<id>1</id>
<pass>0</pass>
<CONSECUTIVE>0</CONSECUTIVE>
<Others xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="https://tribunet.hacienda.go.cr/docs/esquemas/2017/v4.2 /facturaElectronica">
<MESSAGE>MESSAGE</MESSAGE>
</Others>
</FE>
Now my question: I would like only <FE> to show the namespaces, but - as you see in the xml - that declarations appear also in <Others>. How can I limit this to <FE>?
This is an annoying and well known issue and occurs whenever you use namespaces in connection with nested sub-queries in FOR XML queries...
There has been a connect issue for more than 10 years - until it disappaered recently.
It is important to mention, that these repeated namespace declarations are not wrong, just bloating your XML. And it can collide with (to) strict schema validations.
No good solution, just workarounds:
Create the inner XML without the namespace and add the wrapping node on string base, or
Create the namespaces as normal attributes (but not named xmlns) and use REPLACE to change the names.
Both workarounds need a conversion to NVARCHAR(MAX) and back to XML.
I really have no idea, why this was implemented this way...
Find some related examples
here
and here
and here
and here
Attention:
xmlns="https://tribunet.hacienda.go.cr/docs/esquemas/2017/v4.2 /facturaElectronica">
You are using namespace URLs with blanks. This is not allowed...

Add Extra information to query output in SQL Server

I have this query in SQL Server that generates XML, and I would like to add a couple of details.
This is what I have:
<FE>
<id>1</id>
<pass>0</pass>
<CONSECUTIVE>0</CONSECUTIVE>
<DetailLine>
<Article>Book<Article/>
<Currency>USD</Currency>
<Price>10</Price>
<Total>10</Total>
</DetailLine>
....
</FE>
and I would like it to generate this output
<?xml version="1.0" encoding="utf-8"?> <- CHANGE 1
<FE xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="https://tribunet.hacienda.go.cr/docs/esquemas/2017/v4.2/facturaElectronica"> <- CHANGE 2
<id>1</id>
<pass>0</pass>
<CONSECUTIVE>0</CONSECUTIVE>
<DetailLine>
<Article>Book<Article/>
<Currency>USD</Currency>
<Price>10</Price>
<Total>10</Total>
</DetailLine>
</EndLine> <--add this CHANGE 3
....
</FE>
Your question is not all clear... Especially change 3 is wrong actually...
But one after the other:
change 1: This is not supported on XML level!
Any XML-declaration <?xml blah ?> will be omited as XML within SQL-Server is UTF-16 in any case (to be exact: UCS-2)
Real UTF-8 is not support by SQL-Server at all! Many people think, that the VARCHAR type is UTF-8, but this is wrong. It is extended ASCII.
You can concatenated strings. Thus you can convert the XML to a string and append any other string you like, but this is - probably - no longer well-formed XML...
change 2: This is easy, read about WITH XMLNAMESPACES
Try this
WITH XMLNAMESPACES(DEFAULT 'https://tribunet.hacienda.go.cr/docs/esquemas/2017/v4.2/facturaElectronica'
,'http://www.w3.org/2001/XMLSchema' AS xsd
,'http://www.w3.org/2001/XMLSchema-instance' AS xsi)
SELECT 1 AS [id]
,0 AS [pass]
,0 AS [CONSECUTIVE]
FOR XML PATH('FE');
change 3: No idea what you need. Your example is showing a closing </newLine>, but this does not help to understand your issue. Anyway, without the opening tag this is wrong...

Query XML value in sql

I need to get some information from XML in SQL Server 2008, but I cannot even get basic attribute from it. All samples that I tried failed. Table name is Item, xml column name is Data.
Simplified xml looks like this:
<AnchoredXml xmlns="urn:schema:Microsoft.Rtc.Management.ScopeFramework.2008" SchemaWriteVersion="2">
<Key ScopeClass="Global">
<SchemaId Namespace="urn:schema:Microsoft.Rtc.Management.Deploy.Topology.2008" ElementName="Topology" />
<AuthorityId Class="Host" InstanceId="00000000-0000-0000-0000-000000000000" />
</Key>
<Dictionary Count="1">
<Item>
<Key />
<Value Signature="a3502dd0-8c16-4023-9eea-30ea1c7a3a2b">
<Topology xmlns="urn:schema:Microsoft.Rtc.Management.Deploy.Topology.2008">
<Services>
<Service RoleVersion="1" ServiceVersion="6" Type="Microsoft.Rtc.Management.Deploy.Internal.ServiceRoles.FileStoreService">
<ServiceId SiteId="1" RoleName="FileStore" Instance="1" />
<DependsOn />
<InstalledOn>
<ClusterId SiteId="1" Number="1" />
</InstalledOn>
<Ports xmlns="urn:schema:Microsoft.Rtc.Management.Deploy.ServiceRoles.2008" />
<FileStoreService xmlns="urn:schema:Microsoft.Rtc.Management.Deploy.ServiceRoles.2008" ShareName="lyncShare" />
</Service>
</Services>
</Topology>
</Value>
</Item>
</Dictionary>
</AnchoredXml>
I need to read information in AnchoredXml/Key/SchemaId/#NameSpace to select the right xml (there are more rows). Sample xml above is the right one. And after that I need to find the right service with
Type="Microsoft.Rtc.Management.Deploy.Internal.ServiceRoles.FileStoreService"
where is FileStoreService/#ShareName that I need.
I've tried to print the Namespace attributte for the start, but no sample code is working.
A few tries:
SELECT c.p.value('(#Namespace)[1]', 'varchar(50)') as 'Nmspace'
FROM Item
CROSS APPLY Data.nodes('/AnchoredXml/Key/SchemaId') c(p)
returns empty result set
SELECT Data.value('(/AnchoredXml/Key/SchemaId/#Namespace)[1]', 'varchar(50)')
FROM Item
returns NULL for all rows
SELECT
It.Data.exist('/AnchoredXml/Key/SchemaId[#Namespace="Microsoft.Rtc.Management.Deploy.Topology.2008"]')
FROM [xds].[dbo].[Item] AS It
returns 0's for all rows also without quotes ("")
A working sample code to get at least attribute test would be maybe sufficient and I would figure out the rest.
Could you please help me find errors in my queries or maybe identify some other problem?
Thanks
You're ignoring all the XML namespaces in your XML document! You need to pay attention to those and respect them!
There are XML namespaces on:
the root node <AnchoredXml>
(XML namespace: urn:schema:Microsoft.Rtc.Management.ScopeFramework.2008)
the subnode <Topology>
(XML ns: urn:schema:Microsoft.Rtc.Management.Deploy.Topology.2008)
the subnode <FileStoreService>
(XML ns: urn:schema:Microsoft.Rtc.Management.Deploy.ServiceRoles.2008)
Try this:
-- respect the XML namespaces!!
;WITH XMLNAMESPACES(DEFAULT 'urn:schema:Microsoft.Rtc.Management.ScopeFramework.2008',
'urn:schema:Microsoft.Rtc.Management.Deploy.Topology.2008' AS t,
'urn:schema:Microsoft.Rtc.Management.Deploy.ServiceRoles.2008' AS fss)
SELECT
ShareName = Data.value('(/AnchoredXml/Dictionary/Item/Value/t:Topology/t:Services/t:Service/fss:FileStoreService/#ShareName)[1]', 'varchar(50)')
FROM
dbo.Item
In my case, this returns:
ShareName
-----------
lyncShare

SQL Server 2005 Xquery namespaces

I'm trying to get some values out of an Xml Datatype. The data looks like:
<Individual xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<FirstName xmlns="http://nswcc.org.au/BusinessEntities.Crm">Lirria</FirstName>
<LastName xmlns="http://nswcc.org.au/BusinessEntities.Crm">Latimore</LastName>
</Indvidual>
Note the presence of the xmlns in the elements FirstName and LastName - this is added when we create the xml by serializing a c# business object. Anyway it seems that the presence of this namespace in the elements is causing XQuery expressions to fail, such as:
SELECT MyTable.value('(//Individual/LastName)[1]','nvarchar(100)') AS FirstName
This returns null. But when I strip out the namespace from the elements in the xml (e.g. using a Replace T-SQL statement), the above returns a value. However there must be a better way - is there a way of making this query work i.e. without updating the xml first?
Thanks
John Davies
You need to properly name the element you want to select. See Adding Namespaces Using WITH XMLNAMESPACES. Here is an example using your XML:
declare #x xml;
set #x = N'<Individual
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<FirstName xmlns="http://nswcc.org.au/BusinessEntities.Crm">Lirria</FirstName>
<LastName xmlns="http://nswcc.org.au/BusinessEntities.Crm">Latimore</LastName>
</Individual>';
with xmlnamespaces (N'http://nswcc.org.au/BusinessEntities.Crm' as crm)
select #x.value(N'(//Individual/crm:LastName)[1]',N'nvarchar(100)') AS FirstName
The * wildcard will also allow you to select the element without enforcing the explicit namespace. Remus' answer is the way to go, but this may assist others having namespace issues:
select #x.value(N'(//Individual/*:LastName)[1]',N'nvarchar(100)')

Modify XML in SQL server to add a root node

To give some background to this problem first, I am rewriting some code that currently loops through some xml, doing an insert to a table at the end of each loop - replacing with a single sp that takes an xml parameter and does the insert in one go, 'shredding' the xml into a table.
The main shred has been done successfully,but currently one of the columns is used to store the entire node. I have been able to work out the query necessary for this (almost), but it misses out the root part of the node. I have come to the conclusion that my query is as good as I can get it, and I am looking at a way to then do an update statement to get the root node back in there.
So my xml is of the form;
<xml>
<Items>
<Item>
<node1>...</node1><node2>..<node2>.....<noden>...<noden>
<Item>
<Item>
<node1>...</node1><node2>..<node2>.....<noden>...<noden>
<Item>
<Item>
<node1>...</node1><node2>..<node2>.....<noden>...<noden>
<Item>
......
<Items>
</xml>
So the basic shredding puts the value from node1 into column1, node2 into column2 etc. The insert statement looks something like;
INSERT INTO mytable col1, col2,...etc.....,wholenodecolumn
Select
doc.col.value('node1[1]', 'int') column1,
doc.col.value('node2[1]', 'varchar(50)') column2,
....etc......,
doc.col.query('*')--this is the query for getting the whole node
FROM #xml.nodes('//Items/Item') doc(col)
The XML that ends up in wholenodecolumn is of the form;
<node1>...</node1><node2>..<node2>.....<noden>...<noden>
but I need it to be of the form
<Item><node1>...</node1><node2>..<node2>.....<noden>...<noden></Item>
There is existing code (a lot of it) that depends on the xml in this column being of the correct form.
So can someone maybe see how to modify the doc.col.query('*') to get the desired result?
Anyway, I gave up on modifying the query, and tried to think of other ways to accomplish the end result. What I am now looking at is an Update after the insert- something like;
update mytable set wholenodecolumn.modify('insert <Item> as first before * ')
If I could do this along with
.modify('insert </Item> as last after * ')
that would be fine, but doing 1 at a time isn't an option as the XML is then invalid
XQuery [mytable.wholenodecolumn.modify()]: Expected end tag 'Item'
and doing both together I don't know if it's possible but I've tried various syntax and can't get to work.
Any other approaches to the problem also gratefully received
I beleive you can specifiy the Root Node name by using the FOR clause.
For example:
select top 1 *
from HumanResources.Department
for XML AUTO, ROOT('RootNodeName')
Take a looks at books online for more details:
http://msdn.microsoft.com/en-us/library/ms190922.aspx
Answering my own question here! - this follows on from the comments to the one of the other attempted answers where I said:
I am currently looking into FLWOR
Xquery constructs in the query.
col.query('for $item in * return <Item> {$item} </item>') is almost
there, but puts around
each node, rather than around all the
nodes
I was almost there with the syntax, a small tweak has given me what I needed;
doc.col.query('<Item> { for $item in * return $item } </item>'
Thankyou to everyone that helped. I have further related issues now but I'll post as separate questions
Couldn't you just add the '' / '' as fixed texts in your select? Something like:
Select
'<Item>',
doc.col.value('node1[1]', 'int') column1,
doc.col.value('node2[1]', 'varchar(50)') column2,
....etc......,
doc.col.query('*'),
'</Item>' --this is the query for getting the whole node
FROM #xml.nodes('//Items/Item') doc(col)
Marc

Resources