How to replace XML queries in SQL Server? - sql-server

I got a question about replacing XML columns, here is an small example about what I like to change.
<i><recipes n="0" />
<CGrecipes cg="4" r0="302053" r1="302084" r2="302049" r3="302068" />
<HArecipes ha="4" r0="302103" r1="302083" r2="302050" r3="302087" />
<KHrecipes kh="10" r0="302100" r1="302090" r2="302078" r3="302074"
r4="302094" r5="302082" r6="302066" r7="302051"
r8="302086" r9="302070" />
<KHNrecipes khn="10" r0="302102" r1="302089" r2="302056" r3="302077"
r4="302052" r5="302069" r6="302081" r7="302073"
r8="302093" r9="302085" />
<IMITARrecipes imitar="2" r0="302110" r1="302057" />
<MAUSERrecipes mauser="1" r0="302106" />
<SVDrecipes svd="1" r0="302059" />
<BLASERrecipes blaser="2" r0="302105" r1="302060" />
<SIGSAUERrecipes sigsauer="2" r0="302109" r1="302061" />
<HONEYBADGERrecipes honeybadger="1" r0="302062" />
<SMALLBACKPACKrecipes smallbackpack="1" r0="302095" />
<MEDIUMBACKPACKrecipes mediumbackpack="1" r0="302096" />
<MILITARYBACKPACKrecipes militarybackpack="1" r0="302097" />
<LARGEBACKPACKrecipes largebackpack="1" r0="302098" />
<TEDDYBACKPACKrecipes teddybackpack="1" r0="302099" />
<ALICEBACKPACKrecipes alicebackpack="1" r0="302112" />
<M82recipes m82="1" r0="302107" />
<AWMrecipes awm="1" r0="302108" />
<B93Rrecipes b93r="1" r0="302111" />
</i>
And I want to change it with a script to:
<i><recipes r="0" r0="302053" r1="302084" r2="302049" r3="302068" r4="302103" r5="302083" r6="302050" r7="302087" r8="302100" r9="302090" r10="302078" r11="302074" r12="302094" r13="302082" r14="302066" r15="302051" r16="302086" r17="302070" r18="302102" r19="302089" r20="302056" r21="302077" r22="302052" r23="302069" r24="302081" r25="302073" r26="302093" r27="302085" r28="302110" r29="302057" r30="302106" r31="302059" r32="302105" r33="302060" r34="302109" r35="302061" r36="302062" r37="302095" r38="302096" r39="302097" r40="302098" r41="302099" r42="302112" r43="302107" r44="302108" r45="302111" /></i>
I would like to get help and suggestions!

Well, it couldn't be simpler (literally, I think), thanks to T-SQL's very limited implementation of XQuery, and the general hate for dynamic XML of any kind. Let #xml contain your XML in a variable (if it's a column, add a FROM as required).
SELECT CONVERT(XML, REPLACE((
SELECT #xml.value('/i[1]/recipes[1]/#n', 'int') AS [#r], '' AS [#marker]
FOR XML PATH('recipes'), ROOT('i')
), 'marker=""', (
SELECT ' ' + REPLACE(REPLACE('r#="$v"',
'#', ROW_NUMBER() OVER (ORDER BY (SELECT 1)) - 1),
'$v', t.value('text()[1]', 'int'))
FROM (
SELECT #xml.query('
for $a in //*/#*[substring(local-name(),1,1)="r"]
return <r>{string($a)}</r>
') AS a
) _ CROSS APPLY a.nodes('r') AS x(t)
FOR XML PATH('')
)))
From inner to outer: we unpack all the r* attributes into elements, attach a row number to them, then fold the result back into XML by lamely concatenating strings. For a finale, we transform the n attribute of recipes into r and substitute our string concatenation into the outer element.
Why is this code so terrible? Because the data model is terrible (well, and because SQL Server's implementation of XQuery is quite limited, omitting most advanced features that could simplify this). It's an abuse of XML in every way. Consider changing the attributes into child elements. Don't use concatenated element names like ALICEBACKPACKrecipes, generalize this to recipes name='ALICEBACKPACK' or suchlike. Think static names and repeating content:
<i>
<recipes name="" value="0"></recipes>
<recipes name="cg" value="4">
<r>302053</r>
<r>302084</r>
...
</recipes>
...
<recipes name="ALICEBACKPACK" value="1">
<r>302112</r>
</recipes>
...
</i>
This is far easier to query and process for anything that isn't a fully fledged programming language.

Related

Retrieve associated value from next node for each tag

I have the following XML:
<Envelope format="ProceedoOrderTransaction2.1">
<Sender>SENDER</Sender>
<Receiver>RECEIVER</Receiver>
<EnvelopeID>xxxxx</EnvelopeID>
<Date>2021-05-06</Date>
<Time>11:59:46</Time>
<NumberOfOrder>1</NumberOfOrder>
<Order>
<Header>
<OrderNumber>POXXXXX</OrderNumber>
</Header>
<Lines>
<Line>
<LineNumber>1</LineNumber>
<ItemName>Ipsum Lorum</ItemName>
<SupplierArticleNumber>999999</SupplierArticleNumber>
<UnitPrice vatRate="25.0">50</UnitPrice>
<UnitPriceBasis>1</UnitPriceBasis>
<OrderedQuantity unit="Styck">200</OrderedQuantity>
<AdditionalItemProperty Key="ARTIKELNUMMER" Description="Unik ordermärkning (artikelnummer):" />
<Value>999999</Value>
<AdditionalItemProperty Key="BESKRIVNING" Description="Kort beskrivning:" />
<Value>Ipsum Lorum</Value>
<AdditionalItemProperty Key="BSKRIVNING" Description="Beskrivning:" />
<Value>Ipsum Lorum</Value>
<AdditionalItemProperty Key="ENHET" Description="Enhet:" />
<Value>Styck</Value>
<AdditionalItemProperty Key="KVANTITET" Description="Kvantitet:" />
<Value>200</Value>
<AdditionalItemProperty Key="PRIS" Description="Pris/Enhet (ex. moms):" />
<Value>50</Value>
<AdditionalItemProperty Key="VALUTA" Description="Valuta:" />
<Value>SEK</Value>
<Accounting>
<AccountingLine amount="10000">
<AccountingValue dimensionPosition="001" dimensionExternalID="ACCOUNT">xxx</AccountingValue>
<AccountingValue dimensionPosition="002" dimensionExternalID="F1">Ipsum Lorum</AccountingValue>
<AccountingValue dimensionPosition="005" dimensionExternalID="F3">1</AccountingValue>
<AccountingValue dimensionPosition="010" dimensionExternalID="F2">9999</AccountingValue>
</AccountingLine>
</Accounting>
</Line>
</Lines>
</Order>
</Envelope>
I am able to parse out all values correctly to table structure except for 1 value in a way that ensures its it associated with its tag. So where I stumble is that I am correctly getting 1 row per AdditionalItemProperty and I am able to get the Key and Description tag values, for example BESKRIVNING and Kort beskrivning:, but I can't (in a reasonable way) get the value between <Value> </Value> brackets that is also associated with each tag value. So for tag key value BESKRIVNING the associated value is 99999 which seem to be on same hierarchy level (insane I know) as the AdditionalItemProperty it is associated with. Seems like they use logic that value for a AdditionalItemProperty will be following the AdditionalItemProperty tag.
I am using SQL Server 2019. I have gotten this far:
-- Purchaseorderrowattributes
select top(10)
i.value(N'(./Header/OrderNumber/text())[1]', 'nvarchar(30)') as OrderNumber,
ap.value(N'(../LineNumber/text())[1]', 'nvarchar(30)') as LineNumber,
ap.value(N'(#Description)', 'nvarchar(50)') property_description
from
load.proceedo_orders t
outer apply
t.xml_data.nodes('Envelope/Order') AS i(i)
outer apply
i.nodes('Lines/Line/AdditionalItemProperty') as ap(ap)
where
file_name = #filename
Which produces the following output:
OrderNumber LineNumber property_description
--------------------------------------------
PO170006416 1 Antal timmar
PO170006416 1 Beskrivning
PO170006416 1 Kompetensområde
PO170006416 1 Roll
PO170006416 1 Ordernummer
PO170006416 1 Timpris
I can't find a way to add the value for each property in a correct way. Since the ordering of the values will always be same as the ordering of the AdditionalItemProperty i found solution to get ordering of the AdditionalItemProperty and then i could use rownumber and i was then hoping to input the rownumber value into the bracket in
ap.value(N'(../Value/text())[1]', 'nvarchar(50)') property_description
but SQL Server throws exception that it has to be string literal.
So to be clear what I tried doing with something like:
ap.value(CONCAT( N'(../Value/text())[', CAST(ROWNUMBER as varchar) ,']'), 'nvarchar(50)') property_description
SQL Server uses XQuery 1.0.
You can make use of the >> Node Comparison operator to find the Value sibling node with code similar to the following:
-- Purchaseorderrowattributes
select top(10)
i.value(N'(./Header/OrderNumber/text())[1]', 'nvarchar(30)') as OrderNumber
,ap.value(N'(../LineNumber/text())[1]', 'nvarchar(30)') as LineNumber
,ap.value(N'(#Description)', 'nvarchar(50)') property_description
,ap.query('
let $property := .
return (../Value[. >> $property][1])/text()
').value('.[1]', 'varchar(20)') as property_value
from load.proceedo_orders t
outer apply t.xml_data.nodes('Envelope/Order') AS i(i)
outer apply i.nodes('Lines/Line/AdditionalItemProperty') as ap(ap)
where file_name = #filename
So what's going on here?
let $property := . is creating a reference to the current AdditionalItemProperty node.
../Value[. >> $property] is ascending to the parent node, Line, and descending again to find Value nodes after the AditionalItemProperty node reference in document order, with [1] selecting the first one of those nodes.
See the 3.5.3 Node Comparisons section for a little more detail.

Parse an XML value with a colon in SQL Server

I need help to parse XML in SQL Server. I need to get "d1p1:Val2" value and concatenation of values for "d2p1:string".
<FirstData xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
xmlns:d1p1="http://XXXXXX" xmlns="http://YYYYYY" i:type="d1p1:StaticInfo">
<Timestamp>0</Timestamp>
<ActionResult i:nil="true" />
<d1p1:Val1 xmlns:d2p1="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
<d2p1:string>1</d2p1:string>
<d2p1:string>2</d2p1:string>
<d2p1:string>3</d2p1:string>
<d2p1:string>4</d2p1:string>
</d1p1:Val1>
<d1p1:Val2>false</d1p1:Val2>
</FirstData>
Your question is not very clear, but this might help you:
DECLARE #xml XML=
N'<FirstData xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:d1p1="http://XXXXXX" xmlns="http://YYYYYY" i:type="d1p1:StaticInfo">
<Timestamp>0</Timestamp>
<ActionResult i:nil="true" />
<d1p1:Val1 xmlns:d2p1="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
<d2p1:string>1</d2p1:string>
<d2p1:string>2</d2p1:string>
<d2p1:string>3</d2p1:string>
<d2p1:string>4</d2p1:string>
</d1p1:Val1>
<d1p1:Val2>false</d1p1:Val2>
</FirstData>';
WITH XMLNAMESPACES(DEFAULT 'http://YYYYYY'
,'http://XXXXXX' AS d1p1
,'http://schemas.microsoft.com/2003/10/Serialization/Arrays' AS d2p1)
SELECT #xml.value(N'(/FirstData/d1p1:Val2/text())[1]','bit') AS D1P1_Val2
,#xml.query(N'data(/FirstData/d1p1:Val1/d2p1:string/text())').value(N'text()[1]',N'nvarchar(max)') AS AllStrings;
The result
D1P1_Val2 AllStrings
0 1 2 3 4
This is the - not recommended - minimal query:
SELECT #xml.value(N'(//*:Val2)[1]','bit') AS D1P1_Val2
,#xml.query(N'data(//*:string)').value(N'.',N'nvarchar(max)') AS AllStrings;
Try something like this (can't test it myself ) :
SELECT Instructions.query('
declare namespace d1p1="http://XXXXXX";
declare namespace d2p1="http://schemas.microsoft.com/2003/10/Serialization/Arrays";
concat(//d1p1:Val2, " ", //d2p1:string[1]);
')
I think you just have to elaborate a bit on this

Adding node to SQL Server XML is silently failing

Trying to figure out why this snippet does not work.
DECLARE #changeStatus XML, #changeSet XML
SELECT #changeSet = TOP 1 ChangeSet
FROM MyTable
SET #changeStatus = '<change id="' + CAST(NEWID() AS NVARCHAR(36)) + '" by="' + #authorizingUserName + '" byAccountId="' + CAST(#authorizingUserId AS NVARCHAR(36)) + '" when="' + CONVERT(NVARCHAR(50),GETUTCDATE(),127) + 'Z">'
+ '<property id="Status" name="Status" old="' + #status + '" new="Closed" />'
+ '<collections />'
+ '</change>'
-- THIS DOES NOT ERROR OUT AND DOES NOT DO ANYTHING!!
SET #changeSet.modify('insert sql:variable("#changeStatus") as last into (/changes)[1]')
The overall structure of the XML is:
<changes>
<change id="" by="" byAccountId="" when="">
<property />
<collections />
</change>
</changes>
When I run the script and check the #changeSet before and after processing, they are identical. i.e.: The #changeStatus was never added to the XML contained in #changeSet.
If the original was:
<changes>
<change id="change01" by="" byAccountId="" when="">
<property ... />
<collections />
</change>
<change id="change02" by="" byAccountId="" when="">
<property ... />
<collections />
</change>
</changes>
I expected to see:
<changes>
<change id="change01" by="" byAccountId="" when="">
<property ... />
<collections />
</change>
<change id="change02" by="" byAccountId="" when="">
<property ... />
<collections />
</change>
<change id="82ECB3C5-D3BA-4CD2-B62C-89C083E4BAA1" by="me#mydomain.com" byAccountId="1E910737-D78C-E711-9C04-00090FFE0001" when="2018-01-17T00:12:33.700Z">
<property id="Status" name="Status" old="In Review" new="Closed" />
<collections />
</change>
</changes>
Does anyone see what might be wrong?
You should never create an XML on string level!
This has various side-effects.
If one of the variables is NULL the whole string will be NULL
Try SELECT '<x>' + NULL + '</x>';
XML is not just text with some fancy extras. Using T-SQL's FOR XML will - for sure! - create valid XML. Now imagine one of your string contains forbidden characters...
Try SELECT '<x>' + 'The arrow "->" might appear in normal text' + '</x>';
And try to cast it to XML
Many values are not stored in readable format but in some binary format. Depending on your system's settings the conversion of such types (numbers, date and time, BLOBs) will vary. Using FOR XML will use - for sure! - the correct format in and out. You can write anything within your XML, but on another system it might fail to read it.
Try it this way:
DECLARE #changeSet XML = N'<changes/>';
DECLARE #changeStatus XML;
SET #changeStatus=
(
SELECT NEWID() AS [#id]
,'SomeAuthor' AS [#by]
,'SomeAuthorId' AS [#byAccountId]
,GETUTCDATE() AS [#when]
,'Status' AS [property/#id]
,'Status' AS [property/#name]
,'In Review' AS [property/#old]
,'Closed' AS [property/#new]
,'' AS [collections]
FOR XML PATH('change')
);
--your code works fine
SET #changeSet.modify(N'insert sql:variable("#changeStatus") as last into (/changes)[1]');
--you see the first change integrated
SELECT #changeSet;
--Now let us create one more element
SET #changeStatus=
(
SELECT NEWID() AS [#id]
,'Another' AS [#by]
,'OneMore' AS [#byAccountId]
,GETUTCDATE() AS [#when]
,'Status' AS [property/#id]
,'Status' AS [property/#name]
,'In Review' AS [property/#old]
,'Closed' AS [property/#new]
,'' AS [collections]
FOR XML PATH('change')
);
--and insert it
SET #changeSet.modify(N'insert sql:variable("#changeStatus") as last into (/changes)[1]');
--yeah, worked!
SELECT #changeSet;
The query you posted works in general. I ran following test:
http://rextester.com/RNX55555
However, if any of the injected parameters is null, the change will not be inserted and no error will be thrown as you try to insert NULL aka nothing.

Querying XML within SQL Server with namespace

I'm trying to query some XML data that I was sent. I have followed various examples, but can't even address any of the elements beneath the root. I'm not sure what I'm missing. I've tried simply pulling the xml beneath /KitchenSupply/StoreInfo and nothing. Is the namespace the culprit for me not being able to return anything?
declare #myxmlinv as xml =
'<?xml version="1.0" encoding="utf-8"?>
<KitchenSupply xmlns="http://www.starstandards.org/STAR" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="KitchenSupply.xsd">
<StoreInfo>
<RCPT>
<SubTaskCode>ZZZ</SubTaskCode>
<SubComponentCode>BS1</SubComponentCode>
<StoreNumber>2241</StoreNumber>
<USRegionNumber>SE</USRegionNumber>
</RCPT>
<SentDateTime>02/04/2015</SentDateTime>
<IntId>ABC1234587</IntId>
<Destinations>
<DestinationFormatType>KS_XML_v3</DestinationFormatType>
</Destinations>
</StoreInfo>
<KitchenOrder>
<KsRecord SalesTransactionPayType="CC" SalesTransactionType="Closed">
<ReceiptAmt RcptTotalAmt="0.00" CustRcptTotalAmt="25.22" CCPostDate="01/07/2015" InvoiceDate="01/05/2015" CustName="JOHN SMITH" CustNo="13998" RcptNo="78476221" />
<KsCosts>
<SaleAmts UnitCost="15.00" TxblAmt="15.00" PayType="Cust" />
<SalesInfo JobStatus="F" JobNo="1" ItemCode="HT093" ItemDesc="Hand Towel">
<EmpInfo EmpRate="16.00" EmpCommPct="1.2" EmpName="DOUG ROGERS" EmpNo="998331" />
</SalesInfo>
</KsCosts>
</KsRecord>
<CustomerRecord>
<ContactInfo LastName="SMITH" FreqFlag="Y">
<Address Zip="90210" State="CA" City="BEV" Addr1="123 MAIN ST" Type="Business" />
<Email MailTo="FAKE#USA.COM" />
<Phone Num="1235551212" Type="H" />
<Phone Num="1235551213" Type="B" />
</ContactInfo>
</CustomerRecord>
</KitchenOrder>
<KitchenOrder>
<KsRecord SalesTransactionPayType="CC" SalesTransactionType="Closed">
<ReceiptAmt RcptTotalAmt="0.00" CustRcptTotalAmt="5.71" CCPostDate="01/08/2015" InvoiceDate="01/07/2015" CustName="SARAH BALDWIN" CustNo="14421" RcptNo="78476242" />
<KsCosts>
<SaleAmts UnitCost="2.00" TxblAmt="2.00" PayType="Cust" />
<SalesInfo JobStatus="F" JobNo="1" ItemCode="HS044" ItemDesc="Hand Soap">
<EmpInfo EmpRate="16.00" EmpCommPct="1.2" EmpName="DOUG ROGERS" EmpNo="998331" />
</SalesInfo>
</KsCosts>
</KsRecord>
<CustomerRecord>
<ContactInfo LastName="BALDWIN" FreqFlag="N">
<Address Zip="90210" State="CA" City="BEV" Addr1="123 VINE ST" Type="Home" />
<Email MailTo="FAKESARAH#USA.COM" />
<Phone Num="1235555512" Type="H" />
<Phone Num="1235556613" Type="M" />
</ContactInfo>
</CustomerRecord>
</KitchenOrder>
</KitchenSupply>';
declare #myxmlinv_table as table (
row_id tinyint,
inv_xml xml
);
insert into #myxmlinv_table(row_id,inv_xml) values('1',#myxmlinv);
Display the XML document and pull the IntId column: (doesn't work)
select i.row_id, i.inv_xml, i.inv_xml.value('(/KitchenSupply/StoreInfo/IntId)[1]','varchar(255)') as data_description
from #myxmlinv_table i
Attempt to use XMLNAMESPACES to display the XML document at the StoreInfo level: (also doesn't work)
WITH XMLNAMESPACES ('http://www.starstandards.org/STAR' as ks)
SELECT
XmlCol.query('/ks:KitchenSupply/StoreInfo')
FROM T;
Ideally, I'd like to use the nodes to extract all the data out and in to separate tables for querying.
KitchenOrders all within one table, CustomerRecord within another, etc.
Any ideas?
You need to use something like this:
-- define the XML namespace as the "default" so you don't have to
-- prefix each and every XPath element with the XML namespace prefix
WITH XMLNAMESPACES( DEFAULT 'http://www.starstandards.org/STAR')
SELECT
-- reach into the <KsRecord> subnode under <KitchenOrder>
ReceiptNo = XC.value('(KsRecord/ReceiptAmt/#RcptNo)[1]', 'int'),
CustName = XC.value('(KsRecord/ReceiptAmt/#CustName)[1]', 'varchar(25)'),
-- reach into the <CustomerRecord> subnode under <KitchenOrder>
CustomerLastName = xc.value('(CustomerRecord/ContactInfo/#LastName)[1]', 'varchar(50)'),
CustomerEmail = xc.value('(CustomerRecord/ContactInfo/Email/#MailTo)[1]', 'varchar(50)')
FROM
-- get a "virtual" table of XML fragments, one for each <KitchenOrder> node
#myxmlinv.nodes('/KitchenSupply/KitchenOrder') AS XT(XC)
Yes, it's the namespace. To get SQL Server to ignore the namespace you can use local-name:
SELECT *
FROM #myxmlinv_table
WHERE inv_xml.exist('//*[local-name()="KitchenSupply"]') = 1

How to parse XML to columns

I want to parse this Xml to get the following result. The name of the table is SchoolRecord
Name Answer
School name 87f6c8bf-cafc-40fb-a082-ca9d5bfaf1e0
Course 2f23e1cb-181e-4af2-a9ec-3dd68530d1d5
Father NULL
Mother NULL
I am using SQL Server 2012. Here's what I have tried but it didn't work
1.
Select
S.userdefinedxml.value('(/ControlGroup/UserDefinedControls/Control/Name)[1]','varchar(max)' ) as Name,
S.userdefinedxml.value('(/ControlGroup/UserDefinedControls/Control/Answer)[1]','varchar(max)' ) as Answer
From SchoolRecord S
2.
Select
S.userdefinedxml.value('(School_Data/ControlGroup/UserDefinedControls/Control/Name)[1]','varchar(max)' ) as Name,
S.userdefinedxml.value('(School_Data/ControlGroup/UserDefinedControls/Control/Answer)[1]','varchar(max)' ) as Answer
From SchoolRecord S
3.
Select
S.userdefinedxml.value('(Data/School_Data/ControlGroup/UserDefinedControls/Control/Name)[1]','varchar(max)' ) as Name,
S.userdefinedxml.value('(Data/School_Data/ControlGroup/UserDefinedControls/Control/Answer)[1]','varchar(max)' ) as Answer
From SchoolRecord S
My Results
Name Answer
NULL NULL
NULL NULL
NULL NULL
NULL NULL
My XML :
<data>
<School_Data>
<ControlGroup>
<UserDefinedControls>
<Control>
<ControlType>FIND</ControlType>
<Name>School name</Name>
<Answer>87f6c8bf-cafc-40fb-a082-ca9d5bfaf1e0</Answer>
</Control>
</UserDefinedControls>
<UserDefinedControls>
<Control>
<ControlType>FIND</ControlType>
<Name>Course</Name>
<Answer>2f23e1cb-181e-4af2-a9ec-3dd68530d1d5</Answer>
</Control>
</UserDefinedControls>
<UserDefinedControls>
<Control>
<ControlType>FIND</ControlType>
<Name>Father</Name>
<Answer />
</Control>
</UserDefinedControls>
<UserDefinedControls>
<Control>
<ControlType>FIND</ControlType>
<Name>Mother</Name>
<Answer />
</Control>
</UserDefinedControls>
</ControlGroup>
</School_Data>
</data>
I see Yuck mentioned element and attribute names are case-sensitive. Below is an example to return the name and answer from each control:
SELECT
Control.value('Name[1]','varchar(max)' ) as Name,
Control.value('Answer[1]','varchar(max)' ) as Answer
FROM dbo.SchoolRecord S
CROSS APPLY S.userdefinedxml.nodes('/data/School_Data/ControlGroup/UserDefinedControls/Control') AS UserDefinedControls(Control);

Resources