Reading xml data in SQL - sql-server

I am reading data from xml in Sql.
Here is my xml :
Declare #MainXml XML =
'<?xml version="1.0" encoding="utf-8"?>
<result>
<details>
<admin>
<code>555</code>
</admin>
<claimhistory>
<claim id="1" number="100">
<account>Closed</account>
</claim>
<claim id="2" number="200">
<account>Closed</account>
</claim>
</claimhistory>
</details>
</result>'
Reading data like this:
select
C.X.value('(admin/code)[1]', 'varchar(max)') as Code,
A.X.value('#id', 'varchar(max)') as Id,
A.X.value('#number', 'varchar(max)') as No,
A.X.value('(account)[1]', 'varchar(max)') as Status
from
#MainXml.nodes('result/details') as C(X)
cross apply
C.X.nodes('claimhistory/claim') as A(X)
This is returning:
Code Id No Status
---------------------
555 1 100 Closed
555 2 200 Closed
Stored procedure contains above code.
Here datatable variable is used as an input for Stored Procedure. It contains id and name.
Declare #dtValue As [dbo].[DataTableDetails]
Insert Into #dtValue(Requested_Id, Name) Values(1, 'Tim');
Insert Into #dtValue(Requested_Id, Name) Values(2, 'Joe');
I want to add these names to select query based on matching Id of an xml to input.
Expected output -
Code Id No Status Name
----------------------------
555 1 100 Closed Tim
555 2 200 Closed Joe
Currently - after inserting the selected records from xml, I am using update query But table contains over a million records so it is effecting performance now.
Please suggest me.
Edited:
Tried with Join - [added below line in select query]
Select
C.X.value('(admin/code)[1]', 'varchar(max)') as Code,
A.X.value('#id', 'varchar(max)') as Id,
A.X.value('#number', 'varchar(max)') as No,
A.X.value('(account)[1]', 'varchar(max)') as Status,
CA.Name
from
#MainXml.nodes('result/details') as C(X)
cross apply
C.X.nodes('claimhistory/claim') as A(X)
join
#dtValue CA on CA.Requested_Id = A.X.value('#id', 'varchar(max)')

I'd recommend refactoring the way you're selecting from the XML like so:
select
C.X.value('(../../admin/code)[1]', 'varchar(max)') as Code,
C.X.value('#id', 'varchar(max)') as Id,
C.X.value('#number', 'varchar(max)') as No,
C.X.value('(account)[1]', 'varchar(max)') as Status,
dt.Name
from
#MainXml.nodes('result/details/claimhistory/claim') as C(X)
INNER JOIN #dtValue dt
ON dt.Requested_Id = C.X.value('(#id)[1]', 'int')
You don't actually want to CROSS APPLY the child nodes, you want them to be the primary part you're selecting from (i.e. one row per claim element) - it's then easy enough to select based on the grandparent node to get the Code value, and then you can properly INNER JOIN your table variable.
Full sample:
Declare #MainXml XML =
'<?xml version="1.0" encoding="utf-8"?>
<result>
<details>
<admin>
<code>555</code>
</admin>
<claimhistory>
<claim id="1" number="100">
<account>Closed</account>
</claim>
<claim id="2" number="200">
<account>Closed</account>
</claim>
</claimhistory>
</details>
</result>'
DECLARE #dtValue TABLE (Requested_Id int, Name varchar(10))
Insert Into #dtValue(Requested_Id, Name) Values(1, 'Tim'), (2, 'Joe');
select
C.X.value('(../../admin/code)[1]', 'varchar(max)') as Code,
C.X.value('#id', 'varchar(max)') as Id,
C.X.value('#number', 'varchar(max)') as No,
C.X.value('(account)[1]', 'varchar(max)') as Status,
dt.Name
from
#MainXml.nodes('result/details/claimhistory/claim') as C(X)
INNER JOIN #dtValue dt
ON dt.Requested_Id = C.X.value('(#id)[1]', 'int')

Related

Split XML field into multiple delimited values - SQL

I have some XML content in a single field; I want to split each xml field in multiple rows.
The XML is something like that:
<env>
<id>id1<\id>
<DailyProperties>
<date>01/01/2022<\date>
<value>1<\value>
<\DailyProperties>
<DailyProperties>
<date>05/05/2022<\date>
<value>2<\value>
<\DailyProperties>
<\env>
I want to put everything in a table as:
ID DATE VALUE
id1 01/01/2022 1
id1 05/05/2022 2
For now I managed to parse the xml value, and I have found something online to get a string into multiple rows (like this), but my string should have some kind of delimiter. I did this:
SELECT
ID,
XMLDATA.X.query('/env/DailyProperties/date').value('.', 'varchar(100)') as r_date,
XMLDATA.X.query('/env/DailyProperties/value').value('.', 'varchar(100)') as r_value
from tableX
outer apply xmlData.nodes('.') as XMLDATA(X)
WHERE ID = 'id1'
but I get all values without a delimiter, as such:
01/10/202202/10/202203/10/202204/10/202205/10/202206/10/202207/10/202208/10/202209/10/202210/10/2022
Or, as in my example:
ID R_DATE R_VALUE
id01 01/01/202205/05/2022 12
I have found out that XQuery has a last() function that return the last value parsed; in my xml example it will return only 05/05/2022, so it should exists something for address the adding of a delimiter. The number of rows could vary, as it could vary the number of days of which I have a value.
Please try the following solution.
I had to fix your XML to make it well-formed.
SQL
DECLARE #tbl TABLE (id INT IDENTITY PRIMARY KEY, xmldata XML);
INSERT INTO #tbl (xmldata) VALUES
(N'<env>
<id>id1</id>
<DailyProperties>
<date>01/01/2022</date>
<value>1</value>
</DailyProperties>
<DailyProperties>
<date>05/05/2022</date>
<value>2</value>
</DailyProperties>
</env>');
SELECT p.value('(id/text())[1]','VARCHAR(20)') AS id
, c.value('(date/text())[1]','VARCHAR(10)') AS [date]
, c.value('(value/text())[1]','INT') AS [value]
FROM #tbl
CROSS APPLY xmldata.nodes('/env') AS t1(p)
OUTER APPLY t1.p.nodes('DailyProperties') AS t2(c);
Output
id
date
value
id1
01/01/2022
1
id1
05/05/2022
2
Yitzhak beat me to it by 2 min. Nonetheless, here's what I have:
--==== XML Data:
DECLARE #xml XML =
'<env>
<id>id1</id>
<DailyProperties>
<date>01/01/2022</date>
<value>1</value>
</DailyProperties>
<DailyProperties>
<date>05/05/2022</date>
<value>2</value>
</DailyProperties>
</env>';
--==== Solution:
SELECT
ID = ff2.xx.value('(text())[1]','varchar(20)'),
[Date] = ff.xx.value('(date/text())[1]', 'date'),
[Value] = ff.xx.value('(value/text())[1]', 'int')
FROM (VALUES(#xml)) AS f(X)
CROSS APPLY f.X.nodes('env/DailyProperties') AS ff(xx)
CROSS APPLY f.X.nodes('env/id') AS ff2(xx);
Returns:
ID Date Value
-------------------- ---------- -----------
id1 2022-01-01 1
id1 2022-05-05 2

SQL Server XML Processing: Join different Nodes based on ID

I am trying to query XML with SQL. Suppose I have the following XML.
<xml>
<dataSetData>
<text>ABC</text>
</dataSetData>
<generalData>
<id>123</id>
<text>text data</text>
</generalData>
<generalData>
<id>456</id>
<text>text data 2</text>
</generalData>
<specialData>
<id>123</id>
<text>special data text</text>
</specialData>
<specialData>
<id>456</id>
<text>special data text 2</text>
</specialData>
</xml>
I want to write a SELECT query that returns 2 rows as follows:
DataSetData | GeneralDataID | GeneralDataText | SpecialDataTest
ABC | 123 | text data | special data text
ABC | 456 | text data 2 | special data text 2
My current approach is as follows:
SELECT
dataset.nodes.value('(dataSetData/text)[1]', 'nvarchar(500)'),
general.nodes.value('(generalData/text)[1]', 'nvarchar(500)'),
special.nodes.value('(specialData/text)[1]', 'nvarchar(500)'),
FROM #MyXML.nodes('xml') AS dataset(nodes)
OUTER APPLY #MyXML.nodes('xml/generalData') AS general(nodes)
OUTER APPLY #MyXML.nodes('xml/specialData') AS special(nodes)
WHERE
general.nodes.value('(generalData/text/id)[1]', 'nvarchar(500)') = special.nodes.value('(specialData/text/id)[1]', 'nvarchar(500)')
What I do not like here is that I have to use OUTER APPLY twice and that I have to use the WHERE clause to JOIN the correct elements.
My question therefore is: Is it possible to construct the query in a way where I do not have to use the WHERE clause in such a way, because I am pretty sure that this affects performance very negatively if files become larger.
Shouldn't it be possible to JOIN the correct nodes (that is, the corresponding generalData and specialData nodes) with some XPATH statement?
Your XPath expressions are completely off.
Please try the following. It is pretty efficient. You can test its performance with a large XML.
SQL
-- DDL and sample data population, start
DECLARE #xml XML =
N'<xml>
<dataSetData>
<text>ABC</text>
</dataSetData>
<generalData>
<id>123</id>
<text>text data</text>
</generalData>
<generalData>
<id>456</id>
<text>text data 2</text>
</generalData>
<specialData>
<id>123</id>
<text>special data text</text>
</specialData>
<specialData>
<id>456</id>
<text>special data text 2</text>
</specialData>
</xml>';
-- DDL and sample data population, end
SELECT c.value('(dataSetData/text/text())[1]', 'VARCHAR(20)') AS DataSetData
, g.value('(id/text())[1]', 'INT') AS GeneralDataID
, g.value('(text/text())[1]', 'VARCHAR(30)') AS GeneralDataText
, sp.value('(id/text())[1]', 'INT') AS SpecialDataID
, sp.value('(text/text())[1]', 'VARCHAR(30)') AS SpecialDataTest
FROM #xml.nodes('/xml') AS t(c)
OUTER APPLY c.nodes('generalData') AS general(g)
OUTER APPLY c.nodes('specialData') AS special(sp)
WHERE g.value('(id/text())[1]', 'INT') = sp.value('(id/text())[1]', 'INT');
Output
+-------------+---------------+-----------------+---------------+---------------------+
| DataSetData | GeneralDataID | GeneralDataText | SpecialDataID | SpecialDataTest |
+-------------+---------------+-----------------+---------------+---------------------+
| ABC | 123 | text data | 123 | special data text |
| ABC | 456 | text data 2 | 456 | special data text 2 |
+-------------+---------------+-----------------+---------------+---------------------+
I want to suggest one more solution:
DECLARE #xml XML=
N'<xml>
<dataSetData>
<text>ABC</text>
</dataSetData>
<generalData>
<id>123</id>
<text>text data</text>
</generalData>
<generalData>
<id>456</id>
<text>text data 2</text>
</generalData>
<specialData>
<id>123</id>
<text>special data text</text>
</specialData>
<specialData>
<id>456</id>
<text>special data text 2</text>
</specialData>
</xml>';
--The query
SELECT #xml.value('(/xml/dataSetData/text/text())[1]','varchar(100)')
,B.*
,#xml.value('(/xml/specialData[(id/text())[1] cast as xs:int? = sql:column("B.General_Id")]/text/text())[1]','varchar(100)') AS Special_Text
FROM #xml.nodes('/xml/generalData') A(gd)
CROSS APPLY(SELECT A.gd.value('(id/text())[1]','int') AS General_Id
,A.gd.value('(text/text())[1]','varchar(100)') AS General_Text) B;
The idea in short:
We can read the <dataSetData>, as it is not repeating, directly from the variable.
We can use .nodes() to get a derived set of all <generalData> entries.
Now the magic trick: I use APPLY to get the values from the XML as regular columns into the result set.
This trick allows now to use sql:column() in order to build a XQuery predicate to find the corresponding <specialData>.
One more approach with FLWOR
You might try this:
SELECT #xml.query
('
<xml>
{
for $i in distinct-values(/xml/generalData/id/text())
return
<combined dsd="{/xml/dataSetData/text/text()}"
id="{$i}"
gd="{/xml/generalData[id=$i]/text/text()}"
sd="{/xml/specialData[id=$i]/text/text()}"/>
}
</xml>
');
The result
<xml>
<combined dsd="ABC" id="123" gd="text data" sd="special data text" />
<combined dsd="ABC" id="456" gd="text data 2" sd="special data text 2" />
</xml>
The idea in short:
With the help of distinct-values() we get a list of all id values in your XML
we can iterate this and pick the corresponding values
We return the result as a re-structured XML
Now you can use .nodes('/xml/combined') against this new XML and retrieve all values easily.
Performance test
I just want to add a performance test:
CREATE TABLE dbo.TestXml(TheXml XML);
INSERT INTO dbo.TestXml VALUES
(
(
SELECT 'blah1' AS [dataSetData/text]
,(SELECT o.[object_id] AS [id]
,o.[name] AS [text]
FROM sys.objects o
FOR XML PATH('generalData'),TYPE)
,(SELECT o.[object_id] AS [id]
,o.create_date AS [text]
FROM sys.objects o
FOR XML PATH('specialData'),TYPE)
FOR XML PATH('xml'),TYPE
)
)
,(
(
SELECT 'blah2' AS [dataSetData/text]
,(SELECT o.[object_id] AS [id]
,o.[name] AS [text]
FROM sys.objects o
FOR XML PATH('generalData'),TYPE)
,(SELECT o.[object_id] AS [id]
,o.create_date AS [text]
FROM sys.objects o
FOR XML PATH('specialData'),TYPE)
FOR XML PATH('xml'),TYPE
)
)
,(
(
SELECT 'blah3' AS [dataSetData/text]
,(SELECT o.[object_id] AS [id]
,o.[name] AS [text]
FROM sys.objects o
FOR XML PATH('generalData'),TYPE)
,(SELECT o.[object_id] AS [id]
,o.create_date AS [text]
FROM sys.objects o
FOR XML PATH('specialData'),TYPE)
FOR XML PATH('xml'),TYPE
)
);
GO
--just a dummy call to avoid *first call bias*
SELECT x.query('.') FROM dbo.TestXml
CROSS APPLY TheXml.nodes('/xml//*') A(x)
GO
DECLARE #t DATETIME2=SYSUTCDATETIME();
--My first approach
SELECT TheXml.value('(/xml/dataSetData/text/text())[1]','varchar(100)') AS DataSetValue
,B.*
,TheXml.value('(/xml/specialData[(id/text())[1] cast as xs:int? = sql:column("B.General_Id")]/text/text())[1]','varchar(100)') AS Special_Text
INTO dbo.testResult1
FROM dbo.TestXml
CROSS APPLY TheXml.nodes('/xml/generalData') A(gd)
CROSS APPLY(SELECT A.gd.value('(id/text())[1]','int') AS General_Id
,A.gd.value('(text/text())[1]','varchar(100)') AS General_Text) B;
SELECT DATEDIFF(MILLISECOND,#t,SYSUTCDATETIME());
GO
DECLARE #t DATETIME2=SYSUTCDATETIME();
--My second approach
SELECT B.c.value('#dsd','varchar(100)') AS dsd
,B.c.value('#id','int') AS id
,B.c.value('#gd','varchar(100)') AS gd
,B.c.value('#sd','varchar(100)') AS sd
INTO dbo.TestResult2
FROM dbo.TestXml
CROSS APPLY (SELECT TheXml.query
('
<xml>
{
for $i in distinct-values(/xml/generalData/id/text())
return
<combined dsd="{/xml/dataSetData/text/text()}"
id="{$i}"
gd="{/xml/generalData[id=$i]/text/text()}"
sd="{/xml/specialData[id=$i]/text/text()}"/>
}
</xml>
') AS ResultXml) A
CROSS APPLY A.ResultXml.nodes('/xml/combined') B(c)
SELECT DATEDIFF(MILLISECOND,#t,SYSUTCDATETIME());
GO
DECLARE #t DATETIME2=SYSUTCDATETIME();
--Yitzhak'S approach
SELECT c.value('(dataSetData/text/text())[1]', 'VARCHAR(20)') AS DataSetData
, g.value('(id/text())[1]', 'INT') AS GeneralDataID
, g.value('(text/text())[1]', 'VARCHAR(30)') AS GeneralDataText
, sp.value('(id/text())[1]', 'INT') AS SpecialDataID
, sp.value('(text/text())[1]', 'VARCHAR(30)') AS SpecialDataTest
INTO dbo.TestResult3
FROM dbo.TestXml
CROSS APPLY TheXml.nodes('/xml') AS t(c)
OUTER APPLY c.nodes('generalData') AS general(g)
OUTER APPLY c.nodes('specialData') AS special(sp)
WHERE g.value('(id/text())[1]', 'INT') = sp.value('(id/text())[1]', 'INT');
SELECT DATEDIFF(MILLISECOND,#t,SYSUTCDATETIME());
GO
SELECT * FROM TestResult1;
SELECT * FROM TestResult2;
SELECT * FROM TestResult3;
GO
--careful with real data!
DROP TABLE testResult1
DROP TABLE testResult2
DROP TABLE testResult3
DROP TABLE dbo.TestXml;
The result is clearly pointing against XQuery. (Someone might say so sad! now :-) ).
The predicate approach is by far the slowest (4700ms). The FLWOR approach is on rank 2 (1200ms) and the winner is - tatatataaaaa - Yitzhak's approach (400ms, by factor ~10!).
Which solution is best for you, will depend on the actual data (count of elements per XML, count of XMLs and so on). But the visual elegance is - regrettfully - not the only parameter for this choice :-)
Sorry to add this as another answer, but I don't want to add to the other answer. It's big enough already :-)
A combination of Yitzhak's and mine is even faster:
--This is the additional code to be placed into the performance comparison
DECLARE #t DATETIME2=SYSUTCDATETIME();
SELECT TheXml.value('(/xml/dataSetData/text/text())[1]', 'VARCHAR(20)') AS DataSetData
,B.*
, sp.value('(id/text())[1]', 'INT') AS SpecialDataID
, sp.value('(text/text())[1]', 'VARCHAR(30)') AS SpecialDataTest
INTO dbo.TestResult4
FROM dbo.TestXml
CROSS APPLY TheXml.nodes('/xml/generalData') AS A(g)
CROSS APPLY(SELECT g.value('(id/text())[1]', 'INT') AS GeneralDataID
, g.value('(text/text())[1]', 'VARCHAR(30)') AS GeneralDataText) B
OUTER APPLY TheXml.nodes('/xml/specialData[id=sql:column("B.GeneralDataID")]') AS special(sp);
SELECT DATEDIFF(MILLISECOND,#t,SYSUTCDATETIME());
The idea in short:
We read the <dataSetData> directly (no repetition)
We use APPLY .nodes() to get all <generalData>
We use APPLY SELECT to fetch the values of <generalData> elements as real columns.
We use another APPLY .nodes() to fetch the corresponding <specialData> elements
One advantage of this solution: If there might be more than one special-data entry per general-data element, this would work too.
This is now the fastest in my test (~300ms).

Executing a SELECT statement on a cast(text) column that has XML

I am trying to retrieve all values in the XML that contains defined values in the WHERE clause but I am only retrieving the first record and not the subsequent records in the IN operator. I am needing to the CAST a text column to XML and then retrieve the records but I am not able to make this work. Any help/direction would be appreciated.
Here is the XML:
<Payment>
<CoverageCd>COLL</CoverageCd>
<LossTypeCd>COLL</LossTypeCd>
<ClaimStatusCd>C</ClaimStatusCd>
<LossPaymentAmt>14596</LossPaymentAmt>
</Payment>
<Payment>
<CoverageCd>LIAB</CoverageCd>
<LossTypeCd>PD</LossTypeCd>
<ClaimStatusCd>C</ClaimStatusCd>
<LossPaymentAmt>3480</LossPaymentAmt>
</Payment>
Here is my SQL code:
SELECT
ad.AplusDataSysID,
CAST(ad.xmlAplus AS XML).value('(/ISO/PassportSvcRs/Reports/Report/ReportData/ISO/PassportSvcRs/PassportInqRs/Match/Claim/Payment/LossTypeCd)[1]','varchar(max)') AS LossTypeCode
FROM
[dbo].[AUT_Policy] p
INNER JOIN
[dbo].[IP_Policy] ip ON p.PolicySysID = ip.Aut_PolicyID
INNER JOIN
[dbo].[AUT_AplusData] ad ON ip.PolicySysID = ad.PolicySysID
WHERE
CAST(ad.xmlAplus AS XML).value('(/ISO/PassportSvcRs/Reports/Report/ReportData/ISO/PassportSvcRs/PassportInqRs/Match/Claim/Payment/LossTypeCd)[1]', 'VARCHAR(MAX)') IN ('BI','PD','COLL','COMP','PIP','UM','MEDPY','TOWL','RENT','OTHR');
Here is my SQL result:
Here is what the SQL result should look like:
It would appear that the XML nodes method is what you need.
-- Sample data
DECLARE #AUT_AplusData TABLE (AplusDataSysID INT, xmlAplus TEXT);
INSERT #AUT_AplusData VALUES (1,
'<Payment>
<CoverageCd>COLL</CoverageCd>
<LossTypeCd>COLL</LossTypeCd>
<ClaimStatusCd>C</ClaimStatusCd>
<LossPaymentAmt>14596</LossPaymentAmt>
</Payment>
<Payment>
<CoverageCd>LIAB</CoverageCd>
<LossTypeCd>PD</LossTypeCd>
<ClaimStatusCd>C</ClaimStatusCd>
<LossPaymentAmt>3480</LossPaymentAmt>
</Payment>');
-- Solution
SELECT
AplusDataSysID = ad.AplusDataSysID,
LossTypeCd = pay.loss.value('(LossTypeCd/text())[1]', 'varchar(8000)')
FROM #AUT_AplusData AS ad
CROSS APPLY (VALUES(CAST(ad.xmlAplus AS XML))) AS x(xmlAplus)
CROSS APPLY x.xmlAplus.nodes('/Payment') AS pay(loss);
Returns:
AplusDataSysID LossTypeCd
---------------- ---------------
1 COLL
1 PD

Converting XML data to SQL table

I have the following XML data with me. I need to convert this to SQL table.
<SalesDetails>
<Customer Name="Johny" DateofBirth="1990-01-02T00:00:00">
<OrderInfo>
<OrderDate>1993-02-03T00:00:00</OrderDate>
<OrderAmount>1000</OrderAmount>
</OrderInfo>
</Customer>
</SalesDetails>
Can anyone help me with a SQL query that gives the above XML file as output?
In my initial attempt, I have created two tables #TI and #T2. I had then inserted different values into it. I had then queried it as :
SELECT
(SELECT * FROM #T1 FOR XML RAW('Sales') , TYPE),
(SELECT * FROM #T2 FOR XML PATH('OrderInfo') , TYPE)
FOR XML PATH('') , ROOT('SalesDetails')
But I need the output in the first XML format based on SQL tables and corresponding joins. That is, when the name of a customer is displayed, his corresponding order information needs to be displayed. I do not want it in a grouped format.
Sorry, in my first attempt I completely misread your question and thought you'd like to get the data out of your XML. This is the approach to create such an XML out of table's data:
DECLARE #cust TABLE(ID INT, CustomerName VARCHAR(100),DateOfBirth DATE);
INSERT INTO #cust VALUES(1,'Jonny','1990-01-02T00:00:00')
,(2,'Jimmy','1980-01-02T00:00:00');
DECLARE #ord TABLE(ID INT,CustomerID INT,OrderDate DATE, OrderAmount INT);
INSERT INTO #ord VALUES(1,1,'1993-02-03T00:00:00',1000)
,(2,1,'1994-02-03T00:00:00',500)
,(3,2,'1994-02-03T00:00:00',200);
SELECT c.CustomerName AS [#Name]
,c.DateOfBirth AS [#DateofBirth]
,(
SELECT o.OrderDate
,o.OrderAmount
FROM #ord AS o
WHERE o.CustomerID=c.ID
FOR XML PATH('OrderInfo'),TYPE
)
FROM #cust AS c
FOR XML PATH('Customer'),ROOT('SalesDetails')
And this is the created XML
<SalesDetails>
<Customer Name="Jonny" DateofBirth="1990-01-02">
<OrderInfo>
<OrderDate>1993-02-03</OrderDate>
<OrderAmount>1000</OrderAmount>
</OrderInfo>
<OrderInfo>
<OrderDate>1994-02-03</OrderDate>
<OrderAmount>500</OrderAmount>
</OrderInfo>
</Customer>
<Customer Name="Jimmy" DateofBirth="1980-01-02">
<OrderInfo>
<OrderDate>1994-02-03</OrderDate>
<OrderAmount>200</OrderAmount>
</OrderInfo>
</Customer>
</SalesDetails>
Just for the case you want to read your XML, I let this appended
You can retrieve all the information like this:
The generated Index columns are IDs you can use to insert this into relational tables. The problem with your XML is, that the information about your target tabels is missing. But the rest should be easy for you.
Btw: I declared some more similar nodes to make the relational structure visible
DECLARE #x XML=
'<SalesDetails>
<Customer Name="Johny" DateofBirth="1990-01-02T00:00:00">
<OrderInfo>
<OrderDate>1993-02-03T00:00:00</OrderDate>
<OrderAmount>1000</OrderAmount>
</OrderInfo>
<OrderInfo>
<OrderDate>1994-02-03T00:00:00</OrderDate>
<OrderAmount>500</OrderAmount>
</OrderInfo>
</Customer>
<Customer Name="Jimmy" DateofBirth="1980-01-02T00:00:00">
<OrderInfo>
<OrderDate>1994-02-03T00:00:00</OrderDate>
<OrderAmount>200</OrderAmount>
</OrderInfo>
<OrderInfo>
<OrderDate>1993-02-03T00:00:00</OrderDate>
<OrderAmount>100</OrderAmount>
</OrderInfo>
</Customer>
</SalesDetails>';
WITH CustomerNodes AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS CustomerIndex
,Customer.value('#Name','varchar(max)') AS CustomerName
,Customer.value('#DateofBirth','date') AS CustomerDateOfBirth
,One.Customer.query('.') AS CustomerNode
FROM #x.nodes('SalesDetails/Customer') AS One(Customer)
)
SELECT cn.*
,ROW_NUMBER() OVER(PARTITION BY cn.CustomerIndex ORDER BY (SELECT NULL)) AS OrderIndex
,OrderInfo.value('OrderDate[1]','date') AS OrderDate
,OrderInfo.value('OrderAmount[1]','int') AS OrderAmount
FROM CustomerNodes AS cn
CROSS APPLY cn.CustomerNode.nodes('Customer/OrderInfo') As The(OrderInfo)
The result:
Customer Order
ID Name DateOfBirth ID OrderDate OrderAmount
1 Johny 1990-01-02 1 1993-02-03 1000
1 Johny 1990-01-02 2 1994-02-03 500
2 Jimmy 1980-01-02 1 1994-02-03 200
2 Jimmy 1980-01-02 2 1993-02-03 100

Parsing XML content with absent elements in SQL Server 2012

I need to parse XML below into table of transactions for the customers for each day. The XML file is coming from external service which is not controlled by me.
The problem is when a customer doesn't have a transaction for the day, I am not able to view it in my table. How do I get to see that customer had zero transactions?
declare #xml xml =
'<root>
<customers>
<customer id="777">
<orders>
<order currency="USD" id="888" date="2014-06-18">
<transactions>
<transaction id="998">
<date>2014-08-01</date>
<itemid>10001</itemid>
<amount>745.96</amount>
</transaction>
</transactions>
</order>
</orders>
</customer>
<customer id="778">
<orders>
<order id="999" />
</orders>
</customer>
</customers>
</root>'
My transformation query is like this:
select
newid() ID,
ltrim(rtrim(B.C.value('#id', 'nvarchar(50)'))) CUSTOMER_ID,
ltrim(rtrim(K.C.value('#id', 'nvarchar(450)'))) ACCOUNT_ID,
ltrim(rtrim(K.C.value('#date', 'datetime'))) DATE_PLACED,
ltrim(rtrim(K.C.value('#currency', 'nvarchar(50)'))) CURRENCY,
ltrim(rtrim(T.C.value('#id', 'nvarchar(50)'))) TRANSACTION_ID,
ltrim(rtrim(T.C.value('date[1]', 'datetime'))) TRANSACTION_DATE,
ltrim(rtrim(T.C.value('itemid[1]', 'nvarchar(50)'))) TRANSACTION_ITEMID,
ltrim(rtrim(T.C.value('amount[1]', 'money'))) TRANSACTION_BANK_CODE
from
#xml.nodes('/root/customers/customer') as B(C)
outer apply B.C.nodes('/root/customers/customer/orders/order') as K(C)
outer apply K.C.nodes('/root/customers/customer/orders/order/transactions/transaction') as T(C)
where
ltrim(rtrim(b.c.value('#id', 'nvarchar(50)'))) = ltrim(rtrim(k.c.value('../../#id', 'nvarchar(50)')))
and
(
(
t.c.value('../../#id','nvarchar(50)') is not null
and
ltrim(rtrim(k.c.value('#id','nvarchar(50)'))) = ltrim(rtrim(t.c.value('../../#id','nvarchar(50)')))
)
or (ltrim(rtrim(t.c.value('../../#id','nvarchar(50)'))) is null)
)
Thank you in advance!
You should not do the cross apply against the full xpath from root. Begin with where you are instead and remove the where clause.
select newid() as ID,
B.C.value('#id', 'nvarchar(50)') as CUSTOMER_ID,
K.C.value('#id', 'varchar(50)') as ACCOUNT_ID,
T.C.value('#id', 'nvarchar(50)') as TRANSACTION_ID
from #xml.nodes('/root/customers/customer') as B(C)
outer apply B.C.nodes('orders/order') as K(C)
outer apply K.C.nodes('transactions/transaction') as T(C)
Result
ID CUSTOMER_ID ACCOUNT_ID TRANSACTION_ID
------------------------------------ ----------- ---------- --------------
767FCA17-578A-495E-9EFA-75E3509B2BD2 777 888 998
59965290-EB7C-429B-AA5F-97EED0EB35BD 778 999 NULL

Resources