Convert 1-to-n XML column to tabular data - sql-server

I have a table on MS SQL server that holds information about reports in XML format. The table consists of two fields: the first has the business key, the second the entire report in XML format.
These reports include several pictures each. The XML holds information about these pictures, such as their filename, taken date, etc. I want to extract this information into a table, where every record holds information about exactly one photo. I've found ways to do this that come very close, but the problem I keep running into is that I need to create several records in this table for every record in my source table. How can I make this work?
The business key needs to be in the final table as well. This business key can be found in the XML data, but there is also a separate field in the source table (as mentioned before) where it can be found. The content of the XML column could look similar to this:
<Report>
<ReportKey>0000001</ReportKey>
[...]
<Photos>
<Photo>
<Filename>1.jpg</Filename>
<Date>01-01-2015</Date>
</Photo>
<Photo>
<Filename>2.jpg</Filename>
<Date>01-01-2016</Date>
</Photo>
[...]
</Photos>
[...]
</Report>
I want the final table to look like this:
+---------+----------+------------+
| Key | Filename | Date |
+---------+----------+------------+
| 0000001 | 1.jpg | 01-01-2015 |
| 0000001 | 2.jpg | 01-01-2016 |
+---------+----------+------------+

This is not an answer, but important enough not to end up in a comment:
Be very careful with date formats. I do not know how your XML is generated, but the date within an XML should be ISO 8601 (yyyy-mm-dd or yyyy-mm-ddThh:mm:ss).
Your format is culture dependant !!!
Try this:
set language french;
declare #xml as xml ='<x><Date>08-03-2015</Date></x>';
select #xml.value('(/x/Date)[1]','datetime');
set language english;
select #xml.value('(/x/Date)[1]','datetime');
You see, that the results differ?
Now try to set the date to the 13th of March. There's even a conversion exception!

According to comments the OP needs an approach to get this from table row data and the existing answer is not solution enough.
You might try this:
CREATE TABLE #YourTable(BusinessKey VARCHAR(10),ReportData XML);
INSERT INTO #YourTable VALUES
('0000001','<Report>
<ReportKey>0000001</ReportKey>
<Photos>
<Photo>
<Filename>1.jpg</Filename>
<Date>2015-01-01</Date>
</Photo>
<Photo>
<Filename>2.jpg</Filename>
<Date>2016-05-13</Date>
</Photo>
</Photos>
</Report>')
,('0000002','<Report>
<ReportKey>0000002</ReportKey>
<Photos>
<Photo>
<Filename>3.jpg</Filename>
<Date>2015-04-19</Date>
</Photo>
<Photo>
<Filename>4.jpg</Filename>
<Date>2016-12-10</Date>
</Photo>
</Photos>
</Report>');
SELECT BusinessKey AS Table_Key
,ReportData.value('(/Report/ReportKey)[1]','varchar(10)') AS XML_Key
,Photo.value('Filename[1]','varchar(max)') AS Photo_Filename
,Photo.value('Date[1]','date') AS Photo_Date
FROM #YourTable
CROSS APPLY ReportData.nodes('/Report/Photos/Photo') AS A(Photo);
GO
DROP TABLE #YourTable;

Maybe I misunderstood the question. However, try this.
create table t (
[Key] int,
[Filename] nvarchar(max),
[Date] date
)
declare #xml as xml = '<Report>
<ReportKey>0000001</ReportKey>
<Photos>
<Photo>
<Filename>1.jpg</Filename>
<Date>01-01-2015</Date>
</Photo>
<Photo>
<Filename>2.jpg</Filename>
<Date>01-01-2016</Date>
</Photo>
</Photos>
</Report>'
insert into t ([Key], [Filename], [Date])
select n.value('ReportKey[1]', 'int')
, x.value('Filename[1]', 'nvarchar(max)')
, x.value('Date[1]', 'date')
from #xml.nodes('Report') as r(n)
cross apply r.n.nodes('Photos/Photo') as t(x)
select * from t

Related

Generate XML from a table

I want to generate a XML file from a table
But my xmlelement doesn't have the same name that the column in the table
How can I do that?
And I want to generate something like this:
<Car>
<Brand>Ford</Ford>
<Color>Blue</Color>
</Car>
How can I specify the childnodes in my XMl file?
This is covered in the FOR XML documentation, especially in the Using PATH Mode examples.
create table dbo.ThisIsNotTheTableYouAreLookingFor (
NorIsThis varchar(10),
TheColumn varchar(10)
);
insert dbo.ThisIsNotTheTableYouAreLookingFor (NorIsThis, TheColumn)
values ('Ford', 'Blue');
select
NorIsThis as Brand,
TheColumn as Color
from dbo.ThisIsNotTheTableYouAreLookingFor
for xml path('Car');
<Car>
<Brand>Ford</Brand>
<Color>Blue</Color>
</Car>

How to add a new line if there is a specific text appears in XML data?

I'm passing an XML to a stored procedure for inserting. XML contains some pieces of information like product specification, which is a string.
Here is a sample how the XML looks like:
<?xml version="1.0"?>
<Details>
<item Unit="PilotApp.DataAccessObject.DTO.Unit"
PSASysCommon=""
ProductModel="PilotApp.DataAccessObject.DTO.ProductModel"
Product="PilotSmithApp.DataAccessObject.DTO.Product"
SpecTag="62793f05-25ab-41b5-a081-f6c542f1f7cd"
Rate="100" UnitCode="1" Qty="1"
ProductSpec="Pilot Cone Blender Model No. Pilot PCB - 10 , volume of vessel -30 Ltr , handling capacity per batch by weight - 10 Kg and by volume - 20 Ltr. with motor - 0.25 HP/3 ph. Crompton make or equivalent , feeding door , discharge butterfly valve and safety guard .Material of construction of contact stainless steel (AISI) 304 and frame in carbon steel . Purpose : For blending dry powder and granules"
ProductModelID="10c0b51b-7799-4597-a4af-7c3fd431353b"
ProductID="15745d53-8219-431e-a0e3-0d319abf132d"
EnquiryID="00f9436c-ed2a-442c-b333-16348b0d8c33"
ID="e6812788-e67e-4874-bf80-87b39579a837"/>
</Details>
In this product specification section, there is Purpose section added. So, I want to insert it as a new line or display it as a new line and I want to do this using T-SQL
here is the insertion code of XML to a temp table
DECLARE #temp TABLE(
ID UNIQUEIDENTIFIER,
EnquiryID UNIQUEIDENTIFIER,
ProductID UNIQUEIDENTIFIER,
ProductModelID UNIQUEIDENTIFIER,
ProductSpec NVARCHAR(MAX),
Qty DECIMAL(18,2),
Rate DECIMAL(18,2),
UnitCode INT,
SpecTag UNIQUEIDENTIFIER,
IsProcessed bit,
tmpID UNIQUEIDENTIFIER
);
------------parse from xml to temptable ----
INSERT INTO #temp(ID,EnquiryID,ProductID,ProductModelID,
ProductSpec,Qty,Rate,UnitCode,SpecTag,IsProcessed,tmpID)
SELECT T.ID,T.EnquiryID,T.ProductID,T.ProductModelID,replace(replace(replace(replace(T.ProductSpec,'"','"'),'&','&'),'<','<'),'>','>') AS ProductSpec,
T.Qty,T.Rate,T.UnitCode,
-----modified on 14-May-2018 added field SpecTag in EnquiryDetail by Thomson
CASE WHEN T.SpecTag=CAST(CAST(0 AS BINARY) AS UNIQUEIDENTIFIER) THEN NEWID() ELSE T.SpecTag END,
T.IsProcessed,T.tmpID FROM
(SELECT [xmlData].[Col].value('./#ID', 'UNIQUEIDENTIFIER') as ID,
[xmlData].[Col].value('./#EnquiryID', 'UNIQUEIDENTIFIER') as EnquiryID,
[xmlData].[Col].value('./#ProductID', 'UNIQUEIDENTIFIER') as ProductID,
[xmlData].[Col].value('./#ProductModelID', 'UNIQUEIDENTIFIER') as ProductModelID,
[xmlData].[Col].value('./#ProductSpec', 'NVARCHAR(MAX)') as ProductSpec,
[xmlData].[Col].value('./#Qty','DECIMAL(18,2)')as Qty,
[xmlData].[Col].value('./#Rate','DECIMAL(18,2)')as Rate,
[xmlData].[Col].value('./#UnitCode','INT')as UnitCode,
[xmlData].[col].value('./#SpecTag','UNIQUEIDENTIFIER') AS SpecTag,
0 as IsProcessed,
newid() as tmpID
from #DetailXML.nodes('/Details/item') as [xmlData]([Col])) T
Here are the steps to solve this problem.
First get whole string from ProductSpec XML tag as column name "ProductSpec".
Get the sub-string from ProductSpec where sub string started from Purpose in new column as "ProductSpecPurpose".
Append char(10) or char(13) as per your need in string which you have extracted. E.g. char(10) + ProductSpecPurpose.
Merge the two columns which created in step 1 & 2.
Save it.
PS: I did not write solution directly so that at least you can try different sql functions and learn more. Because I believe in learning by ourselves rather spoon feeding. Give it try and if you are not able to figure it out. Do comment I will then write whole sql answer.
Try this to find how to extract the data nested within your XML:
DECLARE #xml XML=
'<?xml version="1.0"?>
<Details>
<item Unit="PilotApp.DataAccessObject.DTO.Unit"
PSASysCommon=""
ProductModel="PilotApp.DataAccessObject.DTO.ProductModel"
Product="PilotSmithApp.DataAccessObject.DTO.Product"
SpecTag="62793f05-25ab-41b5-a081-f6c542f1f7cd"
Rate="100" UnitCode="1" Qty="1"
ProductSpec="Pilot Cone Blender Model No. Pilot PCB - 10 , volume of vessel -30 Ltr , handling capacity per batch by weight - 10 Kg and by volume - 20 Ltr. with motor - 0.25 HP/3 ph. Crompton make or equivalent , feeding door , discharge butterfly valve and safety guard .Material of construction of contact stainless steel (AISI) 304 and frame in carbon steel . Purpose : For blending dry powder and granules"
ProductModelID="10c0b51b-7799-4597-a4af-7c3fd431353b"
ProductID="15745d53-8219-431e-a0e3-0d319abf132d"
EnquiryID="00f9436c-ed2a-442c-b333-16348b0d8c33"
ID="e6812788-e67e-4874-bf80-87b39579a837"/>
</Details>';
SELECT itm.value('#Unit','nvarchar(max)') AS Unit
,itm.value('#PSASysCommon','nvarchar(max)') AS PSASysCommon
,itm.value('#Product','nvarchar(max)') AS Product
,itm.value('#SpecTag','uniqueidentifier') AS SpecTag
,itm.value('#Rate','int') AS Rate
,itm.value('#UnitCode','int') AS UnitCode
,itm.value('#Qty','int') AS Qty
,itm.value('#ProductSpec','nvarchar(max)') AS ProductSpec
,itm.value('#ProductModelID','uniqueidentifier') AS ProductModelID
,itm.value('#ProductID','uniqueidentifier') AS ProductID
,itm.value('#ID','uniqueidentifier') AS ID
FROM #xml.nodes('/Details/item') A(itm);
My approach assumes, that there might be several <item> elements within <Details>.
Just some explanation: The <item> element is a self-closing element with all data placed within attributes. This is a very easy form to query. Good for you...
Btw: It would be best to avoid the <?xml blah?>-declaration at all. Within SQL-Server this declaration is useless and can disturb with encodings...
UPDATE
An enhanced query to parse the spec in lines and extract the Purpose:
SELECT itm.value('#Unit','nvarchar(max)') AS Unit
,itm.value('#PSASysCommon','nvarchar(max)') AS PSASysCommon
,itm.value('#Product','nvarchar(max)') AS Product
,itm.value('#SpecTag','uniqueidentifier') AS SpecTag
,itm.value('#Rate','int') AS Rate
,itm.value('#UnitCode','int') AS UnitCode
,itm.value('#Qty','int') AS Qty
,itm.value('#ProductSpec','nvarchar(max)') AS ProductSpec
,itm.value('#ProductModelID','uniqueidentifier') AS ProductModelID
,itm.value('#ProductID','uniqueidentifier') AS ProductID
,itm.value('#ID','uniqueidentifier') AS ID
,LTRIM(RTRIM(ProductSpecLine.value('text()[1]','nvarchar(max)'))) AS ProductSpecLine_Text
,Purpose
FROM #xml.nodes('/Details/item') A(itm)
OUTER APPLY(SELECT CAST('<x>' + REPLACE((SELECT itm.value('#ProductSpec','nvarchar(max)') AS [*] FOR XML PATH('')),',','</x><x>') + '</x>' AS XML)) B(x)
OUTER APPLY B.x.nodes('/x') C(ProductSpecLine)
OUTER APPLY (SELECT CASE WHEN CHARINDEX('Purpose : ',ProductSpecLine.value('text()[1]','nvarchar(max)'))>0
THEN SUBSTRING(ProductSpecLine.value('text()[1]','nvarchar(max)'),CHARINDEX('Purpose : ',ProductSpecLine.value('text()[1]','nvarchar(max)')),1000)
END) D(Purpose);

Find if a list contains several values - SQL Server Xquery

I am storing a XML data into a table called BikeTable. The XML data is coming from an object that is being serialized using .Net serializer.
BikeTable would look like this :
Id - UniqueIdentifier
XmlData - XML
The XML stored in the XmlData column looks like this :
Record 1 :
<Bike>
<Material>
<Cage>EIECH</Cage>
<Mpn>B258-C436-B001</
</Material>
<Roles>
<string>Race</string>
<string>Mountain</string>
<string>City</string>
</Roles>
</Bike>
Record 2 :
<Bike>
<Material>
<Cage>ABCDE</Cage>
<Mpn>B258-C436-B001</Mpn>
</Material>
<Roles>
<string>Race</string>
</Roles>
</Bike>
I want to be able to find the records in my table that will contain for example Race and Mountain.
Example if I want the Ids of the record that contains 'Road'and 'Mountain" the only way I found is like this :
select Id
from BikeTable
where XmlData.exist('/Bike/Roles/string[contains(., "Road")]') = 1
or XmlData.exist('/Bike/Roles/string[contains(., "Mountain")]') = 1
I don't like this option because it forces me to generate the query if I want to find records that would match one or several roles.
Roles can contains unlimited number of values and I need to be able to find the records that will one or more values.
Ex : records containing Race, records containing Race or Montain, records containing City, records containing City and Mountain etc.
Is there any way to know if a list contains several values?
Yes, you can. This is a bit of a guess though, as you say you want to do a SELECT *; something that is impossible to provide any data for without the DDL of the table. Thus, instead, I've returned the Cage and Mpn of the Bike:
CREATE TABLE BikeTable (xmlData xml);
--The Close tag for Mpn was missing in your sample data, I assume it wasn't mean to be
INSERT INTO BikeTable
VALUES('<Bike>
<Material>
<Cage>EIECH</Cage>
<Mpn>B258-C436-B001</Mpn>
</Material>
<Roles>
<string>Race</string>
<string>Mountain</string>
<string>City</string>
</Roles>
</Bike>')
GO
WITH Bikes AS (
SELECT B.Material.value('(Cage/text())[1]','varchar(15)') AS Cage, --Data Type guessed
B.Material.value('(Mpn/text())[1]','varchar(15)') AS Mpn, --Data Type guessed
BR.String.value('(./text())[1]','varchar(15)') AS String --Data Type guessed
FROM BikeTable BT
CROSS APPLY BT.xmlData.nodes('/Bike/Material') B(Material)
CROSS APPLY BT.xmlData.nodes('/Bike/Roles/string') BR(String))
SELECT Cage, Mpn
FROM Bikes
GROUP BY Cage, Mpn
HAVING COUNT(String) > 1;
GO
DROP TABLE BikeTable;

SSIS Export table with different types of columns into flat file

I'm working on a SSIS Package.
I have a table as below:
Table Name: Employee_table
EmployeID EmployeeName EmployeeDataXML
==============================================
1 Mark <Age>32</Age><Role>Manager</Role>
2 Albert <Age>31</Age><Role>Staff</Role>
==============================================
This table has to be exported into a flat file with name: Employeedata.dat
Content in the file should look like this:
<EmployeeID>1</EmployeeID><EmployeeName>Mark</EmplyeeName><EmployeeDataXML><Age>32</Age><Role>Manager</Role></EmployeeDataXML>
<EmployeeID>2</EmployeeID><EmployeeName>Albert</EmplyeeName><EmployeeDataXML><Age>31</Age><Role>Staff</Role></EmployeeDataXML>
Basically, the employeeid and employeename columns are not in xml format but still when the export happens they should be wrapped up in xml too.
Can someone guide me which is the best way to do it?
Do i need to use any transformation here?
Is there any control/task which is readily available?
Can writing a SQL Select Statement which could simply solves this?
Please guide.
Yes, a simple SELECT using FOR XML PATH should take care of this:
DECLARE #TestData TABLE
(
EmployeID INT NOT NULL,
EmployeeName NVARCHAR(50) NOT NULL,
EmployeeDataXML XML
);
INSERT INTO #TestData (EmployeID, EmployeeName, EmployeeDataXML)
VALUES (1, N'Mark', N'<Age>32</Age><Role>Manager</Role>');
INSERT INTO #TestData (EmployeID, EmployeeName, EmployeeDataXML)
VALUES (2, N'Albert', N'<Age>31</Age><Role>Staff</Role>');
SELECT EmployeID, EmployeeName, EmployeeDataXML
FROM #TestData
FOR XML PATH(N'Employee');
produces the following:
<Employee>
<EmployeID>1</EmployeID>
<EmployeeName>Mark</EmployeeName>
<EmployeeDataXML>
<Age>32</Age>
<Role>Manager</Role>
</EmployeeDataXML>
</Employee>
<Employee>
<EmployeID>2</EmployeID>
<EmployeeName>Albert</EmployeeName>
<EmployeeDataXML>
<Age>31</Age>
<Role>Staff</Role>
</EmployeeDataXML>
</Employee>
You didn't have the parent <Employee> element shown in the sample output, but I don't think the file would be usable without some element wrapping the field elements into a "row".

SQL to populate values in xml list

In an SQL Server sproc I need to generate xml using data originating from two different tables. In my example below, the patient number for type EPI comes from one table and the patient number for type MRN comes from another table. To create the xml I am using a UNION to combine the records from two distinct select statements and then using 'FOR XML PATH'. Is there a different way - such as using two select sub-queries without using UNION?
<Patients>
<Patient>
<Number>1234</Number>
<NumberType>EPI</NumberType>
</Patient>
<Patient>
<Number>5678</Number>
<NumberType>MRN</NumberType>
</Patient>
</Patients>
Thanks in advance.
If I understood your answer to my question, you are not really joining the tables on PatientId, you are just creating a list of all the data from both tables, and you don't need to group the records by patient.
Yes, UNION is the easiest way to accomplish a single list.
However, since you want to output xml, there is an alternate that can be done without UNION, per your question:
Assuming you have two tables that might look something like this:
CREATE TABLE SrcA (PatientId int, NumberA int, TypeA varchar(16));
CREATE TABLE SrcB (PatientId int, NumberB int, TypeB varchar(16));
with sample values like this (note how each table has one record not in the other):
INSERT INTO SrcA VALUES(100, 1234, 'EPI'), (200, 2222, 'EPI'), (400, 4444, 'EPI');
INSERT INTO SrcB VALUES(100, 5678, 'MRN'), (200, 2121, 'MRN'), (300, 3131, 'MRN');
Then the following query:
SELECT
(SELECT SA.NumberA AS Number, SA.TypeA AS NumberType WHERE SA.NumberA IS NOT NULL FOR XML PATH('Patient'), TYPE),
(SELECT SB.NumberB AS Number, SB.TypeB AS NumberType WHERE SB.NumberB IS NOT NULL FOR XML PATH('Patient'), TYPE)
FROM SrcA SA
FULL OUTER JOIN SrcB SB ON SA.PatientId = SB.PatientId
FOR XML PATH(''), ROOT('Patients')
will produce:
<Patients>
<Patient>
<Number>1234</Number>
<NumberType>EPI</NumberType>
</Patient>
<Patient>
<Number>5678</Number>
<NumberType>MRN</NumberType>
</Patient>
<Patient>
<Number>2222</Number>
<NumberType>EPI</NumberType>
</Patient>
<Patient>
<Number>2121</Number>
<NumberType>MRN</NumberType>
</Patient>
<Patient>
<Number>4444</Number>
<NumberType>EPI</NumberType>
</Patient>
<Patient>
<Number>3131</Number>
<NumberType>MRN</NumberType>
</Patient>
</Patients>

Resources