t-sql to create XML file using FOR XML Path, multilevel issues - sql-server

I am creating an XML file to Upload to a 3rd party product. The file must begin with specific file and source information level and then it is followed with the specific data requirements/levels of EVENT and CREW members for those events.
I can create the initial level with the file/source information, and I have the data requirements exactly as they should be, but I cannot get them together in the same file between the "ROOT" level without the initial level repeating between each EVENT level or the an extra EVENT level as if they're nested. I've also managed to get a result with a ROW level that I did not define and the "tags" modified to < and &gt: instead of < >. I've done a good bit of research and tried using a union method, sub-selects, nesting methods as well many combinations of FOR XML PATH, AUTO, EXPLICIT, with and without elements. I've learned a lot, but I'm just not finding the right combination for the results I need.
The first example is the layout that is required. The second is one of the examples that is most common for my efforts, followed by the SQL that created it.
what it should be (FILEINFO level only once, only one EVENT level for each EVENT)
<ROOT>
<FILEINFO>
<SOURCE_ID>P</SOURCE_ID>
</FILEINFO>
<EVENT>
<DATE>2019-09-24T08:00:00</DATE>
<NO>1</NO>
<DEL_FLAG>false</DEL_FLAG>
<DATE_TIME_STAMP>2019-09-24T14:14:00</DATE_TIME_STAMP>
<CREW>
<LAST_NAME>DOE</LAST_NAME>
<DEL_FLAG>false</DEL_FLAG>
<DATE_TIME_STAMP>2019-09-24T14:14:00</DATE_TIME_STAMP>
</CREW>
</EVENT>
<EVENT>
<DATE>2019-09-16T12:30:00</DATE>
<NO>1</NO>
<DEL_FLAG>false</DEL_FLAG>
<DATE_TIME_STAMP>2019-09-16T18:20:00</DATE_TIME_STAMP>
<CREW>
<LAST_NAME>DOE</LAST_NAME>
<DEL_FLAG>false</DEL_FLAG>
<DATE_TIME_STAMP>2019-09-16T18:20:00</DATE_TIME_STAMP>
</CREW>
</EVENT>
</ROOT>
what i'm getting:
<ROOT>
<EVENT>
<FILEINFO>
<SOURCE_ID>P</SOURCE_ID>
</FILEINFO>
<EVENT>
<DATE>2019-09-16T08:00:00</DATE>
<NO>1</NO>
<DEL_FLAG>false</DEL_FLAG>
<DATE_TIME_STAMP>2019-09-16T15:12:00</DATE_TIME_STAMP>
<CREW>
<LAST_NAME>DOE</LAST_NAME>
<DEL_FLAG>false</DEL_FLAG>
<DATE_TIME_STAMP>2019-09-16T15:12:00</DATE_TIME_STAMP>
</CREW>
</EVENT>
</EVENT>
<EVENT>
<FILEINFO>
<SOURCE_ID>P</SOURCE_ID>
</FILEINFO>
<EVENT>
<DATE>2019-09-16T08:00:00</DATE>
<NO>1</NO>
<DEL_FLAG>false</DEL_FLAG>
<DATE_TIME_STAMP>2019-09-16T15:12:00</DATE_TIME_STAMP>
<CREW>
<LAST_NAME>DOE</LAST_NAME>
<DEL_FLAG>false</DEL_FLAG>
<DATE_TIME_STAMP>2019-09-16T15:12:00</DATE_TIME_STAMP>
</CREW>
</EVENT>
</EVENT>
... ...
most recent/simplest attempt that creates the above:
SELECT
(SELECT SOURCE_ID FROM (select 'P' as SOURCE_ID) FILEINFO ) AS 'FILEINFO/SOURCE_ID'
,[DATE] AS 'EVENT/DATE'
,[NO] AS 'EVENT/NO'
,[DEL_FLAG] AS 'EVENT/DEL_FLAG'
,[DATE_TIME_STAMP] AS 'EVENT/DATE_TIME_STAMP'
,'DOE' as 'EVENT/CREW/LAST_NAME'
,[DEL_FLAG2] as 'EVENT/CREW/DEL_FLAG'
,[DATE_TIME_STAMP3] as 'EVENT/CREW/DATE_TIME_STAMP'
FROM [dbo].XMLForFILEExport x
FOR XML path('EVENT'), elements, ROOT('ROOT') ;

This is easy, just use a sub-select and deal with this like it was a *normal column:
This easy SELECT will return the single <FILEINFO>
SELECT 'P' AS [FILEINFO/SOURCE_ID]
FOR XML PATH(''),ROOT('ROOT');
You see, that I used an empty PATH(), but I set the ROOT().
This is the result
<ROOT>
<FILEINFO>
<SOURCE_ID>P</SOURCE_ID>
</FILEINFO>
</ROOT>
Now we can start to add your events. First I need a mockup table to simulate your issue
DECLARE #mockupEventTable TABLE(ID INT IDENTITY,[NO] INT, [DATE] DATETIME, EventText VARCHAR(100));
INSERT INTO #mockupEventTable VALUES(1,'20190916','Event 1')
,(2,'20190917','Event 2');
--The query
SELECT 'P' AS [FILEINFO/SOURCE_ID]
,(
SELECT e.[DATE]
,e.[NO]
,e.EventText
,'Doe' AS [CREW/LASTNAME]
FROM #mockupEventTable e
FOR XML PATH('EVENT'),TYPE
) AS [*]
FOR XML PATH(''),ROOT('ROOT');
The result
<ROOT>
<FILEINFO>
<SOURCE_ID>P</SOURCE_ID>
</FILEINFO>
<EVENT>
<DATE>2019-09-16T00:00:00</DATE>
<NO>1</NO>
<EventText>Event 1</EventText>
<CREW>
<LASTNAME>Doe</LASTNAME>
</CREW>
</EVENT>
<EVENT>
<DATE>2019-09-17T00:00:00</DATE>
<NO>2</NO>
<EventText>Event 2</EventText>
<CREW>
<LASTNAME>Doe</LASTNAME>
</CREW>
</EVENT>
</ROOT>
You can see, that the sub-select will create the inner XML just as you need it. We have to specify ,TYPE in order to get this as typed XML. Try the same without. You will get the XML escaped, as if it was simple text...
And I specify AS [*] (the same was AS [node()]) to indicate, that the XML "column" has no own name, but should be inserted as is. This is not mandatory (try it without), but it makes things more readable...

That's because you specified the PATH "EVENT" already. Also you can remove the EVENT in the field name, e.g. 'EVENT/CREW/DATE_TIME_STAMP' can just be 'CREW/DATE_TIME_STAMP'
TO achieve what required, you can generate the xml with EVENT elementsand then insert the FILEINFO.
DECLARE #x xml;
SET #x=(SELECT
[DATE] AS 'DATE'
,[NO] AS 'NO'
,[DEL_FLAG] AS 'DEL_FLAG'
,[DATE_TIME_STAMP] AS 'DATE_TIME_STAMP'
,'DOE' as 'CREW/LAST_NAME'
,[DEL_FLAG2] as 'CREW/DEL_FLAG'
,[DATE_TIME_STAMP3] as 'CREW/DATE_TIME_STAMP'
FROM [dbo].XMLForFILEExport x
FOR XML path('EVENT'), elements, ROOT('ROOT'))
SET #x.modify('
insert <FILEINFO><SOURCE_ID>P</SOURCE_ID></FILEINFO>
as first
into (/ROOT)[1]');

Related

Insert XML child node to SQL table

I've got an XML file like this and I'm working with SQL 2014 SP2
<?xml version='1.0' encoding='UTF-8'?>
<gwl>
<version>123456789</version>
<entities>
<entity id="1" version="123456789">
<name>xxxxx</name>
<listId>0</listId>
<listCode>Oxxx</listCode>
<entityType>08</entityType>
<createdDate>03/03/1993</createdDate>
<lastUpdateDate>05/06/2011</lastUpdateDate>
<source>src</source>
<OriginalSource>o_src</OriginalSource>
<aliases>
<alias category="STRONG" type="Alias">USCJSC</alias>
<alias category="WEAK" type="Alias">'OSKOAO'</alias>
</aliases>
<programs>
<program type="21">prog</program>
</programs>
<sdfs>
<sdf name="OriginalID">9876</sdf>
</sdfs>
<addresses>
<address>
<address1>1141, SYA-KAYA STR.</address1>
<country>RU</country>
<postalCode>1234</postalCode>
</address>
<address>
<address1>90, MARATA UL.</address1>
<country>RU</country>
<postalCode>1919</postalCode>
</address>
</addresses>
<otherIds>
<childId>737606</childId>
<childId>737607</childId>
</otherIds>
</entity>
</entities>
</gwl>
I made a script to insert data from the XML to a SQL table. How can I insert child node into a table? I think I should replicate the row for each new child node but i don't know the best way to proceed.
Here is my SQL code
DECLARE #InputXML XML
SELECT #InputXML = CAST(x AS XML)
FROM OPENROWSET(BULK 'C:\MyFiles\sample.XML', SINGLE_BLOB) AS T(x)
SELECT
product.value('(#id)[1]', 'NVARCHAR(10)') id,
product.value('(#version)[1]', 'NVARCHAR(14)') ID
product.value('(name[1])', 'NVARCHAR(255)') name,
product.value('(listId[1])', 'NVARCHAR(9)')listId,
product.value('(listCode[1])', 'NVARCHAR(10)')listCode,
product.value('(entityType[1])', 'NVARCHAR(2)')entityType,
product.value('(createdDate[1])', 'NVARCHAR(10)')createdDate,
product.value('(lastUpdateDate[1])', 'NVARCHAR(10)')lastUpdateDate,
product.value('(source[1])', 'NVARCHAR(15)')source,
product.value('(OriginalSource[1])', 'NVARCHAR(50)')OriginalSource,
product.value('(aliases[1])', 'NVARCHAR(50)')aliases,
product.value('(programs[1])', 'NVARCHAR(50)')programs,
product.value('(sdfs[1])', 'NVARCHAR(500)')sdfs,
product.value('(addresses[1])', 'NVARCHAR(50)')addresses,
product.value('(otherIDs[1])', 'NVARCHAR(50)')otherIDs
FROM #InputXML.nodes('gwl/entities/entity') AS X(product)
You have a lot of different children here...
Just to show the principles:
DECLARE #xml XML=
N'<gwl>
<version>123456789</version>
<entities>
<entity id="1" version="123456789">
<name>xxxxx</name>
<listId>0</listId>
<listCode>Oxxx</listCode>
<entityType>08</entityType>
<createdDate>03/03/1993</createdDate>
<lastUpdateDate>05/06/2011</lastUpdateDate>
<source>src</source>
<OriginalSource>o_src</OriginalSource>
<aliases>
<alias category="STRONG" type="Alias">USCJSC</alias>
<alias category="WEAK" type="Alias">''OSKOAO''</alias>
</aliases>
<programs>
<program type="21">prog</program>
</programs>
<sdfs>
<sdf name="OriginalID">9876</sdf>
</sdfs>
<addresses>
<address>
<address1>1141, SYA-KAYA STR.</address1>
<country>RU</country>
<postalCode>1234</postalCode>
</address>
<address>
<address1>90, MARATA UL.</address1>
<country>RU</country>
<postalCode>1919</postalCode>
</address>
</addresses>
<otherIds>
<childId>737606</childId>
<childId>737607</childId>
</otherIds>
</entity>
</entities>
</gwl>';
-The query will fetch some values from several places.
--It should be easy to get the rest yourself...
SELECT #xml.value('(/gwl/version/text())[1]','bigint') AS [version]
,A.ent.value('(name/text())[1]','nvarchar(max)') AS [Entity_Name]
,A.ent.value('(listId/text())[1]','int') AS Entity_ListId
--more columns taken from A.ent
,B.als.value('#category','nvarchar(max)') AS Alias_Category
,B.als.value('text()[1]','nvarchar(max)') AS Alias_Content
--similar for programs and sdfs
,E.addr.value('(address1/text())[1]','nvarchar(max)') AS Address_Address1
,E.addr.value('(country/text())[1]','nvarchar(max)') AS Address_Country
--and so on
FROM #xml.nodes('/gwl/entities/entity') A(ent)
OUTER APPLY A.ent.nodes('aliases/alias') B(als)
OUTER APPLY A.ent.nodes('programs/program') C(prg)
OUTER APPLY A.ent.nodes('sdfs/sdf') D(sdfs)
OUTER APPLY A.ent.nodes('addresses/address') E(addr)
OUTER APPLY A.ent.nodes('otherIds/childId') F(ids);
The idea in short:
We read non-repeating values (e.g. version) from the xml variable directly
We use .nodes() to return repeating elements as derived sets.
We can use a cascade of .nodes() to dive deeper into repeating child elements by using a relativ Xpath (no / at the beginning).
You have two approaches:
Read the XML like above into a staging table (simply by adding INTO #tmpTable before FROM) and proceed from there (will need one SELECT ... GROUP BY for each type of child).
Create one SELECT per type of child, using only one of the APPLY lines and shift the data into specific child tables.
I would tend to the first one.
This allows to do some cleaning, generate IDs, check for business rules, before you shift this into the target tables.

T-SQL XML: <eof> encountered when trying to query child nodes by wildcard

We have an in-house piece of software that works with loosely-defined XML files. I'm trying to extract the child nodes from this step in T-SQL. I'm able to extract the parent node, but I keep getting <eof> syntax errors whenever I query the children.
The XML file looks roughly like this:
<?xml version="1.0"?>
<root>
<steps>
<step>
<steptypeX attribute="somevalue">
<child1>Value</child1>
<child2>Value</child2>
</steptypeX>
</step>
</steps>
</root>
I'm using the following T-SQL:
select
doc.col.query('/child*') --If I use '.' or '*' here I can get the children as XML, but I want the values contained within the nodes on separate rows
from #xmldoc.nodes('/root/steps/step/steptypeX') doc(col)
where doc.col.value('#attribute', 'nvarchar(max)') = 'somevalue'
The error message I'm getting is not clear:
XQuery [query()]: Syntax error near '<eof>'
As far as I can tell, the nodes do exist and I haven't left any XQuery instructions with trailing slashes. I can't really tell what I'm doing wrong here.
If I understand your intention correctly you can use child::*:
DECLARE #xmldoc XML =
N'<?xml version="1.0"?>
<root>
<steps>
<step>
<steptypeX attribute="somevalue">
<child1>Value</child1>
<child2>Value</child2>
</steptypeX>
</step>
</steps>
</root>';
SELECT
doc.col.value('text()[1]', 'nvarchar(max)')
FROM #xmldoc.nodes('/root/steps/step/steptypeX/child::*') doc(col)
WHERE doc.col.value('../#attribute', 'nvarchar(max)') = 'somevalue';
LiveDemo

Combine and modify XML in TSQL

Using SQL Server 2005, is it possible to combine XML and add an attribute at same time?
Unfortunately, due to project restrictions, I need a SQL Server 2005 solution.
Consider the following, where I need to combine XML from multiple rows within a new <root> element...
; WITH [TestTable] AS (
SELECT 7 AS [PkId], CAST('<data><id>11</id><id>12</id></data>' AS XML) AS [Data]
UNION ALL
SELECT 12, CAST('<data><id>22</id></data>' AS XML)
UNION ALL
SELECT 43, CAST('<data><id>33</id></data>' AS XML)
)
SELECT (
SELECT XMLDATA as [*]
FROM (
SELECT [Data] AS [*]
FROM [TestTable]
FOR XML PATH(''), TYPE
) AS DATA(XMLDATA)
FOR XML PATH('root')
)
This produces the desired output of...
<root>
<data><id>11</id><id>12</id></data>
<data><id>22</id></data>
<data><id>33</id></data>
</root>
But what I need to do, if possible, is add an attribute to the existing data element in each of the rows with the PkId value. The desired output would then look like this...
<root>
<data pkid="7"><id>11</id><id>12</id></data>
<data pkid="12"><id>22</id></data>
<data pkid="43"><id>33</id></data>
</root>
My gut feeling is that this is going to be impossible without the use of a cursor, but if anybody knows a way of doing it I'd love to hear it.
At the request of #MattA, here is an example of some random data in the table...
[PkId] [UserId] [SubmittedDate] [Data]
1 1 2015-03-24 12:34:56 '<data><id>1</id><id>2</id></data>'
2 1 2015-03-23 09:15:52 '<data><id>3</id></data>'
3 2 2015-03-22 16:01:23 '<data><id>4</id><id>5</id></data>'
4 1 2015-03-21 13:45:34 '<data><id>6</id></data>'
Please note, that to make the question easier, I stated that I needed the PkId column as the attribute to the data. This is not actually the case - instead I need the [SubmittedDate] column to be used. I apologise if this caused confusion.
Using UserId=1 as a filter, the XML I would like from the above would be...
<root>
<data submitteddate="2015-03-24T12:34:56"><id>1</id><id>2</id></data>
<data submitteddate="2015-03-23T09:15:52"><id>3</id></data>
<data submitteddate="2015-03-21T13:45:34"><id>6</id></data>
</root>
The date would be formatted using the 126 date format available from CONVERT
Here's the quick answer for you. XML does support "modify", but shredding on a small data set like this works quite well too.
Code
--The existing XML
DECLARE #XML XML = '<root>
<data><id>11</id></data>
<data><id>22</id></data>
<data><id>33</id></data>
</root>'
--XML Shredded Back to a table
;WITH
ShreddedXML AS (
SELECT
ID = FieldAlias.value('(id)[1]','int')
FROM
#XML.nodes('/root/data') AS TableAlias(FieldAlias)
), ArbitraryPKGenerator AS (
SELECT CURRENT_TIMESTAMP AS PKid,
ID
FROM ShreddedXML
)
SELECT A.PKId AS "#PKid",
A.ID AS "id"
FROM ArbitraryPKGenerator AS A
FOR XML PATH('data'), ROOT('root')
And the XML
<root>
<data PKid="2015-03-24T09:44:55.770">
<id>11</id>
</data>
<data PKid="2015-03-24T09:44:55.770">
<id>22</id>
</data>
<data PKid="2015-03-24T09:44:55.770">
<id>33</id>
</data>
</root>

SQL - Read an XML node from a table field

I am using SQL Server 2008. I have a field called RequestParameters in one of my SQL table called Requests with XML data. An example would be:
<RequestParameters xmlns="http://schemas.datacontract.org/2004/07/My.Name.Space" xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/" z:Id="1">
<Data z:Id="2" i:type="CheckoutRequest">
<UserGuid>7ec38c44-5aa6-49e6-9fc7-25e9028f2148</UserGuid>
<DefaultData i:nil="true" />
</Data>
</RequestParameters>
I ultimately want to retrieve the value of UserGuid. For that, I am doing this:
SELECT RequestParameters.value('(/RequestParameters/Data/UserGuid)[0]', 'uniqueidentifier') as UserGuid
FROM Requests
However, the results I am seeing are all NULL. What am I doing wrong?
You have to specify the default namespace and use [1] instead of [0].
WITH XMLNAMESPACES(default 'http://schemas.datacontract.org/2004/07/My.Name.Space')
SELECT RequestParameters.value('(/RequestParameters/Data/UserGuid)[1]', 'uniqueidentifier') as UserGuid
FROM Requests;
SQL Fiddle
declare #XML xml
set #XML = "<RequestParameters xmlns="http://schemas.datacontract.org/2004/07/My.Name.Space" xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/" z:Id="1">
<Data z:Id="2" i:type="CheckoutRequest">
<UserGuid>7ec38c44-5aa6-49e6-9fc7-25e9028f2148</UserGuid>
<DefaultData i:nil="true" />
</Data>
</RequestParameters>"
select #XML.value('(/RequestParameters/Data /UserGuid)[1]', 'varchar')
'

Microsoft SQL Server xml data

This site has a technique to pass xml data around in Microsoft SQL Server:
DECLARE #productIds xml
SET #productIds ='<Products><id>3</id><id>6</id><id>15</id></Products>'
SELECT
ParamValues.ID.value('.','VARCHAR(20)')
FROM #productIds.nodes('/Products/id') as ParamValues(ID)
But what is the syntax if I add another field?
The following does NOT work:
DECLARE #productIds xml
SET #productIds ='<Products><id>3</id><descr>Three</descr><id>6</id><descr>six</descr><id>15</id><descr>Fifteen</descr></Products>'
SELECT
ParamValues.ID.value('.','VARCHAR(20)')
,ParamValues.descr.value('.','VARCHAR(20)')
FROM #productIds.nodes('/Products/id') as ParamValues(ID)
Note: Maybe I've constructed my xml wrong.
You need to use something like:
SELECT
ParamValues.ID.value('(id)[1]','VARCHAR(20)'),
ParamValues.ID.value('(descr)[1]','VARCHAR(20)')
FROM
#productIds.nodes('/Products') as ParamValues(ID)
That FROM statement there defines something like a "virtual table" called ParamValues.ID - you need to select the <Products> node into that virtual table and then access the properties inside it.
Furthermore, your XML structure is very badly chosen:
<Products>
<id>3</id>
<descr>Three</descr>
<id>6</id>
<descr>six</descr>
<id>15</id>
<descr>Fifteen</descr>
</Products>
You won't be able to select the individual pairs of id/descr - you should use something more like:
<Products>
<Product>
<id>3</id>
<descr>Three</descr>
</Product>
<Product>
<id>6</id>
<descr>six</descr>
</Product>
<Product>
<id>15</id>
<descr>Fifteen</descr>
</Product>
</Products>
Then you could retrieve all items using this SQL XML query:
SELECT
ParamValues.ID.value('(id)[1]','VARCHAR(20)') AS 'ID',
ParamValues.ID.value('(descr)[1]','VARCHAR(20)') AS 'Description'
FROM
#productIds.nodes('/Products/Product') as ParamValues(ID)
ID Descrition
3 Three
6 six
15 Fifteen
You must wrap each set of id and descr into one parent node. Say Row. Now you can access each pair like this.
DECLARE #productIds xml
SET #productIds ='<Products><Row><id>3</id><descr>Three</descr></Row><Row><id>6</id><descr>six</descr></Row><Row><id>15</id><descr>Fifteen</descr></Row></Products>'
SELECT
ParamValues.Row.query('id').value('.','VARCHAR(20)'),
ParamValues.Row.query('descr').value('.','VARCHAR(20)')
FROM #productIds.nodes('/Products/Row') as ParamValues(Row)

Resources