Extract XML header information into SQL server - sql-server

My process involves getting a large XML file on a daily basis.
I have developed an SSIS package (2008 r2) which first gets rid of the multiple namespaces via a XSLT and then imports data into 40 tables (due to its complexity) by using the XML source object.
Here is the watered down version of a test xml file
<?xml version="1.0" encoding="UTF-8"?>
<s:Test xmlns:s="http://###.##.com/xml"
<sequence>62</sequence>
<generated>2015-04-28T00:59:38</generated>
<report_date>2015-04-27</report_date>
<orders>
<order>
</order>
</orders>
My question is: The XML source imports all the Orders with its nested attributes. How do I extract the 'report_date' and 'generated' from the header?
Any help would be much appreciated.
Thanks
SD

You can use XML method value() passing proper XPath/XQuery expression as parameter. For demo, consider the following table and data :
CREATE TABLE MyTable (id int, MyXmlColumn XML)
DECLARE #data XML = '<?xml version="1.0" encoding="UTF-8"?>
<s:Test xmlns:s="http://###.##.com/xml">
<sequence>62</sequence>
<generated>2015-04-28T00:59:38</generated>
<report_date>2015-04-27</report_date>
<orders>
<order>
</order>
</orders>
</s:Test>'
INSERT INTO Mytable VALUES(1,#data)
You can use the following query to get generated and report_date data :
SELECT
t.MyXmlColumn.value('(/*/generated)[1]','datetime') as generated
, t.MyXmlColumn.value('(/*/report_date)[1]','date') as report_date
FROM Mytable t
SQL Fiddle Demo
output :
generated report_date
----------------------- -----------
2015-04-28 00:59:38.000 2015-04-27

Related

How do I pull a value from column with XML data

Hello I have a table called KNXRUNHISTORY and I am trying to pull data from a column called RUNSUMMARYTXT.
The data stored in this column is in xml format.
I am trying to pull the values for these two fields
(entry key="Link1.#OutputType#) and
(entry key="Link6.#SourceRecordsProcessed#)
Any help would be appreciated as I have banging my head against the wall on this one.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM
http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>Comment</comment>
<entry key="#InterfaceErrorCount#">0</entry>
<entry key="Link3.#LinkElapsedTime#">0:00:21</entry>
<entry key="Link1.#OutputType#">4</entry>
<entry key="Link6.#SourceRecordsProcessed#">148</entry>
The general idea is to use xpath query to xml data. But in your case there are some pitfalls.
1. Your data aren't xml data type. Most probably data type of the column in varchar(max) or nvarchar(max). It would be OK but
2. You have <!DOCTYPE declaration which requires conversion with with style equals to 2. Encoding must be utf-16 (yours is utf-8). So encoding in the xml declaration must be fixed or just removed.
Here is working example.
declare #tbl table (id int, prop nvarchar(max))
insert #tbl(id,prop) values(1,
N'<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>Comment</comment>
<entry key="#InterfaceErrorCount#">0</entry>
<entry key="Link3.#LinkElapsedTime#">0:00:21</entry>
<entry key="Link1.#OutputType#">4</entry>
<entry key="Link6.#SourceRecordsProcessed#">148</entry></properties>')
;with cte as(
select id, convert(xml,
replace( prop,'<?xml version="1.0" encoding="UTF-8"?>',''),--sanitize data
2 --Enable limited internal DTD subset processing
) x --and then convert to xml data type
from #tbl
)
select t.v.value('entry[#key="Link1.#OutputType#"][1]','int') outputtype,
t.v.value('entry[#key="Link6.#SourceRecordsProcessed#"][1]','int') srcrecprocessed
from cte cross apply x.nodes('properties') t(v) -- use xpath query

How do I convert SQL to XML?

I am trying to output SQL as XML to match the exact format as the following
<?xml version="1.0" encoding="utf-8"?>
<ProrateImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schema.aldi-
sued.com/Logistics/Shipping/ProrateImport/20151009">
<Prorates>
<Prorate>
<OrderTypeId>1</OrderTypeId>
<DeliveryDate>2015-10-12T00:00:00+02:00</DeliveryDate>
<DivNo>632</DivNo>
<ProrateUnit>1</ProrateUnit>
<ProrateProducts>
<ProrateProduct ProductCode="8467">
<ProrateItems>
<ProrateItem StoreNo="1">
<Quantity>5</Quantity>
</ProrateItem>
<ProrateItem StoreNo="2">
<Quantity>5</Quantity>
</ProrateItem>
<ProrateItem StoreNo="3">
<Quantity>5</Quantity>
</ProrateItem>
</ProrateItems>
</ProrateProduct>
</ProrateProducts>
</Prorate>
</Prorates>
</ProrateImport>
Here is my query:
SELECT
OrderTypeID,
DeliveryDate, DivNo,
ProrateUnit,
(SELECT
ProductOrder [#ProductCode],
(SELECT
ProrateItem [#StoreNo],
CAST(Quantity AS INT) [Quantity]
FROM
##Result2 T3
WHERE
T3.DivNo = T2.DivNo
AND T3.DivNo = T1.DivNo
AND T3.DeliveryDate = T2.DeliveryDate
AND T3.DeliveryDate = T1.DeliveryDate
AND T3.ProductOrder = t2.ProductOrder
FOR XML PATH('ProrateItem'), TYPE, ROOT('ProrateItems')
)
FROM
##Result2 T2
WHERE
T2.DivNo = T1.DivNo
AND T2.DeliveryDate = T1.DeliveryDate
FOR XML PATH('ProrateProduct'), TYPE, ROOT('ProrateProducts')
)
FROM
##Result2 T1
GROUP BY
OrderTypeID, DeliveryDate, DivNo, ProrateUnit
FOR XML PATH('Prorate'), TYPE, ROOT('Prorates')
How do I add in the Following and have the ProrateImport/20151009" change to the current date?
<?xml version="1.0" encoding="utf-8"?>
<ProrateImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schema.aldi-
sued.com/Logistics/Shipping/ProrateImport/20151009">
This is my first time I have used XML
Im not sure i understand. Did you create the first XML yourself and just need to add the last script?
DECLARE #XMLHEADER nvarchar(max)
SET #XMLHEADER = '<?xml version="1.0" encoding="utf-8"?>
<ProrateImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/'+convert(varchar(8),getdate(),112)+'"
>'
select #xmlheader
And then you just need to add the rest of your output from your select statement.
There are several problems:
How to introduce namespaces?
How to introduce namespaces dynamically
How to add a <?xml ?> directive
two-leveled root (<ProrateImport><Prorate>)
namespaces
You have to use WITH XMLNAMESSPACES to introduce a namespace to your query.
Hint: the naked xmlns is introduced by DEFAULT, the xsi namespace will be introduced automatically by using ELEMENTS XSINIL:
WITH XMLNAMESPACES('http://www.w3.org/2001/XMLSchema' AS xsd
,DEFAULT 'http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20151009')
SELECT 1 AS Dummy
FOR XML PATH('rowElement'), ELEMENTS XSINIL, ROOT('root')
The result
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20151009"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</root>
Note: The namespaces must be stated literally. No computations, no variables!
dynamic namespaces
This is - out of the box - impossible. But you might use dynamically created SQL and use EXEC to get your result. Just create exactly the statement as above
DECLARE #cmd VARCHAR(MAX)=
'
WITH XMLNAMESPACES(''http://www.w3.org/2001/XMLSchema'' AS xsd
,DEFAULT ''http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/' + CONVERT(VARCHAR(8),GETDATE(),112) + ''')
SELECT 1 AS Dummy
FOR XML PATH(''rowElement''), ELEMENTS XSINIL, ROOT(''root'')';
PRINT #cmd
EXEC(#cmd);
the result
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20171019"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</root>
directive
The directive cannot be introduced into XML. SQL-Server will omit any <?xml ?> directive! This can only be done on string level:
DECLARE #cmd VARCHAR(MAX)=
'
WITH XMLNAMESPACES(''http://www.w3.org/2001/XMLSchema'' AS xsd
,DEFAULT ''http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/' + CONVERT(VARCHAR(8),GETDATE(),112) + ''')
SELECT(
SELECT 1 AS Dummy
FOR XML PATH(''rowElement''), ELEMENTS XSINIL, ROOT(''root'')) AS MyResult';
CREATE TABLE #resultTable(MyXmlAsString VARCHAR(MAX))
INSERT INTO #resultTable(MyXmlAsString)
EXEC(#cmd);
SELECT '<?xml version="1.0" encoding="utf-8"?>' + MyXmlAsString
FROM #resultTable;
The result
<?xml version="1.0" encoding="utf-8"?>
<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schema.aldi-sued.com/Logistics/Shipping/ProrateImport/20171019"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</root>
two-leveled root
You can nest two FOR XML statements to achieve this:
WITH XMLNAMESPACES(DEFAULT 'blah')
SELECT
(
SELECT 1 AS Dummy
FOR XML PATH('rowElement'),ROOT('innerRoot'),TYPE
)
FOR XML PATH('outerRoot');
But the annoying part is, that namespaces are introduced by each sub-select over and over. Not wrong but very annoying! A well known Microsoft connect issue. Please sign in and vote for it! The result:
<outerRoot xmlns="blah">
<innerRoot xmlns="blah"> <!--Here's the second xmlns! -->
<rowElement>
<Dummy>1</Dummy>
</rowElement>
</innerRoot>
</outerRoot>
Your solution
After explained all this I'd suggest to create the XML without any namespace or declaration (what you are doing already!), then convert the result to NVARCHAR(MAX) and add the header and the closing footer on string level. This is ugly, but in your case the only way.
Hint: You will not be able to store the final result in a native XML type in SQL Server without loosing the directive.

How to get data from XML Column that contain xml namespace (SQL Server 2005)

I google a lot and got no luck.
I can't retrieve data from XML column which data came from web service using sp_OAGetProperty.
the XML Column contain..
<ArrayOfCustomerInfo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://tempuri.org/">
<Customer CustCode="001">
<CustName>John</CustName>
<Queues>
<Q>
<No>10</No>
<Line>1</Line>
</Q>
</Queues>
</Customer>
</ArrayOfCustomerInfo>
I got NULL when I execute following statement
(but works fine if I remove all XML namespace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://tempuri.org/")
SELECT a.b.value('#CustCode','varchar(4)') AS Code
,a.b.value('CustName[1]','varchar(20)') AS Name
,c.d.value('No[1]','int') AS QNo
,c.d.value('(Line)[1]','int') AS QLine
FROM PGHRMS_Employees x
CROSS APPLY x.data.nodes('/ArrayOfCustomerInfo/Customer') AS a(b)
CROSS APPLY a.b.nodes('Queues/Q') AS c(d)
please give me some advice. I've to achieve with SQL SERVER :(
If anyone want to reproduce it, I pasted script at : http://pastebin.com/ueZGidyL
Thank you in advance !!!
Try this:
;WITH XMLNAMESPACES(DEFAULT 'http://tempuri.org/')
SELECT
Code = XC1.value('#CustCode', 'varchar(4)'),
Name = XC1.value('CustName[1]', 'varchar(20)'),
QNo = XC2.value('No[1]', 'int') ,
QLine = XC2.value('(Line)[1]','int')
FROM
PGHRMS_Employees
CROSS APPLY
XmlContent.nodes('/ArrayOfCustomerInfo/Customer') AS XT1(XC1)
CROSS APPLY
XC1.nodes('Queues/Q') AS XT2(XC2)
With the WITH XMLNAMESPACES construct, you can define some XML namespaces to be used by the following T-SQL statement - default or prefixed namespaces alike.

Bulk Import of XML Into Existing Tables

I am new to XML and SQL Server and am trying import an XML file into SQL Server 2010. I have 14 tables that I would like to parse the data into. All 14 table names are listed in the XML as nodes (I think) I found some example code that worked with the simple example XML, but my XML seems a little more complicated and may not be structured optimally; unfortunately, I can't change that. As a basic attempt, I tried to insert the data into just one field of one existing table (SILVX_SN16000), but the Message pane shows "(0 rows(s) affected). Thanks in advance for looking at this.
USE TEST
Declare #xml XML
Select #xml =
CONVERT(XML,bulkcolumn,2) FROM OPENROWSET(BULK 'C:\Users\Kevin_S\Documents \SilvxInSightImport.xml',SINGLE_BLOB) AS X
SET ARITHABORT ON
Insert into [SILVX_SN16000]
(
md_group
)
Select
P.value('MD_GROUP[1]','NVARCHAR(255)') AS md_group
From #xml.nodes('/TableData/Row') PropertyFeed(P)
Here is a much-shortened (rows removed) version of my XML:
<?xml version="1.0" ?>
<SilvxInSightImport Version="1.0" Host="uslsss17" Date="14-09-14_20-40-02">
<Tables Count="14">
<Table Name="SN16000">
<TableSchema>
<Column><COLUMN_NAME>PARENT_HPKEY</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
<Column><COLUMN_NAME>MD_GROUP</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
<Column><COLUMN_NAME>PKEY</COLUMN_NAME><DATA_TYPE>NUMBER</DATA_TYPE></Column>
<Column><COLUMN_NAME>S_STATE</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
<Column><COLUMN_NAME>NAME</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
<Column><COLUMN_NAME>ROUTER_ID</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
<Column><COLUMN_NAME>IP_ADDR</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
</TableSchema>
<TableData>
<Row><MD_GROUP>100.120.25162</MD_GROUP><PARENT_HPKEY>100</PARENT_HPKEY> <PKEY>161888</PKEY><NAME>UODEDTM010</NAME><ROUTER_ID>10.41.32.129</ROUTER_ID> <IP_ADDR>10.41.32.129</IP_ADDR><S_STATE>IS-NR</S_STATE></Row>
<Row><MD_GROUP>100.120.25162</MD_GROUP><PARENT_HPKEY>100</PARENT_HPKEY> <PKEY>278599</PKEY><NAME>UODEETM010</NAME><ROUTER_ID>10.41.4.129</ROUTER_ID> <IP_ADDR>10.41.4.129</IP_ADDR><S_STATE>IS-NR</S_STATE></Row>
<Row><MD_GROUP>100.120.25162</MD_GROUP><PARENT_HPKEY>100</PARENT_HPKEY> <PKEY>183583</PKEY><NAME>UODEGRM010</NAME><ROUTER_ID>10.41.76.129</ROUTER_ID> <IP_ADDR>10.41.76.129</IP_ADDR><S_STATE>IS-NR</S_STATE></Row>
NT_HPKEY>100</PARENT_HPKEY><PKEY>811003</PKEY><NAME>UODWTIN010</NAME> <ROUTER_ID>10.27.36.130</ROUTER_ID><IP_ADDR>10.27.36.130</IP_ADDR><S_STATE>IS-NR</S_STATE> </Row>
</TableData>
</Table>
</Tables>
</SilvxInSightImport>
The xPath in .nodes() must specify the whole path to the Row nodes so you should start with SilvxInSightImport and work your way down to Row.
/SilvxInSightImport/Tables/Table/TableData/Row
In your case you have multiple table nodes, one for each table and I assume you only need one table at a time. You can use a predicate on the table name in the .nodes() xPath expression.
/SilvxInSightImport/Tables/Table[#Name = "SN16000"]/TableData/Row
Your whole query for SN16000 should look something like this.
select T.X.value('(MD_GROUP/text())[1]', 'varchar(20)') as MD_GROUP,
T.X.value('(PARENT_HPKEY/text())[1]', 'int') as PARENT_HPKEY,
T.X.value('(PKEY/text())[1]', 'int') as PKEY,
T.X.value('(NAME/text())[1]', 'varchar(20)') as NAME,
T.X.value('(ROUTER_ID/text())[1]', 'varchar(20)') as ROUTER_ID,
T.X.value('(IP_ADDR/text())[1]', 'varchar(20)') as IP_ADDR,
T.X.value('(S_STATE/text())[1]', 'varchar(20)') as S_STATE
from #XML.nodes('/SilvxInSightImport/Tables/Table[#Name = "SN16000"]/TableData/Row') as T(X)
You have to sort out the data types used for each column.
SQL Fiddle

Parsing XML with XMLNS in SQL Server 2008 R2

I'm fairly new to querying XML datatypes. We receive XMLs from partners and one such partner sends us XMLs like this:
DECLARE #ResultData XML = '<outGoing xmlns="urn:testsystems-com:HH.2015.Services.Telephony.OutGoing">
<customer>
<ID>158</ID>
</customer>
</outGoing>'
In this example, I would like to pull only the ID out of the XML, but it seems the xmlns is preventing me from getting anything inside the XML:
SELECT cust.value('(ID)[1]', 'VARCHAR(40)') as 'CustomerID'
FROM #ResultData.nodes('/outGoing/customer') as t(cust)
returns NUll, but if I manually remove the XMLNS from the XML I get 158.
I've experimented with WITH XMLNAMESPACES to see if I could use that, but I'm obviously missing something. Since these XMLs will be coming in automatically, I would like to be able to parse the XML, but right now I'm stuck.
That should work:
DECLARE #ResultData XML = '<outGoing xmlns="urn:testsystems-com:HH.2015.Services.Telephony.OutGoing">
<customer>
<ID>158</ID>
</customer>
</outGoing>'
;WITH XMLNAMESPACES(DEFAULT 'urn:testsystems-com:HH.2015.Services.Telephony.OutGoing')
SELECT
#ResultData.value('(/outGoing/customer/ID)[1]', 'int')
or to use your approach:
;WITH XMLNAMESPACES(DEFAULT 'urn:testsystems-com:HH.2015.Services.Telephony.OutGoing')
SELECT
CustomerID = cust.value('(ID)[1]', 'INT')
FROM
#ResultData.nodes('/outGoing/customer') as t(cust)
This will return 158 as its value.
I've used WITH XMLNAMESPACES(DEFAULT .....) since this is the only XML namespace in play, and it's defined at the top-level node - so it applies to every node in the XML structure.

Resources