Trying to query XML Data - node has a space in it - sql-server

I am trying to learn how to work with xml files and data in SQL Server and I'm trying to query an xml file but nothing is returned.
Here is the xml data:
<?xml version="1.0" encoding="UTF-8"?>
<Report xmlns="AdmissionsByPCP" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Name="AdmissionsByPCP" xsi:schemaLocation="AdmissionsByPCP http://10.xxx.x.xx/ReportServer_NameofReportServer?%2FHl%20C%20Syst%20Reports%2health%2FAdmissBy&rs%3ACommand=Render&rs%3AFormat=XML&rs%3ASessionID=h0iz5ijxgt2vdl45g3pjfs45&rc%3ASchema=True">
<Tablix2>
<Details_Collection>
<Details PCPCarrier="DoctorsName">
<Subreport1>
<Report Name="PCPAdmitSubReport">
<Tablix5 Textbox5="79">
<Details_Collection>
<Details Textbox37="Discharge Dx Code: ICDCode" Textbox89="Admit Dx Code: ICDCode" LOS="4" DischargeDate="07/10/2017" AdmitDate="07/06/2017" Hospital="Hospital Name" MemberName="Name" DOB="1/1/2019" AdmissionType="Inpatient" MemberNo="12345" Auth="321*I" Status="Close" AdmissionID="00001" LobName="Medicare" CarrierName="CarrierName"/>
</Details></Details_Collection></Tablix5></Report></Subreport1></Details></Details_Collection></Tablix2></Report>
Here is the query I'm using:
Declare #XMLData as XML
Set #XMLData=(
Select bulkcolumn
FROM OPENROWSET (Bulk '\Directory\AdmissionsByPCP.xml',
Single_Blob) a)
Select
#XMLData.value('(/Root/Report/Tablix2/Detail_Collections/DetailsPCPCarrier) [1]', 'varchar(max)') PCP
The query returns null and I don't know why. Is it because there is a space in the node (<Details PCPCarrier>) and if so how do I work around that?

You have misunderstood how XML works. This is the node you are looking for:
<Details PCPCarrier="DoctorsName">
This is not a node called Details PCPCarrier; it is a node called Details with an attribute called PCPCarrier.
So the XPath to select it would be:
/Root/Report/Tablix2/Detail_Collections/Details
Or, if you want to specifically filter by the PCPCarrier attribute existing:
/Root/Report/Tablix2/Detail_Collections/Details[#PCPCarrier]
Or, to get the value of the attribute itself:
/Root/Report/Tablix2/Detail_Collections/Details/#PCPCarrier

IMSoP pointed me in the right direction and I figured out the rest myself.
I also needed to add this:
With XMLNAMESPACES (Default 'AdmissionsByPCP')
So the query looks like this:
Declare #XMLData as XML
Set #XMLData=(
Select *
FROM OPENROWSET (Bulk '\\Directory\AdmissionsByPCP.xml',
Single_Clob) a );
With XMLNAMESPACES (Default 'AdmissionsByPCP')
Select
#XMLData.value('(/Report/Tablix2/Details_Collection/Details/#PCPCarrier)
[1]', 'varchar(max)')

Related

Slow XML import with SQL server

I have a XML file with a size of 1GB.
I use the following code to load the data into sql server.
DECLARE #xmlvar XML
SELECT #xmlvar = BulkColumn
FROM OPENROWSET(BULK 'C:\Data\demo.xml', SINGLE_BLOB) x;
WITH XMLNAMESPACES(DEFAULT 'ux:no::ehe:v5:actual:aver',
'ux:no:ehe:v5:move' AS ns4,
'ux:no:ehe:v5:cat:fill' as ns3,
'ux:no:ehe:v5:centre' as ns2)
SELECT
zs.value(N'(../#versionCode)', 'VARCHAR(100)') as versionCode,
zs.value(N'(#Start)', 'VARCHAR(50)') as Start_date,
zs.value(N'(#End)', 'VARCHAR(50)') as End_date
into testtbl
FROM #xmlvar.nodes('/ns4:Dataview1/ns4:Content/ns4:gen') A(zs);
I takes now more than 2 hours to run the query and it is not finished.
I have tested the query with a smaller version of the XML file and that works.
Any tips on improving the loading speed?
Thank you.
Update XML file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns4:Dataview1 xmlns="ux:no::ehe:v5:actual:aver" xmlns:ns4="ux:no:ehe:v5:move">
<ns4:Content versionCode="16000">
<ns4:gen start="1961-07-01" end="1961-07-01">
</ns4:gen>
<ns4:gen start="2017-09-19">
</ns4:gen>
<ns4:gen start="1961-07-02" end="2016-09-30">
</ns4:gen>
<ns4:gen start="2016-10-01" end="2017-09-18">
</ns4:gen>
</ns4:Content>
</ns4:Dataview1>
(1) As #Stu already pointed out, loading XML file first into a single row table will speed up the process of loading significantly.
(2) it is not a good idea to traverse XML up in the XPath expressions. Like here:
c.value('../#versionCode', 'VARCHAR(100)') as versionCode
But the XML structure was not shared in the question. So, it is impossible to suggest anything concrete.
2nd CROSS APPLY is simulating 1-to-many relationship in the XML hierarchy.
Check it out below.
SQL
CREATE TABLE tbl (
ID INT IDENTITY(1, 1) PRIMARY KEY,
XmlColumn XML
);
INSERT INTO tbl(XmlColumn)
SELECT * FROM OPENROWSET(BULK N'C:\Data\demo.xml', SINGLE_BLOB) AS x;
WITH XMLNAMESPACES(DEFAULT 'ux:no::ehe:v5:actual:aver',
'ux:no:ehe:v5:move' AS ns4,
'ux:no:ehe:v5:cat:fill' as ns3,
'ux:no:ehe:v5:centre' as ns2)
SELECT c.value('#versionCode', 'VARCHAR(100)') as versionCode,
x.value('#start', 'DATE') as Start_date,
x.value('#end', 'DATE') as End_date
INTO dbo.testtbl
FROM tbl
CROSS APPLY XmlColumn.nodes('/ns4:Dataview1/ns4:Content') AS t1(c)
CROSS APPLY t1.c.nodes('ns4:gen') AS t2(x);
In my opinion it's better to use an SSIS Package for importing XML files.
It has a component named "XML Source" for loading XML file.
There is a useful article at : https://www.sqlshack.com/import-xml-documents-into-sql-server-tables-using-ssis-packages/

loading xml File into SQL Server table is not working.

I have XML file that I am trying to load into SQL server but when I run the script, it is not displaying any rows.
<root>
<DeviceRecord xmlns="http://www.archer-tech.com/">
<IP>137.52</IP>
<FQDN>sdcww00</FQDN>
<NetBios_Name></NetBios_Name>
<Operating_System>Microsoft Windows Vista</Operating_System>
<Mac_Address></Mac_Address>
<Confidence_Level>65
</Confidence_Level>
</DeviceRecord>
<DeviceRecord xmlns="http://www.archer-tech.com/">
<IP>155.37.51</IP>
<FQDN>ww00048</FQDN>
<NetBios_Name></NetBios_Name>
<Operating_System>Microsoft Windows Vista</Operating_System>
<Mac_Address></Mac_Address>
<Confidence_Level>65
</Confidence_Level>
</DeviceRecord>
</root>
SQL Script
declare #xmldata as xml
set #xmldata= (SELECT CONVERT(XML, BulkColumn) AS BulkColumn
FROM OPENROWSET(BULK 'C:\Users\ag03536\Documents\New folder\updated.xml', SINGLE_BLOB)as X)
SELECT
x.Rec.query('./DeviceRecord').value('.','varchar(120)')
,x.Rec.query('./IP').value('.','varchar(20)')
,x.Rec.query('./FQDN').value('.','varchar(20)')
FROM #xmldata.nodes('./root') as x(rec)
First you have to check, whether the XML is read propperly. Use this after reading your XML into the variable:
SELECT #xmldata;
Secondly all your values live in a default namespace. You have to declare it:
WITH XMLNAMESPACES(DEFAULT 'http://www.archer-tech.com/')
Third, your query should read all nested <DeviceRecord> entries probably, you need .nodes() down to this level. The full query should be something like this:
WITH XMLNAMESPACES(DEFAULT 'http://www.archer-tech.com/')
SELECT
x.Rec.value('(IP/text())[1]','varchar(20)') AS DevRec_ID
,x.Rec.value('(FQDN/text())[1]','varchar(20)') AS DevRec_FQDN
--The rest should be the same approach...
FROM #xmldata.nodes('/*:root/DeviceRecord') as x(rec)
EDIT: Your node <root> is not part of the default namespace.
I used a wildcard (*:root)

SQL Server FOR XML PATH: Set xml-declaration or processing instruction "xml-stylesheet" on top

I want to set a processing instruction to include a stylesheet on top of an XML:
The same issue was with the xml-declaration (e.g. <?xml version="1.0" encoding="utf-8"?>)
Desired result:
<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
<TestPath>
<Test>Test</Test>
<SomeMore>SomeMore</SomeMore>
</TestPath>
My research brought me to node test syntax and processing-instruction().
This
SELECT 'type="text/xsl" href="stylesheet.xsl"' AS [processing-instruction(xml-stylesheet)]
,'Test' AS Test
,'SomeMore' AS SomeMore
FOR XML PATH('TestPath')
produces this:
<TestPath>
<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
<Test>Test</Test>
<SomeMore>SomeMore</SomeMore>
</TestPath>
All hints I found tell me to convert the XML to VARCHAR, concatenate it "manually" and convert it back to XML. But this is - how to say - ugly?
This works obviously:
SELECT CAST(
'<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
<TestPath>
<Test>Test</Test>
<SomeMore>SomeMore</SomeMore>
</TestPath>' AS XML);
Is there a chance to solve this?
There is another way, which will need two steps but don't need you to treat the XML as string anywhere in the process :
declare #result XML =
(
SELECT
'Test' AS Test,
'SomeMore' AS SomeMore
FOR XML PATH('TestPath')
)
set #result.modify('
insert <?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>
before /*[1]
')
Sqlfiddle Demo
The XQuery expression passed to modify() function tells SQL Server to insert the processing instruction node before the root element of the XML.
UPDATE :
Found another alternative based on the following thread : Merge the two xml fragments into one? . I personally prefer this way :
SELECT CONVERT(XML, '<?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?>'),
(
SELECT
'Test' AS Test,
'SomeMore' AS SomeMore
FOR XML PATH('TestPath')
)
FOR XML PATH('')
Sqlfiddle Demo
As it came out, har07's great answer does not work with an XML-declaration. The only way I could find was this:
DECLARE #ExistingXML XML=
(
SELECT
'Test' AS Test,
'SomeMore' AS SomeMore
FOR XML PATH('TestPath'),TYPE
);
DECLARE #XmlWithDeclaration NVARCHAR(MAX)=
(
SELECT N'<?xml version="1.0" encoding="UTF-8"?>'
+
CAST(#ExistingXml AS NVARCHAR(MAX))
);
SELECT #XmlWithDeclaration;
You must stay in the string line after this step, any conversion to real XML will either give an error (when the encoding is other then UTF-16) or will omit this xml-declaration.

How to get data from XML Column that contain xml namespace (SQL Server 2005)

I google a lot and got no luck.
I can't retrieve data from XML column which data came from web service using sp_OAGetProperty.
the XML Column contain..
<ArrayOfCustomerInfo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://tempuri.org/">
<Customer CustCode="001">
<CustName>John</CustName>
<Queues>
<Q>
<No>10</No>
<Line>1</Line>
</Q>
</Queues>
</Customer>
</ArrayOfCustomerInfo>
I got NULL when I execute following statement
(but works fine if I remove all XML namespace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://tempuri.org/")
SELECT a.b.value('#CustCode','varchar(4)') AS Code
,a.b.value('CustName[1]','varchar(20)') AS Name
,c.d.value('No[1]','int') AS QNo
,c.d.value('(Line)[1]','int') AS QLine
FROM PGHRMS_Employees x
CROSS APPLY x.data.nodes('/ArrayOfCustomerInfo/Customer') AS a(b)
CROSS APPLY a.b.nodes('Queues/Q') AS c(d)
please give me some advice. I've to achieve with SQL SERVER :(
If anyone want to reproduce it, I pasted script at : http://pastebin.com/ueZGidyL
Thank you in advance !!!
Try this:
;WITH XMLNAMESPACES(DEFAULT 'http://tempuri.org/')
SELECT
Code = XC1.value('#CustCode', 'varchar(4)'),
Name = XC1.value('CustName[1]', 'varchar(20)'),
QNo = XC2.value('No[1]', 'int') ,
QLine = XC2.value('(Line)[1]','int')
FROM
PGHRMS_Employees
CROSS APPLY
XmlContent.nodes('/ArrayOfCustomerInfo/Customer') AS XT1(XC1)
CROSS APPLY
XC1.nodes('Queues/Q') AS XT2(XC2)
With the WITH XMLNAMESPACES construct, you can define some XML namespaces to be used by the following T-SQL statement - default or prefixed namespaces alike.

Bulk Import of XML Into Existing Tables

I am new to XML and SQL Server and am trying import an XML file into SQL Server 2010. I have 14 tables that I would like to parse the data into. All 14 table names are listed in the XML as nodes (I think) I found some example code that worked with the simple example XML, but my XML seems a little more complicated and may not be structured optimally; unfortunately, I can't change that. As a basic attempt, I tried to insert the data into just one field of one existing table (SILVX_SN16000), but the Message pane shows "(0 rows(s) affected). Thanks in advance for looking at this.
USE TEST
Declare #xml XML
Select #xml =
CONVERT(XML,bulkcolumn,2) FROM OPENROWSET(BULK 'C:\Users\Kevin_S\Documents \SilvxInSightImport.xml',SINGLE_BLOB) AS X
SET ARITHABORT ON
Insert into [SILVX_SN16000]
(
md_group
)
Select
P.value('MD_GROUP[1]','NVARCHAR(255)') AS md_group
From #xml.nodes('/TableData/Row') PropertyFeed(P)
Here is a much-shortened (rows removed) version of my XML:
<?xml version="1.0" ?>
<SilvxInSightImport Version="1.0" Host="uslsss17" Date="14-09-14_20-40-02">
<Tables Count="14">
<Table Name="SN16000">
<TableSchema>
<Column><COLUMN_NAME>PARENT_HPKEY</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
<Column><COLUMN_NAME>MD_GROUP</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
<Column><COLUMN_NAME>PKEY</COLUMN_NAME><DATA_TYPE>NUMBER</DATA_TYPE></Column>
<Column><COLUMN_NAME>S_STATE</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
<Column><COLUMN_NAME>NAME</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
<Column><COLUMN_NAME>ROUTER_ID</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
<Column><COLUMN_NAME>IP_ADDR</COLUMN_NAME><DATA_TYPE>VARCHAR2</DATA_TYPE></Column>
</TableSchema>
<TableData>
<Row><MD_GROUP>100.120.25162</MD_GROUP><PARENT_HPKEY>100</PARENT_HPKEY> <PKEY>161888</PKEY><NAME>UODEDTM010</NAME><ROUTER_ID>10.41.32.129</ROUTER_ID> <IP_ADDR>10.41.32.129</IP_ADDR><S_STATE>IS-NR</S_STATE></Row>
<Row><MD_GROUP>100.120.25162</MD_GROUP><PARENT_HPKEY>100</PARENT_HPKEY> <PKEY>278599</PKEY><NAME>UODEETM010</NAME><ROUTER_ID>10.41.4.129</ROUTER_ID> <IP_ADDR>10.41.4.129</IP_ADDR><S_STATE>IS-NR</S_STATE></Row>
<Row><MD_GROUP>100.120.25162</MD_GROUP><PARENT_HPKEY>100</PARENT_HPKEY> <PKEY>183583</PKEY><NAME>UODEGRM010</NAME><ROUTER_ID>10.41.76.129</ROUTER_ID> <IP_ADDR>10.41.76.129</IP_ADDR><S_STATE>IS-NR</S_STATE></Row>
NT_HPKEY>100</PARENT_HPKEY><PKEY>811003</PKEY><NAME>UODWTIN010</NAME> <ROUTER_ID>10.27.36.130</ROUTER_ID><IP_ADDR>10.27.36.130</IP_ADDR><S_STATE>IS-NR</S_STATE> </Row>
</TableData>
</Table>
</Tables>
</SilvxInSightImport>
The xPath in .nodes() must specify the whole path to the Row nodes so you should start with SilvxInSightImport and work your way down to Row.
/SilvxInSightImport/Tables/Table/TableData/Row
In your case you have multiple table nodes, one for each table and I assume you only need one table at a time. You can use a predicate on the table name in the .nodes() xPath expression.
/SilvxInSightImport/Tables/Table[#Name = "SN16000"]/TableData/Row
Your whole query for SN16000 should look something like this.
select T.X.value('(MD_GROUP/text())[1]', 'varchar(20)') as MD_GROUP,
T.X.value('(PARENT_HPKEY/text())[1]', 'int') as PARENT_HPKEY,
T.X.value('(PKEY/text())[1]', 'int') as PKEY,
T.X.value('(NAME/text())[1]', 'varchar(20)') as NAME,
T.X.value('(ROUTER_ID/text())[1]', 'varchar(20)') as ROUTER_ID,
T.X.value('(IP_ADDR/text())[1]', 'varchar(20)') as IP_ADDR,
T.X.value('(S_STATE/text())[1]', 'varchar(20)') as S_STATE
from #XML.nodes('/SilvxInSightImport/Tables/Table[#Name = "SN16000"]/TableData/Row') as T(X)
You have to sort out the data types used for each column.
SQL Fiddle

Resources