Background: I am updating scripts in PowerShell that routinely export large amounts of data from a non-MS database to SQL Server on a different host.
On the export side, I have chosen the .NET System.Data.Dataset object as the format for the data. The transfer file is created using the WriteXml method with the WriteSchema option. This approach supports multiple tables and retains database schema information for the receiving server all in a single file.
Per request, a basic DataSet file might be:
<?xml version="1.0" standalone="yes"?>
<NewDataSet>
<xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:UseCurrentLocale="true">
<xs:complexType>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element name="table1">
<xs:complexType>
<xs:sequence>
<xs:element name="col1" type="xs:string" minOccurs="0" />
<xs:element name="col2" type="xs:string" minOccurs="0" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="table2">
<xs:complexType>
<xs:sequence>
<xs:element name="col1" type="xs:string" minOccurs="0" />
<xs:element name="col2" type="xs:string" minOccurs="0" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>
<table1>
<col1>tkshrq</col1>
<col2>6krrtq</col2>
</table1>
<table1>
<col1>k60stu</col1>
<col2>sqnhp9</col2>
</table1>
<table2>
<col1>6k1thw</col1>
<col2>n2ocgz</col2>
</table2>
<table2>
<col1>26kmw5</col1>
<col2>ym3iwd</col2>
</table2>
</NewDataSet>
On the receiving side, I have an import script that utilizes Write-SqlTableData to bulk load tables from the DataSet file into temporary tables and then run a stored procedure to provide transaction isolation while data moves to the "live" tables.
I am hoping to find a method to directly access the DataSet file from within T-SQL so the import can be done by a single stored procedure.
I am aware how to set up linked servers for flat, "rowset" files (CSVs, DataTable, etc) and query them with OPENROWSET. But I have not been successful in accessing the multi-tabled DataSet file.
I am not interested in changing the transfer file format. It has several desired features and I'd rather deal with a zillion temporary tables than wrangle a zillion transfer files.
I am also aware of third party XML ODBC providers for SQL Server. But third party software is not permissible in this instance.
Please try the following solution.
It is using T-SQL and XQuery methods .nodes() and .value().
I saved your XML as 'e:\Temp\NewDataSet.xml' file.
SQL Server XML data type can hold up to 2GB size wise.
If the performance of the suggested method is not that good, depending on the volume of the data, it is possible to load the entire XML file into a temporary table with one row and one column.
SQL
DECLARE #tbl1 TABLE (ID INT IDENTITY PRIMARY KEY, col1 VARCHAR(50), col2 VARCHAR(50));
DECLARE #tbl2 TABLE (ID INT IDENTITY PRIMARY KEY, col1 VARCHAR(50), col2 VARCHAR(50));
DECLARE #xml XML;
SELECT #xml = XmlDoc
FROM OPENROWSET (BULK N'e:\Temp\NewDataSet.xml', SINGLE_BLOB, CODEPAGE='65001') AS Tab(XmlDoc);
INSERT INTO #tbl1 (col1, col2)
SELECT c.value('(col1/text())[1]', 'VARCHAR(50)') AS col1
, c.value('(col2/text())[1]','VARCHAR(50)') AS col2
FROM #xml.nodes('/NewDataSet/table1') AS t(c);
INSERT INTO #tbl2 (col1, col2)
SELECT c.value('(col1/text())[1]', 'VARCHAR(50)') AS col1
, c.value('(col2/text())[1]','VARCHAR(50)') AS col2
FROM #xml.nodes('/NewDataSet/table2') AS t(c);
-- test
SELECT * FROM #tbl1;
SELECT * FROM #tbl2;
Output
Table1
+----+--------+--------+
| ID | col1 | col2 |
+----+--------+--------+
| 1 | tkshrq | 6krrtq |
| 2 | k60stu | sqnhp9 |
+----+--------+--------+
Table2
+----+--------+--------+
| ID | col1 | col2 |
+----+--------+--------+
| 1 | 6k1thw | n2ocgz |
| 2 | 26kmw5 | ym3iwd |
+----+--------+--------+
I'm trying to create SQL code that would dynamically convert the results of a table to a XML format that resembles the one shown below but upon till now have not found a way to do so.
Table sample:
key name age
---------------
1 Anakin 23
2 jill 40
XML
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="key" type="xs:int" minOccurs="0" />
<xs:element name="name" type="xs:string" minOccurs="0" />
<xs:element name="age" type="xs:int" minOccurs="0" />
</xs:sequence>
</xs:complexType>
</xs:element>
Any suggestions or reading material that could help?
As stated in comments you can get simple element output as easy as the FOR XML PATH ('person')
To generate both schema and xlm that conforms start here:
create table person
([key] int,
[name] varchar(50),
[age] int);
go
insert into person values (1, 'Anakin', 23),(2, 'Jill', 40);
go
select * from person FOR XML AUTO, ELEMENTS, XMLSCHEMA;
If you require explicit element names or schema names you'll need to swap out the XML AUTO with XML EXPLICIT and provide your own schema document.
More info on inline XSD here: https://learn.microsoft.com/en-us/sql/relational-databases/xml/generate-an-inline-xsd-schema?view=sql-server-2017
More info on XML EXPLICIT here: https://learn.microsoft.com/en-us/sql/relational-databases/xml/use-explicit-mode-with-for-xml?view=sql-server-2017
I have a schema with xs:date attribute which is defined in a way that it may contain date or be empty.
But when I trying to query this element I get an error
"XQuery [value()]: 'value()' requires a singleton (or empty sequence), found operand of type 'xs:date *'"
Any suggestions?
Steps to reproduce
create xml schema collection dbo.[test] AS
N'<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="PACKAGE" >
<xs:complexType>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element name="CUSTOMER">
<xs:complexType>
<xs:sequence>
<xs:element name="BIRTHDAY" >
<xs:annotation>
<xs:documentation>Date of Birth</xs:documentation>
</xs:annotation>
<xs:simpleType>
<xs:restriction>
<xs:simpleType>
<xs:list itemType="xs:date" />
</xs:simpleType>
<xs:minLength value="0" />
<xs:maxLength value="1" />
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>';
go
declare #xml xml(dbo.[test]);
set #xml =
'<PACKAGE>
<CUSTOMER>
<BIRTHDAY></BIRTHDAY>
</CUSTOMER>
<CUSTOMER>
<BIRTHDAY>2010-01-01</BIRTHDAY>
</CUSTOMER>
</PACKAGE>'
select
BIRTHDAY = t.cust.value('(BIRTHDAY)[1]', 'date')
FROM #xml.nodes('/PACKAGE/CUSTOMER') as t(cust)
go
drop xml schema collection dbo.[test]
You can use the data Function (XQuery)
select
BIRTHDAY = t.cust.value('data(BIRTHDAY)[1]', 'date')
FROM #xml.nodes('/PACKAGE/CUSTOMER') as t(cust)
The function value() in SQL Server needs a single value and BIRTHDAY is defined as a list of dates. (BIRTHDAY)[1] will give you the first list of dates. data(BIRTHDAY)[1] will give you the first date in the list of dates stored in BIRTHDAY.
Got it! (another way)
select
BIRTHDAY = nullif(t.cust.query('BIRTHDAY').value('(BIRTHDAY)[1]', 'date'), '1900-01-01')
from #xml.nodes('/PACKAGE/CUSTOMER') as t(cust)
Results:
BIRTHDAY
----------
NULL
2010-01-01
I have a file that is structured like so:
<?xml version="1.0" encoding="UTF-8"?>
<EventSchedule>
<Event Uid="2" Type="Main Event">
<IsFixed>True</IsFixed>
<EventKind>MainEvent</EventKind>
<Fields>
<Parameter Name="Type" Value="TV_Show"/>
<Parameter Name="Name" Value="The Muppets"/>
<Parameter Name="Duration" Value="00:30:00"/>
</Fields>
</Event>
<Event>
...and so on
</Event>
</EventSchedule>
I'm not entirely sure if it is valid XML, however I need to import it into SQL Server but everything I try doesn't seem to work.
Please could anyone point me in the right direction either with some example code or a recommendation on which method to use?
I'd ideally like to get the raw data into a flat table, along the lines of:
Name | Type | Duration | EventKind
The Muppets | TV_Show | 00:30:00 | MainEvent
Finally this is coming from fairly large files and I will need to import the regularly.
Thanks, pugu
Try this:
DECLARE #XML XML = '<EventSchedule>
<Event Uid="2" Type="Main Event">
<IsFixed>True</IsFixed>
<EventKind>MainEvent</EventKind>
<Fields>
<Parameter Name="Type" Value="TV_Show"/>
<Parameter Name="Name" Value="The Muppets"/>
<Parameter Name="Duration" Value="00:30:00"/>
</Fields>
</Event>
<Event Uid="3" Type="Secondary Event">
<IsFixed>True</IsFixed>
<EventKind>SecondaryEvent</EventKind>
<Fields>
<Parameter Name="Type" Value="TV_Show"/>
<Parameter Name="Name" Value="The Muppets II"/>
<Parameter Name="Duration" Value="00:30:00"/>
</Fields>
</Event>
</EventSchedule>'
SELECT
EventUID = Events.value('#Uid', 'int'),
EventType = Events.value('#Type', 'varchar(20)'),
EventIsFixed =Events.value('(IsFixed)[1]', 'varchar(20)'),
EventKind =Events.value('(EventKind)[1]', 'varchar(20)')
FROM
#XML.nodes('/EventSchedule/Event') AS XTbl(Events)
Gives me an output of:
And of course, you can easily do an
INSERT INTO dbo.YourTable(EventUID, EventType, EventIsFixed, EventKind)
SELECT
......
to insert that data into a relational table.
Update: assuming you have your XML in files - you can use this code to load the XML file into an XML variable in SQL Server:
DECLARE #XmlFile XML
SELECT #XmlFile = BulkColumn
FROM OPENROWSET(BULK 'path-to-your-XML-file', SINGLE_BLOB) x;
and then use the above code snippet to parse the XML.
Update #2: if you need the parameters, too - use this XQuery statement:
SELECT
EventUID = Events.value('#Uid', 'int'),
EventType = Events.value('#Type', 'varchar(20)'),
EventIsFixed = Events.value('(IsFixed)[1]', 'varchar(20)'),
EventKind = Events.value('(EventKind)[1]', 'varchar(20)'),
ParameterType = Events.value('(Fields/Parameter[#Name="Type"]/#Value)[1]', 'varchar(20)'),
ParameterName = Events.value('(Fields/Parameter[#Name="Name"]/#Value)[1]', 'varchar(20)'),
ParameterDuration = Events.value('(Fields/Parameter[#Name="Duration"]/#Value)[1]', 'varchar(20)')
FROM
#XML.nodes('/EventSchedule/Event') AS XTbl(Events)
Results in:
You do it by creating a destination table, then a schema mapping file that maps the xml elements to table columns.
Yours might look a bit like this:
create table event (
Type nvarchar(50),
Name nvarchar(50),
Duration nvarchar(50))
and this:
<?xml version="1.0" ?>
<Schema xmlns="urn:schemas-microsoft-com:xml-data"
xmlns:dt="urn:schemas-microsoft-com:xml:datatypes"
xmlns:sql="urn:schemas-microsoft-com:xml-sql" >
<ElementType name="Type" dt:type="string" />
<ElementType name="Name" dt:type="string" />
<ElementType name="Duration" dt:type="string" />
<ElementType name="EventSchedule" sql:is-constant="1">
<element type="Event" />
</ElementType>
<ElementType name="Event" sql:relation="Event">
<element type="Type" sql:field="Type" />
<element type="Name" sql:field="Name" />
<element type="Duration" sql:field="Duration" />
</ElementType>
</Schema>
Then you can load your XML into your table using the XML bulk loader.
http://support.microsoft.com/kb/316005
If you need to do it without XML variable (from string in table-valued function)
SELECT
--myTempTable.XmlCol.value('.', 'varchar(36)') AS val
myTempTable.XmlCol.query('./ID').value('.', 'varchar(36)') AS ID
,myTempTable.XmlCol.query('./Name').value('.', 'nvarchar(MAX)') AS Name
,myTempTable.XmlCol.query('./RFC').value('.', 'nvarchar(MAX)') AS RFC
,myTempTable.XmlCol.query('./Text').value('.', 'nvarchar(MAX)') AS Text
,myTempTable.XmlCol.query('./Desc').value('.', 'nvarchar(MAX)') AS Description
--,myTempTable.XmlCol.value('(Desc)[1]', 'nvarchar(MAX)') AS DescMeth2
FROM
(
SELECT
CAST('<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<data-set>
<record>
<ID>1</ID>
<Name>A</Name>
<RFC>RFC 1035[1]</RFC>
<Text>Address record</Text>
<Desc>Returns a 32-bit IPv4 address, most commonly used to map hostnames to an IP address of the host, but it is also used for DNSBLs, storing subnet masks in RFC 1101, etc.</Desc>
</record>
<record>
<ID>2</ID>
<Name>NS</Name>
<RFC>RFC 1035[1]</RFC>
<Text>Name server record</Text>
<Desc>Delegates a DNS zone to use the given authoritative name servers</Desc>
</record>
</data-set>
' AS xml) AS RawXml
) AS b
--CROSS APPLY b.RawXml.nodes('//record/ID') myTempTable(XmlCol);
CROSS APPLY b.RawXml.nodes('//record') myTempTable(XmlCol);
Or from file:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[tfu_RPT_SEL_XmlData]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))
DROP FUNCTION [dbo].[tfu_RPT_SEL_XmlData]
GO
CREATE FUNCTION [dbo].[tfu_RPT_SEL_XmlData]
(
#in_language varchar(10)
,#in_reportingDate datetime
)
RETURNS TABLE
AS
RETURN
(
SELECT
--myTempTable.XmlCol.value('.', 'varchar(36)') AS val
myTempTable.XmlCol.query('./ID').value('.', 'varchar(36)') AS ID
,myTempTable.XmlCol.query('./Name').value('.', 'nvarchar(MAX)') AS Name
,myTempTable.XmlCol.query('./RFC').value('.', 'nvarchar(MAX)') AS RFC
,myTempTable.XmlCol.query('./Text').value('.', 'nvarchar(MAX)') AS Text
,myTempTable.XmlCol.query('./Desc').value('.', 'nvarchar(MAX)') AS Description
FROM
(
SELECT CONVERT(XML, BulkColumn) AS RawXml
FROM OPENROWSET(BULK 'D:\username\Desktop\MyData.xml', SINGLE_BLOB) AS MandatoryRowSetName
) AS b
CROSS APPLY b.RawXml.nodes('//record') myTempTable(XmlCol)
)
GO
SELECT * FROM tfu_RPT_SEL_XmlData('DE', CURRENT_TIMESTAMP);
e.g.
DECLARE #bla varchar(MAX)
SET #bla = 'BED40DFC-F468-46DD-8017-00EF2FA3E4A4,64B59FC5-3F4D-4B0E-9A48-01F3D4F220B0,A611A108-97CA-42F3-A2E1-057165339719,E72D95EA-578F-45FC-88E5-075F66FD726C'
-- http://stackoverflow.com/questions/14712864/how-to-query-values-from-xml-nodes
SELECT
x.XmlCol.value('.', 'varchar(36)') AS val
FROM
(
SELECT
CAST('<e>' + REPLACE(#bla, ',', '</e><e>') + '</e>' AS xml) AS RawXml
) AS b
CROSS APPLY b.RawXml.nodes('e') x(XmlCol);
So you can have a function like
SELECT * FROM MyTable
WHERE UID IN
(
SELECT
x.XmlCol.value('.', 'varchar(36)') AS val
FROM
(
SELECT
CAST('<e>' + REPLACE(#bla, ',', '</e><e>') + '</e>' AS xml) AS RawXml
) AS b
CROSS APPLY b.RawXml.nodes('e') x(XmlCol)
)
If you're trying to import your XML as a "pure" XML field you should create a table like this (obviously with many other fields as you want):
CREATE TABLE [dbo].[TableXML](
[ID] [int] IDENTITY(1,1) NOT NULL,
[XmlContent] [xml] NOT NULL -- specify [xml] type
)
Then you can easily insert your XML as a string:
INSERT INTO [dbo].[TableXML]
([XmlContent])
VALUES
('<?xml version="1.0" encoding="UTF-8"?>
<EventSchedule>
<Event Uid="2" Type="Main Event">
<IsFixed>True</IsFixed>
<EventKind>MainEvent</EventKind>
<Fields>
<Parameter Name="Type" Value="TV_Show"/>
<Parameter Name="Name" Value="The Muppets"/>
<Parameter Name="Duration" Value="00:30:00"/>
</Fields>
</Event>
</EventSchedule>')
Then to query start from MSDN t-SQL XML
If you prefer store it as string use a varchar(max) in place of [XML] column type and the same insert. But if you like to query easily I suggest [XML] type. With the flat string approach you need a lot of work unless you will implement some application code to parse it and store in a flat table.
A good approach could be an XML storage in a "compress" TABLE and a VIEW for data retrieve with the flat field disposition.
How to load the below XML data into the SQL
<?xml version="1.0" encoding="utf-8"?>
<DataTable xmlns="SmarttraceWS">
<xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:MainDataTable="ActivityRecords" msdata:UseCurrentLocale="true">
<xs:complexType>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element name="ActivityRecords">
<xs:complexType>
<xs:sequence>
<xs:element name="ReferenceID" type="xs:long" minOccurs="0" />
<xs:element name="IMEI" type="xs:string" minOccurs="0" />
<xs:element name="Asset" type="xs:string" minOccurs="0" />
<xs:element name="Driver" type="xs:string" minOccurs="0" />
<xs:element name="DateTime" type="xs:string" minOccurs="0" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>
<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
<DocumentElement xmlns="">
<ActivityRecords diffgr:id="ActivityRecords1" msdata:rowOrder="0">
<ReferenceID>2620443016</ReferenceID>
<IMEI>013795001360346</IMEI>
<Asset>L-93745</Asset>
<Driver>N/A</Driver>
<DateTime>2019-10-14 12:00:35</DateTime>
</ActivityRecords>
</DocumentElement>
</diffgr:diffgram>
</DataTable>
Introduction
I'm trying to query an xml column in SQL server 2008, but I get an error I can't fix.
This is the schema I use:
CREATE XML SCHEMA COLLECTION PublicationSchema AS '
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:import namespace="http://www.w3.org/XML/1998/namespace"
schemaLocation="xml.xsd"/>
<xs:element name="publication">
<xs:complexType>
<xs:sequence>
<xs:element ref="metadata"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="meta">
<xs:complexType>
<xs:attributeGroup ref="attlist-meta"/>
</xs:complexType>
</xs:element>
<xs:attributeGroup name="attlist-meta">
<xs:attribute name="name" use="required"/>
<xs:attribute name="content"/>
<xs:attribute name="scheme"/>
</xs:attributeGroup>
<xs:element name="metadata">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" ref="meta"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>'
GO
I create table with an XML column using the schema:
create table test (content XML(PublicationSchema))
I insert some data:
insert into test values(N'<?xml version="1.0" encoding="UTF-16"?>
<publication>
<metadata>
<meta name="type" content="plan" scheme="city"/>
<meta name="statistics" content="second" scheme="informationtype"/>
</metadata>
</publication>')
Problem
When i execute a query:
select * from test
where Content.exist('/publication/metadata/meta[#name] = "type"') = 1
I get this error:
Msg 2213, Level 16, State 1, Line 3
XQuery [test.content.exist()]: Cannot atomize/apply data()
on expression that contains type 'meta' within inferred
type 'element(meta,#anonymous) *'
Question
Doesn anyone know what I can do to fix this query?
You have a syntax error in your exist function. You need to have the comparison between the brackets.
select *
from test
where Content.exist('/publication/metadata/meta[#name = "type"]') = 1
This will work just fine with the XML you have if it was not for your schema. Applying that schema will give the error you referred to in a comment because you have no data type for the attribute name.
You have two options to fix this. Alter the schema to include data types or rewrite the query above tricking SQL Server to treat the attribute as not part of schema.
Specifying a data type for name would look like this.
<xs:attributeGroup name="attlist-meta">
<xs:attribute name="name" use="required" type="xs:string"/>
<xs:attribute name="content"/>
<xs:attribute name="scheme"/>
</xs:attributeGroup>
If you can not modify the schema you can use this query instead.
select *
from test
where Content.query('/publication/metadata/meta').exist('/*[#name = "type"]') = 1