I'm trying to insert an UTF-8 csv file into the MSSQL.
The file contains scandiniavian, russian and chinese characters which do not get properly represented by the insert. e.g. Brøndy is being represented as Br├©ndy. Furthermore, this string takes up a length of 7 spaces instead of 6.
First Try
Table Definition
CREATE TABLE [PSAP].[staging].[ADRC]
(
[MANDT] [NVARCHAR] (12) COLLATE Latin1_General_CI_AS NULL
, [CITY1] [NVARCHAR] (80) COLLATE Latin1_General_CI_AS NULL
)
XML Format File
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR='"' />
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR='";"' />
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR='";"' />
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR='"' />
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR='\n' />
</RECORD>
<ROW>
<COLUMN SOURCE="2" NAME="CLIENT" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="ADRC" xsi:type="SQLNVARCHAR"/>
</ROW>
</BCPFORMAT>
Bulk Insert Statement
BULK INSERT [PSAP].[staging].[ADRC]
FROM 'E:\sourceData\ADRC.csv' WITH (FIRSTROW = 2, FORMATFILE = 'E:\sourceData\ADRC.csv.xml' )
Data Example
"CLIENT";"CITY1"
"600";"Brøndy"
"600";"武戏"
I am trying to extract data from a big xml file using OpenXML and BulkColumn and then save it into a new table called badges.
I am also executing a select statement to show the content of the table.
The file is stored locally. The file uses the attribute-centric mapping and has tens of thousands of rows.
The code I am using is:
CREATE TABLE dbo.badges (
Id int,
Name NVARCHAR(1000),
Date date,
Class smallint,
TagBased nvarchar(10),
);
DECLARE #XMLDoc XML;
DECLARE #XMLDocID INT;
SELECT #XMLDoc = BulkColumn
FROM OPENROWSET(BULK 'C:\Users\Zuhair\Desktop\Badges.xml', SINGLE_BLOB);
EXEC sys.sp_xml_preparedocument #XMLDocID OUTPUT, #XMLDoc;
SELECT Id, Name, Date, Class, TagBased
FROM OPENXML(#XMLDocID, '/badges/row')
WITH (Id int 'Id',
Name NVARCHAR(1000) 'Name',
Date date 'Date',
Class smallint 'Class',
TagBased nvarchar(10) 'TagBased');
INSERT INTO dbo.badges (Id, Name, Date, Class, TagBased)
SELECT *
FROM OPENXML(#XMLDocID, '/badges/row')
WITH (Id int 'Id',
Name NVARCHAR(1000) 'Name',
Date date 'Date',
Class smallint 'Class',
TagBased nvarchar(10) 'TagBased');
exec sp_xml_removedocument #XMLDocID;
However, when I execute the above code I get the following result:
Here is a sample of the XML data that I am using:
<badges>
<row Id="1" UserId="2" Name="Autobiographer" Date="2010-08-11T18:25:03.937" Class="3" TagBased="False" />
<row Id="2" UserId="3" Name="Autobiographer" Date="2010-08-11T18:25:03.997" Class="3" TagBased="False" />
<row Id="3" UserId="4" Name="Autobiographer" Date="2010-08-11T18:25:04.107" Class="3" TagBased="False" />
<row Id="4" UserId="22" Name="Autobiographer" Date="2010-08-11T19:35:05.283" Class="3" TagBased="False" />
<row Id="5" UserId="33" Name="Autobiographer" Date="2010-08-11T19:35:05.330" Class="3" TagBased="False" />
<row Id="6" UserId="27" Name="Autobiographer" Date="2010-08-11T19:40:05.490" Class="3" TagBased="False" />
...
</badges>
Why am I getting this result rather than a table that has the desired data?
The usage of FROM OPENXML with the related procedures to prepare and remove a document are out-dated and should not be used any more.
Try this:
DECLARE #xml XML =
'<badges>
<row Id="1" UserId="2" Name="Autobiographer" Date="2010-08-11T18:25:03.937" Class="3" TagBased="False" />
<row Id="2" UserId="3" Name="Autobiographer" Date="2010-08-11T18:25:03.997" Class="3" TagBased="False" />
<row Id="3" UserId="4" Name="Autobiographer" Date="2010-08-11T18:25:04.107" Class="3" TagBased="False" />
<row Id="4" UserId="22" Name="Autobiographer" Date="2010-08-11T19:35:05.283" Class="3" TagBased="False" />
<row Id="5" UserId="33" Name="Autobiographer" Date="2010-08-11T19:35:05.330" Class="3" TagBased="False" />
<row Id="6" UserId="27" Name="Autobiographer" Date="2010-08-11T19:40:05.490" Class="3" TagBased="False" />
</badges>';
SELECT r.value('#Id','int') AS Id
,r.value('#UserId','int') AS UserId
,r.value('#Name','varchar(max)') AS Name
,r.value('#Date','datetime') AS [Date]
,r.value('#Class','int') AS Class
,r.value('#TagBased','bit') AS TagBased
FROM #xml.nodes('/badges/row') AS A(r)
UPDATE The full (minimal) code
DECLARE #XMLDoc XML;
SELECT #XMLDoc = BulkColumn
FROM OPENROWSET(BULK 'C:\Users\Zuhair\Desktop\Badges.xml', SINGLE_BLOB) AS x;
SELECT r.value('#Id','int') AS Id
,r.value('#UserId','int') AS UserId
,r.value('#Name','varchar(max)') AS Name
,r.value('#Date','datetime') AS [Date]
,r.value('#Class','int') AS Class
,r.value('#TagBased','bit') AS TagBased
FROM #XMLDoc.nodes('/badges/row') AS A(r)
First and foremost, I like Shnugo's answer and I believe that is the path you should follow.
For your specific question, the reason you got all the NULL is because you are extracting data from the ATTRIBUTE and you forgot all the #.
Try the code below:
DECLARE #XMLDoc XML =
'<badges>
<row Id="1" UserId="2" Name="Autobiographer" Date="2010-08-11T18:25:03.937" Class="3" TagBased="False" />
<row Id="2" UserId="3" Name="Autobiographer" Date="2010-08-11T18:25:03.997" Class="3" TagBased="False" />
<row Id="3" UserId="4" Name="Autobiographer" Date="2010-08-11T18:25:04.107" Class="3" TagBased="False" />
<row Id="4" UserId="22" Name="Autobiographer" Date="2010-08-11T19:35:05.283" Class="3" TagBased="False" />
<row Id="5" UserId="33" Name="Autobiographer" Date="2010-08-11T19:35:05.330" Class="3" TagBased="False" />
<row Id="6" UserId="27" Name="Autobiographer" Date="2010-08-11T19:40:05.490" Class="3" TagBased="False" />
</badges>';
DECLARE #XMLDocID INT;
EXEC sys.sp_xml_preparedocument #XMLDocID OUTPUT, #XMLDoc;
SELECT Id, Name, Date, Class, TagBased
FROM OPENXML(#XMLDocID, '/badges/row')
WITH (Id int '#Id',
Name NVARCHAR(1000) '#Name',
Date date '#Date',
Class smallint '#Class',
TagBased nvarchar(10) '#TagBased');
exec sp_xml_removedocument #XMLDocID;
How to import below XML data into SQL Server table with three columns?
<dataset>
<metadata>
<item name="NAME_LAST" type="xs:string" length="62" />
<item name="NAME_FIRST" type="xs:string" length="62" />
<item name="NAME_MIDDLE" type="xs:string" length="32" />
</metadata>
<data>
<row>
<value>SMITH</value>
<value>MARY</value>
<value>N</value>
</row>
<row>
<value>SMITH2</value>
<value>MARY2</value>
<value>N2</value>
</row>
</data>
</dataset>
Try this:
DECLARE #input XML = '<dataset>
<metadata>
<item name="NAME_LAST" type="xs:string" length="62" />
<item name="NAME_FIRST" type="xs:string" length="62" />
<item name="NAME_MIDDLE" type="xs:string" length="32" />
</metadata>
<data>
<row>
<value>SMITH</value>
<value>MARY</value>
<value>N</value>
</row>
<row>
<value>SMITH2</value>
<value>MARY2</value>
<value>N2</value>
</row>
</data>
</dataset>'
INSERT INTO dbo.YourTable(ColName, ColFirstName, ColOther)
SELECT
Name = XCol.value('(value)[1]','varchar(25)'),
FirstName = XCol.value('(value)[2]','varchar(25)'),
OtherValue = XCol.value('(value)[3]','varchar(25)')
FROM
#input.nodes('/dataset/data/row') AS XTbl(XCol)
Insert XML Data into sql Server table
Declare #retValue1 varchar(50);
Declare #XmlStr XML;
SET #XmlStr='<Customers>
<customer>
<ID>111589</ID>
<FirstName>name1</FirstName>
<LastName>Lname1</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>12345</ID>
<FirstName>name2</FirstName>
<LastName>Lname2</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>14567</ID>
<FirstName>name3</FirstName>
<LastName>Lname3</LastName>
<Company>DEF</Company>
</customer>
</Customers>';
#retValue='Failed';
INSERT INTO [test_xmlinsert](
[id],
[firstName],
[lastName],
[company]
)
SELECT
COALESCE([Table].[Column].value('ID[1]', 'int'),0) as 'ID',
[Table].[Column].value('FirstName [1]', 'varchar(50)') as ' FirstName ',
[Table].[Column].value(' LastName[1]', 'varchar(50)') as ' LastName',
[Table].[Column].value(' Company [1]', 'varchar(50)') as ' Company'
FROM #XmlStr.nodes('/ Customers / customer') as [Table]([Column])
IF(##ROWCOUNT > 0 )
SET #retValue='SUCCESS';
I have an XML Query like this:
<ChangeSet xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Change DateTime="2011-12-02T09:01:58.3615661-08:00" UserId="3123">
<Table ChangeType="Insert" Name="EVNT_LN_AFF">
<Keys>
<Key FieldName="DIR_CD" Value="NB" />
<Key FieldName="LN_ID" Value="A" />
<Key FieldName="EVNT_ID" Value="10T000289" />
</Keys>
<ChangedFields>
<Field FieldName="DIR_CD" Previous="" Current="NB" />
<Field FieldName="LN_ID" Previous="" Current="A" />
<Field FieldName="EVNT_ID" Previous="" Current="10T000289" />
<Field FieldName="UD_DTTM" Previous="" Current="12/2/2011 9:01:59 AM" />
<Field FieldName="UD_USER_ID" Previous="" Current="3123" />
</ChangedFields>
</Table>
(The query goes on)
Now I want to use a statement like this:
SELECT TOP 1000 [CHG_LOG_ID]
, [EVNT_ID]
, [DATA_XML_TXT]
, [UD_DTTM]
FROM [MY_PROJ].[dbo].[EVNT_CHG_LOG]
WHERE DATA_XML_TXT.value('(/ChangeSet/Change/Table/ChangedFields/UD_USER_ID)[0]','varchar(50)') like '%3123%'
But when I execute the query, I don't get any results.
I tested the following XQuery, and it should give you what you need:
SELECT TOP 1000 [CHG_LOG_ID]
, [EVNT_ID]
, [DATA_XML_TXT]
, [UD_DTTM]
FROM [MY_PROJ].[dbo].[EVNT_CHG_LOG]
WHERE DATA_XML_TXT.value('(/ChangeSet/Change/Table/ChangedFields/Field[#FieldName="UD_USER_ID"]/#Current)[1]','varchar(50)') like '%3123%'
Note: Indexing for XQuery starts at 1 instead of 0
I'm writing a stored procedure to process XML data uploaded by the user:
<People>
<Person Id="1" FirstName="..." LastName="..." />
<Person Id="2" FirstName="..." LastName="..." />
<Person Id="3" FirstName="..." LastName="..." />
<Person Id="4" FirstName="..." LastName="..." />
<Person Id="5" FirstName="..." LastName="..." />
</People>
I would like to use a schema to make sure that the entities are valid, but I don't want the entire process to fail just because of one invalid entity. Instead, I would like to log all invalid entities to a table and process the valid entities as normal.
Is there a recommended way to do this?
A pure SQL approach would be:
Create a schema collection that defines <Person>:
CREATE XML SCHEMA COLLECTION [dbo].[testtest] AS
N'<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Person">
<xs:complexType>
<xs:attribute name="Id" type="xs:int" use="required"/>
<xs:attribute name="FirstName" type="xs:string" use="required"/>
<xs:attribute name="LastName" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
'
(one-time operation)
Have an XML query that selects each <Person> node from <People> as a separate row.
Declare a cursor on that query and select each row into an untyped xml variable. After the select, try to assign to a typed xml variable from within a try-catch block.
Resulting code would look like:
declare #source xml = N'
<People>
<Person Id="1" FirstName="..." LastName="..." />
<Person Id="2" FirstName="..." LastName="..." />
<Person Id="f" FirstName="..." LastName="..." />
<Person Id="4" FirstName="..." LastName="..." />
<Person Id="5" FirstName="..." LastName="..." />
</People>';
declare foo cursor
local
forward_only
read_only
for
select t.p.query('.')
from #source.nodes('People/Person') as t(p)
;
declare #x xml (dbo.testtest);
declare #x_raw xml;
open foo;
fetch next from foo into #x_raw;
while ##fetch_status = 0
begin
begin try
set #x = #x_raw;
print cast(#x_raw as nvarchar(max)) + ': OK';
end try
begin catch
print cast(#x_raw as nvarchar(max)) + ': FAILED';
end catch;
fetch next from foo into #x_raw;
end;
close foo;
deallocate foo;
Result:
<Person Id="1" FirstName="..." LastName="..."/>: OK
<Person Id="2" FirstName="..." LastName="..."/>: OK
<Person Id="f" FirstName="..." LastName="..."/>: FAILED
<Person Id="4" FirstName="..." LastName="..."/>: OK
<Person Id="5" FirstName="..." LastName="..."/>: OK
A simpler option is to create a CLR stored procedure that would parse XML in a .NET language.