SQL Server BULK INSERT fixed length char data - sql-server

I use SQL Server 2008 and have a table with 5 char typed columns.
CREATE TABLE [dbo].[deviceDataBulk](
[f1] [char](9) NULL,
[f2] [char](5) NULL,
[f3] [char](7) NULL,
[f4] [char](7) NULL,
[f5] [char](6) NULL)
I also have a bcp format file ;
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="9" COLLATION="Turkish_CI_AS"/>
<FIELD ID="2" xsi:type="CharFixed" LENGTH="5" COLLATION="Turkish_CI_AS"/>
<FIELD ID="3" xsi:type="CharFixed" LENGTH="7" COLLATION="Turkish_CI_AS"/>
<FIELD ID="4" xsi:type="CharFixed" LENGTH="7" COLLATION="Turkish_CI_AS"/>
<FIELD ID="5" xsi:type="CharFixed" LENGTH="6" COLLATION="Turkish_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="f1" NULLABLE="YES" xsi:type="SQLCHAR"/>
<COLUMN SOURCE="2" NAME="f2" NULLABLE="YES" xsi:type="SQLCHAR"/>
<COLUMN SOURCE="3" NAME="f3" NULLABLE="YES" xsi:type="SQLCHAR"/>
<COLUMN SOURCE="4" NAME="f4" NULLABLE="YES" xsi:type="SQLCHAR"/>
<COLUMN SOURCE="5" NAME="f5" NULLABLE="YES" xsi:type="SQLCHAR"/>
</ROW>
My data file contains fixed length char data with no field terminators in each line. So, a full line will be 34 characters long.
My problem is field 4 and field 5 may not be present for each row. I may have 21 characters long line or 28 characters long line in that file.
There is no case that field 5 exists and field 4 not.
Possible scenarios for text file are ;
f1 f2 f3 f4 f5
f1 f2 f3 f4
f1 f2 f3
I couldn't insert this file with BULK INSERT. I want BULK INSERT to insert nulls when it doesn't have those fields, if the tool reaches end of line, just insert nulls for the rest of the fields.

How about a 2-step approach ? First load the data into a staging table as 'big rows', then use a second query to split the raw lines into their corresponding fields and handle the "missing f5 and/or f4 columns"-situation accordingly ?
Would look (more or less) like this : (untested!)
CREATE TABLE [dbo].[deviceDataBulk_staging](
[rowid] int IDENTITY(1 , 1) PRIMARY KEY,
[raw] [varchar](34) NOT NULL)
GO
BULK INSERT [deviceDataBulk_staging]
FROM '<your file>'
-- not sure if you really need a format-file here,
-- simply make sure to pass the correct line-separator if it is 'exotic'.
GO
INSERT [deviceDataBulk] (f1, f2, f3, f4, f5)
SELECT f1 = SubString([raw], 1 , 9),
f1 = SubString([raw], 10 , 5),
f1 = SubString([raw], 15 , 7),
f1 = (CASE WHEN Length([raw] < 22 THEN NULL ELSE SubString([raw], 22 , 7) END),
f1 = (CASE WHEN Length([raw] < 29 THEN NULL ELSE SubString([raw], 29 , 6) END)
FROM [deviceDataBulk_staging]
ORDER BY [rowid]
The Staging file would then look like :
The [rowid] is there to keep the order identical to the order originally in the file, you might not need it but IMHO the overhead is minimal and MSSQL isn't too keen on HEAP tables anyway so having it there is "A good thing [Tm]"

Related

Parse, filter nested XML in TSQL

I have a table (table1) which has a column containing XML data.
I need to parse that XML and create rows of data from the child elements of the element -
The output needs to be something like
TestID Sequence ParentSequence ExtID ExtName
-1 1 -1 1 ABC
-1 2 -1 1 DEF
-1 2 -1 1 GHI
But I am getting an empty result set with every other method I tried.
I have focused on accessing Sequence as rest will follow the same process.
Not sure why this does not work. Any help in this regard is appreciated. Thank you. The SQL I have tried is after the XML(commented text is the options I have tried)
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, xmlfield
NVARCHAR(MAX));
INSERT INTO #tbl (xmlfield) VALUES
(N'<OBJECT CLASS="Test1" ID="-1" FULL="FULL" VERSION="1">
<SUBTYPE NAME="SubType1">
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">1</FIELD>
<FIELD NAME="ParentSequence">-1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">ABC</FIELD>
</OBJECT>
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">2</FIELD>
<FIELD NAME="ParentSequence">1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">DEF</FIELD>
<FIELD NAME="__ExtendedData"><OBJECT
CLASS="Meet123" ID="-1" FULL="FULL"
VERSION="1"><FIELD
NAME="OrderDetailID">-1</FIELD><FIELD
NAME="OrderID">-1</FIELD><FIELD
NAME="Sequence">0</FIELD><FIELD
NAME="AttendeeID">123</FIELD><FIELD NAME="
AttendeeID _Name">Test, Mark/I H 6</FIELD><FIELD
NAME="ShowList">1</FIELD><FIELD
NAME="BdgeName">Mark</FIELD><FIELD
NAME="BadgeCompanyName">I H 6</FIELD>
</OBJECT></FIELD>
</OBJECT>
</OBJECT>
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">3</FIELD>
<FIELD NAME="ParentSequence">1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">GHI</FIELD>
</OBJECT>
</SUBTYPE>
<SUBTYPE NAME="SubType2"/>
<SUBTYPE NAME="SubType3"/>
</OBJECT>');
-- DDL and sample data population, end
;WITH rs AS
(
SELECT ID, TRY_CAST(xmlfield AS XML) AS cartxml
FROM #tbl
)
SELECT ID
, c.value('(FIELD[#NAME="TestID"]/text())[1]', 'INT') AS TestID
, c.value('(FIELD[#NAME="Sequence"]/text())[1]', 'INT') AS [Sequence]
, c.value('(FIELD[#NAME="ParentSequence"]/text())[1]', 'INT') AS
ParentSequence
, c.value('(FIELD[#NAME="ExtID"]/text())[1]', 'INT') AS ExtID
, c.value('(FIELD[#NAME="ExtName"]/text())[1]', 'VARCHAR(20)') AS
ExtName
,c1.value('(FIELD[#NAME="AttendeeID"]/text())[1]', 'VARCHAR(20)') AS
AttendeeId,
,c1.value('(FIELD[#NAME="AttendeeID_Name"]/text())[1]',
'VARCHAR(20)') AS AttendeeName,
FROM src As T
CROSS APPLY cartxml.nodes('/OBJECT/SUBTYPE/OBJECT[#ID="-1"]') as
t2(c)
OUTER APPLY cartxml.nodes('/
OBJECT/SUBTYPE/OBJECT[#ID="-1"]/FIELD[#NAME="__ExtendedData"]') as
t3(c1)
Please try the following solution.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, xmlfield NVARCHAR(MAX));
INSERT INTO #tbl (xmlfield) VALUES
(N'<OBJECT CLASS="Test1" ID="-1" FULL="FULL" VERSION="1">
<SUBTYPE NAME="SubType1">
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">1</FIELD>
<FIELD NAME="ParentSequence">-1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">ABC</FIELD>
</OBJECT>
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">2</FIELD>
<FIELD NAME="ParentSequence">1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">DEF</FIELD>
<FIELD NAME="__ExtendedData"><OBJECT
CLASS="Meet123" ID="-1" FULL="FULL"
VERSION="1"><FIELD
NAME="OrderDetailID">-1</FIELD><FIELD
NAME="OrderID">-1</FIELD><FIELD
NAME="Sequence">0</FIELD><FIELD
NAME="AttendeeID">123</FIELD><FIELD NAME="AttendeeID_Name">Test, Mark/I H 6</FIELD><FIELD
NAME="ShowList">1</FIELD><FIELD
NAME="BdgeName">Mark</FIELD><FIELD
NAME="BadgeCompanyName">I H 6</FIELD>
</OBJECT></FIELD>
</OBJECT>
<OBJECT NAME="SubType111" ID="-1">
<FIELD NAME="TestID">-1</FIELD>
<FIELD NAME="Sequence">3</FIELD>
<FIELD NAME="ParentSequence">1</FIELD>
<FIELD NAME="ExtID">-1</FIELD>
<FIELD NAME="ExtName">GHI</FIELD>
</OBJECT>
</SUBTYPE>
<SUBTYPE NAME="SubType2"/>
<SUBTYPE NAME="SubType3"/>
</OBJECT>');
-- DDL and sample data population, end
;WITH rs AS
(
SELECT ID, TRY_CAST(xmlfield AS XML) AS cartxml
FROM #tbl
)
SELECT ID
, c.value('(FIELD[#NAME="TestID"]/text())[1]', 'INT') AS TestID
, c.value('(FIELD[#NAME="Sequence"]/text())[1]', 'INT') AS [Sequence]
, c.value('(FIELD[#NAME="ParentSequence"]/text())[1]', 'INT') AS ParentSequence
, c.value('(FIELD[#NAME="ExtID"]/text())[1]', 'INT') AS ExtID
, c.value('(FIELD[#NAME="ExtName"]/text())[1]', 'VARCHAR(20)') AS ExtName
, w.value('(OBJECT/FIELD[#NAME="AttendeeID"]/text())[1]', 'VARCHAR(20)') AS AttendeeID
, w.value('(OBJECT/FIELD[#NAME="AttendeeID_Name"]/text())[1]', 'VARCHAR(20)') AS AttendeeID_Name
FROM rs AS t
CROSS APPLY cartxml.nodes('/OBJECT/SUBTYPE/OBJECT[#ID="-1"]') as t1(c)
CROSS APPLY (VALUES(TRY_CAST(c.query('FIELD[#NAME="__ExtendedData"]').value('.','NVARCHAR(MAX)') AS XML))) AS t2(w)
WHERE w.exist('/OBJECT[#CLASS="Meet123"]') = 1;
Output
ID
TestID
Sequence
ParentSequence
ExtID
ExtName
AttendeeID
AttendeeID_Name
1
-1
2
1
-1
DEF
123
Test, Mark/I H 6

Reading XML in SQL Server of the specified XML node's "name" and get its "value"

How can I read the following XML in a SQL Server database:
<Row>
<Keys>
<Key>
<Name>NAME_2</Name>
<Data>22</Data>
</Key>
<Key>
<Name>NAME_3</Name>
<Data>33</Data>
</Key>
<Key>
<Name>NAME_1</Name>
<Data>98</Data>
</Key>
</Keys>
</Row>
I want to select from that XML and get only one row with columns:
NAME_1, NAME_2, NAME_3.
That's why I need something which would let me to find Keys/Key/Name with the value: NAME_1 and return its Keys/Key/Data, and so on ...
Expected resultset (1 ROW):
NAME_1 NAME_2 NAME_3
-----------------------
98 22 33
One more important thing. Those values NAME_1, NAME_2, NAME_3. I am expecting them. That's why I need to query for them and return their values for a row.
This was my approach with expected names:
DECLARE #xml XML=
'<Row>
<Keys>
<Key>
<Name>NAME_2</Name>
<Data>22</Data>
</Key>
<Key>
<Name>NAME_3</Name>
<Data>33</Data>
</Key>
<Key>
<Name>NAME_1</Name>
<Data>98</Data>
</Key>
</Keys>
</Row>';
--Most explicit (recommended) version:
SELECT #xml.value('(/Row/Keys/Key[(Name/text())[1]="NAME_1"]/Data/text())[1]','int') AS NAME_1
,#xml.value('(/Row/Keys/Key[(Name/text())[1]="NAME_1"]/Data/text())[1]','int') AS NAME_1
,#xml.value('(/Row/Keys/Key[(Name/text())[1]="NAME_2"]/Data/text())[1]','int') AS NAME_2;
--The same, but less explicit (and therefore not recommended)
SELECT #xml.value('(//Key[Name="NAME_1"]/Data)[1]','int') AS NAME_1
,#xml.value('(//Key[Name="NAME_2"]/Data)[1]','int') AS NAME_1
,#xml.value('(//Key[Name="NAME_3"]/Data)[1]','int') AS NAME_2;
The idea in short:
we fetch each value directly from the XML
We pick the <Key> with a XQuery-predicate asking for the element, where <Name> as the given string as content (= text() node).
We dive into the <Data> node below the <Key> and fetch its content and return it as int.
Assuming you know the column names will always be NAME_1, NAME_2, NAME_3, etc, you can do the following without having to resort to dynamic SQL.
DECLARE #xml xml =
'<Row>
<Keys>
<Key>
<Name>NAME_2</Name>
<Data>22</Data>
</Key>
<Key>
<Name>NAME_3</Name>
<Data>33</Data>
</Key>
<Key>
<Name>NAME_1</Name>
<Data>98</Data>
</Key>
</Keys>
</Row>';
SELECT
*
FROM (
SELECT
x.f.value( 'Data[1]', 'int' ) AS [Data],
x.f.value( 'Name[1]', 'varchar(50)' ) AS [Name]
FROM #xml.nodes( '//Row/Keys/Key' ) x( f )
) AS d
PIVOT (
MAX( [Data] )
FOR [Name] IN ( [NAME_1], [NAME_2], [NAME_3] )
) AS x;
Returns
+--------+--------+--------+
| NAME_1 | NAME_2 | NAME_3 |
+--------+--------+--------+
| 98 | 22 | 33 |
+--------+--------+--------+

Generate xml from ms sql with head and item tables

I'd like to generate an xml from MS SQL with a structure like this (invoice head info, then invoice items info):
<?xml version="1.0" encoding="ISO8859-2"?>
<Data HD="1" View="InvoiceGen">
<Row Table="InvoiceHead">
<InvoiceNumber>630506</InvoiceNumber>
<CustomerId>1432</CustomerId>
</Row>
<Row Table="InvoiceItem">
<ItemNumber>B52</ItemNumber>
<Price>320</Price>
<Tax>30</Tax>
</Row>
<Row>
<ItemNumber>B53</ItemNumber>
<Price>330</Price>
<Tax>32</Tax>
</Row>
<Row Table="InvoiceHead">
<InvoiceNumber>630626</InvoiceNumber>
<CustomerId>1556</CustomerId>
</Row>
<Row Table="InvoiceItem">
<ItemNumber>B5</ItemNumber>
<Price>500</Price>
<Tax>55</Tax>
</Row>
<Row>
<ItemNumber>B5</ItemNumber>
<Price>200</Price>
<Tax>20</Tax>
</Row>
<Row>
<ItemNumber>B18</ItemNumber>
<Price>180</Price>
<Tax>16</Tax>
</Row>
</Data>
i have an invoice head table, and an invoice item table (InvoiceNumer makes the connection between the two):
InvoiceHead (InvoiceNumber,CustomerId)
InvoiceItem (InvoiceNumber,ItemNumber,Price,Tax)
I have already created a table with the combined data, with the same structure as in the desired xml:
InvoiceGen(InvoiceNumber,CustomerId,ItemNumber,Price,Tax)
In this table, after a head row there are all the rows with the item information connected to the invoice head. (Just like in the xml)
The contet of this InvoiceGen table is:
InvoiceNumber CustomerId ItemNumber Price Tax
630506 1432 null null null
630506 1432 B52 320 30
630506 1432 B53 330 32
630626 1556 null null null
630626 1556 B5 500 55
630626 1556 B6 200 20
630626 1556 B18 180 16
I'm not sure if this table can help me, but it looked like a good idea.
Can anyone help me to create an xml like the above?
Thank you!
This basically gets you to what you want. Note, however, that 'B18' is sorted before 'B5' and 'B6' because your values are varchars. With a varchar the value '1' and a lower value than '5', and so '18' as a "lower" value than '5':
WITH VTE AS(
SELECT *
FROM (VALUES (630506,1432,NULL,NULL,NULL),
(630506,1432,'B52',320,30),
(630506,1432,'B53',330,32),
(630626,1556,NULL,NULL,NULL),
(630626,1556,'B5',500 ,55),
(630626,1556,'B6',200,20),
(630626,1556,'B18',180,16)) V(InvoiceNumber,CustomerID,ItemNumber,Price,Tax))
SELECT 1 aS [#HD],
'InvoiceGen' AS [#View],
(SELECT CASE WHEN ItemNumber IS NULL THEN 'InvoiceHead' ELSE 'InvoiceItem' END AS [#Table],
CASE WHEN ItemNumber IS NULL THEN InvoiceNumber END AS InvoiceNumber,
CASE WHEN ItemNumber IS NULL THEN CustomerID END AS CustomerID,
ItemNumber,
Price,
Tax
FROM VTE V
ORDER BY V.InvoiceNumber,
V.CustomerID,
CASE WHEN V.ItemNumber IS NULL THEN 0 ELSE 1 END,
V.ItemNumber
FOR XML PATH('Row'),TYPE)
FOR XML PATH('Data');

How do I get the child node values AND parent node values in SQL XPath

My data (passed into a parameter (#Data XML) in a Stored Procedure) looks like this:
<Records>
<Record id="1">
<Data>
<FirstName>John</FirstName>
<LastName>Doe</LastName>
</Data>
<Result>
<StatusId>3</StatusId>
<ErrorCodes>
<Item>4</Item>
<Item>23</Item>
<Item>19</Item>
</ErrorCodes>
</Result>
</Record>
<Record id="2">
<Data>
<FirstName>Fred</FirstName>
<LastName>Blog</LastName>
</Data>
<Result>
<StatusId>2</StatusId>
<ErrorCodes>
<Item>1</Item>
<Item>3</Item>
</ErrorCodes>
</Result>
</Record>
</Records>
I want to select the Record id and the Error Codes, like this:
id Item
----------
1 4
1 23
1 19
2 1
2 3
The order of data doesn't matter.
The following gets me the Error Codes, but not the Record id:
SELECT Data.value('.', 'int') as ErrorCode
FROM #Data.nodes('/Records/Record/Result/ErrorCodes/*') AS data(Data)
This expression should get you parent-of-parent-of-parent element:
Data.query('../../..')
...so try something like this (untested)...
SELECT
id = Data.value('../../../#id', 'int'),
item = Data.value('.', 'int')
FROM #Data.nodes('/Records/Record/Result/ErrorCodes/*') AS data(Data)

Pivot XML using XQuery and filter on attribute

Given the following XML (in an SQL column field called 'xfield'):
<data>
<section>
<item id="A">
<number>987</number>
</item>
<item id="B">
<number>654</number>
</item>
<item id="C">
<number>321</number>
</item>
</section>
<section>
<item id="A">
<number>123</number>
</item>
<item id="B">
<number>456</number>
</item>
<item id="C">
<number>789</number>
</item>
</section>
</data>
How do you obtain the following table structure (with A, B & C as the column names):
A | B | C
987|654|321
123|456|789
Using SQL XQuery, I'm trying this (not surprisingly, it's invalid):
SELECT
data.value('(./section/item[#ID = "A"]/number/[1])', 'int') as A,
data.value('(./section/item[#ID = "B"]/number/[1])', 'int') as B,
data.value('(./section/item[#ID = "C"]/number/[1])', 'int') as C
FROM Table CROSS APPLY [xfield].nodes('/data') t(data)
You're nearly there.
You need to use nodes() to shred the xml into the rows you want to work with - here, you want a resultset row for each section element, so shred with
nodes('/data/section')
Once you've done that, you just need to make your xpath [1] syntactically correct (and relative to the section nodes you will be 'in'):
data.value('(item[#id = "A"]/number)[1]', 'int') as A,
data.value('(item[#id = "B"]/number)[1]', 'int') as B,
data.value('(item[#id = "C"]/number)[1]', 'int') as C
And voila:
A B C
----------- ----------- -----------
987 654 321
123 456 789

Resources