Bulk insert using format file - database

My database named 'dictionary' have two column named 'column1' and 'column2'. Both can accept NULL value. The data-type of both columns is INT. Now I want to insert into only column2 from a text file using bcp. I made a format file. My format file is like that
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="7"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="24"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="column2" xsi:type="SQLINT"/>
</ROW>
</BCPFORMAT>
and my bulk statement is like
BULK INSERT dictionary
FROM 'C:\Users\jka\Desktop\n.txt'
WITH
(
FIELDTERMINATOR = '\n',
ROWTERMINATOR = '\n',
FORMATFILE = 'path to my format file.xml'
)
But it didn't work? How can I solve this?
N:B:
My txt file looks like
123
456
4101
......
One more question Edited:
i can fill one colum by this technique but when i fill another column from a text file like before from the 1st row. how can i do that ???

Assuming that your format file is correct I believe you need to ditch FIELDTERMINATOR and ROWTERMINATOR from your BULK INSERT
BULK INSERT dictionary
FROM 'C:\Users\jka\Desktop\n.txt'
WITH (FORMATFILE = 'path to my format file.xml')
Also make sure that:
input file's encoding is correct. In your case most likely it should be ANSI and not UTF-8 or Unicode.
row terminator (which is second field terminator in your format file) is actually \r\n and not \n.
UPDATE Since you need to skip first column:
With an XML format file, there is no way to skip a column when you are importing directly into a table by using a BULK INSERT statement. In order to achieve desired result and still use XML format file you need to use OPENROWSET(BULK...) and provide explicit list of columns in the select list and in the target table.
So to insert data only to column2 use:
INSERT INTO dictionary(column2)
SELECT column2
FROM OPENROWSET(BULK 'C:\temp\infile1.txt',
FORMATFILE='C:\temp\bulkfmt.xml') as t1;
If your data file has only one field your format file can look like this
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="C1" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="24"/>
</RECORD>
<ROW>
<COLUMN SOURCE="C1" NAME="column2" xsi:type="SQLINT"/>
</ROW>
</BCPFORMAT>

Your data file contains one field, so your format file should reflect that
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\r\n"/>
</RECORD>

Related

How to load UTF-8 CSV files using Bulk Insert and an XML Format file in SQL Server 2017

After much trying, I have found that since SQL server 2017 (2016?), loading UTF-8 encoded CSV files through Bulk Insert has become possible by using the options CODEPAGE = 65001 and DATAFILETYPE = 'Char', as explained in some other questions.
What doesn't seem to work, is doing the same when using an XML formatfile. I have tried this by still using the CODEPAGE and DATAFILETYPE options, and also with these options omited. And I have tried this with the most simple dataset. One row, one column, containing some text with an UTF-8 character.
This is the XML Formatfile I am using.
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="STREET" xsi:type="NCharTerm" TERMINATOR="\r\n" MAX_LENGTH="1000" COLLATION="Latin1_General_CS_AS_WS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="STREET" NAME="STREET" xsi:type="SQLNVARCHAR"/>
</ROW>
</BCPFORMAT>
Even through the source data only contains some text with 1 special character, the end result looks like this: 慊潫ⵢ瑓晥慦⵮瑓慲鿃⁳㐱
When using xsi:type="CharTerm" instead of xsi:type="NCharTerm" the result looks like this: ...-Straßs ...
Am I doing something wrong, or has UTF-8 support not been properly implemented for XML format files?
After playing around with this, I have found the solution.
Notes
This works with or without BOM header. It is irrelevant.
The culprit was using the COLLATION parameter in the XML file. Omitting it solved the encoding problem. I have an intuitive sense as to why this is the case, but maybe someone with more insight could explain in the comments...
The DATAFILETYPE = 'char' option doesn't seem necessary.
In the XML file, the xsi:type for the field needs to be CharTerm, not NCharTerm.
This works with \r\n, \n, or \r. As long as you set the TERMINATOR correctly, this works. No \n\0 variations required (this would even break functionality, since this is not UTF-16 or UCS-2).
Below you can find a proof-of-concept for easy reuse...
data.txt
ß
ß
ß
Table
CREATE TABLE [dbo].[TEST](
TEST [nvarchar](500) NULL
)
formatfile.xml
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="20"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="TEST" xsi:type="SQLNVARCHAR"/>
</ROW>
</BCPFORMAT>
Bulk insert
bulk insert TEST..TEST
from 'data.txt'
with (formatfile = 'formatfile.xml', CODEPAGE = 65001)
Change your terminator to TERMINATOR="\r\0\n\0". You have to account for the extra bytes when using NCharTerm.

"BULK LOAD DATA CONVERSION ERROR for csv file

I am trying to import .csv file but i am getting "BULK LOAD DATA CONVERSION ERROR" for last column. File looks like:
"123456","123","001","0.00"
I have tried below rowterminator:
ROW TERMINATOR = "\"\r\n"
Nothing is working. Any ideas on what is causing this record to have this error? Thanks
As per given example below, remove the quotes in your csv and use the terminator as "\r\n".
Always use format xml when doing bulk insert. It provides several advantages such as validation of data files etc.
The format file maps the fields of the data file to the columns of the table. You can use a non-XML or XML format file to bulk import data when using a bcp command or a BULK INSERT or INSERT or Transact-SQL command
Considering the input file given by you, suppose you have a table as given below :
CREATE TABLE myTestFormatFiles (
Col1 smallint,
Col2 nvarchar(50),
Col3 nvarchar(50),
Col4 nvarchar(50)
);
Your sample Data File will be as follows :
10,Field2,Field3,Field4
15,Field2,Field3,Field4
46,Field2,Field3,Field4
58,Field2,Field3,Field4
Sample format XML file will be :
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="7"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="Col1" xsi:type="SQLSMALLINT"/>
<COLUMN SOURCE="2" NAME="Col2" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="Col3" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="Col4" xsi:type="SQLNVARCHAR"/>
</ROW>
</BCPFORMAT>
If you are unfamiliar with format files, check XML Format Files (SQL Server).
Example is illustrated here

MS SQL Server, Bulk Insert failing insert file in UTF-16 BE

I have a problem with Bulk Insert on MS SQL Server 2012. Input file is saved in UTF-16 BE.
BULK INSERT Positions
FROM 'C:\DEV\Test\seq.filename.csv'
WITH
(
DATAFILETYPE = 'widechar',
FORMATFILE = 'C:\DEV\Test\Format.xml'
);
Fortmat file:
<?xml version="1.0" encoding="utf-16"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="ActionCode" xsi:type="NCharFixed" LENGTH="4" />
<FIELD ID="T1" xsi:type="NCharFixed" LENGTH="2" />
<FIELD ID="ReofferCounter" xsi:type="NCharFixed" LENGTH="6" />
<FIELD ID="T1" xsi:type="NCharFixed" LENGTH="2" />
... other fields....
</RECORD>
<ROW>
<COLUMN SOURCE="ActionCode" NAME="DT" xsi:type="SQLNCHAR" LENGTH="255" />
<COLUMN SOURCE="ReofferCounter" NAME="NO" xsi:type="SQLNCHAR" LENGTH="255" />
</ROW>
</BCPFORMAT>
Input file sample:
02|+00|... other cols....
02|+00|... other cols....
I have two problems:
1) If the input file has encoding UTF-16 BE, I get only chinesee characters instead of numbers.
2) If I convert the input file to the UTF-16 LE, I see correct characters, but the character data are shifted one character to the left - as if BOM was parsed (and counted as 1 character), but not transformed to the output (which I do not desire).
Questions:
1) I there a way, how to import a file in UTF-16 BE withou conversion to LE?
2) What causes the shift and how to avoid it?

Skip Column in OPENROWSET (BULK)

Trying to bulk insert lots of rows into a table.
My SQL statement:
INSERT INTO [NCAATreasureHunt-dev].dbo.CatalinaCodes(Code)
SELECT (Code)
FROM OPENROWSET(BULK 'C:\Users\Administrator\Desktop\NCAATreasureHunt\10RDM.TXT',
FORMATFILE='C:\Users\Administrator\Desktop\NCAATreasureHunt\formatfile.xml') as t1;
10RDM.TXT:
DJKF61TGN7
Q9TVM16Z6Z
X44T4169FN
JQ2PT1ZXZK
C7NW71QPNG
SFJRR1FWKZ
TYZJW1ZPFY
9MR3M1J3N5
QJ6R217JTK
TVJVW19TYT
formatfile.xml
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="C1" xsi:type="CharTerm" TERMINATOR="\r\n"/>
</RECORD>
<ROW>
<COLUMN SOURCE="C1" NAME="Code" xsi:type="SQLNVARCHAR" />
</ROW>
</BCPFORMAT>
This is the error I'm getting:
Cannot insert the value NULL into column 'Claimed', column does not allow nulls. INSERT fails.
I'm trying to skip the Claimed column. What am I doing wrong in my format file?
See if this answer helps.
With an XML format file, you cannot skip a column when you are
importing directly into a table by using a bcp command or a BULK
INSERT statement. However, you can import into all but the last column
of a table. If you have to skip any but the last column, you must
create a view of the target table that contains only the columns
contained in the data file. Then, you can bulk import data from that
file into the view.
To use an XML format file to skip a table column by using
OPENROWSET(BULK...), you have to provide explicit list of columns in
the select list and also in the target table, as follows:
INSERT ... SELECT FROM OPENROWSET(BULK...)

Bulk insert fixed width fields

How do you specify field lengths with the Bulk Insert command?
Example: If I had a table named c:\Temp\TableA.txt and it had:
123ABC
456DEF
And I had a table such as:
use tempdb
CREATE TABLE TABLEA(
Field1 char(3),
Field2 char(3)
)
BULK INSERT TableA FROM 'C:\Temp\TableA.txt'
SELECT * FROM TableA
Then how would I specify the lengths for Field1 and Field2?
I think you need to define a format file
e.g.
BULK INSERT TableA FROM 'C:\Temp\TableA.txt'
WITH (FORMATFILE = 'C:\Temp\Format.xml')
SELECT * FROM TableA
For that to work, though, you need a Format File, obviously.
See here for general info about creating one:
Creating a Format File
At a guess, from looking at the Schema, something like this might do it:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="3"/>
<FIELD ID="2" xsi:type="CharFixed" LENGTH="3"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="Field1" xsi:type="SQLCHAR" LENGTH="3"/>
<COLUMN SOURCE="2" NAME="Field2" xsi:type="SQLCHAR" LENGTH="3"/>
</ROW>
</BCPFORMAT>
You'd want to use a format file with your BULK INSERT. Something like:
9.0
2
1 SQLCHAR 0 03 "" 1 Field1 ""
2 SQLCHAR 0 03 "\r\n" 2 Field2 ""

Resources