I have the following lines in a text file delimited by "|". I only want to retrieve the Surname and Firstname and write it into a table.
Released_Date|Label|Type|Id|FormId|Title|Surname|First_Name|Middle_Name
25/07/2014|XCS|CDE|V000011|F000011|Miss|Dālwó|Cabĉver|Ann
25/07/2014|XCS|CDE|V000011|F000011|Miss|Rtyālwó|sabĉper|Joanne
I created the XML file to retrieve only the Surname and firstname:
<?xml version="1.0"?>
<BCPFORMAT
xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\n"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="\n"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="Surname"/>
<COLUMN SOURCE="2" NAME="First_Name"/>
</ROW>
</BCPFORMAT>
And I create the stored procedure to read it:
ALTER PROC dbo.ImportTextFile
AS
BULK INSERT test FROM 'C:\Program Files\Data Import.txt'
WITH
(
FIELDTERMINATOR ='|',
ROWTERMINATOR ='\n',
FIRSTROW =2,
FORMATFILE = 'C:\Program Files\cabcolumns.xml'
);
There are no errors but the problem is the whole row from the text file gets inserted into the two columns of the table but I want only the Surname and First_Name. I'm not sure what I am doing wrong. I have also given the DDL of the table below. Please help.
CREATE TABLE [dbo].[test](
[Surname] [nvarchar](4000) COLLATE SQL_Latin1_General_CP1253_CI_AI NULL,
[First_Name] [nvarchar](4000) COLLATE SQL_Latin1_General_CP1253_CI_AI NULL
) ON [PRIMARY]
i think that the issue is in the terminator in the XML file and in the numbering of the source columns.
first test could be a quick update to change field terminator on a sample of data (as a test, to understand if the terminator itself is an issue) updating all the configuration files accordingly.
ruled out the terminator issue, reading documentation you can find an example on how to skip columns when importing data (notice the filed ids):
<?xml version="1.0"?>
<BCPFORMAT
xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR=","/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR=","/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR=","/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR=","/>
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR=","/>
<FIELD ID="6" xsi:type="CharTerm" TERMINATOR=","/>
<FIELD ID="7" xsi:type="CharTerm" TERMINATOR=","/>
<FIELD ID="8" xsi:type="CharTerm" TERMINATOR=","/>
<FIELD ID="9" xsi:type="CharTerm" TERMINATOR="\n"/>
</RECORD>
<ROW>
<COLUMN SOURCE="7" NAME="Surname"/>
<COLUMN SOURCE="8" NAME="First_Name"/>
</ROW>
</BCPFORMAT>
then to import:
ALTER PROC dbo.ImportTextFile
AS
BULK INSERT test FROM 'C:\Program Files\Data Import.txt'
WITH (FIRSTROW = 2, FORMATFILE = 'C:\Program Files\cabcolumns.xml', LASTROW = 3);
explicitly setting the number of the last row you can avoid issues should the last line be empty or the system having troubles correctly detecting the end of the data.
Related
I want to bulk insert a big pile of data into SQL Server, and thus, I need a format file (I'm not inserting value into all columns).
But using This link and the bcp AdventureWorks2012.HumanResources.Department format nul -c -x -f Department-c..xml –t, -T format, I get an error, pointing at the -t, bit, saying ParentContainsErrorException saying there are missing arguments.
What's wrong with the above?
You need to specify your path to where you want your xml file.
This here works for me:
DECLARE #str VARCHAR(1000)
SET #str = 'bcp AdventureWorks2014.HumanResources.Department format nul -c -x -f D:\Stack\Department-c.xml -t, -T'
EXEC xp_cmdshell #str
GO
Given the result:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="7"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="24"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="DepartmentID" xsi:type="SQLSMALLINT"/>
<COLUMN SOURCE="2" NAME="Name" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="GroupName" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="ModifiedDate" xsi:type="SQLDATETIME"/>
</ROW>
</BCPFORMAT>
I am trying to bulk import data into ms-sql 2016, but, because of 2-bytes length characters (like Ü, Ä, etc), I am facing problem:
wrapping fields
Source is fixed-length, unicode (utf-8) text file with special (wide) characters:
this a sample part of file:
ABS525 0128211024200
ABS526 0128211024200
ABS527 0128211024200
ABS528 0128211024200
ABS529 0128211024200
Ölrücklaufleitung 0128211037390
Ölzu- und Ölrücklaufle0128211037390
Ölzulaufleitung 0128211037390
field lengths are: 22 - 4 - 3 - 5 - 1
I tried every way:
- import wizard in Management Studio,
- SSDT import,
- bulk import,
- openrowset,
- bcp command line
nothing worked, actually, they work unless there is a special character in the row.
This is my bulk insert code:
BULK INSERT [tecdoc2].[dbo].[211]
FROM 'C:\Users\Administrator\Desktop\D_TAF24\211yeni.0128'
WITH (MAXERRORS=50, CODEPAGE = '65001', DATAFILETYPE = 'widechar', FORMATFILE = 'C:\Users\Administrator\Desktop\BCP_Formats\a211.xml')
This is my format file (here, I tried a lot of combinations):
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="22" />
<FIELD ID="2" xsi:type="CharFixed" LENGTH="4" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharFixed" LENGTH="3" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharFixed" LENGTH="5" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="5" xsi:type="CharFixed" LENGTH="1" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="6" xsi:type="CharTerm" TERMINATOR="\r\n" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="ArtNr" xsi:type="SQLNVARCHAR" LENGTH="22" />
<COLUMN SOURCE="2" NAME="DLNr" xsi:type="SQLNCHAR" />
<COLUMN SOURCE="3" NAME="SA" xsi:type="SQLNCHAR" />
<COLUMN SOURCE="4" NAME="GenArtNr" xsi:type="SQLNCHAR" />
<COLUMN SOURCE="5" NAME="Losch-Flag" xsi:type="SQLNCHAR" />
</ROW>
</BCPFORMAT>
all fields in sql are nvarchar (with the specified lengths, actually I made a lot of trials here: double the specified lengths, or 'max', etc)
would you have any advice? I would appreciate.
With Kind Regards,
Murat
This is exactly the problem I am having but with OPENROWSET. If the file is delimited it work fine.
The only way around this issue I have found is to import the whole row into a single nvarchar(Big Enough) column and parse it out with the database. Works fine then, but a royal pain in the bottom.
If you change your format file to be:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="35" />
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="\r\n"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="RowData" xsi:type="SQLNVARCHAR" LENGTH="35"/>
</ROW>
</BCPFORMAT>
Then you import query can be:
INSERT INTO [tecdoc2].[dbo].[211]
(
ArtNr
,DLNr
,SA
,GenArtNr
,[Losch-Flag]
)
SELECT SUBSTRING(src.RowData, 0, 22) AS ArtNr
,SUBSTRING(src.RowData, 23, 4) AS DLNr
,SUBSTRING(src.RowData, 27, 3) AS SA
,SUBSTRING(src.RowData, 30, 5) AS GenArtNr
,SUBSTRING(src.RowData, 35, 1) AS 'Losch-Flag'
FROM OPENROWSET ( BULK 'C:\Users\Administrator\Desktop\D_TAF24\211yeni.0128'
,FORMATFILE = 'C:\Users\Administrator\Desktop\BCP_Formats\a211.xml'
,CODEPAGE = '65001' -- Unicode
,FIRSTROW = 1
) AS src
BULK INSERT [Alldlyinventory]
FROM 'C:\Users\Admin\Documents\2NobleEstates\DATA\Download\Output\test.txt'
WITH (FORMATFILE = 'C:\SQL Data\FormatFiles\test.xml');
Format file:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="8" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="2" xsi:type="CharFixed" LENGTH="7" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharFixed" LENGTH="4" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharFixed" LENGTH="1" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="5" xsi:type="CharFixed" LENGTH="10" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="DAY_NUMBER" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="2" NAME="LCBO_NO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="LOCATION_NUMBER" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="LISTING_STATUS" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="5" NAME="QTY_ON_HAND" xsi:type="SQLNVARCHAR"/>
</ROW>
</BCPFORMAT>
But I am getting the following error on SQL Server 2014:
Msg 4832, Level 16, State 1, Line 1
Bulk load: An unexpected end of file was encountered in the data file.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
It's a fixed width import.
Sample txt:
2016032803170570371L 000000014
2016032803367430371L 000000013
2016032803403800371L 000000036
2016032804007540371L 000000015
Looking at your sample text file, it looks like you have a row terminator that is carriage return ({CR}) + linefeed ({LF}).
You can inspect this by opening the text file with a text editor that can show special symbols. I can recommend Notepad++ which is free and good for this purpose (Menu View>Show Symbol>Show All Characters).
If the row terminator is indeed {CR}{LF}, you should use xsi:type="CharTerm" along with a TERMINATOR="\r\n" attribute for the last <FIELD> in the <RECORD> element:
<RECORD>
...
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR="\r\n" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
You can find more information on fixed field import in the following link: XML Format Files (SQL Server) # Importing fixed-length or fixed-width fields
I am trying to import .csv file but i am getting "BULK LOAD DATA CONVERSION ERROR" for last column. File looks like:
"123456","123","001","0.00"
I have tried below rowterminator:
ROW TERMINATOR = "\"\r\n"
Nothing is working. Any ideas on what is causing this record to have this error? Thanks
As per given example below, remove the quotes in your csv and use the terminator as "\r\n".
Always use format xml when doing bulk insert. It provides several advantages such as validation of data files etc.
The format file maps the fields of the data file to the columns of the table. You can use a non-XML or XML format file to bulk import data when using a bcp command or a BULK INSERT or INSERT or Transact-SQL command
Considering the input file given by you, suppose you have a table as given below :
CREATE TABLE myTestFormatFiles (
Col1 smallint,
Col2 nvarchar(50),
Col3 nvarchar(50),
Col4 nvarchar(50)
);
Your sample Data File will be as follows :
10,Field2,Field3,Field4
15,Field2,Field3,Field4
46,Field2,Field3,Field4
58,Field2,Field3,Field4
Sample format XML file will be :
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="7"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="Col1" xsi:type="SQLSMALLINT"/>
<COLUMN SOURCE="2" NAME="Col2" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="Col3" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="Col4" xsi:type="SQLNVARCHAR"/>
</ROW>
</BCPFORMAT>
If you are unfamiliar with format files, check XML Format Files (SQL Server).
Example is illustrated here
SQL :Bulk insert
bulk insert TESTING
from 'D:\Testing.csv'
with
( FIRSTROW=2,
DATAFILETYPE='char',
FIELDTERMINATOR=',',
ROWTERMINATOR = '\n',
FORMATFILE = 'D:\Testing.xml');
XML : Format file
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="Address1" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="Address2" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="Address3" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="Address4" xsi:type="CharTerm" TERMINATOR='\n' />
</RECORD>
<ROW>
<COLUMN SOURCE="Address1" NAME="COLUMN1" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="Address2" NAME="COLUMN2" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="Address3" NAME="COLUMN3" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="Address4" NAME="COLUMN4" xsi:type="SQLVARYCHAR" />
</ROW>
</BCPFORMAT>
The csv file that I have used contain address. I have created a SQL table before bulk insert. There are four column for address.
Testing.csv
"Address1","Address2","Address3","Address4"
"Lot 180, Street 19, "," Oakland Park, "," Kuala Lumpur, "," Selangor"
I want to get the output like in the table above. When i try use the xml format file in bulk insert, I received following error message:
Bulk load: An unexpected end of file was encountered in the data file.
Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK"
for linked server "(null)".