Bulk Insert support for Unicode seperator - sql-server

I am using Azure data factory to Archive data from Azure Sql Db to Azure Blob Store and Bulk insert to retrieve the data.
I am using below as row and column seperator.
Column delimiter:\u0001
Row delimiter:\u0003
My Bulk Insert is below.
BULK INSERT mytable 'MyPath/file.txt'
WITH (DATA_SOURCE = 'MySource',FIELDTERMINATOR ='\u0001', ROWTERMINATOR = '\u0003');
I am getting the below error:
Msg 4866, Level 16, State 1, Line 41
The bulk load failed. The column is too long in the data file for row 1, column 1. Verify that the field terminator and row terminator are specified correctly.
Documentation said Unicode is supported for FIELDTERMINATOR and ROWTERMINATOR then what could be the issue?

It seems unicode is not fully supported for bulk insert.
**Only the t, n, r, 0 and '\0' characters work with the backslash escape character to produce a control character.
Link: https://learn.microsoft.com/en-us/sql/relational-databases/import-export/specify-field-and-row-terminators-sql-server?view=azuresqldb-current

Related

Bulk insert CSV file from Azure blob storage to SQL managed instance

I have CSV file on Azure blob storage. It has 4 columns in it without headers and one blank row at starting. I am inserting CSV file into SQL managed instance by bulkinsert and I have 5 columns in the database table. I don't have 5th column in CSV file.
Therefore it is throwing this error:
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 5 (uId2)
As I want to insert that 4 columns from CSV file to table in database and I want that 5th column in table as NULL.
I am using this code:
BULK INSERT testing
FROM 'test.csv'
WITH (DATA_SOURCE = 'BULKTEST',
FIELDTERMINATOR = ',',
FIRSTROW = 0,
CODEPAGE = '65001',
ROWTERMINATOR = '0x0a'
);
Want that 5th row as NULL in database table, if there are 4 columns in CSV file.
Sorry, we achieve that in bulk insert. None of other ways according my experience.
Azure SQL managed instance is also not supported as dataset in Data Factory Data flow. Otherwise we can using Data Flow derived column to create a new column to mapping to the Azure SQL database.
The best way is that you editor your csv file: just add new column as header in you csv files.
Hope this helps.

Loading CSV File into SQL Server database

Working in the Visual Studio database feature. I've got two tables and I need to load a .csv file into them. I broke out the .csv file into my first and 2nd table. I'm trying a bulk insert
BULK INSERT Course
FROM 'E:\CourseInfo.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
TABLOCK
)
Everything seems right to me but I receive an error saying:
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 3 (ID).
Here is a snippet of my .CSV file being used.
CourseID,CourseTitle,ID,
AC107,Accounting I,1,
AC107,Accounting I,2,
AC110,Payroll Accounting,3,
AC212,Taxation I,4,
AC212,Taxation I,5,
What is meant by mismatch or invalid char? I've tried removing all the values for column 'ID' but that still rendered the same error. I had ID set to auto-increment setting it as "isEntity" but have tried both ways with it set to true and false, still same error.
Possible error: Laid a huge egg, think I have to normalize this prior to doing this because there are multiples of the same class with different ID and that isn't right.
To Answer my own question, I had my PK set to ID and it was showing up for Duplicate Rows where there was duplicate data. I have to remove duplicate courseID, and CourseTitle rows prior to populating my ID Column with values.

Bulk Insert Formatting Issue from CSV File

I am doing a bulk insert from a CSV file.
In one of my columns, I am using a colon such as this 36:21.0. For every row in this column I am getting the following error:
"Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 11, column 3 (MyColumnName)."
Does anyone know a workaround to this so that I will be able to bulk insert the columns that have a colon in the data along with the rest of my columns?
Here is my query if you are interested:
BULK INSERT dbo.[PropertyDefinition] FROM
'//MY CSV FILE PATH HERE'
WITH(
FIRSTROW = 2,
DATAFILETYPE ='char',
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
Your query is correct.
I don't think that colon is causing the problem because the field-terminator and row-terminator does not include colon.
This problem is usually caused due to data type miss-match in the file and the table.
Just make sure that the datatype you are giving for column 3 is matching with the datatype of data in the file at row 11, column 3.

Bulk insert breaks when adding DATAFILETYPE='widenative'

I have this little sql script to import a semicolon separated file into a specific table of my database:
BULK
INSERT foo_bar
FROM 'C:\Users\JohnDoe\projects\foo\ftp-data-importator\bar.txt'
WITH
(
FIELDTERMINATOR = ';',
ROWTERMINATOR = '\n',
FIRSTROW = 2,
MAXERRORS = 100000,
ERRORFILE = 'c:\temp\foobar_bulk_log.txt'
)
GO
And it's working like a charm.
The only problem is that some special unicode characters like ó or é are not being inserted respecting the encoding of the file.
So I added the next line between the WITH keyword parentheses:
DATAFILETYPE = 'widenative'
And instead of respecting the encoding is breaking the whole execution and giving me the next error:
Msg 4866, Level 16, State 5, Line 5 The bulk load failed. The column
is too long in the data file for row 1, column 1. Verify that the
field terminator and row terminator are specified correctly. Msg 7301,
Level 16, State 2, Line 5 Cannot obtain the required interface
("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server
"(null)".
Where is the problem?
Instead of DataFileType try using CODEPAGE=1252.
Try specifying widechar instead of widenative Your original statement is using character mode, not native BCP format. Also, ensure the source file is Unicode (not UTF-8).

SQL server bulk insert rowterminator failed

I have an csv like this :
"F","003","abc""X","1","50A1","name","Z5AA1A005C","70008","","A1ZZZZ17","","","","","","""X","2","50A1","name","Z5AA1A005C","70007","","A1ZZZZ17","","","","","","""X","3","50A1","name","Z5AA1A005C","70000","","A1ZZZZ17","","","","","",""
I need to bulk insert to tabel A
from the 2nd row
BULK INSERT A FROM 'c:\csvtest.csv'
WITH
(
FIELDTERMINATOR='","',
ROWTERMINATOR='0x0a',
FIRSTROW = 2,
DATAFILETYPE = 'widenative'
)
the problem is when I insert, failed insert
it show error :
Msg 4866, Level 16, State 8, Line 15 The bulk load failed. The column
is too long in the data file for row 1, column 15. Verify that the
field terminator and row terminator are specified correctly. Msg 7301,
Level 16, State 2, Line 15 Cannot obtain the required interface
("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server
"(null)".
I have tried rowterminator : '0x0a','\n','\r\n','char(10)' but nothing works
Although it will only be inserting data from row2 row 1 still needs to be in the correct format as I'm pretty sure SQLServer performs a 'pre-validation' cycle against the Schema to ensure you have half a chance of the data getting to the database. Row 1 fails this 'pre-validation' because you have not provided all the columns as per the table schema.
Try to Open the file in Notepad then check it for line structure and save it again.

Resources