I am using Bulk Insert command to insert 4,910,496 records, as below:
BULK INSERT [MyTable] FROM 'F:\mydata.dat' WITH (DATAFILETYPE = 'widechar', FORMATFILE = 'F:\myformat.XML', MAXERRORS = 2147483647, ROWS_PER_BATCH = 4910496, TABLOCK);
However, from time to time, the insertion will fail and I will get the following error:
Error: 802, Severity: 17, State: 20. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.
I check online and see https://blog.sqlauthority.com/2018/08/16/sql-server-error-17300-the-error-is-printed-in-terse-mode-because-there-was-error-during-formatting/ , which said this error occurs because SQL Server is out of memory.
So, my question is, how to prevent SQL Server from out of memory in bulk insert operation?
Related
We recently had an identifier column move from int values to bigint values. An ETL process which loads these values was not updated. That process is using SQL bulk insert, and we are seeing incorrect values in the destination table. I would have expected a hard failure.
Please note this process has been running successfully for a long time before this issue.
Can anyone tell me what the heck SQL Server is doing here!? I know how to fix the situation, but I'm trying to better understand it for the data cleanup effort that I'll need to complete, as well as just for the fact that it looks like black magic!
I've been able to re-create this issue in SQL Server 2017 and 2019.
Simplified Example CSV file contents:
310067463717
310067463718
310067463719
Example SQL:
create table #t (t int)
bulk insert #t from 'c:\temp\test.csv'
with (datafiletype = 'char',
fieldterminator = '|'
)
select * from #t
Resulting data:
829818405
829818406
829818407
Interestingly, I tried with smaller values, and I do see an error:
Example CSV file contents (2147483647 is the largest int value for SQL Server):
310067463717
310067463718
310067463719
2147483647
2147483648
Running the same SQL code as above, I get an error for one row:
Msg 4867, Level 16, State 1, Line 4
Bulk load data conversion error (overflow) for row 5, column 1 (t).
The resulting data looks like this:
829818405
829818406
829818407
2147483647
I also tried just now with a much higher value, 31006746371945654, and that threw the same overflow error as 2147483648.
And last, I did confirm that if I create the table with the column defined as bigint, the data inserted is correct.
create table #t (t bigint)
bulk insert #t from 'c:\temp\test.csv'
with (datafiletype = 'char',
fieldterminator = '|'
)
select * from #t
Resulting data:
2147483647
2147483648
310067463717
310067463718
310067463719
I have an R script that combines years of FFIEC Bank Call Report schedules into flat files--one for each schedule--then writes each schedule to a tab-delimited, non-quoted flat file suitable for bulk inserting into SQL Server. Then I run this bulk insert command:
bulk insert CI from 'e:\CI.txt' with (firstrow = 2, rowterminator = '0x0a', fieldterminator = '\t')
The bulk insert will run for a while then quit, with this error message:
Msg 7301, Level 16, State 2, Line 4
Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".
I've searched here for answers and the most common problem seems to be the rowterminator argument. I know that the files I've created have a line feed without a carriage return, so '0x0a' is the correct argument (but I tried '\n' and it didn't work).
Interestingly, I tried setting the fieldterminator to gibberish just to see what happened and I got the expected error message:
The bulk load failed. The column is too long in the data file for row 1, column 1."
So that tells me that SQL Server has access to the file and is indeed starting to insert it.
Also, I did a manual import (right click on database, tasks->Import Data) and SQL Server swallowed up the file without a hitch. That tells the layout of the table is fine, and so is the file?
Is it possible there's something at the end of the file that's confusing the bulk insert? I looked in a hex editor and it ends with data followed by 0A (the hex code for a line feed).
I'm stumped and open to any possibilities!
I'm trying to insert a lot of data with SQL Server Management Studio. This is what I do:
I open my file containing a lot of SQL inserts: data.sql
I execute it (F5)
I get a lot of these:
(1 row(s) affected)
and some of these:
Msg 8152, Level 16, State 13, Line 26
String or binary data would be truncated.
The statement has been terminated.
Question: How to get the error line number ? Line 26 doesn't seems to be the correct error line number...
This is something that has annoyed SQL Server developers for years. Finally, with SQL Server 2017 CU12 w/ trace flag 460 they give you a better error message, like:
Msg 2628, Level 16, State 6, Procedure ProcedureName, Line Linenumber
String or binary data would be truncated in table ‘%.*ls’, column
‘%.*ls’. Truncated value: ‘%.*ls
A method to get around this now is to add a print statement after each insert. Then, when you see your rows affected print out, you could see what ever you print.
...
insert into table1
select...
print 'table1 insert complete'
insert into table2
select...
print 'table2 insert complete'
This isn't going to tell you what column, but would narrow it down to the correct insert. You can also add SET NOCOUNT ON at the top of your script if you don't want the rows affected message printed out.
Another addition, if you really are using BULK INSERT and weren't just using the term generally, you can specify an ERRORFILE. This will log the row(s) which caused the error(s) in your BULK INSERT command. It's important to know that by default, BULK INSERT will complete if there are 10 errors or less. You can override this by specifying the MAXERRORS in your BULK INSERT command.
I have an csv like this :
"F","003","abc""X","1","50A1","name","Z5AA1A005C","70008","","A1ZZZZ17","","","","","","""X","2","50A1","name","Z5AA1A005C","70007","","A1ZZZZ17","","","","","","""X","3","50A1","name","Z5AA1A005C","70000","","A1ZZZZ17","","","","","",""
I need to bulk insert to tabel A
from the 2nd row
BULK INSERT A FROM 'c:\csvtest.csv'
WITH
(
FIELDTERMINATOR='","',
ROWTERMINATOR='0x0a',
FIRSTROW = 2,
DATAFILETYPE = 'widenative'
)
the problem is when I insert, failed insert
it show error :
Msg 4866, Level 16, State 8, Line 15 The bulk load failed. The column
is too long in the data file for row 1, column 15. Verify that the
field terminator and row terminator are specified correctly. Msg 7301,
Level 16, State 2, Line 15 Cannot obtain the required interface
("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server
"(null)".
I have tried rowterminator : '0x0a','\n','\r\n','char(10)' but nothing works
Although it will only be inserting data from row2 row 1 still needs to be in the correct format as I'm pretty sure SQLServer performs a 'pre-validation' cycle against the Schema to ensure you have half a chance of the data getting to the database. Row 1 fails this 'pre-validation' because you have not provided all the columns as per the table schema.
Try to Open the file in Notepad then check it for line structure and save it again.
If I run the following query in SQL Server 2000 Query Analyzer:
BULK INSERT OurTable
FROM 'c:\OurTable.txt'
WITH (CODEPAGE = 'RAW', DATAFILETYPE = 'char', FIELDTERMINATOR = '\t', ROWS_PER_BATCH = 10000, TABLOCK)
On a text file that conforms to OurTable's schema for 40 lines, but then changes format for the last 20 lines (lets say the last 20 lines have fewer fields), I receive an error. However, the first 40 lines are committed to the table. Is there something about the way I'm calling Bulk Insert that makes it not be transactional, or do I need to do something explicit to force it to rollback on failure?
BULK INSERT acts as a series of individual INSERT statements and thus, if the job fails, it doesn't roll back all of the committed inserts.
It can, however, be placed within a transaction so you could do something like this:
BEGIN TRANSACTION
BEGIN TRY
BULK INSERT OurTable
FROM 'c:\OurTable.txt'
WITH (CODEPAGE = 'RAW', DATAFILETYPE = 'char', FIELDTERMINATOR = '\t',
ROWS_PER_BATCH = 10000, TABLOCK)
COMMIT TRANSACTION
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION
END CATCH
You can rollback the inserts . To do that we need to understand two things first
BatchSize
: No of rows to be inserted per transaction . The Default is entire
Data File. So a data file is in transaction
Say you have a text file which has 10 rows and row 8 and Row 7 has some invalid details . When you Bulk Insert the file without specifying or with specifying batch Size , 8 out of 10 get inserted into the table. The Invalid Row's i.e. 8th and 7th gets failed and doesn't get inserted.
This Happens because the Default MAXERRORS count is 10 per transaction.
As Per MSDN :
MAXERRORS :
Specifies the maximum number of syntax errors allowed in the data
before the bulk-import operation is canceled. Each row that cannot be
imported by the bulk-import operation is ignored and counted as one
error. If max_errors is not specified, the default is 10.
So Inorder to fail all the 10 rows even if one is invalid we need to set MAXERRORS=1 and BatchSize=1 Here the number of BatchSize also matters.
If you specify BatchSize and the invalid row is inside the particular batch , it will rollback the particular batch only ,not the entire data set.
So be careful while choosing this option
Hope this solves the issue.
As stated in the BATCHSIZE definition for BULK INSERT in MSDN Library (http://msdn.microsoft.com/en-us/library/ms188365(v=sql.105).aspx) :
"If this fails, SQL Server commits or rolls back the transaction for every batch..."
In conclusion is not necessary to add transactionality to Bulk Insert.
Try to put it inside user-defined transaction and see what happens. Actually it should roll-back as you described it.