Bulk load data conversion error while importing DATETIME data - sql-server

I found a few posts on this topic on StackOverflow, but none seem to solve my problem.
I am trying to set up bulk imports for SQL Server 2008 Express, and it is failing to import datetime values. The issue seems so basic that I must be missing something very simple, and I'm hoping someone else can catch the problem.
The Problem
I am importing into this table:
CREATE TABLE [dbo].[BulkTest](
[ReportDate] [datetime] NOT NULL
)
This is my format file (BulkTest.fmt):
10.0
1
1 SQLDATETIME 0 0 "\r\n" 1 ReportDate ""
This is the data being imported (BulkTest.tab):
ReportDate
2010-12-31
2011-01-31
This is the import statement:
BULK INSERT dbo.BulkTest
FROM 'Q:\...\BulkTest.tab'
WITH (
CHECK_CONSTRAINTS,
TABLOCK,
FORMATFILE='Q:\...\BulkTest.fmt',
FIRSTROW=1,
DATAFILETYPE='char'
);
These are the errors:
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 1 (ReportDate).
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 1 (ReportDate).
Msg 4832, Level 16, State 1, Line 1
I Have Tried
Changing the date format, including 12/31/2010, 31/12/2010, 20101231, 2010-12-31 00:00:00, and various other formats.
Adding/removing/changing bulk insert statement options DATAFILETYPE, TABLOCK, CHECK_CONSTRAINTS.
Changing the delimiter and the field size in the format file (though per MSDN the field size should not matter).
Running SET DATEFORMAT ymd.
Checking the imported file with hex editor to make sure that it really does contain 8-bit chars and not unicode; it contains exactly what is shown in ASCII format.
Any ideas?

This is the format that works in my code:
08/17/2000 16:32:32
The producer happens to be .NET, ToString(DateTimeFormatInfo.InvariantInfo).

Why do you need a format file for this? What happens when you don't use it?
BULK INSERT dbo.BulkTest
FROM 'Q:\...\BulkTest.tab'
WITH (
CHECK_CONSTRAINTS,
TABLOCK,
FIRSTROW=1,
DATAFILETYPE='char',
ROWTERMINATOR='\r\n'
);

Sorry I'm late to the party but I hope this can help some other poor soul.
Anyway I discovered the same error when using the format file.
Instead of type SQLDATE I used SQLCHAR and the length.
2015-01-01 = length 10
11.0
1
1 SQLCHAR 0 10 "\r\n" 1 my_date ""

Related

T-SQL BULK INSERT type mismatch

I am trying to do a simple BULK INSERT from a large CSV file to a table. The table and the file have matching columns. This is my code:
BULK INSERT myTable
FROM 'G:\Tests\mySource.csv'
WITH (
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
-- ROWTERMINATOR = '0x0a',
BATCHSIZE = 1000,
MAXERRORS = 2
)
GO
As you can see I have tried with row terminators \n and 0x0a (and a bunch more)
I keep getting a type mismatch error:
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 18 (createdAt).
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 3, column 18 (createdAt).
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 4, column 18 (createdAt).
Msg 4865, Level 16, State 1, Line 1
Cannot bulk load because the maximum number of errors (2) was exceeded.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
Column createdAt is of type datetime:
CREATE TABLE [dbo].[myTable]
(
...
[createdAt] [datetime] NULL,
...
)
These are the values of the createdAt column as taken from the first three rows:
2020-08-22 13:51:57
2020-08-22 14:13:13
2020-08-22 14:16:23
I also tried with a different number format as suggested. I also tried changing the column type to DATETIME2(n):
2020-08-22T13:51:57
2020-08-22T14:13:13
2020-08-22T14:16:23
I have no idea what else to review.
I would appreciate any help.
Thanks!
There are many formats of string literals to be converted to dates & times supported by SQL Server - see the MSDN Books Online on CAST and CONVERT. Most of those formats are dependent on what settings you have - therefore, these settings might work some times - and sometimes not. And the DATETIME datatype in particular is notoriously picky about what formats of string literals work - and which others (most) don't.... DATETIME2(n) is much more forgiving and less picky to deal with!
The way to solve this is to use the (slightly adapted) ISO-8601 date format that is supported by SQL Server - this format works always - regardless of your SQL Server language and dateformat settings.
The ISO-8601 format is supported by SQL Server comes in two flavors:
YYYYMMDD for just dates (no time portion); note here: no dashes!, that's very important! YYYY-MM-DD is NOT independent of the dateformat settings in your SQL Server and will NOT work in all situations!
or:
YYYY-MM-DDTHH:MM:SS for dates and times - note here: this format has dashes (but they can be omitted), and a fixed T as delimiter between the date and time portion of your DATETIME.
This is valid for SQL Server 2000 and newer.
If you use SQL Server 2008 or newer and the DATE datatype (only DATE - not DATETIME!), then you can indeed also use the YYYY-MM-DD format and that will work, too, with any settings in your SQL Server.
Don't ask me why this whole topic is so tricky and somewhat confusing - that's just the way it is. But with the YYYYMMDD format, you should be fine for any version of SQL Server and for any language and dateformat setting in your SQL Server.
The recommendation for SQL Server 2008 and newer is to use DATE if you only need the date portion, and DATETIME2(n) when you need both date and time. You should try to start phasing out the DATETIME datatype if ever possible
In your case, I'd try one of two things:
if you can - use DATETIME2(n) instead of DATETIME as your column's datatype - that alone might solve all your problems
if you can't use DATETIME2(n) - try to use 2020-08-22T13:51:57 instead of
2020-08-22 13:51:57 for specifying your date&time in the CSV import file.

BULK INSERT issues with SQL Server 2012 and higher releases

We work with a 3rd party and they provide us files that are basically a dump from their DB. Our company supports SQL Server 2012 as well as SQL Server 2014 and up. I need to BULK INSERT these files and have ONE set of files work for any client.
They provide us the files, from a UNIX system, as utf-8 encoded. I am aware that SQL Server 2012 doesn't support utf-8. From reading on here, I have gone the route of converting those files to utf-16 (using Textpad8). In total there are about 22 files.
I use the following syntax:
BULK INSERT database.dbo.tablename
FROM '\\server\filename.txt'
WITH (FIRSTROW =2, ROWTERMINATOR ='0x0a')
That of course works for all the files on SQL Server 2014 box.
ONE file of the 22 does NOT work for SQL Server 2012 and I cannot figure out what is wrong. That particular file goes into a table defined this way:
CREATE TABLE [dbo].[Map]
(
termid int NOT NULL,
mapguid char(22) NOT NULL,
mapsequence int NOT NULL,
conceptguid char(22) NOT NULL,
mapdefnguid char(22) NOT NULL,
mapquality int NULL,
CONSTRAINT [PK_Map]
PRIMARY KEY CLUSTERED ([termid] ASC, [mapguid] ASC, [mapsequence] ASC)
) ON [PRIMARY];
This is what the sample data looks like
termid mapguid mapsequence conceptguid mapdefnguid mapquality
260724 Nm9T2QFFs67xk2/zCgEDHw 0 AExH2wEce5u4wbhnqf4ZgQ TDMQWQE6UQdXAoATCgECyQ
172288 AW8L6AEj+br0hsZ3CgEBig 0 BgCTWgDjf6OlTk1oCwsLDQ AUKoDQEjn6KrxIAJCgEBmw
377707 PtArUQE7q1ajeoiRCgEDAQ 0 ACSYtQDsdrQtN1h2qf79/w TDMQWQE6UsYdrYAbCgECeg
tab is column separator, and LF is the rowterminator character
This is the error I get:
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 1 (termid).
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 3, column 1 (termid).
I've searched that error on google (and here) and have seen where you may get that error if something is actually specified as literally 'NULL' instead of being blank.
I've even gone so far as to create my own file and I still get the same errors. In my own file, I actually populate the last row, thinking maybe that was causing issues, but the error seems to indicate it doesn't like something with the very first column.
Can anyone help me with some suggestions please?
I don't know if this is REALLY an answer, but somehow, the file imports fine with utf-8 encoding, which doesn't make a lot of sense to me since SQL 2012 isn't supposed to support that. I looked at the data in the table and it appears to be fine, so I don't really have an explanation there.
I then converted the file to utf-16 and re-ran the process and started getting the above errors again, so...shrug

Bulk Load Data Conversion Error - Can't Find Answer

For some reason I keep receiving the following error when trying to bulk insert a CSV file into SQL Express:
Bulk load data conversion error (type mismatch or invalid character for the
specified codepage) for row 2, column 75 (Delta_SM_RR).
Msg 4864, Level 16, State 1, Line 89
Bulk load data conversion error (type mismatch or invalid character for the
specified codepage) for row 3, column 75 (Delta_SM_RR).
Msg 4864, Level 16, State 1, Line 89
Bulk load data conversion error (type mismatch or invalid character for the
specified codepage) for row 4, column 75 (Delta_SM_RR).
... etc.
I have been attempting to insert this column as both decimal and numeric, and keep receiving this same error (if I take out this column, the same error appears for the subsequent column).
Please see below for an example of the data, all data points within this column contain decimals and are all rounded after the third decimal point:
Delta_SM_RR
168.64
146.17
95.07
79.85
60.52
61.03
-4.11
-59.57
1563.09
354.36
114.78
253.46
451.5
Any sort of help or advice would be greatly appreciated as it seems that a number of people of SO have come across this issue. Also, if anyone knows of another automated way to load a CSV into SSMS, that would be a great help as well.
Edits:
Create Table Example_Table
(
[Col_1] varchar(255),
[Col_2] numeric(10,5),
[Col_3] numeric(10,5),
[Col_4] numeric(10,5),
[Col_5] date,
[Delta_SM_RR] numeric(10,5),
)
GO
BULK INSERT
Example_Table
FROM 'C:\pathway\file.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2
);
Table Schema - This is a standalone table (further calculations and additional tables are built off of this single table, however at the time of bulk insert it is the only table)
It's likely that your data has an error in it. That is, that there is a character or value that can't be converted explicitly to NUMERIC or DECIMAL. One way to check this and fix it is to
Change [Delta_SM_RR] numeric(10,5) to [Delta_SM_RR] nvarchar(256)
Run the bulk insert
Find your error row: select * from Example_Table where [Delta_SM_RR] like '%[^-.0-9]%'
Fix the data at the source, or delete from Example_Table where [Delta_SM_RR] like '%[^-.0-9]%'
The last statements returns/deletes rows where there is something other than a digit, period, or hyphen.
For your date column you can follow the same logic above, by changing the column to VARCHAR, and then find your error by using ISDATE() to find the ones which can't be converted.
I'll bet anything there is some weird character in your data set. Open your data set in Notepad++ and view the data. Any aberration should become apparent very quickly! The problem is coming from Col75 and it's affecting the first several rows, and thus everything that comes after that also fails to load.
Make sure that .csv is not using text qualifiers and that none of your fields in the .csv have a comma inside the desired value.
I am struggling with this issue right now. The issue is that I have a 68 column report I am trying to import.
Column 17 is a "Description" column that has a double quote text qualifier on top of the comma delimitation.
Bulk insert with a comma field terminator won't identify the double quote text qualifier and munge all of the data to the right of the offending column.
It looks like to overcome this, you need to create a .fmt file to instruct the Bulk Insert which columns it needs to treat as simple delimited, and which columns it needs to treat as delimited and qualified (see this answer).

BULK INSERT from CSV into SQL Server causes error

I've got the simple table in CSV format:
999,"01/01/2001","01/01/2001","7777777","company","channel","01/01/2001"
990,"01/01/2001","01/01/2001","767676","hhh","tender","01/01/2001"
3838,"01/01/2001","01/01/2001","888","jhkh","jhkjh","01/01/2001"
08987,"01/01/2001","01/01/2001","888888","hkjhjkhv","jhgjh","01/01/2001"
8987,"01/01/2001","01/01/2001","9999","jghg","hjghg","01/01/2001"
jhkjhj,"01/01/2001","01/01/2001","9999","01.01.2001","hjhh","01/01/2001"
090009,"","","77777","","","01/01/2001"
980989,"01/01/2001","01/01/2001","888","","jhkh","01/01/2001"
0000,"01/01/2001","01/01/2001","99999","jhjh","","01/01/2001"
92929,"01/01/2001","01/01/2001","222","","","01/01/2001"
I'm trying to import that data into SQL Server using BULK INSERT (Transact-SQL)
set dateformat DMY;
BULK INSERT Oracleload
FROM '\\Mac\Home\Desktop\Test\T_DOGOVOR.csv'
WITH
(FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
KEEPNULLS);
On the output I get the next error:
Msg 4864, Level 16, State 1, Line 4
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 2 (date_begin)....
Something wrong with date format maybe. But what script I need to write to fix that error?
Please help.
Thanks in advance.
BULK INSERT (nor bcp) cannot (properly) handle CSV files, specially if they have (correctly) " quotes. Alternatives are SSIS or PowerShell.
I always look at the data in Notepad++ to see if there are some weird characters, or non-printable characters, like a line break or something. For this, it seems like you can open it using Notepad (if you don't have Notepad++) do a find-replace for " to nothing... Save the file, and re-do the Bulk Load.
This record:
jhkjhj,"01/01/2001","01/01/2001","9999","01.01.2001","hjhh","01/01/2001"
The first column has a numeric type of some kind. You can't put the jhkjhj value into that field.
Additionally, some records have empty values ("") in date fields. These are likely to be to interpreted as empty strings, rather than null dates, and not convert properly.
But the error refers to "row 1, column 2". That's this value:
"01/01/2001"
Again, the import is interpreting this as a string, rather than a date. I suspect it's trying to import the quotes (") instead of just using them as separators.
You might try bulk loading to a special holding table, and then re-importing from there. Alternatively, you can change how data is exported or write a program to pre-clean it — strip the quotes from fields that shouldn't have them, isolate records that have data that won't insert to an exception file and report.

Using Default constraint during BULK INSERT where data conversion error occurs

I'm attempting to use BULK INSERT to insert 3rd party reports. One of the columns has call duration using TIME data type, however if for some reason they don't have a call time they have the report label it as N/A which results in errors during the BULK INSERT:
Msg 4864, Level 16, State 1, Line 20
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 9, column 11 (call_duration).
The column in question call_duration has a default constraint of '00:00:00' I would like the system to use if/when there is an error flagged during the BULK INSERT in that column (or any column for that matter with a default constraint)
UPDATE: Here's my BULK INSERT statement:
BULK INSERT dbo.TempYellowPages
FROM 'Z:\YP.txt'
WITH (
FIRSTROW=2,
FIELDTERMINATOR='\t',
ROWTERMINATOR='\n',
MAXERRORS = 99
)
I'm looking to use the default constraint of the columns within the TempYellowPages table when there's an issue with the data. I can't use CONVERT (to my knowledge) as the data isn't in a source table, it's coming directly from a file. Here's an example of some of the fields the file could have:
Date Time Caller Name Caller Number Call Duration
9/2/2015 4:03:18 PM John Smith (555) 444-1115 0:04:38
9/2/2015 10:53:09 AM Thomas Bush (555) 444-1115 N/A
9/2/2015 10:26:28 AM Burt Fenimore (555) 444-1115 0:05:53
convert the "call_duration" column to varchar in the source query, use CONVERT function. Be aware how SQL Server understand '00:00:00' (maybe as 1900-01-01') and change the report condition to show 'N/A' according to the '1900..' value

Resources