Bulk Load Data Conversion Error - Can't Find Answer - sql-server

For some reason I keep receiving the following error when trying to bulk insert a CSV file into SQL Express:
Bulk load data conversion error (type mismatch or invalid character for the
specified codepage) for row 2, column 75 (Delta_SM_RR).
Msg 4864, Level 16, State 1, Line 89
Bulk load data conversion error (type mismatch or invalid character for the
specified codepage) for row 3, column 75 (Delta_SM_RR).
Msg 4864, Level 16, State 1, Line 89
Bulk load data conversion error (type mismatch or invalid character for the
specified codepage) for row 4, column 75 (Delta_SM_RR).
... etc.
I have been attempting to insert this column as both decimal and numeric, and keep receiving this same error (if I take out this column, the same error appears for the subsequent column).
Please see below for an example of the data, all data points within this column contain decimals and are all rounded after the third decimal point:
Delta_SM_RR
168.64
146.17
95.07
79.85
60.52
61.03
-4.11
-59.57
1563.09
354.36
114.78
253.46
451.5
Any sort of help or advice would be greatly appreciated as it seems that a number of people of SO have come across this issue. Also, if anyone knows of another automated way to load a CSV into SSMS, that would be a great help as well.
Edits:
Create Table Example_Table
(
[Col_1] varchar(255),
[Col_2] numeric(10,5),
[Col_3] numeric(10,5),
[Col_4] numeric(10,5),
[Col_5] date,
[Delta_SM_RR] numeric(10,5),
)
GO
BULK INSERT
Example_Table
FROM 'C:\pathway\file.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2
);
Table Schema - This is a standalone table (further calculations and additional tables are built off of this single table, however at the time of bulk insert it is the only table)

It's likely that your data has an error in it. That is, that there is a character or value that can't be converted explicitly to NUMERIC or DECIMAL. One way to check this and fix it is to
Change [Delta_SM_RR] numeric(10,5) to [Delta_SM_RR] nvarchar(256)
Run the bulk insert
Find your error row: select * from Example_Table where [Delta_SM_RR] like '%[^-.0-9]%'
Fix the data at the source, or delete from Example_Table where [Delta_SM_RR] like '%[^-.0-9]%'
The last statements returns/deletes rows where there is something other than a digit, period, or hyphen.
For your date column you can follow the same logic above, by changing the column to VARCHAR, and then find your error by using ISDATE() to find the ones which can't be converted.

I'll bet anything there is some weird character in your data set. Open your data set in Notepad++ and view the data. Any aberration should become apparent very quickly! The problem is coming from Col75 and it's affecting the first several rows, and thus everything that comes after that also fails to load.

Make sure that .csv is not using text qualifiers and that none of your fields in the .csv have a comma inside the desired value.
I am struggling with this issue right now. The issue is that I have a 68 column report I am trying to import.
Column 17 is a "Description" column that has a double quote text qualifier on top of the comma delimitation.
Bulk insert with a comma field terminator won't identify the double quote text qualifier and munge all of the data to the right of the offending column.
It looks like to overcome this, you need to create a .fmt file to instruct the Bulk Insert which columns it needs to treat as simple delimited, and which columns it needs to treat as delimited and qualified (see this answer).

Related

BULK INSERT from CSV into SQL Server causes error

I've got the simple table in CSV format:
999,"01/01/2001","01/01/2001","7777777","company","channel","01/01/2001"
990,"01/01/2001","01/01/2001","767676","hhh","tender","01/01/2001"
3838,"01/01/2001","01/01/2001","888","jhkh","jhkjh","01/01/2001"
08987,"01/01/2001","01/01/2001","888888","hkjhjkhv","jhgjh","01/01/2001"
8987,"01/01/2001","01/01/2001","9999","jghg","hjghg","01/01/2001"
jhkjhj,"01/01/2001","01/01/2001","9999","01.01.2001","hjhh","01/01/2001"
090009,"","","77777","","","01/01/2001"
980989,"01/01/2001","01/01/2001","888","","jhkh","01/01/2001"
0000,"01/01/2001","01/01/2001","99999","jhjh","","01/01/2001"
92929,"01/01/2001","01/01/2001","222","","","01/01/2001"
I'm trying to import that data into SQL Server using BULK INSERT (Transact-SQL)
set dateformat DMY;
BULK INSERT Oracleload
FROM '\\Mac\Home\Desktop\Test\T_DOGOVOR.csv'
WITH
(FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
KEEPNULLS);
On the output I get the next error:
Msg 4864, Level 16, State 1, Line 4
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 2 (date_begin)....
Something wrong with date format maybe. But what script I need to write to fix that error?
Please help.
Thanks in advance.
BULK INSERT (nor bcp) cannot (properly) handle CSV files, specially if they have (correctly) " quotes. Alternatives are SSIS or PowerShell.
I always look at the data in Notepad++ to see if there are some weird characters, or non-printable characters, like a line break or something. For this, it seems like you can open it using Notepad (if you don't have Notepad++) do a find-replace for " to nothing... Save the file, and re-do the Bulk Load.
This record:
jhkjhj,"01/01/2001","01/01/2001","9999","01.01.2001","hjhh","01/01/2001"
The first column has a numeric type of some kind. You can't put the jhkjhj value into that field.
Additionally, some records have empty values ("") in date fields. These are likely to be to interpreted as empty strings, rather than null dates, and not convert properly.
But the error refers to "row 1, column 2". That's this value:
"01/01/2001"
Again, the import is interpreting this as a string, rather than a date. I suspect it's trying to import the quotes (") instead of just using them as separators.
You might try bulk loading to a special holding table, and then re-importing from there. Alternatively, you can change how data is exported or write a program to pre-clean it — strip the quotes from fields that shouldn't have them, isolate records that have data that won't insert to an exception file and report.

Using Default constraint during BULK INSERT where data conversion error occurs

I'm attempting to use BULK INSERT to insert 3rd party reports. One of the columns has call duration using TIME data type, however if for some reason they don't have a call time they have the report label it as N/A which results in errors during the BULK INSERT:
Msg 4864, Level 16, State 1, Line 20
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 9, column 11 (call_duration).
The column in question call_duration has a default constraint of '00:00:00' I would like the system to use if/when there is an error flagged during the BULK INSERT in that column (or any column for that matter with a default constraint)
UPDATE: Here's my BULK INSERT statement:
BULK INSERT dbo.TempYellowPages
FROM 'Z:\YP.txt'
WITH (
FIRSTROW=2,
FIELDTERMINATOR='\t',
ROWTERMINATOR='\n',
MAXERRORS = 99
)
I'm looking to use the default constraint of the columns within the TempYellowPages table when there's an issue with the data. I can't use CONVERT (to my knowledge) as the data isn't in a source table, it's coming directly from a file. Here's an example of some of the fields the file could have:
Date Time Caller Name Caller Number Call Duration
9/2/2015 4:03:18 PM John Smith (555) 444-1115 0:04:38
9/2/2015 10:53:09 AM Thomas Bush (555) 444-1115 N/A
9/2/2015 10:26:28 AM Burt Fenimore (555) 444-1115 0:05:53
convert the "call_duration" column to varchar in the source query, use CONVERT function. Be aware how SQL Server understand '00:00:00' (maybe as 1900-01-01') and change the report condition to show 'N/A' according to the '1900..' value

Loading CSV File into SQL Server database

Working in the Visual Studio database feature. I've got two tables and I need to load a .csv file into them. I broke out the .csv file into my first and 2nd table. I'm trying a bulk insert
BULK INSERT Course
FROM 'E:\CourseInfo.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
TABLOCK
)
Everything seems right to me but I receive an error saying:
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 3 (ID).
Here is a snippet of my .CSV file being used.
CourseID,CourseTitle,ID,
AC107,Accounting I,1,
AC107,Accounting I,2,
AC110,Payroll Accounting,3,
AC212,Taxation I,4,
AC212,Taxation I,5,
What is meant by mismatch or invalid char? I've tried removing all the values for column 'ID' but that still rendered the same error. I had ID set to auto-increment setting it as "isEntity" but have tried both ways with it set to true and false, still same error.
Possible error: Laid a huge egg, think I have to normalize this prior to doing this because there are multiples of the same class with different ID and that isn't right.
To Answer my own question, I had my PK set to ID and it was showing up for Duplicate Rows where there was duplicate data. I have to remove duplicate courseID, and CourseTitle rows prior to populating my ID Column with values.

Handling embedded new lines when creating/selecting External Tables in SQL Data Warehouse

In SQL Data Warehouse (editors please don't change this, it is the actual name see: here) I have a JobCandidate_ext external table that looks like this.
CREATE EXTERNAL TABLE [HumanResources].[JobCandidate_ext](
[JobCandidateID] int,
[BusinessEntityID] int,
[Resume] Varchar(8000),
[ModifiedDate] Datetime
)
WITH (
LOCATION='/[HumanResources].[JobCandidate]/data.txt',
DATA_SOURCE=AzureStorage,
FILE_FORMAT=TextFile)
GO
The column [Resume] was an XML type in SQL Server but in SQL Data Warehouse XML types should be converted to varchar(8000) as described here.
I am using a flat file data.txt to export the data to a blob and then create an external table from it.
The [Resume] column has carriage returns in it (as expected from an XML file), and so when you run a SELECT * FROM [HumanResources].[JobCandidate_ext] you get an error. In this case:
Query aborted-- the maximum reject threshold (0 rows) was reached while reading from an external source: 1 rows rejected out of total 2 rows processed.
(/[HumanResources].[JobCandidate]/data.txt)Column ordinal: 0, Expected data type: INT, Offending value: some text .... (Column Conversion Error), Error: Error converting data type NVARCHAR to INT.
I know that I cannot configure a row delimiter when creating external tables as described here.
The row delimiter must be UTF-8 and supported by Hadoop’s LineRecordReader. The row delimiter must be either '\r', '\n', or '\r\n'. These are not user-configurable.
And if you try to put quotes on each column field you get this error while selecting rows from the external table: No closing string delimiter.
Query aborted-- the maximum reject threshold (0 rows) was reached while reading from an external source: 1 rows rejected out of total 1 rows processed.
(/[HumanResources].[JobCandidate]/data.txt)Column ordinal: 2, Expected data type: VARCHAR(8000) collate SQL_Latin1_General_CP1_CI_AS, Offending value: 'ShaiBassli (Tokenization failed), Error: No closing string delimiter.
Is there a way to get around this issue?
Today, PolyBase does not allow for row or field delimiters inside fields i.e. it does not allow you to escape these characters. As Greg pointed out, you can vote for this functionality here: https://feedback.azure.com/forums/307516-sql-data-warehouse/suggestions/10600132-polybase-allow-line-ends-within-qualified-text-f
To workaround this limitation, you can either pre-process the data (using sed or tr for example) to replace unwanted characters before reading it with PolyBase. Or you can switch to other polybase supported file formats RCFile/ORC/Parquet to avoid dealing with row and field delimiters completely.

fetching master table data, getting error

i want to get text values from a master table corresponding to a string (which is comma seperated string of master table userid column) stored in another table
i am trying as
select maritialtype from tblmastermaritialstatus where MaritalStatusId in(select MaritalStatusId from tblPartnerBasicDetail where userid=1)
maritalstatusid in tblPartnerBasicDetail is a string like 1,2,3
i am getting error
Msg 245, Level 16, State 1, Line 1 Conversion failed when converting
the varchar value '1,2,3' to data type tinyint.
how to resolve it
Comma seperated nvarchar data is not the same as comma seperated integers.
You are doing something similar to:
WHERE 1 IN ("1,2,3")
1 is an integer, "1,2,3" is a string (which cannot be implicitly converted). Therefore you are getting an error.
I would recommend normalising your data so that there is no need for comma seperated values.
In the long run this will save you a lot of issues.
However, if you wish to stick with CSV, you may find this article helpful:
http://www.nigelrivett.net/SQLTsql/InCsvStringParameter.html
Check the fn_ParseCSVString part specifically

Resources