Using Default constraint during BULK INSERT where data conversion error occurs - sql-server

I'm attempting to use BULK INSERT to insert 3rd party reports. One of the columns has call duration using TIME data type, however if for some reason they don't have a call time they have the report label it as N/A which results in errors during the BULK INSERT:
Msg 4864, Level 16, State 1, Line 20
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 9, column 11 (call_duration).
The column in question call_duration has a default constraint of '00:00:00' I would like the system to use if/when there is an error flagged during the BULK INSERT in that column (or any column for that matter with a default constraint)
UPDATE: Here's my BULK INSERT statement:
BULK INSERT dbo.TempYellowPages
FROM 'Z:\YP.txt'
WITH (
FIRSTROW=2,
FIELDTERMINATOR='\t',
ROWTERMINATOR='\n',
MAXERRORS = 99
)
I'm looking to use the default constraint of the columns within the TempYellowPages table when there's an issue with the data. I can't use CONVERT (to my knowledge) as the data isn't in a source table, it's coming directly from a file. Here's an example of some of the fields the file could have:
Date Time Caller Name Caller Number Call Duration
9/2/2015 4:03:18 PM John Smith (555) 444-1115 0:04:38
9/2/2015 10:53:09 AM Thomas Bush (555) 444-1115 N/A
9/2/2015 10:26:28 AM Burt Fenimore (555) 444-1115 0:05:53

convert the "call_duration" column to varchar in the source query, use CONVERT function. Be aware how SQL Server understand '00:00:00' (maybe as 1900-01-01') and change the report condition to show 'N/A' according to the '1900..' value

Related

Select text column from table in SQL Server stored procedure

I am having difficulty figuring this out. I have an incident table that contains columns id, comments, incidentdate, and incidentdescID. There are 10 years worth of data in this table. I wrote a stored procedure to extract the last 4 years worth of data but I am running into the following error.
Msg 8152, Level 16, State 10, Line 27
String or binary data would be truncated.
So when I change the date range for the incident to be between 2015 to 2016 I am not getting an error. Then when I change it to be between 2017-2018 I am still not getting an error. But when I change it to be between 2016-2017 I get the error. Also when I comment out the comments column, I do not get an error no matter what date range I put.
So I was thinking there might be a special character in the Comments column which is a text column in the Incident table. If that is the case how would I be able to select that column but remove the special characters in the stored procedure without making changes to the table?
If you suspect your "Comments" column is the culprit then you can search my friend for junk values. I got this error once and fixed by replacing char(10) and char(13) by blanks.
1.
SELECT REPLACE(REPLACE(tbl.comments, CHAR(10), '*JUNK*'), CHAR(13), '*JUNK*') AS CleandComments
FROM [your table name] tbl
Copy your query result into any editor and search for records corresponding JUNK keywords.
This ideally happens when you are importing data from excels or source tables with NVARCHAR datatype whereas your destination is a CSV or accepts only VARCHAR.
If above is your case then you simply need to put REPLACE function on your column/s in your procedure

Bulk Load Data Conversion Error - Can't Find Answer

For some reason I keep receiving the following error when trying to bulk insert a CSV file into SQL Express:
Bulk load data conversion error (type mismatch or invalid character for the
specified codepage) for row 2, column 75 (Delta_SM_RR).
Msg 4864, Level 16, State 1, Line 89
Bulk load data conversion error (type mismatch or invalid character for the
specified codepage) for row 3, column 75 (Delta_SM_RR).
Msg 4864, Level 16, State 1, Line 89
Bulk load data conversion error (type mismatch or invalid character for the
specified codepage) for row 4, column 75 (Delta_SM_RR).
... etc.
I have been attempting to insert this column as both decimal and numeric, and keep receiving this same error (if I take out this column, the same error appears for the subsequent column).
Please see below for an example of the data, all data points within this column contain decimals and are all rounded after the third decimal point:
Delta_SM_RR
168.64
146.17
95.07
79.85
60.52
61.03
-4.11
-59.57
1563.09
354.36
114.78
253.46
451.5
Any sort of help or advice would be greatly appreciated as it seems that a number of people of SO have come across this issue. Also, if anyone knows of another automated way to load a CSV into SSMS, that would be a great help as well.
Edits:
Create Table Example_Table
(
[Col_1] varchar(255),
[Col_2] numeric(10,5),
[Col_3] numeric(10,5),
[Col_4] numeric(10,5),
[Col_5] date,
[Delta_SM_RR] numeric(10,5),
)
GO
BULK INSERT
Example_Table
FROM 'C:\pathway\file.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2
);
Table Schema - This is a standalone table (further calculations and additional tables are built off of this single table, however at the time of bulk insert it is the only table)
It's likely that your data has an error in it. That is, that there is a character or value that can't be converted explicitly to NUMERIC or DECIMAL. One way to check this and fix it is to
Change [Delta_SM_RR] numeric(10,5) to [Delta_SM_RR] nvarchar(256)
Run the bulk insert
Find your error row: select * from Example_Table where [Delta_SM_RR] like '%[^-.0-9]%'
Fix the data at the source, or delete from Example_Table where [Delta_SM_RR] like '%[^-.0-9]%'
The last statements returns/deletes rows where there is something other than a digit, period, or hyphen.
For your date column you can follow the same logic above, by changing the column to VARCHAR, and then find your error by using ISDATE() to find the ones which can't be converted.
I'll bet anything there is some weird character in your data set. Open your data set in Notepad++ and view the data. Any aberration should become apparent very quickly! The problem is coming from Col75 and it's affecting the first several rows, and thus everything that comes after that also fails to load.
Make sure that .csv is not using text qualifiers and that none of your fields in the .csv have a comma inside the desired value.
I am struggling with this issue right now. The issue is that I have a 68 column report I am trying to import.
Column 17 is a "Description" column that has a double quote text qualifier on top of the comma delimitation.
Bulk insert with a comma field terminator won't identify the double quote text qualifier and munge all of the data to the right of the offending column.
It looks like to overcome this, you need to create a .fmt file to instruct the Bulk Insert which columns it needs to treat as simple delimited, and which columns it needs to treat as delimited and qualified (see this answer).

Loading CSV File into SQL Server database

Working in the Visual Studio database feature. I've got two tables and I need to load a .csv file into them. I broke out the .csv file into my first and 2nd table. I'm trying a bulk insert
BULK INSERT Course
FROM 'E:\CourseInfo.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
TABLOCK
)
Everything seems right to me but I receive an error saying:
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 3 (ID).
Here is a snippet of my .CSV file being used.
CourseID,CourseTitle,ID,
AC107,Accounting I,1,
AC107,Accounting I,2,
AC110,Payroll Accounting,3,
AC212,Taxation I,4,
AC212,Taxation I,5,
What is meant by mismatch or invalid char? I've tried removing all the values for column 'ID' but that still rendered the same error. I had ID set to auto-increment setting it as "isEntity" but have tried both ways with it set to true and false, still same error.
Possible error: Laid a huge egg, think I have to normalize this prior to doing this because there are multiples of the same class with different ID and that isn't right.
To Answer my own question, I had my PK set to ID and it was showing up for Duplicate Rows where there was duplicate data. I have to remove duplicate courseID, and CourseTitle rows prior to populating my ID Column with values.

How to query for rows containing <Unable to read data> in a column?

I have a SQL table in which some columns, when viewed in SQL Server Manager, contain <Unable to read data>. Does anyone know how to query for <Unable to read data>? I can individually modify the data in this column with update table set column = NULL where key = 'value', but how can I find whether additional rows exist with this bad data?
I would recommend against replacing the data. There is nothing wrong with it, is just that SSMs cannot display it properly in the Edit panel. The data in the database itself is perfectly fine, from your description.
This script shows the problem:
create table test (id int not null identity(1,1) primary key,
large_value numeric(38,0));
go
insert into test (large_value) values (1);
insert into test (large_value) values (12345678901234567890123456789012345678);
insert into test (large_value) values (1234567890123456789012345678901234567);
insert into test (large_value) values (123456789012345678901234567890123456);
insert into test (large_value) values (12345678901234567890123456789012345);
insert into test (large_value) values (1234567890123456789012345678901234);
insert into test (large_value) values (123456789012345678901234567890123);
insert into test (large_value) values (12345678901234567890123456789012);
insert into test (large_value) values (1234567890123456789012345678901);
insert into test (large_value) values (123456789012345678901234567890);
insert into test (large_value) values (12345678901234567890123456789);
insert into test (large_value) values (NULL);
go
select * from test;
go
The SELECT will work fine, but showing the Edit Top 200 Rows in object explorer will not:
There is a Connect Item for this issue. SSMS 2012 still exhibits the same problem.
If we look at the Numeric and Decimal details we'll see that the problem occurs at a weird boundary, at precision 29 which is actually not a SQL Server boundary (precision 28 is):
Precision Storage bytes
1 - 9 5
10-19 9
20-28 13
29-38 17
If we check the .Net (SSMS is a managed application) decimal precision table we can see quickly where the crux of the issue is: Precision is 28-29 significant digits. So the .Net decimal type cannot map high precision (>29) SQL Server numeric/decimal types.
This will affect not only SSMS display, but your applications as well. Specialized applications like SSIS will use high precisions representation like DT_NUMERIC:
DT_NUMERIC An exact numeric value with a fixed precision and scale.
This data type is a 16-byte unsigned integer with a separate sign, a
scale of 0 - 38, and a maximum precision of 38.
Now back to your problem: you can discover invalid entries by simply looking at the value. Knowing that the C# representation range can accommodate values between approximate (-7.9 x 1028 to 7.9 x 1028) / (100 to 28)` (the range depends on the scale) you can search for values outside the range on each column (the actual values to search between will depend on the column scale). But that begs the question 'what to replace the data with?'.
I would recommend instead using dedicated tools for import export, tools that are capable of handling high precision numeric values. SSIS is the obvious candidate. But even the modest bcp.exe would also fit the bill.
BTW if your values are actually incorrect (ie. true corruption) then I would recommend running DBCC CHECKTABLE (...) WITH DATA_PURITY:
DATA_PURITY
Causes DBCC CHECKDB to check the database for column values that are not valid or out-of-range. For example, DBCC CHECKDB detects
columns with date and time values that are larger than or less than
the acceptable range for the datetime data type; or decimal or
approximate-numeric data type columns with scale or precision values
that are not valid.
For databases created in SQL Server 2005 and later, column-value integrity checks are enabled by default and do not require the
DATA_PURITY option. For databases upgraded from earlier versions of
SQL Server, column-value checks are not enabled by default until DBCC
CHECKDB WITH DATA_PURITY has been run error free on the database.
After this, DBCC CHECKDB checks column-value integrity by default.
Q: How can this issue arise for a datetime column?
use tempdb;
go
create table test(d datetime)
insert into test (d) values (getdate())
select %%physloc%%, * from test;
-- Row is on page 0x9100000001000000
dbcc traceon(3604,-1);
dbcc page(2,1,145,3);
Memory Dump #0x000000003FA1A060
0000000000000000: 10000c00 75f9ff00 6aa00000 010000 ....uùÿ.j .....
Slot 0 Column 1 Offset 0x4 Length 8 Length (physical) 8
dbcc writepage(2,1,145, 100, 8, 0xFFFFFFFFFFFFFFFF)
dbcc checktable('test') with data_purity;
Msg 2570, Level 16, State 3, Line 2 Page (1:145), slot 0 in object ID
837578022, index ID 0, partition ID 2882303763115671552, alloc unit ID
2882303763120062464 (type "In-row data"). Column "d" value is out of
range for data type "datetime". Update column to a legal value.
As suggested above ,these errors usually occurs when Precision and scale are not preserved .If your comfortable with SSIS then you can achieve to get those rows which are corrupt .Taking the values which Martin Smith created
CREATE TABLE T(ID int ,C DECIMAL(38,0));
INSERT INTO T VALUES(1,9999999999999999999999999999999999999)
The above table reproduces the error . Here the first column represents the primary key . I inserted around 1000 rows out of which few were corrupted values . Below is the SSIS package design
In the Data Conversion ,i took the column C which had errors and tried to cast it to Decimal(38,0) .Since a conversion or truncation error will occur ,therefore i redirected the error rows to an OLEDB command which basically updates the table and sets the column to NULL
Update T
Set C=NULL
where ID=?
The value of C and ID will be directed to oledb command .In case if there is no error then i'm just inserting into a table ( Actually no need to do this ).This will work if you have a primary key column in your table .
In case if there is any error in date time column a sql query can be written to verify the format of datetime values .Please go through the MSDN link for valid date time value
Select * from YourTable where ISDATE(Col)!=1
I think you can fetch data with cursor. please try again with cursor query such as below query :
DECLARE VerifyCursor CURSOR FOR
SELECT *
FROM MyTable
WHILE 1=1 BEGIN
BEGIN Try
FETCH FIRST FROM VerifyCursor INTO #Column1, #Column2, ...
INSERT INTO #MyTable2(Column1, Column2,...)
VALUES (#Column1, #Column2, ...)
END TRY
BEGIN CATCH
END CATCH
IF (##FETCH_STATUS<>0) BREAK
End
OPEN VerifyCursor
CLOSE VerifyCursor
DEALLOCATE VerifyCursor
Replacing the bad data is simple with an update:
UPDATE table SET column = NULL WHERE key_column = 'Some value'

Data migration from MySQL to HSQL

I was working on migrating data from MYSQL to HSQL.
In MYSQL data file, there are plenty of records where date values are set as '0000-00-00' and HSQL database throws below error:
"data exception: invalid datetime format / Error Code: -3407 / State:
22007"
for all such records.
I would like to know what could be optimum solution for this problem?
Thanks in advance
HSQLDB follows the SQL Standard and allows valid dates only. A date such as '0001-01-01' would be a good candidate for the default value.
Regardless of the method used for data inserts, the '0000-00-00' strings should be corrected before insert. One way of doing this is to use a default value for the target column with DEFAULT DATE'0001-01-01' and replace the string in the INSERT statement with the keyword DEFAULT. For example:
CREATE TABLE MYTABLE ( C1 INT, C2 DATE DEFAULT DATE'0001-01-01')
INSERT INTO MYTABLE VALUES 1, DEFAULT
INSERT INTO MYTABLE VALUES 3, '2010-08-14'

Resources