We recently had an identifier column move from int values to bigint values. An ETL process which loads these values was not updated. That process is using SQL bulk insert, and we are seeing incorrect values in the destination table. I would have expected a hard failure.
Please note this process has been running successfully for a long time before this issue.
Can anyone tell me what the heck SQL Server is doing here!? I know how to fix the situation, but I'm trying to better understand it for the data cleanup effort that I'll need to complete, as well as just for the fact that it looks like black magic!
I've been able to re-create this issue in SQL Server 2017 and 2019.
Simplified Example CSV file contents:
310067463717
310067463718
310067463719
Example SQL:
create table #t (t int)
bulk insert #t from 'c:\temp\test.csv'
with (datafiletype = 'char',
fieldterminator = '|'
)
select * from #t
Resulting data:
829818405
829818406
829818407
Interestingly, I tried with smaller values, and I do see an error:
Example CSV file contents (2147483647 is the largest int value for SQL Server):
310067463717
310067463718
310067463719
2147483647
2147483648
Running the same SQL code as above, I get an error for one row:
Msg 4867, Level 16, State 1, Line 4
Bulk load data conversion error (overflow) for row 5, column 1 (t).
The resulting data looks like this:
829818405
829818406
829818407
2147483647
I also tried just now with a much higher value, 31006746371945654, and that threw the same overflow error as 2147483648.
And last, I did confirm that if I create the table with the column defined as bigint, the data inserted is correct.
create table #t (t bigint)
bulk insert #t from 'c:\temp\test.csv'
with (datafiletype = 'char',
fieldterminator = '|'
)
select * from #t
Resulting data:
2147483647
2147483648
310067463717
310067463718
310067463719
I have lately stumbled upon a blog post that talks about a stored procedure called Recover_Deleted_Data_Proc.sql that can apparently recover your deleted data from the .log file.
There is nothing new under the sun, we are going to use fn_dblog.
STEPS TO REPRODUCE
We are first going to create the table:
--Create Table
CREATE TABLE [Test_Table]
(
[Col_image] image,
[Col_text] text,
[Col_uniqueidentifier] uniqueidentifier,
[Col_tinyint] tinyint,
[Col_smallint] smallint,
[Col_int] int,
[Col_smalldatetime] smalldatetime,
[Col_real] real,
[Col_money] money,
[Col_datetime] datetime,
[Col_float] float,
[Col_Int_sql_variant] sql_variant,
[Col_numeric_sql_variant] sql_variant,
[Col_varchar_sql_variant] sql_variant,
[Col_uniqueidentifier_sql_variant] sql_variant,
[Col_Date_sql_variant] sql_variant,
[Col_varbinary_sql_variant] sql_variant,
[Col_ntext] ntext,
[Col_bit] bit,
[Col_decimal] decimal(18,4),
[Col_numeric] numeric(18,4),
[Col_smallmoney] smallmoney,
[Col_bigint] bigint,
[Col_varbinary] varbinary(Max),
[Col_varchar] varchar(Max),
[Col_binary] binary(8),
[Col_char] char,
[Col_timestamp] timestamp,
[Col_nvarchar] nvarchar(Max),
[Col_nchar] nchar,
[Col_xml] xml,
[Col_sysname] sysname
)
And we then insert data into it:
--Insert data into it
INSERT INTO [Test_Table]
([Col_image]
,[Col_text]
,[Col_uniqueidentifier]
,[Col_tinyint]
,[Col_smallint]
,[Col_int]
,[Col_smalldatetime]
,[Col_real]
,[Col_money]
,[Col_datetime]
,[Col_float]
,[Col_Int_sql_variant]
,[Col_numeric_sql_variant]
,[Col_varchar_sql_variant]
,[Col_uniqueidentifier_sql_variant]
,[Col_Date_sql_variant]
,[Col_varbinary_sql_variant]
,[Col_ntext]
,[Col_bit]
,[Col_decimal]
,[Col_numeric]
,[Col_smallmoney]
,[Col_bigint]
,[Col_varbinary]
,[Col_varchar]
,[Col_binary]
,[Col_char]
,[Col_nvarchar]
,[Col_nchar]
,[Col_xml]
,[Col_sysname])
VALUES
(CONVERT(IMAGE,REPLICATE('A',4000))
,REPLICATE('B',8000)
,NEWID()
,10
,20
,3000
,GETDATE()
,4000
,5000
,getdate()+15
,66666.6666
,777777
,88888.8888
,REPLICATE('C',8000)
,newid()
,getdate()+30
,CONVERT(VARBINARY(8000),REPLICATE('D',8000))
,REPLICATE('E',4000)
,1
,99999.9999
,10101.1111
,1100
,123456
,CONVERT(VARBINARY(MAX),REPLICATE('F',8000))
,REPLICATE('G',8000)
,0x4646464
,'H'
,REPLICATE('I',4000)
,'J'
,CONVERT(XML,REPLICATE('K',4000))
,REPLICATE('L',100)
)
GO
We are now going to verify if the data are there:
--Verify the data
SELECT * FROM Test_Table
At this point we need to create the stored procedure. I couldn't paste it here because it's too long but you can download it from the same blog post there is a link to a Box file.
If the query gives you troubles like this:
Msg 50000, Level 16, State 1, Procedure Recover_Deleted_Data_Proc, Line 22 [Batch Start Line 700] The compatibility level should be equal to or greater SQL SERVER 2005 (90)
Msg 50000, Level 16, State 1, Procedure Recover_Deleted_Data_Proc, Line 22 [Batch Start Line 705] The compatibility level should be equal to or greater SQL SERVER 2005 (90)
Is because you have to comment out from line 701 to line 708.
Cool, let's now delete the data from that table:
--Delete the data
DELETE FROM Test_Table
And confirm that the data were deleted:
--Verify the data
SELECT * FROM Test_Table
And here is the last step: we need to try to recover the data using the freshly installed stored procedure.
The author instruct us to use one of these two commands (don't forget to change 'test' with the name of your database):
--Recover the deleted data without date range
EXEC Recover_Deleted_Data_Proc 'test', 'dbo.Test_Table'
or
--Recover the deleted data it with date range
EXEC Recover_Deleted_Data_Proc 'test', 'dbo.Test_Table', '2012-06-01', '2012-06-30'
But the problem is that both returns this error:
(8 rows affected)
(2 rows affected)
(64 rows affected)
(2 rows affected)
(1 row affected)
(1 row affected)
(1 row affected)
(1 row affected)
(1 row affected)
(1 row affected)
Msg 245, Level 16, State 1, Procedure Recover_Deleted_Data_Proc, Line 485 [Batch Start Line 112]
Conversion failed when converting the varchar value '0x41-->01 ; 0001' to data type int.
If I right click on the stored procedure and I click "Modify", I don't see anything particularly fishy at Line 485.
Any idea why this stored procedure is not working?
What is the conversion mentioned?
The code is 10 years old and was written with the assumption that a [PAGE ID] would only ever be expressed as a pair of integers, e.g. 0001:00000138 - however, as you have learned, sometimes that is expressed differently, like 0x41-->01 ; 0001:00000138.
You can fix that problem by adding this inside the cursor:
IF #ConsolidatedPageID LIKE '0x%-->%;%'
BEGIN
SET #ConsolidatedPageID = LTRIM(SUBSTRING(#ConsolidatedPageID,
CHARINDEX(';', #ConsolidatedPageID) + 1, 8000));
END
But then your next problem is when you saved the procedure from the box file it probably changed '†' to some wacky ? character. When I fixed that (using N'†' of course, since Unicode characters should always have N), I still got these error messages:
Msg 537, Level 16, State 3, Procedure Recover_Deleted_Data_Proc, Line 525
Invalid length parameter passed to the LEFT or SUBSTRING function.
Msg 9420, Level 16, State 1, Procedure Recover_Deleted_Data_Proc, Line 651
XML parsing: line 1, character 2, illegal xml character
After 15 minutes of trying to reverse engineer this spaghetti, I gave up. If you need to recover data you deleted, restore a backup. If you don't have a backup, well, that's why we take backups. The fragile scripts people try to create to compensate for not taking backups are exactly why log recovery vendors charge the big bucks.
As an aside, the compatibility level error message is a red herring, totally misleading as the logic is currently written, and completely irrelevant to the problem. But it can be solved if, right before this:
IF ISNULL(#Compatibility_Level,0)<=80
BEGIN
RAISERROR('The compatibility level should ... blah blah',16,1)
RETURN
END
You add this:
IF DB_ID(#Database_Name) IS NULL
BEGIN
RAISERROR(N'Database %s does not exist.',11,1,#Database_name);
RETURN;
END
Or simply not calling those two example calls at the end of the script, since they depend on you having a database called test, which clearly you do not.
String or binary data would be truncated. The statement has been terminated.
System.Data.SqlClient.SqlException (0x80131904): String or binary data would be truncated
This exception throws when C#(model) try to save data record for column whose size defined less in SQL SERVER database table where value to pass to this column string length in greater.
To fix this error you only need to alter column of table in SQL SERVER database using SQL Server script.
Only increasing size of column in table works. No need to re deploy the application on PROD/TEST environment.
Please refer this sample below.
CREATE TABLE MyTable(Num INT, Column1 VARCHAR(3))
INSERT INTO MyTable VALUES (1, 'test')
Look at column1 its size is 3 but the given value is of length 4 so you would get the error.
To fix the error:
You should pass the string value less than or equal to it size ie., 3 characters like the below.
INSERT INTO MyTable VALUES (1, 'tes')
If you want to suppress this error
you can use set the below ansi_warnings parameter to off
SET ansi_warnings OFF
if we use ansi_warnings as OFF, the error would be suppressed and whatever can fit in the column, would be inserted, the rest would be truncated.
INSERT INTO MyTable VALUES (1, 'test')
The string 'tes' would be stored in your table and it won't return any error.
I'm working on SQL Server 2008.
I delete all data from a table and then I try to insert value to the table. Here's the code:
TRUNCATE TABLE [dbo].[STRAT_tmp_StratMain]
INSERT INTO [dbo].[STRAT_tmp_StratMain] ([FileNum])
SELECT [dbo].[STRAT_tmp_Customer].[NumericFileNumber]
FROM [dbo].[STRAT_tmp_Customer];
The FileNum in STRAT_tmp_StratMain is float number and is also index and can't be null.
NumericFileNumber is float and can be null but is never null and there are no duplicates in it (each row is unique number).
The table STRAT_tmp_StratMain contain much more fields but all can be null and also has a defualt values.
When I try to run this query I get the error:
Msg 8152, Level 16, State 4, Line 1 String or binary data would be
truncated. The statement has been terminated.
I tried also to do simply:
INSERT INTO [dbo].[STRAT_tmp_StratMain] ([FileNum]) Values (1);
Still get the same error.
Any ideas?
Thanks,
Ilan
I am not able to reproduce your issue. When I run this code on SQL Server 2008, I get no error:
DECLARE #tt TABLE (FileNum float NOT NULL);
INSERT INTO #tt (FileNum) VALUES (1);
Check the Default constraints on all the columns in your target table and make sure none of them would try to insert a string value that would truncated by the datatype limitations of the column.
example: SomeColumn varchar(1) DEFAULT 'Hello'
This due to the data you are trying to insert does not fit in the field: if you have a defined length of (say) 10 or 50 characters but the data you are trying to insert is longer than that.
My table :
log_id bigint
old_value xml
new_value xml
module varchar(50)
reference_id bigint
[transaction] varchar(100)
transaction_status varchar(10)
stack_trace ntext
modified_on datetime
modified_by bigint
Insert Query :
INSERT INTO [dbo].[audit_log]
([old_value],[new_value],[module],[reference_id],[transaction]
,[transaction_status],[stack_trace],[modified_on],[modified_by])
VALUES
('asdf','asdf','Subscriber',4,'_transaction',
'_transaction_status','_stack_trace',getdate(),555)
Error :
Msg 8152, Level 16, State 14, Line 1
String or binary data would be truncated.
The statement has been terminated.
Why is that ???
You're trying to write more data than a specific column can store. Check the sizes of the data you're trying to insert against the sizes of each of the fields.
In this case transaction_status is a varchar(10) and you're trying to store 19 characters to it.
this type of error generally occurs when you have to put characters or values more than that you have specified in Database table like in this case:
you specify
transaction_status varchar(10)
but you actually trying to store
_transaction_status
which contain 19 characters.
that's why you faced this type of error in this code..
This error is usually encountered when inserting a record in a table where one of the columns is a VARCHAR or CHAR data type and the length of the value being inserted is longer than the length of the column.
I am not satisfied how Microsoft decided to inform with this "dry" response message, without any point of where to look for the answer.