Which column is being truncated? [duplicate] - sql-server

The year is 2010.
SQL Server licenses are not cheap.
And yet, this error still does not indicate the row or the column or the value that produced the problem. Hell, it can't even tell you whether it was "string" or "binary" data.
Am I missing something?

A quick-and-dirty way of fixing these is to select the rows into a new physical table like so:
SELECT * INTO dbo.MyNewTable FROM <the rest of the offending query goes here>
...and then compare the schema of this table to the schema of the table into which the INSERT was previously going - and look for the larger column(s).

I realize that this is an old one. Here's a small piece of code that I use that helps.
What this does, is returns a table of the max lengths in the table you're trying to select from. You can then compare the field lengths to the max returned for each column and figure out which ones are causing the issue. Then it's just a simple query to clean up the data or exclude it.
DECLARE #col NVARCHAR(50)
DECLARE #sql NVARCHAR(MAX);
CREATE TABLE ##temp (colname nvarchar(50), maxVal int)
DECLARE oloop CURSOR FOR
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'SOURCETABLENAME' AND TABLE_SCHEMA='dbo'
OPEN oLoop
FETCH NEXT FROM oloop INTO #col;
WHILE (##FETCH_STATUS = 0)
BEGIN
SET #sql = '
DECLARE #val INT;
SELECT #val = MAX(LEN(' + #col + ')) FROM dbo.SOURCETABLENAME;
INSERT INTO ##temp
( colname, maxVal )
VALUES ( N''' + #col + ''', -- colname - nvarchar(50)
#val -- maxVal - int
)';
EXEC(#sql);
FETCH NEXT FROM oloop INTO #col;
END
CLOSE oloop;
DEALLOCATE oloop
SELECT * FROM ##temp
DROP TABLE ##temp;

Another way here is to use binary search.
Comment half of the columns in your code and try again. If the error persists, comment out another half of that half and try again. You will narrow down your search to just two columns in the end.

You could check the length of each inserted value with an if condition, and if the value needs more width than the current column width, truncate the value and throw a custom error.
That should work if you just need to identify which is the field causing the problem. I don't know if there's any better way to do this though.

Recommend you vote for the enhancement request on Microsoft's site. It's been active for 6 years now so who knows if Microsoft will ever do anything about it, but at least you can be a squeaky wheel: Microsoft Connect

For string truncation, I came up with the following solution to find the max lengths of all of the columns:
1) Select all of the data to a temporary table (supply column names where needed), e.g.
SELECT col1
,col2
,col3_4 = col3 + '-' + col4
INTO #temp;
2) Run the following SQL Statement in the same connection (adjust the temporary table name if needed):
DECLARE #table VARCHAR(MAX) = '#temp'; -- change this to your temp table name
DECLARE #select VARCHAR(MAX) = '';
DECLARE #prefix VARCHAR(256) = 'MAX(LEN(';
DECLARE #suffix VARCHAR(256) = ')) AS max_';
DECLARE #nl CHAR(2) = CHAR(13) + CHAR(10);
SELECT #select = #select + #prefix + name + #suffix + name + #nl + ','
FROM tempdb.sys.columns
WHERE object_id = object_id('tempdb..' + #table);
SELECT #select = 'SELECT ' + #select + '0' + #nl + 'FROM ' + #table
EXEC(#select);
It will return a result set with the column names prefixed with 'max_' and show the max length of each column.
Once you identify the faulty column you can run other select statements to find extra long rows and adjust your code/data as needed.

I can't think of a good way really.
I once spent a lot of time debugging a very informative "Division by zero" message.
Usually you comment out various pieces of output code to find the one causing problems.
Then you take this piece you found and make it return a value that indicates there's a problem instead of the actual value (in your case, should be replacing the string output with the len(of the output)). Then manually compare to the lenght of the column you're inserting it into.

from the line number in the error message, you should be able to identify the insert query that is causing the error. modify that into a select query to include AND LEN(your_expression_or_column_here) > CONSTANT_COL_INT_LEN for the string various columns in your query. look at the output and it will give your the bad rows.

Technically, there isn't a row to point to because SQL didn't write the data to the table. I typically just capture the trace, run it Query Analyzer (unless the problem is already obvious from the trace, which it may be in this case), and quickly debug from there with the ages old "modify my UPDATE to a SELECT" method. Doesn't it really just break down to one of two things:
a) Your column definition is wrong, and the width needs to be changed
b) Your column definition is right, and the app needs to be more defensive
?

The best thing that worked for me was to put the rows first into a temporary table using select .... into #temptable
Then I took the max length of each column in that temp table. eg. select max(len(jobid)) as Jobid, ....
and then compared that to the source table field definition.

Related

Is there a simple(r) way to REPLACE a character across all columns in one table in SQL Server?

There are ~10 different subquestions that could be answered here, but the main question is in the title. TLDR version: I have a table like the example below and I want to replace all double quote marks across the whole table. Is there a simple way to do this?
My solution using cursor seems fairly straightforward. I know there's some CURSOR hatred in the SQL Server community (bad runtime?). At what point (num rows and/or num columns) would CURSOR stink at this?
Create Reproducible Example Table
DROP TABLE IF EXISTS #example;
CREATE TABLE #example (
NumCol INT
,CharCol NVARCHAR(20)
,DateCol NVARCHAR(100)
);
INSERT INTO #example VALUES
(1, '"commas, terrible"', '"2021-01-01 20:15:57,2021:04-08 19:40:50"'),
(2, '"loadsrc,.txt"', '2020-01-01 00:00:05'),
(3, '".txt,from.csv"','1/8/2021 10:14')
Right now, my identified solutions are:
Manually update for each column UPDATE X SET CharCol = REPLACE(CharCol, '"',''). Horribly annoying to do at any more than 2 columns IMO.
Use a CURSOR to update (similar to annoyingly complicated looking solution at SQL Server- SQL Replace on all columns in all tables across an entire DB
REPLACE character using CURSOR
This gets a little convoluted with all the cursor-related script, but seems to work well otherwise.
-- declare variable to store colnames, cursor to filter through list, string for dynamic sql code
DECLARE #colname VARCHAR(10)
,#sql VARCHAR(MAX)
,#namecursor CURSOR;
-- run cursor and set colnames and update table
SET #namecursor = CURSOR FOR SELECT ColName FROM #colnames
OPEN #namecursor;
FETCH NEXT FROM #namecursor INTO #colname;
WHILE (##FETCH_STATUS <> -1) -- alt: WHILE ##FETCH_STATUS = 0
BEGIN;
SET #sql = 'UPDATE #example SET '+#colname+' = REPLACE('+#colname+', ''"'','''')'
EXEC(#sql); -- parentheses VERY important: EXEC(sql-as-string) NOT EXEC storedprocedure
FETCH NEXT FROM #namecursor INTO #colname;
END;
CLOSE #namecursor;
DEALLOCATE #namecursor;
GO
-- see results
SELECT * FROM #example
Subquestion: While I've seen it in our database elsewhere, for this particular example I'm opening a .csv file in Excel and exporting it as tab delimited. Is there a way to change the settings to export without the double quotes? If I remember correctly, BULK INSERT doesn't have a way to handle that or a way to handle importing a csv file with extra commas.
And yes, I'm going to pretend that I'm fine that there's a list of datetimes in the date column (necessitating varchar data type).
Why not just dynamically build the SQL?
Presumably it's a one-time task you'd be doing so just run the below for your table, paste into SSMS and run. But if not you could build an automated process to execute it - better of course to properly sanitize when inserting the data though!
select
'update <table> set ' +
String_Agg(QuoteName(COLUMN_NAME) + '=Replace(' + QuoteName(column_name) + ',''"'','''')',',')
from INFORMATION_SCHEMA.COLUMNS
where table_name='<table>' and TABLE_SCHEMA='<schema>' and data_type in ('varchar','nvarchar')
example DB<>Fiddle
You might try this approach, not fast, but easy to type (or generate).
SELECT NumCol = y.value('(NumCol/text())[1]','int')
,CharCol = y.value('(CharCol/text())[1]','nvarchar(100)')
,DateCol = y.value('(DateCol/text())[1]','nvarchar(100)')
FROM #example e
CROSS APPLY(SELECT e.* FOR XML PATH('')) A(x)
CROSS APPLY(SELECT CAST(REPLACE(A.x,'"','') AS XML)) B(y);
The idea in short:
The first APPLY will transform all columns to a root-less XML.
Without using ,TYPE this will be of type nvarchar(max) implicitly
The second APPLY will first replace any " in the whole text (which is one row actually) and cast this to XML.
The SELECT uses .value to fetch the values type-safe from the XML.
Update: Just add INTO dbo.SomeNotExistingTableName right before FROM to create a new table with this data. This looks better than updating the existing table (might be a #-table too). I'd see this as a staging environment...
Good luck, messy data is always a pain in the neck :-)

Adding a new column to a table getting the column name from another table

I'm using Microsoft SQL server management studio.
I would like to add a new column to a table (altertable1), and name that column using the data from a cell (Date) of another table (stattable1).
DECLARE #Data nvarchar(20)
SELECT #Data = Date
FROM stattable1
WHERE Adat=1
DECLARE #sql nvarchar(1000)
SET #sql = 'ALTER TABLE altertable1 ADD ' + #Data + ' nvarchar(20)'
EXEC (#sql)
Executing this, I get the following error and can't find out why:
"Incorrect syntax near '2021'."
The stattable1 looks like this:
Date |Adat
2021-09-08 |1
2021-09-08 is a daily generated data:
**CONVERT(date,GETDATE())**
Just like Larnu said in comment, maybe this is not a main problem for you, but if you want to do this add [ ] when you want to name column starting with number.
Like this:
SET #sql = 'ALTER TABLE altertable1 ADD [' + #Data + '] nvarchar(20)'
And of course, naming columns by date or year is not best practice.
The problem with your overall design is that you seem to be adding a column to the table every day. A table is not a spreadsheet and you should be storing data for each day in a row, not in a separate column. If your reports need to look that way, there are many ways to pivot the data so that you can handle that at presentation time without creating impossible-to-maintain technical debt in your database.
The problem with your current code is that 2021-06-08 is not a valid column name, both because it starts with a number, and because it contains dashes. Even if you use a more language-friendly form like YYYYMMDD (see this article to see what I mean), it still starts with a number.
The best solution to the local problem is to not name columns that way. If you must, the proper way to escape it is to use QUOTENAME() (and not just manually slap [ and ] on either side):
DECLARE #Data nvarchar(20), #sql nvarchar(max);
SELECT #Data = Date
FROM dbo.stattable1
WHERE Adat = 1;
SET #sql = N'ALTER TABLE altertable1
ADD ' + QUOTENAME(#Data) + N' nvarchar(20);';
PRINT #sql;
--EXEC sys.sp_executesql #sql;
This also demonstrates your ability to debug a statement instead of trying to decipher the error message that came from a string you can't inspect.
Some other points to consider:
if you're declaring a string as nvarchar, and especially when dealing with SQL Server metadata, always use the N prefix on any literals you define.
always reference user tables with two-part names.
always end statements with statement terminators.
generally prefer sys.sp_executesql over EXEC().
some advice on dynamic SQL:
Protecting Yourself from SQL Injection - Part 1
Protecting Yourself from SQL Injection - Part 2

Need help/explanation on how to properly add parameter to dynamic SQL Query

I'm searching through databases and I reached out to StackOverflow for some help in how to learn how to run the dynamic SQL through.
The answer I got was extremely useful but it didn't come with an explanation of what was happening exactly or why. Now I'm trying to add another parameter to the code and I'm having trouble adding it.
I would like someone to help me correct the parameter so its input correctly and explain what is going on in the dynamic SQL statement.
Lets get to the problem with the Query. Everything works great as it was correctly by another StackOverflow poster. But now I need to add where the 'ControlID' column must start with the letter Q.
With the way I've put it in now I'm getting an error that states the column 'ControlID' does not exist. But when I check to see if ControlID is the correct column name
select * FROM [EDDS1111111].[EDDSDBO].[Document] where ControlID like 'Q%'
I do get results. So it's not an invalid column name.
I designed this input of 'ControlID' to be similar to the way artifact ID was added earlier in the code so I'm confused as to why I'm getting this error.
-- this is used to add line breaks to make code easier to read
DECLARE #NewLine AS NVARCHAR(MAX) = CHAR(10)
-- to hold your dynamic SQL for all rows/inserts at once
DECLARE #sql NVARCHAR(MAX) = N'';
-- create temp table to insert your dynamic SQL results into
IF OBJECT_ID('tempdb..#DatabaseSizes', 'U') IS NOT NULL
DROP TABLE #DatabaseSizes;
create table #DatabaseSizes(
controlid nvarchar(128),
fileSize DECIMAL (10,6),
extractedTextSize DECIMAL(10,6)
)
SELECT #sql = #sql + N'' +
'select SUM(fileSize)/1024/1024/1024 as fileSize,
SUM(extractedTextSize)/1024/1024 as extractedTextSize ' + #NewLine +
'FROM [EDDS' + CAST(ArtifactID as nvarchar(128)) + '].[EDDSDBO].
[Document] ed' + #NewLine +
'where ed.CreatedDate >= (select CONVERT(varchar,dateadd(d,-
(day(getdate())),getdate()),106)) and ed.controlid = '+Cast(Controlid as
nvarchar(128))+'%' + #NewLine + #NewLine
FROM edds.eddsdbo.[Case]
WHERE name like '%Review%' and (StatusCodeArtifactID = '1780779' or
StatusCodeArtifactID = '1034288')
--controlid always needs to begin with a Q
-- for testing/validating
PRINT #sql
INSERT INTO #DatabaseSizes (
controlid,
fileSize,
extractedTextSize
)
-- executes all the dynamic SQL we just generated
EXEC sys.sp_executesql #SQL;
As I've put the code so far I'd expect for the controlID to equal the current control ID where it would begin with a Q.
I tried to copy the part where it was done correctly so I'm a little confused. Any help to increase my understanding is greatly appreciated.
Thank you for your time
There is not a column for ControlID within edds.eddsdbo.[case]. But it does exist within all databases located within FROM [EDDS' + CAST(ArtifactID as nvarchar(128)) + '].[EDDSDBO].[Document]. CreatedDate also does not exist within the .[Case] tables but does exist within the .Document databases. That is why I put the search for ControlID next to the where statement for the .Document section of the query
This is what was returned when I try to run the code.
Invalid column name 'Controlid'.

Removing cursor t-sql

I have a trigger in mssql in which I want to compare each column from the inserted table with the deleted table to check if the value has changed...
If the value has changed I want to insert the column name into a temp table.
My code until now:
declare columnCursor CURSOR FOR
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'MyTable' AND TABLE_SCHEMA='dbo'
--save inserted and deleted into temp tables
select * into #row1 from Inserted
select * into #row2 from Deleted
declare #tmp table(column_name nvarchar(max))
declare #column nvarchar(50)
OPEN COlumnCUrsor
FETCH NEXT FROM ColumnCursor INTO #column
while ##FETCH_STATUS=0 begin
declare #out bit
declare #sql nvarchar(max) = N'
select #out = case when r1.'+#column+'r2.'+#column+' then 1 else 0 end
from #row1 r1
left join #row2 r2 on r1.sys_volgnr=r2.sys_volgnr'
exec sp_executesql #sql,N'#out bit OUTPUT', #out=#out OUTPUT
if( #out = 1 ) begin
insert into #tmp VALUES(#column)
end
FETCH NEXT FROM ColumnCursor INTO #column
end
CLOSE ColumnCursor;
DEALLOCATE ColumnCursor;
Is there an easier way to accomplish this?
Yes, there is.
You can use the COLUMNS_UPDATED function to determine the columns that had actually changed values, though it's not a very friendly function in terms of code readability.
Read this article from Microsoft support called Proper Use of the COLUMNS_UPDATED() Function to see what I mean.
I've came across an article called A More Performant Alternative To COLUMNS_UPDATED(), perhaps it can help you or at least inspire you.
I will note that you should resist the temptation to use the UPDATE() function, as it may return true even if no data was changed.
here is the relevant part from it's MSDN page:
UPDATE() returns TRUE regardless of whether an INSERT or UPDATE attempt is successful.
Looks like you're trying to build a dynamic solution, which might be useful if you expect to change often (=new columns to be added etc). You could do something like this (in pseudo-code)
Build a dynamic SQL based on DMVs (INFORMATION_SCHEMA.COLUMNS) for the column names:
insert into table ...
select
function_to_split_by_comma (
case when I.col1 = U.col1 then 'col1,' else '' end +
case when I.col2 = U.col2 then 'col2,' else '' end +
...
)
where
I.key_column1 = U.key_column1 ...
These names (col1, col2) should be the columns from the DMV query, + the case for each of the row, and then fixed SQL part for the beginning + you'll need to figure out how to join inserted and deleted, which requires the primary key.
For splitting the data into rows, you can use for example the delimited_split_8k by Jeff Moden (http://www.sqlservercentral.com/articles/Tally+Table/72993/).
Also as Damien pointed out, there can be more than one row in the inserted / deleted tables.

Inserting a string of form "GUID1, GUID2, GUID3 ..." into an IN statement in TSQL

I've got a stored procedure in my database, that looks like this
ALTER PROCEDURE [dbo].[GetCountingAnalysisResults]
#RespondentFilters varchar
AS
BEGIN
#RespondentFilters = '''8ec94bed-fed6-4627-8d45-21619331d82a, 114c61f2-8935-4755-b4e9-4a598a51cc7f'''
DECLARE #SQL nvarchar(600)
SET #SQL =
'SELECT *
FROM Answer
WHERE Answer.RespondentId IN ('+#RespondentFilters+'''))
GROUP BY ChosenOptionId'
exec sp_executesql #SQL
END
It compiles and executes, but somehow it doesn't give me good results, just like the IN statement wasn't working. Please, if anybody know the solution to this problem, help me.
You should definitely look at splitting the list of GUIDs into a table and joining against that table. You should be able to find plenty of examples online for a table-valued function that splits an input string into a table.
Otherwise, your stored procedure is vulnerable to SQL injection. Consider the following value for #RespondentFilters:
#RespondentFilters = '''''); SELECT * FROM User; /*'
Your query would be more secure parsing (i.e. validating) the parameter values and joining:
SELECT *
FROM Answer
WHERE Answer.RespondentId IN (SELECT [Item] FROM dbo.ParseList(#RespondentFilters))
GROUP BY ChosenOptionId
or
SELECT *
FROM Answer
INNER JOIN dbo.ParseList(#RespondentFilters) Filter ON Filter.Item = Answer.RespondentId
GROUP BY ChosenOptionId
It's slightly more efficient as well, since you aren't dealing with dynamic SQL (sp_executesql will cache query plans, but I'm not sure if it will accurately identify your query as a parameterized query since it has a variable list of items in the IN clause).
You need single quotes around each GUID in the list
#RespondentFilters = '''8ec94bed-fed6-4627-8d45-21619331d82a'', ''114c61f2-8935-4755-b4e9-4a598a51cc7f'''
It looks like you don't have closing quotes around your #RespondentFilters '8ec94bed-fed6-4627-8d45-21619331d82a, 114c61f2-8935-4755-b4e9-4a598a51cc7f'
Since GUIDs do a string compare, that's not going to work.
Your best bet is to use some code to split the list out into multiple values.
Something like this:
-- This would be the input parameter of the stored procedure, if you want to do it that way, or a UDF
declare #string varchar(500)
set #string = 'ABC,DEF,GHIJK,LMNOPQRS,T,UV,WXY,Z'
declare #pos int
declare #piece varchar(500)
-- Need to tack a delimiter onto the end of the input string if one doesn't exist
if right(rtrim(#string),1) ','
set #string = #string + ','
set #pos = patindex('%,%' , #string)
while #pos 0
begin
set #piece = left(#string, #pos - 1)
-- You have a piece of data, so insert it, print it, do whatever you want to with it.
print cast(#piece as varchar(500))
set #string = stuff(#string, 1, #pos, '')
set #pos = patindex('%,%' , #string)
end
Code stolen from Raymond Lewallen
I think you need quotes inside the string too. Try:
#RespondentFilters = '''8ec94bed-fed6-4627-8d45-21619331d82a'',''114c61f2-8935-4755-b4e9-4a598a51cc7f'''
You could also consider parsing the #RespondentFilters into a temporary table.
Tank you all for your ansewers. They all helped a lot. I've dealt with the problem by writing a split function, and it works fine. It's a litte bit overhead from what I could have done, but you know, the deadline is hiding around the corner :)

Resources