SQL Delete clears the table instead of erroring - sql-server

I have a piece of SQL which (you would think) wouldn't compile, but which instead deletes all rows from the target table.
Consider this setup:
create table TableA (ColumnA varchar(200));
create table TableB (ColumnB varchar(200));
insert TableA values ('A'),('B'),('C');
insert TableB values ('A');
Then the following sql:
--Returns all rows from TableA
select * from TableA;
--Does not error (ColumnA does not exist on TableB)
delete TableA where ColumnA in (select ColumnA from TableB)
--No Rows are returned
select * from TableA;
The delete statement above causes all rows to be removed from TableA, rather than erroring that ColumnA doesn't exist in TableB
There's a SQL Fiddle demontrating this here: http://www.sqlfiddle.com/#!3/9d883/6
It seems that the ColumnA from TableA is being picked up, but expected it to be "out of scope".
Why is this?

That works as expected, due to the correlation between ColumnA in the inner query to the outer.
This commonly used correlated query pattern is valid
DELETE TableA WHERE NOT EXISTS (select * from TableB where TableB.ID=TableA.ID)
It removes TableA entries that don't have a dependent record in TableB.
It shows that you can reference TableA columns in a correlated query. In your query
delete TableA where ColumnA in (select ColumnA from TableB)
The inner query is producing
one row for each record in TableB
one column for each row, whose value is ColumnA from outer query
So the DELETE goes through

While I understand the confusion, it is behaving as it should. ColumnA is still "in scope". In fact you could join on it in your subquery if you wanted. The brackets don't limit the scope, but from a readability standpoint I can see the confusion that it creates.
This is another example of why it's a good idea to always prefix your column names with the table name (or alias).

Related

SQL Server : DELETE FROM table FROM table

I keep coming across this DELETE FROM FROM syntax in SQL Server, and having to remind myself what it does.
DELETE FROM tbl
FROM #tbl
INNER JOIN tbl ON fk = pk AND DATEDIFF(day, #tbl.date, tbl.Date) = 0
EDIT: To make most of the comments and suggested answers make sense, the original question had this query:
DELETE FROM tbl
FROM tbl2
As far as I understand, you would use a structure like this where you are restricting which rows to delete from the first table based on the results of the from query. But to do that you need to have a correlation between the two.
In your example there is no correlation, which will effectively be a type of cross join which means "for every row in tbl2, delete every row in tbl1". In other words it will delete every row in the first table.
Here is an example:
declare #t1 table(A int, B int)
insert #t1 values (15, 9)
,(30, 10)
,(60, 11)
,(70, 12)
,(80, 13)
,(90, 15)
declare #t2 table(A int, B int)
insert #t2 values (15, 9)
,(30, 10)
,(60, 11)
delete from #t1 from #t2
The result is an empty #t1.
On the other hand this would delete just the matching rows:
delete from #t1 from #t2 t2 join #t1 t1 on t1.A=t2.A
I haven't seen this anywhere before. The documentation of DELETE tells us:
FROM table_source Specifies an additional FROM clause. This
Transact-SQL extension to DELETE allows specifying data from
and deleting the corresponding rows from the table in
the first FROM clause.
This extension, specifying a join, can be used instead of a subquery
in the WHERE clause to identify rows to be removed.
Later in the same document we find
D. Using joins and subqueries to data in one table to delete rows in
another table The following examples show two ways to delete rows in
one table based on data in another table. In both examples, rows from
the SalesPersonQuotaHistory table in the AdventureWorks2012 database
are deleted based on the year-to-date sales stored in the SalesPerson
table. The first DELETE statement shows the ISO-compatible subquery
solution, and the second DELETE statement shows the Transact-SQL FROM
extension to join the two tables.
With these examples to demonstrate the difference
-- SQL-2003 Standard subquery
DELETE FROM Sales.SalesPersonQuotaHistory
WHERE BusinessEntityID IN
(SELECT BusinessEntityID
FROM Sales.SalesPerson
WHERE SalesYTD > 2500000.00);
-- Transact-SQL extension
DELETE FROM Sales.SalesPersonQuotaHistory
FROM Sales.SalesPersonQuotaHistory AS spqh
INNER JOIN Sales.SalesPerson AS sp
ON spqh.BusinessEntityID = sp.BusinessEntityID
WHERE sp.SalesYTD > 2500000.00;
The second FROM mentions the same table in this case. This is a weird way to get something similar to an updatable cte or a derived table
In the third sample in section D the documentation states clearly
-- No need to mention target table more than once.
DELETE spqh
FROM
Sales.SalesPersonQuotaHistory AS spqh
INNER JOIN Sales.SalesPerson AS sp
ON spqh.BusinessEntityID = sp.BusinessEntityID
WHERE sp.SalesYTD > 2500000.00;
So I get the impression, the sole reason for this was to use the real table's name as the DELETE's target instead of an alias.

String or binary data would be truncated error in SQL server. How to know the column name throwing this error

I have an insert Query and inserting data using SELECT query and certain joins between tables.
While running that query, it is giving error "String or binary data would be truncated".
There are thousands of rows and multiple columns I am trying to insert in that table.
So it is not possible to visualize all data and see what data is throwing this error.
Is there any specific way to identify which column is throwing this error? or any specific record not getting inserted properly and resulted into this error?
I found one article on this:
RareSQL
But this is when we insert data using some values and that insert is one by one.
I am inserting multiple rows at the same time using SELECT statements.
E.g.,
INSERT INTO TABLE1 VALUES (COLUMN1, COLUMN2,..) SELECT COLUMN1, COLUMN2,.., FROM TABLE2 JOIN TABLE3
Also, in my case, I am having multiple inserts and update statements and even not sure which statement is throwing this error.
You can do a selection like this:
select TABLE2.ID, TABLE3.ID TABLE1.COLUMN1, TABLE1.COLUMN2, ...
FROM TABLE2
JOIN TABLE3
ON TABLE2.JOINCOLUMN1 = TABLE3.JOINCOLUMN2
LEFT JOIN TABLE1
ON TABLE1.COLUMN1 = TABLE2.COLUMN1 and TABLE1.COLUMN2 = TABLE2.COLUMN2, ...
WHERE TABLE1.ID = NULL
The first join reproduces the selection you have been using for the insert and the second join is a left join, which will yield null values for TABLE1 if a row having the exact column values you wanted to insert does not exist. You can apply this logic to your other queries, which were not given in the question.
You might just have to do it the hard way. To make it a little simpler, you can do this
Temporarily remove the insert command from the query, so you are getting a result set out of it. You might need to give some of the columns aliases if they don't come with one. Then wrap that select query as a subquery, and test likely columns (nvarchars, etc) like this
Select top 5 len(Col1), *
from (Select col1, col2, ... your query (without insert) here) A
Order by 1 desc
This will sort the rows with the largest values in the specified column first and just return the rows with the top 5 values - enough to see if you've got a big problem or just one or two rows with an issue. You can quickly change which column you're checking simply by changing the column name in the len(Col1) part of the first line.
If the subquery takes a long time to run, create a temp table with the same columns but with the string sizes large (like varchar(max) or something) so there are no errors, and then you can do the insert just once to that table, and run your tests on that table instead of running the subquery a lot
From this answer,
you can use temp table and compare with target table.
for example this
Insert into dbo.MyTable (columns)
Select columns
from MyDataSource ;
Become this
Select columns
into #T
from MyDataSource;
select *
from tempdb.sys.columns as TempCols
full outer join MyDb.sys.columns as RealCols
on TempCols.name = RealCols.name
and TempCols.object_id = Object_ID(N'tempdb..#T')
and RealCols.object_id = Object_ID(N'MyDb.dbo.MyTable)
where TempCols.name is null -- no match for real target name
or RealCols.name is null -- no match for temp target name
or RealCols.system_type_id != TempCols.system_type_id
or RealCols.max_length < TempCols.max_length ;

T-SQL column that indicates one or more records exist in a separate table

I want a query that selects all records from tableA and no other records. However, I want my query to include a column that indicates that 1 or more records exist in tableB.
LEFT OUTER JOIN tableA to tableB doesn't work because if there are 2 records in tableB that relate to a record in tableA I get 2 records in the result set. I only want 1.
RIGHT OUTER JOIN doesn't work because my query returns all of the records in tableB that do not match to any records in tableA. I do not want to get records from tableB that do not match at least 1 record in tableA.
INNER JOIN also fails because I do not get all of the records in tableA; only those that contain a matching record in tableB.
It's as if I need a query like this:
SELECT tableA.ID, IF EXISTS row in tableB THEN 1 ELSE 0
FROM tableA <some sort of join> tableB on tableA.ID = tableB.FKtoTableA
Due to the fact that the goal is to merely test for existence then we highly suggest using the EXISTS clause in-line:
SELECT A.*
, CASE
WHEN EXISTS (
SELECT 1
FROM TableB B
WHERE B.Id = A.Id
) THEN 1
ELSE 0
END
FROM TableA A
Not only is this typically going to be faster than a solution that employs a LEFT JOIN + IS NOT NULL or a COUNT and has the added benefit of having semantics that agree with your problem statement.
You could use a subquery:
select
tableA.*,
(select count(*) from tableB where tableA.ID=tableB.ID) as 'Count in TableB'
from tableA
You could wrap a conditional or case statement around the subquery to give you a more Boolean value if you wanted.
You could use left join and only pull in aggregate data from b:
select a.id, cast(count(b.id) as bit) from a left join b on a.id = b.id group by a.id;
example
Converting to bit promotes any nonzero value to 1.

TSQL Copy New Contents to Archive Table Only

I have an ArchiveTable that I want to periodically copy any new records from OriginalTable. This is something I thought may work.
INSERT INTO OriginalTable
SELECT *
FROM ArchiveTable
WHERE NOT EXISTS (SELECT *
FROM OriginalTable ot
INNER JOIN ArchiveTable at ON ot.email = at.email)
Simply doing something like..
INSERT INTO ArchiveTable
SELECT * FROM OriginalTable
Of course, only works for the initial copy.
Your current query:
INSERT INTO OriginalTable
SELECT * FROM ArchiveTable
WHERE NOT EXISTS
(SELECT * FROM OriginalTable ot
INNER JOIN ArchiveTable at
ON ot.email = at.email)
Uses an EXISTS subquery that isn't related to the outer query. So it's saying, "if no row exists in the original table that has the same email as any row in the archive table, then insert everything in the archive table into the Original table."
Probably not what you want. You probably want to insert the specific rows that do not already exist in the original table. So you would want to correlate the subquery to the outer query:
INSERT INTO OriginalTable
SELECT * FROM ArchiveTable at
WHERE NOT EXISTS
(SELECT * FROM OriginalTable ot
WHERE ot.email = at.email)
This query says, "insert into the original table, any rows in the archive table where I don't already have the Email in the Original table".

T-SQL Deletes all rows from a table when subquery is malformed [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
sql server 2008 management studio not checking the syntax of my query
I ran across an issue today where a subquery was bad and the result was all rows from the parent table were deleted.
TableA
ID,
Text,
GUID
TableB
ID,
TableAID,
Text
delete from TableB
where id in (
select TableAID
from TableA
where GUID = 'fdjkhflafdhf'
)
If you run the subquery by itself you get an error since the column (TableAID) doesn't exist in Table A. If you run the full query - it deletes all records from table B without an error.
I also tried the following queries which removed 0 records (expected)
delete from TableB where id in (null)
delete from TableB where id in (select null)
Can someone explain to my why this is occurring when the query is malformed? Why does it seem to evaluate to true?
Note: This was tested on SQL Server 2008 R2
As TableAID doesn't exist in TableA, the query is using the column from TableB. Therefore the query is the same as:
delete from TableB
where id in (
select TableB.TableAID
from TableA
where GUID = 'fdjkhflafdhf'
)
So in essence it's doing:
delete from TableB
where id in (TableAID)
If you are using sub-queries its best to mention your table names when referencing. The following WILL throw an exception:
delete from TableB
where id in (
select TableA.TableAID
from TableA
where TableA.GUID = 'fdjkhflafdhf'
)
Furthermore I would use an alias so that we know which query we are referring to:
delete from TableB
where id in (
select a.TableAID
from TableA a
where a.GUID = 'fdjkhflafdhf'
)

Resources