T-SQL CAST only failing when within IN clause

T-SQL CAST only failing when within IN clause - sql-server

I have the following two tables (in SQL Server 2017):
Parts
PartCode INT
IsActive BIT
...other fields...
Products
PartCode VARCHAR(6)
...other fields...
In the Products table, PartCode is either a 6-digit number or alpha followed by 5 digits. The Parts table is only concerned with those parts with 6-digit numeric part codes.
This query successfully returns the list of numeric part codes from the Products table:
SELECT CAST(PartCode AS INT) FROM Products WHERE ISNUMERIC(PartCode)=1
However, once I embed it within an IN, like this:
UPDATE Parts
SET IsActive=0
WHERE PartCode NOT IN (SELECT CAST(PartCode AS INT)
FROM Products
WHERE ISNUMERIC(PartCode)=1)
it fails with "Conversion failed when converting the varchar value 'K12345' to data type int."
I am aware of the odd behaviors of ISNUMERIC (unexpected values that it returns 1 for), but in this case, SELECT ISNUMERIC('K12345') is 0 as expected.
Since ISNUMERIC properly excluded the K12345 value on the SELECT, why did it attempt to cast it with the UPDATE? It should have been excluded from the result set (as it was when running the SELECT by itself) and thus not need to be converted. Why does placing the SELECT within an IN make it behave differently?

Other than the obvious "fix the database", you can avoid most problems by using EXISTS().
UPDATE Parts
SET IsActive=0
WHERE NOT EXISTS (
SELECT *
FROM Products
WHERE Products.PartCode = CAST(Parts.PartCode AS VARCHAR(6))
)
This way you're converting an integer to a string, and it always works. It also continues to allow use of any index on Products.PartCode.

Put the check in the select term:
UPDATE Parts
SET IsActive=0
WHERE PartCode NOT IN (
SELECT
CASE
WHEN ISNUMERIC(PartCode)=1 THEN CAST(PartCode AS INT)
ELSE -1
END
FROM Products)
The reason it's exploding in your version is the optimizer must be applying the where clause after evaluating the select list.

Related

I am running a query using fetch and offset but confused with the behaviour of same

I am very confused as a null set is also a set this is true!
But when i am executing a query like
SELECT FirstName
,LastName
,EmailPromotion
FROM person.Person
ORDER BY LastName desc
OFFSET 0 ROW FETCH FIRST 0 ROW ONLY
Its giving an error
The number of rows provided for a FETCH clause must be greater then zero.
And at the same time when i am executing a query like this
DECLARE #n AS BIGINT = 0;
SELECT FirstName
,LastName
,EmailPromotion
FROM person.Person
ORDER BY LastName desc
OFFSET #n ROW FETCH FIRST #n ROW ONLY
Its again giving an error
The number of rows provided for a FETCH clause must be greater then zero.
But when i am executing a query like this its not giving any error and rightfully generates an empty set
DECLARE #n AS BIGINT = 0;
SELECT FirstName
,LastName
,EmailPromotion
FROM person.Person
ORDER BY LastName desc
OFFSET 0 ROW FETCH FIRST #n ROW ONLY
Why is that can someone explain this behaviour please?

There are many issues with SQL that make it clear that, whilst it's inspired by set-based logic, it's not rigorously set-based. This would appear to be the case here where, per the documentation, it's incorrect to use a zero value for FETCH.
Other examples where SQL falls down with respect to empty sets include not allowing tables with no columns (which can be interesting as degenerate cases) as well as not allowing keys to be declared with no columns (where the table should contain 0 or exactly 1 row)1
So whilst we can say that empty sets are interesting, that doesn't necessarily mean that SQL is going to help you in generating them. The case you've found where it does seem to allow it to happen would appear to be more of a case of "tricking the optimizer" than something you should rely upon in production code.
1IIRC, in The Third Manifesto Date and Darwen refer to these as some of SQL's "nullogical" errors. Other set-based issues include the fact that tables and result sets are allowed to have duplicate rows and so may be bags rather than sets.

because your passing page size is zero so therse no records for OFFSET, is you pass page size issue will be resolve

Using "CAST" operator in sql with where and IN clause not working

The right query that works fine is
select * from Pretty_Txns where Send_Customer in ('1000000000164774783','1000000000253252111')
But I have data coming from outside SQL from a python application which is of the below format and of a varying length and hence the use of IN clause as it can be easily parameterised
(1000000000164774783,1000000000253252111)
So , I am trying to use a CAST operator to make life simple
select * from Pretty_Txns where cast (Send_Customer as numeric) in(1000000000164774783,1000000000253252111)
But it fails with:
Arithmetic overflow error converting varchar to data type numeric.
select * from Pretty_Txns where cast (Send_Customer as bigint) in(1000000000164774783,1000000000253252111)
Error converting data type varchar to bigint.

select * from Pretty_Txns where cast (Send_Customer as numeric(38)) in(1000000000164774783,1000000000253252111) --default total digits is 18 if not specified
or
select * from Pretty_Txns where cast (Send_Customer as bigint) in(1000000000164774783,1000000000253252111)
Updated:
First to check whether there are all numeric types of records in Send_Customer
This will give you all the records that contain only numbers,
SELECT Send_Customer FROM yourTable WHERE Send_Customer NOT LIKE '%[^0-9]%
or you could run the following two to compare:
1:
select Count(*)
from (
SELECT Send_Customer
FROM yourTable
WHERE Send_Customer NOT LIKE '%[^0-9]%'
) as ABC
2:
select count(Send_Customer)
from yourtable
Compare the result number with the total you have in table, if does not match, there must be some containing non numeric characters, then it cannot do the convert from varchar(..)(I assume it is varchar here) to numeric, unless you have take care of those records first, such as replace other characters, delete other characters using REPLACE or STUFF, but it will bring the data loss, which may not be accurate in your case.

You could also cast the individual values in the IN clause to varchar if there were only a few, instead of casting the DB field. That way, you wouldn't get a conversion error if send_customer contained some non-numeric data.
Just adding this in case anyone else comes across it in Google.

How can this expression reach the NULL expression?

I'm trying to randomly populate a column with values from another table using this statement:
UPDATE dbo.SERVICE_TICKET
SET Vehicle_Type = (SELECT TOP 1 [text]
FROM dbo.vehicle_typ
WHERE id = abs(checksum(NewID()))%21)
It seems to work fine, however the value NULL is inserted into the column. How can I get rid of the NULL and only insert the values from the table?

This can happen when you don't have an appropriate index on the ID column of your vehicle_typ table. Here's a smaller query that exhibits the same problem:
create table T (ID int null)
insert into T(ID) values (0),(1),(2),(3)
select top 1 * from T where ID = abs(checksum(NewID()))%3
Because there's no index on T, what happens is that SQL Server performs a table scan and then, for each row, attempts to satisfy the where clause. Which means that, for each row it evaluates abs(checksum(NewID()))%3 anew. You'll only get a result if, by chance, that expression produces, say, 1 when it's evaluated for the row with ID 1.
If possible (I don't know your table structure) I would first populate a column in SERVICE_TICKET with a random number between 0 and 20 and then perform this update using the already generated number. Otherwise, with the current query structure, you're always relying on SQL Server being clever enough to only evaluate abs(checksum(NewID()))%21once for each outer row, which it may not always do (as you've already found out).

#Damien_The_Unbeliever explained why your query fails.
My first variant was not correct, because I didn't understand the problem in full.
You want to set each row in SERVICE_TICKET to a different random value from vehicle_typ.
To fix it simply order by random number, rather than comparing a random number with ID. Like this (and you don't care how many rows are in vehicle_typ as long as there is at least one row there).
WITH
CTE
AS
(
SELECT
dbo.SERVICE_TICKET.Vehicle_Type
CA.[text]
FROM
dbo.SERVICE_TICKET
CROSS APPLY
(
SELECT TOP 1 [text]
FROM dbo.vehicle_typ
ORDER BY NewID()
) AS CA
)
UPDATE CTE
SET Vehicle_Type = [text];
At first we make a Common Table Expression, you can think of it as a temporary table. For each row in SERVICE_TICKET we pick one random row from vehicle_typ using CROSS APPLY. Then we UPDATE the original table with chosen rows.

SQL Server 2005 SELECT TOP 1 from VIEW returns LAST row

I have a view that may contain more than one row, looking like this:
[rate] | [vendorID]
8374 1234
6523 4321
5234 9374
In a SPROC, I need to set a param equal to the value of the first column from the first row of the view. something like this:
DECLARE #rate int;
SET #rate = (select top 1 rate from vendor_view where vendorID = 123)
SELECT #rate
But this ALWAYS returns the LAST row of the view.
In fact, if I simply run the subselect by itself, I only get the last row.
With 3 rows in the view, TOP 2 returns the FIRST and THIRD rows in order. With 4 rows, it's returning the top 3 in order. Yet still top 1 is returning the last.
DERP?!?
This works..
DECLARE #rate int;
CREATE TABLE #temp (vRate int)
INSERT INTO #temp (vRate) (select rate from vendor_view where vendorID = 123)
SET #rate = (select top 1 vRate from #temp)
SELECT #rate
DROP TABLE #temp
.. but can someone tell me why the first behaves so fudgely and how to do what I want? As explained in the comments, there is no meaningful column by which I can do an order by. Can I force the order in which rows are inserted to be the order in which they are returned?
[EDIT] I've also noticed that: select top 1 rate from ([view definition select]) also returns the correct values time and again.[/EDIT]

That is by design.
If you don't specify how the query should be sorted, the database is free to return the records in any order that is convenient. There is no natural order for a table that is used as default sort order.
What the order will actually be depends on how the query is planned, so you can't even rely on the same query giving a consistent result over time, as the database will gather statistics about the data and may change how the query is planned based on that.
To get the record that you expect, you simply have to specify how you want them sorted, for example:
select top 1 rate
from vendor_view
where vendorID = 123
order by rate

I ran into this problem on a query that had worked for years. We upgraded SQL Server and all of a sudden, an unordered select top 1 was not returning the final record in a table. We simply added an order by to the select.
My understanding is that SQL Server normally will generally provide you the results based on the clustered index if no order by is provided OR off of whatever index is picked by the engine. But, this is not a guarantee of a certain order.
If you don't have something to order off of, you need to add it. Either add a date inserted column and default it to GETDATE() or add an identity column. It won't help you historically, but it addresses the issue going forward.

While it doesn't necessarily make sense that the results of the query should be consistent, in this particular instance they are so we decided to leave it 'as is'. Ultimately it would be best to add a column, but this was not an option. The application this belongs to is slated to be discontinued sometime soon and the database server will not be upgraded from SQL 2005. I don't necessarily like this outcome, but it is what it is: until it breaks it shall not be fixed. :-x

How does sql server choose values in an update statement where there are multiple options?

I have an update statement in SQL server where there are four possible values that can be assigned based on the join. It appears that SQL has an algorithm for choosing one value over another, and I'm not sure how that algorithm works.
As an example, say there is a table called Source with two columns (Match and Data) structured as below:
(The match column contains only 1's, the Data column increments by 1 for every row)
Match Data
`--------------------------
1 1
1 2
1 3
1 4
That table will update another table called Destination with the same two columns structured as below:
Match Data
`--------------------------
1 NULL
If you want to update the ID field in Destination in the following way:
UPDATE
Destination
SET
Data = Source.Data
FROM
Destination
INNER JOIN
Source
ON
Destination.Match = Source.Match
there will be four possible options that Destination.ID will be set to after this query is run. I've found that messing with the indexes of Source will have an impact on what Destination is set to, and it appears that SQL Server just updates the Destination table with the first value it finds that matches.
Is that accurate? Is it possible that SQL Server is updating the Destination with every possible value sequentially and I end up with the same kind of result as if it were updating with the first value it finds? It seems to be possibly problematic that it will seemingly randomly choose one row to update, as opposed to throwing an error when presented with this situation.
Thank you.
P.S. I apologize for the poor formatting. Hopefully, the intent is clear.

It sets all of the results to the Data. Which one you end up with after the query depends on the order of the results returned (which one it sets last).
Since there's no ORDER BY clause, you're left with whatever order Sql Server comes up with. That will normally follow the physical order of the records on disk, and that in turn typically follows the clustered index for a table. But this order isn't set in stone, particularly when joins are involved. If a join matches on a column with an index other than the clustered index, it may well order the results based on that index instead. In the end, unless you give it an ORDER BY clause, Sql Server will return the results in whatever order it thinks it can do fastest.
You can play with this by turning your upate query into a select query, so you can see the results. Notice which record comes first and which record comes last in the source table for each record of the destination table. Compare that with the results of your update query. Then play with your indexes again and check the results once more to see what you get.
Of course, it can be tricky here because UPDATE statements are not allowed to use an ORDER BY clause, so regardless of what you find, you should really write the join so it matches the destination table 1:1. You may find the APPLY operator useful in achieving this goal, and you can use it to effectively JOIN to another table and guarantee the join only matches one record.

The choice is not deterministic and it can be any of the source rows.
You can try
DECLARE #Source TABLE(Match INT, Data INT);
INSERT INTO #Source
VALUES
(1, 1),
(1, 2),
(1, 3),
(1, 4);
DECLARE #Destination TABLE(Match INT, Data INT);
INSERT INTO #Destination
VALUES
(1, NULL);
UPDATE Destination
SET Data = Source.Data
FROM #Destination Destination
INNER JOIN #Source Source
ON Destination.Match = Source.Match;
SELECT *
FROM #Destination;
And look at the actual execution plan. I see the following.
The output columns from #Destination are Bmk1000, Match. Bmk1000 is an internal row identifier (used here due to lack of clustered index in this example) and would be different for each row emitted from #Destination (if there was more than one).
The single row is then joined onto the four matching rows in #Source and the resultant four rows are passed into a stream aggregate.
The stream aggregate groups by Bmk1000 and collapses the multiple matching rows down to one. The operation performed by this aggregate is ANY(#Source.[Data]).
The ANY aggregate is an internal aggregate function not available in TSQL itself. No guarantees are made about which of the four source rows will be chosen.
Finally the single row per group feeds into the UPDATE operator to update the row with whatever value the ANY aggregate returned.
If you want deterministic results then you can use an aggregate function yourself...
WITH GroupedSource AS
(
SELECT Match,
MAX(Data) AS Data
FROM #Source
GROUP BY Match
)
UPDATE Destination
SET Data = Source.Data
FROM #Destination Destination
INNER JOIN GroupedSource Source
ON Destination.Match = Source.Match;
Or use ROW_NUMBER...
WITH RankedSource AS
(
SELECT Match,
Data,
ROW_NUMBER() OVER (PARTITION BY Match ORDER BY Data DESC) AS RN
FROM #Source
)
UPDATE Destination
SET Data = Source.Data
FROM #Destination Destination
INNER JOIN RankedSource Source
ON Destination.Match = Source.Match
WHERE RN = 1;
The latter form is generally more useful as in the event you need to set multiple columns this will ensure that all values used are from the same source row. In order to be deterministic the combination of partition by and order by columns should be unique.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight