SQL Server select-where statement issue - sql-server

I am using SQL Server 2008 Enterprise on Windows Server 2008 Enterprise. I have a question about tsql in SQL Server 2008. For select-where statement, there are two differnet forms,
(1) select where foo between [some value] and [some other value],
(2) select where foo >= [some value] and foo <= [some other value]? I am not sure whether between-and is always the same as using <= and >= sign?
BTW: whether they are always the same - even for differnet types of data (e.g. compare numeric value, comparing string values), appreciate if anyone could provide some documents to prove whether they are always the same, so that I can learn more from it.
thanks in advance,
George

Yes they are always the same. The entry in Books Online for BETWEEN says
BETWEEN returns TRUE if the value of
test_expression is greater than or
equal to the value of begin_expression
and less than or equal to the value of
end_expression.
Indeed you can see this easily by looking at the execution plans. You will see that Between doesn't even appear in the text. It has been replaced with >= and <= there is no distinction made between the two.
SELECT * FROM master.dbo.spt_values
WHERE number between 1 and 3 /*Numeric*/
SELECT * FROM master.dbo.spt_values
WHERE name between 'a' and 'b' /*String*/
select * from sys.objects
WHERE create_date between GETDATE() and GETDATE()+100 /*Date*/

Related

Microsoft SQL Server: Error with Group By

I'm new to Microsoft SQL Server 2014. I run this SQL code:
SELECT TOP(10) 'DBSG' as seek_entity, *
FROM DBSG..PM00200
and get this result:
Next, I want to find out total line items for that entity with code below.
WITH vw_pm00200_all AS
(
SELECT TOP(10)
'DBSG' as seek_entity, *
FROM
DBSG..PM00200
)
SELECT
seek_entity,
COUNT(*) AS total
FROM
vw_pm00200_all
GROUP BY
1
Sadly, I get this error. I have no idea why it failed.
Msg 164, Level 15, State 1, Line 9
Each GROUP BY expression must contain at least one column that is not an outer reference.
Lastly, please advise is Microsoft SQL Server based on Transact-SQL?
It looks like you are running into this problem here: Each GROUP BY expression must contain at least one column that is not an outer reference
As the answer points out, grouping by a constant literal is pointless as it is the same for all results. Count(*) will return the same result as Count(*) with a GROUP BY.
If this is just test code and you plan on using a CASE statement (with different values) in place of the string literal, you may have better luck.
Yes, T-SQL is Microsoft SQL Server's flavor of SQL.

Why does "= ALL (subquery)" evaluate to true if the subquery returnes no results?

I would expect "= ALL (subquery)" to evaluate to false if the subquery returns no results.
However in testing I find that not to be the case:
--put one record in #Orders
SELECT 1 AS 'OrderID'
INTO #Orders;
--put one record in #OrderLines
SELECT
1 AS 'OrderID'
,1 AS 'OrderLineID'
,3 AS 'Quantity'
INTO #OrderLines;
--as expected this returns the record in #Orders
SELECT *
FROM #Orders
WHERE 3 = ALL
(
SELECT Quantity
FROM #OrderLines
);
--now delete the record in #OrderLines
DELETE FROM #OrderLines;
--this still returns the record from #Orders even though the subquery returns no results
SELECT *
FROM #Orders
WHERE 3 = ALL
(
SELECT Quantity
FROM #OrderLines
);
Execution plan for the final select statement: https://www.brentozar.com/pastetheplan/?id=H1jQ2YgIK
Tested on:
Microsoft SQL Server 2017 (RTM-CU20) (KB4541283) - 14.0.3294.2 (X64)
Microsoft SQL Server 2017 (RTM-CU25) (KB5003830) - 14.0.3401.7 (X64)
When searching I find unofficial sources which say that "= ALL (subquery)" evaluates to true if the subquery returns no results:
"The ALL must be preceded by the comparison operators and evaluates to TRUE if the query returns no rows" https://dotnettutorials.net/lesson/all-operator-sql-server/
"The ALL must be preceded by the comparison operators and evaluates to TRUE if the query returns no rows" https://www.w3resource.com/sql/special-operators/sql_all.php
But I don't see anything in the official documentation (https://learn.microsoft.com/en-us/sql/t-sql/language-elements/all-transact-sql?view=sql-server-ver15) that supports that idea, in fact it would seem to dispute it: "ALL requires the scalar_expression to compare positively to every value that is returned by the subquery"
Questions
Is it expected behavior in SQL Server to evaluate ALL as true if the subquery returns no results?
If the answer to #1 is "yes":
Is it documented somewhere?
What is the explanation for that behavior? In the code example above 3 does not compare positively with no results so it seems highly unintuitive that the query should return results
Thanks for any assistance and insight.
Paraphrasing the documentation:
... scalar_expression = ALL (subquery) would evaluate as FALSE if some of the values of the subquery don't meet the criteria of the expression.
It's subtle, but the intention seems to be return false if some values do not satisfy the condition, true otherwise. In the edge case of there being no values, there are no values that don't satisfy the condition, so it returns true.
The "problem" causing the perhaps surprising result is the word "some", which implies existence. If no values exist, there can't be "some" values that are false, so it's true.
You could say it's based on double negative logic where the edge case happens to fall in the unexpected half of the result.
As a side note, I have written a huge amount of SQL in my career and never used this keyword, nor seen it used.
Recommendation: Do not use.

SQL Server Determining Hard Coded Date as Larger When It's Not?

An old employee left a massive query behind that I've been debugging and it appears that the issue has come down to SQL Server itself determining a comparison differently than what I would have expected.
I have a table with a column col1 containing the value 20191215 as a datetime.
The part in question is similar to the following:
select case when col1 > '01/01/2020' then 1 else 0 end
This statement is returning 1, suggesting that '12/15/2019' is larger than '01/01/2020'.
I do not need assistance correcting the query, as I have already made changes to do so other than using the comparison the previous employee was using, I am simply curious as to why SQL Server would evaluate this as I have described.
I understand that this is not the typically way SQL Server would store dates as well, would the issue simply be the formatting of the dates?
Current SQL Server version is: SQL Server 2014 SP3 CU3.
SQL Fiddle link that shows the same results
Please note that the link does not contain an exact replica of my case
Edit: Included additional info relevant to actual query.
It is a string comparison not a date comparison:
select case when '12/15/2019' > '01/01/2020' then 1 else 0 end
vs
select case when CAST('12/15/2019' AS DATE) > CAST('01/01/2020' AS DATE) then 1 else 0 end
db<>fiddle demo
I am simply curious as to why SQL Server would evaluate this as I have described.
'12/15/2019' it is a string literal, SQL Server does not know you want to treat a date unless you explicitly express your intention.
I have a table with a column col1 containing the value 20191216
If you are comparing with a column then the data type of column matters and data type precedence rules

Surprising behavior of re-assigning a variable for every row in a select

This is the simplest example with which I could reproduce the issue. As such it looks a little bit contrived, but bear with me.
declare #t table(number int)
insert into #t values (1), (2)
declare #sum bigint = 0
select #sum = #sum + number
from (select top 2 number from #t order by number) subquery
order by number desc
select #sum
Here's the query on the data explorer.
I would expect this to return 3, the sum of the values in the table #t. Instead, however, it returns 1.
Doing any of the following will cause the query to correctly return 3:
make #t.number and #sum have the same type (by making #sum an int or #t.number a bigint).
removing the outer order by
removing the inner order by
making both order bys sort in the same direction by adding desc to the inner one or removing it from the outer one
removing the subquery (i.e. just selecting from #t)
None of these things strike me as something that should change the behavior of this query.
Swapping the sort orders (descending in the subquery, ascending on the outside) will make the query return 2 instead of 1.
A similar thing happens with strings instead of numbers, so this isn't constrained to int and bigint.
This happens both with SQL Server 2014 and 2016, or to be precise
Microsoft SQL Server 2014 - 12.0.2000.8 (X64)
Feb 20 2014 20:04:26
Copyright (c) Microsoft Corporation
Developer Edition (64-bit) on Windows NT 6.3 <X64> (Build 10586: )
and
Microsoft SQL Server 2016 (RTM-CU1) (KB3164674) - 13.0.2149.0 (X64)
Jul 11 2016 22:05:22
Copyright (c) Microsoft Corporation
Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: )
(the latter being the data explorer).
What's going on?
The answer seems to be that you are/were relying on undocumented behaviour which changed in Sql Server 2012.
Per the documentation:
https://msdn.microsoft.com/en-us/library/ms187330.aspx
SELECT #local_variable is typically used to return a single value into the variable. However, when expression is the name of a column, it can return multiple values. If the SELECT statement returns more than one value, the variable is assigned the last value that is returned.
It is not documented what happens if the destination variable (to be assigned to) is part of the source expression. It seems this behaviour has changed. In earlier versions the variable would be assigned once for each row, but that doesn't seem to occur any more.
This is most visible for a lot of functions where the "group concat" trick ceased to work:
SELECT #sentence = #sentence + ' ' + word from SENTENCE_WORDS order by position
These have generally to be replaced by the xml concat trick.
set #sentence = (
select word as "text()", ' ' as "text()"
from SENTENCE_WORDS
order by position
for xml path(''), root('root'), type
).value('(/root)[1]', 'nvarchar(max)')
Remove the second ORDER BY (I.e. "order by number desc").
You are using an undocumented feature of T-SQL (I believe it's called ROW concatenation?) which is not guaranteed to work in future versions of SQL. It's a little hacky, but very useful none-the-less! As you've discovered, it breaks when you use the ORDER BY clause. This is a known issue of using Row concatenation.

SQL Server Convert Data

I'm having trouble trying to clean a database because SQL Server doesn't differentiate '2¹59' from '2159', but when when try to convert into INT it obviously returns an error.
In this case I need to replace by NULL, every non numerical data.
Can someone help please? (I'm using Sql Server 2008)
From SQL SERVER 2012 there is a new function which have been added called TRY_PARSE,
If you use it then it will automatically make non int to null.
select TRY_PARSE('2¹59' as int)
Output of above query will be null.
You can use a different collation to change the way the strings are compared:
select
case when N'2¹59' = N'2159' collate Latin1_General_BIN then 1 else 0 end
This will select 0 as you'd expect.
More importantly, since MS SQL understands unicode properly, you can do this:
select cast(N'2¹59' as varchar)
which will give you '2159' - properly replacing the "broken" digits.
If you have no other option, you could also build a helper table to handle indexing the string (just a single column with numbers 1..1000 for example), and do something like this:
exists
(
select 1 from [Numbers]
where
[Numbers].[Index] < len([Value]) + 1
and
unicode(substring([Value], [Numbers].[Index], 1)) > 127
)
Needless to say, this is going to be rather slow. For simple integers, though, this can work as a decent validation - simply use (unicode(substring([Value], [Numbers].[Index], 1)) not between 48 and 57) and ([Numbers].[Index] <> 0 or substring([Value], 1, 1) <> '-')) for example.

Resources