Is SQL Server's double checking needed here? - sql-server

IF #insertedValue IS NOT NULL AND #insertedValue > 0
This logic is in a trigger.
The value comes from a deleted or inserted row (doesn't matter).
2 questions :
Do I need to check both conditions? (I want all value > 0, value in db can be nullable)
Does SQL Server check the expression in the order I wrote it ?

1) Actually, no, since if the #insertedValue is NULL, the expression #insertedValue > 0 will evaulate to false. (Actually, as Martin Smith points out in his comment, it will evaluate to a special value "unknown", which when forced to a Boolean result on its own collapses to false - examples: unknown AND true = unknown which is forced to false, unknown OR true = true.) But you're relying on comparison behaviour with NULL values. A single step equivalent method, BTW, would be:
IF ISNULL(#insertedValue, 0) > 0
IMHO, you're better sticking with the explicit NULL check for clarity if nothing else.
2) Since the query will be optimised before execution, there is absolutely no guarantee of order of execution or short circuiting of the AND operator.
Combining the two - if the double check is truly unnecessary, then it will probably be optimised out before execution anyway, but your SQL code will be more maintainable in my view if you make this explicit.

You can use COALESCE => Returns the first nonnull expression among its arguments.
Now you can make the query more flexible, by increasing the column limits and again you need to check the Greater Then Zero condition. Important point to note down here is you have the option to check values in multiple columns.
declare #val int
set #val = COALESCE( null, 1, 10 )
if(#val>0)
select 'fine'
else
select 'not fine'

Related

SQL Server CHOOSE() function behaving unexpectedly with RAND() function

I've encountered an interesting SQL server behaviour while trying to generate random values in T-sql using RAND and CHOOSE functions.
My goal was to try to return one of two given values using RAND() as rng. Pretty easy right?
For those of you who don't know it, CHOOSE function accepts in an index number(int) along with a collection of values and returns a value at specified index. Pretty straightforward.
At first attempt my SQL looked like this:
select choose(ceiling((rand()*2)) ,'a','b')
To my surprise, this expression returned one of three values: null, 'a' or 'b'. Since I didn't expect the null value i started digging. RAND() function returns a float in range from 0(included) to 1 (excluded). Since I'm multiplying it by 2, it should return values anywhere in range from 0(included) to 2 (excluded). Therefore after use of CEILING function final value should be one of: 0,1,2. After realising that i extended the value list by 'c' to check whether that'd be perhaps returned. I also checked the docs page of CEILING and learnt that:
Return values have the same type as numeric_expression.
I assumed the CEILINGfunction returned int, but in this case would mean that the value is implicitly cast to int before being used in CHOOSE, which sure enough is stated on the docs page:
If the provided index value has a numeric data type other than int,
then the value is implicitly converted to an integer.
Just in case I added an explicit cast. My SQL query looks like this now:
select choose(cast(ceiling((rand()*2)) as int) ,'a','b','c')
However, the result set didn't change. To check which values cause the problem I tried generating the value beforehand and selecting it alongside the CHOOSE result. It looked like this:
declare #int int = cast(ceiling((rand()*2)) as int)
select #int,choose( #int,'a','b','c')
Interestingly enough, now the result set changed to (1,a), (2,b) which was my original goal. After delving deeper in the CHOOSE docs page and some testing i learned that 'null' is returned in one of two cases:
Given index is a null
Given index is out of range
In this case that would mean that index value when generated inside the SELECT statement is either 0 or above 2/3 (I'm assuming that negative numbers are not possible here and CHOOSE function indexes from 1). As I've stated before 0 should be one of possibilities of:
ceiling((rand()*2))
,but for some reason it's never 0 (at least when i tried it 1 million+ times like this)
set nocount on
declare #test table(ceiling_rand int)
declare #counter int = 0
while #counter<1000000
begin
insert into #test
select ceiling((rand()*2))
set #counter=#counter+1
end
select distinct ceiling_rand from #test
Therefore I assume that the value generated in SELECT is greater than 2/3 or NULL. Why would it be like this only when generated in SELECT statement? Perhaps order of resolving CAST, CELING or RAND inside SELECT is different than it would seem? It's true I've only tried it a limited number of times, but at this point the chances of it being a statistical fluctuation are extremely small. Is it somehow a floating-point error? I truly am stumbled and looking forward to any explanation.
TL;DR: When generating a random number inside a SELECT statement result set of possible values is different then when it's generated before the SELECT statement.
Cheers,
NFSU
EDIT: Formatting
You can see what's going on if you look at the execution plan.
SET SHOWPLAN_TEXT ON
GO
SELECT (select choose(ceiling((rand()*2)) ,'a','b'))
Returns
|--Constant Scan(VALUES:((CASE WHEN CONVERT_IMPLICIT(int,ceiling(rand()*(2.0000000000000000e+000)),0)=(1) THEN 'a' ELSE CASE WHEN CONVERT_IMPLICIT(int,ceiling(rand()*(2.0000000000000000e+000)),0)=(2) THEN 'b' ELSE NULL END END)))
The CHOOSE is expanded out to
SELECT CASE
WHEN ceiling(( rand() * 2 )) = 1 THEN 'a'
ELSE
CASE
WHEN ceiling(( rand() * 2 )) = 2 THEN 'b'
ELSE NULL
END
END
and rand() is referenced twice. Each evaluation can return a different result.
You will get the same problem with the below rewrite being expanded out too
SELECT CASE ceiling(( rand() * 2 ))
WHEN 1 THEN 'a'
WHEN 2 THEN 'b'
END
Avoid CASE for this and any of its variants.
One method would be
SELECT JSON_VALUE ( '["a", "b"]' , CONCAT('$[', FLOOR(rand()*2) ,']') )

Is <> 0 condition will Consider Null & 0 as same in SQL

I need to compare variable in case condition in stored procedure.
case when #column <> 0 then ...
ELSE 0 END
but whenever am getting #column as NULL, then the above script returning as 0.
will sql consider both 0 & NULL as same.
Am using sql server 2014.
Thanks All
No. SQL considers NULL as "I have no idea". Comparing anything with "I have no idea" results in an answer of "I totally have no idea". Look:
- How high is John?
- I have no idea.
- What is two centimeters higher than John?
- I have no idea.
Even comparison between two NULL values is not true: if I have no idea how tall John is and if I also have no idea how tall Jack is, I can't conclude that John is equally tall as Jack (and I can't conclude that John is not equally tall as Jack). The only sensible answer is... "I have no idea".
The way to test for NULL is with IS operator, which specifically exists for this scenario (e.g. #column IS NULL, or #column IS NOT NULL).
So NULL is not equal to 0, nor is it NOT equal to 0. The result of NULL <> 0 is NULL. However, NULL is falsy where conditionals are concerned, so CASE thinks you should get the ELSE branch any time #column is NULL.
In case if you want to execute the then part of case if the column value is null, then modify your condition to check for nulls also
CASE WHEN (#column <> 0 OR #column IS NULL) then ...
ELSE 0 END

CASE Statement SQL: Priority in cases?

I have a general question for when you are using a CASE statement in SQL (Server 2008), and more than one of your WHEN conditions are true but the resulting flag is to be different.
This is hypothetical example but may be transferable when applying checks across multiple columns to classify data in rows. The output of the code below is dependant on how the cases are ordered, as both are true.
DECLARE #TESTSTRING varchar(5)
SET #TESTSTRING = 'hello'
SELECT CASE
WHEN #TESTSTRING = 'hello' THEN '0'
WHEN #TESTSTRING <> 'hi' THEN '1'
ELSE 'N/A'
END AS [Output]
In general, would it be considered bad practice to create flags in this way? Would a WHERE, OR statement be better?
Case statements are guaranteed to be evaluated in the order they are written. The first matching value is used. So, for your example, the value 0 would be returned.
This is clearly described in the documentation:
Searched CASE expression:
Evaluates, in the order specified, Boolean_expression for each WHEN clause.
Returns result_expression of the first Boolean_expression that evaluates to TRUE.
If no Boolean_expression evaluates to TRUE, the Database Engine returns the else_result_expression if an ELSE clause is specified, or
a NULL value if no ELSE clause is specified.
As for whether this is good or bad practice, I would lean on the side of neutrality. This is ANSI behavior so you can depend on it, and in some cases it is quite useful:
select (case when val < 10 then 'Less than 10'
when val < 100 then 'Between 10 and 100'
when val < 1000 then 'Between 100 and 1000'
else 'More than 1000' -- or NULL
end) as MyGroup
To conclude further - SQL will stop reading the rest of the of the case/when statement when one of the WHEN clauses is TRUE. Example:
SELECT
CASE
WHEN 3 = 3 THEN 3
WHEN 4 = 4 THEN 4
ELSE NULL
END AS test
This statement returns 3 since this is the first WHEN clause to return a TRUE, even though the following statement is also a TRUE.

Why does SUM(...) on an empty recordset return NULL instead of 0?

I understand why null + 1 or (1 + null) returns null: null means "unknown value", and if a value is unknown, its successor is unknown as well. The same is true for most other operations involving null.[*]
However, I don't understand why the following happens:
SELECT SUM(someNotNullableIntegerField) FROM someTable WHERE 1=0
This query returns null. Why? There are no unknown values involved here! The WHERE clause returns zero records, and the sum of an empty set of values is 0.[**] Note that the set is not unknown, it is known to be empty.
I know that I can work around this behaviour by using ISNULL or COALESCE, but I'm trying to understand why this behaviour, which appears counter-intuitive to me, was chosen.
Any insights as to why this makes sense?
[*] with some notable exceptions such as null OR true, where obviously true is the right result since the unknown value simply does not matter.
[**] just like the product of an empty set of values is 1. Mathematically speaking, if I were to extend $(Z, +)$ to $(Z union {null}, +)$, the obvious choice for the identity element would still be 0, not null, since x + 0 = x but x + null = null.
The ANSI-SQL-Standard defines the result of the SUM of an empty set as NULL. Why they did this, I cannot tell, but at least the behavior should be consistent across all database engines.
Reference: http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt on page 126:
b) If AVG, MAX, MIN, or SUM is specified, then
Case:
i) If TXA is empty, then the result is the null value.
TXA is the operative resultset from the selected column.
When you mean empty table you mean a table with only NULL values, That's why we will get NULL as output for aggregate functions. You can consider this as by design for SQL Server.
Example 1
CREATE TABLE testSUMNulls
(
ID TINYINT
)
GO
INSERT INTO testSUMNulls (ID) VALUES (NULL),(NULL),(NULL),(NULL)
SELECT SUM(ID) FROM testSUMNulls
Example 2
CREATE TABLE testSumEmptyTable
(
ID TINYINT
)
GO
SELECT SUM(ID) Sums FROM testSumEmptyTable
In both the examples you will NULL as output..

Postgres NOT in array

I'm using Postgres' native array type, and trying to find the records where the ID is not in the array recipient IDs.
I can find where they are IN:
SELECT COUNT(*) FROM messages WHERE (3 = ANY (recipient_ids))
But this doesn't work:
SELECT COUNT(*) FROM messages WHERE (3 != ANY (recipient_ids))
SELECT COUNT(*) FROM messages WHERE (3 = NOT ANY (recipient_ids))
What's the right way to test for this condition?
SELECT COUNT(*) FROM "messages" WHERE NOT (3 = ANY (recipient_ids))
You can always negate WHERE (condition) with WHERE NOT (condition)
You could turn it around a bit and say "3 is not equal to all the IDs":
where 3 != all (recipient_ids)
From the fine manual:
9.21.4. ALL (array)
expression operator ALL (array expression)
The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expression is evaluated and compared to each element of the array using the given operator, which must yield a Boolean result. The result of ALL is "true" if all comparisons yield true (including the case where the array has zero elements). The result is "false" if any false result is found.
Beware of NULLs
Both ALL:
(some_value != ALL(some_array))
And ANY:
NOT (some_value = ANY(some_array))
Would work as long as some_array is not null. If the array might be null, then you must account for it with coalesce(), e.g.
(some_value != ALL(coalesce(some_array, array[]::int[])))
Or
NOT (some_value = ANY(coalesce(some_array, array[]::int[])))
From the docs:
If the array expression yields a null array, the result of ANY will be null
If the array expression yields a null array, the result of ALL will be null
Augmenting the ALL/ANY Answers
I prefer all solutions that use all or any to achieve the result, appreciating the additional notes (e.g. about NULLs). As another augementation, here is a way to think about those operators.
You can think about them as short-circuit operators:
all(array) goes through all the values in the array, comparing each to the reference value using the provided operator. As soon as a comparison yields false, the process ends with false, otherwise true. (Comparable to short-circuit logical and.)
any(array) goes through all the values in the array, comparing each to the reference value using the provided operator. As soon as a comparison yields true, the process ends with true, otherwise false. (Comparable to short-circuit logical or.)
This is why 3 <> any('{1,2,3}') does not yield the desired result: The process compares 3 with 1 for inequality, which is true, and immediately returns true. A single value in the array different from 3 is enough to make the entire condition true. The 3 in the last array position is prob. never used.
3 <> all('{1,2,3}') on the other hand makes sure all values are not equal 3. It will run through all comparisons that yield true up to an element that yields false (the last in this case), to return false as the overall result. This is what the OP wants.
an update:
as of postgres 9.3,
you can use NOT in tandem with the #> (contains operator) to achieve this as well.
IE.
SELECT COUNT(*) FROM "messages" WHERE NOT recipient_ids #> ARRAY[3];
not (3 = any(recipient_ids))?
Note that the ANY/ALL operators will not work with array indexes. If indexes are in mind:
SELECT COUNT(*) FROM "messages" WHERE 3 && recipient_ids
and the negative:
SELECT COUNT(*) FROM "messages" WHERE NOT (3 && recipient_ids)
An index can then be created like:
CREATE INDEX recipient_ids_idx on tableName USING GIN(recipient_ids)
Use the following query
select id from Example where NOT (id = ANY ('{1, 2}'))

Resources