Math operations with NULL in tsql - sql-server

Why
select 1 - NULL
returns NULL instead of 1?
It wasn't clearly expected to me.

Null: A value of NULL indicates that the value is unknown. A value of NULL is different from an empty or zero value. No two null values are equal. Comparisons between two null values, or between a NULL and any other value, return unknown because the value of each NULL is unknown.
If you do any arithmetic operations with null the whole expression evaluates to Null. In order to handle null you should use Isnull() or coalesce function like this.
select 1 - isnull(NULL,0) as result

Simply because NULL is not 0.
If it helps, consider NULL as a synonym for "unknown", and then it'll make perfect sense - the result of 1 minus an unknown number can only give an unknown result.

Here are some references on how NULL behaves differently:
NULL can be thought of as UNKNOWN (docs).
Arithmetic operations with NULL result in NULL (wiki).
SUM() operation ignores NULL instead of returning UNKNOWN (docs).
String concatenation with NULL result in NULL (wiki and comment by #Zohar).
Boolean comparisons with NULL use three-value logic (wiki).
Where clauses with NULL should use IS, because boolean comparisons result in UNKNOWN not TRUE (docs).
To determine whether an expression is NULL, use IS NULL or IS NOT NULL instead of comparison operators (such as = or !=). Comparison operators return UNKNOWN when either or both arguments are NULL.

Related

SQL SERVER and NULL values on Equal operator

I have in a where clause the following statement COLUMN_1 <> 'O'
But the rows that contains NULL on the COLUMN_1 are not take in consideration, it is like they contain the value 'O'
Why is that ?
Thanks in advance.
Because NULL does not equal, nor does not not equal anything, including NULL. NULL <> 'O' evaluates to UNKNOWN; which is specifically not TRUE.
If you want to evaluate with NULL values you need to use IS NULL:
WHERE (COLUMN_1 <> 'O' OR COLUMN_1 IS NULL)
This is also documented on both not-equal operator articles:
Not Equal To (Transact SQL) - traditional
Compares two expressions (a comparison operator). When you compare nonnull expressions, the result is TRUE if the left operand is not equal to the right operand; otherwise, the result is FALSE. If either or both operands are NULL, see the topic SET ANSI_NULLS (Transact-SQL).
Not Equal To (Transact SQL) - exclamation
Tests whether one expression is not equal to another expression (a comparison operator). If either or both operands are NULL, NULL is returned. Functions the same as the <> (Not Equal To) comparison operator.

Syntax error when assigning column value using CASE WHEN in Computed Column Formula

I'm trying to use the following CASE WHEN in computed column alias, but it shows syntax error.
[Password_Last_Changed] [datetime] AS
SELECT CASE
WHEN ([SUA_History1_Date] IS NOT NULL) then [SUA_History1_Date]
WHEN ([SUA_History1_Date] IS NULL) then [SUA_History2_Date]
WHEN ([SUA_History2_Date] IS NULL) then [SUA_History3_Date]
WHEN ([SUA_History3_Date] IS NULL) then [SUA_History4_Date]
WHEN ([SUA_History4_Date] IS NULL) then [SUA_History5_Date]
ELSE NULL
END
Not sure what went wrong. If there is a better approach for this logic, please let me try it.
Using COALESCE is better option than ISNULL or CASE..WHEN for this problem since the input values for the COALESCE expression can be evaluated multiple times.
You can also use NULLIF to check the conditions in those NULL valued columns.
A NULL value for ISNULL is converted to int datatype whereas for COALESCE, you must provide a data type. ISNULL takes only 2 parameters whereas COALESCE takes a variable number of parameters.
You can use COALESCE:
COALESCE([SUA_History1_Date],[SUA_History2_Date],[SUA_History3_Date],[SUA_History4_Date],[SUA_History5_Date])
Evaluates the arguments in order and returns the current value of the first expression that initially does not evaluate to NULL.

About Empty/Blank database field in SQL Server

I am confusing a lot, about, Empty, Null and not Null database fields value.
I have a tab where is several fields are containing Empty/blank data some of them containing NULL data, and containing actual data.
I am confuse what the difference between Epmyt/Blank and Null data.
Please help me.
Thanks
Ravik
Empty or blank usually refers to an empty string value, length = 0 characters. Null means no value, not even an empty one.
NULL is a database indicator that specifies that the value in the database is missing or indeterminate. Empty strings are zero-length varchar strings.
They may or may not be the same when you select values. SQL Server Management Studio, for instance, exports NULL values as "NULL" instead of a blank string.
NULLs follow some very basic comparison rules that many find counterintuitive. Any comparison operation -- except for is null -- returns false. In particular, the following two return false:
where NULL = NULL
where NULL <> NULL
This applies to columns that have a NULL value as well So these both return false, when val contains a NULL value:
where val = val
where val <> val
Blank/empty string is just a value for a variable length string that has a length of 0, usually represented as ''. The following does return true:
where '' = ''
where val = val -- given that val is ''
The one complication is that some databases treat NULL values as zero-length strings. Oracle comes to mind. However, this is not ANSI-compliant behavior.
NULL is different from an empty value in that with NULL, you don't know for sure if the value is empty or not. This has real measurable impacts in your database. Here are some examples:
The LEN() of an empty varchar field is 0. The LEN() of a NULL field is still NULL... you don't know how large the field is.
Two empty fields compared to themselves ('' = '') are true. You don't know if two NULL fields are equal or not, so (NULL = NULL) results in NULL.
The COALESCE('', ... ) of an empty field results in the empty field. The COALESCE(NULL, ... ) of a NULL field falls through to the next value.
There are many more examples like this. NULL does not mean empty. NULL means "I don't know."

Why does SUM(...) on an empty recordset return NULL instead of 0?

I understand why null + 1 or (1 + null) returns null: null means "unknown value", and if a value is unknown, its successor is unknown as well. The same is true for most other operations involving null.[*]
However, I don't understand why the following happens:
SELECT SUM(someNotNullableIntegerField) FROM someTable WHERE 1=0
This query returns null. Why? There are no unknown values involved here! The WHERE clause returns zero records, and the sum of an empty set of values is 0.[**] Note that the set is not unknown, it is known to be empty.
I know that I can work around this behaviour by using ISNULL or COALESCE, but I'm trying to understand why this behaviour, which appears counter-intuitive to me, was chosen.
Any insights as to why this makes sense?
[*] with some notable exceptions such as null OR true, where obviously true is the right result since the unknown value simply does not matter.
[**] just like the product of an empty set of values is 1. Mathematically speaking, if I were to extend $(Z, +)$ to $(Z union {null}, +)$, the obvious choice for the identity element would still be 0, not null, since x + 0 = x but x + null = null.
The ANSI-SQL-Standard defines the result of the SUM of an empty set as NULL. Why they did this, I cannot tell, but at least the behavior should be consistent across all database engines.
Reference: http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt on page 126:
b) If AVG, MAX, MIN, or SUM is specified, then
Case:
i) If TXA is empty, then the result is the null value.
TXA is the operative resultset from the selected column.
When you mean empty table you mean a table with only NULL values, That's why we will get NULL as output for aggregate functions. You can consider this as by design for SQL Server.
Example 1
CREATE TABLE testSUMNulls
(
ID TINYINT
)
GO
INSERT INTO testSUMNulls (ID) VALUES (NULL),(NULL),(NULL),(NULL)
SELECT SUM(ID) FROM testSUMNulls
Example 2
CREATE TABLE testSumEmptyTable
(
ID TINYINT
)
GO
SELECT SUM(ID) Sums FROM testSumEmptyTable
In both the examples you will NULL as output..

Postgres NOT in array

I'm using Postgres' native array type, and trying to find the records where the ID is not in the array recipient IDs.
I can find where they are IN:
SELECT COUNT(*) FROM messages WHERE (3 = ANY (recipient_ids))
But this doesn't work:
SELECT COUNT(*) FROM messages WHERE (3 != ANY (recipient_ids))
SELECT COUNT(*) FROM messages WHERE (3 = NOT ANY (recipient_ids))
What's the right way to test for this condition?
SELECT COUNT(*) FROM "messages" WHERE NOT (3 = ANY (recipient_ids))
You can always negate WHERE (condition) with WHERE NOT (condition)
You could turn it around a bit and say "3 is not equal to all the IDs":
where 3 != all (recipient_ids)
From the fine manual:
9.21.4. ALL (array)
expression operator ALL (array expression)
The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expression is evaluated and compared to each element of the array using the given operator, which must yield a Boolean result. The result of ALL is "true" if all comparisons yield true (including the case where the array has zero elements). The result is "false" if any false result is found.
Beware of NULLs
Both ALL:
(some_value != ALL(some_array))
And ANY:
NOT (some_value = ANY(some_array))
Would work as long as some_array is not null. If the array might be null, then you must account for it with coalesce(), e.g.
(some_value != ALL(coalesce(some_array, array[]::int[])))
Or
NOT (some_value = ANY(coalesce(some_array, array[]::int[])))
From the docs:
If the array expression yields a null array, the result of ANY will be null
If the array expression yields a null array, the result of ALL will be null
Augmenting the ALL/ANY Answers
I prefer all solutions that use all or any to achieve the result, appreciating the additional notes (e.g. about NULLs). As another augementation, here is a way to think about those operators.
You can think about them as short-circuit operators:
all(array) goes through all the values in the array, comparing each to the reference value using the provided operator. As soon as a comparison yields false, the process ends with false, otherwise true. (Comparable to short-circuit logical and.)
any(array) goes through all the values in the array, comparing each to the reference value using the provided operator. As soon as a comparison yields true, the process ends with true, otherwise false. (Comparable to short-circuit logical or.)
This is why 3 <> any('{1,2,3}') does not yield the desired result: The process compares 3 with 1 for inequality, which is true, and immediately returns true. A single value in the array different from 3 is enough to make the entire condition true. The 3 in the last array position is prob. never used.
3 <> all('{1,2,3}') on the other hand makes sure all values are not equal 3. It will run through all comparisons that yield true up to an element that yields false (the last in this case), to return false as the overall result. This is what the OP wants.
an update:
as of postgres 9.3,
you can use NOT in tandem with the #> (contains operator) to achieve this as well.
IE.
SELECT COUNT(*) FROM "messages" WHERE NOT recipient_ids #> ARRAY[3];
not (3 = any(recipient_ids))?
Note that the ANY/ALL operators will not work with array indexes. If indexes are in mind:
SELECT COUNT(*) FROM "messages" WHERE 3 && recipient_ids
and the negative:
SELECT COUNT(*) FROM "messages" WHERE NOT (3 && recipient_ids)
An index can then be created like:
CREATE INDEX recipient_ids_idx on tableName USING GIN(recipient_ids)
Use the following query
select id from Example where NOT (id = ANY ('{1, 2}'))

Resources