This might be a very basic question but I just came over it while writing a query.
Why can't SQL Server convert a check for NULL to BIT? I was thinking about something like this:
DECLARE #someVariable INT = NULL;
-- Do something
SELECT CONVERT(BIT, (#someVariable IS NULL))
The expected outcome would then be either 1 or 0.
Use case:
SELECT CONVERT(BIT, (CASE WHEN #someVariable IS NULL THEN 1 ELSE 0 END))
Or use IIF (a little more readable than CASE):
CONVERT(BIT, IIF(#x IS NULL, 0, 1))
not a direct cast
select cast(isnull(#null,1) as bit)
In SQL the language, NULLs are not considered data values. They represent a missing/unknown state. Quoting from Wikipedia's article on SQL NULL:
SQL null is a state (unknown) and not a value. This usage is quite different from most programming languages, where null means not assigned to a particular instance.
This means that any comparison against that UNKNOWN value can only be UNKNOWN itself. Even comparing two NULLs can't return true: if both values are unknown, how can we say that they are equal or not?
IS NULL and IS NOT NULL are predicates that can be used in conditional expressions. That means that they don't return a value themselves. Therefore, they can't be "cast" to a bit , or treated as a boolean.
Basic SQL comparison operators always return Unknown when comparing anything with Null, so the SQL standard provides for two special Null-specific comparison predicates. The IS NULL and IS NOT NULL predicates (which use a postfix syntax) test whether data is, or is not, Null.
Any other way of treating nulls is a vendor-specific extension.
Finally, BIT is not a boolean type, it's just a single-bit number. An optional BOOLEAN type was introduced in SQL 1999 but only PostgreSQL implements it correctly, ie having TRUE, FALSE or UNKNOWN values.
Without a BOOLEAN type you can't really calculate the result of a conditional expression like A AND B or x IS NULL. You can only use functions like NULLIF or COALESCE to replace the NULL value with something else.
Related
Why not? What is a good replacement for this functionality which was removed in newer versions of SQL server?
I use the following pattern all over the place:
select * from ContactReport
order by
case ContactDate when null then 'a' else 'b' end, ContactMethod
I could change it to this, but it would be slower:
select * from ContactReport
order by
case ContactDate when null then 'a'+ContactMethod else 'b'+ContactMethod end
In some instances I have very large tables and this would be a problem. I could add some dummy columns eg Dummy1,Dummy2,Dummy3 with value 1 to every table where I need this just so I can order by it...
Any better ideas?
You have misunderstood what the error is telling you. It is perfectly fine to have constants in branches of CASE statements - just the whole expression is not allowed to resolve to a constant.
case ContactDate when null then 'a' else 'b' end
is incorrect. With ANSI defaults enabled it will always evaluate to "b" as nothing is equal to null. SQL Server can tell this at compile time and constant fold it to b. It is meaningless to order by a value that is the same in all rows - hence the error.
This error is not new, it occurs at least as far back as 2008 - it is a good thing because otherwise you would just silently get the wrong results without rows with NULL dates ordered first as desired.
Your expression does work with ANSI_NULLS off (and compiles OK in the ORDER BY clause) but I don't recommend using that.
Use the following expression instead.
case when ContactDate is null then 'a' else 'b' end
Or, this can be shortened to
IIF(ContactDate is null, 'a', 'b')
I am encountering this strange behavior (Okay maybe not strange but beyond my understanding) when using isnull. It all sums up to this:
isnull(left(cast(null as varchar),1),0) gives 0
isnull(left(cast(null as varchar),1),-1) gives *
I would like to know the reason behind this behavior.
Although I got a workaround here:
select isnull(cast(left(cast(null as varchar),1) as varchar),-1)
The reason for the asterisk is due to an overflow error. left(cast(null as varchar),1) would return a varchar(1). Within the ISNULL the value -1 would be implicitly converted to a varchar(1), and a negative number cannot be represented with a single character, hence why an '*' is displayed.
If you change it to a LEFT(...2) then get a result:
SELECT ISNULL(LEFT(CAST(NULL AS varchar),2),-1);
On a different note Bad habits to kick : declaring VARCHAR without (length)
You could use COALESCE:
SELECT COALESCE(left(cast(null as varchar),1),-1);
DBFiddle Demo
ISNULL infers data type from first argument, COALESCE infers from wider one(Data type precedence)
So in Access you can choose the column data type "Yes/No" and it will ask you while typing in the information the option "Yes" and "No". Yet, I do not see any of that on SQL Server Management Studio from Microsoft, I've searched around and seen that "bit" is the type, but when I put bit and I go to add information it appears as a normal column to type information. Or should I know myself to put either 0 or 1?
Also, is 0 true or false?
SQL Server doesn't have a boolean data type. The closest approximation is the bit. But that is a numeric type, not a boolean type. In addition, it only supports 2 values - 0 or 1 (and one non-value, NULL).
However, SQL (standard SQL, as well as T-SQL dialect) describes a Three valued logic - TRUE, FALSE and UNKNOWN. So bit isn't actually the best if you need all 3 states.
When using it, you cannot use that value directly in an if statement for example:
IF CONVERT(bit, 0)
BEGIN
print 'Ok'
END
would not parse and end up in error. So, you would need to write it as below;
IF CONVERT(bit, 0) = 0
In MS SQL bit is equivalent to a boolean.
https://msdn.microsoft.com/en-us/library/ms177603.aspx
Here you can read more about.
1 would be the equivalent of Yes
0 would be the equivalent of No
NULL would be the equivalent of Undefined ( if that exists in Access )
in SQL Server the equivalent to boolean datatype is Bit. Bit can take values 0(false) or 1 (true). If you want to set a default value to your Bit field on creating the table you can set:
...
myBoolean Bit, default 1,
..
Why built-in boolean functions behaves differently on NULL input ?
For example - this query:
select 'ISDATE(null)' as function_call,
ISDATE(null) as result union all
select 'ISNUMERIC(null)' as function_call,
ISNUMERIC(null) as result union all
select 'IS_MEMBER(null)' as function_call,
IS_MEMBER(null) as result union all
select 'IS_SRVROLEMEMBER(null, null)' as function_call,
IS_SRVROLEMEMBER(null, null) as result
gives us:
function_call result
---------------------------- -----------
ISDATE(null) 0
ISNUMERIC(null) 0
IS_MEMBER(null) NULL
IS_SRVROLEMEMBER(null, null) NULL
So seems that ISDATE, ISNUMERIC behaves according to boolean logic, but IS_MEMBER,IS_SRVROLEMEMBER - behaves according to Three valued logic.
Shouldn't all boolean functions behave the same on NULL input ? What ANSI SQL standard says about that ?
Thanks
Re ANSI standards, the latter two functions have nothing to do with ANSI SQL; They're MSSQL specific security functions. Which is not to suggest they don't have analogues in other DBMSs, just that they're not "typical" UML style functions or part of the standard.
I'm fact, a search on this reasonably authoritative O'Reilly page about ANSI standard functions for the term "Boolean" returns no results. One may infer from this that there is no ANSI approach to such scalar functions' handling of NULLs.
The three valued logic is required in those functions to allow NULL to signify that an input is not valid. eg Refer to Remarks section of MSDN IS_MEMBER().
(This form of NULL return is not to be confused with eg aggregate functions that may return a value, or NULL if one of its inputs was NULL.)
There's nothing stopping you "wrapping" those functions to behave like the others, if that's what you really need. Eg ISNULL(IS_MEMBER(someValueFromATable),0).
The former two functions return a meaningful Boolean value, as you've found.
ISDATE(null), for example, returns false because null is not a "valid date, time, or datetime value" (MSDN, my emphasis on value).
In the case where NULL is interpreted to mean "unknown", it would be semantically meaningful for ISDATE() etc to return "unknown" when the input is unknown, but not programatically practical; The need to "convert" the result (from all these boolean functions) from three-state to boolean logic is completely redundant when we already have a separate type-non-specific test for ISNULL().
In comparing the two types of functions you've identified, the return for the latter ones shouldn't be NULL in this case, because while NULL isn't a date, it is still certainly a valid piece of data that can be properly examined by this function.
I don't find it odd that these security functions (IS_SRVROLEMEMBER) would behave differently from system/datatype functions (ISNUMERIC) since these security functions are essentially queries and the results can change depending upon who is querying. The meanings of the return values, including nulls, are spelled out very well for all of these in the MSDN Documentation.
More concretely, for the arguments to ISNUMERIC and ISDATE, you can test very well ahead of time if the argument is null or not, so I'm not sure that returning a null is necessary, or has much practicality.
For the arguments to the security functions, you may have non-null arguments, but the functions have been built in a helpful manner to return null in cases where the arguments aren't valid, not found, or you don't have the permissions to know the answer.
Much of this can certainly be seen as subjective, thus I have voted to close this question, however interesting it may be.
I've just come across an interesting scenario on how NULL is handled in T-SQL (and possibly other forms of SQL). The issue is pretty well described and answered by this question and I've illustrated the issue below;
-- SET ANSI_NULLS ON -- Toggle this between ON/OFF to see how it changes behaviour
DECLARE #VAR1 DATETIME
DECLARE #VAR2 DATETIME
SET #VAR1 = (SELECT CURRENT_TIMESTAMP)
SET #VAR2 = (SELECT NULL)
-- This will return 1 when ansi_nulls is off and nothing when ansi_nulls is on
SELECT 1 WHERE #VAR1 != #VAR2
DECLARE #TstTable TABLE (
COL1 DATETIME,
COL2 DATETIME)
INSERT INTO #TstTable
SELECT #VAR1, #VAR1
UNION
SELECT #VAR1, NULL
-- This won't ever return a value irrespective of the ansi_nulls setting
SELECT * FROM #TstTable WHERE COL1 != COL2
This situation led me to question my understanding of null representations specifically within SQL. I've always understood null to mean that it has no value. This seems to be an incorrect assumption given the first paragraph of this page. It states (my emphasis...I could quite easily just highlight the whole paragraph though);
A value of NULL indicates the value is unknown. A value of NULL is
different from an empty or zero value. No two null values are equal.
Comparisons between two null values, or between a NULL and any other
value, return unknown because the value of each NULL is unknown.
Does this hold true for T-SQL variable conditions also? It certainly does for my SELECT 1 WHERE #VAR1 != #VAR2 example above, but I don't understand why NULL in this instance is considered "UNKNOWN" and not empty/uninitialised/nothing etc. I know ANSI_NULLS changes how this works, but it is deprecated and will be removed from some future version.
Can someone offer a good explanation as to why NULL in T-SQL refers to an unknown value rather than an uninitialised value? If so, can you extend your answer to show why T-SQL variables with a NULL value are also considered to be unknown?
In SQL, we're interested in storing facts in tables (a.k.a relations).
What Codd asked for was:
Rule 3: Systematic treatment of null values:
The DBMS must allow each field to remain null (or empty). Specifically, it must support a representation of "missing information and inapplicable information" that is systematic, distinct from all regular values (for example, "distinct from zero or any other number", in the case of numeric values), and independent of data type. It is also implied that such representations must be manipulated by the DBMS in a systematic way.
What we've ended up with is three-valued logic (as #zmbq stated). Why is it this way?
We have two items that we're trying to compare for equality. Are they equal? Well, it turns out that we don't (yet) know what item 1 is, and we don't (yet) know what item 2 is (both are NULL). They might be equal. They might be unequal. It would be equally wrong to answer the equality comparison with either TRUE or FALSE. So we answer UNKNOWN.
In other languages, null is usually used with pointers (or references in languages without pointers, but notably not C++), to indicate that the pointer does not, at this time, point to anything.
Welcome to Three Valued Logic, where everything can be true, false or unknown.
The value of the null==null is not true, and it's not false, it's unknown...
but I don't understand why NULL in this instance is considered "UNKNOWN" and not
empty/uninitialised/nothing
?? What is there not to understand. It is like that BECAUSE IT WAS DEFINED LIKE THAT. Someone had the idea it is like that. It was put into the standard.
Yes, this is a little recursive, but quite often design decisions run like that.
This has more to do with arithmetics. Sum of 20 rows with one Null is Null - how would you treat it as unknown? C# etc. react with an exception, but that gets in your way when doing statistical analysis. Uknonwn values have tto move all they come in contact with into unknown, and no unknown is ever the same.