SQL with unknown purpose

SQL with unknown purpose - sql-server

I have to change some SQL queries (SQL Server 2005) done by another person and in that code I see often the following construction:
SELECT fieldA, SUM(CASE fieldB WHEN null THEN 0 ELSE fieldB END) as AliasName FROM ...
I don't understand the case statement because as far as I know, null can not be checked within a case and therefore I think that the above code does the same as:
SELECT fieldA, SUM(fieldB) as AliasName FROM ...
I have also done some tests and have not seen any differences in the result. Am I missing something, or can I replace the upper statement through the short one?
UPDATE
Only for completeness because it's not mentioned in the answers: The upper code returns the same result as the lower. The used case construction does not replace null's through zero's and therefore it can be ommited. If the purpose of the original sql was to make sure that never null will be returned, the coalesce or the isnull-operator can be used (as stated in the answers).

The output of your second statement will contain nulls (when aggregating records that only have null values for fieldB). If you don't mind that, you're ok.
If you want zeros in your output rather than null values, use this:
select fieldA, sum(isnull(fieldB, 0)) as AliasName from ...

You would achieve this more readably with
SELECT fieldA, COALESCE(fieldB, 0) as AliasName

Related

How to merge columns in SQL of the same table

I have two date columns.
Sometimes they both have dates(Which will be same always in both the columns) and sometimes one is empty and one has date value.
So, instead of two columns, I am trying to get one column.
If one is empty it will take date value from other column and if both have values(which will always be same) it will just take any of the value from the two columns.
I have tried UNION commands but its not giving me the desired result.

SQL Server has a couple different options for this scenario. You can use COALESCE, ISNULL, or a CASE statement.
Based on the information you provided I would use COALESCE. It offers several benefits over ISNULL and is very simple to implement. A CASE statement seems like overkill for what you are trying to do. Check out the link above for more info on each solution.

Welcome to Stack Overflow!
You need Coalesce
Also, in the future, you should put sample data and metadata in text in your question, rather than as attachments.

You could use the ISNULL statement if it is SQL
SELECT ISNULL(ReturnDate,RepartureDate) as dateAct FROM AviationReservation_dev

UPDATE tableName
SET Date1Column = ISNULL(Date1Column, Date2Column);
Context: ISNULL ( check_expression , replacement_value ), if first argument is not null, it will use that argument.
After the update, delete the other column.

It seems there is no case for both column to be empty, then in such condition, you can do something like this:
SELECT
CASE
WHEN column1 IS NULL THEN column2
WHEN column2 IS NULL THEN column1
ELSE column1 orcolumn2

SQL Server : Select ... From with FULL JOIN with a Default value for a column

I created a table, tblNewParts with 3 columns:
NewCustPart
AddedDate
Handled
and I am trying to FULL JOIN it to an existing table, tblPartsWorkedOn.
tblNewParts is defined to have Handled defaulted to 'N'...
SELECT *
FROM dbo.tblPartsWorkedOn AS BASE
FULL JOIN dbo.tblNewParts AS ADDON ON BASE.[CustPN] = ADDON.[NewCustPart]
WHERE ADDON.[Handled] IS NULL
ORDER BY [CustPN] DESC
And I want the field [Handled] to come back as 'N' instead of NULL when I run the query. The problem is that when there aren't any records in the new table, I get NULL's instead of 'N's.
I saw a SELECT CASE WHEN col1 IS NULL THEN defaultval ELSE col1 END as a mostly suitable answer from here. I am wondering if this will work in this instance, and how would I write that in T-SQL for SQL Server 2012? I need all of the columns from both tables, rather than just the one.
I'm making this a question, rather than a comment on the cited link, so as to not obscure the original link's question.
Thank you for helping!

Name the column (alias.column_name) in select statement and use ISNULL(alias.column,'N').
Thanks

After many iterations I found the answer, it's kind of bulky but here it is anyway. Synopsis:
Yes, the CASE statement does work, but it gives the output as an unnamed column. Also, in this instance to get all of the original columns AND the corrected column, I had to use SELECT *, CASE...END as [ColumnName].
But, here is the better solution, as it will place the information into the correct column, rather than adding a column to the end of the table and calling that column 'Unnamed Column'.
Select [ID], [Seq], [Shipped], [InternalPN], [CustPN], [Line], [Status],
CASE WHEN ADDON.[NewCustPart] IS NULL THEN BASE.[CustPN] ELSE
ADDON.[NewCustomerPart] END as [NewCustPart],
GetDate() as [AddedDate],
CASE WHEN ADDON.[Handled] IS NULL THEN 'N' ELSE ADDON.[Handled] END as [Handled]
from dbo.tblPartsWorkedOn as BASE
full join dbo.tblNewParts as AddOn ON Base.[CustPN] = AddOn.NewCustPart
where AddOn.Handled = 'N' or AddOn.Handled is null
order by [NewCustPart] desc
This sql code places the [CustPN] into [NewCustPart] if it's null, it puts a 'N' into the field [Handled] if it's null and it assigns the date to the [AddedDate] field. It also only returns records that have not been handled, so that you get the ones that need to be looked at; and it orders the resulting output by the [NewCustPart] field value.
Resulting Output looks something like this: (I shortened the DateTime for the output here.)
[ID] [SEQ] [Shipped] [InternalPN] [CustPN] [Status] [NewCustPart] [AddedDate] [Handled]
1 12 N 10012A 10012A UP 10012A 04/02/2016 N
...
Rather than with the nulls:
[ID] [SEQ] [Shipped] [InternalPN] [CustPN] [Status] [NewCustPart] [AddedDate] [Handled]
1 12 N 10012A 10012A UP NULL NULL NULL
...
I'm leaving this up, and just answering it rather than deleting it, because I am fairly sure that someone else will eventually ask this same question. I think that lots of examples showing how and why something is done, is a very helpful thing to have as not everything can be generalized. Just some thoughts and I hope that this helps someone else!

COUNT(NULL) and the IN clause

This is more of a curious question. I know this question seems like an odd ball but I use null when checking for data because I'm not concerned what data is there but only IF data is there. I believe the following scenario only occurs in SQL Server.
When I want to see if a record exists I'll use:
IF(EXISTS(SELECT null FROM Table1 WHERE Criteria IN (1, 2)))
The following code also works:
IF((SELECT COUNT(null) FROM Table1 WHERE Criteria = 1) = 2)
But this doesn't work:
IF((SELECT COUNT(null) FROM Table1 WHERE Criteria IN (1,2)) = 2)
and get this error:
Operand data type NULL is invalid for count operator.
Why is the third statement any different because of the IN clause?
Here is a SQL Fiddle of what I'm talking about:
http://sqlfiddle.com/#!6/6d7db/8
Narrowed it down to only if there are multiple items in the IN clause too

It seems to be something about the query optimizer.
In the first two queries (from your fiddle), the count(null) seems to be converted to COUNT(*) as you can see in the execution plan.
In the second line, IN with only one value is optimized to =, resulting in the exact same query as above:
With IN (1,2) the query fails. It's the same if you use COUNT(1): It's converted to COUNT(*) where the query can only return one row, but stays COUNT(1) in the third.
Another sidenote: The effect only works with a real table. If you use a table variable, all three statements throw the error.
The bottom line should probably be: count(null) is wrong (as Heinzi explained), it just may slip through the optimizer in very rare circumstances.

COUNT(null), the short form of COUNT(ALL null), simply does not make sense. Let's have a look at the definition of COUNT (emphasis mine):
COUNT(*) returns the number of items in a group. This includes NULL values and duplicates.
COUNT(ALL expression) evaluates expression for each row in a group and returns the number of nonnull values.
COUNT(DISTINCT expression) evaluates expression for each row in a group and returns the number of unique, nonnull values.
Thus, COUNT(ALL someExpressionThatYieldsNull) would always return 0, no matter how many records are matched by your WHERE clause. Obviously, that makes it utterly unsuitable for counting rows. COUNT(*) would be correct here.
I am quite surprised that your second example works at all, you might have stumbled upon a bug here. Trying the following in MSSQL 2012 (SQLFiddle):
SELECT COUNT(NULL) FROM someTable;
yields the following error:
Operand data type NULL is invalid for count operator.
which makes perfect sense.

NOT IN subquery fails when there are NULL-valued results

Sorry guys, I had no idea how to phrase this one, but I have the following in a where clause:
person_id not in (
SELECT distinct person_id
FROM protocol_application_log_devl pal
WHERE pal.set_id = #set_id
)
When the subquery returns no results, my whole select fails to return anything. To work around this, I replaced person_id in the subquery with isnull(person_id, '00000000-0000-0000-0000-000000000000').
It seems to work, but is there a better way to solve this?

It is better to use NOT EXISTS anyway:
WHERE NOT EXISTS(
SELECT 1 FROM protocol_application_log_devl pal
WHERE pal.person_id = person_id
AND pal.set_id = #set_id
)
Should I use NOT IN, OUTER APPLY, LEFT OUTER JOIN, EXCEPT, or NOT EXISTS?
A pattern I see quite a bit, and wish that I didn't, is NOT IN. When
I see this pattern, I cringe. But not for performance reasons – after
all, it creates a decent enough plan in this case:
The main problem is that the results can be surprising if the target
column is NULLable (SQL Server processes this as a left anti semi
join, but can't reliably tell you if a NULL on the right side is equal
to – or not equal to – the reference on the left side). Also,
optimization can behave differently if the column is NULLable, even if
it doesn't actually contain any NULL values
Instead of NOT IN, use a correlated NOT EXISTS for this query pattern.
Always. Other methods may rival it in terms of performance, when all
other variables are the same, but all of the other methods introduce
either performance problems or other challenges.

While I support Tim's answer as being correct-in-practice (NOT IN is not appropriate here), this is an interesting case noted in the IN / NOT IN documentation:
Caution: Any null values returned by subquery or expression that are compared to test_expression using IN or NOT IN return UNKNOWN. Using null values in together with IN or NOT IN can produce unexpected results1.
This is why the isnull "fixes" the problem - it masks any such NULL values and avoids the unexpected behavior. With that in mind, the following approach would also work (but please heed the advice about not using NOT IN to begin with):
person_id not in (
SELECT distinct person_id
FROM protocol_application_log_devl pal
WHERE pal.set_id = #set_id
AND person_id NOT NULL -- guard here
)
However, a NULL person_id is suspicious and might indicate other issues ..
1 Here is the Proof pudding:
select case when 1 not in (2) then 1 else 0 end as r1,
case when 1 not in (2, NULL) then 1 else 0 end as r2
-- r1: 1, r2: 0

I just replaced the null value with empty value using isnull function as below example. It solved my issue
where isnull(UserId,'') not in (select UserID from users where ...)

This should work:
nvl(person_id, '') not in (
SELECT distinct person_id
FROM protocol_application_log_devl pal
WHERE pal.set_id = #set_id
)

SQL Server ORDER BY date and nulls last

I am trying to order by date. I want the most recent dates coming in first. That's easy enough, but there are many records that are null and those come before any records that have a date.
I have tried a few things with no success:
ORDER BY ISNULL(Next_Contact_Date, 0)
ORDER BY ISNULL(Next_Contact_Date, 999999999)
ORDER BY coalesce(Next_Contact_Date, 99/99/9999)
How can I order by date and have the nulls come in last? The data type is smalldatetime.

smalldatetime has range up to June 6, 2079 so you can use
ORDER BY ISNULL(Next_Contact_Date, '2079-06-05T23:59:00')
If no legitimate records will have that date.
If this is not an assumption you fancy relying on a more robust option is sorting on two columns.
ORDER BY CASE WHEN Next_Contact_Date IS NULL THEN 1 ELSE 0 END, Next_Contact_Date
Both of the above suggestions are not able to use an index to avoid a sort however and give similar looking plans.
One other possibility if such an index exists is
SELECT 1 AS Grp, Next_Contact_Date
FROM T
WHERE Next_Contact_Date IS NOT NULL
UNION ALL
SELECT 2 AS Grp, Next_Contact_Date
FROM T
WHERE Next_Contact_Date IS NULL
ORDER BY Grp, Next_Contact_Date

According to Itzik Ben-Gan, author of T-SQL Fundamentals for MS SQL Server 2012, "By default, SQL Server sorts NULL marks before non-NULL values. To get NULL marks to sort last, you can use a CASE expression that returns 1 when the" Next_Contact_Date column is NULL, "and 0 when it is not NULL. Non-NULL marks get 0 back from the expression; therefore, they sort before NULL marks (which get 1). This CASE expression is used as the first sort column." The Next_Contact_Date column "should be specified as the second sort column. This way, non-NULL marks sort correctly among themselves." Here is the solution query for your example for MS SQL Server 2012 (and SQL Server 2014):
ORDER BY
CASE
WHEN Next_Contact_Date IS NULL THEN 1
ELSE 0
END, Next_Contact_Date;
Equivalent code using IIF syntax:
ORDER BY
IIF(Next_Contact_Date IS NULL, 1, 0),
Next_Contact_Date;

order by -cast([Next_Contact_Date] as bigint) desc

If your SQL doesn't support NULLS FIRST or NULLS LAST, the simplest way to do this is to use the value IS NULL expression:
ORDER BY Next_Contact_Date IS NULL, Next_Contact_Date
to put the nulls at the end (NULLS LAST) or
ORDER BY Next_Contact_Date IS NOT NULL, Next_Contact_Date
to put the nulls at the front. This doesn't require knowing the type of the column and is easier to read than the CASE expression.
EDIT: Alas, while this works in other SQL implementations like PostgreSQL and MySQL, it doesn't work in MS SQL Server. I didn't have a SQL Server to test against and relied on Microsoft's documentation and testing with other SQL implementations. According to Microsoft, value IS NULL is an expression that should be usable just like any other expression. And ORDER BY is supposed to take expressions just like any other statement that takes an expression. But it doesn't actually work.
The best solution for SQL Server therefore appears to be the CASE expression.

A bit late, but maybe someone finds it useful.
For me, ISNULL was out of question due to the table scan. UNION ALL would need me to repeat a complex query, and due to me selecting only the TOP X it would not have been very efficient.
If you are able to change the table design, you can:
Add another field, just for sorting, such as Next_Contact_Date_Sort.
Create a trigger that fills that field with a large (or small) value, depending on what you need:
CREATE TRIGGER FILL_SORTABLE_DATE ON YOUR_TABLE AFTER INSERT,UPDATE AS
BEGIN
SET NOCOUNT ON;
IF (update(Next_Contact_Date)) BEGIN
UPDATE YOUR_TABLE SET Next_Contact_Date_Sort=IIF(YOUR_TABLE.Next_Contact_Date IS NULL, 99/99/9999, YOUR_TABLE.Next_Contact_Date_Sort) FROM inserted i WHERE YOUR_TABLE.key1=i.key1 AND YOUR_TABLE.key2=i.key2
END
END

Use desc and multiply by -1 if necessary. Example for ascending int ordering with nulls last:
select *
from
(select null v union all select 1 v union all select 2 v) t
order by -t.v desc

I know this is old but this is what worked for me
Order by Isnull(Date,'12/31/9999')

I think I found a way to show nulls in the end and still be able to use indexes for sorting.
The idea is super simple - create a calculatable column which will be based on existing column, and put an index on it.
ALTER TABLE dbo.Users
ADD [FirstNameNullLast]
AS (case when [FirstName] IS NOT NULL AND (ltrim(rtrim([FirstName]))<>N'' OR [FirstName] IS NULL) then [FirstName] else N'ZZZZZZZZZZ' end) PERSISTED
So, we are creating a persisted calculatable column in the SQL, in that column all blank and null values will be replaced by 'ZZZZZZZZ', this will mean, that if we will try to sort based on that column, we will see all the null or blank values in the end.
Now we can use it in our new index.
Like this:
CREATE NONCLUSTERED INDEX [IX_Users_FirstNameNullLast] ON [dbo].[Users]
(
[FirstNameNullLast] ASC
)
So, this is an ordinary nonclustered index. We can change it however we want, i.e. include extra columns, increase number of indexes columns, change sorting order etc.

I know this is a old thread, but in SQL Server nulls are always lower than non-null values. So it's only necessary to order by Desc
In your case Order by Next_Contact_Date Desc should be enough.
Source: order by with nulls- LearnSql

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL with unknown purpose - sql-server

The output of your second statement will contain nulls (when aggregating records that only have null values for fieldB). If you don't mind that, you're ok. If you want zeros in your output rather than null values, use this: select fieldA, sum(isnull(fieldB, 0)) as AliasName from ...

You would achieve this more readably with SELECT fieldA, COALESCE(fieldB, 0) as AliasName

Related

How to merge columns in SQL of the same table

SQL Server : Select ... From with FULL JOIN with a Default value for a column

COUNT(NULL) and the IN clause

NOT IN subquery fails when there are NULL-valued results

SQL Server ORDER BY date and nulls last

Categories

Resources