SQL case in select query - sql-server

I have following query
Select
a,b,c,
case totalcount
when 0 then 0
else abcd/totalcount
end AS 'd',
case totalcount
when 0 then 0
else defg/totalcount
end AS 'e'
From
Table1
In this query, I have same case statement in select query...Can i make it into single select statement.
"totalcount" is some value... abcd and defg are two column values from table1. If totalcount is zero, i dont want to divide, else I wish to divide to get an average value of abcd and defg.

No, since you seem to be requiring two output columns in your desired resultset... Each Case statement can generate at most one output value.
It doesn't matter how many columns the case statement needs to access to do it's job, or how many different values it has to select from (how many When... Then... expressions), it just matters how many columns you are trying to generate in the final resultset.

At first view, I would say no, since you seem to use the exact same condition leading to different results.
Besides, this looks odd to me, why would you need two SELECT CASE for only one condition? This makes no sense.
Could you be more specific or give a real-world example of what you're trying to ask, with "real" data so that we might better answer your question?
EDIT #1
Given that:
Yes...it's two different fields....and the valud i need is calculated one...which is Value = a/b. But i need to check if b is zero or not
I would still answer no, as if they are two different fields, and you want both results in your result set, then you will need to write two CASE statements, one for each of the fields you need to verify whether it is zero-valued or not. Keep in mind that one CASE statement is equivalent to one single column only. So, if you need to check for a zero value, you are required to check for both independently.
By the way, sorry for my misunderstanding, that must be a language barrier, unfortunately. =) But my requirement for "real" data example holds, since this might light us all up with some other related solutions, we never know! =)
EDIT #2
Considering you have to check for a value of 0, I would perhaps rewrite my code as follows, for the sake of readability:
select a, b, c
, case when condition1 <> 0 then abcd / condition1 else 0 end as Result1
, case when condition2 <> 0 then defg / condition2 else 0 end as Result2
from Table1
In my opinion, it is leaner and swifter in the code, and it is easier to understand what is the intention. But after all, the result would be the same! No offense made here! =)
EDIT #3
After having read your question edit:
"totalcount" is some value... abcd and defg are two column values from table1. If totalcount is zero, i dont want to divide, else I wish to divide to get an average value of abcd and defg.
I say, if it is the average that you're after, why not just use the AVG function which will deal with it internaly, without having to care about zero values?
select a, b, c
, AVG(abcd) as Result1
, AVG(defg) as Result2
from Table1
Plus, considering having no record, the average can only be 0!
I don't know about your database engine, but if it is Oracle or SQL Server, and I think DB2 as well, they support this function.

You need to elaborate more - what is the condition? With what you have entered so far, it looks like you are just using the result of the expression, making the CASE statement redundant.
For example, see my inline comments:
SELECT a,b,c,
case condition
when 0 then 0 --this is unnecessary
else abcd/condition, --this is just using the result of the condition?
case condition
when 0 then 0 --this is unnecessary
else defg/condition --this is just using the result of the condition?
from Table1
so you could just refactor this as:
SELECT a
,b
,c
,expression
,expression
from Table1
if you actually meant the else abcd/condition as evaluate another condition or use a default value then you need to expand on that so we can answer accurately.
EDIT: if you are looking to avoid a divide by zero and you only want to evaluate condition once, then you need to do it outside of the SELECT and use the variable inside the SELECT. I wouldn't be concerned about the performance impact of evaluating it multiple times, if the condition doesn't change then its value should be cached by the query execution engine. Although it could look ugly and unmaintainable, in which case you could possibly factor condition into a function.

If all you're trying to do is avoid dividing by zero, and if it's not critical that an attempt to divide by zero actually produces zero, then you can do the following:
Select
a,b,c,
abcd / NULLIF(totalcount,0) AS 'd',
defg / NULLIF(totalcount,0) AS 'e'
From
Table1
This transforms an attempt to divide by zero into an attempt to divide by NULL, which always produces NULL.
If you need zero and you know that abcd and defg will never be NULL, then you can wrap the whole thing in an ISNULL(...,0) to force it to be zero. If abcd or defg might legitimately be null and you need the zero output if totalcount is zero, then I don't know any other option than using a CASE statement, sorry.
Moreover, since it appears totalcount is an expression, for maintainability you could compute it in a subquery, such as:
Select
a,b,c,
abcd / NULLIF(totalcount,0) AS 'd',
defg / NULLIF(totalcount,0) AS 'e'
From
(
Select
a,b,c,
abcd,
defg,
(... expression goes here ...) AS totalcount
From
Table1
) Data
But if it's a simple expression this may be overkill.

Technically this is possible. I'm not suggesting it's a good idea though!
;with Table1 as
(
select 1 as a, 2 as b, 3 as c, 0 as condition, 1234 as abcd, 4567 AS defg UNION ALL
select 1 as a, 2 as b, 3 as c, 1 as condition, 1234.0 as abcd, 4567 AS defg
)
,
cte AS
(
Select
a,b,c,
case condition
when 0 then CAST(CAST(0 as decimal(18, 6)) as binary(9)) + CAST(CAST(0 as decimal(18, 6)) as binary(9))
else CAST(CAST(abcd/condition as decimal(18, 6)) as binary(9)) + CAST(CAST(defg/condition as decimal(18, 6)) as binary(9))
end AS d
From
Table1
)
Select
a,
b,
c,
cast(substring(d,1,9) as decimal(18, 6)) as d,
cast(substring(d,10,9) as decimal(18, 6)) as e
From cte

Related

Snowflake: Trouble getting numbers to return from a PIVOT function

I am moving a query from SQL Server to Snowflake. Part of the query creates a pivot table. The pivot table part works fine (I have run it in isolation, and it pulls numbers I expect).
However, the following parts of the query rely on the pivot table- and those parts fail. Some of the fields return as a string-type. I believe that the problem is Snowflake is having issues converting string data to numeric data. I have tried CAST, TRY_TO_DOUBLE/NUMBER, but these just pull up 0.
I will put the code down below, and I appreciate any insight as to what I can do!
CREATE OR REPLACE TEMP TABLE ATTR_PIVOT_MONTHLY_RATES AS (
SELECT
Market,
Coverage_Mo,
ZEROIFNULL(TRY_TO_DOUBLE('Starting Membership')) AS Starting_Membership,
ZEROIFNULL(TRY_TO_DOUBLE('Member Adds')) AS Member_Adds,
ZEROIFNULL(TRY_TO_DOUBLE('Member Attrition')) AS Member_Attrition,
((ZEROIFNULL(CAST('Starting Membership' AS FLOAT))
+ ZEROIFNULL(CAST('Member Adds' AS FLOAT))
+ ZEROIFNULL(CAST('Member Attrition' AS FLOAT)))-ZEROIFNULL(CAST('Starting Membership' AS FLOAT)))
/ZEROIFNULL(CAST('Starting Membership' AS FLOAT)) AS "% Change"
FROM
(SELECT * FROM ATTR_PIVOT
WHERE 'Starting Membership' IS NOT NULL) PT)
I realize this is a VERY big question with a lot of moving parts... So my main question is: How can I successfully change the data type to numeric value, so that hopefully the formulas work in the second half of the query?
Thank you so much for reading through it all!
EDITED FOR SHORTENING THE QUERY WITH UNNEEDED SYNTAX
CAST(), TRY_TO_DOUBLE(), TRY_TO_NUMBER(). I have also put the fields (Starting Membership, Member Adds) in single and double quotation marks.
Unless you are quoting your field names in this post just to highlight them for some reason, the way you've written this query would indicate that you are trying to cast a string value to a number.
For example:
ZEROIFNULL(TRY_TO_DOUBLE('Starting Membership'))
This is simply trying to cast a string literal value of Starting Membership to a double. This will always be NULL. And then your ZEROIFNULL() function is turning your NULL into a 0 (zero).
Without seeing the rest of your query that defines the column names, I can't provide you with a correction, but try using field names, not quoted string values, in your query and see if that gives you what you need.
You first mistake is all your single quoted columns names are being treated as strings/text/char
example your inner select:
with ATTR_PIVOT(id, studentname) as (
select * from values
(1, 'student_a'),
(1, 'student_b'),
(1, 'student_c'),
(2, 'student_z'),
(2, 'student_a')
)
SELECT *
FROM ATTR_PIVOT
WHERE 'Starting Membership' IS NOT NULL
there is no "starting membership" column and we get all the rows..
ID
STUDENTNAME
1
student_a
1
student_b
1
student_c
2
student_z
2
student_a
So you need to change 'Starting Membership' -> "Starting Membership" etc,etc,etc
As Mike mentioned, the 0 results is because the TRY_TO_DOUBLE always fails, and thus the null is always turned to zero.
now, with real "string" values, in real named columns:
with ATTR_PIVOT(Market, Coverage_Mo, "Starting Membership", "Member Adds", "Member Attrition") as (
select * from values
(1, 10 ,'student_a', '23', '150' )
)
SELECT
Market,
Coverage_Mo,
ZEROIFNULL(TRY_TO_DOUBLE("Starting Membership")) AS Starting_Membership,
ZEROIFNULL(TRY_TO_DOUBLE("Member Adds")) AS Member_Adds,
ZEROIFNULL(TRY_TO_DOUBLE("Member Attrition")) AS Member_Attrition
FROM ATTR_PIVOT
WHERE "Starting Membership" IS NOT NULL
we get what we would expect:
MARKET
COVERAGE_MO
STARTING_MEMBERSHIP
MEMBER_ADDS
MEMBER_ATTRITION
1
10
0
23
150

SQL Server CHOOSE() function behaving unexpectedly with RAND() function

I've encountered an interesting SQL server behaviour while trying to generate random values in T-sql using RAND and CHOOSE functions.
My goal was to try to return one of two given values using RAND() as rng. Pretty easy right?
For those of you who don't know it, CHOOSE function accepts in an index number(int) along with a collection of values and returns a value at specified index. Pretty straightforward.
At first attempt my SQL looked like this:
select choose(ceiling((rand()*2)) ,'a','b')
To my surprise, this expression returned one of three values: null, 'a' or 'b'. Since I didn't expect the null value i started digging. RAND() function returns a float in range from 0(included) to 1 (excluded). Since I'm multiplying it by 2, it should return values anywhere in range from 0(included) to 2 (excluded). Therefore after use of CEILING function final value should be one of: 0,1,2. After realising that i extended the value list by 'c' to check whether that'd be perhaps returned. I also checked the docs page of CEILING and learnt that:
Return values have the same type as numeric_expression.
I assumed the CEILINGfunction returned int, but in this case would mean that the value is implicitly cast to int before being used in CHOOSE, which sure enough is stated on the docs page:
If the provided index value has a numeric data type other than int,
then the value is implicitly converted to an integer.
Just in case I added an explicit cast. My SQL query looks like this now:
select choose(cast(ceiling((rand()*2)) as int) ,'a','b','c')
However, the result set didn't change. To check which values cause the problem I tried generating the value beforehand and selecting it alongside the CHOOSE result. It looked like this:
declare #int int = cast(ceiling((rand()*2)) as int)
select #int,choose( #int,'a','b','c')
Interestingly enough, now the result set changed to (1,a), (2,b) which was my original goal. After delving deeper in the CHOOSE docs page and some testing i learned that 'null' is returned in one of two cases:
Given index is a null
Given index is out of range
In this case that would mean that index value when generated inside the SELECT statement is either 0 or above 2/3 (I'm assuming that negative numbers are not possible here and CHOOSE function indexes from 1). As I've stated before 0 should be one of possibilities of:
ceiling((rand()*2))
,but for some reason it's never 0 (at least when i tried it 1 million+ times like this)
set nocount on
declare #test table(ceiling_rand int)
declare #counter int = 0
while #counter<1000000
begin
insert into #test
select ceiling((rand()*2))
set #counter=#counter+1
end
select distinct ceiling_rand from #test
Therefore I assume that the value generated in SELECT is greater than 2/3 or NULL. Why would it be like this only when generated in SELECT statement? Perhaps order of resolving CAST, CELING or RAND inside SELECT is different than it would seem? It's true I've only tried it a limited number of times, but at this point the chances of it being a statistical fluctuation are extremely small. Is it somehow a floating-point error? I truly am stumbled and looking forward to any explanation.
TL;DR: When generating a random number inside a SELECT statement result set of possible values is different then when it's generated before the SELECT statement.
Cheers,
NFSU
EDIT: Formatting
You can see what's going on if you look at the execution plan.
SET SHOWPLAN_TEXT ON
GO
SELECT (select choose(ceiling((rand()*2)) ,'a','b'))
Returns
|--Constant Scan(VALUES:((CASE WHEN CONVERT_IMPLICIT(int,ceiling(rand()*(2.0000000000000000e+000)),0)=(1) THEN 'a' ELSE CASE WHEN CONVERT_IMPLICIT(int,ceiling(rand()*(2.0000000000000000e+000)),0)=(2) THEN 'b' ELSE NULL END END)))
The CHOOSE is expanded out to
SELECT CASE
WHEN ceiling(( rand() * 2 )) = 1 THEN 'a'
ELSE
CASE
WHEN ceiling(( rand() * 2 )) = 2 THEN 'b'
ELSE NULL
END
END
and rand() is referenced twice. Each evaluation can return a different result.
You will get the same problem with the below rewrite being expanded out too
SELECT CASE ceiling(( rand() * 2 ))
WHEN 1 THEN 'a'
WHEN 2 THEN 'b'
END
Avoid CASE for this and any of its variants.
One method would be
SELECT JSON_VALUE ( '["a", "b"]' , CONCAT('$[', FLOOR(rand()*2) ,']') )

How does the outer WHERE clause affect the way nested query is executed?

Let's say I have a table lines
b | a
-----------
17 7000
17 0
18 6000
18 0
19 5000
19 2500
I want to get positive values of a function: (a1 - a2) \ (b2 - b1) for all elements in cartesian product of lines with different b's. (If you are interested this will result in intersections of lines y1 = b1*x + a1 and y2 = b2*x + a2)
I wrote query1 for that cause
SELECT temp.point FROM
(SELECT DISTINCT ((l1.a - l2.a) / (l2.b - l1.b)) AS point
FROM lines AS l1
CROSS JOIN lines AS l2
WHERE l1.b != l2.b
) AS temp
WHERE temp.point > 0
It throws a "division by zero" error. I tried the same query without the WHERE clause (query2) and it works just fine
SELECT temp.point FROM
(SELECT DISTINCT ((l1.a - l2.a) / (l2.b - l1.b)) AS point
FROM lines AS l1
CROSS JOIN lines AS l2
WHERE l1.b != l2.b
) AS temp
as well as the variation with the defined SQL function (query3)
CREATE FUNCTION get_point(#a1 DECIMAL(18, 4), #a2 DECIMAL(18, 4), #b1 INT, #b2 INT)
RETURNS DECIMAL(18, 4)
WITH EXECUTE AS CALLER
AS
BEGIN
RETURN (SELECT (#a1 - #a2) / (#b2 - #b1))
END
GO
SELECT temp.point FROM
(SELECT DISTINCT dbo.get_point(l1.a, l2.a, l1.b, l2.b) AS point
FROM lines AS l1
CROSS JOIN lines AS l2
WHERE l1.b != l2.b
) AS temp
WHERE temp.point > 0
I have an intuitive assumption that the outer SELECT shouldn't affect the way nested SELECT is executed (or at least shouldn't break it). Even if it is not true that wouldn't explain why query3 works when query1 doesn't.
Could someone explain the principle behind this? That would be much appreciated.
If you want to guarantee that the query will always work, you'd need to wrap your calculation in something like a case statement
case when l2.b - l1.b = 0
then null
else (l1.a - l2.a) / (l2.b - l1.b)
end
Technically, the optimizer is perfectly free to evaluate conditions in whatever order it expects will be more efficient. The optimizer is free to evaluate the division before the where clause that filters out rows where the divisor would be 0. It is also free to evaluate the where clause first. Your different queries have different query plans which result in different behavior.
Realistically, though, even though a particular query might have a "good" query plan today, there is no guarantee that the optimizer won't decide in a day, a month, or a year to change the query plan to something that would throw a division by 0 error. I suppose you could decide to use a bunch of hints/ plan guides to force a particular plan with a particular behavior to be used. But that tends to be the sort of thing that bites you in the hind quarters later. Wrapping the calculation in a case (or otherwise preventing the division by 0 error) will be much safer and easier to explain to the next developer.

"if, then, else" in SQLite

Without using custom functions, is it possible in SQLite to do the following. I have two tables, which are linked via common id numbers. In the second table, there are two variables. What I would like to do is be able to return a list of results, consisting of: the row id, and NULL if all instances of those two variables (and there may be more than two) are NULL, 1 if they are all 0 and 2 if one or more is 1.
What I have right now is as follows:
SELECT
a.aid,
(SELECT count(*) from W3S19 b WHERE a.aid=b.aid) as num,
(SELECT count(*) FROM W3S19 c WHERE a.aid=c.aid AND H110 IS NULL AND H112 IS NULL) as num_null,
(SELECT count(*) FROM W3S19 d WHERE a.aid=d.aid AND (H110=1 or H112=1)) AS num_yes
FROM W3 a
So what this requires is to step through each result as follows (rough Python pseudocode):
if row['num_yes'] > 0:
out[aid] = 2
elif row['num_null'] == row['num']:
out[aid] = 'NULL'
else:
out[aid] = 1
Is there an easier way? Thanks!
Use CASE...WHEN, e.g.
CASE x WHEN w1 THEN r1 WHEN w2 THEN r2 ELSE r3 END
Read more from SQLite syntax manual (go to section "The CASE expression").
There's another way, for numeric values, which might be easier for certain specific cases.
It's based on the fact that boolean values is 1 or 0, "if condition" gives a boolean result:
(this will work only for "or" condition, depends on the usage)
SELECT (w1=TRUE)*r1 + (w2=TRUE)*r2 + ...
of course #evan's answer is the general-purpose, correct answer

SQL Server: sort a column numerically if possible, otherwise alpha

I am working with a table that comes from an external source, and cannot be "cleaned". There is a column which an nvarchar(20) and contains an integer about 95% of the time, but occasionally contains an alpha. I want to use something like
select * from sch.tbl order by cast(shouldBeANumber as integer)
but this throws an error on the odd "3A" or "D" or "SUPERCEDED" value.
Is there a way to say "sort it like a number if you can, otherwise just sort by string"? I know there is some sloppiness in that statement, but that is basically what I want.
Lets say for example the values were
7,1,5A,SUPERCEDED,2,5,SECTION
I would be happy if these were sorted in any of the following ways (because I really only need to work with the numeric ones)
1,2,5,7,5A,SECTION,SUPERCEDED
1,2,5,5A,7,SECTION,SUPERCEDED
SECTION,SUPERCEDED,1,2,5,5A,7
5A,SECTION,SUPERCEDED,1,2,5,7
I really only need to work with the
numeric ones
this will give you only the numeric ones, sorted properly:
SELECT
*
FROM YourTable
WHERE ISNUMERIC(YourColumn)=1
ORDER BY YourColumn
select
*
from
sch.tbl
order by
case isnumeric(shouldBeANumber)
when 1 then cast(shouldBeANumber as integer)
else 0
end
Provided that your numbers are not more than 100 characters long:
WITH chars AS
(
SELECT 1 AS c
UNION ALL
SELECT c + 1
FROM chars
WHERE c <= 99
),
rows AS
(
SELECT '1,2,5,7,5A,SECTION,SUPERCEDED' AS mynum
UNION ALL
SELECT '1,2,5,5A,7,SECTION,SUPERCEDED'
UNION ALL
SELECT 'SECTION,SUPERCEDED,1,2,5,5A,7'
UNION ALL
SELECT '5A,SECTION,SUPERCEDED,1,2,5,7'
)
SELECT rows.*
FROM rows
ORDER BY
(
SELECT SUBSTRING(mynum, c, 1) AS [text()]
FROM chars
WHERE SUBSTRING(mynum, c, 1) BETWEEN '0' AND '9'
FOR XML PATH('')
) DESC
SELECT
(CASE ISNUMERIC(shouldBeANumber)
WHEN 1 THEN
RIGHT(CONCAT('00000000',shouldBeANumber), 8)
ELSE
shouoldBeANumber) AS stringSortSafeAlpha
ORDEER BY
stringSortSafeAlpha
This will add leading zeros to all shouldBeANumber values that truly are numbers and leave all remaining values alone. This way, when you sort, you can use an alpha sort but still get the correct values (with an alpha sort, "100" would be less than "50", but if you change "50" to "050", it works fine). Note, for this example, I added 8 leading zeros, but you only need enough leading zeros to cover the largest possible integer in your column.

Resources