What is the difference between my 2 SUM(CASE()) examples? - sql-server

So I have 2 queries, 1 works like I would expect and the other one doesn't. Here's the one that works like I expect, it's a SUMIF using a CASE statement:
SELECT
PartNo,
SUM(ActualPcsGood) AS Pcs,
SUM(CASE WHEN Status = 'Current' THEN ActualPcsGood END) AS [Current],
SUM(CASE WHEN Status = 'Pending' THEN ActualPcsGood END) AS [Pending],
SUM(CASE WHEN Status = 'Future' THEN ActualPcsGood END) AS [Future],
SUM(CASE WHEN Status = 'Finished' THEN ActualPcsGood END) AS [Finished]
FROM OrderRouting
WHERE PartNo LIKE '20004%'
GROUP BY PartNo;
Output:
Now I have this other query that is confusing me, here's the code:
SELECT
JobNo,
UnitPrice,
SUM(CASE WHEN JobNo LIKE '10426%' THEN UnitPrice END) AS [OrderTotal]
FROM OrderDet
WHERE UnitPrice > 0
AND JobNo LIKE '10426%'
GROUP BY JobNo, UnitPrice;
Output:
My question is why is the 3rd column exactly the same as the second one? It's my intention that the third column is supposed to total the entire thing, meaning that the value for the 3rd column would be exactly the same for all rows. Why is it not? What is the major difference between my 2 examples?

This isn't tested but here are some ideas:
select dtl.JobNo, dtl.UnitPrice, tot.UnitPrice SumPrice, (dtl.UnitPrice/tot.UnitPrice)*100 pctTot
from
(SELECT
JobNo,
UnitPrice
FROM OrderDet
WHERE UnitPrice > 0
AND JobNo LIKE '10426%') dtl
cross join
(SELECT sum(UnitPrice) unitPrice
FROM OrderDet
WHERE UnitPrice > 0
AND JobNo LIKE '10426%') tot
OR
SELECT
JobNo,
UnitPrice,
(select sum(UnitPrice) from OrderDet where UnitPrice > 0 and JobNo like '10426%') totPrice
FROM OrderDet
WHERE UnitPrice > 0
AND JobNo LIKE '10426%'
GROUP BY JobNo;

At first look everything looks fine. I would say that your number of records per unit price is one per job for the this record set.
Add a COUNT(*) to see how many records are being summed up.
The other thought is that you have a quantity field on the record and your case statement should really be:
SUM(CASE WHEN JobNo LIKE '10426%' THEN UnitPrice * Quantity END) AS [OrderTotal]
Hope that helps.

Group by sums the values within the grouping columns. Your grouping columns are Job and UnitPrice. Because you have a unique JobNo, UnitPrice, it's hard to see what it's doing. Try adding a duplicate UnitPrice, JobNo row in your data source so you can see what's actually doing.
I'm not sure why you would want to show the sum total in this way though. I would use a rollup which would show the total at the bottom.

Your first query groups by only PartNo. So you SUM with Case statement work for each unique PartNo.
Your second query however groups by JobNo and UnitPrice. so your SUM runs for each group of JobNo and UnitPrice, which is only single row. Hence same result as UnitPrice. Assuming each jobid as unique unit price try query below. You don't need CASE inside SUM as WHERE clause will take care of it.
SELECT
JobNo,
MIN(UnitPrice) as UnitPrice,
SUM(UnitPrice) AS [OrderTotal]
FROM OrderDet
WHERE UnitPrice > 0
AND JobNo LIKE '10426%'
GROUP BY JobNo;

But why are you adding a case when you are already filtering by '10426%' condition?
your query will return only those records with '10426%' as Job No. Grouping by both job and Unitprice will give you single row for different unit prices of same job. Below query should be enough. If you are having different unit price for '10426%', we cannot get unit price in a single row.
SELECT
JobNo,
SUM(UnitPrice) AS [OrderTotal]
FROM OrderDet
WHERE UnitPrice > 0
AND JobNo LIKE '10426%'
GROUP BY JobNo;

Related

SQL In Select Statement show values from the same table but different records

I have a "Students" table with two columns "UserID" and "Name".
Next I have a table named "TestResults" with three columns, UserID, TestID, and TestScore.
I would like to run a single query that shows for each User, on ONE row, their test scores, for tests that have the TestID equal to 1A or 2A.
What approach is the best, I'm wondering if Pivot is the best way or is there another that is more advisable. Thanks.
Guessing on your comment, you can use conditional aggregation with max and case to get the results on a single row:
select s.userid, s.name,
max(case when t.testid = '1a' then t.testscore end) as 1ascore,
max(case when t.testid = '2a' then t.testscore end) as 2ascore
from students s
join testresults t on s.userid = t.userid
group by s.userid, s.name
Try this -
SELECT
UserID,
TestScore
FROM
TestResults
WHERE
(TestID = 1A)
OR (TESTID = 2A)

Querying in SQL-

I have a table named "Results" like below:
I'd like to count personnel who have been completely scored. It means the ones who have no zero in score column. For example based on the uploaded picture just person with ID 1004 should be counted and the outcome should be one.
I used this code:
select Count(PrsID) from results
where Score <> 0
group by PrsID
But it wouldn't help me cause if a person has just one non-zero score, he will be counted!
Thanks in advance.
If I understand you correctly, I think this is what you want: SELECT COUNT(DISTINCT(PrsID)) FROM results WHERE PrsID NOT IN (SELECT DISTINCT PrsID FROM results WHERE score = 0)
If a person's min score is greater than zero, then they should be counted.
select count(1)
from (
select PrsID
from results
group by PrsID
having min(Score) > 0) as results
You can use conditional aggregation as below:
Select PrsId from results
group by PrsId
having sum(case when score = 0 then 1 else 0 end) > 1
I think you want something like:
select PrsID, Count(1) from results
where Score = 0
group by PrsID
This one returns count of person who has non zero score
SELECT
t1.PrsID,
Count(t1.PrsID)
FROM
( SELECT
*
FROM results
GROUP BY results.PrsID
ORDER BY results.Score ASC
) AS t1
WHERE t1.Score <> 0;

Query returns no results

I am running a query that counts emails sent by customers, based on their subject.
DECLARE #LastMonthNo varchar(2)
DECLARE #LastMYear varchar(4)
SET #LastMonthNo = DATEPART(m,DATEADD(m,-1,GetDate()))
SET #LastMYear = DATEPART(yyyy,DATEADD(m,-1,GetDate()));
SELECT
CustID, CustName, CustEmail,
ISNULL(SUM(CASE WHEN EmailSubject LIKE 'KeyWord' THEN 1 END),0) AS TotalEmail
FROM
TableEmails
WHERE
DATEPART(M, DATESENT) = #LastMonthNo
AND DATEPART(YYYY, DATESENT) = #LastYearNo
GROUP BY CustID, CustName, CustEmail
For some customers, the query returns no results. I do not mean NULL, I mean there is no record at all. However, I need to identify those customers.
What can I do to get the query to generate some sort of results? A 0 would be perfect.
Try something like this..
SELECT CustID, CustName, CustEmail,
SUM(CASE WHEN EmailSubject LIKE 'KeyWord'
AND DATEPART(YYYY,DATESENT)=#LastYearNo
AND DATEPART(YYYY,DATESENT)=#LastYearNo
THEN 1 ELSE 0 END) AS TotalEmail,
FROM TableEmails
GROUP BY CustID, CustName, CustEmail
What is the difference?
WHERE part executes before GROUP BY. So, with your query, you are grouping your results after other customers are filtered out. If you move that condition to CASE statement, you will check that condition on each record in the table regardless of dates. Hope that makes sense.

Same column multiple times in query

Question for SQL Server experts. In the below query I would like to have an additional column which also SUMs Quantity but does so based on a different Requirement Type. I have tried a few ideas - CASE and adding a subquery in the select list but all return far too many results. What I would like to see is MATERIAL, MATERIAL_DESCRIPTION,SIZE_LITERAL,SUM OF QUANTITY WHEN REQUIREMENT TYPE = 'PB', SUM OF QUANTITY WHEN REQUIREMENT TYPE = '01' Not sure how to add the quantity twice on two different conditions. Thanks in advance
SELECT MATERIAL,
MATERIAL_DESCRIPTION,
SIZE_LITERAL,
SUM(QUANTITY) AS 'SUM_FCST'
FROM VW_MRP_ALLOCATION
WHERE REQUIREMENT_CATEGORY = 'A60381002'
AND MATERIAL_AVAILABILITY_DATE >= GETDATE()
AND REQUIREMENT_TYPE ='PB'
GROUP BY MATERIAL,
MATERIAL_DESCRIPTION,
SIZE_LITERAL
ORDER BY MATERIAL,
SIZE_LITERAL
You would use a CASE expression inside of the SUM(). This is conditional aggregation:
SELECT MATERIAL,
MATERIAL_DESCRIPTION,
SIZE_LITERAL,
SUM(case when REQUIREMENT_TYPE ='PB' then QUANTITY else 0 end) AS 'SUM_FCST_PB',
SUM(case when REQUIREMENT_TYPE ='01' then QUANTITY else 0 end) AS 'SUM_FCST_01'
FROM VW_MRP_ALLOCATION
WHERE REQUIREMENT_CATEGORY = 'A60381002'
AND MATERIAL_AVAILABILITY_DATE >= GETDATE()
GROUP BY MATERIAL,
MATERIAL_DESCRIPTION,
SIZE_LITERAL
ORDER BY MATERIAL,
SIZE_LITERAL
You can use case inside sum as below:
SELECT MATERIAL,
MATERIAL_DESCRIPTION,
SIZE_LITERAL,
SUM(CASE WHEN [Requirement_Type] = 'PB' then QUANTITY else 0 end) AS 'SUM_FCST'
FROM VW_MRP_ALLOCATION
WHERE REQUIREMENT_CATEGORY = 'A60381002'
AND MATERIAL_AVAILABILITY_DATE >= GETDATE()
AND REQUIREMENT_TYPE ='PB'
GROUP BY MATERIAL,
MATERIAL_DESCRIPTION,
SIZE_LITERAL
ORDER BY MATERIAL,
SIZE_LITERAL

SSRS 2008 R2 - evaluating running total only on change of group

I have a report where I capture patient information, some of which is stored in the patient table and some of which is stored in the observations table. Taking date of birth as my example, if I count all the records for which the DOB has been supplied, I get significantly more than the total number of patients, because of the join to the observations table. How do I evaluate the running total only once for each group?
Edit: some sample data over at http://sqlfiddle.com/#!3/27b91/1/0. If I count birthdates from that query, I want 2 as the answer; same for race and ethnicity.
The following may or may not be the right approach for your specific situation, but it can be a useful technique to have at your disposal.
You can add some code to your select statement to help yourself answer questions like these 'downstream' (either via added criteria or via SSRS). See this modification of your SQL Fiddle:
select pid, firstName, lastName, dateOfBirth, obsName, obsValue, obsDate,
rowRank, CASE rowRank WHEN 1 THEN 1 ELSE 0 END AS countableRow
from
(
select Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth
, Obs.obsName, Obs.obsValue, Obs.obsDate,
ROW_NUMBER() OVER (PARTITION BY Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth ORDER BY Obs.obsDate) AS rowRank
from Person
join Obs on Person.pId = Obs.pId
) rankedData
The rowRank field will create a group-relative ranking number, which may or may not be useful to you downstream. The countableRow field will be either 1 or 0 such that each group will have one and only one row with a 1 in it. Doing SUM(countableRow) will give you the proper number of groups in your data.
Now, you can extend this functionality (if you wish) by dumping out actual field values instead of a constant scalar like 1 in the first row of each group. So, if you had CASE rowRank WHEN 1 THEN dateOfBirth ELSE NULL END AS countableDOB, you could then, for example, get the total number of people with each distinct birthday using just this dataset.
Of course, you can do all those things using methods like #Russell's with SQL anyway, so this would be most relevant with specific downstream requirements that may not match your situation.
EDIT
Obviously the countableRow field there isn't a one-size-fits-all solution to the types of queries you want. I have added a few more examples of the PARTITION BY strategy to another SQL Fiddle:
select pid, firstName, lastName, dateOfBirth, obsName, obsValue, obsDate,
rowRank, CASE rowRank WHEN 1 THEN 1 ELSE 0 END AS countableRow,
valueRank, CASE valueRank WHEN 1 THEN 1 ELSE 0 END AS valueCount,
dobRank, CASE WHEN dobRank = 1 AND dateOfBirth IS NOT NULL THEN 1 ELSE 0 END AS dobCount
from
(
select Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth
, Obs.obsName, Obs.obsValue, Obs.obsDate,
ROW_NUMBER() OVER (PARTITION BY Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth ORDER BY Obs.obsDate) AS rowRank,
ROW_NUMBER() OVER (PARTITION BY Obs.obsName, Obs.obsValue ORDER BY Obs.obsDate) AS valueRank,
ROW_Number() OVER (PARTITION BY Person.dateOfBirth ORDER BY Person.pid) AS dobRank
from Person
join Obs on Person.pId = Obs.pId
) rankedData
Lest anyone misunderstand me as suggesting this is always appropriate, it obviously isn't. This isn't a better solution to getting specific answers using additional SQL queries. What it allows you to do is encode enough information to simply answer such questions in the consuming code all in a single result set. That's where it can come in handy.
SECOND EDIT
Since you were wondering whether you can do this if race data is stored in more than one place, the answer is, absolutely. I have revised the code from my previous SQL Fiddle, which is now available in a new one:
select pid, firstName, lastName, dateOfBirth, obsName, obsValue, obsDate,
rowRank, CASE rowRank WHEN 1 THEN 1 ELSE 0 END AS countableRow,
valueRank, CASE valueRank WHEN 1 THEN 1 ELSE 0 END AS valueCount,
dobRank, CASE WHEN dobRank = 1 AND dateOfBirth IS NOT NULL THEN 1 ELSE 0 END AS dobCount,
raceRank, CASE WHEN raceRank = 1 AND (race IS NOT NULL OR obsName = 'RACE') THEN 1 ELSE 0 END AS raceCount
from
(
select Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth, Person.[race]
, Obs.obsName, Obs.obsValue, Obs.obsDate,
ROW_NUMBER() OVER (PARTITION BY Person.pid, Person.firstName, Person.lastName, Person.dateOfBirth ORDER BY Obs.obsDate) AS rowRank,
ROW_NUMBER() OVER (PARTITION BY Obs.obsName, Obs.obsValue ORDER BY Obs.obsDate) AS valueRank,
ROW_NUMBER() OVER (PARTITION BY Person.dateOfBirth ORDER BY Person.pid) AS dobRank,
ROW_NUMBER() OVER (PARTITION BY ISNULL(Person.race, CASE Obs.obsName WHEN 'RACE' THEN Obs.obsValue ELSE NULL END) ORDER BY Person.pid) AS raceRank
from Person
left join Obs on Person.pId = Obs.pId
) rankedData
As you can see, in the new Fiddle, this properly counts the number of Races as 3, with 2 being in the Obs table and the third being in the Person table. The trick is that PARTITION BY can contain expressions, not just raw column output. Note that I changed the join to a left join here, and that we need to use a CASE to only include obsValue WHERE obsName is 'RACE'. It is a little complicated, but not overwhelmingly so, and it handles even fairly complex cases gracefully.
It turned out that Jeroen's pointer to RunningValue was more on-target than I thought. I was able to get the results I wanted with the following code:
=RunningValue(Iif(Not IsNothing(Fields!DATEOFBIRTH.Value)
, Fields!PATIENTID.Value
, Nothing)
, CountDistinct
, Nothing
)
Thanks particularly to Dominic P, whose technique I'll keep in mind for next time.
This will only pull one record per patient, unless they reported different DOBs:
SELECT P.FOO,
P.BAR,
(etc.),
O.DOB
FROM Patients P
INNER JOIN Observations O
ON P.PatientID = O.PatientID
GROUP BY P.FOO, P.BAR, (P.etc), O.DOB

Resources