Count unique IDs with Case - sql-server

I have this Query(using SQL Server 2008):
SELECT
ISNULL(tt.strType,'Random'),
COUNT(DISTINCT t.intTestID) as Qt_Tests,
COUNT(CASE WHEN sd.intResult=4 THEN 1 ELSE NULL END) as Positives
FROM TB_Test t
LEFT JOIN TB_Sample s ON t.intTestID=s.intTestID
LEFT JOIN TB_Sample_Drug sd ON s.intSampleID=sd.intSampleID
LEFT JOIN TB_Employee e ON t.intEmployeeID=e.intEmployeeID
LEFT JOIN TB_Test_Type tt ON t.intTestTypeID=tt.intTestTypeID
WHERE
s.dtmCollection BETWEEN '2013-06-01 00:00' AND '2013-08-31 23:59'
AND f.intCompanyID = 91
GROUP BY
tt.strType
The thing is each sample has four records on sample_drug, which represents the drugs tested on that sample. And the result of the sample can be positive from one to four drugs at the same time.
What I need to show in the third column is just if the sample was positive, no matter how many drugs.
And I can't find a way to do that, because i need the CASE WHEN to know that i want all the Results = 4 but just from unique intSampleIDs.
Thanks in advance!

If I understand correctly, the easiest way is to use max() instead of count():
SELECT ISNULL(tt.strType,'Random'),
COUNT(DISTINCT t.intTestID) as Qt_Tests,
max(CASE WHEN sd.intResult = 4 THEN 1 ELSE NULL END) as HasPositives
. . .
EDIT:
If you want the number of samples with a positive result, then use count(distinct) with a case statement.
SELECT ISNULL(tt.strType,'Random'),
COUNT(DISTINCT t.intTestID) as Qt_Tests,
count(distinct CASE WHEN sd.intResult = 4 THEN s.intSampleID end) as NumPositiveSamples

instead of this:
COUNT(CASE WHEN sd.intResult=4 THEN 1 ELSE NULL END) as Positives
try this:
SUM(CASE WHEN sd.intResult=4 THEN 1 ELSE 0 END) as Positives

Should work if i undestood correctly that you want show that a row in TB_Sample_Drug marked with intResult = 4 (1) or not (null) in the positives column.
SELECT
ISNULL(tt.strType,'Random'),
COUNT(DISTINCT t.intTestID) as Qt_Tests,
CASE WHEN sd.Positives > 0 THEN 1 ELSE NULL END as Positives
FROM TB_Test t
LEFT JOIN TB_Sample s ON t.intTestID=s.intTestID
LEFT JOIN TB_Employee e ON t.intEmployeeID=e.intEmployeeID
LEFT JOIN TB_Test_Type tt ON t.intTestTypeID=tt.intTestTypeID
CROSS APPLY (select count(*) as Positives from TB_Sample_Drug sd where s.intSampleID=sd.intSampleID and sd.intResult=4) sd
WHERE
s.dtmCollection BETWEEN '2013-06-01 00:00' AND '2013-08-31 23:59'
AND f.intCompanyID = 91
GROUP BY
tt.strType

Related

How to store result of CASE expression into (temporary/not) variable and use it to calculate concurrently inside the same SELECT statement

In other programing language we can do this:
A = 5
B = A + 7
But anyone know how to do this in SQL, something like this:
SELECT
(CASE WHEN (CASE WHEN t.[Id] IS NULL THEN 0 ELSE 1 END) = 1 THEN a.[123]
ELSE a.[432] END) AS 'Points',
#'Points' + 5 AS 'FinalPoints' ---Does any way to do this
FROM DeliveryOrder AS do
LEFT JOIN Transaction AS t ON t.DOId = do.Id
LEFT JOIN Amount AS a ON do.Id = a.Id
I really appreciate if anyone have any ideas how to do this
P/S: Since I can use SubQuery to simulate this but in my case, each columns have many CASE..WHEN expression and inside it have Case expression too so after all when combine all of them into single SubQuery make thing so terrible to read :(
You can't do exactly what you want. Aliases defined in the select clause cannot be reused in the same clause, so you need to either repeat the expression, or use a derived table (subquery or cte).
Your nested case seems overkill - I think that your code can be simplified as:
SELECT
CASE WHEN t.[Id] IS NULL THEN 10 ELSE 5 END AS points,
CASE WHEN t.[Id] IS NULL THEN 15 ELSE 10 END finalPoints
FROM DeliveryOrder AS do
LEFT JOIN Transaction AS t ON t.DOId = do.Id
Or using a subquery:
SELECT
points,
points + 5 finalPoints
FROM (
SELECT CASE WHEN t.[Id] IS NULL THEN 10 ELSE 5 END AS points
FROM DeliveryOrder AS do
LEFT JOIN Transaction AS t ON t.DOId = do.Id
) t
One another way can be to use cross apply,
SELECT
p.Points
,p.Points+5 'FinalPoints'
FROM DeliveryOrder AS do
LEFT JOIN dbo.[Transaction] AS t ON t.DOId = do.Id
cross apply (
select CASE WHEN T.Id IS NOT NULL THEN 5 ELSE 10 END AS 'Points'
) p
Using a SubQuery
SELECT Points, Points + 5 AS FinalPoints
FROM
(
SELECT CASE WHEN (CASE WHEN T.Id IS NULL THEN 0 ELSE 1 END) = 1 THEN 5 ELSE 10 END Points
FROM DeliveryOrder AS DO
LEFT JOIN [Transaction] AS T ON DO.Id = T.DOId
) T
You could get ride of the nested CASE expression and directly use
CASE WHEN T.Id IS NOT NULL THEN 5 ELSE 10 END Points

Aggregate function is return incorrect value when joining more table

I am getting the Total sale using the following query.
SELECT SUM([B].[TotalSale])
FROM [dbo].[BookingDetail] [BF] WITH (READPAST)
INNER JOIN [dbo].[Booking] [B] WITH (READPAST) ON [B].[BookingDetailID] = [BF].[ID]
WHERE [BF].[MarketID] = '2'
I want to add another column to get the Gross Sale .
For that I have to make a join with another table called AirTraveler.
But once I add a new table to the query
SELECT
SUM([B].[TotalSale]) ,
SUM(CASE WHEN [B].[TravelSectorID] = 3 AND [B].[BookingStatusID] IN (16, 20, 22, 23) THEN COALESCE([B].[TotalSale], 0.0)
WHEN ([B].[TravelSectorID] = 1 AND [B].[IsDomestic] = 1 AND CONVERT(varchar, [AT].[FareDetails].query('string(/AirFareInfo[1]/PT[1])')) = 'FlightAndHotel') THEN [AT].[TotalSale]
ELSE 0 END) AS [GrossSale]
FROM [dbo].[BookingDetail] [BF] WITH (READPAST)
INNER JOIN [dbo].[Booking] [B] WITH (READPAST) ON [B].[BookingDetailID] = [BF].[ID]
LEFT OUTER JOIN [dbo].[AirTraveler] [AT] WITH(READPAST) ON [B].[ID] = [AT].[BookingID]
WHERE [BF].[MarketID] = '2'
it is giving incorrect result of [TotalSale] .the aggregate functions return wrong values because there may be multiple AirTraveler per Booking ID, which is correct. What can I do to solve the aggregate function problem?
I am actually stuck.
I am using SQL Server .
Thanks in advance.
Not tested or anything, but when you are joining to a lower level table that causes a header table to double count, you can pre-aggregate it before it joins
This is probably missing some opening/closing brackets and aliases but hopefully you can work it out
SELECT
SUM([B].[TotalSale]) ,
SUM(CASE WHEN [B].[TravelSectorID] = 3
AND [B].[BookingStatusID] IN (16, 20, 22, 23)
THEN COALESCE([B].[TotalSale], 0.0)
WHEN ([B].[TravelSectorID] = 1 AND [B].[IsDomestic] = 1
THEN [AT].[TotalSale]
ELSE 0 END) AS [GrossSale]
FROM [dbo].[BookingDetail] [BF] WITH (READPAST)
INNER JOIN [dbo].[Booking] [B] WITH (READPAST) ON [B].[BookingDetailID] = [BF].[ID]
LEFT OUTER JOIN
(
SELECT BookingID, SUM(CASE WHEN
CONVERT(varchar(50), [FareDetails].query('string(/AirFareInfo[1]/PT[1])'))
= 'FlightAndHotel') THEN [TotalSale] ELSE 0 END) TotalSale
FROM [dbo].[AirTraveler] [AT] WITH(READPAST)
GROUP BY BookingID
) AT
ON [B].[ID] = [AT].[BookingID]
WHERE [BF].[MarketID] = '2'
Also I gave your varchar cast a size - I think if you don't do this it'll be 1 so your case is never true

How can I convert T-SQL syntax, which has written by CASE WHEN, to pivot or window functions

How can I convert T-SQL syntax, which has written by CASE WHEN, to PIVOT or window functions with following code:
SELECT
T.TaskID,
SUM(CASE WHEN T.LogDate<'2016-02-04' AND T.TaskStatusID=2 THEN ISNULL(DA_CHILD.Value,0)*(T.DoneScore/100) ELSE 0 END) PreAmount,
SUM(CASE WHEN T.LogDate>='2016-02-04' AND T.LogDate<='2017-02-04' AND T.TaskStatusID=2 THEN ISNULL(DA_CHILD.Value,0)*(T.DoneScore/100) ELSE 0 END) CurAmount,
SUM(CASE WHEN T.LogDate>'2017-02-04' AND T.TaskStatusID=2 THEN ISNULL(DA_CHILD.Value,0)*(T.DoneScore/100) ELSE 0 END) AfterAmount
FROM
NetTasks$ T
INNER JOIN NetDeviceActions DA ON DA.DeviceActionID=T.DeviceActionID
INNER JOIN NetActionParents AP ON AP.ParentID=DA.ActionID
INNER JOIN NetDeviceActions DA_CHILD ON DA_CHILD.ActionID=AP.ChildID AND
DA_CHILD.DeviceID=DA.DeviceID AND
DA_CHILD.ContractInfoID=DA.ContractInfoID
WHERE
T.ParentTaskID = 0 AND
T.FinishDate<='2017-01-07' AND
DA.ContractInfoID=15
GROUP BY
T.TaskID, T.DoneScore,T.FinishDate
This is my result:
TaskID PreAmount CurAmount AfterAmount
686170 0 0 0
655768 NULL 0 0
734520 0 NULL 0
682661 0 NULL 0
Here is the first transformation that you have to put your data through:
Please test this for errors first
SELECT
T.TaskID,
CASE
WHEN T.LogDate<'2016-02-04' AND T.TaskStatusID=2
THEN 'PreAmount'
WHEN T.LogDate>='2016-02-04' AND T.LogDate<='2017-02-04'
AND T.TaskStatusID=2 THEN 'CurAmount'
WHEN T.LogDate>'2017-02-04' AND T.TaskStatusID=2 THEN 'AfterAmount'
END As Type,
ISNULL(DA_CHILD.Value,0)*(T.DoneScore/100) Amount
FROM
NetTasks$ T
INNER JOIN NetDeviceActions DA ON DA.DeviceActionID=T.DeviceActionID
INNER JOIN NetActionParents AP ON AP.ParentID=DA.ActionID
INNER JOIN NetDeviceActions DA_CHILD ON DA_CHILD.ActionID=AP.ChildID
AND DA_CHILD.DeviceID=DA.DeviceID
AND DA_CHILD.ContractInfoID=DA.ContractInfoID
WHERE
T.ParentTaskID = 0 AND
T.FinishDate<='2017-01-07' AND
DA.ContractInfoID=15
Now if you take a look at this page:
https://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx
The query above takes place of (<SELECT query that produces the data>)
So when we follow those instructions and look at some examples, we come up with this:
SELECT
TaskID,
[PreAmount],[CurAmount],[AfterAmount]
FROM (
SELECT
T.TaskID,
CASE
WHEN T.LogDate<'2016-02-04' AND T.TaskStatusID=2
THEN 'PreAmount'
WHEN T.LogDate>='2016-02-04' AND T.LogDate<='2017-02-04'
AND T.TaskStatusID=2 THEN 'CurAmount'
WHEN T.LogDate>'2017-02-04' AND T.TaskStatusID=2 THEN 'c'
END As Type,
ISNULL(DA_CHILD.Value,0)*(T.DoneScore/100) Amount
FROM
NetTasks$ T
INNER JOIN NetDeviceActions DA ON DA.DeviceActionID=T.DeviceActionID
INNER JOIN NetActionParents AP ON AP.ParentID=DA.ActionID
INNER JOIN NetDeviceActions DA_CHILD ON DA_CHILD.ActionID=AP.ChildID
AND DA_CHILD.DeviceID=DA.DeviceID
AND DA_CHILD.ContractInfoID=DA.ContractInfoID
WHERE
T.ParentTaskID = 0 AND
T.FinishDate<='2017-01-07' AND
DA.ContractInfoID=15
) As SRC
PIVOT
(
SUM(Amount)
FOR Type IN [PreAmount],[CurAmount],[AfterAmount]
) As PivotTable
Here are all the problems with this:
You are not going to see any performance improvement here
Your code is more complicated and less maintainable
Problems with original query:
it's full of hard coded dates
SUMing a rate usually gives you a nonsense number. Usually you sum numerator and denominator first then divide. Maybe that's your real problem
PS the window function makes even less sense I can't even think of a sensible way to do that.

TSQL Partitioning to COUNT() only subset of rows

I have a query that I am working on that, for every given month and year in a sales table, returns back a SUM() of the total items ordered as well as a count of the distinct number of accounts ordering items and a couple break down of SUm()s on various product types. See the query below for an example:
SELECT
YearReported,
MonthReported,
COUNT(DISTINCT DiamondId) as Accounts,
SUM(Quantity) as TotalUnitsOrdered,
SUM(CASE P.ProductType WHEN 1 THEN Quantity ELSE 0 END) as MonthliesOrdered,
SUM(CASE P.ProductType WHEN 1 THEN 0 ELSE Quantity END) as TPBOrdered
FROM
RetailOrders R WITH (NOLOCK)
LEFT JOIN
Products P WITH (NOLOCK) ON R.ProductId = P.ProductId
GROUP BY
YearReported, MonthReported
The problem I am facing now is that I also need to get the count of distinct accounts broken out based on another field in the dataset. For example:
SELECT
YearReported,
MonthReported,
COUNT(DISTINCT DiamondId) as Accounts,
SUM(Quantity) as TotalUnitsOrdered,
SUM(CASE P.ProductType WHEN 1 THEN Quantity ELSE 0 END) as MonthliesOrdered,
SUM(CASE P.ProductType WHEN 1 THEN 0 ELSE Quantity END) as TPBOrdered,
SUM(CASE IsInitial WHEN 1 THEN Quantity ELSE 0 END) as InitialOrders,
SUM(CASE IsInitial WHEN 0 THEN Quantity ELSE 0 END) as Reorders,
COUNT(/*DISTINCT DiamndId WHERE IsInitial = 1 */) as InitialOrderAccounts
FROM
RetailOrders R WITH (NOLOCK)
LEFT JOIN
Products P WITH (NOLOCK) ON R.ProductId = P.ProductId
GROUP BY
YearReported, MonthReported
Obviously we would replace the commented out section in the last SUM with something that would not throw an error. I just added that for illustrative purposes.
I feel like this can be done using the Partition methods in SQL, but I have to admit that I just am not very good with them and cant figure out how to do this. And the MS documentation online for Partitions really just makes my head hurt after reading it last night.
EDIT: I mistakenly has the last aggregate function as sum and I meant for it to be a COUNT().
So to help clarify, COUNT(DISTINCT DiamondId) will give me back a count of all unique DiamondId values in the set but I also need to get a COUNT() of all Unique Diamond Id values in the set whose corresponding IsInitial flag is set to 1
Just a matter of nulling the ones that don't qualify:
count(distinct
case
when IsInitial = 1 then DiamndId
/* else null */
end
)

SQL not doing the join correctly

I have a SQL statement with some JOIN condition it is working fine for all of them but not the last one the code is below:
SELECT
A.EMPL_CTG,
B.DESCR AS PrName,
SUM(A.CURRENT_COMPRATE) AS SALARY_COST_BUDGET,
SUM(A.BUDGET_AMT) AS BUDGET_AMT,
SUM(A.BUDGET_AMT)*100/SUM(A.CURRENT_COMPRATE) AS MERIT_GOAL,
SUM(C.FACTOR_XSALARY) AS X_Programp,
SUM(A.FACTOR_XSALARY) AS X_Program,
COUNT(A.EMPLID) AS EMPL_CNT,
COUNT(D.EMPLID),
SUM(CASE WHEN A.PROMOTION_SECTION = 'Y' THEN 1 ELSE 0 END) AS PRMCNT,
SUM(CASE WHEN A.EXCEPT_IND = 'Y' THEN 1 ELSE 0 END) AS EXPCNT,
(SUM(CASE WHEN A.PROMOTION_SECTION = 'Y' THEN 1 ELSE 0 END)+SUM(CASE WHEN A.EXCEPT_IND = 'Y' THEN 1 ELSE 0 END))*100/(COUNT(A.EMPLID)) AS PEpercent
FROM
EMP_DTL A INNER JOIN EMPL_CTG_L1 B ON A.EMPL_CTG = B.EMPL_CTG
INNER JOIN
ECM_PRYR_VW C ON A.EMPLID=C.EMPLID
INNER JOIN ECM_INELIG D on D.EMPL_CTG=A.EMPL_CTG and D.YEAR=YEAR(getdate())
WHERE
A.YEAR=YEAR(getdate())
AND B.EFF_STATUS='A'
GROUP BY
A.EMPL_CTG,
B.DESCR
ORDER BY B.DESCR
The COUNT(D.EMPLID) is returning the same value as COUNT(A.EMPLID) but I need the count of EMPLIDs for Table D in the join condition, any help?
COUNT() (and also the other GROUP BY aggregate functions) doesn't process only the rows from one of the tables.
They work on all the rows produced by the JOIN. If the JOIN without GROUP BY produces 42 rows then COUNT(*) and COUNT(1) returns 42 while COUNT(A.EMPLID) and COUNT(D.EMPLID) return the number of not-NULL values in those columns.
In order to get the number of rows extracted from one of the tables the you should use COUNT(DISTINCT). It ignores the NULL values and also the duplicates produced by the JOIN.
Change COUNT(D.EMPLID) to COUNT(DISTINCT D.EMPLID).

Resources