Distinct count needed within case when statement

Distinct count needed within case when statement - sql-server

I am creating a query of all people who were screened for smoking status and need a count of unique patients. I am pulling from an encounter table, so the patient could have been asked multiple times. In my case when statement I would like to limit the "Then..." result to something like "Then count distinct patients" but it is giving me an error about aggregates not being allowed within an aggregate. If I remove it, it will then not produce a total as I wish and it's telling me I need it in the group by clause, which I do not want. limit is not an option in sql-server to the best of my knowledge
,count(case when soc.tobacco_user_c in (1, 2, 4, 5) and dmw.SMOKING_CESS_CNSL_YN ='y' then enc.PAT_ID **Here is where I want a unique count of patients** end) Compliant

You can combine DISTINCT with a CASE expression.
Example
SELECT
COUNT(DISTINCT CASE WHEN tobacco = 1 THEN PAT_ID ELSE NULL END)
...
;
I've abbreviated your example to make it easier to read. NULLs will not be included in the final count, so there is no need to worry about off by one errors.

case when soc.tobacco_user_c in (1, 2, 4, 5) and dmw.SMOKING_CESS_CNSL_YN ='y' then COUNT(DISTINCT enc.PAT_ID) ELSE 0 end Compliant

I ended up creating two subqueries and then doing a select count distinct on each of the max columns in those queries to limit the results to one

Related

Is there a way you can produce an output like this in T-SQL

I have a column which I translate the values using a case statements and I get numbers like this below. There are multiple columns I need to produce the result like this and this is just one column.
How do you produce the output as a whole like this below.
The 12 is the total numbers counting from top to bottom
49 is the Average.
4.08 is the division 49/12.
1 is how many 1's are there in the output list above. As you can see there is only one 1 in the output above
8.33% is the division and percentage comes from 1/12 * 100
and so on. Is there a way to produce this output below?
drop table test111
create table test111
(
Q1 nvarchar(max)
);
INSERT INTO TEST111(Q1)
VALUES('Strongly Agree')
,('Agree')
,('Disagree')
,('Strongly Disagree')
,('Strongly Agree')
,('Agree')
,('Disagree')
,('Neutral');
SELECT
CASE WHEN [Q1] = 'Strongly Agree' THEN 5
WHEN [Q1] = 'Agree' THEN 4
WHEN [Q1] = 'Neutral' THEN 3
WHEN [Q1] = 'Disagree' THEN 2
WHEN [Q1] = 'Strongly Disagree' THEN 1
END AS 'Test Q1'
FROM test111

I have to make a few assumptions here, but it looks like you want to treat an output column like a column in a spreadsheet. You have 12 numbers. You then have a blank "separator" row. Then a row with the number 12 (which is the count of how many numbers you have). Then a row with the number 49, which is the sum of those 12 numbers. Then the 4.08 row, which is rougly the average, and so on.
Some of these outputs can be provided by cube or rollup, but neither is a complete solution.
If you wanted to produce this output directly from TSQL, you would need to have multiple select statements and combine the results of all of those statements using union all. First you would have a select just to get the numbers. Then you would have a second select which outputs a "blank". Then another select which is providing a count. Then another select which is providing a sum. And so on.
You would also no longer be able to output actual numbers, since a "blank" is not a number. Visually it's best represented as an empty string. But now your output column has to be of datatype char or varchar.
You also have to make sure rows come out in the correct order for presentation. So you need a column to order by. You would have to add some kind of ordering column "manually" to each of the select statements, so when you union them all together you can tell SQL in what order the output should be provided.
So the answer to "can it be done?" is technically "yes". But if you think seems like a whole lot of laborious and inefficient TSQL work, you'd be right.
The real solution here is to change your approach. SQL should not be concerned with "output formatting". What you should do is just return the actual data (your 12 numbers) from SQL, and then do all of the additional presentation (like adding a blank row, adding a count row, etc), in the code of the program that is calling SQL to get that data.

I must say, this is one of the strangest T-SQL requirements I've seen, and is really best left to the presentation layer.
It is possible using GROUPING SETS though. We can use it to get an extra rollup row that aggregates the whole table.
Once you have the rollup, you need to unpivot the totalled row (identified by GROUPING() = 1) to get your final result. We can do this using CROSS APPLY.
This is impossible without a row-identifier. I have added ROW_NUMBER, but any primary or unique key will do.
WITH YourTable AS (
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rn,
CASE WHEN [Q1] = 'Strongly Agree' THEN 5
WHEN [Q1] = 'Agree' THEN 4
WHEN [Q1] = 'Neutral' THEN 3
WHEN [Q1] = 'Disagree' THEN 2
WHEN [Q1] = 'Strongly Disagree' THEN 1
END AS TestQ1
FROM test111
),
RolledUp AS (
SELECT
rn,
TestQ1,
grouping = GROUPING(TestQ1),
count = COUNT(*),
sum = SUM(TestQ1),
avg = AVG(TestQ1 * 1.0),
one = COUNT(CASE WHEN TestQ1 = 1 THEN 1 END),
onePct = COUNT(CASE WHEN TestQ1 = 1 THEN 1 END) * 1.0 / COUNT(*)
FROM YourTable
GROUP BY GROUPING SETS(
(rn, TestQ1),
()
)
)
SELECT v.TestQ1
FROM RolledUp r
CROSS APPLY (
SELECT r.TestQ1, 0 AS ordering
WHERE r.grouping = 0
UNION ALL
SELECT v.value, v.ordering
FROM (VALUES
(NULL , 1),
(r.count , 2),
(r.sum , 3),
(r.avg , 4),
(r.one , 5),
(r.onePct, 6)
) v(value, ordering)
WHERE r.grouping = 1
) v
ORDER BY
v.ordering,
r.rn;
db<>fiddle

Use count and case together using SQL

I want to use a query to return 3 columns, how many type-A blood patients are patient sets,
how many type-B blood patients are there, and how many countries are there based on patients.
Each patient is identified using an unique ID, so patientID is what I'm doing count on. Each State is just using state abbreviation and blood type just letters.
And there are sets of patients, sets are just group of patients lumped together, so like a bunch of patientIDs, they are also unique like patientsID
So far I have something like this, I don't want to use SUM because that would add each patientsID numbers together, I should be using Count. Is there way to count using a case scenario? Or is there better way to accomplish what I want?
select distinct PTID,
select count (patientID CASE WHEN bloodtype = 'A') as totalAbloodtype,
select count (patientID CASE WHEN bloodtype = 'AB') as totalABbloodtype,
select count (distinct countrycode) as totalcountriesinset
from patientsinfo
and PTID is not null
group by PTID

You can use sum with a case expression. You don't need distinct with group by and you seem to be missing where with an abundance of selects so the code you have will just be a syntax error.
Obviously with no sample data or desired results I cannot test, but the idea is
select PTID,
sum (CASE WHEN bloodtype = 'A' then 1 else 0 end) as totalAbloodtype,
sum (CASE WHEN bloodtype = 'AB' then 1 else 0 end) as totalABbloodtype,
count (distinct countrycode) as totalcountriesinset
from patientsinfo
where PTID is not null
group by PTID

T-SQL Selecting TOP 1 In A Query With Aggregates/Groups

I'm still fairly new to SQL. This is a stripped down version of the query I'm trying to run. This query is suppose to find those customers with more than 3 cases and display either the top 1 case or all cases but still show the overall count of cases per customer in each row in addition to all the case numbers.
The TOP 1 subquery approach didn't work but is there another way to get the results I need? Hope that makes sense.
Here's the code:
SELECT t1.StoreID, t1.CustomerID, t2.LastName, t2.FirstName
,COUNT(t1.CaseNo) AS CasesCount
,(SELECT TOP 1 t1.CaseNo)
FROM MainDatabase t1
INNER JOIN CustomerDatabase t2
ON t1.StoreID = t2.StoreID
WHERE t1.SubmittedDate >= '01/01/2017' AND t1.SubmittedDate <= '05/31/2017'
GROUP BY t1.StoreID, t1.CustomerID, t2.LastName, t2.FirstName
HAVING COUNT (t1.CaseNo) >= 3
ORDER BY t1.StoreID, t1.PatronID
I would like it to look something like this, either one row with just the most recent case and detail or several rows showing all details of each case in addition to the store id, customer id, last name, first name, and case count.
Data Example

For these I usually like to make a temp table of aggregates:
DROP TABLE IF EXISTS #tmp;
CREATE TABLE #tmp (
CustomerlD int NOT NULL DEFAULT 0,
case_count int NOT NULL DEFAULT 0,
case_max int NOT NULL DEFAULT 0,
);
INSERT INTO #tmp
(CustomerlD, case_count, case_max)
SELECT CustomerlD, COUNT(tl.CaseNo), MAX(tl.CaseNo)
FROM MainDatabase
GROUP BY CustomerlD;
Then you can join this "tmp" table back to any other table you want to display the number of cases on, or the max case number on. And you can limit it to customers that have more than 3 cases with WHERE case_count > 3

How to select Second Last Row in mySql?

I want to retrieve the 2nd last row result and I have seen this question:
How can I retrieve second last row?
but it uses order by which in my case does not work because the Emp_Number Column contains number of rows and date time stamp that mixes data if I use order by .
The rows 22 and 23 contain the total number of rows (excluding row 21 and 22) and the time and day it got entered respectively.
I used this query which returns the required result 21 but if this number increases it will cause an error.
SELECT TOP 1 *
FROM(
SELECT TOP 2 *
FROM DAT_History
ORDER BY Emp_Number ASC
) t
ORDER BY Emp_Number desc
Is there any way to get the 2nd last row value without using the Order By function?

There is no guarantee that the count will be returned in the one-but-last row, as there is no definite order defined. Even if those records were written in the correct order, the engine is free to return the records in any order, unless you specify an order by clause. But apparently you don't have a column to put in that clause to reproduce the intended order.
I propose these solutions:
1. Return the minimum of those values that represent positive integers
select min(Emp_Number * 1)
from DAT_history
where Emp_Number not regexp '[^0-9]'
See SQL Fiddle
This will obviously fail when the count is larger then the smallest employee number. But seeing the sample data, that would represent a number of records that is maybe not expected...
2. Count the records, ignoring the 2 aggregated records
select count(*)-2
from DAT_history
See SQL Fiddle
3. Relying on correct order without order by
As explained at the start, you cannot rely on the order, but if for some reason you still want to rely on this, you can use a variable to number the rows in a sub query, and then pick out the one that has been attributed the one-but-last number:
select Emp_Number * 1
from (select Emp_Number,
#rn := #rn + 1 rn
from DAT_history,
(select #rn := 0) init
) numbered
where rn = #rn - 1
See SQL Fiddle
The * 1 is added to convert the text to a number data type.

This is not a perfect solution. I am making some assumptions for this. Check if this could work for you.
;WITH cte
AS (SELECT emp_number,
Row_number()
OVER (
ORDER BY emp_number ASC) AS rn
FROM dat_history
WHERE Isdate(emp_number) = 0) --Omit date entries
SELECT emp_number
FROM cte
WHERE rn = 1 -- select the minimum entry, assuming it would be the count and assuming count might not exceed the emp number range of 9888000

SQL SELECT Query

I have a very simple table that has businesses and a column of DisplayBiz = varchar(1) that is either Y or N... I want a script to extract data from the database first all the "Y" and then then all the "N" for a total of ten and I want them ordered by business name..
Is there a way to do this? I am assuming it would be something like this:
SELECT TOP 10 MemberID,
BizName
ORDER BY BizType
but this doesn't take into consideration the DisplayBiz column
Any ideas?
Many thanks..!

You can add more than one column in the ORDER BY clause :
-- ...
ORDER BY DisplayBiz DESC, BizType
Which would put Y rows first, then N rows.

This will get the first 10 alphabetical BizNames that have a 'Y' for DisplayBiz. If there are less than 10, it will start over at A for those with 'N'...
SELECT TOP 10 MemberID, BizName, DisplayBiz
FROM dbo.table
ORDER BY
CASE WHEN DisplayBiz = 'Y' THEN 1 ELSE 2 END,
BizName;
You could also use:
ORDER BY
DisplayBiz DESC,
BizName;
But I prefer the CASE - while more code, you're not taking advantage of the English spelling of Y/N. Seems more proper to be explicit.