Summarize Data for Report using T-SQL Case Statement - sql-server

I want to create a simple summary report in Reporting Services using age, account and age group as follows:
SELECT AGE,COUNT(ACCOUNT)AS TOTALCASES,
'AGEGRP' =CASE WHEN AGE <=5 THEN 'AGE 0 TO 5'
WHEN AGE >=6 THEN 'AGE 6 AND OLDER'
END
FROM MAIN
GROUP BY 'AGEGRP'
When I run this in SQL Server Management Studio, I receive error message:
Msg 164, Level 15, State 1, Line 1 Each GROUP BY expression must contain
at least one column that is not an outer reference.
Can someone suggest a way to produce summarized data, counting account number, summarizing by age 0 to 5 and age 6 and older?

you can't have "age" in the select list if you group by AGEGRP
try:
DECLARE #YourTable table (age int, account int)
insert into #YourTable values (1,40)
insert into #YourTable values (2,40)
insert into #YourTable values (3,40)
insert into #YourTable values (4,40)
insert into #YourTable values (5,40)
insert into #YourTable values (6,40)
insert into #YourTable values (7,40)
insert into #YourTable values (8,40)
SELECT
COUNT(ACCOUNT)AS TOTALCASES, AGEGRP
FROM (SELECT
AGE,ACCOUNT, CASE
WHEN AGE <=5 THEN 'AGE 0 TO 5'
WHEN AGE >=6 THEN 'AGE 6 AND OLDER'
END AS AGEGRP
FROM #YourTable
)dt
GROUP BY AGEGRP
OUTPUT:
TOTALCASES AGEGRP
----------- ---------------
5 AGE 0 TO 5
3 AGE 6 AND OLDER
(2 row(s) affected)

Either you do an inner query, like KM shows, or you repeat the expression you want to group by:
SELECT
AGE,
COUNT(ACCOUNT) AS TOTALCASES,
CASE
WHEN AGE <=5 THEN 'AGE 0 TO 5'
ELSE 'AGE 6 AND OLDER'
END AS AGEGRP
FROM
MAIN
GROUP BY
CASE
WHEN AGE <=5 THEN 'AGE 0 TO 5'
ELSE 'AGE 6 AND OLDER'
END

It's impossible to have AGE in the final result set. I feel you are mixing two requests together. Taking KM's solution, You can either have the inner result, or the outer result without the AGE column.
EDIT: KM just edited his reply, so do I:) Anyway, I was referencing the following two results:
select age, (case ... end) as agegroup from main
select agegroup, count(*) as cases from (select age, (case ... end) as agegroup) t group by agegroup

Related

TSQL get COUNT of rows that are missing from right table

There was one other SIMILAR answer but it is 2 pages long and my requirement doesn't need that. I have 2 tables, tableA and a tableB, and I need to find the COUNTS of rows that are present in tableA but are not present in tableB OR if update_on in tableB is not today's date.
My tables:
tableA:
release_id book_name release_begin_date
----------------------------------------------------
1122 midsummer 2016-01-01
1123 fool's errand 2016-06-01
1124 midsummer 2016-04-01
1125 fool's errand 2016-08-01
tableB:
release_id book_name updated_on
-----------------------------------------
1122 midsummer 2016-08-17
1123 fool's errand 2016-08-16**
Expected result: Since each book is missing one release id, 1 is count. But in addition fool's errand's existing row in tableB has updated_on date of yesterday and not today, it needs to be counted in count_of_not_updated.
book_name count_of_missing count_of_not_updated
-------------------------------------------------------
midsummer 1 0
fool's errand 1 1
Note: Even though fool's errand is present in tableB, I need to show it in count_of_missing because it's updated_on date is yesterday and not today. I know it has to be a combination of a left join and something else, but the kicker here is not only getting the missing rows from left table but at the same time checking if the updated_on table was today's date and if not, count that row in count_of_not_updated.
select sum(case when b.release_id is null then 1 else 0 end) as noReleaseID
, sum(case when datediff(d, b.release_date, getdate()) > 0 then 1 else 0 end) as releaseDateNotToday
, a.release_id
from tableA a
left outer join tableB b on a.release_id = b.release_id
Group by a.release_id
This example uses a sum function on a case statement to add up the instances where the case statement returns true. Note that the current code assumes, as in your example, that you are looking to count all old release dates from table b - more steps would be required if each book has multiple old release dates in table b, and you only want to compare to the most recent release date.
Try this
DECLARE #tableA TABLE (release_id INT, book_name NVARCHAR(50), release_begin_date DATETIME)
DECLARE #tableB TABLE (release_id INT, book_name NVARCHAR(50), updated_on DATETIME)
INSERT INTO #tableA
VALUES
(1122, 'midsummer', '2016-01-01'),
(1123, 'fool''s errand', '2016-06-01'),
(1124, 'midsummer', '2016-04-01'),
(1125, 'fool''s errand', '2016-08-01')
INSERT INTO #tableB
VALUES
(1122, 'midsummer', '2016-08-17'),
(1123, 'fool''s errand', '2016-08-16')
;WITH TmpTableA
AS
(
SELECT
book_name,
COUNT(1) CountOfTableA
FROM
#tableA
GROUP BY
book_name
), TmpTableB
AS
(
SELECT
book_name,
COUNT(1) CountOfTableB,
SUM(CASE WHEN CONVERT(VARCHAR(11), updated_on, 112) = CONVERT(VARCHAR(11), GETDATE(), 112) THEN 0 ELSE 1 END) count_of_not_updated
FROM
#tableB
GROUP BY
book_name
)
SELECT
A.book_name ,
A.CountOfTableA - ISNULL(B.CountOfTableB, 0) AS count_of_missing,
ISNULL(B.count_of_not_updated, 0) AS count_of_not_updated
FROM
TmpTableA A LEFT JOIN
TmpTableB B ON A.book_name = B.book_name
Result:
book_name count_of_missing count_of_not_updated
-------------------- ---------------- --------------------
fool's errand 1 1
midsummer 1 1

Group by count once

My data is like below:
ClassId ClassName StudentId Subject SubjectId
-----------------------------------------------------
1 ESL 12 English 20
1 ESL 13 Science 30
1 ESL 12 Social 40
1 ESL 12 Maths 50
Required output: parameters are Subject column values
ClassId ClassName TotalStudents SubjectIds
-----------------------------------------------
1 ESL 2 20, 40, 50, 30
When one student takes multiple subjects then count student only once, so in the above data 12 is one student id takes multiple subjects so counted only once. TotalStudents value is 2 (1 from student id 12 and 1 from student id 13)
I am not looking for how to display subjectIds column value in comma separated string.
Thanks in advance
COUNT DISTINCT then use STUFF for combined the subject
declare #temp table
(ClassId int,ClassName nvarchar(max),StudentId int,Subject nvarchar(max), SubjectId int)
insert into #temp values (1,'ESL',12,'English' , 20 )
insert into #temp values (1,'ESL',13,'Science' , 30 )
insert into #temp values (1,'ESL',12,'Social ' , 40 )
insert into #temp values (1,'ESL',12,'Maths ' , 50 )
select ClassId,ClassName,COUNT(DISTINCT StudentId) CNT,
STUFF( (SELECT ',' + CAST(t1.SubjectId AS NVARCHAR)
FROM #temp t1
WHERE StudentId = t1.StudentId
FOR XML PATH('')),
1, 1, '') SubjectIdS
from #temp
GROUP BY ClassId,ClassName
OUTPUT
DISTINCT can be applied inside aggregate functions.
SELECT COUNT(DISTINCT column_name) FROM table_name;
If you don't need to display the SubjectIds, then you need to use a GROUP BY clause to group the resultset by ClassId and ClassName.
SELECT ClassId, ClassName, COUNT(distinct StudentId) as TotalStudents
FROM MyTable
GROUP BY ClassId, ClassName
See this example at SqlFiddle

SQL - Query log from Users table

Based on the following example : (it is a "QueryLog" table, this table store interactions between a user and two different products N and R):
Id Date UserID Product
--------------------------------------------------
0 2013-06-09 14:50:24.000 100 N
1 2013-06-09 15:27:23.000 100 N
2 2013-06-09 15:29:23.000 100 N
3 2013-06-17 15:31:23.000 100 N
4 2013-06-17 15:32:23.000 100 N
5 2014-05-19 15:30:23.000 250 N
6 2014-07-19 15:27:23.000 250 N
7 2014-07-19 15:27:23.000 333 R
8 2014-08-19 15:27:23.000 333 R
Expected results :
Count
-----
1
(Only UserID 250 is inside my criteria)
If one user interacts 10 times with the product in only one month, he's not in my criteria.
To resume, I am looking for :
The Number of distinct users that had interactions with product N on at least more than one month (what ever the number of interactions this user may have had during a single month)
This is the code I've tried:
select distinct v.UserID, v.mois , v.annee
from
(select c.UserID , c. mois, c.annee, COUNT(c.UserID) as frequence
from
(
SELECT
datepart(month,[DATE]) as mois,
datepart(YEAR,[DATE]) as annee ,
Username,
UserID,
Product
FROM QueryLog
where Product = 'N'
) c
group by c.UserID, c.annee, c.mois
) v
group by v.UserID, v.mois, v.annee
try this:
DECLARE #YourTable table (Id int, [Date] datetime, UserID int, Product char(1))
INSERT INTO #YourTable VALUES (0,'2013-06-09 14:50:24',100 ,'N')
,(1,'2013-06-09 15:27:23',100 ,'N')
,(2,'2013-06-09 15:29:23',100 ,'N')
,(3,'2013-06-17 15:31:23',100 ,'N')
,(4,'2013-06-17 15:32:23',100 ,'N')
,(5,'2014-05-19 15:30:23',250 ,'N')
,(6,'2014-07-19 15:27:23',250 ,'N')
,(7,'2014-07-19 15:27:23',333 ,'R')
,(8,'2014-08-19 15:27:23',333 ,'R')
;WITH MultiMonthUsers AS
(
select
UserID
FROM (select
UserID
FROM #YourTable
WHERE product='N'
GROUP BY UserID, YEAR([Date]),MONTH([Date])
)dt2
GROUP BY UserID
HAVING COUNT(*)>1
)
SELECT COUNT(*) FROM MultiMonthUsers
Depending on number of rows and indexes, this will run slow. Using YEAR([Date]),MONTH([Date]) will prevent any index usage.
I think this will do it, but I need a better dataset to test with:
SELECT COUNT(*)
FROM (
--roll all month/user records into single row
SELECT UserID, datediff(month 0, [date]) As MonthGroup
FROM QueryLog
WHERE Product='N'
GROUP BY datediff(month 0, [date]), UserId
) t
-- look for users with multiple rows
GROUP BY UserID
HAVING COUNT(UserID) > 1
Seems like there should be a way to roll this up further, to avoid the need for the nested select.

How SQL Server iterates over rows when computing several aggregate functions in one select query

I have a SQL Server 2012
And query
SELECT ManagerId,
SUM(CASE WHEN SoldInDay < 30 THEN 1 ELSE 0 END) as badSoldDays,
SUM(CASE WHEN Category = 'PC' THEN 1 ELSE 0 END) as DaysWithSoldPc
FROM SomeTable
GROUP BY ProductId
Some Table Definition
ManagerId | SoldInDay | Category
1 50 PC
1 20 Laptop
2 30 PC
3 40 Laptop
So, question is:
Does it mean that Sql will iterate over all rows twice? so, each aggregate function executes in separate cycle over all rows in table? or it's much smarter?
Doesn't matter what I want to get by this query, it's my dream.
(It appears that your question was addressed by a comment, but for completeness, I've provided an official answer here.)
First, your example SQL will not run. You are including ManagerId in the field list but not the GROUP BY. You will get an error akin to this:
Msg 8120, Level 16, State 1, Line 9 Column '#SomeTable.ManagerID' is
invalid in the select list because it is not contained in either an
aggregate function or the GROUP BY clause.
Assuming you meant "ManagerId" instead of "ProductId" in the field list, I reproduced your situation and reviewed the execution plans. It showed only one "Stream Aggregate" operator. You can force it to run over the table twice by separating the aggregations into two different common table expressions (CTEs) and JOINing them back together. In that case, you see two Stream Aggregate operators (one for each run through the table).
Here is the code to generate the execution plans:
DECLARE #SomeTable TABLE
(
ManagerId int,
SoldInDay int,
Category varchar(50)
);
INSERT INTO #SomeTable (ManagerId, SoldInDay, Category) VALUES (1, 50, 'PC');
INSERT INTO #SomeTable (ManagerId, SoldInDay, Category) VALUES (1, 20, 'Laptop');
INSERT INTO #SomeTable (ManagerId, SoldInDay, Category) VALUES (2, 30, 'PC');
INSERT INTO #SomeTable (ManagerId, SoldInDay, Category) VALUES (3, 40, 'Laptop');
/*
This produces an error:
Msg 8120, Level 16, State 1, Line 9
Column '#SomeTable.ManagerID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
SELECT
ManagerId,
SUM(CASE WHEN SoldInDay < 30 THEN 1 ELSE 0 END) as badSoldDays,
SUM(CASE WHEN Category = 'PC' THEN 1 ELSE 0 END) as DaysWithSoldPc
FROM #SomeTable
GROUP BY ProductId;
*/
SELECT
ManagerId,
SUM(CASE WHEN SoldInDay < 30 THEN 1 ELSE 0 END) as BadSoldDays,
SUM(CASE WHEN Category = 'PC' THEN 1 ELSE 0 END) as DaysWithSoldPc
FROM #SomeTable
GROUP BY ManagerId;
WITH DaysWithSoldPcTable AS
(
SELECT
ManagerId,
SUM(CASE WHEN Category = 'PC' THEN 1 ELSE 0 END) as DaysWithSoldPc
FROM #SomeTable
GROUP BY ManagerId
), BadSoldDaysTable AS
(
SELECT
ManagerId,
SUM(CASE WHEN SoldInDay < 30 THEN 1 ELSE 0 END) as BadSoldDays
FROM #SomeTable
GROUP BY ManagerId
)
SELECT
DaysWithSoldPcTable.ManagerId,
DaysWithSoldPcTable.DaysWithSoldPc,
BadSoldDaysTable.BadSoldDays
FROM DaysWithSoldPcTable
JOIN BadSoldDaysTable
ON DaysWithSoldPcTable.ManagerId = BadSoldDaysTable.ManagerId;

Subtraction for two rows

I have following table named table1 in SQL Server.
id value
1 10
2 100
3 20
4 40
5 50
When i execute query following query it gives me result of 110 which is expected
SELECT SUM(value) from table1 where id in (1,2)
What i want is opposite of SUM means the output should be 90 or -90.
i know this can be done by writing following query
select ((SELECT value from table1 where id in (1)) - (SELECT value from table1 where id in (2)) )
but is there any simplified way to do this(something like SUM function).
Fiddle demo using Sum() with Case:
Declare #SubId int =1
--To get -90 or +90 change the value of #SubId from 1 to 2
Select Sum(Case When Id = #SubId Then value Else -1*Value End) Total
From Table1
Where Id in (1,2);
Depending on whether you want result to be non-negative or non-positive, you can switch MIN and MAX in the following statement:
select max(value) - min(value)
from table1
where id in (1,2)

Resources