How to get the desired output dynamically? - sql-server

We have two student tables with details in one table and marks in another, we need to calculate the percentage of marks obtained by students and divide them according to percentage and gender. Attached image for further understanding.
image
Adding my trial_code:
create table student_details (id int primary key, student_name varchar(20), student_gender varchar(15))
insert into student_details values(1,'test','Male'),(2,'test2','Female')
select * from student_details
create table student_marks(id int primary key, student_id int foreign key references student_details(id)
,[subject] varchar(20)
,marks int)
insert into student_marks values(1,1,'English',25),(2,2,'English',65)
,(3,1,'Social',25),(4,2,'Social',65)
,(5,1,'Bio',25),(6,2,'Bio',65)
,(7,1,'Maths',25),(8,2,'Maths',65)
,(9,1,'Science',25),(10,2,'Science',65)
select avg(marks) from student_marks
group by student_id
select sum(marks) from student_marks
group by id
create or alter proc student_avg_marks
as
begin
declare #column varchar(50)
select case when avg(sm.marks)<25 then 'Below 25%'
when (avg(sm.marks)between 25 and 50) then '25% - 50%'
when (avg(sm.marks)between 50 and 75) then '50% - 75%'
when avg(sm.marks)<25 then 'Above 75%'end as avg_marks,
count(student_id) as [total no. of students],
sum(case when sd.student_gender='Male' then 1 else 0 end),
sum(case when sd.student_gender='Female' then 1 else 0 end)
from student_details as sd inner join
student_marks as sm on sd.id=sm.student_id
group by sm.student_id
end
exec student_avg_marks
The output should be something like in the image, even if no records are present the avg_marks column should display all records in it.
Fiddle The final output should count no. of students but the result displayed is no. of subjects.
If no of students is 3 and 2 are male and 1 is female then it should represent that and 50 lies in '50 to 75' range.

hello please test this:
CREATE OR ALTER PROC student_avg_marks
AS
BEGIN
--declare #column varchar(50)
SELECT V.avg_marks ,isnull(T2.[total no. of students],'') as [No. of Students], isnull(T2.male,'') as Male, isnull(T2.female,'') as Female
FROM (values('Below 25%') ,('25% - 50%') ,('50% - 75%'),('Above 75%')) V (avg_marks)
LEFT JOIN (
select case when avg(sm.marks) < 25 then 'Below 25%'
when avg(sm.marks) between 25 and 50 then '25% - 50%'
when avg(sm.marks) between 50 and 75 then '50% - 75%'
when avg(sm.marks) > 75 then 'Above 75%'end as avg_marks,
COUNT(DISTINCT sm.student_id) as [total no. of students],
SUM(case when sd.student_gender='Male' then 1 else 0 end) AS male,
SUM(case when sd.student_gender='Female' then 1 else 0 end) AS female
from student_details as sd inner join
student_marks as sm on sd.id=sm.student_id
group by sd.student_gender) AS T2 ON V.avg_marks = T2.avg_marks
GROUP BY V.avg_marks, T2.[total no. of students], T2.male, T2.female
END
if we execute the query:
GO
EXECUTE student_avg_marks
Result Set:

Use a select with your desired values
create or alter proc student_avg_marks
as
begin
declare #column varchar(50)
SELECT V.avg_marks ,T2.[total no. of students] ,T2.male ,T2.female FROM (values('Below 25%') ,('25% - 50%') ,('50% - 75%'),('Above 75%')) V (avg_marks)
LEFT JOIN (
select case when avg(sm.marks)<25 then 'Below 25%'
when (avg(sm.marks)between 25 and 50) then '25% - 50%'
when (avg(sm.marks)between 50 and 75) then '50% - 75%'
when avg(sm.marks)<25 then 'Above 75%'end as avg_marks,
1 as [total no. of students],
case when sd.student_gender='Male' then 1 else 0 end AS male,
case when sd.student_gender='Female' then 1 else 0 end AS female
from student_details as sd inner join
student_marks as sm on sd.id=sm.student_id
group by sm.student_id ,sd.student_gender) AS T2 ON V.avg_marks = T2.avg_marks
end

Related

T-SQL: get only rows which are after a row that meets some condition

I have a table of article's logs. I need to get all the articles which have only one log, or in case there are amount of logs more than 1: if an article has any log in status = 103, it's need to fetch only rows after this log, in other case all the logs. So from the following dataset I want to get only rows with Id 1383 and 284653.
Id
Article
Version
StatusId
AddedDate
1383
1481703
0
42
2011-11-25 09:23:42.000
284645
435545
1
41
2021-11-02 18:29:42.000
284650
435545
2
41
2021-11-02 18:34:58.000
284651
435545
2
103
2021-11-02 18:34:58.000
284653
435545
3
41
2021-11-02 18:38:33.000
Any ideas how to handle it properly ? Thanks in advance
You can use window functions here. A combination of a running COUNT and a windowed COUNT will do the trick
The benefit of using window functions rather than self-joins is that you only scan the base table once.
SELECT
Id,
Article,
Version,
StatusId,
AddedDate
FROM (
SELECT *,
HasPrev103 = COUNT(CASE WHEN StatusId = 103 THEN 1 END) OVER
(PARTITION BY Article ORDER BY AddedDate ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),
Has103 = COUNT(CASE WHEN StatusId = 103 THEN 1 END) OVER (PARTITION BY Article),
Count = COUNT(*) OVER (PARTITION BY Article)
FROM YourTable t
) t
WHERE (Has103 > 0 AND HasPrev103 > 0) OR Count = 1;
db<>fiddle
CREATE TABLE #Article (
Id int NOT NULL PRIMARY KEY,
Article int NOT NULL,
Version int NOT NULL,
StatusId int NOT NULL,
DateAdded datetime NOT NULL
)
INSERT INTO #Article (Id, Article, Version, StatusId, DateAdded)
VALUES
(1383, 1481703, 0, 42, '2011-11-25 09:23:42.000'),
(284645, 435545, 1, 41 , '2021-11-02 18:29:42.000'),
(284650, 435545, 2, 41 , '2021-11-02 18:34:58.000'),
(284651, 435545, 2, 103, '2021-11-02 18:34:58.000'),
(284653, 435545, 3, 41 , '2021-11-02 18:38:33.000')
SELECT *
FROM #Article a
LEFT JOIN (
-- Get articles that appear only once.
SELECT Article
FROM #Article
GROUP BY Article
HAVING COUNT(*) = 1
) AS o
ON a.Article = o.Article
LEFT JOIN (
-- Get the 103s and their corresponding date.
SELECT Article, DateAdded
FROM #Article
WHERE StatusId = 103
) AS s
ON a.Article = s.Article AND s.DateAdded < a.DateAdded
WHERE o.Article IS NOT NULL OR (s.Article IS NOT NULL AND a.DateAdded > s.DateAdded)
DROP TABLE #Article

Multiple select queries using while loop in a single table? Is it Possible?

I have 2 tables. Table A has Date, ISBN (for Book), Demand(demand for that date). Table B has Date, ISBN (for Book), and SalesRank.
The sample data is as follows:
The DailyBookFile has 150k records for each date, from year 2010 (i.e. 150k * 365 days * 8 years) rows. Same goes with SalesRank Table having about 500k records for each date
DailyBookFile
Date Isbn13 CurrentModifiedDemandTotal
20180122 9780955153075 13
20180122 9780805863567 9
20180122 9781138779396 1
20180122 9780029001516 9
20180122 9780470614150 42
SalesRank
importdate ISBN13 SalesRank
20180122 9780029001516 69499
20180122 9780470614150 52879
20180122 9780805863567 832429
20180122 9780955153075 44528
20180122 9781138779396 926435
Required Output
Date Avg_Rank Book_Group
20180122 385154 Elite
20180121 351545 Elite
20180120 201545 Elite
I want to get the Top 200 CurrentModifiedDemand for each day, and take the average Rank.
I am unable to work out a solution as I am new to SQL.
I started with getting the Top 200 CurrentModifiedDemand for yesterday and get the Avg Rank over last year.
SELECT DBF.Filedate AS [Date],
AVG(AMA.SalesRank) AS Avg_Rank,
'Elite' AS Book_Group
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
INNER JOIN [ODS].[MarketplaceMonitor].[SalesRank] AS AMA ON (DBF.Isbn13 = AMA.ISBN13
AND DBF.FileDate = AMA.importdate)
WHERE DBF.Isbn13 IN (SELECT TOP 200 Isbn13
FROM [ODS].[wholesale].[DailyBookFile]
WHERE FileDate = 20180122
AND CAST(CurrentModifiedDemandTotal AS int) > 200)
AND DBF.Filedate > 20170101
GROUP BY DBF.Filedate;
But the result is not what I want. So, now I want the ISBN for the Top 200 CurrentModifiedDemand for each day and their avg rank. I tried with this.
DECLARE #i int;
SET #i = 20180122;
WHILE (SELECT DISTINCT(DBF.Filedate)
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
WHERE DBF.Filedate = #i) IS NOT NULL
BEGIN
SELECT DBF.Filedate AS [Date],
AVG(AMA.SalesRank) AS Avg_Rank,
'Elite' AS Book_Group
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
INNER JOIN [ODS].[MarketplaceMonitor].[SalesRank] as AMA ON DBF.Isbn13 = AMA.ISBN13
AND DBF.FileDate = AMA.importdate
WHERE DBF.Isbn13 in (SELECT TOP 200 Isbn13
FROM [ODS].[wholesale].[DailyBookFile]
WHERE FileDate = #i
AND CAST (CurrentModifiedDemandTotal AS int) > 500)
AND DBF.Filedate = #i
GROUP BY DBF.Filedate;
SET #i = #i+1;
END
In this I am getting one select query result in each window. Is there any way to have the result in a single table?
P.S. The list of top 200 books every day will change according to the CurrentModifiedDemand. I want to take their avg. sales rank for that day.
Instead of immediately selecting in each iteration of the loop, you can insert rows to temp table (or table-type variable) and select everything after the loop finishes:
IF OBJECT_ID('tempdb..#books') IS NOT NULL
BEGIN
DROP TABLE #books
END
CREATE TABLE #books (
[Date] INT,
[Avg_Rank] FLOAT,
[Book_Group] VARCHAR(512)
);
DECLARE #i int;
SET #i = 20180122;
BEGIN TRY
WHILE (SELECT DISTINCT(DBF.Filedate)
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
WHERE DBF.Filedate = #i) IS NOT NULL
BEGIN
INSERT INTO #books (
[Date],
[Avg_Rank],
[Book_Group]
)
SELECT DBF.Filedate AS [Date],
AVG(AMA.SalesRank) AS Avg_Rank,
'Elite' AS Book_Group
FROM [ODS].[wholesale].[DailyBookFile] AS DBF
INNER JOIN [ODS].[MarketplaceMonitor].[SalesRank] as AMA ON DBF.Isbn13 = AMA.ISBN13
AND DBF.FileDate = AMA.importdate
WHERE DBF.Isbn13 in (SELECT TOP 200 Isbn13
FROM [ODS].[wholesale].[DailyBookFile]
WHERE FileDate = #i
AND CAST (CurrentModifiedDemandTotal AS int) > 500)
AND DBF.Filedate = #i
GROUP BY DBF.Filedate;
SET #i = #i+1;
END
END TRY
BEGIN CATCH
IF OBJECT_ID('tempdb..#books') IS NOT NULL
BEGIN
DROP TABLE #books
END
END CATCH
SELECT *
FROM #books
DROP TABLE #books
Using table-type variable would yield simpler code, but when storing large amounts of data table-type variables start losing in performance against temp tables. I'm not sure how many rows is a cut-off, but in my experience I've seen significant performance gains from changing table-type var to temp table at 10000+ row counts. For small row counts an opposite might apply.
This avoids a costly WHILE loop, and I believe achieves your goal:
CREATE TABLE #DailyBookFile ([Date] date,
Isbn13 bigint,
CurrentModifiedDemandTotal tinyint);
INSERT INTO #DailyBookFile
VALUES ('20180122',9780955153075,13),
('20180122',9780805863567,9 ),
('20180122',9781138779396,1 ),
('20180122',9780029001516,9 ),
('20180122',9780470614150,42);
CREATE TABLE #SalesRank (importdate date,
ISBN13 bigint,
#SalesRank int);
INSERT INTO #SalesRank
VALUES ('20180122',9780029001516,69499 ),
('20180122',9780470614150,52879 ),
('20180122',9780805863567,832429),
('20180122',9780955153075,44528 ),
('20180122',9781138779396,926435);
GO
WITH Ranks AS(
SELECT SR.*,
RANK() OVER (PARTITION By SR.importdate ORDER BY SR.#SalesRank) AS Ranking
FROM #SalesRank SR
JOIN #DailyBookFile DBF ON SR.ISBN13 = DBF.Isbn13
AND SR.importdate = DBF.[Date])
SELECT importdate AS [Date],
AVG(#SalesRank) AS Avg_rank,
'Elite' AS Book_Group
FROM Ranks
WHERE Ranking <= 200
GROUP BY importdate;
GO
DROP TABLE #DailyBookFile;
DROP TABLE #SalesRank;

Group by count once

My data is like below:
ClassId ClassName StudentId Subject SubjectId
-----------------------------------------------------
1 ESL 12 English 20
1 ESL 13 Science 30
1 ESL 12 Social 40
1 ESL 12 Maths 50
Required output: parameters are Subject column values
ClassId ClassName TotalStudents SubjectIds
-----------------------------------------------
1 ESL 2 20, 40, 50, 30
When one student takes multiple subjects then count student only once, so in the above data 12 is one student id takes multiple subjects so counted only once. TotalStudents value is 2 (1 from student id 12 and 1 from student id 13)
I am not looking for how to display subjectIds column value in comma separated string.
Thanks in advance
COUNT DISTINCT then use STUFF for combined the subject
declare #temp table
(ClassId int,ClassName nvarchar(max),StudentId int,Subject nvarchar(max), SubjectId int)
insert into #temp values (1,'ESL',12,'English' , 20 )
insert into #temp values (1,'ESL',13,'Science' , 30 )
insert into #temp values (1,'ESL',12,'Social ' , 40 )
insert into #temp values (1,'ESL',12,'Maths ' , 50 )
select ClassId,ClassName,COUNT(DISTINCT StudentId) CNT,
STUFF( (SELECT ',' + CAST(t1.SubjectId AS NVARCHAR)
FROM #temp t1
WHERE StudentId = t1.StudentId
FOR XML PATH('')),
1, 1, '') SubjectIdS
from #temp
GROUP BY ClassId,ClassName
OUTPUT
DISTINCT can be applied inside aggregate functions.
SELECT COUNT(DISTINCT column_name) FROM table_name;
If you don't need to display the SubjectIds, then you need to use a GROUP BY clause to group the resultset by ClassId and ClassName.
SELECT ClassId, ClassName, COUNT(distinct StudentId) as TotalStudents
FROM MyTable
GROUP BY ClassId, ClassName
See this example at SqlFiddle

Is it possible to iterate a calculation across distinct values in a column?

I want to iterate a calculation which takes counts of people that fit particular criteria and calculates percentages based on those counts across distinct regions.
My code:
USE Database1;
GO
declare #ShouldRegister as float
declare #Registered as float
SET #ShouldRegister = (SELECT COUNT(*) FROM dbo.TABLE
WHERE field1 in..
AND field2 in..
AND field3 in..
...
)
SET #Registered = (SELECT COUNT(*) FROM dbo.TABLE
WHERE field1 in..
AND field2 in..
AND field3 in..
...
)
SELECT
#ShouldRegister as ShouldRegister
, #Registered as Registered
, cast((#Registered/NULLIF(#ShouldRegister, 0))*100 as decimal(12,8)) as Percentmet
, CAST(100*2.33*(SQRT(#Registered/NULLIF(#ShouldRegister, 0) * (1-(#Registered/NULLIF(#ShouldRegister, 0)))/NULLIF(#ShouldRegister, 0))) as decimal(12,8)) + cast((#Registered/NULLIF(#ShouldRegister, 0))*100 as decimal(12,8)) as AdjPercentmet
The code returns something like this:
ShouldRegister Registered Percentmet adjpercentmet
223587 565 0.25269805 0.27743717
Each person has a region assigned in the "Region" column. The code above calculates across all regions. What I would like to see is:
ShouldRegister Registered Percentmet adjpercentmet Region
223 50 0.12345678 0.12345678 Region1
456 100 0.12345678 0.12345678 Region2
789 456 0.12345678 0.12345678 Region3
My brain wants to do: "For Region in Regions, do (Code)", but I don't think SQL works that way.
try this way :-
Set Nocount On;
Select t.ShouldRegistered
,t.Registered
,t.Region
----- Isnull(,0) will cause of divide by zero error if null
,Cast((t.Registered / (Case When Isnull(t.ShouldRegistered,0) = 0 Then 1 Else t.ShouldRegistered) * 100) As Decimal(12,8)) As Percentmet
,[AdjPercentmet condition] As AdjPercentmet
From (
Select tb.Region
,Sum(Case When [tb.Field1.....Condition] And [tb.Field2.....Condition] And [tb.Field3....Condition] Then 1 Else 0 End) As ShouldRegistered
,Sum(Case When [tb.Field1.....Condition] And [tb.Field2.....Condition] And [tb.Field3....Condition] Then 1 Else 0 End) As Registered
From dbo.TABLE As tb With (Nolock)
Group By tb.Region
) As t
Group By t.ShouldRegistered
,t.Registered
,t.Region
Order By t.Region

Summarize Data for Report using T-SQL Case Statement

I want to create a simple summary report in Reporting Services using age, account and age group as follows:
SELECT AGE,COUNT(ACCOUNT)AS TOTALCASES,
'AGEGRP' =CASE WHEN AGE <=5 THEN 'AGE 0 TO 5'
WHEN AGE >=6 THEN 'AGE 6 AND OLDER'
END
FROM MAIN
GROUP BY 'AGEGRP'
When I run this in SQL Server Management Studio, I receive error message:
Msg 164, Level 15, State 1, Line 1 Each GROUP BY expression must contain
at least one column that is not an outer reference.
Can someone suggest a way to produce summarized data, counting account number, summarizing by age 0 to 5 and age 6 and older?
you can't have "age" in the select list if you group by AGEGRP
try:
DECLARE #YourTable table (age int, account int)
insert into #YourTable values (1,40)
insert into #YourTable values (2,40)
insert into #YourTable values (3,40)
insert into #YourTable values (4,40)
insert into #YourTable values (5,40)
insert into #YourTable values (6,40)
insert into #YourTable values (7,40)
insert into #YourTable values (8,40)
SELECT
COUNT(ACCOUNT)AS TOTALCASES, AGEGRP
FROM (SELECT
AGE,ACCOUNT, CASE
WHEN AGE <=5 THEN 'AGE 0 TO 5'
WHEN AGE >=6 THEN 'AGE 6 AND OLDER'
END AS AGEGRP
FROM #YourTable
)dt
GROUP BY AGEGRP
OUTPUT:
TOTALCASES AGEGRP
----------- ---------------
5 AGE 0 TO 5
3 AGE 6 AND OLDER
(2 row(s) affected)
Either you do an inner query, like KM shows, or you repeat the expression you want to group by:
SELECT
AGE,
COUNT(ACCOUNT) AS TOTALCASES,
CASE
WHEN AGE <=5 THEN 'AGE 0 TO 5'
ELSE 'AGE 6 AND OLDER'
END AS AGEGRP
FROM
MAIN
GROUP BY
CASE
WHEN AGE <=5 THEN 'AGE 0 TO 5'
ELSE 'AGE 6 AND OLDER'
END
It's impossible to have AGE in the final result set. I feel you are mixing two requests together. Taking KM's solution, You can either have the inner result, or the outer result without the AGE column.
EDIT: KM just edited his reply, so do I:) Anyway, I was referencing the following two results:
select age, (case ... end) as agegroup from main
select agegroup, count(*) as cases from (select age, (case ... end) as agegroup) t group by agegroup

Resources