I was wondering if a table could be created using count within multiple group by functions or how to possibly do this.
The goal is to create a table from our DB showing ethnicity numbers of our students based on year in the program.
Year
Black/AA
Hispanic/Latinx
Asian
2022
#
#
#
2023
#
#
#
2024
#
#
#
But can't figure out how to make SQL group them all effectively.
The data set looks like
Student_id
year
ethnicity_id
1
2022
1
2
2022
3
3
2023
3
4
2023
2
5
2024
3
6
2024
1
Ethnicity_id
Name
1
Black/AA
2
Hispanic/Latinx
3
Asian
you probably want a pivot. the pivot syntax is different depending on your rdbms. since you did not provide a rdmbs I will post a more generic solution using the case statement.
select year,
sum(case when ethnicity_id = 1 then 1 else 0 end) "Black/AA",
sum(case when ethnicity_id = 2 then 1 else 0 end) "Hispanic/Latinx",
sum(case when ethnicity_id = 3 then 1 else 0 end) "Asian"
from student_data
group by year
order by year
Related
I'm trying to output what every salesperson has sold in the last six months but what I am using counts all dates and outputs them.
SELECT SalespersonNo, COUNT (SalespersonNo) AS ['CarsSold']
FROM CarForSale
WHERE DateSold > '01/08/2018'
GROUP BY SalespersonNo;
As I said above, it outputs all the dates added up instead of what I want which is for it to add up all the cars sold in the past 6 months
These are the results I am getting:
SalespersonNo 'CarsSold'
100001 4
100002 1
100003 1
100004 4
100005 2
100010 1
100011 2
100012 2
100015 1
100017 2
100020 2
I am aiming to get results like this:
SalespersonNo 'CarsSold'
100001 3
100003 1
100004 3
100005 1
100011 2
100015 1
100017 2
100020 1
You probably want to use conditional aggregation:
SELECT SalespersonNo,
COUNT(SalespersonNo) AS [CarsSoldTotal],
COUNT(CASE WHEN DateSold > DATEADD(mm, -6, GETDATE()) THEN 1 END) AS [CarsSold6Month]
FROM CarForSale
WHERE DateSold > '01/08/2018'
GROUP BY SalespersonNo;
I have the below data, and I performed a ROW_NUMBER(partition by ID order by YEAR) function based on year which's ranking my data as below.
I want to bring in name for every id based on their latest year. I want to bring in NULL data if that's the only data available and bring in latest NON NULL data for every other record. But rownumber only lets me bring in recent name which could be NULL. How do I query below data to bring in most recent NON NULL name?
ID year name rownum
10 2011 abc 1
10 2010 abc 2
11 2011 ghi 1
11 2010 ghi 1
13 2010 NULL 1
13 2009 jkl 2
14 2014 NULL 1
14 2014 mno 2
15 2015 NULL 1
I want to bring in names jkl, mno for ID's 13 and 14 and not NULLS in my final result. Any suggestion on how to achieve that?
The output I desire is below - I want to display data for ROW NUM=1
10 2011 abc
11 2011 ghi
13 2009 jkl
14 2014 mno
15 2015 NULL
Sort non-null rows ahead of null rows:
select ID, year, name
from (select *,
row_number() over (partition by ID
order by case when name is null then 1 else 0 end, year desc) as RN
from #t) _
where rn = 1
See also SQL Server equivalent to Oracle's NULLS FIRST?, SQL Server ORDER BY date and nulls last &
I have 2 tables with the following datas in them:-
Company
CompanyId CompanyName
1 Company1
2 Company2
3 Company3
Employees
EmployeeId EmployeeName CompanyId StartDate
1 Employee1 1 12/21/2011
2 Employee2 1 01/20/2012
3 Employee3 2 03/23/2012
4 Employee4 2 07/15/2012
5 Employee5 2 01/20/2013
6 Employee6 3 12/17/2013
Now i want to check, How many people were recruited in the team in the specified month and year? I have the storage table as follows:-
RecruiterIndicator
CompanyId Year Month EmployeeRecruited
1 2011 12 1
1 2012 1 1
2 2012 3 1
2 2012 7 1
2 2013 1 1
3 2013 12 1
This should be a merge stored procedure that should update the data if it is present for the same month year and company and insert if that is not present? The loop would start from a particular date that can be an parameter and it would loop through the current month.
Please help me with this
Thanks
Vishal
SELECT YEAR(StartDate) AS [Year], MONTH(StartDate) AS [Month], COUNT(*) EmpTotal
FROM Employees
GROUP BY YEAR(StartDate), MONTH(StartDate)
If you want to see the Total Employees by company as well you can do something like this
SELECT YEAR(StartDate) AS [Year], MONTH(StartDate) AS [Month]
,C.CompanyName , COUNT(E.EmployeeId) EmpTotal
FROM Employees E INNER JOIN Company C
ON E.CompanyId = C.CompanyId
GROUP BY YEAR(StartDate), MONTH(StartDate) ,C.CompanyName
I have two tables Distributors and Orders. I want to get the order counts for each month (INCLUDING 0 counts) I am Grouping by CustId Month and Year.
NOTE : The client is using SQL 2000 :(
This is what I want
DistID Month Year Orders
------------------------------
1 1 2012 4
1 2 2012 13
1 3 2012 5
2 1 2012 3
2 2 2012 0
2 3 2012 0
3 1 2012 8
3 2 2012 0
3 3 2012 3
4 1 2012 1
4 2 2012 0
4 3 2012 1
5 1 2012 6
5 2 2012 6
5 3 2012 0
This is what I get
DistID Month Year Orders
------------------------------
1 1 2012 4
1 2 2012 13
1 3 2012 5
2 1 2012 3
3 1 2012 8
3 3 2012 3
4 1 2012 1
4 3 2012 1
5 1 2012 6
5 2 2012 6
I know why. Its because there isnt a row in the Orders table for certain months. Is there a way to put a count of 0 if there arent any rows in the Orders table for that month and year?
Here is what I have so far
SELECT
D.DistID,
DATEPART(MONTH, Order_Date) AS [Month],
DATEPART(YEAR, Order_Date) AS [Year],
SUM(Total_PV) AS TotalPV,
COUNT(D.DistId) AS Orders
FROM Distributor D
LEFT OUTER JOIN Order O ON D.DistID = O.Distributor_ID
WHERE DATEPART(YEAR, Order_Date) > 2005
GROUP BY DistID, DATEPART(MONTH, Order_Date), DATEPART(YEAR, Order_Date)
Thanks for any input
You could create a table containing all months and years, like:
create table MonthList(year int, month int);
If you fill it with all available years, you can then left join:
select o.distributor_id
, ml.month
, ml.year
, sum(o.total_pv) as totalpv
, count(d.distid) as orders
from monthlist ml
left join
[order] o
on datepart(year, o.order_date) = ml.year
and datepart(month, o.order_date) = ml.month
where ml.year > 2005
group by
o.distributor_id
, ml.month
, ml.year
There is no need to join in Distributor if you don't use columns from that table.
I have a query in MS SQL Server asking for name and some date-related information, depending on two dates, a start- and an enddate.
The problem is, I´m not always getting the same performance. Whenever I request something between the dates;
2010-07-01 00:00:00.000 and
2011-07-21 23:59:59.999
the performance is excellent. I get my result within mseconds. When I request something between these dates, for example,
2011-07-01 00:00:00.000 and
2011-07-21 23:59:59.999
the performance is.. less than good, taking between 20-28 seconds for each query. Do note how the dates giving good performance is more than a year between, while the latter is 20 days.
Is there any particular reason (maybe related to how DATETIME work) for this?
EDIT: The query,
SELECT ENAME,
SUM(CASE DATE WHEN 0 THEN 1 ELSE 0 END) AS U2,
SUM(CASE DATE WHEN 1 THEN 1 ELSE 0 END) AS B_2_4,
SUM(CASE DATE WHEN 2 THEN 1 ELSE 0 END) AS B_4_8,
SUM(CASE DATE WHEN 3 THEN 1 ELSE 0 END) AS B_8_16,
SUM(CASE DATE WHEN 4 THEN 1 ELSE 0 END) AS B_16_24,
SUM(CASE DATE WHEN 5 THEN 1 ELSE 0 END) AS B_24_48,
SUM(CASE DATE WHEN 6 THEN 1 ELSE 0 END) AS O_48,
SUM(CASE DATE WHEN 7 THEN 1 ELSE 0 END) AS status,
AVG(AVG) AS AVG,
SUM(DATE) AS TOTAL
FROM
(SELECT ENAME,
(CASE
WHEN status = 'Öppet' THEN 7
WHEN DATE < 48 THEN
(CASE WHEN DATE BETWEEN 0 AND 2 THEN 0
WHEN DATE BETWEEN 2 AND 4 THEN 1
WHEN DATE BETWEEN 4 AND 8 THEN 2
WHEN DATE BETWEEN 8 AND 16 THEN 3
WHEN DATE BETWEEN 16 AND 24 THEN 4
WHEN DATE BETWEEN 24 AND 48 THEN 5
ELSE - 1 END)
ELSE 6 END) AS DATE,
DATE AS AVG
FROM
(SELECT DATEDIFF(HOUR, cases.date, status.date) AS DATE,
extern.name AS ENAME,
status.status
FROM
cases INNER JOIN
status ON cases.id = status.caseid
AND status.date =
(SELECT MAX(date) AS Expr1
FROM status AS status_1
WHERE (caseid = cases.id)
GROUP BY caseid) INNER JOIN
extern ON cases.owner = extern.id
WHERE (cases.org = 'Expert')
AND (cases.date BETWEEN '2009-01-15 09:48:25.633'
AND '2011-07-21 09:48:25.633'))
AS derivedtbl_1)
AS derivedtbl_2
GROUP BY ENAME
ORDER BY ENAME
(parts of) The tables:
Extern
-ID (->cases.owner)
-name
Cases
-Owner (->Extern.id)
-id (->status.caseid)
-date (case created at this date)
Status
-caseid (->cases.id)
-Status
-Date (can be multiple, MAX(status.date) gives us date when
status was last changed)
I would have thought a statistics issue.
When you are only selecting the most recent dates these may be unrepresented in the statistics yet as the threshold has not yet been reached that would trigger auto updating.
See this blog post for an example.