Select rows based on count of child table - sql-server

I have three entities: department, employee, and report. A department has many employees, each of whom has many reports. I want to select the one employee in each department who has the most reports. I have no idea how to even start this query. This question seems very similar, but I can't figure out how to manipulate those answers for what I want.
I have full access to the entire system, so I can make any changes necessary. In the event of a tie, it's safe to arbitrarily pick one of the results.
Department:
ID | Name
----|------
1 | DeptA
2 | DeptB
3 | DeptC
4 | DeptD
Employee:
ID | Name | DeptID
----|------|--------
1 | Joe | 1
2 | John | 1
3 | Emma | 2
4 | Jack | 3
5 | Sven | 3
6 | Axel | 4
7 | Brad | 4
8 | Jane | 4
Report:
ID | EmployeeID
----|------------
1 | 1
2 | 2
3 | 3
4 | 5
5 | 6
6 | 6
7 | 8
Desired result (assuming I queried names only):
Joe OR John (either is acceptable)
Emma
Sven
Axel

How to start this query? Well, get the information about each employee, the department, and the number of reports:
select e.name, e.deptid, count(*) as numreports
from employee e join
reports r
on e.id = r.employeeid
group by e.name, e.deptid;
Now you just want the largest count in each department. I would suggest row_number() or rank() depending on how you want to handle ties:
select er.*
from (select e.name, e.deptid, count(*) as numreports,
row_number() over (partition by e.deptid order by count(*) desc) as seqnum
from employee e join
reports r
on e.id = r.employeeid
group by e.name, e.deptid
) er
where seqnum = 1;
If you want the department name instead of number, you can join that in as well.

From your Question schema will be
SELECT * into #Department FROM(
select 1 ID,'DEPTA' NAME
UNION ALL
select 2,'DEPTB'
UNION ALL
select 3,'DEPTC'
UNION ALL
select 4,'DEPTD')TAB
SELECT * INTO #Employee FROM (
SELECT 1 ID ,'Joe' Name , 1 DeptID
UNION ALL
SELECT 2 , 'John' , 1
UNION ALL
SELECT 3 , 'Emma' ,2
UNION ALL
SELECT 4 ,'Jack' , 3
UNION ALL
SELECT 5 ,'Sven' , 3
UNION ALL
SELECT 6 , 'Axel' , 4
UNION ALL
SELECT 7 ,'Brad' , 4
UNION ALL
SELECT 8 ,'Jane' , 4)AS A
SELECT * INTO #Report FROM(
SELECT 1 ID ,1 EmployeeID
UNION ALL
SELECT 2, 2
UNION ALL
SELECT 3 ,3
UNION ALL
SELECT 4, 5
UNION ALL
SELECT 5, 6
UNION ALL
SELECT 6, 6
UNION ALL
SELECT 7, 8
UNION ALL
SELECT 8, 8
UNION ALL
SELECT 9, 8
)AS A
And you need to apply DENSE_RANK() for giving rank based on no of reports(count)
;WITH CTE AS(
select DEP.ID DEP_ID, DEP.NAME DEP,EMP.ID EMP_ID, EMP.Name EMP
,DENSE_RANK() OVER(PARTITION BY DEP.ID ORDER BY COUNT(REP.ID) DESC) REP_RANK
,COUNT(REP.ID) NO_OF_REP FROM #Department DEP
inner join #Employee emp on emp.deptid=dep.id
inner join #report rep on rep.EmployeeID=emp.id
GROUP BY DEP.ID, DEP.NAME ,EMP.ID, EMP.Name
)
SELECT DEP, EMP, NO_OF_REP FROM CTE WHERE REP_RANK=1
Here in the DEPTA Joe & John both will be picked because both are having 1 report count which is a max count in DEPTA.
And the result will be
+-------+------+-----------+
| DEP | EMP | NO_OF_REP |
+-------+------+-----------+
| DEPTA | Joe | 1 |
| DEPTA | John | 1 |
| DEPTB | Emma | 1 |
| DEPTC | Sven | 1 |
| DEPTD | Jane | 3 |
+-------+------+-----------+

Please try the below code:-
SELECT D.NAME
FROM (
SELECT C.NAME, RANK() OVER (
PARTITION BY C.DEPTID ORDER BY C.COUNTS DESC
) RNK
FROM (
SELECT EMPID, NAME, COUNT(EMPID) AS COUNTS, DEPTID
FROM DBO.REPORT AS A
JOIN DBO.EMPLO AS B ON A.EMPID = B.ID
GROUP BY EMPID, NAME, DEPTID
) AS C
) AS D
WHERE D.RNK = 1

Related

SQL Query a range of data and return NULL if the data is NOT exist

I'm trying to populate a list of data into a line charts. My X-axis will be my StartTime and my Y-axis will be Total
Wanted to ask is it possible to query a range of data and if the data is not in the database and return it as null instead of no show, take an example below:
|StartTime |Qty |
-----------------------
|10 |1 |
|11 |3 |
|12 |2 |
|13 |1 |
|11 |2 |
What's my expected result: WHERE CLAUSE AS StartTime within 9 TO 12
|StartTime |TOTAL |
-----------------------
|9 |NULL |
|10 |1 |
|11 |5 |
|12 |2 |
Can anyone show me and example of what the query will be? Because I have no idea at all.
You can have another table with Starttime you want to show on graph. Left join with your first table and do a group by starttime from new table. Use the count for your purpose.
TableTimes:
| StartTime |
--------------
| 9 |
| 10 |
| 11 |
| 12 |
| 13 |
| 14 |
Select Sum(Qty) From TableTimes TT
Left Join FirstTable FT on TT.StartTime=FT.StartTime
Where TT.StartTime Between 9 and 12
Group by TT.StartTime
You can get around creating a brand new table by using a sub-query or a CTE:
select h.Hour, Sum(i.Qty) as Qty
from ItemsPerHour i
right outer join (
select 1 as Hour union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10 union all
select 11 union all
select 12 union all
select 13 union all
select 14 union all
select 15 union all
select 16 union all
select 17 union all
select 18 union all
select 19 union all
select 20 union all
select 21 union all
select 24
) h
on h.Hour = i.StartTime
order by h.Hour;
Using WITH for an hours CTE:
with hours as
(
select 1 as Hour union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10 union all
select 11 union all
select 12 union all
select 13 union all
select 14 union all
select 15 union all
select 16 union all
select 17 union all
select 18 union all
select 19 union all
select 20 union all
select 21 union all
select 24
)
select h.Hour, Sum(i.Qty) as Qty
from ItemsPerHour i
right outer join hours h
on h.Hour = i.StartTime
order by h.Hour;

Rank consecutive null values

I want to rank consecutive null value for my records. Every record will be rank as 1. For the null value that only appear once, the rank will also be 1. But for the null values that appear in a consecutive way, the rank will be 1 for the first record and 2 for the second record and so on. Here's my code.
CREATE TABLE #my_table
(
id BIGINT IDENTITY PRIMARY KEY
,fruit varchar(100)
);
INSERT INTO #my_table
SELECT 'apple'
UNION ALL SELECT 'apple'
UNION ALL SELECT NULL
UNION ALL SELECT 'pineapple'
UNION ALL SELECT 'banana'
UNION ALL SELECT NULL
UNION ALL SELECT NULL
UNION ALL SELECT 'orange'
select * from #my_table
Intended result
+----+-----------+------+
| id | fruit | rank |
+----+-----------+------+
| 1 | apple | 1 |
| 2 | apple | 1 |
| 3 | NULL | 1 |
| 4 | pineapple | 1 |
| 5 | banana | 1 |
| 6 | NULL | 1 |
| 7 | NULL | 2 |
| 8 | orange | 1 |
+----+-----------+------+
How should I query it?
Please help!
You can use difference of ROW_NUMBER to get the grouping of consecutive NULL values:
WITH Cte AS(
SELECT *,
g = ROW_NUMBER() OVER(ORDER BY id)
- ROW_NUMBER() OVER(PARTITION BY fruit ORDER BY id)
FROM #my_table
)
SELECT
id,
fruit,
CASE
WHEN fruit IS NULL THEN ROW_NUMBER() OVER(PARTITION BY fruit, g ORDER BY id)
ELSE 1
END AS rank
FROM Cte
ORDER BY id;
ONLINE DEMO
CREATE TABLE #my_table
(
id BIGINT IDENTITY PRIMARY KEY
,fruit varchar(100)
);
INSERT INTO #my_table
SELECT 'apple'
UNION ALL SELECT 'apple'
UNION ALL SELECT NULL
UNION ALL SELECT 'pineapple'
UNION ALL SELECT 'banana'
UNION ALL SELECT NULL
UNION ALL SELECT NULL
UNION ALL SELECT 'orange'
;
WITH REC_CTE (id,fruit,ranks)
AS (
-- Anchor definition
SELECT id,
fruit,
1 as ranks
FROM #my_table
WHERE fruit is not null
-- Recursive definition
UNION ALL
SELECT son.id,
son.fruit,
case when son.fruit is null AND father.fruit is null then
father.ranks + 1
else
1
end as ranks
FROM #my_table son INNER JOIN
REC_CTE father
on son.id = father.id +1
WHERE son.fruit is null
--AND father.fruit is null
)
SELECT * from REC_CTE order by id
DROP TABLE #my_table
Following solution doesn't use recursion (limited to 32767 level = ~ rows depending on solution) and also it uses only two agregate/ranking functions (SUM and DENSE_RANK):
;WITH Base
AS (
SELECT *, IIF(fruit IS NULL, SUM(IIF(fruit IS NOT NULL, 1, 0)) OVER(ORDER BY id), NULL) AS group_num
FROM #my_table t
)
SELECT *, IIF(fruit IS NULL, DENSE_RANK() OVER(PARTITION BY group_num ORDER BY id), 1) rnk
FROM Base b
ORDER BY id
Results:
id fruit group_num rnk
--- --------- --------- ---
100 apple NULL 1
125 apple NULL 1
150 NULL 2 1
175 pineapple NULL 1
200 banana NULL 1
225 NULL 4 1
250 NULL 4 2
275 orange NULL 1
300 NULL 5 1
325 NULL 5 2
350 NULL 5 3

SQL Server 2012 - Looking for duplicates with differences

In SQL Server 2012, I have a table like this:
Id | AccountID | Accession | Status
----------------------------------------
1 | 1234567 | ABCD | F
2 | 1234567 | ABCD | F
3 | 2345678 | BCDE | F
4 | 8765432 | BCDE | F
5 | 3456789 | CDEF | F
6 | 9876543 | CDEF | A
I need to find rows that have the same Accession and a Status of "F", but a different AccountID.
I need a query that would return:
Id | AccountID | Accession | Status
----------------------------------------
3 | 2345678 | BCDE | F
4 | 8765432 | BCDE | F
1 and 2 wouldn't be returned because they have the same AccountID. 5 and 6 wouldn't be returned because the status on 6 is "A" and not "F".
You could do something like this.
;WITH NonDupAccountIDs AS
(
SELECT AccountID,Accession, Status
FROM MyTable
WHERE Status = 'F'
GROUP BY AccountID,Accession, Status
HAVING COUNT(Id) = 1
)
,DupAccessions AS
(
SELECT Accession
FROM MyTable
WHERE Status = 'F'
GROUP BY Accession
HAVING COUNT(AccountID) > 1
)
select a.AccountID, a.Accession, a.Status
FROM NonDupAccountIDs a
INNER JOIN DupAccessions b
ON a.Accession = b.Accession
Another alternative
Declare #Table table (id int,AccountID varchar(25),Accession varchar(25),Status varchar(25))
Insert into #Table (id , AccountID , Accession , Status) values
(1, 1234567,'ABCD','F'),
(2, 1234567,'ABCD','F'),
(3, 2345678,'BCDE','F'),
(4, 8765432,'BCDE','F'),
(5, 3456789,'CDEF','F'),
(6, 9876543,'CDEF','A')
Select A.*
from #Table A
Join (
Select Accession
From #Table
Where Status='F'
Group By Accession
Having Min(Accession)=Max(Accession)
and count(Distinct AccountID)>1
) B on a.Accession=B.Accession
Returns
id AccountID Accession Status
3 2345678 BCDE F
4 8765432 BCDE F
This works as well. If there are multiple sets of duplicates, this only returns one with the highest ID. Example
John Cappelletti had a great solution as well, his returns all duplicated values if there exists any incongruity. Example
I had to add some more data to see what would happen. You should decide how you will treat these occurrences.
select
max(ID) ID,AccountID, Accession
from p where Status = 'F'
group by AccountID, Accession
having
(select count(Accession) from (select max(ID) ID,AccountID, Accession from p where Status = 'F' group by AccountID, Accession) f where f.accession = p.accession)>1
;
SELECT t2.Id, t1.AccountID, t1.Accession, t1.Status
FROM TABLE_NAME t2
INNER JOIN (
SELECT AccountID, Accession, Status
FROM TABLE_NAME
GROUP BY Status, Accession, AccountID
) t1
ON t1.AccountID = t2.AccountID
Might need to play with this but should get you close. Remember to replace TABLE_NAME with your table.

LAG of MIN in SQL Analytic

I have a table containing employees id, year id, client id, and the number of sales. For example:
--------------------------------------
id_emp | id_year | sales | client id
--------------------------------------
4 | 1 | 14 | 1
4 | 1 | 10 | 2
4 | 2 | 11 | 1
4 | 2 | 17 | 2
For a employee, I want to obtain rows with the minimum sales per year and the minimum sales of the previous year.
One of the queries I tried is the following:
select distinct
id_emp,
id_year,
MIN(sales) OVER(partition by id_emp, id_year) AS min_sales,
LAG(min(sales), 1) OVER(PARTITION BY id_emp, id_year
ORDER BY id_emp, id_year) AS previous
from facts
where id_emp = 4
group by id_emp, id_year, sales;
I get the result:
-------------------------------------
id_emp | id_year | sales | previous
-------------------------------------
4 | 1 | 10 | (null)
4 | 1 | 10 | 10
4 | 2 | 11 | (null)
but I expect to get:
-------------------------------------
id_emp | id_year | sales | previous
-------------------------------------
4 | 1 | 10 | (null)
4 | 2 | 11 | 10
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE EMPLOYEE_SALES ( id_emp, id_year, sales, client_id ) AS
SELECT 4, 1, 14, 1 FROM DUAL
UNION ALL SELECT 4, 1, 10, 2 FROM DUAL
UNION ALL SELECT 4, 2, 11, 1 FROM DUAL
UNION ALL SELECT 4, 2, 17, 2 FROM DUAL;
Query 1:
SELECT ID_EMP,
ID_YEAR,
SALES AS SALES,
LAG( SALES ) OVER ( PARTITION BY ID_EMP ORDER BY ID_YEAR ) AS PREVIOUS
FROM (
SELECT e.*,
ROW_NUMBER() OVER ( PARTITION BY id_emp, id_year ORDER BY sales ) AS RN
FROM EMPLOYEE_SALES e
)
WHERE rn = 1
Query 2:
SELECT ID_EMP,
ID_YEAR,
MIN( SALES ) AS SALES,
LAG( MIN( SALES ) ) OVER ( PARTITION BY ID_EMP ORDER BY ID_YEAR ) AS PREVIOUS
FROM EMPLOYEE_SALES
GROUP BY ID_EMP, ID_YEAR
Results - Both give the same output:
| ID_EMP | ID_YEAR | SALES | PREVIOUS |
|--------|---------|-------|----------|
| 4 | 1 | 10 | (null) |
| 4 | 2 | 11 | 10 |
You mean like this?
select id_emp, id_year, min(sales) as min_sales,
lag(min(sales)) over (partition by id_emp order by id_year) as prev_year_min_sales
from facts
where id_emp = 4
group by id_emp, id_year;
I believe it is because you are using sales column in your group by statement.
Try to remove it and just use
GROUP BY id_emp,id_year
You could get your desired output using ROW_NUMBER() and LAG() analytic functions.
For example,
Table
SQL> SELECT * FROM t;
ID_EMP ID_YEAR SALES CLIENT_ID
---------- ---------- ---------- ----------
4 1 14 1
4 1 10 2
4 2 11 1
4 2 17 2
Query
SQL> WITH DATA AS
2 (SELECT t.*,
3 row_number() OVER(PARTITION BY id_emp, id_year ORDER BY sales) rn
4 FROM t
5 )
6 SELECT id_emp,
7 id_year ,
8 sales ,
9 lag(sales) over(order by sales) previous
10 FROM DATA
11 WHERE rn =1;
ID_EMP ID_YEAR SALES PREVIOUS
---------- ---------- ---------- ----------
4 1 10
4 2 11 10

SQL Server : select distinct by one column and by another column value

This is a SQL Server table's data
id user_id start_date status_id payment_id
======================================================
2 4 20-nov-11 1 5
3 5 23-nov-11 1 245
4 5 25-nov-11 1 128
5 6 20-nov-11 1 223
6 6 25-nov-11 2 542
7 4 29-nov-11 2 123
8 4 05-jan-12 2 875
I need to get distinct values by user_id also order by id asc, but only one user_id with highest start_date
I need the following output:
id user_id start_date status_id payment_id
======================================================
8 4 05-jan-12 2 875
4 5 25-nov-11 1 128
6 6 25-nov-11 2 542
Please help!
What is SQL query for this?
You can use row_number() in either a sub-query or using CTE.
Subquery Version:
select id, user_id, start_date, status_id, payment_id
from
(
select id, user_id, start_date, status_id, payment_id,
row_number() over(partition by user_id order by start_date desc) rn
from yourtable
) src
where rn = 1
See SQL Fiddle with Demo
CTE Version:
;with cte as
(
select id, user_id, start_date, status_id, payment_id,
row_number() over(partition by user_id order by start_date desc) rn
from yourtable
)
select id, user_id, start_date, status_id, payment_id
from cte
where rn = 1
See SQL Fiddle with Demo
Or you can join the table to itself:
select t1.id,
t1.user_id,
t1.start_date,
t1.status_id,
t1.payment_id
from yourtable t1
inner join
(
select user_id, max(start_date) start_date
from yourtable
group by user_id
) t2
on t1.user_id = t2.user_id
and t1.start_date = t2.start_date
See SQL Fiddle with Demo
All of the queries will produce the same result:
| ID | USER_ID | START_DATE | STATUS_ID | PAYMENT_ID |
---------------------------------------------------------------------------
| 8 | 4 | January, 05 2012 00:00:00+0000 | 2 | 875 |
| 4 | 5 | November, 25 2011 00:00:00+0000 | 1 | 128 |
| 6 | 6 | November, 25 2011 00:00:00+0000 | 2 | 542 |
Not the best and untested:
select *
from ServersTable
join (
select User_Id, max(Id) as ID
from ServersTable x
where x.start_date = (
select max(start_date)
from ServersTable y
where y.UserID = x.UserId
)
group by User_Id) s on ServersTable.Id = s.Id

Resources