I have staff members that are assigned tasks, I need to find the percentage of tasks that a staff member has completed year-to-date... of those that were assigned to him. If John is assigned 10 tasks, and completed 5 I need to show John has closed .50 (50%).
I have two tables:
Tasks and Tasks_cstm
Tasks t
| ID | STATUS |Date_Completed|
The statuses are 'In Progress', 'Not Started', 'Completed'
Tasks_cst tc
| ID_C|Staff_Member|
The tables are joined on t.id = tc.id_c
This returns the number completed:
(select count(*)as Completed from tasks_CSTM tc
join tasks t
on t.id = tc.id_c
where status = 'completed'
group by staff_member_C )
This returns the total number of tasks:
(select count(*)as Total from tasks_CSTM tc
join tasks t
on t.id = tc.id_c
group by staff_member_C )
This is what I've come up with, but it errors: Subquery returned more than 1 value.
select staff_member_c,((select count(*)as Completed from tasks_CSTM tc
join tasks t
on t.id = tc.id_c
where status = 'completed'
group by staff_member_C )/(select count(*)as Total from tasks_CSTM tc
join tasks t
on t.id = tc.id_c
group by staff_member_C ))
from tasks t
join tasks_CSTM tc
on t.id = tc.id_C
group by staff_member_C
Any help is appreciated.
Something like this I think:
select staff_member_c, sum(case when status='completed' then 1.0 end)/count(*) as pctCompleted
from tasks_cstm tc
join tasks t
on t.id = tc.id_c
group by staff_member_c
You might need "else 0.0" in the case statement (but don't in MSSQL), and you might need nullif(count(*),0) in the denominator (but probably not in any DBMS).
There's a couple issues here to grapple with. One of which is dealing with the "year-to-date" part. Right now, with a Date_Completed column, there's no way to know when a task was assigned/created, which invalidates our ability to know year-to-date info. Barring that part of the question, here's my query which should work. I have a WHERE clause commented out which can easily be adapted to use a Date_Assigned column for year-to-date info.
select
staff_member
, sum(case t.status when 'Completed' then 1.0 else 0 end) [Completed]
, count(*) [Total]
, sum(case t.status when 'Completed' then 1.0 else 0 end) / count(*) [CompletedPercent]
from
tasks t
inner join tasks_cstm tc
on t.id = tc.id_C
--where
-- dateadd(year, datediff(year, 0, Date_Assigned), 0) = dateadd(year, datediff(year, 0, getdate()), 0)
group by
staff_member
And here's the setup code I used to (un-comprehensibly) test it out:
create table tasks (ID int, Status varchar(50), Date_Completed date)
create table tasks_cstm (ID_C int, Staff_Member varchar(50))
insert into tasks
select 1, 'Not Started', null
union all
select 2, 'Completed', '2011-04-15'
union all
select 3, 'In Progress', null
insert into tasks_cstm
select 1, 'Cadaeic'
union all
select 2, 'Cadaeic'
union all
select 3, 'Cadaeic'
Resulting in this:
staff_member Completed Total CompletedPercent
------------------- -------------------- ----------- -----------------------
Cadaeic 1.0 3 0.333333
-- Tasks
declare #T table(ID int, Status varchar(20))
-- Tasks_cst
declare #TC table(ID_C int, Staff_Member varchar(20))
insert into #TC values (1, 'Staff 1')
insert into #TC values (2, 'Staff 2')
insert into #TC values (3, 'Staff 3')
insert into #T values (1, 'Completed')
insert into #T values (1, 'Completed')
insert into #T values (1, 'In Progress')
insert into #T values (2, 'Completed')
insert into #T values (2, 'In Progress')
insert into #T values (3, 'In Progress')
select *
from #TC as TC
cross apply
(select sum(case T.Status when 'Completed' then 1.0 else 0.0 end) / count(*)
from #T as T
where T.ID = TC.ID_C) as C(PrecentCompleted)
Result
ID_C Staff_Member PrecentCompleted
----------- -------------------- ---------------------------------------
1 Staff 1 0.666666
2 Staff 2 0.500000
3 Staff 3 0.000000
Related
ID Grade Result
1 A Good
1 B Good
1 C Good
1 D Good
2 A Good
2 B Good
3 B Bad
3 C Bad
3 D Bad
.
.
.
Can I check for conditions for each id number. If an id number has A and B then good, if an id number doesn't have an A and has a D then bad. I need some type of procedure or loop to check the condition for all id numbers and return a value in a new column. I really just need to know if this can be done in T-SQL? I have a table with hundreds of id numbers.
Try this:
drop table #tmp
select 1 as ID, 'A' as Grade into #tmp
union select 1, 'B'
union select 1, 'C'
union select 1, 'D'
union select 2, 'A'
union select 2, 'B'
union select 3, 'B'
union select 3, 'C'
union select 3, 'D'
select * from #tmp
select ID, Grade,
case when ID in (select ID from #tmp where Grade in ('A','B') group by ID having count(*)>=2) then 'Good'
when ID in (select ID from #tmp where Grade not in ('A')) and ID in (select ID from #tmp where Grade in ('D')) then 'BAD' end as Result
from #tmp
You can do it with SUM() window function:
select t.ID, t.Grade,
case
when t.hasA > 0 and t.hasB > 0 then 'Good'
when t.hasA = 0 and t.hasD > 0 then 'Bad'
-- else ?
end Result
from (
select *,
sum(case when Grade = 'A' then 1 else 0 end) over (partition by ID) hasA,
sum(case when Grade = 'B' then 1 else 0 end) over (partition by ID) hasB,
sum(case when Grade = 'D' then 1 else 0 end) over (partition by ID) hasD
from tablename
) t
See the demo.
Results:
> ID | Grade | Result
> -: | :---- | :-----
> 1 | A | Good
> 1 | B | Good
> 1 | C | Good
> 1 | D | Good
> 2 | A | Good
> 2 | B | Good
> 3 | B | Bad
> 3 | C | Bad
> 3 | D | Bad
You could filter the proper set using group by and having clause
antd merge he result uisng union
the first select check for both A and B grade the second for D
select id, grade, 'Good' Result
from my_table m1
INNER JOIN (
select id
from my_table
where grade in ('A', 'B')
group by id
having count(distict grade ) = 2
) t1 on t1.id = m1.id
UNION
select id, grade, 'Bad'
from my_table m2
INNER JOIN (
select id
from my_table
where grade IN ( 'A', 'D')
group by id
having count(distict grade ) = 1
INNER JOIN my_table m2 ON t2.id = m2.id and m2.grade ='D'
) t2 ON t2.id = m2.id
Use aggregation and case:
select id,
(case when sum(case when grade not in ('A', 'B') then 1 else 0 end) > 0
then 'good'
when sum(case when grade = 'A' then 1 else 0 end) = 0 and
sum(case when grade = 'D' then 1 else 0 end) > 0
then 'bad'
else 'middling'
end) as overall
from t
group by id;
Note that I interpreted your first condition as "ids having only 'A's and 'B's."
You can try following:
select a.ID,a.Result
into #tmp2 from
(select ID, 'Good' Result from #yourtable where Grade='A') a
inner join
(select ID, 'Good' Result from #yourtable where Grade='B' ) b
on a.ID=b.ID
select t.ID,t.Grade,ISNULL(t2.Result,'Bad') Result
into #result
from #yourtable t
left join #tmp2 t2
on t.ID=t2.ID
select * from #result
This is one approach using string_agg function.
Please note only SQL Server 2017 or above version supports string_agg function.
DROP TABLE IF EXISTS TEST
CREATE TABLE TEST (ID INT, GRADE VARCHAR(2))
INSERT INTO TEST VALUES
(1, 'B'),
(1, 'A'),
(1, 'C'),
(1, 'D'),
(2, 'A'),
(2, 'B'),
(3, 'B'),
(3, 'C'),
(3, 'D')
;WITH CTE
AS
(
SELECT ID, STRING_AGG(GRADE,',') WITHIN GROUP (ORDER BY GRADE) AS GRADE_STRING
FROM TEST
GROUP BY ID
)
SELECT TEST.ID, TEST.GRADE,
CASE WHEN GRADE_STRING LIKE '%A%B%' THEN 'GOOD'
WHEN GRADE_STRING LIKE '%D%' AND GRADE_STRING NOT LIKE '%A%' THEN 'BAD'
END AS RESULTS
FROM CTE
LEFT JOIN TEST
ON CTE.ID = TEST.ID
Test Result:
DB<>Fiddle
On my Microsoft SQL Server 2016 database I'm trying to determine how many labs (Lab_Space table) have had an assessment (EHS_Assessment_Audit table) done within the last year, grouped by location (Locations table). It's common to have more than one assessment done per year per lab.
Everything I've tried so far results in more "done" counts than labs. For example:
WITH cte AS
(
SELECT DISTINCT
Lab_Space_Id
FROM
EHS_Assessment_Audit
WHERE
Audit_Date >= DATEADD(year, -1, GETDATE())
)
SELECT
l.Site_Name, l.Campus_Name,
COUNT(DISTINCT s.id) Total,
SUM(CASE WHEN a.Lab_Space_ID IS NOT NULL THEN 1 ELSE 0 END) Audited
FROM
Lab_Space s
LEFT OUTER JOIN
cte a ON s.id = a.Lab_Space_Id
JOIN
Locations l ON l.Building_Code = s.Building_Code
GROUP BY
l.Site_Name, l.Campus_Name
ORDER BY
l.Site_Name, l.Campus_Name
The cte there should get me a unique list of labs that have had an assessment done, and then I'm trying to count that grouped by location. I'm ending up with output though where it'll say there are 178 total and 1080 audited for a single site/campus combo.
I think using a CTE in this case is going to be more trouble than it's worth. A subquery is going to be easier to read and modify. For example:
SELECT
l.Site_Name,
l.Campus_Name,
COALESCE(b.NumAudits, 0) as NumTotalAudits,
COALESCE(b.NumLabs, 0) as AuditedLabs
FROM Locations l
LEFT JOIN (
SELECT s.Building_Code, COUNT(*) as NumAudits, COUNT(DISTINCT s.Lab_Space_Id) as NumLabs
FROM Lab_Space s
INNER JOIN EHS_Assessment_Audit a ON s.Lab_Space_Id = a.Lab_Space_Id
WHERE a.Audit_Date >= DATEADD(year, -1, GETDATE())
GROUP BY s.Building_Code
) as b ON l.Building_Code = b.Building_Code
With overly simplistic temp tables and example data:
CREATE TABLE #EHS_Assessment_Audit (Lab_Space_Id int, Audit_Date datetime)
CREATE TABLE #Lab_Space (Lab_Space_Id int, Building_Code int)
CREATE TABLE #Locations (Location_Id int, Building_Code int, Site_Name nvarchar(30), Campus_Name nvarchar(30))
INSERT INTO #Locations VALUES (1, 1, 'Location1', 'Campus1'), (2, 2, 'Location2', 'Campus2')
INSERT INTO #Lab_Space VALUES (1, 1), (2, 1), (3, 2), (4, 2)
INSERT INTO #EHS_Assessment_Audit VALUES (1, '2018-10-11'), (1, '2018-09-11'), (2, '2018-10-11'), (3, '2015-10-11')
SELECT * FROM #Locations
SELECT * FROM #Lab_Space
SELECT * FROM #EHS_Assessment_Audit
SELECT
l.Site_Name,
l.Campus_Name,
COALESCE(b.NumAudits, 0) as NumTotalAudits,
COALESCE(b.NumLabs, 0) as AuditedLabs
FROM #Locations l
LEFT JOIN (
SELECT s.Building_Code, COUNT(*) as NumAudits, COUNT(DISTINCT s.Lab_Space_Id) as NumLabs
FROM #Lab_Space s
INNER JOIN #EHS_Assessment_Audit a ON s.Lab_Space_Id = a.Lab_Space_Id
WHERE a.Audit_Date >= DATEADD(year, -1, GETDATE())
GROUP BY s.Building_Code
) as b ON l.Building_Code = b.Building_Code
Table 1
RefId Name
----- ----
1 A
2 B
Table 2
RefId Date
----- -----
1 29/03/2018 07:15
1 29/03/2018 07:30
2 29/03/2018 07:35
2 29/03/2018 07:40
I would like the result to be as follows (Refid name and the max(date) from table 1 and 2 for that refid)
1 A 29/03/2018 07:30
2 B 29/03/2018 07:40
Query used
select
table1.refId, table1.name,
(select max(date) from table2)
from
table1, table2
where
table1.refid = table2.refid
group by
table2.refid
I am getting the following error message
Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Use JOIN and the aggregate function MAX with GROUP BY to select the max date for each RefId.
Query
select [t1].[RefId], [t1].[Name], max([t2].[date] as [date]
from [Table1] [t1]
join [Table2] [t2]
on [t1].[RefId] = [t2].[RefId]
group by [t1].[RefId], [t1].[Name];
'29/03/2018 07:15' is nvarchar-type, you need datetime.
nvarchar convert to datetime: SELECT CONVERT(datetime, '29/03/2018 07:15', 103)
Answer to your example:
DECLARE #Table1 TABLE(RefId int, Name nvarchar(10));
INSERT INTO #Table1(RefId, Name) VALUES(1, 'A'), (2, 'B');
DECLARE #Table2 TABLE(RefId int, [Date] nvarchar(50));
INSERT INTO #Table2(RefId, [Date])
VALUES
(1, '29/03/2018 07:15'),
(1, '29/03/2018 07:30'),
(2, '29/03/2018 07:35'),
(2, '29/03/2018 07:40');
SELECT t1.RefId, t1.Name, t2.Date
FROM #Table1 AS t1
INNER JOIN
(SELECT RefId, MAX(CONVERT(datetime, [Date], 103)) AS [Date]
FROM #Table2
GROUP BY RefId) AS t2
ON t1.RefId = t2.RefId
I am trying to calculate churn of customers based on activity they could have done, opposed to churn by date that is the normal thing. We have events that is connected to a specific host, in my example all events are hosted by Alice but it could be different hosts.
All the people that follow a specific event should be placed in a category (new, active, churned and resurrected).
New: First time a person follow an event from the specific host.
Active: Follow again (and last event from specific host was also followed).
Churned: Follower had a chance to follow but didn't.
Resurrected: Follower that has churned has started to follow a previously followed host.
declare #events table (event varchar(50), host varchar(50), date date)
declare #eventFollows table (event varchar(50), follower varchar(50))
insert into #events values ('e_1', 'Alice', GETDATE())
insert into #events values ('e_2', 'Alice', GETDATE())
insert into #events values ('e_3', 'Alice', GETDATE())
insert into #events values ('e_4', 'Alice', GETDATE())
insert into #events values ('e_5', 'Alice', GETDATE())
insert into #eventFollows values ('e_1', 'Bob') --new
insert into #eventFollows values ('e_2', 'Bob') --active
--Bob churned
insert into #eventFollows values ('e_4', 'Megan') --new
insert into #eventFollows values ('e_5', 'Bob') --resurrected
insert into #eventFollows values ('e_5', 'Megan') --active
select * from #events
select * from #eventFollows
The expected outcome should be something like this
select 'e_1', 1 as New, 0 as resurrected, 0 as active, 0 as churned --First time Bob follows Alice event
union all
select 'e_2', 0 as New, 0 as resurrected, 1 as active, 0 as churned --Bob follows the next event that Alice host (considered as Active)
union all
select 'e_3', 0 as New, 0 as resurrected, 0 as active, 1 as churned --Bob churns since he does not follow the next event
union all
select 'e_4', 1 as New, 0 as resurrected, 0 as active, 0 as churned --First time Megan follows Alice event
union all
select 'e_5', 0 as New, 1 as resurrected, 1 as active, 0 as churned --Second time (active) for Megan and Bob is resurrected
I started with a query of something like below, but the problem is that I don't get all the events that the followers did not follow (but could have followed).
select a.event, follower, date,
LAG (a.event,1) over (partition by a.host, ma.follower order by date) as lag,
LEAD (a.event,1) over (partition by a.host, ma.follower order by date) as lead,
LAG (a.event,1) over (partition by a.host order by date) as lagP,
LEAD (a.event,1) over (partition by a.host order by date) as leadP
from #events a left join #eventFollows ma on ma.event = a.event order by host, follower, date
Any ideas?
This may seem a bit of an indirect approach, but it's possible to detect islands by checking for gaps in the numbers:
;with nrsE as
(
select *, ROW_NUMBER() over (order by event) rnrE from #events
), nrs as
(
select f.*,host, rnrE, ROW_NUMBER() over (partition by f.follower, e.host order by f.event ) rnrF
from nrsE e
join #eventFollows f on f.event = e.event
), f as
(
select host, follower, min(rnrE) FirstE, max(rnrE) LastE, ROW_NUMBER() over (partition by follower, host order by rnrE - rnrF) SeqNr
from nrs
group by host, follower, rnrE - rnrF --difference between rnr-Event and rnr-Follower to detect gaps
), stat as --from the result above on there are several options. this example uses getting a 'status' and pivoting on it
(
select e.event, e.host, case when f.FirstE is null then 'No participants' when f.LastE = e.rnrE - 1 then 'Churned' when rnrE = f.FirstE then case when SeqNr = 1 then 'New' else 'Resurrected' end else 'Active' end Status
from nrsE e
left join f on e.rnrE between f.FirstE and f.LastE + 1 and e.host = f.host
)
select p.* from stat pivot(count(Status) for Status in ([New], [Resurrected], [Active], [Churned])) p
The last 2 steps could be simplified, but getting the 'Status' this way might be reusable for other scenarios
This matches your desired result
SELECT
X.event, X.host, X.date,
IsNew = SUM(CASE WHEN X.FirstFollowerEvent = X.event THEN 1 ELSE 0 END),
IsActive = SUM(CASE WHEN X.lagFollowerEvent = X.lagEvent THEN 1 ELSE 0 END),
IsChurned = SUM(CASE WHEN X.follower IS NULL THEN 1 ELSE 0 END),
IsResurrected = SUM(CASE WHEN X.lagFollowerEvent <> X.lagEvent AND X.FirstFollowerEvent IS NOT NULL THEN 1 ELSE 0 END)
FROM
(
select
a.event, a.host, ma.follower, a.date,
FIRST_VALUE(a.event) over (partition by a.host, ma.follower order by a.date, a.event) as FirstFollowerEvent,
LAG (a.event,1) over (partition by a.host, ma.follower order by a.date, a.event) as lagFollowerEvent,
LAG (a.event,1) over (partition by a.host order by a.date, a.event) as lagEvent
FROM
#events a
LEFT join
#eventFollows ma on a.event = ma.event
) X
GROUP BY
X.event, X.host, X.date
ORDER by
X.event, X.host, X.date
From this it is possible for the following :
TABLE 1
Id | final | Date
------------------
1 236 02-11-14
2 10 07-01-12
3 58 09-02-10
TABLE 2
Id | final | Date
------------------
1 330 02-11-14
2 5 07-01-12
3 100 09-02-10
ADD both Table 1 and Table 2 Sum'd values(column final), and then work out the AVG number from this and create this as another column average, THEN if table2 for example SUM'd original amount (before the AVG) is higher than Table 1 SUM'd amount create another column and print in that column 'Tbl2 has the higher amount' and vise verser if table 1 had the higher amount.
End result Column wise table would look like this :
|tb1_final_amount|tb2_final_amount|Avg_Amount|Top_Score_tbl
|tb1_final_amount|tb2_final_amount|Avg_Amount|Top_Score_tbl
304 435 369.5 tb2 has highest score
This is one way (of many) to do this. You can sum up the two tables and use them as derived tables in a query like so:
select
tb1_final_amount,
tb2_final_amount,
(tb1_final_amount+tb2_final_amount)/2.0 as Avg_Amount,
case
when tb1_final_amount < tb2_final_amount then 'tb2 has highest score'
else 'tb1 has highest score'
end as Top_Score_tbl
from
(select SUM(final) as tb1_final_amount from TABLE1) t1,
(select SUM(final) as tb2_final_amount from TABLE2) t2
This does the trick!:
--SET UP Table1
CREATE TABLE Table1 (ID INT, final INT, [Date] DATETIME)
INSERT Table1 VALUES (1, 236, '20141102')
INSERT Table1 VALUES (2, 10, '20120107')
INSERT Table1 VALUES (3, 58, '20100209')
--SET UP Table2
CREATE TABLE Table2 (ID INT, final INT, [Date] DATETIME)
INSERT Table2 VALUES (1, 330, '20141102')
INSERT Table2 VALUES (2, 5, '20120107')
INSERT Table2 VALUES (3, 100, '20100209')
-- Query
SELECT
SUM(CASE WHEN t.TableName = 'Table1' THEN T.final
ELSE 0
END) AS tb1_final_amount,
SUM(CASE WHEN t.TableName = 'Table2' THEN T.final
ELSE 0
END) AS tb2_final_amount,
AVG(T.final) AS Avg_Amount,
ISNULL((
SELECT
'Table1'
FROM
Table1 T1
WHERE
SUM(CASE WHEN t.TableName = 'Table1' THEN T.final
ELSE 0
END) > SUM(CASE WHEN t.TableName = 'Table2' THEN T.final
ELSE 0
END)
), 'Table2')
FROM
(
SELECT
'Table1' AS TableName,
final
FROM
Table1
UNION ALL
SELECT
'Table2',
final
FROM
Table2
) AS T