Join on max value if join condition not met - sql-server

I am trying to calculate how much vacation time an employee would have based on the amount of time he worked.
employeeVacation -> id, yearsWorked, vacationHours
employee -> empid, StartDate
I can get the number of years the employee worked by using this
datepart(year, getdate()) - datepart(year,StartDate) as yearsWorked
After you work for more that 8 years the vacation time is the same. Here is my vacation table.
id years vacationHours
1 0 40
2 1 40
3 2 40
4 3 80
5 4 80
6 5 80
7 6 80
8 7 120
select e.empid, ev.vacationhours from employee e join employeevacation ev on
ev.yearsWorked = datepart(year, getdate()) - datepart(year,e.StartDate)
So say you work for 30 years I can't do a join to get the number of vacation hours. Should I be looking to do a join, or should I just cut my losses and insert years into the vacation table up to like 100 so that way I can join and not worry about it.

SELECT e.empid, ISNULL(ev.vacationhour, 0)
FROM employee e
LEFT JOIN employeevacation ev ON ev.yearsWorked = (
SELECT MAX(ev2.yearsWorked)
FROM employeevacation ev2
WHERE yearsWorked<=datepart(year, getdate()) - datepart(year,e.StartDate)
)
Basically, you query the highest threshold he is eligible to.
You can replace 0 with whatever default value you may want. Although you should never hit the default value if you have an employeevacantion row with 0 years.

You can do something like this
select e.empid, coalesce(ev.vacationhours, evMax.vacationhours)
from employee e
left join employeevacation ev on ev.yearsWorked = datepart(year, getdate()) - datepart(year,e.StartDate)
join employeevacation evMax on evMax.yearsWorked = 7

Related

Finding A Time When A Value Changed

I am still learning many new things about SQL such as PARTITION BY and CTEs. I am currently working on a query which I have cobbled together from a similar question I found online. However, I can not seem to get it to work as intended.
The problem is as follows -- I have been tasked to show rank promotions in an organization from the begining of 2022 to today. I am working with 2 primary tables, an EMPLOYEES table and a PERIODS table. This periods table captures a snapshot of any given employee each month - including their rank at the time. Each of these months is also assigned a PeriodID (e.g. Jan 2022 = PeriodID 131). Our EMPLOYEE table holds the employees current rank. These ranks are stored as an int (e.g. 1,2,3 with 1 being lowest rank). It is possible for an employee to rank up more than once in any given month.
I have simplified the used query as much as I can for the sake of this problem. Query follows as:
;WITH x AS
(
SELECT
e.EmployeeID, p.PeriodID, p.RankID,
rn = ROW_NUMBER() OVER (PARTITION BY e.EmployeeID ORDER BY p.PeriodID DESC)
FROM employees e
LEFT JOIN periods p on p.EmployeeID= e.EmployeeID
WHERE p.PeriodID <= 131 AND p.PeriodID >=118 --This is the time range mentioned above
),
rest AS (SELECT * FROM x WHERE rn > 1)
SELECT
main.EmployeeID,
PeriodID = MIN(
CASE
WHEN main.CurrentRankID = Rest.RankID
THEN rest.PeriodID ELSE main.PeriodID
END),
main.RankID, rest.RankID
FROM x AS main LEFT OUTER JOIN rest ON main.EmployeeID = rest.EmployeeID
AND rest.rn >1
LEFT JOIN periods p on p.EmployeeID = e.EmployeeID
WHERE main.rn = 1
AND NOT EXISTS
(
SELECT 1 FROM rest AS rest2
WHERE EmployeeID = rest.EmployeeID
AND rn < rest.rn
AND main.RankID <> rest.RankID
)
and p.PeriodID <= 131 AND p.PeriodID >=118
GROUP BY main.EmployeeID, main.PeriodID, main.RankID, rest.RankID
As mentioned before, this query was borrowed from a similar question and modified for my own use. I imagine the bones of the query is good and maybe I have messed up a variable somewhere but I can not seem to locate the problem line. The end goal is for the query to result in a table showing the EmployeeID, PeriodID, the rank they are being promoted from, and the rank they are being promoted to in the month the promotion was earned. Similar to the below.
EmployeeID
PeriodID
PerviousRankID
NewRank
123
131
1
2
123
133
2
3
Instead, my query is spitting out repeating previous/current ranks and the PeriodIDs seem to be static (such as what is shown below).
EmployeeID
PeriodID
PerviousRankID
NewRank
123
131
1
1
123
131
1
1
I am hoping someone with a greater knowledge base on these functions is able to quickly notice my mistake.
If we assume some example DML/DDL (it's really helpful to provide this with your question):
DECLARE #Employees TABLE (EmployeeID INT IDENTITY, Name VARCHAR(20), RankID INT);
DECLARE #Periods TABLE (PeriodID INT, EmployeeID INT, RankID INT);
INSERT INTO #Employees (Name, RankID) VALUES ('Jonathan', 10),('Christopher', 10),('James', 10),('Jean-Luc', 8);
INSERT INTO #Periods (PeriodID, EmployeeID, RankID) VALUES
(1,1,1),(2,1,1),(3,1,1),(4,1,8 ),(5,1,10),(6,1,10),
(1,2,1),(2,2,1),(3,2,1),(4,2,8 ),(5,2,8 ),(6,2,10),
(1,3,1),(2,3,1),(3,3,7),(4,3,10),(5,3,10),(6,3,10),
(1,4,1),(2,4,1),(3,4,1),(4,4,8 ),(5,4,9 ),(6,4,9 )
Then we can accomplish what I think you're looking for using a OUTER APPLY then aggregates the values based on the current-row values:
SELECT e.EmployeeID, e.Name, e.RankID AS CurrentRank, ap.PeriodID AS ThisPeriod, p.PeriodID AS LastRankChangePeriodID, p.RankID AS LastRankChangedFrom, ap.RankID - p.RankID AS LastRankChanged
FROM #Employees e
LEFT OUTER JOIN #Periods ap
ON e.EmployeeID = ap.EmployeeID
OUTER APPLY (
SELECT EmployeeID, MAX(PeriodID) AS PeriodID
FROM #Periods
WHERE EmployeeID = e.EmployeeID
AND RankID <> ap.RankID
AND PeriodID < ap.PeriodID
GROUP BY EmployeeID
) a
LEFT OUTER JOIN #Periods p
ON a.EmployeeID = p.EmployeeID
AND a.PeriodID = p.PeriodID
ORDER BY e.EmployeeID, ap.PeriodID DESC
Using the correlated subquery we get a view of the data which we can filter using the current-row values, and we aggregate that to return the period we're looking for (where it's before this period, and it's not the same rank). Then it's just a join back to the Periods table to get the values.
You used an LEFT JOIN, so I've preserved that using an OUTER APPLY. If you wanted to filter using it, it would be a CROSS APPLY instead.
EmployeeID
Name
CurrentRank
ThisPeriod
LastRankChangePeriodID
LastRankChangedFrom
LastRankChanged
1
Jonathan
10
6
4
8
2
1
Jonathan
10
5
4
8
2
1
Jonathan
10
4
3
1
7
1
Jonathan
10
3
1
Jonathan
10
2
1
Jonathan
10
1
2
Christopher
10
6
5
8
2
2
Christopher
10
5
3
1
7
2
Christopher
10
4
3
1
7
2
Christopher
10
3
2
Christopher
10
2
2
Christopher
10
1
3
James
10
6
3
7
3
3
James
10
5
3
7
3
3
James
10
4
3
7
3
3
James
10
3
2
1
6
3
James
10
2
3
James
10
1
4
Jean-Luc
8
6
5
9
-1
4
Jean-Luc
8
5
4
8
1
4
Jean-Luc
8
4
3
1
7
4
Jean-Luc
8
3
4
Jean-Luc
8
2
4
Jean-Luc
8
1
Now we can see what the previous change looked like for each period. Currently Jonathan is has RankID 10. Last time that was different was in PeriodID 4 when it was 8. The same was true for PeriodID 5. In PeriodID 4 he had RankID 8, and prior to that he had RankID 1. Before that his Rank hadn't changed.
Jean-Luc was actually demoted as his last change. I don't know if this is possible within your model.

Choose row that equal to the max value from a query

I want to know who has the most friends from the app I own(transactions), which means it can be either he got paid, or paid himself to many other users.
I can't make the query to show me only those who have the max friends number (it can be 1 or many, and it can be changed so I can't use limit).
;with relationships as
(
select
paid as 'auser',
Member_No as 'afriend'
from Payments$
union all
select
member_no as 'auser',
paid as 'afriend'
from Payments$
),
DistinctRelationships AS (
SELECT DISTINCT *
FROM relationships
)
select
afriend,
count(*) cnt
from DistinctRelationShips
GROUP BY
afriend
order by
count(*) desc
I just can't figure it out, I've tried count, max(count), where = max, nothing worked.
It's a two columns table - "Member_No" and "Paid" - member pays the money, and the paid is the one who got the money.
Member_No
Paid
14
18
17
1
12
20
12
11
20
8
6
3
2
4
9
20
8
10
5
20
14
16
5
2
12
1
14
10
It's from Excel, but I loaded it into sql-server.
It's just a sample, there are 1000 more rows
It seems like you are massively over-complicating this. There is no need for self-joining.
Just unpivot each row so you have both sides of the relationship, then group it up by one side and count distinct of the other side
SELECT
-- for just the first then SELECT TOP (1)
-- for all that tie for the top place use SELECT TOP (1) WITH TIES
v.Id,
Relationships = COUNT(DISTINCT v.Other),
TotalTransactions = COUNT(*)
FROM Payments$ p
CROSS APPLY (VALUES
(p.Member_No, p.Paid),
(p.Paid, p.Member_No)
) v(Id, Other)
GROUP BY
v.Id
ORDER BY
COUNT(DISTINCT v.Other) DESC;
db<>fiddle

SQL Server - Apply OrderBy on columns when used CAST()

In my gaming application i have Teams and each Team can have any number of players, if a player participates in a match i am giving him 5 points. Each time the player participates in a match he will get 5 points added to his count.
my stored procedure takes TeamId as the input parameter.
Now i want to calculate the Total Participation points each team has got by month, but here the Participation Points each player has scored should be added to the last month in which the player has played the Match.
Lets say Team1 has Player1 and player1 has played total of 4 matches, 1 match in 04/2020 , 2 matches in 06/2020 and 1 match in 08/2020 , here for playing 4 matches Player1 of Team1 got 20 participation points and the last match Player1 played is in 08/2020 so all the 20 points should be added to 08/2020 for Team1
In the player table across each Player i have a [TotalMatchesPlayed] by each player, [TotalMatchesPlayed] * 5 will give me the [TotalParticipationPoints] for each player.
This should repeat for all the players in the Team.
SELECT DISTINCT TP.[TeamId], ISNULL(P.[TotalMatchesPlayed], 0) * 5 AS [ParticipationPoints], CAST(MONTH(PA.[ActivityDate]) AS VARCHAR(2)) AS [Month], CAST(YEAR(PA.[ActivityDate]) AS VARCHAR(4)) AS [Year] FROM [TeamPlayers] TP
INNER JOIN dbo.[Player] P
ON TP.[PlayerId] = P.[PlayerId]
INNER JOIN dbo.[PlayerActivity] PA
ON PA.[PlayerId] = P.[PlayerId] AND PA.[ActivityTypeId] = 14
WHERE TP.[TeamId] = 12
my issue with above query is [PlayerActivity] table has a row each time a player participates in a match, now i want to take only the latest date and add all the participation points to that month and year which i am not able to achieve
I tried adding ORDER BY PA.[ActivityDate] DESC but thts throwing an error
Order by items must appear in the select list if SELECT DISTINCT is
specified.
my sample output should be as below
ParticipationPoints | Month | Year
50 03 2020
0 04 2020
20 05 2020
sample table designs and data in the below link.
http://sqlfiddle.com/#!18/41766/1
Does this work for you:
SELECT
TP.[TeamId]
, SUM(ISNULL(P.[TotalMatchesPlayed], 0)) * 5 AS [ParticipationPoints]
, DATEPART(MONTH,PA.[ActivityDate]) AS [Month]
, DATEPART(YEAR,PA.[ActivityDate]) AS [Year]
FROM [TeamPlayer] TP
INNER JOIN dbo.[Player] P
ON TP.[PlayerId] = P.[PlayerId]
INNER JOIN dbo.[PlayerActivity] PA
ON PA.[PlayerId] = P.[PlayerId] AND PA.[PlayerActivityTyepId] = 14
WHERE TP.[TeamId]=45
GROUP BY TP.[TeamId], DATEPART(MONTH,PA.[ActivityDate]), DATEPART(YEAR,PA.[ActivityDate])
ORDER BY DATEPART(MONTH,PA.[ActivityDate]) DESC, DATEPART(YEAR,PA.[ActivityDate]) DESC
You called some [Activity Date] fields so you should select one of the casts above as order by
SELECT DISTINCT TP.[TeamId], ISNULL(P.[TotalMatchesPlayed], 0) * 5 AS [ParticipationPoints], CAST(MONTH(PA.[ActivityDate]) AS VARCHAR(2)) AS [Month], CAST(YEAR(PA.[ActivityDate]) AS VARCHAR(4)) AS [Year] FROM [TeamPlayers] TP
INNER JOIN dbo.[Player] P
ON TP.[PlayerId] = P.[PlayerId]
INNER JOIN dbo.[PlayerActivity] PA
ON PA.[PlayerId] = P.[PlayerId] AND PA.[ActivityTypeId] = 14
WHERE TP.[TeamId] = 12
ORDER BY CAST(MONTH(PA.[ActivityDate]) AS VARCHAR(2)) desc

TSQL - DateTime difference between more than two rows

I'm trying to find out how to calculate difference between multiple rows from one simple query. Here it is:
SELECT [DateTime],EmployeeId,ControlPointID,EventTypeID
FROM [Events]
WHERE Day([DateTime]) = 4
AND Month([DateTime]) = 7
AND Year([DateTime]) = 2017
AND EmployeeId = 451
AND ControlPointID IN ( 3, 6 )
AND EventTypeID IN ( 1, 2 )
ORDER BY [DateTime]
Result:
DateTime EmployeeId ControlPointID EventTypeID
2017-07-04 11:32:10.000 451 6 1
2017-07-04 16:07:00.000 451 3 2
2017-07-04 16:42:50.000 451 6 1
2017-07-04 20:04:10.000 451 3 2
I need to calculate difference between [DateTime] in minutes.
EventTypeId = 1 means that Employee enters to the building and EventTypeId=2 means that Employee leaves. I can calculate difference between first Enter Event and last Leave Event. In this case it's 512 minutes. But, i have problem to calculate work time, when someone enters twice and leaves twice. It should be 477 minutes. Calculation should looks like this:
DateDiff = (2017-07-04 16:07:00.000 - 2017-07-04 11:32:10.000) +
(2017-07-04 20:04:10.000 - 2017-07-04 16:42:50.000)
Can you help me figure it out, please ?
Given a building entry, finding the first leave after that entry can be done with cross apply:
select entry.EmployeeId, entry.DateTime, exit.DateTime
from Events entry
cross apply (select top 1 e.DateTime
from Events e
where e.EmployeeId = entry.EmployeeId
and e.DateTime > entry.DateTime
and e.EventTypeId = 2
order by e.DateTime asc
) as exit
where entry.EventTypeId = 1
at which point you just need to use the applicable T/SQL function to get the difference in whatever unit you want (eg. in minutes with datediff(minute, entry.DateTime, exit.DateTime).
To get the total of all the differences simply sum the differences:
select EmployeeId, sum(mins)
from (
select entry.EmployeeId, entry.DateTime as EntryDateTime, exit.DateTime as ExitDateTime, datediff(minute, EntryDateTime, ExitDateTime) as mins
from Events entry
cross apply (select top 1 e.DateTime
from Events e
where e.EmployeeId = entry.EmployeeId
and e.DateTime > entry.DateTime
and e.EventTypeId = 2
order by e.DateTime asc
) as exit
where entry.EventTypeId = 1
) as input
group by EmployeeId
Edit: added overall summation (with diff on the inside for clarity)
This can be done using LAG window function, since 2008 does not supports it we need to left join with Row_Number to find the previous entry
;WITH cte
AS (SELECT Row_number()OVER(Partition by EmployeeID ORDER BY [DateTime]) rn,*
FROM Yourresult)
SELECT a.EmployeeID,
Sum(Datediff(minute, b.[DateTime], a.[DateTime]))
FROM cte a
LEFT JOIN cte b
ON a.EmployeeID = b.EmployeeID
AND a.rn = b.rn + 1
WHERE a.[EventTypeId] = 2
GROUP BY a.EmployeeID
Note : This considers there isn't any wrong punches. Just like your sample data

SQL Server: How to get a rolling sum over 3 days for different customers within same table

This is the input table:
Customer_ID Date Amount
1 4/11/2014 20
1 4/13/2014 10
1 4/14/2014 30
1 4/18/2014 25
2 5/15/2014 15
2 6/21/2014 25
2 6/22/2014 35
2 6/23/2014 10
There is information pertaining to multiple customers and I want to get a rolling sum across a 3 day window for each customer.
The solution should be as below:
Customer_ID Date Amount Rolling_3_Day_Sum
1 4/11/2014 20 20
1 4/13/2014 10 30
1 4/14/2014 30 40
1 4/18/2014 25 25
2 5/15/2014 15 15
2 6/21/2014 25 25
2 6/22/2014 35 60
2 6/23/2014 10 70
The biggest issue is that I don't have transactions for each day because of which the partition by row number doesn't work.
The closest example I found on SO was:
SQL Query for 7 Day Rolling Average in SQL Server
but even in that case there were transactions made everyday which accomodated the rownumber() based solutions
The rownumber query is as follows:
select customer_id, Date, Amount,
Rolling_3_day_sum = CASE WHEN ROW_NUMBER() OVER (partition by customer_id ORDER BY Date) > 2
THEN SUM(Amount) OVER (partition by customer_id ORDER BY Date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
END
from #tmp_taml9
order by customer_id
I was wondering if there is way to replace "BETWEEN 2 PRECEDING AND CURRENT ROW" by "BETWEEN [DATE - 2] and [DATE]"
One option would be to use a calendar table (or something similar) to get the complete range of dates and left join your table with that and use the row_number based solution.
Another option that might work (not sure about performance) would be to use an apply query like this:
select customer_id, Date, Amount, coalesce(Rolling_3_day_sum, Amount) Rolling_3_day_sum
from #tmp_taml9 t1
cross apply (
select sum(amount) Rolling_3_day_sum
from #tmp_taml9
where Customer_ID = t1.Customer_ID
and datediff(day, date, t1.date) <= 3
and t1.Date >= date
) o
order by customer_id;
I suspect performance might not be great though.

Resources