I am still learning many new things about SQL such as PARTITION BY and CTEs. I am currently working on a query which I have cobbled together from a similar question I found online. However, I can not seem to get it to work as intended.
The problem is as follows -- I have been tasked to show rank promotions in an organization from the begining of 2022 to today. I am working with 2 primary tables, an EMPLOYEES table and a PERIODS table. This periods table captures a snapshot of any given employee each month - including their rank at the time. Each of these months is also assigned a PeriodID (e.g. Jan 2022 = PeriodID 131). Our EMPLOYEE table holds the employees current rank. These ranks are stored as an int (e.g. 1,2,3 with 1 being lowest rank). It is possible for an employee to rank up more than once in any given month.
I have simplified the used query as much as I can for the sake of this problem. Query follows as:
;WITH x AS
(
SELECT
e.EmployeeID, p.PeriodID, p.RankID,
rn = ROW_NUMBER() OVER (PARTITION BY e.EmployeeID ORDER BY p.PeriodID DESC)
FROM employees e
LEFT JOIN periods p on p.EmployeeID= e.EmployeeID
WHERE p.PeriodID <= 131 AND p.PeriodID >=118 --This is the time range mentioned above
),
rest AS (SELECT * FROM x WHERE rn > 1)
SELECT
main.EmployeeID,
PeriodID = MIN(
CASE
WHEN main.CurrentRankID = Rest.RankID
THEN rest.PeriodID ELSE main.PeriodID
END),
main.RankID, rest.RankID
FROM x AS main LEFT OUTER JOIN rest ON main.EmployeeID = rest.EmployeeID
AND rest.rn >1
LEFT JOIN periods p on p.EmployeeID = e.EmployeeID
WHERE main.rn = 1
AND NOT EXISTS
(
SELECT 1 FROM rest AS rest2
WHERE EmployeeID = rest.EmployeeID
AND rn < rest.rn
AND main.RankID <> rest.RankID
)
and p.PeriodID <= 131 AND p.PeriodID >=118
GROUP BY main.EmployeeID, main.PeriodID, main.RankID, rest.RankID
As mentioned before, this query was borrowed from a similar question and modified for my own use. I imagine the bones of the query is good and maybe I have messed up a variable somewhere but I can not seem to locate the problem line. The end goal is for the query to result in a table showing the EmployeeID, PeriodID, the rank they are being promoted from, and the rank they are being promoted to in the month the promotion was earned. Similar to the below.
EmployeeID
PeriodID
PerviousRankID
NewRank
123
131
1
2
123
133
2
3
Instead, my query is spitting out repeating previous/current ranks and the PeriodIDs seem to be static (such as what is shown below).
EmployeeID
PeriodID
PerviousRankID
NewRank
123
131
1
1
123
131
1
1
I am hoping someone with a greater knowledge base on these functions is able to quickly notice my mistake.
If we assume some example DML/DDL (it's really helpful to provide this with your question):
DECLARE #Employees TABLE (EmployeeID INT IDENTITY, Name VARCHAR(20), RankID INT);
DECLARE #Periods TABLE (PeriodID INT, EmployeeID INT, RankID INT);
INSERT INTO #Employees (Name, RankID) VALUES ('Jonathan', 10),('Christopher', 10),('James', 10),('Jean-Luc', 8);
INSERT INTO #Periods (PeriodID, EmployeeID, RankID) VALUES
(1,1,1),(2,1,1),(3,1,1),(4,1,8 ),(5,1,10),(6,1,10),
(1,2,1),(2,2,1),(3,2,1),(4,2,8 ),(5,2,8 ),(6,2,10),
(1,3,1),(2,3,1),(3,3,7),(4,3,10),(5,3,10),(6,3,10),
(1,4,1),(2,4,1),(3,4,1),(4,4,8 ),(5,4,9 ),(6,4,9 )
Then we can accomplish what I think you're looking for using a OUTER APPLY then aggregates the values based on the current-row values:
SELECT e.EmployeeID, e.Name, e.RankID AS CurrentRank, ap.PeriodID AS ThisPeriod, p.PeriodID AS LastRankChangePeriodID, p.RankID AS LastRankChangedFrom, ap.RankID - p.RankID AS LastRankChanged
FROM #Employees e
LEFT OUTER JOIN #Periods ap
ON e.EmployeeID = ap.EmployeeID
OUTER APPLY (
SELECT EmployeeID, MAX(PeriodID) AS PeriodID
FROM #Periods
WHERE EmployeeID = e.EmployeeID
AND RankID <> ap.RankID
AND PeriodID < ap.PeriodID
GROUP BY EmployeeID
) a
LEFT OUTER JOIN #Periods p
ON a.EmployeeID = p.EmployeeID
AND a.PeriodID = p.PeriodID
ORDER BY e.EmployeeID, ap.PeriodID DESC
Using the correlated subquery we get a view of the data which we can filter using the current-row values, and we aggregate that to return the period we're looking for (where it's before this period, and it's not the same rank). Then it's just a join back to the Periods table to get the values.
You used an LEFT JOIN, so I've preserved that using an OUTER APPLY. If you wanted to filter using it, it would be a CROSS APPLY instead.
EmployeeID
Name
CurrentRank
ThisPeriod
LastRankChangePeriodID
LastRankChangedFrom
LastRankChanged
1
Jonathan
10
6
4
8
2
1
Jonathan
10
5
4
8
2
1
Jonathan
10
4
3
1
7
1
Jonathan
10
3
1
Jonathan
10
2
1
Jonathan
10
1
2
Christopher
10
6
5
8
2
2
Christopher
10
5
3
1
7
2
Christopher
10
4
3
1
7
2
Christopher
10
3
2
Christopher
10
2
2
Christopher
10
1
3
James
10
6
3
7
3
3
James
10
5
3
7
3
3
James
10
4
3
7
3
3
James
10
3
2
1
6
3
James
10
2
3
James
10
1
4
Jean-Luc
8
6
5
9
-1
4
Jean-Luc
8
5
4
8
1
4
Jean-Luc
8
4
3
1
7
4
Jean-Luc
8
3
4
Jean-Luc
8
2
4
Jean-Luc
8
1
Now we can see what the previous change looked like for each period. Currently Jonathan is has RankID 10. Last time that was different was in PeriodID 4 when it was 8. The same was true for PeriodID 5. In PeriodID 4 he had RankID 8, and prior to that he had RankID 1. Before that his Rank hadn't changed.
Jean-Luc was actually demoted as his last change. I don't know if this is possible within your model.
I have a set of data that I want to classify into groups based on a prior record id existing on the newer rows. The initial record of the group has a prior sequence id = 0.
The data is as follows:
customer id
sequence id
prior_sequence id
1
1
0
1
2
1
1
3
2
2
4
0
2
5
4
2
6
0
2
7
6
Ideally, I would like to create the following grouping column and yield the following results:
customer id
sequence id
prior sequence id
grouping
1
1
0
1
1
2
1
1
1
3
2
1
2
4
0
2
2
5
4
2
2
6
0
3
2
7
6
3
I've attempted to utilize island gap logic utilizing the ROW_NUMBER() function. However, I have been unsuccessful in doing so. I suspect the need here is more along the lines of a recursive CTE, which I am attempting at the moment.
I agree that a recursive CTE will do the job. Something like:
WITH reccte AS
(
/*query that determines starting point for recursion
*
* In this case we want all records with no prior_sequence_id
*/
SELECT
customer_id,
sequence_id,
prior_sequence_id,
/*establish grouping*/
ROW_NUMBER() OVER (ORDER BY sequence_id) as grouping
FROM yourtable
WHERE prior_sequence_id = 0
UNION
/*join the recursive CTe back to the table and iterate*/
SELECT
yourtable.customer_id,
yourtable.sequence_id,
yourtable.prior_sequence_id,
reccte.grouping
FROM reccte
INNER JOIN yourtable ON reccte.sequence_id = yourtable.prior_sequence_id
)
SELECT * FROM reccte;
It looks like you could use a simple correlated query, at least given your sample data:
select *, (
select Sum(Iif(prior_sequence_id = 0, 1, 0))
from t t2
where t2.sequence_id <= t.sequence_id
) Grouping
from t;
See Example Fiddle
I have two tables in SQLITE one table FastData records data at a high rate while the other table SlowData records data at a lower rate. FastData and SlowData share a primary key (PK) that represents time of data capture. As such the two tables could look like:
Fast Data Slow Data
Pk Value1 Pk Value2
2 1 1 1
3 2 4 2
5 3 7 3
6 4
7 5
9 6
I would like to create a Select statement that joins these two tables filling in the SlowData with the previous captured data.
Join Data
Pk Value1 Value2
2 1 1
3 2 1
5 3 2
6 4 2
7 5 3
9 6 3
You may try the following approach which uses row_number to determine the most recent entry as it relates to Pk as the ideal entry for Value2 after performing a left join.
SELECT
Pk,
Value1,
Value2
FROM (
SELECT
f.Pk,
f.Value1,
s.Value2,
ROW_NUMBER() OVER (
PARTITION BY f.Pk, f.Value1
ORDER BY s.Pk DESC
) rn
FROM
fast_data f
LEFT JOIN
slow_data s ON f.Pk >= s.Pk
) t
WHERE rn=1;
Pk
Value1
Value2
2
1
1
3
2
1
5
3
2
6
4
2
7
5
3
9
6
3
View working demo on DB Fiddle
You need a LEFT join of the tables and FIRST_VALUE() window function to pick Value2:
SELECT DISTINCT f.Pk, f.Value1,
FIRST_VALUE(s.Value2) OVER (PARTITION BY f.Pk ORDER BY s.Pk DESC) Value2
FROM FastData f LEFT JOIN SlowData s
ON s.Pk <= f.Pk;
See the demo.
Here is the scenerio, I have a input data and a table table1
Input Data Table1
Customer Id Campaign ID CustomerId CampaignID
1 1 4 2
1 2 6 3
2 3 1 1
1 3 5 5
4 2 9 8
4 4
5 5
I want to query table1 such that it return only those values from the where clause which are not present in table1. So the result will be as below
Result
Customer Id Campaign ID
1 2
2 3
1 3
4 4
5 5
So the query should be something like
select CustomerId, CampaignID from Table1
where Customer Id in (Input data for customer id) and CampaignId in (Input data for campaign id)
. I know this query is not right, but can someone please help.
Is there a way to filter the values given in where clause based on if they are present in table1?
P.S. table1 primary key (CustomerId, CampaignID)
This will work as for your scenario. But it wont show last result record 5,5 since it does not fulfill your need.
select * from input where (cust_id, camp_id) not in (select cust_id, camp_id from table1)
I got stuck something about stored procedures I write a stored that i need to shot three columns of products count like this
SELECT
Count([TPDTN].[ProductName]) as 'Product Count',
[TPDTN].[CategoryID]
FROM
[TPDTN]
LEFT JOIN
[TPDCN] ON [TPDTN].[CategoryID] = [TPDCN].[libDocumentID]
GROUP BY
[TPDTN].[CategoryID], [TPDCN].[libDocumentID]
It shows results like this:
Product Count CategoryID
---------------------------
2 1
9 2
2 3
2 4
1 5
But I don't know how make it show
Product Count CategoryID libDocumentID
-----------------------------------------------
2 1 123456789
9 2 123456789
2 3 123456789
2 4 123456789
1 5 123456789
Producer ID (LibdocumentID) is from other table but when I SELECT [TPDCN].[libDocumentID] the value is NULL
Product Count CategoryID libDocumentID
------------------------------------------------
2 1 NULL
9 2 NULL
2 3 NULL
2 4 NULL
1 5 NULL
How can I solve it? Thank you
Just add it to the select, and if you don't need the NULL you need an INNER JOIN:
SELECT Count([TPDTN].[ProductName]) as 'Product Count',[TPDTN].[CategoryID], [TPDCN].[libDocumentID]
FROM [TPDTN]
inner join [TPDCN]
ON [TPDTN].[CategoryID] = [TPDCN].[libDocumentID]
GROUP BY [TPDTN].[CategoryID],[TPDCN].[libDocumentID]