SQL Server : Join from multiple table references - sql-server

Forgive me for adding yet another JOIN question, but I've been stumped all day and haven't been able to find a good answer for this.
I'm trying to join 4 tables, such that they look like below:
QuarterID ReviewID SaleID PotentialID
1 1 1 1
1 2 2 null
1 3 null 2
1 4 null null
The relevant info from the tables is below
Sale:
QuarterID
ReviewID
IsArchived
Potential:
QuarterID
ReviewID
IsArchived
Quarter:
ID
Review:
ID
We can have multiple Sales and Potentials associated with one Quarter-Review pairing, but only one Sale and one Potential will have IsArchived = 0 for the given Quarter-Review pairing.
SELECT
quarter.id AS QID,
review.id AS RID,
Sales.id AS SID,
Potentials.id AS PID
FROM
dbo.quarter
JOIN
(SELECT *
FROM dbo.sale
WHERE isarchived = 0) AS Sales ON Sales.quarterid = quarter.id
JOIN
(SELECT *
FROM dbo.potential
WHERE isarchived = 0) AS Potentials ON Potentials.quarterid = quarter.id
JOIN
dbo.review ON dbo.review.id = Sales.reviewid
AND dbo.review.id = Potentials.reviewid
ORDER BY
quarter.id, rid
Using the above (there are some unnecessary columns, I know), I've managed to get the joins so that they get the 1st condition (where its all the Sales and Potentials that are in the same Quarter and Review combination, but I also want to see if there is a Quarter/Review combo with only a Sale and no Potential, if there is a Q/R combo with only a Potential and no Sale, and just every Quarter and Review combo, since there are only a few Q/R combos that have both a Sale and Potential, with almost all of the Q/R combos only having a Sale or Potential.
I guess overall the difficulty comes from needing to get the join from two intermediate tables. I can join Quarter, Sale, and Review easily, but having the Potential table joining on the same fields (ReviewID, QuarterID) as Sale is making me only get the AND, and I can't figure out an OR. I've been throwing around ORs for hours trying to get the right sequence without any luck. Help?
--Edit to include sample data--
Quarter
ID
1
2
Review
ID (Other fields, not relevant to join)
1
2
3
4
5
Sale
ID ReviewID QuarterID isArchived (Other fields, not relevant)
1 1 1 0
2 2 1 1
3 2 1 0
4 1 2 0
5 5 1 0
6 5 2 0
Potential
ID ReviewID QuarterID isArchived (Other fields, not relevant)
1 1 1 0
2 3 1 0
3 4 2 1
4 4 2 0
5 5 2 0
Joining the above sample data, I would expect the output to look like:
QuarterID ReviewID SaleID PotentialID
1 1 1 1
1 2 3 null
1 3 null 2
1 4 null null
1 5 5 null
2 1 4 null
2 2 null null
2 3 null null
2 4 null 4
2 5 6 5
But the problem I am having is I am only returning the rows like the first and last row, where there is both a Sale and Potential for a given Quarter/Review combo, and not the ones where one or many may be null.

Not sure if I understood your question correctly (some sample data will help) but I think you mean that you need all the combinations of Quarter and Review and then any related Sale and Potential data for each combination of Quarter and Review. If that is what you need, then try the below query:
SELECT [Quarter].ID AS QID, Review.ID AS RID, Sales.ID AS SID, Potentials.ID AS PID FROM [Quarter]
CROSS JOIN [Review]
LEFT JOIN (SELECT * FROM Sale WHERE IsArchived = 0) Sales ON [Quarter].ID = Sales.QuarterID AND [Review].ID = Sales.ReviewID
LEFT JOIN (SELECT * FROM Potential WHERE IsArchived = 0) Potentials ON [Quarter].ID = Potentials.QuarterID AND [Review].ID = Potentials.ReviewID

Related

Finding A Time When A Value Changed

I am still learning many new things about SQL such as PARTITION BY and CTEs. I am currently working on a query which I have cobbled together from a similar question I found online. However, I can not seem to get it to work as intended.
The problem is as follows -- I have been tasked to show rank promotions in an organization from the begining of 2022 to today. I am working with 2 primary tables, an EMPLOYEES table and a PERIODS table. This periods table captures a snapshot of any given employee each month - including their rank at the time. Each of these months is also assigned a PeriodID (e.g. Jan 2022 = PeriodID 131). Our EMPLOYEE table holds the employees current rank. These ranks are stored as an int (e.g. 1,2,3 with 1 being lowest rank). It is possible for an employee to rank up more than once in any given month.
I have simplified the used query as much as I can for the sake of this problem. Query follows as:
;WITH x AS
(
SELECT
e.EmployeeID, p.PeriodID, p.RankID,
rn = ROW_NUMBER() OVER (PARTITION BY e.EmployeeID ORDER BY p.PeriodID DESC)
FROM employees e
LEFT JOIN periods p on p.EmployeeID= e.EmployeeID
WHERE p.PeriodID <= 131 AND p.PeriodID >=118 --This is the time range mentioned above
),
rest AS (SELECT * FROM x WHERE rn > 1)
SELECT
main.EmployeeID,
PeriodID = MIN(
CASE
WHEN main.CurrentRankID = Rest.RankID
THEN rest.PeriodID ELSE main.PeriodID
END),
main.RankID, rest.RankID
FROM x AS main LEFT OUTER JOIN rest ON main.EmployeeID = rest.EmployeeID
AND rest.rn >1
LEFT JOIN periods p on p.EmployeeID = e.EmployeeID
WHERE main.rn = 1
AND NOT EXISTS
(
SELECT 1 FROM rest AS rest2
WHERE EmployeeID = rest.EmployeeID
AND rn < rest.rn
AND main.RankID <> rest.RankID
)
and p.PeriodID <= 131 AND p.PeriodID >=118
GROUP BY main.EmployeeID, main.PeriodID, main.RankID, rest.RankID
As mentioned before, this query was borrowed from a similar question and modified for my own use. I imagine the bones of the query is good and maybe I have messed up a variable somewhere but I can not seem to locate the problem line. The end goal is for the query to result in a table showing the EmployeeID, PeriodID, the rank they are being promoted from, and the rank they are being promoted to in the month the promotion was earned. Similar to the below.
EmployeeID
PeriodID
PerviousRankID
NewRank
123
131
1
2
123
133
2
3
Instead, my query is spitting out repeating previous/current ranks and the PeriodIDs seem to be static (such as what is shown below).
EmployeeID
PeriodID
PerviousRankID
NewRank
123
131
1
1
123
131
1
1
I am hoping someone with a greater knowledge base on these functions is able to quickly notice my mistake.
If we assume some example DML/DDL (it's really helpful to provide this with your question):
DECLARE #Employees TABLE (EmployeeID INT IDENTITY, Name VARCHAR(20), RankID INT);
DECLARE #Periods TABLE (PeriodID INT, EmployeeID INT, RankID INT);
INSERT INTO #Employees (Name, RankID) VALUES ('Jonathan', 10),('Christopher', 10),('James', 10),('Jean-Luc', 8);
INSERT INTO #Periods (PeriodID, EmployeeID, RankID) VALUES
(1,1,1),(2,1,1),(3,1,1),(4,1,8 ),(5,1,10),(6,1,10),
(1,2,1),(2,2,1),(3,2,1),(4,2,8 ),(5,2,8 ),(6,2,10),
(1,3,1),(2,3,1),(3,3,7),(4,3,10),(5,3,10),(6,3,10),
(1,4,1),(2,4,1),(3,4,1),(4,4,8 ),(5,4,9 ),(6,4,9 )
Then we can accomplish what I think you're looking for using a OUTER APPLY then aggregates the values based on the current-row values:
SELECT e.EmployeeID, e.Name, e.RankID AS CurrentRank, ap.PeriodID AS ThisPeriod, p.PeriodID AS LastRankChangePeriodID, p.RankID AS LastRankChangedFrom, ap.RankID - p.RankID AS LastRankChanged
FROM #Employees e
LEFT OUTER JOIN #Periods ap
ON e.EmployeeID = ap.EmployeeID
OUTER APPLY (
SELECT EmployeeID, MAX(PeriodID) AS PeriodID
FROM #Periods
WHERE EmployeeID = e.EmployeeID
AND RankID <> ap.RankID
AND PeriodID < ap.PeriodID
GROUP BY EmployeeID
) a
LEFT OUTER JOIN #Periods p
ON a.EmployeeID = p.EmployeeID
AND a.PeriodID = p.PeriodID
ORDER BY e.EmployeeID, ap.PeriodID DESC
Using the correlated subquery we get a view of the data which we can filter using the current-row values, and we aggregate that to return the period we're looking for (where it's before this period, and it's not the same rank). Then it's just a join back to the Periods table to get the values.
You used an LEFT JOIN, so I've preserved that using an OUTER APPLY. If you wanted to filter using it, it would be a CROSS APPLY instead.
EmployeeID
Name
CurrentRank
ThisPeriod
LastRankChangePeriodID
LastRankChangedFrom
LastRankChanged
1
Jonathan
10
6
4
8
2
1
Jonathan
10
5
4
8
2
1
Jonathan
10
4
3
1
7
1
Jonathan
10
3
1
Jonathan
10
2
1
Jonathan
10
1
2
Christopher
10
6
5
8
2
2
Christopher
10
5
3
1
7
2
Christopher
10
4
3
1
7
2
Christopher
10
3
2
Christopher
10
2
2
Christopher
10
1
3
James
10
6
3
7
3
3
James
10
5
3
7
3
3
James
10
4
3
7
3
3
James
10
3
2
1
6
3
James
10
2
3
James
10
1
4
Jean-Luc
8
6
5
9
-1
4
Jean-Luc
8
5
4
8
1
4
Jean-Luc
8
4
3
1
7
4
Jean-Luc
8
3
4
Jean-Luc
8
2
4
Jean-Luc
8
1
Now we can see what the previous change looked like for each period. Currently Jonathan is has RankID 10. Last time that was different was in PeriodID 4 when it was 8. The same was true for PeriodID 5. In PeriodID 4 he had RankID 8, and prior to that he had RankID 1. Before that his Rank hadn't changed.
Jean-Luc was actually demoted as his last change. I don't know if this is possible within your model.

Classifying rows into a grouping column that shows the data is related to prior rows

I have a set of data that I want to classify into groups based on a prior record id existing on the newer rows. The initial record of the group has a prior sequence id = 0.
The data is as follows:
customer id
sequence id
prior_sequence id
1
1
0
1
2
1
1
3
2
2
4
0
2
5
4
2
6
0
2
7
6
Ideally, I would like to create the following grouping column and yield the following results:
customer id
sequence id
prior sequence id
grouping
1
1
0
1
1
2
1
1
1
3
2
1
2
4
0
2
2
5
4
2
2
6
0
3
2
7
6
3
I've attempted to utilize island gap logic utilizing the ROW_NUMBER() function. However, I have been unsuccessful in doing so. I suspect the need here is more along the lines of a recursive CTE, which I am attempting at the moment.
I agree that a recursive CTE will do the job. Something like:
WITH reccte AS
(
/*query that determines starting point for recursion
*
* In this case we want all records with no prior_sequence_id
*/
SELECT
customer_id,
sequence_id,
prior_sequence_id,
/*establish grouping*/
ROW_NUMBER() OVER (ORDER BY sequence_id) as grouping
FROM yourtable
WHERE prior_sequence_id = 0
UNION
/*join the recursive CTe back to the table and iterate*/
SELECT
yourtable.customer_id,
yourtable.sequence_id,
yourtable.prior_sequence_id,
reccte.grouping
FROM reccte
INNER JOIN yourtable ON reccte.sequence_id = yourtable.prior_sequence_id
)
SELECT * FROM reccte;
It looks like you could use a simple correlated query, at least given your sample data:
select *, (
select Sum(Iif(prior_sequence_id = 0, 1, 0))
from t t2
where t2.sequence_id <= t.sequence_id
) Grouping
from t;
See Example Fiddle

SQLite Join Tables With Different Primary Key Values

I have two tables in SQLITE one table FastData records data at a high rate while the other table SlowData records data at a lower rate. FastData and SlowData share a primary key (PK) that represents time of data capture. As such the two tables could look like:
Fast Data Slow Data
Pk Value1 Pk Value2
2 1 1 1
3 2 4 2
5 3 7 3
6 4
7 5
9 6
I would like to create a Select statement that joins these two tables filling in the SlowData with the previous captured data.
Join Data
Pk Value1 Value2
2 1 1
3 2 1
5 3 2
6 4 2
7 5 3
9 6 3
You may try the following approach which uses row_number to determine the most recent entry as it relates to Pk as the ideal entry for Value2 after performing a left join.
SELECT
Pk,
Value1,
Value2
FROM (
SELECT
f.Pk,
f.Value1,
s.Value2,
ROW_NUMBER() OVER (
PARTITION BY f.Pk, f.Value1
ORDER BY s.Pk DESC
) rn
FROM
fast_data f
LEFT JOIN
slow_data s ON f.Pk >= s.Pk
) t
WHERE rn=1;
Pk
Value1
Value2
2
1
1
3
2
1
5
3
2
6
4
2
7
5
3
9
6
3
View working demo on DB Fiddle
You need a LEFT join of the tables and FIRST_VALUE() window function to pick Value2:
SELECT DISTINCT f.Pk, f.Value1,
FIRST_VALUE(s.Value2) OVER (PARTITION BY f.Pk ORDER BY s.Pk DESC) Value2
FROM FastData f LEFT JOIN SlowData s
ON s.Pk <= f.Pk;
See the demo.

List combination in where clause

Here is the scenerio, I have a input data and a table table1
Input Data Table1
Customer Id Campaign ID CustomerId CampaignID
1 1 4 2
1 2 6 3
2 3 1 1
1 3 5 5
4 2 9 8
4 4
5 5
I want to query table1 such that it return only those values from the where clause which are not present in table1. So the result will be as below
Result
Customer Id Campaign ID
1 2
2 3
1 3
4 4
5 5
So the query should be something like
select CustomerId, CampaignID from Table1
where Customer Id in (Input data for customer id) and CampaignId in (Input data for campaign id)
. I know this query is not right, but can someone please help.
Is there a way to filter the values given in where clause based on if they are present in table1?
P.S. table1 primary key (CustomerId, CampaignID)
This will work as for your scenario. But it wont show last result record 5,5 since it does not fulfill your need.
select * from input where (cust_id, camp_id) not in (select cust_id, camp_id from table1)

SQL Stored Procedures : Count, Join and group by but it is shown 'NULL'

I got stuck something about stored procedures I write a stored that i need to shot three columns of products count like this
SELECT
Count([TPDTN].[ProductName]) as 'Product Count',
[TPDTN].[CategoryID]
FROM
[TPDTN]
LEFT JOIN
[TPDCN] ON [TPDTN].[CategoryID] = [TPDCN].[libDocumentID]
GROUP BY
[TPDTN].[CategoryID], [TPDCN].[libDocumentID]
It shows results like this:
Product Count CategoryID
---------------------------
2 1
9 2
2 3
2 4
1 5
But I don't know how make it show
Product Count CategoryID libDocumentID
-----------------------------------------------
2 1 123456789
9 2 123456789
2 3 123456789
2 4 123456789
1 5 123456789
Producer ID (LibdocumentID) is from other table but when I SELECT [TPDCN].[libDocumentID] the value is NULL
Product Count CategoryID libDocumentID
------------------------------------------------
2 1 NULL
9 2 NULL
2 3 NULL
2 4 NULL
1 5 NULL
How can I solve it? Thank you
Just add it to the select, and if you don't need the NULL you need an INNER JOIN:
SELECT Count([TPDTN].[ProductName]) as 'Product Count',[TPDTN].[CategoryID], [TPDCN].[libDocumentID]
FROM [TPDTN]
inner join [TPDCN]
ON [TPDTN].[CategoryID] = [TPDCN].[libDocumentID]
GROUP BY [TPDTN].[CategoryID],[TPDCN].[libDocumentID]

Resources