Limited T-SQL Join - sql-server

This should be simple enough, but somehow my brain stopped working.
I have two related tables:
Table 1:
ID (PK), Value1
Table 2:
BatchID, Table1ID (FK to Table 1 ID), Value2
Example data:
Table 1:
ID Value1
1 A
2 B
Table 2:
BatchID Table1ID Value2
1 1 100
2 1 101
3 1 102
1 2 200
2 2 201
Now, for each record in Table 1, I'd like to do a matching record on Table 2, but only the most recent one (batch ID is sequential). Result for the above example would be:
Table1.ID Table1.Value1 Table2.Value2
1 A 102
2 B 201
The problem is simple, how to limit join result with Table2. There were similar questions on SO, but can't find anything like mine. Here's one on MySQL that looks similar:
LIMITing an SQL JOIN
I'm open to any approach, although speed is still the main priority since it will be a big dataset.

WITH Latest AS (
SELECT Table1ID
,MAX(BatchID) AS BatchID
FROM Table2
GROUP BY Table1ID
)
SELECT *
FROM Table1
INNER JOIN Latest
ON Latest.Table1ID = Table1.ID
INNER JOIN Table2
ON Table2.BatchID = Latest.BatchID

SELECT id, value1, value2
FROM (
SELECT t1.id, t2.value1, t2.value2, ROW_NUMBER() OVER (PARTITION BY t1.id ORDER BY t2.BatchID DESC) AS rn
FROM table1 t1
JOIN table2 t2
ON t2.table1id = t1.id
) q
WHERE rn = 1

Try
select t1.*,t2.Value2
from(
select Table1ID,max(Value2) as Value2
from [Table 2]
group by Table1ID) t2
join [Table 1] t1 on t2.Table1ID = t1.id

Either GROUP BY or WHERE clause that filters on the most recent:
SELECT * FROM Table1 a
INNER JOIN Table2 b ON (a.id = b.Table1ID)
WHERE NOT EXISTS(
SELECT 1 FROM Table2 c WHERE c.Table1ID = a.id AND c.BatchID > b. BatchID
)

Related

Find matched column records in one table that may be in multiple columns in a second table

I have two tables, Table 1 with multiple columns, name, ID number, address, etc. And Table 2 with columns, ID number 1 and ID number 2 and a few other columns.
I am trying to get a T-SQL query returning all rows in Table 1 with an indicator showing whether the ID number in Table 1 matches either ID_1 or ID_2 in Table 2. The result set would be all columns from Table 1 , plus the indicator “Matched” if the ID number in Table 1 matches either ID_1 or ID_2 in Table 2.
Table 1: ID | Name | Address |
Table 2: ID_1 | ID_2
Result
T1.ID, Name, Address, ("Matched"/"Unmatched") ...
Also, would it be the same to do the opposite, meaning instead of the result including all rows from Table 1 that have a matching ID in ID_1 or ID_2 in Table 2, the result set would include only records from Table 1 where t1.ID = (T2.ID_1 or T2.ID_2)?
SELECT DISTINCT
CASE
WHEN (table1.ID = table2.ID_1 )
THEN 'Matched'
ELSE 'Unmatched'
END AS Status ,
table1.*
FROM
table1
LEFT JOIN
table2 ON table1.ID = table2.ID_1
UNION
SELECT DISTINCT
CASE
WHEN (table1.ID = table2.ID_2)
THEN 'Matched'
ELSE 'Unmatched'
END AS Status,
table1.*
FROM
table1
LEFT JOIN
table2 ON table1.ID = table2.ID_2
I think that a correlated subquery with an exists condition would be a reasonable solution:
select
t1.*,
case when exists (select 1 from table2 t2 where t1.id in (t2.id_1, t2.id_2))
then 'Matched'
else 'Unmatched'
end matched
from table1 t1
And the other way around:
select
t2.*,
case when exists (select 1 from table1 t1 where t1.id in (t2.id_1, t2.id_2))
then 'Matched'
else 'Unmatched'
end matched
from table2 t2
If you want to "align" the rows based on the match for the whole dataset at once, then you might want to try a full join:
select t1.*, t2.*
from table1 t1
full join table2 t2 on t1.id in (t2.id_1, t2.id_2)

Joining Two Tables, getting Aggregate and unique value of one

This may have been answered previously, but I'm having a difficult time describing my issue.
Let's say I have two tables
Table1
User, CalendarID
Joe 1
Joe 2
Joe 3
Sam 4
Bob 1
Jim 2
Jim 3
Table2
CalendarID, CalendarTime
1 2014-08-18 00:00:00.000
2 2015-01-19 00:00:00.000
3 2015-08-24 00:00:00.000
4 2016-01-18 00:00:00.000
What I would like to do is Join the two tables, only getting a single User Name, and Calendar ID based on what is this highest CalendarTime associated with that CalandarID.
So I would like the query to return
User CalendarID
Joe 3
Sam 4
Bob 1
Jim 3
The closest I've managed is
SELECT t1.User, MAX(t2.CalendarTIme) AS CalendarTime
FROM table1 t1
INNER JOIN table2 as t2
ON t1.CalendarID = t2.CalendarID
Group By t1.User
Which gets me the User and CalendarTime that I want, but not the Calendar ID, which is what I really want. Please help.
Closest to your script and pretty straightforward:
SELECT t1.User, t2.*
FROM table1 t1
INNER JOIN table2 as t2
ON t1.CalendarID = t2.CalendarID
WHERE NOT EXISTS
(
SELECT 1 FROM table1 t1_2
INNER JOIN table2 t2_2
ON t2_2.Calendar_ID = t1_2.Calendar_ID
WHERE t1_2.User = t1.User
AND t2_2.CalendarTime > t2.CalendarTime
)
This can be solved for the top N per group:
using top with ties with row_number():
select top 1 with ties
t1.User, t1.CalendarId, t2.CalendarTime
from table1 t1
inner join table2 as t2
on t1.Calendarid = t2.Calendarid
order by row_number() over (partition by t1.User order by t2.CalendarTime desc)
or using common table expression(or a derived table/subquery) with row_number()
;with cte as (
select t1.User, t1.CalendarId, t2.CalendarTime
, rn = row_number() over (partition by t1.User order by t2.CalendarTime desc)
from table1 t1
inner join table2 as t2
on t1.Calendarid = t2.Calendarid
)
select User, CalendarId, CalendarTime
from cte
where rn = 1

Joining Tables with Criteria

How do I create an MSSQL query that joins TableA with TableB using the ID field, however I want it to join on ID record that has the highest value in the Number column?
TableA
ID
1
2
3
4
TableB
ID Number
1 1
1 2
1 3
2 1
3 1
3 2
4 1
4 2
4 3
I would want this as my output
TableJoined
ID Number
1 3
2 1
3 2
4 3
Is there a way to use a LEFT JOIN to achieve this or using max()?
Both. Use aggregation on the left join.
Select t1.id, max(t2.number)
From table1 t1
Left join table2 t2 on t1.id= t2.id
Group by t1.id;
You can query as below:
Select a.Id, Number from #a a join
(
Select top(1) with ties * from #b
order by row_number() over(partition by id order by number desc)
) b on a.id = b.id
Select A.Id, Max(Number) MaxNo from A
join B on A.Id=B.Id
Group by A.Id
create table #a(
id int
)
go
create table #b(
id int,
number int
)
go
insert into #a values(1),(2),(3),(4)
insert into #b values(1,1),(1,2),(1,3),(2,1),(3,1),(3,2),(4,1),(4,2),(4,3)
select #b.id,MAX(number) as maximum
from #b left outer join #a on #b.id=#a.id
group by #b.id

How to get column of table with a self-referencing foreign key?

I have a table structure shown below:
Id Name Sal ManagerId
1 a 5000 2
2 b 7000 3
3 c 6000 1
I need output like this
Id Name Sal Manager
1 a 5000 b
2 b 7000 c
3 c 6000 a
How can i do that?
You have to use a self-JOIN to link one table to itself:
SELECT t1.Id, t1.Name, t1.Sal, t2.ManagerName AS Manager
FROM TableName t1 INNER JOIN TableName t2
ON t1.ManagerID = t2.Id
If ManagerId is nullable you might want to use an OUTER JOIN:
SELECT t1.Id, t1.Name, t1.Sal, COALESCE(t2.ManagerName, '<no manager>') AS Manager
FROM TableName t1 LEFT OUTER JOIN TableName t2
ON t1.ManagerID = t2.Id

inner join and group by

I have two tables with identical definition.
T1:
Name VARCHAR(50)
Qty INT
T2:
Name VARCHAR(50)
Qty INT
This is the data each table has:
T1:
Name Qty
a 1
b 2
c 3
d 4
T2:
Name Qty
a 1
b 3
e 5
f 10
I want to have result which can sum the Qty from both the tables based on Name.
Expected resultset:
Name TotalQty
a 2
b 5
c 3
d 4
e 5
f 10
If am do Left Join or Right Join, it is not going to return me the Name from either of the tables.
What i am thinking is to create a temp table and add these records and just do a SUM aggregate on Qty column but i think there should be a better way to do this.
This is how my query looks like which does not return the expected resultset:
SELECT t1.Name, ISNULL(SUM(t1.Qty + t2.Qty),0) TotalQty
FROM t1
LEFT JOIN t2
ON t1.Name = T2.Name
GROUP BY t1.Name
Can someone please tell me if creating a temp table is OK here or there is a better way to do this?
You can use a full outer join:
SELECT
ISNULL(t1.Name, t2.Name) AS Name,
ISNULL(t1.Qty, 0) + ISNULL(t2.Qty, 0) AS TotalQty
FROM t1
FULL JOIN t2 ON t1.Name = T2.Name
See it working online: sqlfiddle
You can use a UNION ALL to select both tables as one, since they have the same definition. From there, you can nest them as a derived table, and then SUM on that:
SELECT [Name], SUM(Qty) AS TotalQty
FROM (
SELECT [Name], Qty
FROM t1
UNION ALL
SELECT [Name], Qty
FROM t2
) YourDerivedTable
GROUP BY [Name]

Resources