T-SQL: Join three tables and limit to earliest encounter - sql-server

I'm fairly new to sql and have sunk a whole day into trying to figure out how to do the following. I have 3 tables that look something like this:
Table 1
customer_id department_id start_dts
1 2 2011-07-23 14:30:00
3 1 2011-07-24 10:15:00
3 1 2011-08-18 11:14:00
2 3 2012-05-04 05:45:00
1 3 2010-06-09 15:20:00
Table 2
department_id deparment_nm
1 a
2 b
3 c
Table 3
customer_id customer_nm
1 betty
2 fred
3 dino
I want to generate a list of the earliest encounter for each department and the associated customer name for the encounter such that it would look something like this (order of the dept doesn't matter):
department_nm customer_nm start_dts
a dino 2011-07-24 10:15:00
b betty 2011-07-23 14:30:00
c betty 2010-06-09 15:20:00
I first attempted to join table 2 on table 1 by department_id then inner joining table 3 on customer_id and using the min function for start_dts under the select statement, but that gives me each customers first encounter in each department. I then tried various iterations of nested joins and attempted to use an over/partition clause to get what I want, but I don't think I'm understanding that concept correctly. Any insight is very appreciated.

;with cte as (
select dept_nm, cust_nm, row_number() over (partition by t1.dept_id order by start_dts) rn, start_dts from table1 t1
left join table2 t2
on t1.dept_id = t2.dept_id
left join table3 t3
on t1.cust_id = t3.cust_id
) select * from cte where rn=1
Not sure i understood your requirement correctly.. but looks you are trying something like this...

Pretty sparse on actual details but something like this is what you are looking for. This has been asked and answered hundreds and hundreds of times.
select department_nm
, customer_nm
, start_dts
from
(
select department_nm
, customer_nm
, start_dts
, ROW_NUMBER() over(partition by t1.customer_id, t1.department_id order by t1.start_dts desc) as RowNum
from table1 t1
join table2 t2 on t2.department_id = t1.department_id
join table3 t3 on t3.customer_id = t1.customer_id
) x
where x.RowNum = 1

I feel like CTEs and Window functions are overkill for something like this. The following should work, if I understood correctly.
SELECT department_nm, customer_nm, MIN(start_dts) AS [start_dts]
FROM
(
SELECT department_nm, customer_nm, start_dts
FROM Table1 t1
JOIN Table2 t2 ON t1.department_id = t2.department_id
JOIN Table3 t3 ON t1.customer_id = t2.customer_id
) x
GROUP BY department_nm, customer_nm

Related

Return sum using aggregate function in SQL Server

I have two tables.
Table1 has following data
Id1 Name Comments
--------------------
1 abc hgdhg
2 xyz mnoph
3 ysdfr jkljk
4 asdf iiuoo
5 pqrs liuoo
Table2 has following data
Id2 Id1 count date
-------------------------------
1 1 18 11/16/2005
2 1 1 11/15/2005
3 1 4 11/25/2005
4 2 4 11/22/2005
5 3 8 11/05/2005
6 3 3 11/30/2005
7 4 2 11/29/2005
8 3 0 11/04/2005
9 2 5 11/02/2005
10 3 9 11/22/2005
11 2 15 11/10/2005
12 5 12 11/19/2005
I want to return output as name, comments, sum of all count since 11/10/2005
I am trying the following query(with out date where condition)
select
Name, Comments, sum(count)
from
Table1 T1
join
Table2 T2 on T1.Id1 = T2.Id1
group by
ID1
But it is throwing error
Name is invalid in the select list because it is not contained in either an aggregate function or the Group by clause.
Can anyone help me with query (with the date where condition)? What's wrong with this?
Thanks in advance
You have to add any columns not contained in the aggregate function, and use where to filter the results:
select Name,
Comments,
sum(count)
from Table1 T1 join Table2 T2 on T1.Id1 = T2.Id1
where T2.[date] >= '11/10/2005'
group by Name, Comments
you can use below query
SELECT T1.Name ,
T1.Comments ,
SUM(T2.[count]) AS [count]
FROM Table1 T1
INNER JOIN Table2 T2 ON T1.Id1 = T2.Id1
WHERE CAST(T2.[date] AS DATE) >= CAST('11/10/2005' AS DATE)
GROUP BY T1.Name ,
T1.Comments
Every column in a select statement without an aggregate function needs to be in the group by sentence too to prevent aggregate errors, about limiting the date, use where clause to define the condition, like shows ahead.
select
Name, Comments, sum(count)
from
Table1 T1
join
Table2 T2 on T1.Id1 = T2.Id1
where
date >= '2005-11-10 00:00:00'
group by
Name, Comments

Joining Two Tables, getting Aggregate and unique value of one

This may have been answered previously, but I'm having a difficult time describing my issue.
Let's say I have two tables
Table1
User, CalendarID
Joe 1
Joe 2
Joe 3
Sam 4
Bob 1
Jim 2
Jim 3
Table2
CalendarID, CalendarTime
1 2014-08-18 00:00:00.000
2 2015-01-19 00:00:00.000
3 2015-08-24 00:00:00.000
4 2016-01-18 00:00:00.000
What I would like to do is Join the two tables, only getting a single User Name, and Calendar ID based on what is this highest CalendarTime associated with that CalandarID.
So I would like the query to return
User CalendarID
Joe 3
Sam 4
Bob 1
Jim 3
The closest I've managed is
SELECT t1.User, MAX(t2.CalendarTIme) AS CalendarTime
FROM table1 t1
INNER JOIN table2 as t2
ON t1.CalendarID = t2.CalendarID
Group By t1.User
Which gets me the User and CalendarTime that I want, but not the Calendar ID, which is what I really want. Please help.
Closest to your script and pretty straightforward:
SELECT t1.User, t2.*
FROM table1 t1
INNER JOIN table2 as t2
ON t1.CalendarID = t2.CalendarID
WHERE NOT EXISTS
(
SELECT 1 FROM table1 t1_2
INNER JOIN table2 t2_2
ON t2_2.Calendar_ID = t1_2.Calendar_ID
WHERE t1_2.User = t1.User
AND t2_2.CalendarTime > t2.CalendarTime
)
This can be solved for the top N per group:
using top with ties with row_number():
select top 1 with ties
t1.User, t1.CalendarId, t2.CalendarTime
from table1 t1
inner join table2 as t2
on t1.Calendarid = t2.Calendarid
order by row_number() over (partition by t1.User order by t2.CalendarTime desc)
or using common table expression(or a derived table/subquery) with row_number()
;with cte as (
select t1.User, t1.CalendarId, t2.CalendarTime
, rn = row_number() over (partition by t1.User order by t2.CalendarTime desc)
from table1 t1
inner join table2 as t2
on t1.Calendarid = t2.Calendarid
)
select User, CalendarId, CalendarTime
from cte
where rn = 1

How to replace some rows in a SELECT query from another SELECT

I have two tables:
T1:
ID Department ATTRIBUTES TEAM
--- ---------- ---------- ------
1 R&D Dress_Code NULL
2 R&D Dress_Code Web
3 R&D Food System
4 R&D Food NULL
5 R&D Color NULL
6 Marketing Food System
T2:
ID VAL
--- ----------
1 Smart
2 Casual
3 Beef
4 Chicken
5 Green
6 Fish
The purpose of T1 is to show all the department attributes.
If the TEAM is null, it is for everyone in that department. Sometimes a team has special settings which override the generic settings.
For example, I want to get the settings as a 'Web' team in R&D.
I can write:
SELECT T1.DEPARTMENT, T1.ATTRIBUTES, T1.TEAM, T2.VAL
FROM T1
LEFT JOIN T2 ON T1.ID = T2.ID
WHERE T1.DEPARTMENT = 'R&D' AND T1.TEAM = 'Web'
This will show one record which says dress code is casual.
But I want the result to be:
ATTRIBUTES VAL
---------- ------
Dress_Code Casual
Food Chicken
Color Green
Similarly for the 'System' team in R&D, the result would be smart dress code, beef, and green color.
I'm thinking first select all R&D results and then replace the rows with the above select results.
I need to write this as a stored procedure.
Any help is much appreciated!
Using CTE and row_number() :
with CTE as(
select
ATTRIBUTES,
VAL,
T1.TEAM,
row_number() over (partition by ATTRIBUTES order by team desc) rn
from t1 t1
inner join t2 t2 on t1.ID =t2.ID
AND ( T1.TEAM = 'Web' or T1.TEAM is null )
)
select
ATTRIBUTES ,
VAL
from cte where rn=1
order by val
OutPut :
SELECT
T1.DEPARTMENT,
T1.ATTRIBUTES,
T1.TEAM,
T2.VAL
INTO #temp
FROM T1
INNER JOIN T2 ON T1.ID = T2.ID
WHERE T1.TEAM is NULL
SELECT
T1.ATTRIBUTES,
T2.VAL
INTO #t2
FROM T1
INNER JOIN T2 ON T1.ID = T2.ID
WHERE T1.TEAM = 'Web'
UPDATE t
SET t.VAL=b.VAL
FROM #temp t
join #t2 b on b.ATTRIBUTES=t.ATTRIBUTES
SELECT
DEPARTMENT,
ATTRIBUTES,
TEAM,
VAL
FROM #temp
Even this too will help.

inner join and group by

I have two tables with identical definition.
T1:
Name VARCHAR(50)
Qty INT
T2:
Name VARCHAR(50)
Qty INT
This is the data each table has:
T1:
Name Qty
a 1
b 2
c 3
d 4
T2:
Name Qty
a 1
b 3
e 5
f 10
I want to have result which can sum the Qty from both the tables based on Name.
Expected resultset:
Name TotalQty
a 2
b 5
c 3
d 4
e 5
f 10
If am do Left Join or Right Join, it is not going to return me the Name from either of the tables.
What i am thinking is to create a temp table and add these records and just do a SUM aggregate on Qty column but i think there should be a better way to do this.
This is how my query looks like which does not return the expected resultset:
SELECT t1.Name, ISNULL(SUM(t1.Qty + t2.Qty),0) TotalQty
FROM t1
LEFT JOIN t2
ON t1.Name = T2.Name
GROUP BY t1.Name
Can someone please tell me if creating a temp table is OK here or there is a better way to do this?
You can use a full outer join:
SELECT
ISNULL(t1.Name, t2.Name) AS Name,
ISNULL(t1.Qty, 0) + ISNULL(t2.Qty, 0) AS TotalQty
FROM t1
FULL JOIN t2 ON t1.Name = T2.Name
See it working online: sqlfiddle
You can use a UNION ALL to select both tables as one, since they have the same definition. From there, you can nest them as a derived table, and then SUM on that:
SELECT [Name], SUM(Qty) AS TotalQty
FROM (
SELECT [Name], Qty
FROM t1
UNION ALL
SELECT [Name], Qty
FROM t2
) YourDerivedTable
GROUP BY [Name]

Limited T-SQL Join

This should be simple enough, but somehow my brain stopped working.
I have two related tables:
Table 1:
ID (PK), Value1
Table 2:
BatchID, Table1ID (FK to Table 1 ID), Value2
Example data:
Table 1:
ID Value1
1 A
2 B
Table 2:
BatchID Table1ID Value2
1 1 100
2 1 101
3 1 102
1 2 200
2 2 201
Now, for each record in Table 1, I'd like to do a matching record on Table 2, but only the most recent one (batch ID is sequential). Result for the above example would be:
Table1.ID Table1.Value1 Table2.Value2
1 A 102
2 B 201
The problem is simple, how to limit join result with Table2. There were similar questions on SO, but can't find anything like mine. Here's one on MySQL that looks similar:
LIMITing an SQL JOIN
I'm open to any approach, although speed is still the main priority since it will be a big dataset.
WITH Latest AS (
SELECT Table1ID
,MAX(BatchID) AS BatchID
FROM Table2
GROUP BY Table1ID
)
SELECT *
FROM Table1
INNER JOIN Latest
ON Latest.Table1ID = Table1.ID
INNER JOIN Table2
ON Table2.BatchID = Latest.BatchID
SELECT id, value1, value2
FROM (
SELECT t1.id, t2.value1, t2.value2, ROW_NUMBER() OVER (PARTITION BY t1.id ORDER BY t2.BatchID DESC) AS rn
FROM table1 t1
JOIN table2 t2
ON t2.table1id = t1.id
) q
WHERE rn = 1
Try
select t1.*,t2.Value2
from(
select Table1ID,max(Value2) as Value2
from [Table 2]
group by Table1ID) t2
join [Table 1] t1 on t2.Table1ID = t1.id
Either GROUP BY or WHERE clause that filters on the most recent:
SELECT * FROM Table1 a
INNER JOIN Table2 b ON (a.id = b.Table1ID)
WHERE NOT EXISTS(
SELECT 1 FROM Table2 c WHERE c.Table1ID = a.id AND c.BatchID > b. BatchID
)

Resources