SQL statement - join based on date - sql-server

I need to write a statement joining two tables based on dates.
Table 1 contains time recording entries.
+----+-----------+--------+---------------+
| ID | Date | UserID | DESC |
+----+-----------+--------+---------------+
| 1 | 1.10.2010 | 5 | did some work |
| 2 | 1.10.2011 | 5 | did more work |
| 3 | 1.10.2012 | 4 | me too |
| 4 | 1.11.2012 | 4 | me too |
+----+-----------+--------+---------------+
Table 2 contains the position of each user in the company. The ValidFrom date is the date at which the user has been or will be promoted.
+----+-----------+--------+------------+
| ID | ValidFrom | UserID | Pos |
+----+-----------+--------+------------+
| 1 | 1.10.2009 | 5 | PM |
| 2 | 1.5.2010 | 5 | Senior PM |
| 3 | 1.10.2010 | 4 | Consultant |
+----+-----------+--------+------------+
I need a query which outputs table one with one added column which is the position of the user at the time the entry has been made. (the Date column)
All date fileds are of type date.
I hope someone can help. I tried a lot but don't get it working.

Try this using a subselect in the where clause:
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE TimeRecord
(
ID INT,
[Date] Date,
UserID INT,
Description VARCHAR(50)
)
INSERT INTO TimeRecord
VALUES (1,'2010-01-10',5,'did some work'),
(2, '2011-01-10',5,'did more work'),
(3, '2012-01-10', 4, 'me too'),
(4, '2012-11-01',4,'me too')
CREATE TABLE UserPosition
(
ID Int,
ValidFrom Date,
UserId INT,
Pos VARCHAR(50)
)
INSERT INTO UserPosition
VALUES (1, '2009-01-10', 5, 'PM'),
(2, '2010-05-01', 5, 'Senior PM'),
(3, '2010-01-10', 4, 'Consultant ')
Query 1:
SELECT TR.ID,
TR.[Date],
TR.UserId,
TR.Description,
UP.Pos
FROM TimeRecord TR
INNER JOIN UserPosition UP
ON UP.UserId = TR.UserId
WHERE UP.ValidFrom = (SELECT MAX(ValidFrom)
FROM UserPosition UP2
WHERE UP2.UserId = UP.UserID AND
UP2.ValidFrom <= TR.[Date])
Results:
| ID | Date | UserId | Description | Pos |
|----|------------|--------|---------------|-------------|
| 1 | 2010-01-10 | 5 | did some work | PM |
| 2 | 2011-01-10 | 5 | did more work | Senior PM |
| 3 | 2012-01-10 | 4 | me too | Consultant |
| 4 | 2012-11-01 | 4 | me too | Consultant |

You can do it using OUTER APPLY:
SELECT ID, [Date], UserID, [DESC], x.Pos
FROM table1 AS t1
OUTER APPLY (
SELECT TOP 1 Pos
FROM table2 AS t2
WHERE t2.UserID = t1.UserID AND t2.ValidFrom <= t1.[Date]
ORDER BY t2.ValidFrom DESC) AS x(Pos)
For every row of table1 OUTER APPLY operation fetches all table2 rows of the same user that have a ValidFrom date that is older or the same as [Date]. These rows are sorted in descending order and the most recent of these is finally returned.
Note: If no match is found by the OUTER APPLY sub-query then a NULL value is returned, meaning that no valid position exists in table2 for the corresponding record in table1.
Demo here

This works by using a rank function and subquery. I tested it with some sample data.
select sub.ID,sub.Date,sub.UserID,sub.Description,sub.Position
from(
select rank() over(partition by t1.userID order by t2.validfrom desc)
as 'rank', t1.ID as'ID',t1.Date as'Date',t1.UserID as'UserID',t1.Descr
as'Description',t2.pos as'Position', t2.validfrom as 'validfrom'
from temployee t1 inner join jobs t2 on -- replace join tables with your own table names
t1.UserID=t2.UserID
) as sub
where rank=1

This query would work
select t1.*,t2.pos from Table1 t1 left outer join Table2 t2 on
t1.Date=t2.Date and t1.UserID=t2.UserID

Related

Join created table under condition

I am creating a code to join two different tables under a certain condition. The tables look like this
(TABLE 2)
date | deal_code | originator | servicer | random |
-----------------------------------------------------
2011 | 001 | commerzbank | SPV1 | 1 |
2012 | 001 | commerzbank | SPV1 | 12 |
2013 | 001 | commerzbank | SPV1 | 7 |
2013 | 005 | unicredit | SPV2 | 7 |
and another table
(TABLE 1)
date | deal_code | amount |
---------------------------
2011 | 001 | 100 |
2012 | 001 | 100 |
2013 | 001 | 100 |
2013 | 005 | 200 |
I would like to have this as the final result
date | deal_code | amount | originator | servicer | random |
--------------------------------------------------------------
2013 | 001 | 100 | commerzbank | SPV1 | 7 |
2013 | 005 | 200 | unicredit | SPV2 | 7 |
I created the following code
select q1.deal_code, q1.date
from table1 q1
where q1.date = (SELECT MAX(t4.date)
FROM table1 t4
WHERE t4.deal_code = q1.deal_code)
that gives me:
(TABLE 3)
date | deal_code | amount |
---------------------------
2013 | 001 | 100 |
2013 | 005 | 200 |
That is the latest observation for table 1, now I would like to have the originator and servicer information given the deal_code and date. Any suggestion? I hope to have been clear enough. Thanks.
This should do what you are looking for. Please be careful when naming columns. Date is a reserved word and is too ambiguous to be a good name for a column.
declare #Something table
(
SomeDate int
, deal_code char(3)
, originator varchar(20)
, servicer char(4)
, random int
)
insert #Something values
(2011, '001', 'commerzbank', 'SPV1', 1)
, (2012, '001', 'commerzbank', 'SPV1', 12)
, (2013, '001', 'commerzbank', 'SPV1', 7)
, (2013, '005', 'unicredit ', 'SPV2', 7)
declare #SomethingElse table
(
SomeDate int
, deal_code char(3)
, amount int
)
insert #SomethingElse values
(2011, '001', '100')
, (2012, '001', '100')
, (2013, '001', '100')
, (2013, '005', '200')
select x.SomeDate
, x.deal_code
, x.originator
, x.servicer
, x.random
, x.amount
from
(
select s.SomeDate
, s.deal_code
, s.originator
, s.servicer
, s.random
, se.amount
, RowNum = ROW_NUMBER()over(partition by s.deal_code order by s.SomeDate desc)
from #Something s
join #SomethingElse se on se.SomeDate = s.SomeDate and se.deal_code = s.deal_code
) x
where x.RowNum = 1
Looks like this would work:
DECLARE #MaxYear INT;
SELECT #MaxYear = MAX(date)
FROM table1 AS t1
INNER JOIN table2 AS t2
ON t1.deal_code = t2.deal_code;
SELECT t1.date,
t1.deal_code,
t1.amount,
t2.originator,
t2.servicer,
t2.random
FROM table1 AS t1
INNER JOIN table2 AS t2
ON t1.date = #MaxYear
AND t1.deal_code = t2.deal_code;
I agree with Sean Lange about the date column name. His method gets around the dependency on the correlated sub-query, but at the heart of things, you really just need to add an INNER JOIN to your existing query in order to get the amount column into your result set.
select
q2.date,
q2.deal_code,
q1.amount,
q2.originator,
q2.servicer,
q2.random
from
table1 q1
join
table2 q2
on q1.date = q2.date
and q1.deal_code = q2.deal_code
where q1.date = (SELECT MAX(t4.date)
FROM table1 t4
WHERE t4.deal_code = q1.deal_code)

SQL Server - How to join with max value from second table and apply specific condition

I have two tables:
Weeks
| WeekID | StartDate |
| 1 | 2016-12-25 |
| 2 | 2017-01-01 |
| 3 | 2017-01-08 |
and Settings
| ID | SettingVal | ApplyFrom |
| 1 | 10 | 2016-06-01 |
| 2 | 13 | 2017-01-01 |
| 3 | 5 | 2017-01-02 |
For each WeekID, I need to select SettingVal with MAX(ApplyFrom) existing, but also ApplyFrom <= DATEADD(day, 6, StartDate) from table Weeks, for example:
| WeekID | SettingVal |
| 1 | 10 |
| 2 | 5 |
| 3 | 5 |
When I write the following query:
SELECT t1.WeekID, t2.SettingVal
FROM Weeks t1
LEFT OUTER JOIN Settings t2 ON t2.ApplyFrom <= DATEADD(day, 6, t1.StartDate)
it joins one row from first table with multiple rows from second table. How do I join only with a row having MAX(ApplyFrom), and select the SettingVal column I need?
Option 1 - WITH TIES
Select Top 1 with ties
A.WeekID
,B.SettingVal
From Weeks A
Left Join Settings B
on B.ApplyFrom<=DateAdd(DAY,6,A.StartDate)
Order By Row_Number() over (Partition By A.WeekID Order by B.ApplyFrom Desc)
Option 2 - Cross Apply
Select A.WeekID
,B.SettingVal
From Weeks A
Cross Apply (
Select Top 1 SettingVal
From Settings
Where ApplyFrom<=DateAdd(DAY,6,A.StartDate)
Order By ApplyFrom Desc
) B
Both Return
WeekID SettingVal
1 10
2 5
3 5
You can try using a query like this
SELECT
t1.WeekID,
t2.SettingVal
FROM Weeks t1
LEFT OUTER JOIN
(
SELECT
t3.WeekID,
MAX(t2.ApplyFrom) ApplyFrom
FROM Weeks t3
LEFT OUTER JOIN Settings t2 ON t2.ApplyFrom BETWEEN t3.StartDate AND DATEADD(day, 6, t3.StartDate)
GROUP BY t3.WeekID
)T4
ON T4.WeekID=T1.WeekID
LEFT OUTER JOIN Settings T2 ON T4.ApplyFrom=T2.ApplyFrom

SQL: How to join two tables by their date ranges

I have a table with a history of assigning Eployee Type to a Work item, like follows:
| WorkItemID | EmployeeTypeID | ValidFrom | ValidTo |
| 1 | 1 | 2017-03-01 12:19:20.000 | 2017-03-05 14:11:20.000 |
| 1 | 1 | 2017-03-10 17:00:20.000 | NULL |
| 1 | 2 | 2017-05-12 12:19:20.000 | 2017-05-29 14:11:20.000 |
| 1 | 2 | 2017-07-01 12:19:20.000 | NULL |
| 2 | 1 | 2017-01-01 15:19:20.000 | 2017-03-01 11:29:20.000 |
| 2 | 1 | 2017-04-03 16:19:20.000 | NULL |
NULL means that there's no End date for the last assignment and it is still valid.
I also have a table with a history of assigning Eployee Type to an Employee:
| EmployeeID | EmployeeTypeID | ValidFrom | ValidTo |
| 1 | 1 | 2017-01-01 12:19:20.000 | 2017-03-05 14:11:20.000 |
| 1 | 2 | 2017-03-05 14:11:20.000 | NULL |
| 2 | 1 | 2016-05-05 15:19:20.000 | 2017-03-01 11:29:20.000 |
| 2 | 2 | 2017-03-01 11:29:20.000 | NULL |
For a given EmployeeID and WorkItemID, I need to select a minimum date within these date ranges where their EmployeeTypeID matched (if there is any).
For example, for EmployeeID = 1 And WorkItemID = 1 the minimum date when their Employeetypes matched is 2017-03-01 (disregard the time part).
How do I write an SQL query to join these two tables correctly and select the desired date?
The following way appeared to be correct for me:
Firstly, I select Min Date from table 1 that match with table 2 by date ranges and they should overlap as well:
DECLARE #MinDate1 datetime
DECLARE #MinDate2 datetime
SELECT #MinDate1 =
(SELECT MIN(t1.ValidFrom)
FROM Table1 t1
JOIN Table2 t2 ON t1.EmployeeTypeID = t2.EmployeeTypeID
WHERE t1.WorkItemID = 1 AND t2.EmployeeID = 1
AND (t1.ValidFrom <= t2.ValidTo OR t2.ValidTo IS NULL)
AND (t1.ValidTo >= t2.ValidFrom OR t1.ValidTo IS NULL))
Then I select Min Date from table 2 that match with table 1 by date ranges and they should overlap as well:
SELECT #MinDate2 =
(SELECT MIN(t2.ValidFrom)
FROM Table1 t1
JOIN Table2 t2 ON t1.EmployeeTypeID = t2.EmployeeTypeID
WHERE t1.WorkItemID = 1 AND t2.EmployeeID = 1
AND (t1.ValidFrom <= t2.ValidTo OR t2.ValidTo IS NULL)
AND (t1.ValidTo >= t2.ValidFrom OR t1.ValidTo IS NULL))
And finaly, I select the max date of two which would be the min date when the two ranges actually overlap and have the same EmployeeTypeID
SELECT CASE WHEN #MinDate1 > #MinDate2 THEN #MinDate1 ELSE #MinDate2 END AS MinOverlapDate
The output would be:
| MinOverlapDate |
| 2017-03-01 12:19:20.000 |
So it should be something like this:
SELECT MIN(Date)
FROM table1 t1
JOIN table2 t2 ON t1.EmployeeTypeID = t2.EmployeeTypeID
WHERE t1.EmployeeID = givenValue AND t2.WorkitemID = givenValue
But again if you dont know from which table the result goes you cant write a query for that.
What you should do is do at least 3 tables or maybe more
Would contain Employee informations
Items jobs dates whatever is connected to WORK
Some connection between them (Emp 1 has Work 2) (Emp 2 has Work 4) and so on
You CANNOT have same values in two tables without knowing from which one you want to get tha data!
OR .. You can do it into one table.
Columns: WorkItem | EmployeeID | EmployeeType | Date | Date
Actually, my variant still does not work correctly. The #MinDate1 and #MinDate2 should be compared by each EmployeeTypeID one by one. There it was compared independently.
Here is correct variant of solving this problem:
SELECT MIN(CASE WHEN t1.ValidFrom > t2.ValidFrom THEN t1.ValidFrom ELSE t2.ValidFrom END) AS MinOverlapDate
FROM Table1 t1
JOIN Table2 t2 ON t1.EmployeeTypeID = t2.EmployeeTypeID
WHERE t1.WorkItemID = 1 AND t2.EmployeeID = 1
AND (t1.ValidFrom <= t2.ValidTo OR t2.ValidTo IS NULL)
AND (t1.ValidTo >= t2.ValidFrom OR t1.ValidTo IS NULL)
Don't use >=, <=, = or between when comparing datetime fields. Since all of the mention operator would check against time as well. You would want to use datediff to check against the smallest interval according to your needs
select
Min_Overlap_Per_Section = (select MAX(ValidFrom)
FROM (VALUES (t1.ValidFrom), (t2.ValidFrom)) as ValidFrom(ValidFrom))
, Section_From = (select MAX(ValidFrom)
FROM (VALUES (t1.ValidFrom), (t2.ValidFrom)) as ValidFrom(ValidFrom))
, Section_To = (select MIN(ValidTo)
FROM (VALUES (t1.ValidTo), (t2.ValidTo)) as ValidTo(ValidTo))
from Table1
JOIN Table2 t2 ON t1.EmployeeTypeID = t2.EmployeeTypeID
where (
datediff(day, t1.ValidFrom, t2.ValidTo) >= 0
or t2.ValidTo IS NULL
)
and (
datediff(day, t2.ValidFrom, t1.ValidTo) >= 0
or t1.ValidTo IS NULL
)

How to (Dirty) Pair DateTimes Across Two Tables

I am looking at a SQL Server 2008 Database with two Tables, each with a PK (INT) column and a DateTime column.
There is no explicit relationship between the Tables, except I know the application has a heuristic tendency to insert to the database in pairs, one row into each Table, with DateTimes that seem to never match exactly but are usually pretty close.
I am trying to match back up the PKs in each table by finding the closest matching DateTime in the other table. Each PK can only be used once for this matching.
What is the best way to do this?
EDIT: Sorry, please find at bottom some example input and desired output.
+-------+-------------------------+
| t1.PK | t1.DateTime |
+-------+-------------------------+
| 1 | 2016-08-11 00:11:03.000 |
| 2 | 2016-08-11 00:11:08.000 |
| 3 | 2016-08-11 11:03:00.000 |
| 4 | 2016-08-11 11:08:00.000 |
+-------+-------------------------+
+-------+-------------------------+
| t2.PK | t2.DateTime |
+-------+-------------------------+
| 1 | 2016-08-11 11:02:00.000 |
| 2 | 2016-08-11 00:11:02.000 |
| 3 | 2016-08-11 22:00:00.000 |
| 4 | 2016-08-11 11:07:00.000 |
| 5 | 2016-08-11 00:11:07.000 |
+-------+-------------------------+
+-------+-------+-------------------------+-------------------------+
| t1.PK | t2.PK | t1.DateTime | t2.DateTime |
+-------+-------+-------------------------+-------------------------+
| 1 | 2 | 2016-08-11 00:11:03.000 | 2016-08-11 00:11:02.000 |
| 2 | 5 | 2016-08-11 00:11:08.000 | 2016-08-11 00:11:07.000 |
| 3 | 1 | 2016-08-11 11:03:00.000 | 2016-08-11 11:02:00.000 |
| 4 | 4 | 2016-08-11 11:08:00.000 | 2016-08-11 11:07:00.000 |
+-------+-------+-------------------------+-------------------------+
JOIN to the row with lowest DATEDIFF (in seconds) between t1.DateTime and t2.DateTime.
You can achieve the result you are looking for by cross joining table 1 with table 2 and then getting the difference of the dates in seconds as per Tab Alleman’s suggestion. The next step would then be to rank each match using the ROW_NUMBER() function. Final step is to select out only rows which Rank = 1.
The following example demonstrates using your example data:
DECLARE #t1 TABLE
(
ID INT PRIMARY KEY
,[DateTime] DATETIME
);
DECLARE #t2 TABLE
(
ID INT PRIMARY KEY
,[DateTime] DATETIME
)
INSERT INTO #t1
(
ID
,[DateTime]
)
VALUES
(1 ,'2016-08-11 00:11:03.000'),
(2 ,'2016-08-11 00:11:08.000'),
(3 ,'2016-08-11 11:03:00.000'),
(4 ,'2016-08-11 11:08:00.000');
INSERT INTO #t2
(
ID
,[DateTime]
)
VALUES
(1, '2016-08-11 11:02:00.000'),
(2, '2016-08-11 00:11:02.000'),
(3, '2016-08-11 22:00:00.000'),
(4, '2016-08-11 11:07:00.000'),
(5, '2016-08-11 00:11:07.000');
WITH CTE_DateDifference
AS
(
SELECT t1.ID AS T1_ID
,t2.ID AS T2_ID
,t1.[DateTime] AS T1_DateTime
,t2.[DateTime] AS T2_DateTime
,ABS(DATEDIFF(SECOND, t1.[DateTime], t2.[DateTime])) AS Duration -- Determine the difference between the dates in seconds.
FROM #t1 t1
CROSS JOIN #t2 t2
),CTE_RankDateMatch
AS
(
SELECT T1_ID
,T2_ID
,T1_DateTime
,T2_DateTime
,ROW_NUMBER() OVER (PARTITION BY T1_ID ORDER BY Duration) AS [Rank] -- Rank each match, the row numbers generated will be order based on the duration between the dates. Thus rows with a number of 1will be the closest match between the two tables.
FROM CTE_DateDifference
)
-- Finally select out the rows with a Rank equal to 1.
SELECT *
FROM CTE_RankDateMatch
WHERE [Rank] = 1

Split table in two tables plus a link table

I have a table with three columns with double values, but no double rows. Now I want to split this table in two table with unique values and a link table. I think the Problem gets clearer when I Show you example tables:
Original:
| ID | Column_1 | Column_2 | Column_3 |
|----|----------|----------|----------|
| 1 | A | 123 | A1 |
| 2 | A | 123 | A2 |
| 3 | B | 234 | A2 |
| 4 | C | 456 | A1 |
Table_1
| ID | Column_1 | Column_2 |
|----|----------|----------|
| 1 | A | 123 |
| 2 | B | 234 |
| 3 | C | 456 |
Table_2
| ID | Column_3 |
|----|----------|
| 1 | A1 |
| 2 | A2 |
Link-Table
| ID | fk1 | fk2 |
|----|-----|-----|
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 2 |
| 4 | 3 | 1 |
Table_1 I created like this:
INSERT INTO Table_1(Column_1, Column_2)
SELECT DISTINCT Column_1, Column_2 FROM Original
WHERE Original.Column_1 NOT IN (SELECT Column_1 FROM Table_1)
Table_2 I created in the same way.
The question now is, how to create the Link-Table?
The original table does grow continuesly, so only new entries should be added.
Do I have to use a Cursor, or is there a better way?
SOLUTION:
MERGE Link_Table AS LT
USING (SELECT DISTINCT T1.ID AS T1ID, T2.ID AS T2ID FROM Original AS O
INNER JOIN Table_1 AS T1 ON T1.Column_1 = O.Column_1
INNER JOIN Table_2 AS T2 ON T2.Column_3 = O.Column_3) AS U
ON LT.fk1 = U.T1ID
WHEN NOT MATCHED THEN
INSERT (fk1, fk2)
VALUES (U.T1ID, U.T2ID);
You can JOIN all 3 tables to get proper data for link table:
--INSERT INTO [Link-Table]
SELECT t1.ID,
t2.ID
FROM Original o
INNER JOIN Table_1 t1
ON t1.Column_1 = o.Column_1
INNER JOIN Table_2 t2
ON t2.Column_3 = o.Column_3
If your original table will grow, then you need to use MERGE to update/insert new data.
You have to inner join your Original,Table_1 and Table_2 to get the desired result.
Try like this, Its similar to gofr1 post.
DECLARE #orginal TABLE (
ID INT
,Column_1 VARCHAR(10)
,Column_2 INT
,Column_3 VARCHAR(10)
)
DECLARE #Table_1 TABLE (
ID INT
,Column_1 VARCHAR(10)
,Column_2 INT
)
DECLARE #Table_2 TABLE (
ID INT
,Column_3 VARCHAR(10)
)
Insert into #orginal values
(1,'A',123,'A1')
,(2,'A',123,'A2')
,(3,'B',234,'A2')
,(4,'C',456,'A1')
Insert into #Table_1 values
(1,'A',123)
,(2,'B',234)
,(3,'C',456)
Insert into #Table_2 values
(1,'A1')
,(2,'A2')
SELECT O.ID
,T1.ID
,T2.ID
FROM #orginal O
INNER JOIN #Table_1 T1 ON T1.Column_1 = O.Column_1
INNER JOIN #Table_2 T2 ON T2.Column_3 = O.Column_3

Resources