Join two tables depending on date sequence - sql-server

In SQL Server 2008, I want to join two tables depending on date sequence. More specifically, I need to left join Payments table to Profiles table by the following rules:
UserId has to be matched.
Every record in Payments matches the record in Profiles with the closest Profiles.CreationDate before Payments.PayDate.
For a simplified example,
Table Payments:
UserId PayDate Amount
1 2012 400
1 2010 500
2 2014 600
Table Profiles:
UserId CreationDate Address
1 2009 NY
1 2015 MD
2 2007 NJ
2 2013 MA
3 2008 TX
Desired Result:
UserId CreationDate PayDate Amount Address
1 2009 2010 500 NY
1 2009 2012 400 NY
2 2013 2014 600 MA
It's guaranteed that a user have at least 1 Profiles record before he pays. Another restriction is that I not authorized to write anything into the database.
I idea is first left join Payments with Profiles, then within the record group matching each (UserId, PayDate) tuple, sort it by CreationDate, then select the last record. But I don't know how to implement it in SQL language, or are there any better ways to do this merge?

Use Outer Apply to do this.
SELECT py.UserId,
CreationDate,
PayDate,
Amount,
Address
FROM Payments py
OUTER APPLY (SELECT TOP 1 *
FROM Profiles pr
WHERE py.UserId = pr.UserId
and PayDate> CreationDate
ORDER BY CreationDate desc) cs
SQLFIDDLE DEMO

Related

In SQL Server, how do I get the date of an event from a table and the date the previous event happened?

I'm using Microsoft SQL Server 2016 (SP2) (13.0.5026.0 (X64). I have a query that joins a people table with a contacts table. The people table has the client detail and the contacts table tells us the last time that client has made a contact.
The below query links the tables and gives me the client ID and the date they made contact
SELECT
DP.dim_person_ID AS [Dim_Person_ID],
CAST(FC.CONTACT_DTTM AS date) AS [Contact Date]
FROM
Child_Social.Fact_Contacts FC
INNER JOIN
CHILD_SOCIAL.DIM_Person DP ON FC.dim_person_ID = DP.Dim_Person_ID
If they made multiple contacts they will return a row for each one and the output will look like this
Person_ID
Contact Date
1
01/01/2023
1
01/10/2022
1
01/07/2022
1
01/04/2022
1
01/01/2022
2
02/01/2023
2
02/10/2022
2
02/07/2022
2
02/04/2022
2
02/01/2022
What I'm trying to do is to add a column to the query that shows the previous contact date as well eg return an output like
Person ID
Contact Date
Previous Contact Date
1
01/10/2022
01/07/2022
1
01/07/2022
01/04/2022
2
02/10/2022
02/07/2022
2
02/07/2022
02/04/2022
I'm unsure how to create a join/sub query that will calculate a previous episode (rather than a most recent episode or a first episode, just the previous one).
Any help or guidance gratefully received

How to accummulate two datetime in two tables as VIEW in SQL Server 2014?

How to query to accumulate two datetime columns in two tables in SQL Server 2014? This is an example for your reference:
Check-In table
InID UserID CheckInTime
---------------------------------
IN-001 1 2018-11-10 08:00:00
IN-002 2 2018-11-15 07:00:00
Check-Out table
OutID UserID CheckOutTime
----------------------------------
OUT-001 1 2018-11-10 12:00:00
OUT-002 2 2018-11-15 14:00:00
Result set (expected)
ResultID UserID InID OutID WorkTimeinHour
--------------------------------------------------------
1 1 IN-001 OUT-001 4
2 2 IN-002 OUT-002 7
Similar to #PSK, I used STUFF function to replace "IN-" and "OUT-" characters
But since these are in JOIN conditions, those operations will cause performance loss
It is better to use a numeric column in both tables instead of useless "IN-" and "OUT-" containing string columns
select
i.UserId, i.InID, CheckInTime, o.OutID, CheckOutTime,
dbo.fn_CreateTimeFromSeconds(DATEDIFF(ss, CheckInTime, CheckOutTime)) as TotalTime
from CheckIn i
inner join CheckOut o
on i.UserId = o.UserId and
STUFF (i.InID,1,3,'') = STUFF (o.OutID,1,4,'')
Additionally, I used a custom user-defined fn_CreateTimeFromSeconds function to format time for HH:MI:SS format
Hope it helps
For your current scenario, you can try like following.
Assuming that IN and OUT id after the "-" will be same as one entry.
SELECT ROW_NUMBER()
OVER(
ORDER BY (SELECT NULL)) AS ResultIt,
T1.inid,
T2.outid,
DATEDIFF(hh, T2.checkouttime, T1.checkintime)
FROM checkin T1
INNER JOIN checkout T2
ON REPLACE(T1.inid, 'IN-', '') = REPLACE(T2.outid, 'OUT-', '')
This query will not perform good for huge data as REPLACE is being used in the JOIN. Ideally you should have a single identifier to identify the IN and OUT transaction.

Query Most Recent Records in MS Access Based on Date Provided in Form Field

Let me start by noting I have spent a few days searching through S.O. and have not been able to find a solution. I apologize in advance if the solution is very simple, but I am still learning and appreciate any help I can get.
I have a MS Access 2010 Database, and I am trying to create a set of queries to inform other forms and queries. There are two tables: Borrower Contact Info (BC_Info) and Basic Financial Indicators (BF_Indicators). Each month, I review and track key performance metrics of each borrower. I would like to create a query that supplies the most recent record based on a textbox input (Forms![Portfolio_Review Menu]!Text47).
Two considerations have separated this from other posts I have seen in the 'greatest-n-per-group' tag:
Not every borrower will have data for every month.
I need to be able to see back in time, i.e. if it is January 1, 2019 and I want to see the metrics as of July 31, 2017, I want to make
sure I am only seeing data from before July 31, 2017 but as close to
this date as possible.
Fields are as follows:
BC_Info
- BorrowerName
-PartnerID
BF_Indicators
-Fin_ID
-DateUpdated
The tables are connected by BorrowerName -- which is a unique naming convention used for the primary key of BC_Info.
What I currently have is:
SELECT BCI.BorrowerName, BCI.PartnerID, BFI.Fin_ID, BFI.DateUpdated
FROM ((BC_Info AS BCI
INNER JOIN BF_Indicators AS BFI
ON BFI.BorrowerName = BCI.BorrowerName)
INNER JOIN
(
SELECT Fin_ID, MAX(DateUpdated) AS MAX_DATE
FROM BF_Indicators
WHERE (DateUpdated <= Forms![Portfolio_Review Menu]!Text47 OR
Forms![Portfolio_Review Menu]!Text47 IS NULL)
GROUP BY Fin_ID
) AS Last_BF ON BFI.Fin_ID = Last_BF.Fin_ID AND
BFI.DateUpdated = Last_BF.MAX_DATE);
This gives me the fields I need, and will keep records out that are past the date given in the textbox, but will give all records from before the textbox input -- not just the most recent.
Results (Date Entered is 12/31/2018; MEHN-45543 is only Borrower with information later than 09/30/2018):
BorrowerName PartnerID Fin_ID DateUpdated
MEHN-45543 19 9 12/31/2018
ARYS-7940 5 10 9/30/2018
FINS-21032 12 11 9/30/2018
ELET-00934 9 12 9/30/2018
MEHN-45543 19 18 9/30/2018
Expected Results (Date Entered is 12/31/2018; MEHN-45543 is only Borrower with information later than 09/30/2018):
BorrowerName PartnerID Fin_ID DateUpdated
MEHN-45543 19 9 12/31/2018
ARYS-7940 5 10 9/30/2018
FINS-21032 12 11 9/30/2018
ELET-00934 9 12 9/30/2018
As mentioned, I am planning to use the results of this Query to generate further queries that use aggregated information from the Financial Indicators to determine portfolio quality at the time.
Please let me know if there is any other information I can provide. And again, thank you in advance.
Try joining BC_Info to a query that aggregates BF_Indicators on BorrowerName, not Fin_ID. Tested with literal date value:
SELECT BC_Info.*, MaxDate
FROM BC_Info
INNER JOIN
(SELECT BorrowerName, Max(DateUpdated) AS MaxDate
FROM BF_Indicators WHERE DateUpdated <=#12/31/2018# GROUP BY BorrowerName) AS Q1
ON BC_Info.BorrowerName=Q1.BorrowerName;
If you need to include Fin_ID in the results, then:
SELECT BC_Info.*, Fin_ID, DateUpdated FROM BC_Info
INNER JOIN
(SELECT * FROM BF_Indicators WHERE Fin_ID IN
(SELECT TOP 1 Fin_ID FROM BF_Indicators AS Dupe
WHERE Dupe.BorrowerName=BF_Indicators.BorrowerName AND DateUpdated<=#12/31/2018#
ORDER BY Dupe.DateUpdated DESC)
) AS Q1
ON BC_Info.BorrowerName = Q1.BorrowerName;
If you don't like TOP N, adjust your original query:
SELECT BCI.BorrowerName, BCI.PartnerID, BFI.Fin_ID, BFI.DateUpdated
FROM ((BC_Info AS BCI
INNER JOIN BF_Indicators AS BFI
ON BFI.BorrowerName = BCI.BorrowerName)
INNER JOIN
(
SELECT BorrowerName, MAX(DateUpdated) AS MAX_DATE
FROM BF_Indicators
WHERE (DateUpdated <= #12/31/2018#)
GROUP BY BorrowerName
) AS Last_BF ON BFI.BorrowerName = Last_BF.BorrowerName AND
BFI.DateUpdated = Last_BF.MAX_DATE);
And 1 more to think about:
SELECT BC_Info.PartnerID, BC_Info.BorrowerName, BF_Indicators.Fin_ID, BF_Indicators.DateUpdated
FROM BC_Info RIGHT JOIN BF_Indicators ON BC_Info.BorrowerName = BF_Indicators.BorrowerName
WHERE (((BF_Indicators.DateUpdated)=DMax("DateUpdated","BF_Indicators","BorrowerName='" & [BC_Info].[BorrowerName] & "' AND DateUpdated<=#12/31/2018#")));

Query to find records with more than one date

I'm trying to build a query for a MS SQL database that will find records with more than one year but not the records with only one.
Lets say I have a car dealership and I have 1 Chevy from 2015 and 2 from 2017 then I would want to find Chevy 2015 1 and chevy 2017 2 but if I have a three Fords from 2018 and only 2018 then I don't want that at all.
I have tweeked with groups and joins but I don't get any where. So I need Select from table something. I'm leaning toward a pivot table but not sure what to do. Thanks for the help
MyTable Contents
Model year count
Chevy 2012 1
Chevy 2012 1
Chevy 2015 1
Ford 2018 1
Ford 2018 1
Ford 2018 1
Buick 2017 1
Lexus 2017 1
Lexus 2015 1
Desired Result Set
Chevy 2012 2
Chevy 2015 1
Lexus 2017 1
Lexus 2015 1
Because it has 2 different years for the model
The below query should help you. Need not hardcode model values.
Select T.Model,T.[year] ,count(T.[year])
from T
join (select distinct * from T) S on T.model = S.model and T.year!=S.year
group by T.Model,T.[year]
You need to use SUM function and group by on subquery,Because there might be Multiple count on count column. then join itself and distinct to exclude duplicate data.
Select distinct t1.*
from (
SELECT Model,[year] ,sum([count]) totle
FROM T
group by Model,[year]
) t1
inner join T t2 on t1.Model = t2.Model and t1.[year] !=t2.[year]
sqlfiddle:http://sqlfiddle.com/#!18/e8756/55
Note:[table],[year] are keyword in sql avoid naming it as column name

PostgreSQL - Filter column 2 results based on column 1

Forgive a novice question. I am new to postgresql.
I have a database full of transactional information. My goal is to iterate through each day since the first transaction, and show how many unique users made a purchase on that day, or in the 30 days previous to that day.
So the # of unique users on 02/01/2016 should show all unique users from 01/01/2016 through 02/01/2016. The # of unique users on 02/02/2016 should show all unique users from 01/02/2016 through 02/02/2016.
Here is a fiddle with some sample data: http://sqlfiddle.com/#!15/b3d90/1
The result should be something like this:
December 17 2014 -- 1
December 18 2014 -- 2
December 19 2014 -- 3
...
January 13 2015 -- 16
January 19 2015 -- 15
January 20 2015 -- 15
...
The best I've come up with is the following:
SELECT
to_char(S.created, 'YYYY-MM-DD') AS my_day,
COUNT(DISTINCT
CASE
WHEN S.created > S.created - INTERVAL '30 days'
THEN S.user_id
END)
FROM
transactions S
GROUP BY my_day
ORDER BY my_day;
As you can see, I have no idea how I could reference what exists in column one in order to specify what date range should be included in the filter.
Any help would be much appreciated!
I think if you do a self-join, it would give you the results you seek:
select
t1.created,
count (distinct t2.user_id)
from
transactions t1
join transactions t2 on
t2.created between t1.created - interval '30 days' and t1.created
group by
t1.created
order by
t1.created
That said, I think this is going to do form of a cartesian join in the background, so for large datasets I doubt it's very efficient. If you run into huge performance problems, there are ways to make this a lot faster... but before you address that, find out if you need to.
-- EDIT 8/20/16 --
In response to your issue with the performance of this... yes, it's a pig. I admit it. I encountered a similar issue here:
PostgreSQL Joining Between Two Values
The same concept for your example is this:
with xtrans as (
select created, created + generate_series(0, 30) as create_range, user_id
from transactions
)
select
t1.created,
count (distinct t2.user_id)
from
transactions t1
join xtrans t2 on
t2.create_range = t1.created
group by
t1.created
order by
t1.created
It's not as easy to follow, but it should yield identical results, only it will be significantly faster because it's not doing the "glorified cross join."

Resources