Declare new variable and input date based on rank - sql-server

I want to 'clean' a dataset and declare a new variable, then input a date based on rank.
My dataset looks like this:
+-----+--------------+------------+-------+
| ID | Start_date | End_date | Rank |
+-----+--------------+------------+-------+
| a | May '16 | May '16 | 5 |
| a | Jun '16 | Jul '16 | 4 |
| a | Jul '16 | Aug '16 | 3 |
| a | Aug '16 | NULL '16 | 2 |
| a | Sept '16 | NULL '16 | 1 |
+-----+--------------+------------+-------+
I basically want to input the start date of rank 1 into the end date of rank 2 or say input start 5 into end 6 (always -1).
Have written the following to select into a tempory table and rank based on id and date:
SELECT
[Start_Date] as 'start'
,[End_Date] as 'end'
,[Code] as 'code'
,[ID] as 'id'
,rank() over (partition by [id] order by [Start_Date]) as 'rank'
INTO #1
FROM [Table]
ORDER BY [id]
Its the following part that doesn't work ...
DECLARE new_end
BEGIN select [#1].[start] into new_end FROM [#1]
WHERE (
([#1].[rank] = 1)
AND ([#1].[end] IS NULL)
)

Assuming you already have your dataset as provided in your question, this is just a simple self join, no?
declare #t table(ID nvarchar(1), Start_date date, End_date date, [Rank] int);
insert into #t values ('a','20170501','20170501',5),('a','20170601','20170701',4),('a','20170701','20170801',3),('a','20170801',NULL,2),('a','20170901',NULL,1);
select t1.ID
,t1.Start_date
,isnull(t1.End_date,t2.Start_date) as End_date
-- If you *always* want to overwrite the End_Date use this instead:
-- ,t2.Start_date as End_date
,t1.[Rank]
from #t t1
left join #t t2
on(t1.[Rank] = t2.[Rank]+1);
Output:
+----+------------+------------+------+
| ID | Start_date | End_date | Rank |
+----+------------+------------+------+
| a | 2017-05-01 | 2017-05-01 | 5 |
| a | 2017-06-01 | 2017-07-01 | 4 |
| a | 2017-07-01 | 2017-08-01 | 3 |
| a | 2017-08-01 | 2017-09-01 | 2 |
| a | 2017-09-01 | NULL | 1 |
+----+------------+------------+------+

Related

Get count of rows after a unique date for each row

I have a 1:1 table (we will call this table 1) which looks like this:
| id | date |
|----|---------------------|
| 1 | 2011-01-02 00:00:00 |
| 2 | 2012-01-02 00:00:00 |
I have a second many:1 table (we will call this table 2) which looks like this:
| id | date |
|----|---------------------|
| 1 | 2011-01-01 00:00:00 |
| 1 | 2011-01-02 00:00:00 |
| 1 | 2011-01-03 00:00:00 |
| 2 | 2011-12-31 00:00:00 |
| 2 | 2012-01-01 00:00:00 |
| 2 | 2012-01-03 00:00:00 |
I would like to left join table 1 and table 2 (table 1 is the left table) on id, but I only want the count of dates from table 2 that are greater than or equal to the date column in table 1.
Thus, the resultant table would look like this:
| id | date | count_dates |
|----|---------------------|-------------|
| 1 | 2011-01-02 00:00:00 | 2 |
| 2 | 2012-01-02 00:00:00 | 1 |
How can I do this using SQL Server?
Don't bother with a left join, just use a sub-query with the appropriate where clause e.g.
select id, [date]
, (select count(*) from dbo.Table2 T2 where T2.id = T1.id and T2.[date] >= T1.[date])
from dbo.Table1 T1;

Date difference for same ID

I ve got a data set similar to
+----+------------+------------+------------+
| ID | Udate | last_code | Ddate |
+----+------------+------------+------------+
| 1 | 05/11/2018 | ACCEPTED | 13/10/2018 |
| 1 | 03/11/2018 | ATTEMPT | 13/10/2018 |
| 1 | 01/11/2018 | INFO | 13/10/2018 |
| 1 | 22/10/2018 | ARRIVED | 13/10/2018 |
| 1 | 15/10/2018 | SENT | 13/10/2018 |
+----+------------+------------+------------+
I m trying to get the date difference for each code on Udate, but for the first date I want to make datedifference between Udate and Ddate.
So I ve been trying:
DATEDIFF(DAY,LAG(Udate) OVER (PARTITION BY Shipment_Number ORDER BY Udate), Udate)
to get the difference between dates and it works so far, but I also need the first date difference between Udate and Ddate.
I was thinking about ISNULL()
Also, at the end I need an average of days between codes as well, usually they keep the same pattern. Sample output data:
+----+------------+------------+------------+------------+
| ID | Udate | last_code | Ddate | Difference |
+----+------------+------------+------------+------------+
| 1 | 05/11/2018 | ACCEPTED | 13/10/2018 | 2 |
| 1 | 03/11/2018 | ATTEMPT | 13/10/2018 | 2 |
| 1 | 01/11/2018 | INFO | 13/10/2018 | 10 |
| 1 | 22/10/2018 | ARRIVED | 13/10/2018 | 7 |
| 1 | 15/10/2018 | SENT | 13/10/2018 | 2 |
+----+------------+------------+------------+------------+
Notice that when there is no previous code, the date diff is between Udate and Ddate.
Would appreciate any idea.
Thank you.
Well, ISNULL is the way to go here.
Since you also want the average difference, you can use a common table expression to get the difference, and query it to get the average:
First, Create and populate sample data (Please save us this step in your future questions)
-- This would not be needed if you've used ISO8601 for date strings (yyyy-mm-dd | yyyymmdd)
SET DATEFORMAT DMY;
DECLARE #T AS TABLE
(
ID int,
UDate date,
last_code varchar(10),
Ddate date
) ;
INSERT INTO #T (ID, Udate, last_code, Ddate) VALUES
(1, '05/11/2018', 'ACCEPTED', '13/10/2018'),
(1, '03/11/2018', 'ATTEMPT' , '13/10/2018'),
(1, '01/11/2018', 'INFO' , '13/10/2018'),
(1, '22/10/2018', 'ARRIVED' , '13/10/2018'),
(1, '15/10/2018', 'SENT' , '13/10/2018');
The cte:
WITH CTE AS
(
SELECT ID,
Udate,
last_code,
Ddate,
DATEDIFF(
DAY,
ISNULL(
LAG(Udate) OVER(PARTITION BY ID ORDER BY Udate),
Ddate
),
UDate
) As Difference
FROM #T
)
The query:
SELECT *, AVG(Difference) OVER(PARTITION BY ID) As AverageDifference
FROM CTE;
Results:
ID Udate last_code Ddate Difference AverageDifference
1 15.10.2018 SENT 13.10.2018 2 4
1 22.10.2018 ARRIVED 13.10.2018 7 4
1 01.11.2018 INFO 13.10.2018 10 4
1 03.11.2018 ATTEMPT 13.10.2018 2 4
1 05.11.2018 ACCEPTED 13.10.2018 2 4

TSQL: Find Groups of Records in a Sequence of Records

Sorry for the title if you find it incorrect, I really wasn't sure how to name this question. There is probably a term for this type of query/pattern.
I have a sequence of records that need to be ordered by date, the records have a condition I would like to "group" by (SomeCondition) to get the earliest start date and latest end date (taking NULL's into account) but I'm unsure how to accomplish the query (if it's even possible). The original records in the table look something like;
-----------------------------------------------------------
| AbcID | XyzID | StartDate | EndDate | SomeCondition |
-----------------------------------------------------------
| 1 | 1 | 2018-01-01 | 2018-03-05 | 1 |
| 2 | 1 | 2018-04-20 | 2018-05-01 | 1 |
| 3 | 1 | 2018-05-02 | 2018-05-15 | 0 |
| 4 | 1 | 2018-06-01 | 2018-07-01 | 1 |
| 5 | 1 | 2018-08-01 | NULL | 1 |
| 6 | 2 | 2018-01-01 | 2018-06-30 | 1 |
| 7 | 2 | 2018-07-01 | 2018-08-31 | 0 |
-----------------------------------------------------------
The result I'm going for would be;
-----------------------------------
| XyzID | StartDate | EndDate |
-----------------------------------
| 1 | 2018-01-01 | 2018-05-01 |
| 1 | 2018-06-01 | NULL |
| 2 | 2018-01-01 | 2018-06-30 |
-----------------------------------
Thanks for any help/insight, even if it's "not possible".
Solving this problem requires you to solve it piece by piece. Here are the steps that I used to do that:
Determine when the island begins (when SomeCondition is false)
Create an "ID" number for each island (within each XyzID) by summing the number of IslandBegins while considering the records in AbcID order
Determine the first and last AbcID within each XyzID/IslandNumber combination where SomeCondition is true
Use the previous step as a guide as to what StartDate / EndDate you should get for each record in the result set
Sample Data:
declare #sample_data table
(
AbcID int
, XyzID int
, StartDate date
, EndDate date
, SomeCondition bit
)
insert into #sample_data
values (1, 1, '2018-01-01', '2018-03-05', 1)
, (2, 1, '2018-04-20', '2018-05-01', 1)
, (3, 1, '2018-05-02', '2018-05-15', 0)
, (4, 1, '2018-06-01', '2018-07-01', 1)
, (5, 1, '2018-08-01', NULL, 1)
, (6, 2, '2018-01-01', '2018-06-30', 1)
, (7, 2, '2018-07-01', '2018-08-31', 0)
Answer:
The comments in the code show which step each part of the CTE is accomplishing.
with island_bgn as
(
--Step 1
select d.AbcID
, d.XyzID
, d.StartDate
, d.EndDate
, d.SomeCondition
, case when d.SomeCondition = 0 then 1 else 0 end as IslandBegin
from #sample_data as d
)
, island_nbr as
(
--Step 2
select b.AbcID
, b.XyzID
, b.StartDate
, b.EndDate
, b.SomeCondition
, b.IslandBegin
, sum(b.IslandBegin) over (partition by b.XyzID order by b.AbcID asc) as IslandNumber
from island_bgn as b
)
, prelim as
(
--Step 3
select n.XyzID
, n.IslandNumber
, min(n.AbcID) as AbcIDMin
, max(n.AbcID) as AbcIDMax
from island_nbr as n
where 1=1
and n.SomeCondition = 1
group by n.XyzID
, n.IslandNumber
)
--Step 4
select p.XyzID
, a.StartDate
, b.EndDate
from prelim as p
inner join #sample_data as a on p.AbcIDMin = a.AbcID
inner join #sample_data as b on p.AbcIDMax = b.AbcID
order by p.XyzID
, a.StartDate
, b.EndDate
Results:
+-------+------------+------------+
| XyzID | StartDate | EndDate |
+-------+------------+------------+
| 1 | 2018-01-01 | 2018-05-01 |
+-------+------------+------------+
| 1 | 2018-06-01 | NULL |
+-------+------------+------------+
| 2 | 2018-01-01 | 2018-06-30 |
+-------+------------+------------+

Group Non-Contiguous Dates By Criteria In Column

I have a table with start and end dates for team consultations with customers.
I need to merge certain consultations based on a number of days specified in another column (sometimes the consultations may overlap, sometimes they are contiguous, sometimes they arent), Team and Type.
Some example data is as follows:
DECLARE #TempTable TABLE([CUSTOMER_ID] INT
,[TEAM] VARCHAR(1)
,[TYPE] VARCHAR(1)
,[START_DATE] DATETIME
,[END_DATE] DATETIME
,[GROUP_DAYS_CRITERIA] INT)
INSERT INTO #TempTable VALUES (1,'A','A','2013-08-07','2013-12-31',28)
,(2,'B','A','2015-05-15','2015-05-28',28)
,(2,'B','A','2015-05-15','2016-05-12',28)
,(2,'B','A','2015-05-28','2015-05-28',28)
,(3,'C','A','2013-05-27','2014-07-23',28)
,(3,'C','A','2015-01-12','2015-05-28',28)
,(3,'B','A','2015-01-12','2015-05-28',28)
,(3,'C','A','2015-05-28','2015-05-28',28)
,(3,'C','A','2015-05-28','2015-12-17',28)
,(4,'A','B','2013-07-09','2014-04-21',7)
,(4,'A','B','2014-04-29','2014-08-01',7)
Which looks like this:
+-------------+------+------+------------+------------+---------------------+
| CUSTOMER_ID | TEAM | TYPE | START_DATE | END_DATE | GROUP_DAYS_CRITERIA |
+-------------+------+------+------------+------------+---------------------+
| 1 | A | A | 07/08/2013 | 31/12/2013 | 28 |
| 2 | B | A | 15/05/2015 | 28/05/2015 | 28 |
| 2 | B | A | 15/05/2015 | 12/05/2016 | 28 |
| 2 | B | A | 28/05/2015 | 28/05/2015 | 28 |
| 3 | C | A | 27/05/2013 | 23/07/2014 | 28 |
| 3 | C | A | 12/01/2015 | 28/05/2015 | 28 |
| 3 | B | A | 12/01/2015 | 28/05/2015 | 28 |
| 3 | C | A | 28/05/2015 | 28/05/2015 | 28 |
| 3 | C | A | 28/05/2015 | 17/12/2015 | 28 |
| 4 | A | B | 09/07/2013 | 21/04/2014 | 7 |
| 4 | A | B | 29/04/2014 | 01/08/2014 | 7 |
+-------------+------+------+------------+------------+---------------------+
My desired output is as follows:
+-------------+------+------+------------+------------+---------------------+
| CUSTOMER_ID | TEAM | TYPE | START_DATE | END_DATE | GROUP_DAYS_CRITERIA |
+-------------+------+------+------------+------------+---------------------+
| 1 | A | A | 07/08/2013 | 31/12/2013 | 28 |
| 2 | B | A | 15/05/2015 | 12/05/2016 | 28 |
| 3 | C | A | 27/05/2013 | 23/07/2014 | 28 |
| 3 | C | A | 12/01/2015 | 17/12/2015 | 28 |
| 3 | B | A | 12/01/2015 | 28/05/2015 | 28 |
| 4 | A | B | 09/07/2013 | 21/04/2014 | 7 |
| 4 | A | B | 29/04/2014 | 01/08/2014 | 7 |
+-------------+------+------+------------+------------+---------------------+
I am struggling to do this at all, let alone with any efficiency! Any ideas / code will be greatly received.
Server version is MS SQL Server 2014
Thanks,
Dan
If I am understanding your question correctly, we want to return rows only when a second, third, etc consultation has not occurred within group_days_criteria number of days after the previous consultation end date.
We can get the previous consultation end date and eliminate rows (since we are not concerned with the number of consultations) where a consultation occurred for the same customer by the same team and of the same consultation type within our date range.
DECLARE #TempTable TABLE([CUSTOMER_ID] INT
,[TEAM] VARCHAR(1)
,[TYPE] VARCHAR(1)
,[START_DATE] DATETIME
,[END_DATE] DATETIME
,[GROUP_DAYS_CRITERIA] INT)
INSERT INTO #TempTable VALUES (1,'A','A','2013-08-07','2013-12-31',28)
,(2,'B','A','2015-05-15','2015-05-28',28)
,(2,'B','A','2015-05-15','2016-05-12',28)
,(2,'B','A','2015-05-28','2015-05-28',28)
,(3,'C','A','2013-05-27','2014-07-23',28)
,(3,'C','A','2015-01-12','2015-05-28',28)
,(3,'B','A','2015-01-12','2015-05-28',28)
,(3,'C','A','2015-05-28','2015-05-28',28)
,(3,'C','A','2015-05-28','2015-12-17',28)
,(4,'A','B','2013-07-09','2014-04-21',7)
,(4,'A','B','2014-04-29','2014-08-01',7)
;with prep as (
select Customer_ID,
Team,
[Type],
[Start_Date],
[End_Date],
Group_Days_Criteria,
ROW_NUMBER() over (partition by customer_id, team, [type] order by [start_date] asc, [end_date] desc) as rn, -- earliest start date with latest end date
lag([End_Date] + Group_Days_Criteria, 1, 0) over (partition by customer_id, team, [type] order by [start_date] asc, [end_date] desc) as PreviousEndDate -- previous end date +
from #TempTable
)
select p.Customer_Id,
p.[Team],
p.[Type],
p.[Start_Date],
p.[End_Date],
p.Group_Days_Criteria
from prep p
where p.rn = 1
or (p.rn != 1 and p.[Start_date] > p.PreviousEndDate)
order by p.Customer_Id, p.[Team], p.[Start_Date], p.[Type]
This returned the desired result set.

SQL Server : how to use variable values from CTE in WHERE clause?

First of all please correct me if my title are not specific/clear enough.
I have use the following code to generate the start dates and end dates :
DECLARE #start_date date, #end_date date;
SET #start_date = '2016-07-01';
with dates as
(
select
#start_date AS startDate,
DATEADD(DAY, 6, #start_date) AS endDate
union all
select
DATEADD(DAY, 7, startDate) AS startDate,
DATEADD(DAY, 7, endDate) AS endDate
from
dates
where
startDate < '2017-03-31'
)
select * from dates
Below is part of the output from above query :
+------------+------------+
| startDate | endDate |
+------------+------------+
| 2016-07-01 | 2016-07-07 |
| 2016-07-08 | 2016-07-14 |
| 2016-07-15 | 2016-07-21 |
| 2016-07-22 | 2016-07-28 |
| 2016-07-29 | 2016-08-04 |
+------------+------------+
Now I have another table named sales, which have 3 columns sales_id,sales_date and sales_amount as below :
+----------+------------+--------------+
| sales_ID | sales_date | sales_amount |
+----------+------------+--------------+
| 1 | 2016-07-04 | 10 |
| 2 | 2016-07-06 | 20 |
| 3 | 2016-07-13 | 30 |
| 4 | 2016-07-19 | 15 |
| 5 | 2016-07-21 | 20 |
| 6 | 2016-07-25 | 25 |
| 7 | 2016-07-26 | 40 |
| 8 | 2016-07-29 | 20 |
| 9 | 2016-08-01 | 30 |
| 10 | 2016-08-02 | 30 |
| 11 | 2016-08-03 | 40 |
+----------+------------+--------------+
How can I create the query to show the total sales amount of each week (which is between each startDate and endDate from the first table)? I suppose I will need to use a recursive query with WHERE clause to check if the dates are in between startDate and endDate but I cant find a working example.
Here are my expected result (the startDate and endDate are the records from the first table) :
+------------+------------+--------------+
| startDate | endDate | sales_amount |
+------------+------------+--------------+
| 2016-07-01 | 2016-07-07 | 30 |
| 2016-07-08 | 2016-07-14 | 30 |
| 2016-07-15 | 2016-07-21 | 35 |
| 2016-07-22 | 2016-07-28 | 65 |
| 2016-07-29 | 2016-08-04 | 120 |
+------------+------------+--------------+
Thank you!
Your final Select (after the cte) should be something like this
Select D.*
,Sales_Amount = sum(Sales)
From dates D
Join Sales S on (S.sales_date between D.startDate and D.endDate)
Group By D.startDate,D.endDate
Order By D.startDate
EDIT: You could use a Left Join if you want to see missing dates from
Sales

Resources