Can someone help me combining consecutive dates in different rows as shown above.Rows in first image are input and records in 2nd image are output.
In case it helps to find information about this in the future, this kind of algorithm is called "islands and gaps". Below is an example of code that will create what you are looking for. This code has a few assumptions:
The dates have no time portion. This may not be an issue but I did not test it.
Dates are not consolidated across different WID/LIDs.
There are no null dates in EndDate. If nulls are present, the data needs to be brought in like this: ISNULL(EndDate, '9999-12-31')
Here is the code:
IF OBJECT_ID('tempdb..SourceData') IS NOT NULL
DROP TABLE #SourceData;
CREATE TABLE #SourceData
(
WID VARCHAR(2) NOT NULL
, LID VARCHAR(2) NOT NULL
, StartDate DATE NOT NULL
, EndDate DATE NOT NULL
)
INSERT INTO #SourceData (WID, LID, StartDate, EndDate) VALUES
('W1','L1','1960-02-10','1988-03-22'),
('W1','L1','1988-03-23','1988-03-28'),
('W1','L1','1991-03-14','2010-10-20'),
('W2','L2','1964-10-29','1991-07-04'),
('W2','L2','1991-07-05','1992-01-28'),
('W2','L2','1992-01-29','1992-01-30');
IF OBJECT_ID('tempdb..ConsolidatedData') IS NOT NULL
DROP TABLE #ConsolidatedData;
CREATE TABLE #ConsolidatedData
(
WID VARCHAR(2) NOT NULL
, LID VARCHAR(2) NOT NULL
, StartDate DATE NOT NULL
, EndDate DATE NULL
);
WITH base_data AS
(
SELECT
WID
, LID
, StartDate
, EndDate
-- For each record, get the closest previous thru date, for consolidating records below
, MAX(EndDate) OVER (
PARTITION BY WID, LID
ORDER BY StartDate, EndDate
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS RunningMaxEndDate
FROM #SourceData
)
, islands AS
(
SELECT
WID
, LID
, StartDate
, EndDate
-- Create a running count of each gap (island), which prevents them from being consolidated
, SUM(IIF(RunningMaxEndDate >= DATEADD(DAY, -1, StartDate), 0, 1)) OVER (
PARTITION BY WID, LID
ORDER BY StartDate, EndDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS IslandNumber
FROM base_data
)
INSERT INTO #ConsolidatedData
(
WID
, LID
, StartDate
, EndDate
)
SELECT
WID
, LID
, MIN(StartDate) AS StartDate
, MAX(EndDate) AS EndDate
FROM islands
GROUP BY
WID
, LID
, IslandNumber;
SELECT * FROM #SourceData;
SELECT * FROM #ConsolidatedData;
DROP TABLE #SourceData;
DROP TABLE #ConsolidatedData;
No idea how performant this will be on large datasets or if it will cover all scenarios, but it does return the desired output from the sample data provided.
SELECT DISTINCT
WID
,LID
,ISNULL(StartDate, LAG(StartDate) OVER (PARTITION BY WID, LID ORDER BY RowNum)) as StartDate
,ISNULL(Enddate, LEAD(endDate) OVER (PARTITION BY WID, LID ORDER BY RowNum)) as Enddate
FROM (
SELECT
WID
,LID
,IIF(DATEADD(dd, -1, startdate) = ISNULL(LAG(endDate) OVER (PARTITION BY WID, LID ORDER BY enddate), enddate), null, startdate) AS StartDate
,IIF(DATEADD(dd, 1, enddate) = LEAD(startDate) OVER (PARTITION BY WID, LID ORDER BY Startdate), null, enddate) AS EndDate
,ROW_NUMBER() OVER (PARTITION BY WID, LID ORDER BY startDate, Enddate) AS rownum
FROM (VALUES
('W1','L1','1960-02-10','1988-03-22'),
('W1','L1','1988-03-23','1988-03-28'),
('W1','L1','1991-03-14','2010-10-20'),
('W2','L2','1964-10-29','1991-07-04'),
('W2','L2','1991-07-05','1992-01-28'),
('W2','L2','1992-01-29','1992-01-30')
) Sub(WID, LID, StartDate, EndDate)
) sub
WHERE
ISNULL(StartDate, EndDate) IS NOT NULL
ORDER BY
wid
,lid
,StartDate
,Enddate
Returns:
WID | LID | StartDate | Enddate
--------------------------------------
W1 | L1 | 1960-02-10 | 1988-03-28
W1 | L1 | 1991-03-14 | 2010-10-20
W2 | L2 | 1964-10-29 | 1992-01-30
Related
I have a table as follows: (Expected result without weekend exclude logic)
Start Date
End Date(Expected Date)
No of Days(input)
01-01-2021
02-01-2021
2
03-01-2021
08-01-2021
5
09-01-2021
10-01-2021
2
11-01-2021
20-01-2021
10
21-01-2021
09-02-2021
20
10-02-2021
10-02-2021
1
I want to re-generate the StartDate and EndDate data based on the NumberOfDays values, and the StartDate for subsequent rows based on previous row's EndDate + 1 day and in this sequence, I need to exclude the weekend dates as well, and I have another scenario to include weekend dates based on condition.
I want to apply this logic and select the data in same select query using SQL Server.
This is what I have tried
declare #t table ( StartDate date, EndDate date, DaysToAdd int );
insert into #t(StartDate, EndDate, DaysToAdd)
values('20210217', '20210227', 10), ('20210312', '20210310', 10), ('20210326', '20210401', 10), ('20210409', '20210401', 10), ('20210507', '20210401', 10), ('20210606', '20210529', 10), ('20210618', '20210417', 3), ('20210620', '20210309', 2), ('20300913', '20210227', 2), (null, '20300914', 4);
select * from #t
select dateadd(day, -DaysToAdd-1+count(*) over(order by isnull(StartDate, EndDate), EndDate) + sum(DaysToAdd) over(order by isnull(StartDate, EndDate), EndDate), min(StartDate) over()) as NewStartDate, dateadd(day, -1+count(*) over(order by isnull(StartDate, EndDate), EndDate) + sum(DaysToAdd) over(order by isnull(StartDate, EndDate), EndDate), min(StartDate) over()) as NewEndDate, * from #t;
My Expected result:
Start Date
End Date(Expected Date)
No of Days(input)
01-01-2021
04-01-2021
2
05-01-2021
11-01-2021
5
12-01-2021
13-01-2021
2
14-01-2021
27-01-2021
10
28-01-2021
24-02-2021
20
25-02-2021
25-02-2021
1
it is best if you have a calendar table
for the solution, i create a simple calendar table
create table calendar
(
CalDate date,
isWeekEnd bit
);
then populate it with dates
with rcte as
(
select CalDate = convert(date, '2021-01-01')
union all
select CalDate = dateadd(day, 1, CalDate)
from rcte
where CalDate <= '2021-12-30'
)
insert into calendar (CalDate, isWeekEnd)
select CalDate,
case when left(datename(weekday, CalDate), 3) in ('Sat', 'Sun') then 1 else 0 end
from rcte
option (maxrecursion 0)
your sample table & data
declare #t table (id int identity, StartDate date, EndDate date, DaysToAdd int );
insert into #t(StartDate, EndDate, DaysToAdd)
values('2021-01-01', '2021-01-02', 2),
('2021-01-03', '2021-01-08', 5),
('2021-01-09', '2021-01-10', 2),
('2021-01-11', '2021-01-20', 10),
('2021-01-21', '2021-02-09', 20),
('2021-02-10', '2021-02-10', 1);
Since you only interested in the StartDate of first row, I select it into a variable
The actual query
declare #StartDate date;
select #StartDate = StartDate
from #t
where id = 1;
with
cal as
(
select CalDate, rn = row_number() over (order by CalDate)
from Calendar
where CalDate >= #StartDate
and isWeekEnd = 0
),
t as
(
select t.id, t.DaysToAdd,
s = sum(t.DaysToAdd) over (order by t.id) - t.DaysToAdd + 1,
e = sum(t.DaysToAdd) over (order by t.id)
from #t t
)
select t.id,
t.DaysToAdd,
StartDate = s.CalDate,
EndDate = e.CalDate
from t
inner join cal s on t.s = s.rn
inner join cal e on t.e = e.rn
order by t.id
db<>fiddle demo
I have a list of dates like this (no gaps, each calendar date):
DateKey
Valid
2021-01-01
1
2021-01-02
1
2021-01-03
1
2021-01-04
0
2021-01-05
0
2021-01-06
1
2021-01-07
1
I would like to convert them using T-SQL to date ranges considering valid dates only.
So the results would be:
ValidFrom
ValidTo
2021-01-01
2021-01-03
2021-01-06
2021-01-07
Grouping simply by Valid flag rtutns wrong results:
select min(dateKey),max(dateKey)
from #t
group by Valid
If I knew how to assign a unique value for each continuous segment of valid dates, that would solve my problem. Is there anyone that can help me with this?
Just another option using the window function sum() over()
Select ValidFrom = min(DateKey)
,ValidTo = max(DateKey)
From (
Select *
,Grp = sum(case when Valid=0 then 1 else 0 end) over (order by DateKey)
from YourTable
) A
Where Valid=1
Group By Grp
Returns
ValidFrom ValidTo
2021-01-01 2021-01-03
2021-01-06 2021-01-07
Something like the following may work for you:
DECLARE #Dates TABLE (Dt DATE, Valid BIT)
INSERT #Dates
VALUES('2021-01-01', 1),
('2021-01-02', 1),
('2021-01-03', 1),
('2021-01-04', 0),
('2021-01-05', 0),
('2021-01-06', 1),
('2021-01-07', 1)
SELECT MIN(dt.Dt) AS BeginRange,
MAX(dt.Dt) AS EndRange
FROM (
SELECT d.Dt,
DATEDIFF(D, ROW_NUMBER() OVER(ORDER BY d.Dt), d.Dt) AS DtRange
FROM #Dates d
WHERE Valid = 1
) AS dt
GROUP BY dt.DtRange;
I think I've just found the solution of my problem:
https://dba.stackexchange.com/questions/197972/convert-list-of-dates-in-a-date-range-in-sql-server
DECLARE #t TABLE (dt DATE);
INSERT INTO #t (dt)
VALUES ('20180202')
,('20180203')
,('20180204')
,('20180205')
,('20180209')
,('20180212')
,('20180213');
WITH c
AS (
SELECT dt
,dateadd(day, - 1 * dense_rank() OVER (orderby dt), dt) AS grp
FROM #t
)
SELECT min(dt) AS start_range
,max(dt) AS end_range
FROM c
GROUP BY grp;
;with cte as (
select Domain_Id, Starting_Date, End_Date
from Que_Date
union all
select t.Domain_Id, cte.Starting_Date, t.End_Date
from cte
join Que_Date t on cte.Domain_Id = t.Domain_Id and cte.End_Date = t.Starting_Date),
cte2 as (
select *, rn = row_number() over (partition by Domain_Id, End_Date order by Domain_Id)
from cte
)
select DISTINCT Domain_Id, Starting_Date, max(End_Date) enddate
from cte2
where rn=1
group by Domain_Id, Starting_Date
order by Domain_Id, Starting_Date;
select * from Que_Date
This is the code that I have wrote but i am getting an extra row i.e 2nd row is extra, the expected output should have only 1st, 3rd and 4th row as output so please help me with it.
I have attached an image showing Input, Excepted Output, and the output that I am getting.
You've got so many results in your first cte. Your first cte has consisting domains. So you cannot filter domains based on your cte. So you query has unnecessary rows.
Try this solution. Cte ConsistentDomains has just consistent domains. So based on this cte, we can get not overlapped results.
Create and fill data:
CREATE TABLE FooTable
(
Domain_ID INT,
Starting_Date DATE,
End_Date Date
)
INSERT INTO dbo.FooTable
(
Domain_ID,
Starting_Date,
End_Date
)
VALUES
( 1, -- Domain_ID - int
CONVERT(datetime,'01-01-2011',103), -- Starting_Date - date
CONVERT(datetime,'05-01-2011',103) -- End_Date - date
)
, (1, CONVERT(datetime,'05-01-2011',103), CONVERT(datetime,'07-01-2011',103))
, (1, CONVERT(datetime,'07-01-2011',103), CONVERT(datetime,'15-01-2011',103))
, (2, CONVERT(datetime,'11-05-2011',103), CONVERT(datetime,'12-05-2011',103))
, (2, CONVERT(datetime,'13-05-2011',103), CONVERT(datetime,'14-05-2011',103))
Query to find not overlapping results:
DECLARE #startDate varchar(50) = '2011-01-01';
WITH ConsistentDomains AS
(
SELECT
f.Domain_ID
, f.Starting_Date
, f.End_Date
FROM FooTable f
WHERE f.Starting_Date = #startDate
UNION ALL
SELECT
s.Domain_ID
, s.Starting_Date
, s.End_Date
FROM FooTable s
INNER JOIN ConsistentDomains cd
ON s.Domain_ID = cd.Domain_ID
AND s.Starting_Date = cd.End_Date
), ConsistentDomainsRownumber AS
(
SELECT
cd.Domain_ID
, cd.Starting_Date
, cd.End_Date
, ROW_NUMBER() OVER (PARTITION BY cd.Domain_ID ORDER BY cd.Starting_Date,
cd.End_Date) RN
FROM ConsistentDomains cd
)
SELECT cd.Domain_ID
, convert(varchar, cd.Starting_Date, 105) Starting_Date
, convert(varchar, cd.End_Date, 105) End_Date
FROM ConsistentDomainsRownumber cd WHERE cd.RN = 1
UNION ALL
SELECT
ft.Domain_ID
, convert(varchar, ft.Starting_Date, 105) Starting_Date
, convert(varchar, ft.End_Date, 105) End_Date
FROM dbo.FooTable ft WHERE ft.Domain_ID NOT IN (SELECT cd.Domain_ID FROM
ConsistentDomainsRownumber cd)
Output:
I used the same table creating script as provided by #stepup, but you can also get your outcome in this way.
CREATE TABLE testtbl
(
Domain_ID INT,
Starting_Date DATE,
End_Date Date
)
INSERT INTO testtbl
VALUES
(1, convert(date, '01-01-2011' ,103), convert(date, '05-01-2011',103) )
,(1, convert(date, '05-01-2011' ,103), convert(date, '07-01-2011',103) )
,(1, convert(date, '07-01-2011' ,103), convert(date, '15-01-2011',103) )
,(2, convert(date, '11-05-2011' ,103), convert(date, '12-05-2011',103) )
,(2, convert(date, '13-05-2011' ,103), convert(date, '14-05-2011',103) )
You can make use of self join and Firs_value and last value within the group to make sure that you are comparing within the same ID and overlapping dates.
select distinct t.Domain_ID,
case when lag(t1.starting_date)over (partition by t.Domain_id order by
t.starting_date) is not null
then first_value(t.Starting_Date) over (partition by t.domain_id order by
t.starting_date)
else t.Starting_Date end StartingDate,
case when lead(t.domain_id) over (partition by t.domain_id order by t.starting_date) =
t1.Domain_ID then isnull(last_value(t.End_Date) over (partition by t.domain_id order by t.end_date rows between unbounded preceding and unbounded following),t.End_Date)
else t.End_Date end end_date
from testtbl t
left join testtbl t1 on t.Domain_ID = t1.Domain_ID
and t.End_Date = t1.Starting_Date
and t.Starting_Date < t1.Starting_Date
Output:
Domain_ID StartingDate end_date
1 2011-01-01 2011-01-15
2 2011-05-11 2011-05-12
2 2011-05-13 2011-05-14
I am fairly new to SSIS, and now I have this requirement to exclude weekends in order to do a performance management. Now I have created a calendar and marked the weekends; what I am trying to do, using SSIS, is get the start and end date of every status and count how many weekends are there. I am kind of struggling to know which component to use to achieve this task.
So I have mainly two tables:
1- Table Calendar
2- Table History-Log
Calendar has the following columns:
1- ID
2- date
3- year
4- month
5- day of week
6- isweekend
History-Log has the following:
1- ID
2- Status
3- startdate
4- enddate
Your help is really appreciated.
I'm not an SSIS user, so apologies if this answer does not help, but if I wanted to get the result you describe, based on some test data:
DECLARE #Calendar TABLE (
ID INT,
[Date] DATETIME,
[Year] INT,
[Month] INT,
[DayOfWeek] VARCHAR(10),
IsWeekend BIT
)
DECLARE #HistoryLog TABLE (
ID INT,
[Status] INT,
StartDate DATETIME,
EndDate DATETIME
)
DECLARE #StartDate DATE = '20100101', #NumberOfYears INT = 10
DECLARE #CutoffDate DATE = DATEADD(YEAR, #NumberOfYears, #StartDate);
INSERT INTO #Calendar
SELECT ROW_NUMBER() OVER (ORDER BY d) AS ID,
d AS [Date],
DATEPART(YEAR,d) AS [Year],
DATEPART(MONTH,d) AS [Month],
DATENAME(WEEKDAY,d) AS [DayOfWeek],
CASE WHEN DATENAME(WEEKDAY,d) IN ('Saturday','Sunday') THEN 1 ELSE 0 END AS IsWeekend
FROM
(
SELECT d = DATEADD(DAY, rn - 1, #StartDate)
FROM
(
SELECT TOP (DATEDIFF(DAY, #StartDate, #CutoffDate))
rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
ORDER BY s1.[object_id]
) AS x
) AS y;
INSERT INTO #HistoryLog
SELECT 1, 3, '2016-01-05', '2016-01-20'
UNION
SELECT 2, 7, '2016-01-08', '2016-01-25'
UNION
SELECT 3, 4, '2016-01-01', '2016-02-03'
UNION
SELECT 4, 3, '2016-02-09', '2016-02-10'
I would use a query like this to return all of the HistoryLog records with a count of the number of weekend days between their StartDate and EndDate:
SELECT h.ID,
h.[Status],
h.StartDate,
h.EndDate,
COUNT(c.ID) AS WeekendDays
FROM #HistoryLog h
LEFT JOIN #Calendar c ON c.[Date] >= h.StartDate AND c.[Date] <= h.EndDate AND c.IsWeekend = 1
GROUP BY h.ID, h.[Status], h.StartDate, h.EndDate
ORDER BY 1
If you wanted to know the number of weekends, rather than the number of weekend days, we'd need to slightly amend this logic (and define how a range containing only one weekend day - or one starting on a Sunday and ending on a Saturday inclusive - should be handled). Assuming you just want to know how many distinct weekends are at least partially within the date range, you could do:
SELECT h.ID,
h.[Status],
h.StartDate,
h.EndDate,
COUNT(weekends.ID) AS Weekends
FROM #HistoryLog h
LEFT JOIN
(
SELECT c.ID,
c.[Date] AS SatDate,
DATEADD(DAY,1,c.[Date]) AS SunDate
FROM #Calendar c
WHERE c.[DayOfWeek] = 'Saturday'
) weekends ON h.StartDate BETWEEN weekends.SatDate AND weekends.SunDate
OR h.EndDate BETWEEN weekends.SatDate AND weekends.SunDate
OR (h.StartDate <= weekends.SatDate AND h.EndDate >= weekends.SunDate)
GROUP BY h.ID, h.[Status], h.StartDate, h.EndDate
I need to concatenate rows with a date and a code into a date range
Table with two columns that are a composite primary key (date and a code )
Date Code
1/1/2011 A
1/2/2011 A
1/3/2011 A
1/1/2011 B
1/2/2011 B
2/1/2011 A
2/2/2011 A
2/27/2011 A
2/28/2011 A
3/1/2011 A
3/2/2011 A
3/3/2011 A
3/4/2011 A
Needs to be converted to
Start Date End Date Code
1/1/2011 1/3/2011 A
2/1/2011 2/2/2011 A
1/1/2011 1/2/2011 B
2/27/2011 3/4/2011 A
Is there any other way or is a cursor loop the only way?
declare #T table
(
[Date] date,
Code char(1)
)
insert into #T values
('1/1/2011','A'),
('1/2/2011','A'),
('1/3/2011','A'),
('1/1/2011','B'),
('1/2/2011','B'),
('3/1/2011','A'),
('3/2/2011','A'),
('3/3/2011','A'),
('3/4/2011','A')
;with C as
(
select *,
datediff(day, 0, [Date]) - row_number() over(partition by Code
order by [Date]) as rn
from #T
)
select min([Date]) as StartDate,
max([Date]) as EndDate,
Code
from C
group by Code, rn
sql server 2000 has it limitations. Rewrote the solution to make it more readable.
declare #t table
(
[Date] datetime,
Code char(1)
)
insert into #T values
('1/1/2011','A'),
('1/2/2011','A'),
('1/3/2011','A'),
('1/1/2011','B'),
('1/2/2011','B'),
('3/1/2011','A'),
('3/2/2011','A'),
('3/3/2011','A'),
('3/4/2011','A')
select a.code, a.date, min(b.date)
from
(
select *
from #t t
where not exists (select 1 from #t where t.code = code and t.date -1 = date)
) a
join
(
select *
from #t t
where not exists (select 1 from #t where t.code = code and t.date = date -1)
) b
on a.code = b.code and a.date <= b.date
group by a.code, a.date
Using a DatePart function for month will get you the "groups" you want
SELECT Min(Date) as StartDate, Max(Date) as EndDate, Code
FROM ThisTable Group By DatePart(m, Date), Code