I have a question about SQL Server.
Table: holidaylist
Date | weekendStatus | Holidaystatus
2015-12-01 | 0 | 0
2015-12-02 | 0 | 0
2015-12-03 | 0 | 0
2015-12-04 | 1 | 0
2015-12-05 | 1 | 0
2015-12-06 | 0 | 1
2015-12-07 | 0 | 0
2015-12-08 | 0 | 0
2015-12-09 | 0 | 1
2015-12-10 | 0 | 0
2015-12-11 | 0 | 0
2015-12-12 | 1 | 1
2015-12-13 | 1 | 0
Table: emp
empid | doj | dos
1 | 2015-12-01 | 2015-12-06
2 |2015-12-01 | 2015-12-13
3 |2015-12-03 |2015-12-13
I want get days difference from dos-doj withoutweekenstatusandholidaysstatus
and includeweekendandholidaystatus
I want output like this:
Empid | doj | dos |includeweekendandholidays | witoutincludeweekendandholidayslist
1 | 2015-12-01 |2015-12-06 | 5 | 3
2 | 2015-12-01 |2015-12-13 | 12 | 8
3 | 2015-12-03 |2015-12-13 | 10 | 6
I tried this query:
select
a.empid, a.doj, a.dos,
case
when b.weekendstatus = 1 and c.Holidaystatus = 1
then datediff(day, c.date, b.date)
end as includeweekenandholidays
case
when b.weekendstatus != 1 or c.Holidaystatus = 1
then datediff(day, c.date, b.date)
end as witoutincludeweekendandholidayslist
from
emp a
left join
holidaylist b on a.doj = b.date
left join
holidaylist c on a.dos = c.date
Above query not given expected result please tell me how to write query to achieve this task in SQL Server
Try this :
select a.empid,
a.doj,a.dos,
IncludeRest = (select count(h.date) from holidaylist h where e.doj<=h.date AND e.dos>=h.date),
ExcludeRest = (select count(h.date) from holidaylist h where e.doj<=h.date AND e.dos>=h.date AND h.weekendstatus = 0 AND h.holdaystatus = 0)
from emp e
you can use a CASE in your COUNT to determine whether or not to count that day..
SELECT
e.empid,
e.doj,
e.dos,
COUNT(*) includeweekendandholidays,
COUNT(CASE WHEN Holidaystatus = 0
AND [weekendStatus] = 0 THEN 1
END) withoutincludeweekendandholidayslist
FROM
emp e
JOIN holidaylist hl ON hl.Date >= e.doj
AND hl.Date < e.dos
GROUP BY
e.empid,
e.doj,
e.dos
This might perform better since it only joins to holidaylist table on records you need..
SELECT
e.empid,
e.doj,
e.dos,
DATEDIFF(DAY, e.doj, e.dos) includeweekendandholidays,
COUNT(*) withoutincludeweekendandholidayslist
FROM
emp e
JOIN holidaylist hl ON hl.Date BETWEEN e.doj AND e.dos
WHERE
weekendStatus = 0
AND Holidaystatus = 0
GROUP BY
e.empid,
e.doj,
e.dos,
DATEDIFF(DAY, e.doj, e.dos)
I don't get your output though since it only appears that you're excluding weekends and not holidays..
You can use OUTER APPLY:
SELECT a.empid, a.doj, a.dos,
DATEDIFF(d, a.doj, a.dos) + 1 AS include,
DATEDIFF(d, a.doj, a.dos) + 1 - b.wd - b.hd + b.common AS without
FROM emp AS a
OUTER APPLY (
SELECT SUM(weekendStatus) AS wd,
SUM(Holidaystatus) AS hd,
COUNT(CASE WHEN weekendStatus = 1 AND Holidaystatus = 1 THEN 1 END) AS common
FROM holidaylist
WHERE [Date] BETWEEN a.doj AND a.dos) AS b
For each row of table emp, OUTER APPLY calculates weekendStatus=1 and Holidaystatus=1 rows that correspond to the interval of this row.
Calculated fields selected:
include is the total number of days of the emp interval including weekend days and holidays.
without is the total number of days of the emp interval minus weekend days and holidays. common field makes sure common weekend days and holidays are not subtracted twice.
Note: The above query includes start and end days of the interval in the calculations, so the interval considered is [doj - dos]. You can change the predicate of the WHERE clause in the OUTER APPLY operation so as to exclude start, end, or both, days of the interval.
Demo here
try another way with cross join
select t.empid,t.doj,t.dos,datediff(day,t.doj,t.dos) includeweekendandholidays,
datediff(day,t.doj,t.dos)-isnull(t1.wes,0) as witoutincludeweekendandholidayslist
from #emp t left join (
select empid, sum(hd.Holidaystatus+hd.weekendStatus) wes from
#emp emp cross join #holidaylist hd where hd.[Date] between doj
and dateadd(day,-1,dos) group by empid) t1 on t.empid=t1.empid
sample data
declare #holidaylist table ([Date] date, weekendStatus int, Holidaystatus int)
insert into #holidaylist([Date], weekendStatus, Holidaystatus) values
('2015-12-01' , 0 , 0),
('2015-12-02' , 0 , 0),
('2015-12-03' , 0 , 0),
('2015-12-04' , 1 , 0),
('2015-12-05' , 1 , 0),
('2015-12-06' , 0 , 1),
('2015-12-07' , 0 , 0),
('2015-12-08' , 0 , 0),
('2015-12-09' , 0 , 1),
('2015-12-10' , 0 , 0),
('2015-12-11' , 0 , 0),
('2015-12-12' , 1 , 1),
('2015-12-13' , 1 , 0)
declare #emp table(empid int, doj date, dos date)
insert into #emp (empid,doj,dos) values
(1 , '2015-12-01' , '2015-12-06'),
(2 ,'2015-12-01' , '2015-12-13'),
(3 ,'2015-12-03' ,'2015-12-13')
Related
I have a table with a calendar, and a table with rates. In the table with the rates, there are no values existing for days in the weekend. I'm trying to join the two, in order to have a table where there is a rate for all days, and I need the rates in the weekend to be the latest available rate. Instad of it showing NULL values, as it would when you make a left join and the record doesn't exist, it should just take the latest available, repeating the previous value.
I have the below code, which works, but it takes 2 min to do on 7,397 rows, which is way too long.
Does anyone know a faster way to get the same results?
SELECT
c.CalendarID,
MAX(r.RateID)
FROM Dim_Calendar c
LEFT JOIN Dim_Rates r ON r.RateDate <= c.CalendarID
What I get without <= and just an = is the following
CalendarID | RateID
20131001 | 2
20131002 | 3
20131003 | 4
20131004 | 5
20131005 | NULL
20131006 | NULL
20131007 | 6
And this is the desired table:
CalendarID | RateID
20131001 | 2
20131002 | 3
20131003 | 4
20131004 | 5
20131005 | 5
20131006 | 5
20131007 | 6
You can use LAG() window function:
SELECT c.CalendarID,
COALESCE(
r.RateID,
LAG(r.RateID, 1) OVER (ORDER BY c.CalendarID),
LAG(r.RateID, 2) OVER (ORDER BY c.CalendarID)
) RateID
FROM Dim_Calendar c LEFT JOIN Dim_Rates r
ON r.RateDate = c.CalendarID
ORDER BY c.CalendarID
See the demo.
Results:
> CalendarID | RateID
> ---------: | :-----
> 20131001 | 2
> 20131002 | 3
> 20131003 | 4
> 20131004 | 5
> 20131005 | 5
> 20131006 | 5
> 20131007 | 6
You could use a correlated subquery to fill the gaps:
SELECT
c.CalendarID,
(SELECT TOP 1 r.RateID FROM Dim_Rates r
WHERE r.RateDate <= c.CalendarID AND r.RateID IS NOT NULL
ORDER BY r.RateDate DESC) AS RateID
FROM Dim_Calendar c
ORDER BY c.CalendarID;
This query can be improved by using the following index:
CREATE INDEX idx ON Dim_Rates (RateDate, RateID);
As pointed out, you need to check for proper and covering indexing. It appears you are running a against a DW DB and if that is the case then you can replace the CTE with indexed temp tables if the esitmated row count approximation is way off in the query plan.
;WITH NormalizedData AS
(
SELECT
RateID,CalendarID,
VirtualGroupID = SUM(LastRecordBeforeGap) OVER (ORDER BY CalendarID ROWS UNBOUNDED PRECEDING)
FROM
(
SELECT RateID,CalendarID,
LastRecordBeforeGap = CASE WHEN LEAD(RateID) OVER(ORDER BY CalendarID) IS NULL AND RateID IS NOT NULL THEN 1 ELSE 0 END
FROM
Dim_Calendar c
LEFT JOIN Dim_Rates r ON r.RateDate = c.CalendarID
)AS x
)
SELECT
RateID = ISNULL(RateID, SUM(RateID) OVER(PARTITION BY VirtualGroupID)),
CalendarID
FROM
NormalizedData
I have two tables of dates, for the questions's sake call them dates1 and dates2. Sometimes a given date is in both, sometimes it is in 1 but not 2, and other times in 2 but not 1.
My original requirement was just a list of all dates from both sets
SELECT Date FROM dates1
UNION
SELECT Date FROM dates2
Easy peasy. New requirement; know which list the dates came from (or both if that is the case). The columns I need are as follows:
Date, IsList1, IsList2
So, some example data:
Dates1
======
Date
====
2017-01-31
2017-02-28
2017-03-31
Dates2
======
Date
====
2017-01-31
2017-04-30
Expected output
Date | IsList1 | IsList2
2017-01-31 | 1 | 1
2017-02-28 | 1 | 0
2017-03-31 | 1 | 0
2017-04-30 | 0 | 1
SQL fiddle with the above data: http://sqlfiddle.com/#!18/9eecb/5425
You'll most likely need to use a FULL OUTER JOIN and some expressions to achieve this.
SELECT ISNULL(D1.[Date], D2.[Date]) AS [Date],
CASE WHEN D1.[Date] IS NULL THEN 0 ELSE 1 END AS IsList1,
CASE WHEN D2.[Date] IS NULL THEN 0 ELSE 1 END AS IsList2
FROM #dates1 D1
FULL OUTER JOIN #dates2 D2 ON D1.[Date] = D2.[Date];
It's also worth nothing, on your SQL fiddle, that you have 2 INSERT statements into #Dates1 and none into #Dates2, thus the result set is 1 for all of IsList1 and 0 for IsList2.
;with cte
as
(
select dt.date as tb1,dt1.date as tbl2
from
#dates1 dt
full join
#dates2 dt1
on dt.date =dt1.date
)
select isnull(tb1 ,tbl2 ) as dt,
case when tb1 is not null then 1 else 0 end as list1,
case when tbl2 is not null then 1 else 0 end as list2
from cte
Perhaps another option with a simple aggregation:
Select Date
,InList1=sum(InList1)
,InList2=sum(InList2)
From (
Select Distinct Date,InList1=1,InList2=0 from #dates1
Union All
Select Distinct Date,InList1=0,InList2=1 from #dates2
) A
Group By Date
Returns
Date InList1 InList2
2017-01-31 1 1
2017-02-28 1 0
2017-03-31 1 0
2017-04-30 0 1
I am using SQL Server 2008 and I would like to only get the activityCode for the orderno when it equals 1 if there are duplicate orderno with the activityCode equals 0.
Also, if the record for orderno activityCode equals 0 then display those records also. But I would only like to display the orderno when the activityCode equals 0 if the same orderno activityCode does not equal 1 or the activityCode only equals 0. I hope this is clear and makes sense but let me know if I need to provide more details. Thanks
--create table
create table po_v
(
orderno int,
amount number,
activityCode number
)
--insert values
insert into po_v values
(170268, 2774.31, 0),
(17001988, 288.82, 0),
(17001988, 433.23, 1),
(170271, 3786, 1),
(170271, 8476, 0),
(170055, 34567, 0)
--Results
170268 | 2774.31 | 0
17001988 | 433.23 | 1
170271 | 3786 | 1
170055 | 34567 | 0
*****Updated*****
I have inserted two new records and the results have been updated. The data in the actual table has other numbers besides 0 and 1. The select statement displays the correct orderno's but I would like the other records for the orderno to display also. The partition only populates one record per orderno. If possible I would like to see the records with the same activityCode.
--insert values
insert into po_v values
(170271, 3799, 1),
(172525, 44445, 2)
--select statement
SELECT Orderno,
Amount,
Activitycode
FROM (SELECT orderno,
amount,
activitycode,
ROW_NUMBER()
OVER(
PARTITION BY orderno
ORDER BY activitycode DESC) AS dup
FROM Po_v)dt
WHERE dt.dup = 1
ORDER BY 1
--select statement results
170055 | 34567 | 0
170268 | 2774.31 | 0
170271 | 3786 | 1
172525 | 44445 | 2
17001988 | 433.23 | 1
--expected results
170055 | 34567 | 0
170268 | 2774.31 | 0
170271 | 3786 | 1
170271 | 3799 | 1
172525 | 44445 | 2
17001988 | 433.23 | 1
Not totally clear what you are trying to do here but this returns the output you are expecting.
select orderno
, amount
, activityCode
from
(
select *
, RowNum = ROW_NUMBER() over(partition by orderno order by activityCode desc)
from po_v
) x
where x.RowNum = 1
---EDIT---
With the new details this is a very different question. As I understand it now you want all row for that share the max activity code for each orderno. You can do this pretty easily with a cte.
with MyGroups as
(
select orderno
, Activitycode = max(activitycode)
from po_v
group by orderno
)
select *
from po_v p
join MyGroups g on g.orderno = p.orderno
and g.Activitycode = p.Activitycode
Try this
SELECT Orderno,
Amount,
Activitycode
FROM (SELECT orderno,
amount,
activitycode,
ROW_NUMBER()
OVER(
PARTITION BY orderno
ORDER BY activitycode DESC) AS dup
FROM Po_v)dt
WHERE dt.dup = 1
ORDER BY 1
Result
Orderno Amount Activitycode
------------------------------------
170055 34567.00 0
170268 2774.31 0
170271 3786.00 1
17001988 433.23 1
In my table, I have a primary key and a date. What I'd like to achieve is to have an incremental label based on whether or not there is a break between the dates - column Goal.
Now, below is an example. The break column was calculated using LEAD function (I thought it might help).
I am able to solve it using T-SQL, but this would be last resort. Nothing I tried has worked so far. I am using MSSQL 2014.
PK | Date | break | Goal |
-------------------------------
1 | 03/2017 | 0 | 1 |
1 | 04/2017 | 0 | 1 |
1 | 08/2017 | 1 | 2 |
1 | 09/2017 | 0 | 2 |
1 | 10/2017 | 0 | 2 |
1 | 02/2018 | 1 | 3 |
1 | 03/2018 | 0 | 3 |
Here is a code to reproduce this example:
CREATE TABLE #test
(
ConsumerId INT,
FullDate DATE,
Goal INT
)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-03-01',1)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-04-01',1)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-08-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-09-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-10-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2018-02-01',3)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2018-03-01',3)
SELECT ConsumerId,
FullDate,
CASE WHEN (datediff(month,
isnull(
LEAD (FullDate,1) OVER (PARTITION BY ConsumerId ORDER BY FullDate DESC),
FullDate),
FullDate) > 1)
THEN 1
ELSE 0
END AS break,
Goal
FROM #test
ORDER BY FullDate ASC
EDIT
This is apparently a famous problem "Islands and gaps" as pointed out in the comments. And Google offers many solutions as well as other questions here at SO.
Try this...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT
sg.ConsumerId, sg.FullDate,
GroupValue = DENSE_RANK() OVER (PARTITION BY sg.ConsumerId ORDER BY sg.GV)
FROM
cte_SmearGap sg;
An explanation of the code an how it works...
The 1st query, in cte_TestGap, uses the LAG function along with ROW_NUMBER() function to mark the location of gap in the data. We can see that by breaking it out and looking at it's results...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
)
SELECT * FROM cte_TestGap;
cte_TestGap results...
ConsumerId FullDate Gap
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 0
1 2017-08-01 3
1 2017-09-01 0
1 2017-10-01 0
1 2018-02-01 6
1 2018-03-01 0
At this point we want the 0 values to take on the value of the preceding non-0 values, allowing them to be grouped together. This is done in the 2nd query (cte_SmearGap) using the MAX function with a "window frame". So if we look at the output of cte_SmearGap, we can see that...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT * FROM cte_SmearGap;
cte_SmearGap results...
ConsumerId FullDate GV
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 1
1 2017-08-01 3
1 2017-09-01 3
1 2017-10-01 3
1 2018-02-01 6
1 2018-03-01 6
At this point All of the rows are in distinct groups... but... We'd like to have our group numbers in a contiguous sequence (1,2,3) as opposed to (1,3,6).
Of course that's easy enough to fix using the DENSE_Rank() function, which is what's happening in the final select...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT
sg.ConsumerId, sg.FullDate,
GroupValue = DENSE_RANK() OVER (PARTITION BY sg.ConsumerId ORDER BY sg.GV)
FROM
cte_SmearGap sg;
The end result...
ConsumerId FullDate GroupValue
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 1
1 2017-08-01 2
1 2017-09-01 2
1 2017-10-01 2
1 2018-02-01 3
1 2018-03-01 3
The comment from David Browne was actually extremely useful. If you google "Islands and Gaps", there are many variations of the solution. Below is the one I liked the most.
In the end, I needed the Goal column to be able to group the dates into MIN/MAX. This solution skips this step and directly creates the aggregated range.
Here is the source.
SELECT MIN(FullDate) AS range_start,
MAX(FUllDate) AS range_end
FROM (
SELECT FullDate,
DATEADD(MM, -1 * ROW_NUMBER() OVER(ORDER BY FullDate), FullDate) AS grp
FROM #test
) a
GROUP BY a.grp
And the output:
range_start | range_end |
--------------------------
2017-03-01 | 2017-04-01 |
2017-08-01 | 2017-10-01 |
2018-02-01 | 2018-03-01 |
Say we have such a table:
declare #periods table (
s date,
e date,
t tinyint
);
with date intervals without gaps ordered by start date (s)
insert into #periods values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
All date intervals have different types (t).
It is required to combine date intervals of the same type where they are not broken by intervals of the other types (having all intervals ordered by start date).
So the result table should look like:
s | e | t
------------|------------|-----
2013-01-01 | 2013-01-02 | 3
2013-01-02 | 2013-01-05 | 1
2013-01-05 | 2013-01-08 | 2
2013-01-08 | 2013-01-09 | 1
Any ideas how to do this without cursor?
I've got one working solution:
declare #periods table (
s datetime primary key clustered,
e datetime,
t tinyint,
period_number int
);
insert into #periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
declare #t tinyint = null;
declare #PeriodNumber int = 0;
declare #anchor date;
update #periods
set period_number = #PeriodNumber,
#PeriodNumber = case
when #t <> t
then #PeriodNumber + 1
else
#PeriodNumber
end,
#t = t,
#anchor = s
option (maxdop 1);
select
s = min(s),
e = max(e),
t = min(t)
from
#periods
group by
period_number
order by
s;
but I doubt if I can rely on such a behavior of UPDATE statement?
I use SQL Server 2008 R2.
Edit:
Thanks to Daniel and this article: http://www.sqlservercentral.com/articles/T-SQL/68467/
I found three important things that were missed in the solution above:
There must be clustered index on the table
There must be anchor variable and call of the clustered column
Update statement should be executed by one processor, i.e. without parallelism
I've changed the above solution in accordance with these rules.
Since your ranges are continuous, the problem essentially becomes a gaps-and-islands one. If only you had a criterion to help you to distinguish between different sequences with the same t value, you could group all the rows using that criterion, then just take MIN(s), MAX(e) for every group.
One method of obtaining such a criterion is to use two ROW_NUMBER calls. Consider the following query:
SELECT
*,
rnk1 = ROW_NUMBER() OVER ( ORDER BY s),
rnk2 = ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
FROM #periods
;
For your example it would return the following set:
s e t rnk1 rnk2
---------- ---------- -- ---- ----
2013-01-01 2013-01-02 3 1 1
2013-01-02 2013-01-04 1 2 1
2013-01-04 2013-01-05 1 3 2
2013-01-05 2013-01-06 2 4 1
2013-01-06 2013-01-07 2 5 2
2013-01-07 2013-01-08 2 6 3
2013-01-08 2013-01-09 1 7 3
The interesting thing about the rnk1 and rnk2 rankings is that if you subtract one from the other, you will get values that, together with t, uniquely identify every distinct sequence of rows with the same t:
s e t rnk1 rnk2 rnk1 - rnk2
---------- ---------- -- ---- ---- -----------
2013-01-01 2013-01-02 3 1 1 0
2013-01-02 2013-01-04 1 2 1 1
2013-01-04 2013-01-05 1 3 2 1
2013-01-05 2013-01-06 2 4 1 3
2013-01-06 2013-01-07 2 5 2 3
2013-01-07 2013-01-08 2 6 3 3
2013-01-08 2013-01-09 1 7 3 4
Knowing that, you can easily apply grouping and aggregation. This is what the final query might look like:
WITH partitioned AS (
SELECT
*,
g = ROW_NUMBER() OVER ( ORDER BY s)
- ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
FROM #periods
)
SELECT
s = MIN(s),
e = MAX(e),
t
FROM partitioned
GROUP BY
t,
g
;
If you like, you can play with this solution at SQL Fiddle.
How about this?
declare #periods table (
s datetime primary key,
e datetime,
t tinyint,
s2 datetime
);
insert into #periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
update #periods set s2 = s;
while ##ROWCOUNT > 0
begin
update p2 SET s2=p1.s
from #periods p1
join #PERIODS P2 ON p2.t = p1.t AND p2.s2 = p1.e;
end
select s2 as s, max(e) as e, min(t) as t
from #periods
group by s2
order by s2;
a possibly solution to avoid update and cursor should be using common table expressions...
like this...
declare #periods table (
s date,
e date,
t tinyint
);
insert into #periods values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
with cte as ( select 0 as n
,p.s as s
,p.e as e
,p.t
,case when p2.s is null then 1 else 0 end fl_s
,case when p3.e is null then 1 else 0 end fl_e
from #periods p
left outer join #periods p2
on p2.e = p.s
and p2.t = p.t
left outer join #periods p3
on p3.s = p.e
and p3.t = p.t
union all
select n+1 as n
, p2.s as s
, p.e as e
,p.t
,case when not exists(select * from #periods p3 where p3.e =p2.s and p3.t=p2.t) then 1 else 0 end as fl_s
,p.fl_e as fl_e
from cte p
inner join #periods p2
on p2.e = p.s
and p2.t = p.t
where p.fl_s = 0
union all
select n+1 as n
, p.s as s
, p2.e as e
,p.t
,p.fl_s as fl_s
,case when not exists(select * from #periods p3 where p3.s =p2.e and p3.t=p2.t) then 1 else 0 end as fl_e
from cte p
inner join #periods p2
on p2.s = p.e
and p2.t = p.t
where p.fl_s = 1
and p.fl_e = 0
)
,result as (select s,e,t,COUNT(*) as count_lines
from cte
where fl_e = 1
and fl_s = 1
group by s,e,t
)
select * from result
option(maxrecursion 0)
resultset achieved...
s e t count_lines
2013-01-01 2013-01-02 3 1
2013-01-02 2013-01-05 1 2
2013-01-05 2013-01-08 2 3
2013-01-08 2013-01-09 1 1
Hooray! I've found the solution that suits me and it is done without iteration
with cte1 as (
select s, t from #periods
union all
select max(e), null from #periods
),
cte2 as (
select rn = row_number() over(order by s), s, t from cte1
),
cte3 as (
select
rn = row_number() over(order by a.rn),
a.s,
a.t
from
cte2 a
left join cte2 b on a.rn = b.rn + 1 and a.t = b.t
where
b.rn is null
)
select
s = a.s,
e = b.s,
a.t
from
cte3 a
inner join cte3 b on b.rn = a.rn + 1;
Thanks everyone for sharing your thoughts and solutions!
Details:
cte1 returns the chain of dates with the types after them:
s t
---------- ----
2013-01-01 3
2013-01-02 1
2013-01-04 1
2013-01-05 2
2013-01-06 2
2013-01-07 2
2013-01-08 1
2013-01-09 NULL -- there is no type *after* the last date
ct2 just add row number to the above result:
rn s t
---- ---------- ----
1 2013-01-01 3
2 2013-01-02 1
3 2013-01-04 1
4 2013-01-05 2
5 2013-01-06 2
6 2013-01-07 2
7 2013-01-08 1
8 2013-01-09 NULL
if we output all the fields from the query in cte3 without where condition, we get the following results:
select * from cte2 a left join cte2 b on a.rn = b.rn + 1 and a.t = b.t;
rn s t rn s t
---- ---------- ---- ------ ---------- ----
1 2013-01-01 3 NULL NULL NULL
2 2013-01-02 1 NULL NULL NULL
3 2013-01-04 1 2 2013-01-02 1
4 2013-01-05 2 NULL NULL NULL
5 2013-01-06 2 4 2013-01-05 2
6 2013-01-07 2 5 2013-01-06 2
7 2013-01-08 1 NULL NULL NULL
8 2013-01-09 NULL NULL NULL NULL
For the dates where type is repeted there are values on the right side of the results. So we can just remove all the lines where values exist on the right side.
So cte3 returns:
rn s t
----- ---------- ----
1 2013-01-01 3
2 2013-01-02 1
3 2013-01-05 2
4 2013-01-08 1
5 2013-01-09 NULL
Note that because of the removal some rows there are some gaps in rn sequence, so we have to renumber them again.
From here only one thing left - to transform the dates to periods:
select
s = a.s,
e = b.s,
a.t
from
cte3 a
inner join cte3 b on b.rn = a.rn + 1;
and we've got the required result:
s e t
---------- ---------- ----
2013-01-01 2013-01-02 3
2013-01-02 2013-01-05 1
2013-01-05 2013-01-08 2
2013-01-08 2013-01-09 1
this is your solution with a different data on the table..
declare #periods table (
s datetime primary key,
e datetime,
t tinyint,
period_number int
);
insert into #periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-09' , '2013-01-10', 2),
('2013-01-10' , '2013-01-11', 1);
declare #t tinyint = null;
declare #PeriodNumber int = 0;
update #periods
set period_number = #PeriodNumber,
#PeriodNumber = case
when #t <> t
then #PeriodNumber + 1
else
#PeriodNumber
end,
#t = t;
select
s = min(s),
e = max(e),
t = min(t)
from
#periods
group by
period_number
order by
s;
where have a gap between
('2013-01-05' , '2013-01-06', 2),
--and
('2013-01-09' , '2013-01-10', 2),
your solution resultset is..
s e t
2013-01-01 2013-01-02 3
2013-01-02 2013-01-05 1
2013-01-05 2013-01-10 2
2013-01-10 2013-01-11 1
isnt was spected the resultset like this..??
s e t
2013-01-01 2013-01-02 3
2013-01-02 2013-01-05 1
2013-01-05 2013-01-06 2
2013-01-09 2013-01-10 2
2013-01-10 2013-01-11 1
maybe I did misunderstood the rule of your problem...