Group by month and year of a date column - sql-server

I have my query as following
SELECT
MAX(Reimbursement_EBSUtilization.Id) AS Id,
ProviderReimbursementRequest.Contractor_Id,
Reimbursement_EBSUtilization.ServiceMonth,
fContractor.ContractorName,
Reimbursement_EBSUtilization.SD_Id,
MAX(StandardUnits) AS StandardUnits,
MAX(Rate) AS Rate,
SUM(Reimbursement_EBSUtilization.UnitsDelivered) AS UnitsDelivered,
NULL AS ReduceUnits,
CAST(1 AS bit) AS IsEbs,
Reimbursement_EBSUtilization.BHFormName,
fExpenseType.ExpenseType,
CASE
WHEN Reimbursement_EBSUtilization.BHFormName IS NULL THEN MAX(Rate) * SUM(Reimbursement_EBSUtilization.UnitsDelivered) * ISNULL(MAX(Reimbursement_EBSUtilization.StandardUnits), 0)
ELSE (CASE
WHEN fExpenseType.ExpenseType = 'Payable' THEN SUM(ISNULL(Reimbursement_BHForms.ReimburseAmount, 0)) - SUM(ISNULL(Reimbursement_BHForms.ReducedAmount, 0))
ELSE 0
END) -
(CASE
WHEN fExpenseType.ExpenseType = 'Offset' THEN SUM(ISNULL(Reimbursement_BHForms.ReimburseAmount, 0)) - SUM(ISNULL(Reimbursement_BHForms.ReducedAmount, 0))
ELSE 0
END)
END AS ReimbursementAmount
FROM
ProviderReimbursementRequest
LEFT JOIN
Reimbursement_EBSUtilization ON ProviderReimbursementRequest.Id = Reimbursement_EBSUtilization.PRR_Id
LEFT JOIN
Reimbursement_BHForms ON Reimbursement_EBSUtilization.Id = Reimbursement_BHForms.REU_Id
LEFT JOIN
fExpenseCategory ON Reimbursement_BHForms.EC_Id = fExpenseCategory.ID
LEFT JOIN
fExpenseType ON fExpenseCategory.ExpenseType = fExpenseType.Id
LEFT JOIN
fContractor ON ProviderReimbursementRequest.Contractor_Id = fContractor.Id
WHERE
MRR_Id = #MrrId
AND Reimbursement_EBSUtilization.SD_Id = #ServiceDetailId
GROUP BY
ProviderReimbursementRequest.Contractor_Id,
Reimbursement_EBSUtilization.ServiceMonth,
fContractor.ContractorName,
Reimbursement_EBSUtilization.SD_Id,
Reimbursement_EBSUtilization.BHFormName,
fExpenseType.ExpenseType
On executing the result is
Id Contractor_Id ServiceMonth ContractorName SD_Id StandardUnits Rate UnitsDelivered ReduceUnits IsEbs BHFormName ExpenseType ReimbursementAmount
3976 845 2016-05-01 Payments SC1 2867 1.00 10.00 20 NULL 1 NULL NULL 200.00
3966 845 2016-07-31 Payments SC1 2867 1.00 10.00 NULL NULL 1 NULL NULL NULL
3974 846 2016-07-01 Payments SC2 2867 1.00 10.00 100 NULL 1 NULL NULL 1000.00
3970 846 2016-07-31 Payments SC2 2867 1.00 10.00 20 NULL 1 NULL NULL 200.00
3978 847 2016-07-31 Payments SC3 2867 1.00 10.00 30 NULL 1 NULL NULL 300.00
3983 847 2016-08-01 Payments SC3 2867 1.00 10.00 NULL NULL 1 NULL NULL NULL
If you observe the service month column for contractor_id = 846 we can see 2 records with same month.
I want the output to combine these columns as one is with 2016-07-01 and other is with 2016-07-31 as they both belong to same month and year. I want them to be combined.
Can any one help on this ?

You really should get in the habit of using aliases and formatting your queries. As posted that query is impossible to decipher. With just some aliases and a little formatting it is a lot cleaner.
select Max(ru.Id) as Id
, prr.Contractor_Id
, ru.ServiceMonth
, c.ContractorName
, ru.SD_Id
, Max(StandardUnits) as StandardUnits
, max(Rate) as Rate
, sum(ru.UnitsDelivered) as UnitsDelivered
, null as ReduceUnits
, Cast(1 as BIT) as IsEbs
, ru.BHFormName
, et.ExpenseType
, case when ru.BHFormName is null
then max(Rate) * sum(ru.UnitsDelivered) * ISNULL(max(ru.StandardUnits),0)
else
(
case when et.ExpenseType = 'Payable'
then sum(ISNULL(f.ReimburseAmount,0)) - sum(ISNULL(f.ReducedAmount,0))
else 0
end
) -
(
case when et.ExpenseType = 'Offset'
then sum(ISNULL(f.ReimburseAmount,0)) - sum(ISNULL(f.ReducedAmount,0))
else 0
end
) end as ReimbursementAmount
from ProviderReimbursementRequest prr
left join Reimbursement_EBSUtilization ru on prr.Id = ru.PRR_Id
left join Reimbursement_BHForms f on ru.Id = f.REU_Id
left join fExpenseCategory ec on f.EC_Id = ec.ID
left join fExpenseType et on ec.ExpenseType = et.Id
left join fContractor c on prr.Contractor_Id = c.Id
where MRR_Id = #MrrId
and ru.SD_Id = #ServiceDetailId
group by prr.Contractor_Id
, ru.ServiceMonth
, c.ContractorName
, ru.SD_Id
, ru.BHFormName
, et.ExpenseType
To actually help with your problem I think we need a bit more detail. This is a great place to start. http://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum/
--EDIT--
If I understand the issue you need to group by the first of the month in ru.ServiceMonth instead of the actual value.
Something like this.
dateadd(month, datediff(month, 0, ru.ServiceMonth), 0)

Related

Count Days in SQL SERVER in Case When Clause

I have Table
1 2013-10-01 08:21 Null Null Null
1 2013-10-01 14:30 Null Null Null
2 2013-10-01 08:31 Null Lt Null
2 2013-10-01 14:31 EO Null Null
3 2013-10-01 14:30 EO Null Ab here
this Table is Result from this query
SELECT m.ID,L.Log_D,W.Sat,L.C,
(CASE WHEN convert(time,L.Log_D)>'08:31:00' and convert(time,L.Log_D)<'10:30:00' and L.C =1 then 'Lt'end )as Late,
(CASE WHEN convert(time,L.Log_D)<'13:30:00' and L.C =2 then 'Ab'end )as EarlyOut,
(CASE WHEN DATENAME(DAY, day( L.Log_D)) =2 THEN 'Ab' END) as Counte
from WorkPeriod W,LogT L,MinimumInfoT m where day(L.Log_D) =1
and month(L.Log_D) =10
and year(L.Log_D) =2013
and W.id=54
and m.Branch_ID=35
and L.C in(1,2)
and M.ID =L.EmpId
and W.id =m.W_Period
group by m.ID,L.Log_D,W.Sat,L.C
order by m.ID
I need the Absent Column Display as 'Ab' when the count Of the Day is less then 2 for each Id is that possible any help will be appreciated
You can use a Window Function to do this:
CASE WHEN COUNT(*) OVER (PARTITION BY m.ID) < 2 THEN 'ab' ELSE NULL END as Counte
This will get a count of records for each distinct m.ID. If the count for the current m.ID is less than 2, then it spits out ab.

How to Format SQL Query? Use Pivot?

Here is my Query. The results that I get are correct but I'm having trouble getting it in the desired format. I've tried to use Pivot, but I get errors. Any ideas?
Query:
DECLARE #SMonth DATETIME
SET #SMonth = '12/01/2015'
SELECT
SMonth 'Sales Month',
c.CustNumber 'Customer',
b.Description 'Brand',
Sum (SaleQuantity) 'Qty'
FROM
DistStructure.Customer c
JOIN Sales.Sale s ON s.CustId = c.CustId
JOIN Sales.Import i on i.ImportRefId = s.ImportRefId
JOIN AppSecurity.Log l on l.LogId = s.ImportRefId
JOIN Sales.Prod p on p.ProdId = s.ProdId
JOIN Sales.Brand b on b.BrandId = p.BrandId
WHERE
s.SMonth = #SMonth AND
i.ImportStatId = 50
Group By
CustNumber,
SMonth,
Description
Order By
CustNumber
Query Results:
Sales Month Customer Brand Qty
----------------------------------------------------
2015-12-01 00:00:00.000 030554 FS 29
2015-12-01 00:00:00.000 030554 BS 5
2015-12-01 00:00:00.000 032204 FZ 21
2015-12-01 00:00:00.000 032204 BS 14
2015-12-01 00:00:00.000 032204 FS 114
2015-12-01 00:00:00.000 034312 FZ 8
2015-12-01 00:00:00.000 034312 FS 104
2015-12-01 00:00:00.000 034312 BS 16
2015-12-01 00:00:00.000 034983 FS 63
2015-12-01 00:00:00.000 034983 BS 18
2015-12-01 00:00:00.000 034983 FZ 3
Desired Format:
Note: The Customer should be rolled up by Brand (so there is only one row per Customer) and then totaled. If the Brand has no data a zero should be placed in the spot.
Sales Month Customer BS FS FZ Total
--------------------------------------------------------------
2015-12-01 00:00:00.000 030554 5 29 0 34
2015-12-01 00:00:00.000 032204 14 114 21 149
2015-12-01 00:00:00.000 034312 16 104 8 128
2015-12-01 00:00:00.000 034983 18 63 3 84
Here is one way using Conditional Aggregate to alter your existing query to get the desired result format.
;with cte as
(
SELECT [Sales Month]=SMonth,
[Customer]= c.CustNumber,
[BS] = Sum(CASE WHEN b.Description = 'BS' THEN SaleQuantity ELSE 0 END),
[FS]= Sum(CASE WHEN b.Description = 'FS' THEN SaleQuantity ELSE 0 END),
[FZ]= Sum(CASE WHEN b.Description = 'FZ' THEN SaleQuantity ELSE 0 END)
FROM DistStructure.Customer c
JOIN Sales.Sale s
ON s.CustId = c.CustId
JOIN Sales.Import i
ON i.ImportRefId = s.ImportRefId
JOIN AppSecurity.Log l
ON l.LogId = s.ImportRefId
JOIN Sales.Prod p
ON p.ProdId = s.ProdId
JOIN Sales.Brand b
ON b.BrandId = p.BrandId
WHERE s.SMonth = #SMonth
AND i.ImportStatId = 50
GROUP BY CustNumber,
SMonth
ORDER BY [Customer]
)
SELECT [Sales Month],
[Customer],
[BS],
[FS],
[FZ],
TOTAL=[BS] + [FS] + [FZ]
FROM CTE
Note: If number of Brand's are unknown then you need to use dynamic code
I believe this is what you are looking for:
/*
Setup Sample Table
*/
declare #t table
(
[Sales Month] datetime,
Customer nvarchar(6),
Brand nvarchar(2),
Qty tinyint
)
/*
Setup Sample Table with
*/
insert into #t
([Sales Month], Customer, Brand, Qty)
values ('2015-12-01', '030554', N'FS', 29),
('2015-12-01', '030554', N'BS', 5),
('2015-12-01', '032204', N'FZ', 21),
('2015-12-01', '032204', N'BS', 14),
('2015-12-01', '032204', N'FS', 114),
('2015-12-01', '034312', N'FZ', 8),
('2015-12-01', '034312', N'FS', 104),
('2015-12-01', '034312', N'BS', 16),
('2015-12-01', '034983', N'FS', 63),
('2015-12-01', '034983', N'BS', 18),
('2015-12-01', '034983', N'FZ', 3)
/*
Generating desired output
*/
select pvt.[Sales Month],
pvt.Customer,
isnull(pvt.BS, 0) as BS,
isnull(pvt.FS, 0) as FS,
isnull(pvt.FZ, 0) as FZ,
isnull(pvt.BS, 0) + isnull(pvt.FS, 0) + isnull(pvt.FZ, 0) as Total
from #t as t pivot
( sum(Qty) for Brand in (BS, FS, FZ) ) as pvt

Days difference in SQL Server

I have a question about SQL Server.
Table: holidaylist
Date | weekendStatus | Holidaystatus
2015-12-01 | 0 | 0
2015-12-02 | 0 | 0
2015-12-03 | 0 | 0
2015-12-04 | 1 | 0
2015-12-05 | 1 | 0
2015-12-06 | 0 | 1
2015-12-07 | 0 | 0
2015-12-08 | 0 | 0
2015-12-09 | 0 | 1
2015-12-10 | 0 | 0
2015-12-11 | 0 | 0
2015-12-12 | 1 | 1
2015-12-13 | 1 | 0
Table: emp
empid | doj | dos
1 | 2015-12-01 | 2015-12-06
2 |2015-12-01 | 2015-12-13
3 |2015-12-03 |2015-12-13
I want get days difference from dos-doj withoutweekenstatusandholidaysstatus
and includeweekendandholidaystatus
I want output like this:
Empid | doj | dos |includeweekendandholidays | witoutincludeweekendandholidayslist
1 | 2015-12-01 |2015-12-06 | 5 | 3
2 | 2015-12-01 |2015-12-13 | 12 | 8
3 | 2015-12-03 |2015-12-13 | 10 | 6
I tried this query:
select
a.empid, a.doj, a.dos,
case
when b.weekendstatus = 1 and c.Holidaystatus = 1
then datediff(day, c.date, b.date)
end as includeweekenandholidays
case
when b.weekendstatus != 1 or c.Holidaystatus = 1
then datediff(day, c.date, b.date)
end as witoutincludeweekendandholidayslist
from
emp a
left join
holidaylist b on a.doj = b.date
left join
holidaylist c on a.dos = c.date
Above query not given expected result please tell me how to write query to achieve this task in SQL Server
Try this :
select a.empid,
a.doj,a.dos,
IncludeRest = (select count(h.date) from holidaylist h where e.doj<=h.date AND e.dos>=h.date),
ExcludeRest = (select count(h.date) from holidaylist h where e.doj<=h.date AND e.dos>=h.date AND h.weekendstatus = 0 AND h.holdaystatus = 0)
from emp e
you can use a CASE in your COUNT to determine whether or not to count that day..
SELECT
e.empid,
e.doj,
e.dos,
COUNT(*) includeweekendandholidays,
COUNT(CASE WHEN Holidaystatus = 0
AND [weekendStatus] = 0 THEN 1
END) withoutincludeweekendandholidayslist
FROM
emp e
JOIN holidaylist hl ON hl.Date >= e.doj
AND hl.Date < e.dos
GROUP BY
e.empid,
e.doj,
e.dos
This might perform better since it only joins to holidaylist table on records you need..
SELECT
e.empid,
e.doj,
e.dos,
DATEDIFF(DAY, e.doj, e.dos) includeweekendandholidays,
COUNT(*) withoutincludeweekendandholidayslist
FROM
emp e
JOIN holidaylist hl ON hl.Date BETWEEN e.doj AND e.dos
WHERE
weekendStatus = 0
AND Holidaystatus = 0
GROUP BY
e.empid,
e.doj,
e.dos,
DATEDIFF(DAY, e.doj, e.dos)
I don't get your output though since it only appears that you're excluding weekends and not holidays..
You can use OUTER APPLY:
SELECT a.empid, a.doj, a.dos,
DATEDIFF(d, a.doj, a.dos) + 1 AS include,
DATEDIFF(d, a.doj, a.dos) + 1 - b.wd - b.hd + b.common AS without
FROM emp AS a
OUTER APPLY (
SELECT SUM(weekendStatus) AS wd,
SUM(Holidaystatus) AS hd,
COUNT(CASE WHEN weekendStatus = 1 AND Holidaystatus = 1 THEN 1 END) AS common
FROM holidaylist
WHERE [Date] BETWEEN a.doj AND a.dos) AS b
For each row of table emp, OUTER APPLY calculates weekendStatus=1 and Holidaystatus=1 rows that correspond to the interval of this row.
Calculated fields selected:
include is the total number of days of the emp interval including weekend days and holidays.
without is the total number of days of the emp interval minus weekend days and holidays. common field makes sure common weekend days and holidays are not subtracted twice.
Note: The above query includes start and end days of the interval in the calculations, so the interval considered is [doj - dos]. You can change the predicate of the WHERE clause in the OUTER APPLY operation so as to exclude start, end, or both, days of the interval.
Demo here
try another way with cross join
select t.empid,t.doj,t.dos,datediff(day,t.doj,t.dos) includeweekendandholidays,
datediff(day,t.doj,t.dos)-isnull(t1.wes,0) as witoutincludeweekendandholidayslist
from #emp t left join (
select empid, sum(hd.Holidaystatus+hd.weekendStatus) wes from
#emp emp cross join #holidaylist hd where hd.[Date] between doj
and dateadd(day,-1,dos) group by empid) t1 on t.empid=t1.empid
sample data
declare #holidaylist table ([Date] date, weekendStatus int, Holidaystatus int)
insert into #holidaylist([Date], weekendStatus, Holidaystatus) values
('2015-12-01' , 0 , 0),
('2015-12-02' , 0 , 0),
('2015-12-03' , 0 , 0),
('2015-12-04' , 1 , 0),
('2015-12-05' , 1 , 0),
('2015-12-06' , 0 , 1),
('2015-12-07' , 0 , 0),
('2015-12-08' , 0 , 0),
('2015-12-09' , 0 , 1),
('2015-12-10' , 0 , 0),
('2015-12-11' , 0 , 0),
('2015-12-12' , 1 , 1),
('2015-12-13' , 1 , 0)
declare #emp table(empid int, doj date, dos date)
insert into #emp (empid,doj,dos) values
(1 , '2015-12-01' , '2015-12-06'),
(2 ,'2015-12-01' , '2015-12-13'),
(3 ,'2015-12-03' ,'2015-12-13')

SQL Server 2005 - Update column where DATEDIFF between two dates is minimum

I have two tables, defined as following:
PTable:
[StartDate], [EndDate], [Type], PValue
.................................................
2011-07-01 2011-07-07 001 5
2011-07-08 2011-07-14 001 10
2011-07-01 2011-07-07 002 15
2011-07-08 2011-07-14 002 20
TTable:
[Date], [Type], [TValue]
..................................
2011-07-01 001 11
2011-07-02 001 4
2011-07-03 001 0
2011-07-08 002 12
2011-07-09 002 12
2011-07-10 002 0
I want to update Tvalue column in TTable with the PValue in PTable, where [Date] in TTable is between [StartDate] and [EndDate] in PTable and DATEDIFF(DAY,TTable.[Date],PTable.[EndDate]) is minimum, AND PTable.Type = TTable.Type
The final TTable should look like this:
[Date], [Type], [TValue]
..................................
2011-07-01 001 11
2011-07-02 001 4
2011-07-03 001 5 --updated
2011-07-08 002 12
2011-07-09 002 12
2011-07-10 002 20 --updated
What I have tried is this:
UPDATE [TTable]
SET
TValue = T1.PValue
FROM TTable
INNER JOIN PTable T1 ON
[Date] BETWEEN T1.StartDate AND T1.EndDate
AND DATEDIFF(DAY,[Date],T1.EndDate) =
(SELECT MIN( DATEDIFF(DAY,TTable.[Date],T.EndDate) )
FROM PTable T WHERE TTable.[Date] BETWEEN T.StartDate AND T.EndDate
)
AND
T1.[Type] = TTable.[Type]
It gives me this error :
"Multiple columns are specified in an aggregated expression containing an outer reference. If an expression being aggregated contains an outer reference, then that outer reference must be the only column referenced in the expression."
Later edit:
Considering TTable AS T and PTable AS P, the condition for update are:
1. T.Type = P.Type
2. T.Date BETWEEN P.StartDate AND P.EndDate
3. DATEDIFF(DAY,T.Date,P.EndDate) = minimum value of all DATEDIFFs WHERE P.Type = T.Type AND T.Date BETWEEN P.StartDate AND P.EndDate
Later Edit 2:
Sorry, because I typed wrong the last row in PTable (2011-08-10 instead 2011-07-14), the final result was wrong.
I also managed to update in a simpler way, which I obviously should have tried from the start:
UPDATE TTABLE
SET
TValue = T1.PValue
FROM TTable
INNER JOIN PTABLE T1 ON
[Date] = (SELECT TOP(1) MAX(Date) FROM [TTABLE] WHERE [Date] BETWEEN T1.StartDate AND T1.EndDate)
AND
T1.Type = [TTABLE].Type
Sorry about this.
So you said something about "DATEDIFF(DAY,TTable.[Date],PTable.[EndDate]) is minimum" which confused me. Itt would seem like if there a weekly entry per Type, then for a particular Date, Type combination it would ever only match one. You might give this a try:
UPDATE TTABLE
SET TValue = T1.PValue
FROM TTable
INNER JOIN PTABLE T1 ON T1.Type = [TTABLE].Type -- find row in PTable that the Date falls between
and [Date] BETWEEN T1.StartDate AND T1.EndDate)
where
TValue = ( select MIN(TValue) -- finds the lowest TValue, 0 in example
from TTable))
...updated...
So it appears I read the problem incorrectly the first time. I had thought we update the TTable entries that have the lowest TValue. Not sure how I got that impression. Still seems like there needs to be a check for if it is 0?
UPDATE TTable
SET TValue = T1.PValue
FROM TTable
INNER JOIN PTable T1 ON T1.Type = TTable.Type
and T1.EndDate = (
SELECT top 1 EndDate
FROM PTable
WHERE Type=TTable.Type
ORDER BY abs(DATEDIFF(day,TTable.Date,PTable.EndDate)) desc)
WHERE
TValue = 0 -- only updating entries that aren't set, have a 0
This only works if there is one is one row in PTable with an EndDate of 7/7 or whatever for a given type. If there are two entries for Type 001 with an end date of 7/7, then it will join to two entries. Also if there is two entries that are equal distant from the date in question, so an EndDate of 7/7 and one of 7/13 are both 3 days from 7/10. If the EndDates are all 7 days apart (weekly) you should be ok.

Combine continuous datetime intervals by type

Say we have such a table:
declare #periods table (
s date,
e date,
t tinyint
);
with date intervals without gaps ordered by start date (s)
insert into #periods values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
All date intervals have different types (t).
It is required to combine date intervals of the same type where they are not broken by intervals of the other types (having all intervals ordered by start date).
So the result table should look like:
s | e | t
------------|------------|-----
2013-01-01 | 2013-01-02 | 3
2013-01-02 | 2013-01-05 | 1
2013-01-05 | 2013-01-08 | 2
2013-01-08 | 2013-01-09 | 1
Any ideas how to do this without cursor?
I've got one working solution:
declare #periods table (
s datetime primary key clustered,
e datetime,
t tinyint,
period_number int
);
insert into #periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
declare #t tinyint = null;
declare #PeriodNumber int = 0;
declare #anchor date;
update #periods
set period_number = #PeriodNumber,
#PeriodNumber = case
when #t <> t
then #PeriodNumber + 1
else
#PeriodNumber
end,
#t = t,
#anchor = s
option (maxdop 1);
select
s = min(s),
e = max(e),
t = min(t)
from
#periods
group by
period_number
order by
s;
but I doubt if I can rely on such a behavior of UPDATE statement?
I use SQL Server 2008 R2.
Edit:
Thanks to Daniel and this article: http://www.sqlservercentral.com/articles/T-SQL/68467/
I found three important things that were missed in the solution above:
There must be clustered index on the table
There must be anchor variable and call of the clustered column
Update statement should be executed by one processor, i.e. without parallelism
I've changed the above solution in accordance with these rules.
Since your ranges are continuous, the problem essentially becomes a gaps-and-islands one. If only you had a criterion to help you to distinguish between different sequences with the same t value, you could group all the rows using that criterion, then just take MIN(s), MAX(e) for every group.
One method of obtaining such a criterion is to use two ROW_NUMBER calls. Consider the following query:
SELECT
*,
rnk1 = ROW_NUMBER() OVER ( ORDER BY s),
rnk2 = ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
FROM #periods
;
For your example it would return the following set:
s e t rnk1 rnk2
---------- ---------- -- ---- ----
2013-01-01 2013-01-02 3 1 1
2013-01-02 2013-01-04 1 2 1
2013-01-04 2013-01-05 1 3 2
2013-01-05 2013-01-06 2 4 1
2013-01-06 2013-01-07 2 5 2
2013-01-07 2013-01-08 2 6 3
2013-01-08 2013-01-09 1 7 3
The interesting thing about the rnk1 and rnk2 rankings is that if you subtract one from the other, you will get values that, together with t, uniquely identify every distinct sequence of rows with the same t:
s e t rnk1 rnk2 rnk1 - rnk2
---------- ---------- -- ---- ---- -----------
2013-01-01 2013-01-02 3 1 1 0
2013-01-02 2013-01-04 1 2 1 1
2013-01-04 2013-01-05 1 3 2 1
2013-01-05 2013-01-06 2 4 1 3
2013-01-06 2013-01-07 2 5 2 3
2013-01-07 2013-01-08 2 6 3 3
2013-01-08 2013-01-09 1 7 3 4
Knowing that, you can easily apply grouping and aggregation. This is what the final query might look like:
WITH partitioned AS (
SELECT
*,
g = ROW_NUMBER() OVER ( ORDER BY s)
- ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
FROM #periods
)
SELECT
s = MIN(s),
e = MAX(e),
t
FROM partitioned
GROUP BY
t,
g
;
If you like, you can play with this solution at SQL Fiddle.
How about this?
declare #periods table (
s datetime primary key,
e datetime,
t tinyint,
s2 datetime
);
insert into #periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
update #periods set s2 = s;
while ##ROWCOUNT > 0
begin
update p2 SET s2=p1.s
from #periods p1
join #PERIODS P2 ON p2.t = p1.t AND p2.s2 = p1.e;
end
select s2 as s, max(e) as e, min(t) as t
from #periods
group by s2
order by s2;
a possibly solution to avoid update and cursor should be using common table expressions...
like this...
declare #periods table (
s date,
e date,
t tinyint
);
insert into #periods values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
with cte as ( select 0 as n
,p.s as s
,p.e as e
,p.t
,case when p2.s is null then 1 else 0 end fl_s
,case when p3.e is null then 1 else 0 end fl_e
from #periods p
left outer join #periods p2
on p2.e = p.s
and p2.t = p.t
left outer join #periods p3
on p3.s = p.e
and p3.t = p.t
union all
select n+1 as n
, p2.s as s
, p.e as e
,p.t
,case when not exists(select * from #periods p3 where p3.e =p2.s and p3.t=p2.t) then 1 else 0 end as fl_s
,p.fl_e as fl_e
from cte p
inner join #periods p2
on p2.e = p.s
and p2.t = p.t
where p.fl_s = 0
union all
select n+1 as n
, p.s as s
, p2.e as e
,p.t
,p.fl_s as fl_s
,case when not exists(select * from #periods p3 where p3.s =p2.e and p3.t=p2.t) then 1 else 0 end as fl_e
from cte p
inner join #periods p2
on p2.s = p.e
and p2.t = p.t
where p.fl_s = 1
and p.fl_e = 0
)
,result as (select s,e,t,COUNT(*) as count_lines
from cte
where fl_e = 1
and fl_s = 1
group by s,e,t
)
select * from result
option(maxrecursion 0)
resultset achieved...
s e t count_lines
2013-01-01 2013-01-02 3 1
2013-01-02 2013-01-05 1 2
2013-01-05 2013-01-08 2 3
2013-01-08 2013-01-09 1 1
Hooray! I've found the solution that suits me and it is done without iteration
with cte1 as (
select s, t from #periods
union all
select max(e), null from #periods
),
cte2 as (
select rn = row_number() over(order by s), s, t from cte1
),
cte3 as (
select
rn = row_number() over(order by a.rn),
a.s,
a.t
from
cte2 a
left join cte2 b on a.rn = b.rn + 1 and a.t = b.t
where
b.rn is null
)
select
s = a.s,
e = b.s,
a.t
from
cte3 a
inner join cte3 b on b.rn = a.rn + 1;
Thanks everyone for sharing your thoughts and solutions!
Details:
cte1 returns the chain of dates with the types after them:
s t
---------- ----
2013-01-01 3
2013-01-02 1
2013-01-04 1
2013-01-05 2
2013-01-06 2
2013-01-07 2
2013-01-08 1
2013-01-09 NULL -- there is no type *after* the last date
ct2 just add row number to the above result:
rn s t
---- ---------- ----
1 2013-01-01 3
2 2013-01-02 1
3 2013-01-04 1
4 2013-01-05 2
5 2013-01-06 2
6 2013-01-07 2
7 2013-01-08 1
8 2013-01-09 NULL
if we output all the fields from the query in cte3 without where condition, we get the following results:
select * from cte2 a left join cte2 b on a.rn = b.rn + 1 and a.t = b.t;
rn s t rn s t
---- ---------- ---- ------ ---------- ----
1 2013-01-01 3 NULL NULL NULL
2 2013-01-02 1 NULL NULL NULL
3 2013-01-04 1 2 2013-01-02 1
4 2013-01-05 2 NULL NULL NULL
5 2013-01-06 2 4 2013-01-05 2
6 2013-01-07 2 5 2013-01-06 2
7 2013-01-08 1 NULL NULL NULL
8 2013-01-09 NULL NULL NULL NULL
For the dates where type is repeted there are values on the right side of the results. So we can just remove all the lines where values exist on the right side.
So cte3 returns:
rn s t
----- ---------- ----
1 2013-01-01 3
2 2013-01-02 1
3 2013-01-05 2
4 2013-01-08 1
5 2013-01-09 NULL
Note that because of the removal some rows there are some gaps in rn sequence, so we have to renumber them again.
From here only one thing left - to transform the dates to periods:
select
s = a.s,
e = b.s,
a.t
from
cte3 a
inner join cte3 b on b.rn = a.rn + 1;
and we've got the required result:
s e t
---------- ---------- ----
2013-01-01 2013-01-02 3
2013-01-02 2013-01-05 1
2013-01-05 2013-01-08 2
2013-01-08 2013-01-09 1
this is your solution with a different data on the table..
declare #periods table (
s datetime primary key,
e datetime,
t tinyint,
period_number int
);
insert into #periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-09' , '2013-01-10', 2),
('2013-01-10' , '2013-01-11', 1);
declare #t tinyint = null;
declare #PeriodNumber int = 0;
update #periods
set period_number = #PeriodNumber,
#PeriodNumber = case
when #t <> t
then #PeriodNumber + 1
else
#PeriodNumber
end,
#t = t;
select
s = min(s),
e = max(e),
t = min(t)
from
#periods
group by
period_number
order by
s;
where have a gap between
('2013-01-05' , '2013-01-06', 2),
--and
('2013-01-09' , '2013-01-10', 2),
your solution resultset is..
s e t
2013-01-01 2013-01-02 3
2013-01-02 2013-01-05 1
2013-01-05 2013-01-10 2
2013-01-10 2013-01-11 1
isnt was spected the resultset like this..??
s e t
2013-01-01 2013-01-02 3
2013-01-02 2013-01-05 1
2013-01-05 2013-01-06 2
2013-01-09 2013-01-10 2
2013-01-10 2013-01-11 1
maybe I did misunderstood the rule of your problem...

Resources