Running sum from a point - sql-server

I have a forecast of change that I need to add on to actuals.
Example:
Date Group Count ActForc
Nov-15 GrpA 10 A
Dec-15 GrpA 12 A
Jan-16 GrpA -1 F
Feb-16 GrpA 2 F
What I would like to see is:
Date Group Count
Nov-15 GrpA 10
Dec-15 GrpA 12
Jan-16 GrpA 11
Feb-16 GrpA 13
but all of the counting/running sum queries I have seen assume that I want the sections to be separate, and give me ways to create sums for each section, but essentially, I want to seed the sum for the second section with the final value from the first section, and continue from that point, without disturbing the values from the second section

If your forecasts are always in the end of the date range, you can also do this by using few window functions inside each other. Here is a running total calculated over a field that checks if the next row is 'F' then it takes count, otherwise 0. When that is then taken instead of count when the next row is F, it will contain the figure you want.
select
[date],
[group],
case when isnull(lead(ActForc) over (order by Date asc),ActForc) = 'F' then
sum(Count2) over (order by Date asc) else [Count] end,
[count],
ActForc
from (
select
[date],
[group],
case when isnull(lead(ActForc) over (order by Date asc),ActForc) = 'F' then [Count] else 0 end as Count2,
[count],
ActForc
from
table1
) X
This should perform better than any recursive CTEs / correlated subqueries because the data isn't read several times. If you have more groups, partitioning the window functions with the group should fix that.
Example in SQL Fiddle with few more months.

Try with a recursive cte.
First create a subquery to have a row_id
Then create the base case with rn = 1
And finally the recursion calculate each next level.
SQL Fiddle Demo
WITH addID as (
SELECT [Date], [Group], [Count], [ActForc],
ROW_NUMBER() OVER ( ORDER BY [DATE]) as rn
FROM myTable
), cte_name ( [Date], [Group], [Count], [level] ) AS
(
SELECT [Date], [Group], [Count], 1 as [level]
FROM addID
WHERE rn = 1
UNION ALL
SELECT A.[Date],
A.[Group],
CASE WHEN [ActForc] = 'F' THEN C.[Count] + A.[Count]
ELSE A.[Count]
END AS [Count],
C.[level] + 1
FROM addID A
INNER JOIN cte_name C
ON A.rn = C.[level] + 1
)
SELECT *
FROM cte_name
OUTPUT
| Date | Group | Count | level |
|----------------------------|-------|-------|-------|
| November, 01 2015 00:00:00 | GrpA | 10 | 1 |
| December, 01 2015 00:00:00 | GrpA | 12 | 2 |
| January, 01 2016 00:00:00 | GrpA | 11 | 3 |
| February, 01 2016 00:00:00 | GrpA | 13 | 4 |

Related

SQL Server - assign value to a field based on a running total

For a customer, I'm sending through an XML file to another system, the sales orders and I sum the quantities for each item across all sales orders lines (e.g.: if I have "ItemA" in 10 sales orders with different quantities in each one, I sum the quantity and send the total).
In return, I get a response whether the requested quantities can be delivered to the customers or not. If not, I still get the total quantity that can be delivered. However, could be situations when I request 100 pieces of "ItemA" and I cannot deliver all 100, but 98. In cases like this, I need to distribute (to UPDATE a custom field) those 98 pieces FIFO, according to the requested quantity in each sales order and based on the registration date of each sales order.
I tried to use a WHILE LOOP but I couldn't achieve the desired result. Here's my piece of code:
DECLARE #PickedQty int
DECLARE #PickedERPQty int
DECLARE #OrderedERPQty int=2
SET #PickedQty =
WHILE (#PickedQty>0)
BEGIN
SET #PickedERPQty=(SELECT CASE WHEN #PickedQty>#OrderedERPQty THEN #OrderedERPQty ELSE #PickedQty END)
SET #PickedQty=#PickedQty-#PickedERPQty
PRINT #PickedQty
IF #PickedQty>=0
BEGIN
UPDATE OrderLines
SET UDFValue2=#PickedERPQty
WHERE fDocID='82DADC71-6706-44C7-9B78-7FCB55D94A69'
END
IF #PickedQty <= 0
BREAK;
END
GO
Example of response
I requested 35 pieces but only 30 pieces are available to be delivered. I need to distribute those 30 pieces for each sales order, based on requested quantity and also FIFO, based on the date of the order. So, in this example, I will update the RealQty column with the requested quantity (because I have stock) and in the last one, I assign the remaining 5 pieces.
ord_Code CustOrderCode Date ItemCode ReqQty AvailQty RealQty
----------------------------------------------------------------------------
141389 CV/2539 2018-11-25 PX085 10 30 10
141389 CV/2550 2018-11-26 PX085 5 30 5
141389 CV/2563 2018-11-27 PX085 10 30 10
141389 CV/2564 2018-11-28 PX085 10 30 5
Could anyone give me a hint? Thanks
This might be more verbose than it needs to be, but I'll leave it to you to skinny it down if that's possible.
Set up the data:
DECLARE #OrderLines TABLE(
ord_Code INTEGER NOT NULL
,CustOrderCode VARCHAR(7) NOT NULL
,[Date] DATE NOT NULL
,ItemCode VARCHAR(5) NOT NULL
,ReqQty INTEGER NOT NULL
,AvailQty INTEGER NOT NULL
,RealQty INTEGER NOT NULL
);
INSERT INTO #OrderLines(ord_Code,CustOrderCode,[Date],ItemCode,ReqQty,AvailQty,RealQty) VALUES (141389,'CV/2539','2018-11-25','PX085',10,0,0);
INSERT INTO #OrderLines(ord_Code,CustOrderCode,[Date],ItemCode,ReqQty,AvailQty,RealQty) VALUES (141389,'CV/2550','2018-11-26','PX085', 5,0,0);
INSERT INTO #OrderLines(ord_Code,CustOrderCode,[Date],ItemCode,ReqQty,AvailQty,RealQty) VALUES (141389,'CV/2563','2018-11-27','PX085',10,0,0);
INSERT INTO #OrderLines(ord_Code,CustOrderCode,[Date],ItemCode,ReqQty,AvailQty,RealQty) VALUES (141389,'CV/2564','2018-11-28','PX085',10,0,0);
DECLARE #AvailQty INTEGER = 30;
For running totals, for SQL Server 20012 and up anyway, SUM() OVER is the preferred technique so I started off with some variants on that. This query brought in some useful numbers:
SELECT
ol.ord_Code,
ol.CustOrderCode,
ol.Date,
ol.ItemCode,
ol.ReqQty,
#AvailQty AS AvailQty,
SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]) AS TotalOrderedQty,
#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]) AS RemainingQty
FROM
#OrderLines AS ol;
Then I used the RemainingQty to do a little math. The CASE expression is hairy, but the first step checks to see if the RemainingQty after processing this row will be positive, and if it is, we fulfill the order. If not, we fulfill what we can. The nested CASE is there to stop negative numbers from coming into the result set.
SELECT
ol.ord_Code,
ol.CustOrderCode,
ol.Date,
ol.ItemCode,
ol.ReqQty,
#AvailQty AS AvailQty,
SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]) AS TotalOrderedQty,
#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]) AS RemainingQty,
CASE
WHEN (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date])) > 0
THEN ol.ReqQty
ELSE
CASE
WHEN ol.ReqQty + (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date])) > 0
THEN ol.ReqQty + (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]))
ELSE 0
END
END AS RealQty
FROM
#OrderLines AS ol
Windowing functions (like SUM() OVER) can only be in SELECT and ORDER BY clauses, so I had to do a derived table with a JOIN. A CTE would work here, too, if you prefer. But I used that derived table to UPDATE the base table.
UPDATE Lines
SET
Lines.AvailQty = d.AvailQty
,Lines.RealQty = d.RealQty
FROM
#OrderLines AS Lines
JOIN
(
SELECT
ol.ord_Code,
ol.CustOrderCode,
ol.Date,
ol.ItemCode,
#AvailQty AS AvailQty,
CASE
WHEN (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date])) > 0
THEN ol.ReqQty
ELSE
CASE
WHEN ol.ReqQty + (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date])) > 0
THEN ol.ReqQty + (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]))
ELSE 0
END
END AS RealQty
FROM
#OrderLines AS ol
) AS d
ON d.CustOrderCode = Lines.CustOrderCode
AND d.ord_Code = Lines.ord_Code
AND d.ItemCode = Lines.ItemCode
AND d.Date = Lines.Date;
SELECT * FROM #OrderLines;
Results:
+----------+---------------+---------------------+----------+--------+----------+---------+
| ord_Code | CustOrderCode | Date | ItemCode | ReqQty | AvailQty | RealQty |
+----------+---------------+---------------------+----------+--------+----------+---------+
| 141389 | CV/2539 | 25.11.2018 00:00:00 | PX085 | 10 | 30 | 10 |
| 141389 | CV/2550 | 26.11.2018 00:00:00 | PX085 | 5 | 30 | 5 |
| 141389 | CV/2563 | 27.11.2018 00:00:00 | PX085 | 10 | 30 | 10 |
| 141389 | CV/2564 | 28.11.2018 00:00:00 | PX085 | 10 | 30 | 5 |
+----------+---------------+---------------------+----------+--------+----------+---------+
Play with different available qty values here: https://rextester.com/MMFAR17436

MSSQL: Create incremental row label per group

In my table, I have a primary key and a date. What I'd like to achieve is to have an incremental label based on whether or not there is a break between the dates - column Goal.
Now, below is an example. The break column was calculated using LEAD function (I thought it might help).
I am able to solve it using T-SQL, but this would be last resort. Nothing I tried has worked so far. I am using MSSQL 2014.
PK | Date | break | Goal |
-------------------------------
1 | 03/2017 | 0 | 1 |
1 | 04/2017 | 0 | 1 |
1 | 08/2017 | 1 | 2 |
1 | 09/2017 | 0 | 2 |
1 | 10/2017 | 0 | 2 |
1 | 02/2018 | 1 | 3 |
1 | 03/2018 | 0 | 3 |
Here is a code to reproduce this example:
CREATE TABLE #test
(
ConsumerId INT,
FullDate DATE,
Goal INT
)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-03-01',1)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-04-01',1)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-08-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-09-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-10-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2018-02-01',3)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2018-03-01',3)
SELECT ConsumerId,
FullDate,
CASE WHEN (datediff(month,
isnull(
LEAD (FullDate,1) OVER (PARTITION BY ConsumerId ORDER BY FullDate DESC),
FullDate),
FullDate) > 1)
THEN 1
ELSE 0
END AS break,
Goal
FROM #test
ORDER BY FullDate ASC
EDIT
This is apparently a famous problem "Islands and gaps" as pointed out in the comments. And Google offers many solutions as well as other questions here at SO.
Try this...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT
sg.ConsumerId, sg.FullDate,
GroupValue = DENSE_RANK() OVER (PARTITION BY sg.ConsumerId ORDER BY sg.GV)
FROM
cte_SmearGap sg;
An explanation of the code an how it works...
The 1st query, in cte_TestGap, uses the LAG function along with ROW_NUMBER() function to mark the location of gap in the data. We can see that by breaking it out and looking at it's results...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
)
SELECT * FROM cte_TestGap;
cte_TestGap results...
ConsumerId FullDate Gap
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 0
1 2017-08-01 3
1 2017-09-01 0
1 2017-10-01 0
1 2018-02-01 6
1 2018-03-01 0
At this point we want the 0 values to take on the value of the preceding non-0 values, allowing them to be grouped together. This is done in the 2nd query (cte_SmearGap) using the MAX function with a "window frame". So if we look at the output of cte_SmearGap, we can see that...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT * FROM cte_SmearGap;
cte_SmearGap results...
ConsumerId FullDate GV
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 1
1 2017-08-01 3
1 2017-09-01 3
1 2017-10-01 3
1 2018-02-01 6
1 2018-03-01 6
At this point All of the rows are in distinct groups... but... We'd like to have our group numbers in a contiguous sequence (1,2,3) as opposed to (1,3,6).
Of course that's easy enough to fix using the DENSE_Rank() function, which is what's happening in the final select...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT
sg.ConsumerId, sg.FullDate,
GroupValue = DENSE_RANK() OVER (PARTITION BY sg.ConsumerId ORDER BY sg.GV)
FROM
cte_SmearGap sg;
The end result...
ConsumerId FullDate GroupValue
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 1
1 2017-08-01 2
1 2017-09-01 2
1 2017-10-01 2
1 2018-02-01 3
1 2018-03-01 3
The comment from David Browne was actually extremely useful. If you google "Islands and Gaps", there are many variations of the solution. Below is the one I liked the most.
In the end, I needed the Goal column to be able to group the dates into MIN/MAX. This solution skips this step and directly creates the aggregated range.
Here is the source.
SELECT MIN(FullDate) AS range_start,
MAX(FUllDate) AS range_end
FROM (
SELECT FullDate,
DATEADD(MM, -1 * ROW_NUMBER() OVER(ORDER BY FullDate), FullDate) AS grp
FROM #test
) a
GROUP BY a.grp
And the output:
range_start | range_end |
--------------------------
2017-03-01 | 2017-04-01 |
2017-08-01 | 2017-10-01 |
2018-02-01 | 2018-03-01 |

How can I group / window date ordered events delineated by an arbitrary expression?

I would like to group some data together based on dates and some (potentially arbitrary) indicator:
Date | Ind
================
2016-01-02 | 1
2016-01-03 | 5
2016-03-02 | 10
2016-03-05 | 15
2016-05-10 | 6
2016-05-11 | 2
I would like to group together subsequent (date-ordered) rows but breaking the group after Indicator >= 10:
Date | Ind | Group
========================
2016-01-02 | 1 | 1
2016-01-03 | 5 | 1
2016-03-02 | 10 | 1
2016-03-05 | 15 | 2
2016-05-10 | 6 | 3
2016-05-11 | 2 | 3
I did find a promising technique at the end of a blog post: "Use this Neat Window Function Trick to Calculate Time Differences in a Time Series" (the final subsection, "Extra Bonus"), but the important part of the query uses a keyword (FILTER) that doesn't seem to be supported in SQL Server (and a quick Google later and I'm not sure where it is supported!).
I'm still hopeful a technique using a window function might be the answer. I just need a counter that I can add to every row, (like RANK or ROW_NUMBER does) but that only increments when some arbitrary condition evaluates as true. Is there a way to do this in SQL Server?
Here is the solution:
DECLARE #t TABLE ([Date] DATETIME, Ind INT)
INSERT INTO #t
VALUES
('2016-01-02', 1),
('2016-01-03', 5),
('2016-03-02', 10),
('2016-03-05', 15),
('2016-05-10', 6),
('2016-05-11', 2)
SELECT [Date],
Ind,
1 + SUM([Group]) OVER(ORDER BY [Date]) AS [Group]
FROM
(
SELECT *,
CASE WHEN LAG(ind) OVER(ORDER BY [Date]) >= 10
THEN 1
ELSE 0
END AS [Group]
FROM #t
) t
Just mark row as 1 when previous is greater than 10 else 0. Then a running sum will give you the desired result.
Giving full credit to Giorgi for the idea, but I've modified his answer (both for my benefit and for future readers).
Just change the CASE statement to see if 30 or more days have lapsed since the last record:
DECLARE #t TABLE ([Date] DATETIME)
INSERT INTO #t
VALUES
('2016-01-02'),
('2016-01-03'),
('2016-03-02'),
('2016-03-05'),
('2016-05-10'),
('2016-05-11')
SELECT [Date],
1 + SUM([Group]) OVER(ORDER BY [Date]) AS [Group]
FROM
(
SELECT [Date],
CASE WHEN DATEADD(d, -30, [Date]) >= LAG([Date]) OVER(ORDER BY [Date])
THEN 1
ELSE 0
END AS [Group]
FROM #t
) t

Count number of days in a year with a record

I have a SQL Server table named AgentLog in which I store for each agent his daily number of sales.
+-----------+------------+-------------+
| AgentName | Date | SalesNumber |
+-----------+------------+-------------+
| John | 01.01.2014 | 45 |
| Terry | 01.01.2014 | 30 |
| John | 02.01.2014 | 20 |
| Terry | 02.01.2014 | 15 |
| Terry | 03.01.2014 | 52 |
| Terry | 04.01.2014 | 24 |
| Terry | 05.01.2014 | 12 |
| Terry | 06.01.2014 | 10 |
| Terry | 07.01.2014 | 23 |
| John | 08.01.2014 | 48 |
| Terry | 08.01.2014 | 35 |
| John | 09.01.2014 | 37 |
| Terry | 10.01.2014 | 35 |
+-----------+------------+-------------+
If an agent doesn't work on one particular day, there is no record of his sales on that date.
I want to generate a report(query) on a given date interval (ex: 01.01.2014 - 10.01.2014) that counts on how many days an agent wasn't present for work (ex: John - 6 days), was at work (John - 4 days) and also returns the date interval it wasn't present (ex: John 03.01.2014 - 07.01.2014, 10.01.2014) (there can be multiple intervals).
You need to create a custom table and populate it with a record for each date you want in your range (Feel free to go as far back in the past and forward into the future as you feel you may need.). You could do this in Excel very easily and import it.
Select *
from Custom.DateListTable dlt
left outer join agentlog ag
on dlt.Date = ag.Date
I would approach this by getting the number of dates in the interval, as well as the number of dates the agent was at work, and you then have everything you need.
To get the number of days you can use DATEDIFF:
SELECT DATEDIFF(day, '2014-01-01', '2014-10-01') AS totalDays;
To get the number of days an agent worked, you can use the COUNT(*) aggregate function:
SELECT agentName, COUNT(*) AS daysWorked
FROM myTable
GROUP BY agentName;
Then, you can just add to that query to get the days not worked by subtracting totalDays - daysWorked:
SELECT agentName, COUNT(*) AS daysWorked, (DATEDIFF(day, '2014-01-01', '2014-10-01') - COUNT(*)) AS daysMissed
FROM myTable
GROUP BY agentName;
Here is an SQL Fiddle example.
The only way I can think of to resolve this is to creating a temporary table with only one column (datetime) and save there all the dates from the selected range. You can create an stored procedure that fills that temporary table using a cursor with all the dates from the interval. Then do a LEFT join between your table and the temporary table to look for null values in your table (The days where that person didn't come to work)
Try this...
SET DATEFIRST 1; --Monday
DECLARE #StartDate DATETIME = '2014-01.01',
#EndDate DATETIME = '2014-01.10';
WITH data as (
select 0 as i, DATEADD(DAY, 0, #StartDate) as TheDate
union all
select i + 1, DATEADD(DAY, i + 1, #StartDate) as TheDate
from data
where i < (#EndDate - #StartDate)
)
SELECT a.AgentName,
SUM(CASE WHEN c.Date IS NULL THEN 1 ELSE 0 END) AS Missing,
SUM(CASE WHEN c.Date IS NOT NULL THEN 1 ELSE 0 END) AS Working
FROM Agent a
JOIN data b ON NOT EXISTS(SELECT NULL FROM SpecialDate s WHERE s.date = b.TheDate)
LEFT JOIN AgentLog c ON
c.AgentName = a.AgentName
AND c.Date = b.TheDate
WHERE DATEPART(weekday, b.TheDate) <= 5
GROUP BY a.AgentName
OPTION (MAXRECURSION 10000);
It includes a check for weekends, as well as a reference to "SpecialDate" where a list of non working days can be maintained, and excluded from the check.
Reading your question again, I realise that this will only solve half your problem.
NOTE: The following answer mainly addresses the trickiest part of the question, which is how to obtain "absence from work" intervals.
Given these values as Interval Start - End dates:
DECLARE #IntervalStart DATE = '2013-12-30'
DECLARE #IntervalEnd DATE = '2014-01-10'
the following query gives you the "absence from work" intervals:
SELECT AgentName,
DATEADD(d, 1, t.[Date]) As OffWorkStart,
DATEADD(d, -1, t.NextDate) As OffWorkEnd
FROM (
SELECT AgentName, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As NextDate,
DATEDIFF(DAY, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC)) As NextMinusCurrent
FROM #AgentLog) t
WHERE t.NextMinusCurrent > 1
-- Get marginal beginning interval (in case such an interval exists)
UNION ALL
SELECT AgentName, #IntervalStart AS OffWorkStart, DATEADD(DAY, -1, MIN([Date])) AS OffWorkEnd
FROM #AgentLog
GROUP BY AgentName
HAVING MIN([Date]) > #IntervalStart
-- Get marginal ending interval (in case such an interval exists)
UNION ALL
SELECT AgentName, DATEADD(DAY, 1, MAX([Date])) AS OffWorkStart, #IntervalEnd
FROM #AgentLog
GROUP BY AgentName
HAVING MAX([Date]) < #IntervalEnd
ORDER By AgentName, OffWorkStart
With the input data you supplied, the above query gives you the following output:
AgentName OffWorkStart OffWorkEnd
---------------------------------------
John 2013-12-30 2013-12-31
John 2014-01-03 2014-01-07
John 2014-01-10 2014-01-10
Terry 2013-12-30 2013-12-31
Terry 2014-01-09 2014-01-09
The idea behind the basic part of the query is to employ the following nested query:
SELECT AgentName,
[Date],
LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As NextDate,
DATEDIFF(DAY, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC)) As NextMinusCurrent
FROM #AgentLog
in order to get any existing gaps between the days a certain agent is present for work. A value of NextMinusCurrent > 1 indicates such a gap.
Counting days is trivial once you have the above query in place. E.g. placing the above query in a CTE you can count total number of absence days with sth like:
;WITH cte (
... query goes here
)
SELECT AgentName, SUM(DATEDIFF(DAY, OffWorkStart, OffWorkEnd) + 1) AS AbsenceDays
FROM cte
GROUP By AgentName
P.S. The above query makes use of SQL Server LEAD function, which is available from SQL SERVER 2012 onwards.
SQL Fiddle here
EDIT:
CTEs together with ROW_NUMBER() can be used to simulate LEAD function. The first part of the query becomes:
;WITH cte1 AS (
SELECT AgentName,
[Date],
ROW_NUMBER() OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As rn
FROM #AgentLog
),
cte2 AS (
SELECT cte1.AgentName, cte1.[Date],
cteLead.[Date] AS NextDate,
DATEDIFF(DAY, cte1.[Date], cteLead.[Date]) As NextMinusCurrent
FROM cte1
LEFT OUTER JOIN cte1 AS cteLead
ON (cte1.rn = cteLead.rn - 1) AND (cte1.AgentName = cteLead.AgentName)
)
SELECT AgentName,
DATEADD(d, 1, cte2.[Date]) As OffWorkStart,
DATEADD(d, -1, cte2.NextDate) As OffWorkEnd
FROM cte2
WHERE NextMinusCurrent > 1
SQL Fiddle for SQL Server 2008 here. I hope it executes in SQL Server 2005 also!

Find the min and max dates between multiple sets of dates

Given the following set of data, I'm trying to determine how I can select the start and end dates of the combined date ranges, when they intersect with each other.
For instance, for PartNum 115678, I would want my final result set to display the date ranges 2012/01/01 - 2012/01/19 (rows 1, 2 and 4 combined since the date ranges intersect) and 2012/02/01 - 2012/03/28 (row 3 since this ones does not intersect with the range found previously).
For PartNum 213275, I would want to select the only row for that part, 2012/12/01 - 2013/01/01.
Edit:
I'm currently playing around with the following SQL statement, but it's not giving me exactly what I need.
with DistinctRanges as (
select distinct
ha1.PartNum "PartNum",
ha1.StartDt "StartDt",
ha2.EndDt "EndDt"
from dbo.HoldsAll ha1
inner join dbo.HoldsAll ha2
on ha1.PartNum = ha2.PartNum
where
ha1.StartDt <= ha2.EndDt
and ha2.StartDt <= ha1.EndDt
)
select
PartNum,
StartDt,
EndDt
from DistinctRanges
Here are the results of the query shown in the edit:
You're better off having a persisted Calendar table, but if you don't, the CTE below will create it ad-hoc. The TOP(36000) part is enough to give you 10 years worth of dates from the pivot ('20100101') on the same line.
SQL Fiddle
MS SQL Server 2008 Schema Setup:
create table data (
partnum int,
startdt datetime,
enddt datetime,
age int
);
insert data select
12345, '20120101', '20120116', 15 union all select
12345, '20120115', '20120116', 1 union all select
12345, '20120201', '20120328', 56 union all select
12345, '20120113', '20120119', 6 union all select
88872, '20120201', '20130113', 43;
Query 1:
with Calendar(thedate) as (
select TOP(36600) dateadd(d,row_number() over (order by 1/0),'20100101')
from sys.columns a
cross join sys.columns b
cross join sys.columns c
), tmp as (
select partnum, thedate,
grouper = datediff(d, dense_rank() over (partition by partnum order by thedate), thedate)
from Calendar c
join data d on d.startdt <= c.thedate and c.thedate <= d.enddt
)
select partnum, min(thedate) startdt, max(thedate) enddt
from tmp
group by partnum, grouper
order by partnum, startdt
Results:
| PARTNUM | STARTDT | ENDDT |
------------------------------------------------------------------------------
| 12345 | January, 01 2012 00:00:00+0000 | January, 19 2012 00:00:00+0000 |
| 12345 | February, 01 2012 00:00:00+0000 | March, 28 2012 00:00:00+0000 |
| 88872 | February, 01 2012 00:00:00+0000 | January, 13 2013 00:00:00+0000 |

Resources