Getting the Min(startdate) and Max(enddate) for an ID when that ID shows up multiple times

Getting the Min(startdate) and Max(enddate) for an ID when that ID shows up multiple times - sql-server

I have a table with a column for ID, StartDate, EndDate, And whether or not there was a gap between the enddate of that row and the next start date. If there was only one set instance of that ID i know that I could just do
SELECT min(startdate),max(enddate)
FROM table
GROUP BY ID
However, I have multiple instances of these IDs in several non-connected timespans. So if I were to do that I would get the very first start date and the last enddate for a different set of time for that personID. How would I go about making sure I get the min a max dates for the specific blocks of time?
I thought about potentially creating a new column where it would have a number for each set of time. So for the first set of time that has no gaps, it would have 1, then when the next row has a gap it will add +1 corresponding to a new set of time. but I am not really sure how to go about that. Here is some sample data to illustrate what I am working with:
ID StartDate EndDate NextDate Gap_ind
001 1/1/2018 1/31/2018 2/1/2018 N
001 2/1/2018 2/30/2018 3/1/2018 N
001 3/1/2018 3/31/2018 5/1/2018 Y
001 5/1/2018 5/31/2018 6/1/2018 N
001 6/1/2018 6/30/2018 6/30/2018 N

This is a classic "gaps and islands" problem, where you are trying to define the boundaries of your islands, and which you can solve by using some windowing functions.
Your initial effort is on track. Rather than getting the next start date, though, I used the previous end date to calculate the groupings.
The innermost subquery below gets the previous end date for each of your date ranges, and also assigns a row number that we use later to keep our groupings in order.
The next subquery out uses the previous end date to identify which groups of date ranges go together (overlap, or nearly so).
The outermost query is the end result you're looking for.
SELECT
Grp.ID,
MIN(Grp.StartDate) AS GroupingStartDate,
MAX(Grp.EndDate) AS GroupingEndDate
FROM
(
SELECT
PrevDt.ID,
PrevDt.StartDate,
PrevDt.EndDate,
SUM(CASE WHEN DATEADD(DAY,1,PrevDt.PreviousEndDate) >= PrevDt.StartDate THEN 0 ELSE 1 END)
OVER (PARTITION BY PrevDt.ID ORDER BY PrevDt.RN) AS GrpNum
FROM
(
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY StartDate, EndDate) as RN,
ID,
StartDate,
EndDate,
LAG(EndDate,1) OVER (PARTITION BY ID ORDER BY StartDate) AS PreviousEndDate
FROM
tbl
) AS PrevDt
) AS Grp
GROUP BY
Grp.ID,
Grp.GrpNum;
Results:
+-----+------------------+--------------+
| ID | InitialStartDate | FinalEndDate |
+-----+------------------+--------------+
| 001 | 2018-01-01 | 2018-03-01 |
| 001 | 2018-05-01 | 2018-06-01 |
+-----+------------------+--------------+
SQL Fiddle demo.
Further reading:
The SQL of Gaps and Islands in Sequences
Gaps and Islands Across Date Ranges

This is an example of a gaps-and-islands problem. A simple solution is to use lag() to determine if there are overlaps. When there is none, you have the start of a group. A cumulative sum defines the group -- and you aggregate on that.
select t.id, min(startdate), max(enddate)
from (select t.*,
sum(case when prev_enddate >= dateadd(day, -1, startdate)
then 0 else 1
end) over (partition by id order by startdate) as grp
from (select t.*, lag(enddate) over (partition by id order by startdate) as prev_enddate
from t
) t
) t
group by id, grp;

Related

return amount per year/month records based on start and enddate

I have a table with, for example this data:
ID |start_date |end_date |amount
---|------------|-----------|--------
1 |2019-03-21 |2019-05-09 |10000.00
2 |2019-04-02 |2019-04-10 |30000.00
3 |2018-11-01 |2019-01-08 |20000.00
I would like te get the splitted records back with the correct calculated amount based on the year/month.
I expect the outcome to be like this:
ID |month |year |amount
---|------|-------|--------
1 |3 | 2019 | 2200.00
1 |4 | 2019 | 6000.00
1 |5 | 2019 | 1800.00
2 |4 | 2019 |30000.00
3 |11 | 2018 | 8695.65
3 |12 | 2018 | 8985.51
3 |1 | 2019 | 2318.84
What would be the best way to achieve this? I think you would have to use DATEDIFF to get the number of days between the start_date and end_date to calculate the amount per day, but I'm not sure how to return it as records per month/year.
Tnx in advance!

This is one idea. I use a Tally to create a day for every day the amount is relevant for for that ID. Then, I aggregate the value of the Amount divided by the numbers of days, which is grouped by Month and year:
CREATE TABLE dbo.YourTable(ID int,
StartDate date,
EndDate date,
Amount decimal(12,2));
GO
INSERT INTO dbo.YourTable (ID,
StartDate,
EndDate,
Amount)
VALUES(1,'2019-03-21','2019-05-09',10000.00),
(2,'2019-04-02','2019-04-10',30000.00),
(3,'2018-11-01','2019-01-08',20000.00);
GO
--Create a tally
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT TOP (SELECT MAX(DATEDIFF(DAY, t.StartDate, t.EndDate)+1) FROM dbo.YourTable t) --Limits the rows, might be needed in a large dataset, might not be, remove as required
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1, N N2, N N3), --1000 days, is that enough?
--Create the dates
Dates AS(
SELECT YT.ID,
DATEADD(DAY, T.I, YT.StartDate) AS [Date],
YT.Amount,
COUNT(T.I) OVER (PARTITION BY YT.ID) AS [Days]
FROM Tally T
JOIN dbo.YourTable YT ON T.I <= DATEDIFF(DAY, YT.StartDate, YT.EndDate))
--And now aggregate
SELECT D.ID,
DATEPART(MONTH,D.[Date]) AS [Month],
DATEPART(YEAR,D.[Date]) AS [Year],
CONVERT(decimal(12,2),SUM(D.Amount / D.[Days])) AS Amount
FROM Dates D
GROUP BY D.ID,
DATEPART(MONTH,D.[Date]),
DATEPART(YEAR,D.[Date])
ORDER BY D.ID,
[Year],
[Month];
GO
DROP TABLE dbo.YourTable;
GO
DB<>Fiddle

How to sum any credits before debits SQL Server?

I'm trying to sum the all credits that occur before a debit, then sum all the debits after credit within a 4 day time period.
Table
ACCT |Date | Amount | Credit or debit
-----+----------+---------+----------------
152 |8/14/2017 | 48 | C
152 |8/12/2017 | 22.5 | D
152 |8/12/2017 | 40 | D
152 |8/11/2017 | 226.03 | C
152 |8/10/2017 | 143 | D
152 |8/10/2017 | 107.23 | C
152 |8/10/2017 | 20 | D
152 |8/10/2017 | 49.41 | C
My query should only sum if there is credit before the debit. the results will have 3 rows with the data above.
Output needed:
acct DateRange credit_amount debit_amount
--------------------------------------------------------------------------
152 2017-10-14 to 2017-10-18 49.41 20
152 2017-10-14 to 2017-10-18 107.23 143
152 2017-10-14 to 2017-10-18 226.03 62.5
The last one is summing the two debits until there is a credit.
First find the first credit.
sum the credits if there are more then 1 before a debit.
then find the debit and sum together until the next credit.
I only need the case where the credit date is before the debit date. The 48 on 8/14 is ignored because there is no debit after it.
The logic is to see if the account was credited then debited after it.
My attempt
DECLARE #StartDate DATE
DECLARE #EndDate DATE
DECLARE #OverallEndDate DATE
SET #OverallEndDate = '2017-08-14'
SET #StartDate = '2017-08-10'
SET #EndDate = dateadd(dd, 4, #startDate);
WITH Dates
AS (
SELECT #StartDate AS sd, #EndDate AS ed, #OverallEndDate AS od
UNION ALL
SELECT dateadd(dd, 1, sd), DATEADD(dd, 1, ed), od
FROM Dates
WHERE od > sd
), credits
AS (
SELECT DISTINCT A.Acct, LEFT(CONVERT(VARCHAR, #StartDate, 120), 10) + 'to' + LEFT(CONVERT(VARCHAR, #EndDate, 120), 10) AS DateRange, credit_amount, debit_amount
FROM (
SELECT t1.acct, sum(amount) AS credit_amount, MAX(t1.datestart) AS c_datestart
FROM [Transactions] T1
WHERE Credit_or_debit = 'C' AND T1.Datestart BETWEEN #StartDate AND #EndDate AND T1.[acct] = '152' AND T1.Datestart <= (
SELECT MIN(D1.Datestart)
FROM [Transactions] D1
WHERE T1.acct = D1.acct AND D1.Credit_or_debit = 'D' AND D1.Datestart BETWEEN #StartDate AND #EndDate
)
GROUP BY T1.acct
) AS A
CROSS JOIN (
SELECT t2.acct, sum(amount) AS debit_amount, MAX(t2.datestart) AS c_datestart
FROM [Transactions] T2 AND T2.DBCR = 'D' AND T2.Datestart BETWEEN #StartDate AND #EndDate AND T2.[acct] = '152' AND T2.Datestart <= (
SELECT MAX(D1.Datestart)
FROM [Transactions] D1
WHERE T2.acct = D1.acct AND D1.Credit_or_debit = 'D' AND D1.Datestart BETWEEN #StartDate AND #EndDate
)
GROUP BY T2.acct
) AS B
WHERE A.acct = B.acct AND A.c_datestart <= B.d_datestart
)
SELECT *
FROM credits
OPTION (MAXRECURSION 0)
Update:
The date stored is actually date timestamped. That is how I verify whether the debit is > credit.

It should be clear now that you definitely need a column that specifies the sequential order of transactions, because otherwise you can't decide whether a debit is placed befor or after a credit when they both have the same datestart. Assuming that you have such a column (in my query I named it ID), a solution could be as follows, without recursion and also without a self-join. The problem can be solved using some of the window functions available since SQL Server 2008.
My solution processes the data in several steps that I implemented as a sequence of 2 CTEs and a final PIVOT query:
DECLARE #StartDate DATE = '20170810';
DECLARE #EndDate DATE = dateadd(dd, 4, #StartDate);
DECLARE #DateRange nvarchar(24);
SET #DateRange =
CONVERT(nvarchar(10), #StartDate, 120) + ' to '
+ CONVERT(nvarchar(10), #EndDate, 120);
WITH
blocks (acct, CD, amount, blockno, r_blockno) AS (
SELECT acct, Credit_or_debit, amount
, ROW_NUMBER() OVER (PARTITION BY acct ORDER BY ID ASC)
- ROW_NUMBER() OVER (PARTITION BY acct, Credit_or_debit ORDER BY ID ASC)
, ROW_NUMBER() OVER (PARTITION BY acct ORDER BY ID DESC)
- ROW_NUMBER() OVER (PARTITION BY acct, Credit_or_debit ORDER BY ID DESC)
FROM Transactions
WHERE datestart BETWEEN #StartDate AND #EndDate
AND Credit_or_debit IN ('C','D') -- not needed, if always true
),
blockpairs (acct, CD, amount, pairno) AS (
SELECT acct, CD, amount
, DENSE_RANK() OVER (PARTITION BY acct, CD ORDER BY blockno)
FROM blocks
WHERE (blockno > 0 OR CD = 'C') -- remove leading debits
AND (r_blockno > 0 OR CD = 'D') -- remove trailing credits
)
SELECT acct, #DateRange AS DateRange
, amt.C AS credit_amount, amt.D AS debit_amount
FROM blockpairs PIVOT (SUM(amount) FOR CD IN (C, D)) amt
ORDER BY acct, pairno;
And this is how it works:
blocks
Here, the relevant data is retrieved from the table, meaning that the date range filter is applied, and another filter on the Credit_or_debit column makes sure that only the values C and D are contained in the result (if this is the case by design in your table, then that part of the WHERE clause can be omitted). The essential part in this CTE is the difference of two rownumbers (blockno). Credits and debits are numbered separately, and their respective rownumber is subtracted from the overall row number. Within a consecutive block of debits or credits, these numbers will be the same for each record, and they will be different (higher) in later blocks of the same type. The main use if this numbering is to identify the very first block (number 0) in order to be able to exclude it from
further processing in the next step in case it's a debit block. To be able to also identify the very last block (and filter it away in the next step if it's a credit block), a similar block numbering is made in the reverse order (r_blockno). The result (which I orderd just for visualization with your sample data) will look like this:
blockpairs
In this CTE, as described before, the very first block is filtered away if it's a debit block, and the very last block is filtered away if it's a credit block. Doing this, the number of remaining blocks must be even, and the logical order of blocks must be a sequence of pairs of credit and debit blocks, each pair starting with a credit block and followed by its associated debit block. Each pair of credit/debit blocks will result in a single row in the end. To associate the credit and debit blocks correctly in the query, I give them the same number by using separate numberings per type (the n-th credit block and the n-th debit block are associated by giving them the same number n). For this numbering, I use the DENSE_RANK function, for all records in a block to obtain the same number (pairno) and make the numbering gapless. For numbrting the blocks of the same type, I reuse the the blockno field described above for ordering. The result in your example (again sorted for visualization):
The final PIVOT query
Finally, the credit_amount and debit_amount are aggregated over the respective blocks grouping by acct and pairno and then diplayed side-by-side using a PIVOT query.
Although the column pairno isn't visible, it is used for sorting the resulting records.

GROUP BY DAY, CUMULATIVE SUM

I have a table in MSSQL with the following structure:
PersonId
StartDate
EndDate
I need to be able to show the number of distinct people in the table within a date range or at a given date.
As an example i need to show on a daily basis the totals per day, e.g. if we have 2 entries on the 1st June, 3 on the 2nd June and 1 on the 3rd June the system should show the following result:
1st June: 2
2nd June: 5
3rd June: 6
If however e.g. on of the entries on the 2nd June also has an end date that is 2nd June then the 3rd June result would show just 5.
Would someone be able to assist with this.
Thanks
UPDATE
This is what i have so far which seems to work. Is there a better solution though as my solution only gets me employed figures. I also need unemployed on another column - unemployed would mean either no entry in the table or date not between and no other entry as employed.
CREATE TABLE #Temp(CountTotal int NOT NULL, CountDate datetime NOT NULL);
DECLARE #StartDT DATETIME
SET #StartDT = '2015-01-01 00:00:00'
WHILE #StartDT < '2015-08-31 00:00:00'
BEGIN
INSERT INTO #Temp(CountTotal, CountDate)
SELECT COUNT(DISTINCT PERSON.Id) AS CountTotal, #StartDT AS CountDate FROM PERSON
INNER JOIN DATA_INPUT_CHANGE_LOG ON PERSON.DataInputTypeId = DATA_INPUT_CHANGE_LOG.DataInputTypeId AND PERSON.Id = DATA_INPUT_CHANGE_LOG.DataItemId
LEFT OUTER JOIN PERSON_EMPLOYMENT ON PERSON.Id = PERSON_EMPLOYMENT.PersonId
WHERE PERSON.Id > 0 AND DATA_INPUT_CHANGE_LOG.Hidden = '0' AND DATA_INPUT_CHANGE_LOG.Approved = '1'
AND ((PERSON_EMPLOYMENT.StartDate <= DATEADD(MONTH,1,#StartDT) AND PERSON_EMPLOYMENT.EndDate IS NULL)
OR (#StartDT BETWEEN PERSON_EMPLOYMENT.StartDate AND PERSON_EMPLOYMENT.EndDate) AND PERSON_EMPLOYMENT.EndDate IS NOT NULL)
SET #StartDT = DATEADD(MONTH,1,#StartDT)
END
select * from #Temp
drop TABLE #Temp

You can use the following query. The cte part is to generate a set of serial dates between the start date and end date.
DECLARE #ViewStartDate DATETIME
DECLARE #ViewEndDate DATETIME
SET #ViewStartDate = '2015-01-01 00:00:00.000';
SET #ViewEndDate = '2015-02-25 00:00:00.000';
;WITH Dates([Date])
AS
(
SELECT #ViewStartDate
UNION ALL
SELECT DATEADD(DAY, 1,Date)
FROM Dates
WHERE DATEADD(DAY, 1,Date) <= #ViewEndDate
)
SELECT [Date], COUNT(*)
FROM Dates
LEFT JOIN PersonData ON Dates.Date >= PersonData.StartDate
AND Dates.Date <= PersonData.EndDate
GROUP By [Date]
Replace the PersonData with your table name
If startdate and enddate columns can be null, then you need to add
addditional conditions to the join
It assumes one person has only one record in the same date range

You could do this by creating data where every start date is a +1 event and end date is -1 and then calculate a running total on top of that.
For example if your data is something like this
PersonId StartDate EndDate
1 20150101 20150201
2 20150102 20150115
3 20150101
You first create a data set that looks like this:
EventDate ChangeValue
20150101 +2
20150102 +1
20150115 -1
20150201 -1
And if you use running total, you'll get this:
EventDate Total
2015-01-01 2
2015-01-02 3
2015-01-15 2
2015-02-01 1
You can get it with something like this:
select
p.eventdate,
sum(p.changevalue) over (order by p.eventdate asc) as total
from
(
select startdate as eventdate, sum(1) as changevalue from personnel group by startdate
union all
select enddate, sum(-1) from personnel where enddate is not null group by enddate
) p
order by p.eventdate asc
Having window function with sum() requires SQL Server 2012. If you're using older version, you can check other options for running totals.
My example in SQL Fiddle
If you have dates that don't have any events and you need to show those too, then the best option is probably to create a separate table of dates for the whole range you'll ever need, for example 1.1.2000 - 31.12.2099.
-- Edit --
To get count for a specific day, it's possible use the same logic, but just sum everything up to that day:
declare #eventdate date
set #eventdate = '20150117'
select
sum(p.changevalue)
from
(
select startdate as eventdate, 1 as changevalue from personnel
where startdate <= #eventdate
union all
select enddate, -1 from personnel
where enddate < #eventdate
) p
Hopefully this is ok, can't test since SQL Fiddle seems to be unavailable.

Gap Between two two dates of different cells

I need help to create one script where i got stuck.
MemberId BeginDate EndDate Output
1039725910 3/1/2014 8/10/2014 0 End on 10th August
1039725910 8/11/2014 11/10/2014 1 Start on 11th August, 1 day gap
1039725910 11/11/2014 12/31/2014 1 Start on 11th August, 1 day gap
1166164140 1/1/2014 4/30/2039 0 End on 30 April
1166164140 2/5/2014 12/30/2039 2 Start on 1st May, Here is a 2 days gap
Here For one member I have three different begin and end date. for the first records for each member, it would be 0, for the 2nd records, the gap would be (2nd Begindate - 1st EndDate). For 3rd record, The difference would be (3rd Begin date - 2nd EndDate) and so on...I am not able to attach any screenshot.
Kindly help me on this.
Regards,
Ratan

You can use the row_number() window function together with a self-join to access the previous row partitioned by MemberId like this:
select
a.MemberId,
a.BeginDate,
a.EndDate,
Output = ISNULL(DATEDIFF(DAY, isnull(b.EndDate, a.BeginDate), a.BeginDate), 0)
from
(select *, rn = ROW_NUMBER() over (partition by memberid order by begindate) from members) a
left join
(select *, rn = ROW_NUMBER() over (partition by memberid order by begindate) from members) b
on a.MemberId = b.MemberId and a.rn - 1 = b.rn
With your sample data this would give you:
MemberId BeginDate EndDate Output
1039725910 2014-03-01 2014-08-10 0
1039725910 2014-08-11 2014-11-10 1
1039725910 2014-11-11 2014-12-31 1
1166164140 2014-01-01 2039-04-30 0
1166164140 2014-05-02 2039-12-30 -9129
If you need to disregard the year component you'll have to do some date arithmetic.

You can use ROW_NUMBER()
Try using query like one given below:
select *,
case when rno = 1 then 0
else datediff(day, begindate,enddate) end as difference
from
(select *, row_number() over (partition by MemberId order by MemberId) as rno from members)
tbl
Check below demo code:
SQLFiddle Demo

Count number of days in a year with a record

I have a SQL Server table named AgentLog in which I store for each agent his daily number of sales.
+-----------+------------+-------------+
| AgentName | Date | SalesNumber |
+-----------+------------+-------------+
| John | 01.01.2014 | 45 |
| Terry | 01.01.2014 | 30 |
| John | 02.01.2014 | 20 |
| Terry | 02.01.2014 | 15 |
| Terry | 03.01.2014 | 52 |
| Terry | 04.01.2014 | 24 |
| Terry | 05.01.2014 | 12 |
| Terry | 06.01.2014 | 10 |
| Terry | 07.01.2014 | 23 |
| John | 08.01.2014 | 48 |
| Terry | 08.01.2014 | 35 |
| John | 09.01.2014 | 37 |
| Terry | 10.01.2014 | 35 |
+-----------+------------+-------------+
If an agent doesn't work on one particular day, there is no record of his sales on that date.
I want to generate a report(query) on a given date interval (ex: 01.01.2014 - 10.01.2014) that counts on how many days an agent wasn't present for work (ex: John - 6 days), was at work (John - 4 days) and also returns the date interval it wasn't present (ex: John 03.01.2014 - 07.01.2014, 10.01.2014) (there can be multiple intervals).

You need to create a custom table and populate it with a record for each date you want in your range (Feel free to go as far back in the past and forward into the future as you feel you may need.). You could do this in Excel very easily and import it.
Select *
from Custom.DateListTable dlt
left outer join agentlog ag
on dlt.Date = ag.Date

I would approach this by getting the number of dates in the interval, as well as the number of dates the agent was at work, and you then have everything you need.
To get the number of days you can use DATEDIFF:
SELECT DATEDIFF(day, '2014-01-01', '2014-10-01') AS totalDays;
To get the number of days an agent worked, you can use the COUNT(*) aggregate function:
SELECT agentName, COUNT(*) AS daysWorked
FROM myTable
GROUP BY agentName;
Then, you can just add to that query to get the days not worked by subtracting totalDays - daysWorked:
SELECT agentName, COUNT(*) AS daysWorked, (DATEDIFF(day, '2014-01-01', '2014-10-01') - COUNT(*)) AS daysMissed
FROM myTable
GROUP BY agentName;
Here is an SQL Fiddle example.

The only way I can think of to resolve this is to creating a temporary table with only one column (datetime) and save there all the dates from the selected range. You can create an stored procedure that fills that temporary table using a cursor with all the dates from the interval. Then do a LEFT join between your table and the temporary table to look for null values in your table (The days where that person didn't come to work)

Try this...
SET DATEFIRST 1; --Monday
DECLARE #StartDate DATETIME = '2014-01.01',
#EndDate DATETIME = '2014-01.10';
WITH data as (
select 0 as i, DATEADD(DAY, 0, #StartDate) as TheDate
union all
select i + 1, DATEADD(DAY, i + 1, #StartDate) as TheDate
from data
where i < (#EndDate - #StartDate)
)
SELECT a.AgentName,
SUM(CASE WHEN c.Date IS NULL THEN 1 ELSE 0 END) AS Missing,
SUM(CASE WHEN c.Date IS NOT NULL THEN 1 ELSE 0 END) AS Working
FROM Agent a
JOIN data b ON NOT EXISTS(SELECT NULL FROM SpecialDate s WHERE s.date = b.TheDate)
LEFT JOIN AgentLog c ON
c.AgentName = a.AgentName
AND c.Date = b.TheDate
WHERE DATEPART(weekday, b.TheDate) <= 5
GROUP BY a.AgentName
OPTION (MAXRECURSION 10000);
It includes a check for weekends, as well as a reference to "SpecialDate" where a list of non working days can be maintained, and excluded from the check.
Reading your question again, I realise that this will only solve half your problem.

NOTE: The following answer mainly addresses the trickiest part of the question, which is how to obtain "absence from work" intervals.
Given these values as Interval Start - End dates:
DECLARE #IntervalStart DATE = '2013-12-30'
DECLARE #IntervalEnd DATE = '2014-01-10'
the following query gives you the "absence from work" intervals:
SELECT AgentName,
DATEADD(d, 1, t.[Date]) As OffWorkStart,
DATEADD(d, -1, t.NextDate) As OffWorkEnd
FROM (
SELECT AgentName, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As NextDate,
DATEDIFF(DAY, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC)) As NextMinusCurrent
FROM #AgentLog) t
WHERE t.NextMinusCurrent > 1
-- Get marginal beginning interval (in case such an interval exists)
UNION ALL
SELECT AgentName, #IntervalStart AS OffWorkStart, DATEADD(DAY, -1, MIN([Date])) AS OffWorkEnd
FROM #AgentLog
GROUP BY AgentName
HAVING MIN([Date]) > #IntervalStart
-- Get marginal ending interval (in case such an interval exists)
UNION ALL
SELECT AgentName, DATEADD(DAY, 1, MAX([Date])) AS OffWorkStart, #IntervalEnd
FROM #AgentLog
GROUP BY AgentName
HAVING MAX([Date]) < #IntervalEnd
ORDER By AgentName, OffWorkStart
With the input data you supplied, the above query gives you the following output:
AgentName OffWorkStart OffWorkEnd
---------------------------------------
John 2013-12-30 2013-12-31
John 2014-01-03 2014-01-07
John 2014-01-10 2014-01-10
Terry 2013-12-30 2013-12-31
Terry 2014-01-09 2014-01-09
The idea behind the basic part of the query is to employ the following nested query:
SELECT AgentName,
[Date],
LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As NextDate,
DATEDIFF(DAY, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC)) As NextMinusCurrent
FROM #AgentLog
in order to get any existing gaps between the days a certain agent is present for work. A value of NextMinusCurrent > 1 indicates such a gap.
Counting days is trivial once you have the above query in place. E.g. placing the above query in a CTE you can count total number of absence days with sth like:
;WITH cte (
... query goes here
)
SELECT AgentName, SUM(DATEDIFF(DAY, OffWorkStart, OffWorkEnd) + 1) AS AbsenceDays
FROM cte
GROUP By AgentName
P.S. The above query makes use of SQL Server LEAD function, which is available from SQL SERVER 2012 onwards.
SQL Fiddle here
EDIT:
CTEs together with ROW_NUMBER() can be used to simulate LEAD function. The first part of the query becomes:
;WITH cte1 AS (
SELECT AgentName,
[Date],
ROW_NUMBER() OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As rn
FROM #AgentLog
),
cte2 AS (
SELECT cte1.AgentName, cte1.[Date],
cteLead.[Date] AS NextDate,
DATEDIFF(DAY, cte1.[Date], cteLead.[Date]) As NextMinusCurrent
FROM cte1
LEFT OUTER JOIN cte1 AS cteLead
ON (cte1.rn = cteLead.rn - 1) AND (cte1.AgentName = cteLead.AgentName)
)
SELECT AgentName,
DATEADD(d, 1, cte2.[Date]) As OffWorkStart,
DATEADD(d, -1, cte2.NextDate) As OffWorkEnd
FROM cte2
WHERE NextMinusCurrent > 1
SQL Fiddle for SQL Server 2008 here. I hope it executes in SQL Server 2005 also!

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Getting the Min(startdate) and Max(enddate) for an ID when that ID shows up multiple times - sql-server

Related

return amount per year/month records based on start and enddate

How to sum any credits before debits SQL Server?

GROUP BY DAY, CUMULATIVE SUM

Gap Between two two dates of different cells

Count number of days in a year with a record

Categories

Resources