Counting on 2 Date columns - sql-server

So I've got 2 Date Columns, start and end. What I am trying to accomplish is inventory for each day and this will be going back to the beginning of 2020 so that I have these fields:
Start Date, count of new adds that day, count of closed that day, and count of existing open from previous days. My basic data structure I derive is Start Date, End Date, Request Type (if start date = date of report and end date is null then 'New Add', if end date is not null then 'Work Closed' and if Start Date is less than date of report then 'Existing Open'. The problem is that these depend on relativity between the report date and open/close dates. I need to be able to group by a date and give the counts for each day. I tried these 2 solutions and didn't work like I had hoped for as they're slightly different than my scenario. (Count Function on Multiple Columns by Date (SQL Server) and Get count on two different date columns and group by date). When I boils down I need to do a count by each day based on the current date inventory and the existing stuff from the previous day.
My basic data structure is like this and is fake data:
+----+------------+-----------+---+
| ID | StartDate | EndDate | |
+----+------------+-----------+---+
| 1 | 1/1/2020 | NULL | |
| 2 | 12/1/2019 | 1/1/2020 | |
| 3 | 1/1/2020 | 1/3/2020 | |
| 4 | 12/17/2019 | 1/2/2020 | |
+----+------------+-----------+---+
Expected Result:
+-------------+---------+-----------------+-----------+--+--+------+
| Report Date | NewAdds | ExistingOpen | Closed | | | |
+-------------+---------+-----------------+-----------+--+--+------+
| 1/1/2020 | 2 | 1 | 1 | | | |
| 1/2/2020 | 0 | 1 | 1 | | | |
| 1/3/2020 | 0 | 1 | 1 | | | |
+-------------+---------+-----------------+-----------+--+--+------+

set #report_start = '20200101';
set #report_end = '20200103';
select
d.dt,
count(case when t.start_dt = d.dt then 1 end) as Adds,
count(case when d.dt > t.start_dt and d.dt < t.end_dt as Existing,
count(case when t.end_dt = d.dt then 1 end) as Closed
from T t inner join Dates d on d.dt <= coalesce(t.end_dt, #report_end)
where d.dt between #report_start and #report_end
group by d.dt;
Create a table of dates and join against it. Counting is fairly easy at that point.
This is a bad idea because you need to count up across all dates ever. Also I don't know what null end date means. Apologies if this is sloppy as I typed it on my phone.

This is a begining of a solution that fixes the logic to handle any report date:
If Start_date = report_date and (end_date is null or end_date > report date) then 'New Add'
if end_date is not null and end_date <= report_date then 'Work Closed'
if Start_Date < report_date and (end_date is null or end_date > report_date then 'Existing Open'
You need a case expression that will give you one of the three values.
Once you get it working for a single report date, you can generate a range of report dates using this solution and join it with your table: Generate Dates between date ranges

to implement my solution, add a table where I manage the calendar
you have to do several steps to solve the problem:
establish which tasks are open for each calendar interval (NewAddTask)
calculate total open tasks by interval (TotalNewAddTask)
establish which tasks are close for each calendar interval (ClosedTask)
calculate total close tasks by interval(TotalClosedTask)
calculate a schedule of the interval combination (ExistingOpenCalendar)
establish which tasks are Existing Open for each calendar interval (ExistingOpenDetail)
calculate total Existing Open tasks by interval(TotalExistingOpenTask)
I finally combine all the totals with the calendar
with NewAddTask as
(
SELECT IdCalendar,IdTask
FROM
Calendar CROSS JOIN Task
where StarDate between FirstDate and LastDate
),
TotalNewAddTask as
(
select IdCalendar,count(IdTask) as Total
from NewAddTask
group by IdCalendar
),
ClosedTask as
(
SELECT IdCalendar,IdTask
FROM
Calendar CROSS JOIN Task
where isnull(CloseDate,'2020-12-31') between FirstDate and LastDate
),
TotalClosedTask as
(
select IdCalendar,count(IdTask) as Total
from ClosedTask
group by IdCalendar
),
ExistingOpenCalendar as
(
SELECT
Calendarend.IdCalendar ,
CalendarStart.FirstDate,
Calendarend.LastDate
FROM
Calendar as CalendarStart CROSS JOIN Calendar as Calendarend
where
CalendarStart.FirstDate<Calendarend.LastDate
)
, ExistingOpenDetail as
(
select ExistingOpenCalendar.IdCalendar,Task.IdTask
from ExistingOpenCalendar CROSS JOIN Task
where StarDate between FirstDate and LastDate
and not (isnull(CloseDate,'2020-12-31') between FirstDate and LastDate)
and (CloseDate is null or (CloseDate < LastDate))
)
,TotalExistingOpenTask as
(
select IdCalendar,count(IdTask) as Total
from ExistingOpenDetail
group by IdCalendar
)
select
Calendar.IdCalendar,Calendar.FirstDate ,
isnull(TotalNewAddTask.Total,0)as NewAddTask,
isnull(TotalClosedTask.Total,0)as ClosedTask,
isnull(TotalExistingOpenTask.Total,0)as ExistingOpen
from Calendar
left join TotalNewAddTask on Calendar.IdCalendar=TotalNewAddTask.IdCalendar
left join TotalClosedTask on Calendar.IdCalendar=TotalClosedTask.IdCalendar
left join TotalExistingOpenTask on Calendar.IdCalendar=TotalExistingOpenTask.IdCalendar
this query meets the conditions
in this example you can find example

Related

Update a column with LastExclusionDate

In SQL Server 2012, I have a table t1 where we store a list of excluded product.
I would like to add a column LastExclusionDate to store the date since when the product has been excluded.
Every day the product is inserted into the table if it is excluded. If not there will be no row and the next time when the product will be excluded there will be a gap date with the previous insert.
I would like to find a T-SQL query to update the LastExclusionDate column.
I would like to use it to populate column LastExclusionDate the first time (=initialisation) and use it every day to update the column when we insert a new row
I've tried this query, but I don't know how to get LastExclusionDate!
;WITH Cte AS
(
SELECT
product_id,
CreationDate,
LAG(CreationDate) OVER (PARTITION BY Product_ID ORDER BY CreationDate) AS GapStart,
(DATEDIFF(DAY, LAG(CreationDate) OVER (PARTITION BY Product_id ORDER BY CreationDate), CreationDate) -1) AS GapDays
FROM
#t1
)
SELECT *
FROM cte
Here's some sample data:
+------------+--------------+--------------------------------+
| product_id | CreationDate | LastExclusionDate_(toPopulate) |
+------------+--------------+--------------------------------+
| 100 | 2018-05-01 | 2018-05-01 |
| 100 | 2018-05-02 | 2018-05-01 |
| 100 | 2018-05-03 | 2018-05-01 |
| 100 | 2018-06-01 | 2018-06-01 |
| 100 | 2018-06-02 | 2018-06-01 |
| 200 | 2018-09-01 | 2018-09-01 |
| 200 | 2018-09-02 | 2018-09-01 |
| 200 | 2018-09-17 | 2018-09-17 |
+------------+--------------+--------------------------------+
Thanks
The idea in finding gap-less sequences is to compare the series to a gap-less sequence and find groups of records where the difference of both doesn't change. For example, when the date increases one by one and a row number also does, then the difference between both stays the same and we found a group:
WITH
cte (product_id, CreationDate, grp) AS (
SELECT product_id, CreationDate
, DATEDIFF(day, '19000101', CreationDate)
- ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY CreationDate)
FROM #t1
)
SELECT product_id, CreationDate
, MIN(CreationDate) OVER (PARTITION BY product_id, grp) AS LastExclusionDate
FROM cte
For ongoing daily insertions it can be done with something like this.
INSERT INTO <yourTable>
SELECT
newProduct.[product_id],
newProduct.[creationDate],
isnull(existingProduct.[lastExclusionDate], newProduct.[creationDate]) AS [lastExclusionDate]
FROM
(SELECT <#product_id> AS [product_id], <#createionDate> AS [creationDate]) AS newProduct
LEFT JOIN #temp existingProduct
ON existingProduct.[product_id] = newProduct.product_id
AND existingProduct.[creationDate] = DATEADD(DAY,-1,newProduct.[creationDate])
I've got a demo here http://rextester.com/BDEO23118 . It's a larger than necessary demo because it uses the code above with the data you provided to populate a table row-by-row like you might in a daily update process. It then does individual insertions using this code with some new dates so you can see the way it handles new ranges. (just an FYI, rextester displays result dates in day.month.year hh:mm:ss format, but you can dump the script into management studio and it will output in DATE format)

Horizontal date intervals to Vertical dates (possibly a sql/vba loop solution?)

Is there a quicker way to convert my data from columns a - d being personnel information, then column e being leave starting day and column f being leave ending day to the following:
Column a - d repeating on each row and column e being a seperate row for each day/date included in the range?
At the moment I am doing this manually to prepare large leave taken/clocked in recon.
I should also add that each row contains a interval for an employees leave taken and that same employee could appear more than once in the dataset.
I am reading up on SQL scripts although it doesn't appear to cover this case with so many rows and intervals to create for each person.
If you want to solve this problem in SQL, then you can use a calendar or dates table for this sort of thing.
For only 152kb in memory, you can have 30 years of dates in a table with this:
/* dates table */
declare #fromdate date = '20000101';
declare #years int = 30;
/* 30 years, 19 used data pages ~152kb in memory, ~264kb on disk */
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
select top (datediff(day, #fromdate,dateadd(year,#years,#fromdate)))
[Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1,#fromdate))
into dbo.Dates
from n as deka cross join n as hecto cross join n as kilo
cross join n as tenK cross join n as hundredK
order by [Date];
create unique clustered index ix_dbo_Dates_date on dbo.Dates([Date]);
Without taking the actual step of creating a table, you can generate an adhoc tables of dates using a common table expression with just this:
declare #fromdate date, #thrudate date;
select #fromdate = min(fromdate), #thrudate = max(thrudate) from dbo.leave;
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
select top (datediff(day, #fromdate, #thrudate)+1)
[Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1,#fromdate))
from n as deka cross join n as hecto cross join n as kilo
cross join n as tenK cross join n as hundredK
order by [Date]
)
Use either like so:
/* `distinct` if there are overlaps or duplicates to remove */
select distinct
l.personid
, d.[Date]
from dbo.leave l
inner join dates d
on d.date >= l.fromdate
and d.date <= l.thrudate;
rextester demo: http://rextester.com/AVOIN59493
from this test data:
create table leave (personid int, fromdate date, thrudate date)
insert into leave values
(1,'20170101','20170107')
,(1,'20170104','20170106') -- overlapped
,(1,'20170420','20170422')
,(2,'20170207','20170207') -- single day
,(2,'20170330','20170405')
returns:
+----------+------------+
| personid | Date |
+----------+------------+
| 1 | 2017-01-01 |
| 1 | 2017-01-02 |
| 1 | 2017-01-03 |
| 1 | 2017-01-04 |
| 1 | 2017-01-05 |
| 1 | 2017-01-06 |
| 1 | 2017-01-07 |
| 1 | 2017-04-20 |
| 1 | 2017-04-21 |
| 1 | 2017-04-22 |
| 2 | 2017-02-07 |
| 2 | 2017-03-30 |
| 2 | 2017-03-31 |
| 2 | 2017-04-01 |
| 2 | 2017-04-02 |
| 2 | 2017-04-03 |
| 2 | 2017-04-04 |
| 2 | 2017-04-05 |
+----------+------------+
Number and Calendar table reference:
Generate a set or sequence without loops - 2 - Aaron Bertrand
The "Numbers" or "Tally" Table: What it is and how it replaces a loop - Jeff Moden
Creating a Date Table/Dimension in sql Server 2008 - David Stein
Calendar Tables - Why You Need One - David Stein
Creating a date dimension or calendar table in sql Server - Aaron Bertrand
Guys how i solved this was actually just using another formula on this forum involving a join and between with the date intervals.
Worked fine!
Ps used the calendar for another scenario regarding actual work days taking into account weekends and public holidays....
Thanks

Select all rows within a date range, with at least one row occurring in the last given month

I've run out of ideas for this one and I'm not sure how to do this whatsoever.
I'll give you my current query that gives me the rows I need for the date range;
WITH CTE AS
(SELECT FER.*, COUNT(*) OVER (PARTITION BY FER.Report_Subject, FER.Event_Category) AS Event_Count
FROM FacilityEventReport FER INNER JOIN
FacilityEventReport_RMReview RMR ON FER.ID = RMR.FER_ID
WHERE RMR.Review_Status = 'Active' AND FER.Report_About = 'Resident' AND #BeginDate <= FER.Event_Date AND #EndDate >= FER.Event_Date AND (LEN(ISNULL(#Category,'')) = 0) OR #Category = FER.Event_Category)
SELECT *
FROM CTE
WHERE Event_Count > 1
ORDER BY Report_Subject
The above query returns all events that have occurred within the date parameters #BeginDate and #EndDate. #Category is an optional parameter, only used to filter the query by Event_Category. This query returns all of the requested rows, but I need at least ONE of the rows returned to be in the MONTH of the second date parameter (#EndDate).
This query is used for a report.
The parameters #BeginDate and #EndDate are both of the type DateTime. As is the field Event_Date.
Does anyone have any ideas?
Thanks,
Kramb
CLARIFICATION:
Condensed Data:
Event_Date | Report_Subject | Event_Category
----------------------------------------------------
2016-01-01 | Patient 1 | Aggressive Act
2016-01-02 | Patient 1 | Aggressive Act
2016-02-01 | Patient 1 | Aggressive Act
2016-01-01 | Patient 2 | Fall
2016-01-02 | Patient 2 | Fall
2016-03-01 | Patient 3 | Fall
If I run the query with the following parameters:
#BeginDate = '2016-01-01';
#EndDate = '2016-02-01';
I want the following data returned:
Event_Date | Report_Subject | Event_Category
----------------------------------------------------
2016-01-01 | Patient 1 | Aggressive Act
2016-01-02 | Patient 1 | Aggressive Act
2016-02-01 | Patient 1 | Aggressive Act
Notice that Patient 1 was returned because they had more than one event of the same type. Also, An event occurred within the month of the second parameter.
Patient 2 did have more than one event, but because those events did not occur in the month of the second parameter the rows were not returned.
There are lots of ways. One is to add this to the end of your current query:
...
AND EXISTS(
SELECT *
FROM CTE t2 WHERE t2.Report_Subject=CTE.Report_Subject
AND DATEDIFF(Month, Event_Date, #EndDate)=0
)

Efficient Date Comparisons in SQL

I hope this question provides all of the necessary information, but please do request more if anything is unclear. This is my first question on stack overflow so please bear with me.
I am running this query on SQL Server 2005.
I have a large derived dataset (i'll provide a small subset later) which has 4 fields;
ID,
Year,
StartDate,
EndDate
Within this data set the ID may (correctly) appear multiple times with different date combinations.
The question I have is what ways are there to identify if a record is 'new' I.E it's start date does not fall between the start and end date of any other records for the same id.
For an example take the data set below (I hope this table comes out correctly!);
+----+------+------------+------------+
| ID | Year | Start Date | End Date |
+----+------+------------+------------+
| 1 | 2007 | 01/01/2007 | 10/10/2007 |
| 1 | 2007 | 01/01/2007 | 05/04/2007 |
| 1 | 2007 | 05/04/2007 | 08/10/2007 |
| 1 | 2007 | 15/10/2007 | 20/10/2007 |
| 1 | 2007 | 25/10/2007 | 01/01/2008 |
| 2 | 2007 | 01/01/2007 | 01/01/2008 |
| 2 | 2008 | 01/01/2008 | 15/07/2008 |
| 2 | 2008 | 10/06/2008 | 01/01/2009 |
+----+------+------------+------------+
If we say nothing existed before 2007 then Row 1 and Row 6 are 'new' at that time.
Rows 2,3,7 and 8 are not 'new' as they either join the end of a previous record or overlap it to form a continuous date period (take rows 6 and 7 there are no 'breaks' between 01/01/2008 and 01/01/2009)
Row 4 and 5 would be considered a new record as it does not attach directly to the end of the previous period for ID 1 or overlap any of the other periods.
Currently to get this data set I have to put all of my data into temporary tables and then join them together on various fields to remove the records I don't want.
Firstly I remove rows where the startdate equals the enddate of another row for that ID (This would get rid of rows 3 and 7)
Then I remove rows where the the start date is between the startdate and enddate of other records for that ID (this would remove rows 2 and 8)
That would leave me withRows 1,4,5 and 6 as the 'new' records which is correct.
Is there a more efficient way to do this such as in some sort of loop, CTE or cough Cursor?
As per the above, if there is anything unclear don't hesitate to ask and I will try and provide you with the information you request.
Try
;with cte as
(
Select *, row_number() over (partition by id order by startdate) rn from yourtable
)
select distinct t1.*
from cte t1
left join cte t2
on t1.ID = t2.ID
and t1.EndDate>=t2.StartDate and t1.StartDate<=t2.EndDate
and t1.rn<>t2.rn
where t2.ID is null
or t1.rn=1
this should work, if you have a unique identifier for each row:
select * from
tbl t3
left outer join
(
select distinct t1.id as id_inside, t1.recno as recno_inside
from
tbl t1 inner join
tbl t2 on
t1.id = t2.id and
(t1.startdate <> t2.startdate or t1.enddate <> t2.enddate) and
(t1.startdate >= t2.startdate and t1.enddate <= t2.enddate)
) t4 on
t3.id = t4.id_inside and
t3.recno = t4.recno_inside
where
id_inside is null and
recno_inside is null
sqlfiddle

SQL Query for Date Range, multiple start/end times

A table exists in Microsoft SQL Server with record ID, Start Date, End Date and Quantity.
The idea is that for each record, the quantity/total days in range = daily quantity.
Given that a table containing all possible dates exists, how can I generate a result set in SQL Server to look like the following example?
EX:
RecordID | Start Date | End Date | Quantity
1 | 1/1/2010 | 1/5/2010 | 30000
2 | 1/3/2010 | 1/9/2010 | 20000
3 | 1/1/2010 | 1/7/2010 | 10000
Results as
1 | 1/1/2010 | QTY (I can do the math easy, just need the dates view)
1 | 1/2/2010 |
1 | 1/3/2010 |
1 | 1/4/2010 |
1 | 1/3/2010 |
2 | 1/4/2010 |
2 | 1/5/2010 |
2 | 1/6/2010 |
2 | 1/7/2010 |
2 | 1/8/2010 |
2 | 1/9/2010 |
3 | 1/1/2010 |
3 | 1/2/2010 |
3 | 1/3/2010 |
3 | 1/4/2010 |
3 | 1/5/2010 |
3 | 1/6/2010 |
3 | 1/7/2010 |
Grouping on dates I could get then get the sum of quantity on that day however the final result set can't be aggregate due to user provided filters that may exclude some of these records down the road.
EDIT
To clarify, this is just a sample. The filters are irrelevant as I can join to the side to pull in details related to the record ID in the results.
The real data contains N records which increases weekly, the dates are never the same. There could be 2000 records with different start and end dates... That is what I want to generate a view for. I can right join onto the data to do the rest of what I need
I should also mention this is for past, present and future data. I would love to get rid of a temporary table of dates. I was using a recursive query to get all dates that exist within a 50 year span but this exceeds MAXRECURSION limits for a view, that I cannot use.
Answer
select RecordId,d.[Date], Qty/ COUNT(*) OVER (PARTITION BY RecordId) AS Qty
from EX join Dates d on d.Date between [Start Date] and [End Date]
ORDER BY RecordId,[Date]
NB: The below demo CTEs use the date datatype which is SQL Server 2008 the general approach should work for SQL2005 as well though.
Test Case
/*CTEs for testing purposes only*/
WITH EX AS
(
SELECT 1 AS RecordId,
cast('1/1/2010' as date) as [Start Date],
cast('1/5/2010' as date) as [End Date],
30000 AS Qty
union all
SELECT 2 AS RecordId,
cast('1/3/2010' as date) as [Start Date],
cast('1/9/2010' as date) as [End Date],
20000 AS Qty
),Dates AS /*Dates Table now adjusted to do greater range*/
(
SELECT DATEADD(day,s1.number + 2048*s2.number,'1990-01-01') AS [Date]
FROM master.dbo.spt_values s1 CROSS JOIN master.dbo.spt_values s2
where s1.type='P' AND s2.type='P' and s2.number <= 8
order by [Date]
)
select RecordId,d.[Date], Qty/ COUNT(*) OVER (PARTITION BY RecordId) AS Qty
from EX join Dates d on d.Date between [Start Date] and [End Date]
ORDER BY RecordId,[Date]
Results
RecordId Date Qty
----------- ---------- -----------
1 2010-01-01 6000
1 2010-01-02 6000
1 2010-01-03 6000
1 2010-01-04 6000
1 2010-01-05 6000
2 2010-01-03 2857
2 2010-01-04 2857
2 2010-01-05 2857
2 2010-01-06 2857
2 2010-01-07 2857
2 2010-01-08 2857
2 2010-01-09 2857
I think you can try this.
SELECT [Quantities].[RecordID], [Dates].[Date], SUM([Quantity])
FROM [Dates]
JOIN [Quantities] on [Dates].[Date] between [Quantities].[Start Date] and [End Date]
GROUP BY [Quantities].[RecordID], [Dates].[Date]
ORDER BY [Quantities].[RecordID], [Dates].[Date]

Resources