Creating sequential date ranges for items in a queue - sql-server

I have a table 'item_queue' containing, items, groups and a sequence number.
Each item is unique and is held against a group with a number indicating the sequence. The count is a total for that item e.g.
group_id|item_id|sequence_order_number|count
--------------------------------------------
A |123 |1 |20
A |124 |2 |30
B |125 |1 |10
Given this information I am trying to set up sequential start and end dates
The start datetime of the first item for a group is the current time, for example assume start of item 123 is '2019-04-04 12:00:00.000' then
end datetime would be start + (count * minutes) so '2019-04-04 12:20:00.000'
The start of item 124 would equal that end date as it is the next in the sequence for that group. the end is then calculated the same way to be '2019-04-04 12:50:00.000'
item 125 would start the time again at '2019-04-04 12:00:00.000' as it is in a different group
I have attempted a few ways to do this, and I think the answer is a recursive cte, but I can't wrap my head around it to make it work for one or multiple groups, my unsuccessful attempt for a single group:
;with cte as
(
select
group_id,
item_id,
count,
GETDATE() as start_datetime,
DATEADD(MINUTE, count, GETDATE()) as end_datetime,
iq.sequence_order_number
from item_queue iq
where iq.group_id = 'A'
union all
select
group_id,
item_id,
count,
cte.end_datetime,
DATEADD(MINUTE, count, cte2.end_datetime) as end_datetime,
iq.sequence_order_number
from item_queue iq
inner join cte
on cte.group_id = iq.group_id
and cte.sequence_order_number > iq.sequence_order_number
where iq.group_id = 'A'
)
select * from cte
I suspect the answer may involve a row number window something like
ROW_NUMBER() OVER (Partition By iq.group_id Order By iq.sequence_order_number ASC)
But I have had trouble using it recursively.
Using SQL server 2012, without the ability to upgrade this database.

The minutes you want to add are practically a cumulative sum. The sum() over() window function is available in 2012 and performs exactly that. Try:
select
*,
isnull(sum([count]) over
(
partition by group_id
order by item_id asc
rows between unbounded PRECEDING and 1 PRECEDING
)
,0) as cum_count_start,
sum([count]) over ( partition by group_id order by item_id asc ) as cum_count_end
from item_queue
You already know how to use dateadd after this point.
What the individual window function caluses do:
partition by group_id : Seperate (partition) the calculations for each group_id value subset
order by item_id asc : make a virtual sorting of the rows on which the window range will be applied
rows between.... : The actual window. For the start date, we want to consider all the lines from the start (thus unbounded preceding) to the previous one (thus 1 preceding), since you don't want the start date to include the current line's [count]. Note that ommitting this clause like we did on the cum_count_end is equivelant to rows between unbounded preceding and current row.
The isnull(...,0) is needed because for the first line of each group_id you want to add 0 to the start date, but the window function sees no rows and returns NULL, so we need to change this to 0.

Related

Calculating Days Between Dates in Separate Rows For Same UnitID

I am trying to calculate the time a commercial real estate space sits vacant. I have move-in & move-out dates for each tenant that has occupied that unit. It is easy to calculate the occupied time of each tenant as that data is within the same row. However, I want to calculate the vacant time: the time between move-out of the previous tenant and move-in of the next tenant. These dates appear in separate rows.
Here is a sample of what I have currently:
SELECT
uni_vch_UnitNo AS UnitNumber,
uty_vch_Code AS UnitCode,
uty_int_Id AS UnitID, tul_int_FacilityId AS FacilityID,
tul_dtm_MoveInDate AS Move_In_Date,
tul_dtm_MoveOutDate AS Move_Out_Date,
DATEDIFF(day, tul_dtm_MoveInDate, tul_dtm_MoveOutDate) AS Occupancy_Days
FROM TenantUnitLeases
JOIN units
ON tul_int_UnitId = uni_int_UnitId
JOIN UnitTypes
ON uni_int_UnitTypeId = uty_int_Id
WHERE
tul_int_UnitId = '26490'
ORDER BY tul_dtm_MoveInDate ASC
Is there a way to assign an id to each row in chronological, sequential order and find the difference between row 2 move-in date less row 1 move-out date and so on?
Thank you in advance for the help.
I can't really tell which tables provide which columns for your query. Please alias and dot-qualify them in the future.
If you're using SQL 2012 or later, you've got LEAD and LAG functions which do exactly what you want: bring a "leading" or "lagging" row into a current row. See if this works (hopefully it should at least get you started):
SELECT
uni_vch_UnitNo AS UnitNumber,
uty_vch_Code AS UnitCode,
uty_int_Id AS UnitID, tul_int_FacilityId AS FacilityID,
tul_dtm_MoveInDate AS Move_In_Date,
tul_dtm_MoveOutDate AS Move_Out_Date,
DATEDIFF(day, tul_dtm_MoveInDate, tul_dtm_MoveOutDate) AS Occupancy_Days
, LAG(tul_dtm_MoveOutDate) over (partition by uni_vch_UnitNo order by tul_dtm_MoveOutDate) as Previous_Move_Out_Date
, DATEDIFF(day,LAG(tul_dtm_MoveOutDate) over (partition by uni_vch_UnitNo order by tul_dtm_MoveOutDate),tul_dtm_MoveInDate) as Days_Vacant
FROM TenantUnitLeases
JOIN units
ON tul_int_UnitId = uni_int_UnitId
JOIN UnitTypes
ON uni_int_UnitTypeId = uty_int_Id
WHERE
tul_int_UnitId = '26490'
ORDER BY tul_dtm_MoveInDate ASC
Just comparing a value from the current row with a value in the previous row is functionality provided by the lag() function.
Try this in your query:
select...
tul_dtm_MoveInDate AS Move_In_Date,
tul_dtm_MoveOutDate AS Move_Out_Date,
DateDiff(day, Lag(tul_dtm_MoveOutDate,1) over(partition by uty_vch_Code, tul_int_FacilityId order by tul_dtm_MoveInDate), tul_dtm_MoveInDate) DaysVacant,
...
This needs a window function or correlated sub query. The goal is to provide the previous move out date for each row, which is in turn a function of that row. The term 'window' in this context means to apply an aggregate function over a smaller range than the whole set.
If you had a function called GetPreviousMoveOutDate, the parameters would be the key to filter on, and the ranges to search within the filter. So we would pass the UnitID as the key and the MoveInDate for this row, and the function should return the most recent MoveOutDate for the same unit that is before the passed in date. By getting the max date before this one, we will ensure we get only the previous occupancy if it exists.
To use a sub-query in ANSI-SQL you just add the select as a column. This should work on MS-SQL as well as other DB platforms; however, it requires using aliases for the table names so they can be referenced in the query more than once. I've updated your sample SQL with aliases using the AS syntax, although it looks redundant to your table naming convention. I added a uni_dtm_UnitFirstAvailableDate to your units table to handle the first vacancy, but this can be a default:
SELECT
uni.uni_vch_UnitNo AS UnitNumber,
uty.uty_vch_Code AS UnitCode,
uty.uty_int_Id AS UnitID, tul_int_FacilityId AS FacilityID,
tul.tul_dtm_MoveInDate AS Move_In_Date,
tul.tul_dtm_MoveOutDate AS Move_Out_Date,
DATEDIFF(day, tul.tul_dtm_MoveInDate, tul.tul_dtm_MoveOutDate) AS Occupancy_Days,
-- select the date:
(SELECT MAX (prev_tul.tul_dtm_MoveOutDate )
FROM TenantUnitLeases AS prev_tul
WHERE prev_tul.tul_int_UnitId = tul.tul_int_UnitId
AND prev_tul.tul_dtm_MoveOutDate > tul.tul_dtm_MoveInDate
AND prev_tul.tul_dtm_MoveOutDate is not null
) AS previous_moveout,
-- use the date in a function:
DATEDIFF(day, tul.tul_dtm_MoveInDate,
ISNULL(
(SELECT MAX (prev_tul.tul_dtm_MoveOutDate )
FROM TenantUnitLeases AS prev_tul
WHERE prev_tul.tul_int_UnitId = tul.tul_int_UnitId
AND prev_tul.tul_dtm_MoveOutDate > tul.tul_dtm_MoveInDate
AND prev_tul.tul_dtm_MoveOutDate is not null
) , uni.uni_dtm_UnitFirstAvailableDate) -- handle first occupancy
) AS Vacancy_Days
FROM TenantUnitLeases AS tul
JOIN units AS uni
ON tul.tul_int_UnitId = uni.uni_int_UnitId
JOIN UnitTypes AS uty
ON uni.uni_int_UnitTypeId = uty.uty_int_Id
WHERE
tul.tul_int_UnitId = '26490'
ORDER BY tul.tul_dtm_MoveInDate ASC

SQL Server issue when grouping by year,month, day

I have the below query where I get the past 6 rows from column 'FileSize' and total them into separate column called 'previous'. What I need is to group the results by year,month, day.
This is what I have:
SELECT DATEPART(DAY,CompleteTime )
, SUM(ja.FileSize)
, SUM(FileSize) OVER (ORDER BY DATEPART(DAY,CompleteTime ) ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) as previous
FROM Jobs_analytics ja
WHERE CompleteTime Between '2020-7-13 00:00:00' AND GETDATE()
GROUP BY DATEPART(DAY,CompleteTime )
However SQL wants me to add the FileSize to the group by clause. But When I do that I get every file in the results set. Since SUM(FileSize) OVER (ORDER BY DATEPART(DAY,CompleteTime ) ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) as previous was in a SUM function I didn't think I needed to include it in the group by clause?
Is there anyway I can group my results set by year,month, day?
It's expecting to sum the column FileSize when you want to sum the sum:
SUM(SUM(FileSize)) OVER (ORDER BY DATEPART(DAY,CompleteTime ) ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) as previous
The inner sum() takes care of the group aggregate. The outer sum() over () is the analytic function that looks over the prior rows (which are now grouped and summed themselves.)
SELECT
CAST(CompleteTime AS DATE), SUM(FileSize) AS TotalSize,
SUM(SUM(FileSize)) OVER (
ORDER BY CAST(CompleteTime AS DATE)
ROWS BETWEEN 5 PRECEDING AND CURRENT ROW
) AS Previous
FROM Jobs_analytics
WHERE CompleteTime BETWEEN '2020-07-13 00:00:00' AND GETDATE()
GROUP BY CAST(CompleteTime AS DATE);
Be careful with datepart(day, ...) as it's going to return a value from 1 to 31 and will collide with other months/years once you expand your date range enough to cover multiple dates falling on the same day of month.

Transaction data aggregate

As a disclaimer, I am not entirely sure the title of the question is best, if not I apologize.
I am trying to calculate cycle times for individuals, but files are occasionally transferred out of their work queues and eventually back. There are no unique transaction IDs recorded just a date and time stamp.
I tried looking for an aggregate group by functions and was told that is not a feature sql-server has.
I started by trying to identify the first and last transaction and was going to build out the query from there but it wasn't too helpful. Any insight would be very helpful.
Changedate is when the transfer from one person to another is recorded (year, moth, day time)
select a.claimId,
a.claimincidentID,
cast(a.changeDate as date) changedate,
a.claimNum,
a.Coverage,
a.AssignedAdjID,
a.AssignedAdj,
a.AssignedUnit,
a.TransferedAdjID,
a.TransferedAdj,
a.TransferedUnit,
a.usertypeid,
a.ChangedBy,
b.Feature_Create_Date,
DATEDIFF(day, b.Feature_Create_Date, a.changedate) transfer1,
cast(FIRST_VALUE(changeDate) OVER (ORDER BY changedate ASC)as date) AS firstchangedate,
cast(LAST_VALUE(changeDate) OVER (ORDER BY a.changedate ASC)as date) AS lastchangedate
from DB1.dbo.Assign_Transfer a
left join DB2.claimslist b on a.claimid=b.claimId
group by a.claimId, a.claimincidentID, a.changeDate, a.claimNum, a.Coverage, a.AssignedAdjID, a.AssignedAdj, a.AssignedUnit, a.TransferedAdjID, a.TransferedAdj, a.TransferedUnit, a.usertypeid, a.ChangedBy, b.Feature_Create_Date
Think of each of these rows as a Start (because the most recent one hasn't ended)
We would need to generate the complement End for this person in the chain.
Then with pairs of Start/End one could create GrossDuration.
Even after we get an assignment's start and end date/time,
we will have workday (8-4, or 9-5, or noon-8, ...) considerations,
also Sat/Sun/Hol and Vacation/out-of-office.
All of which affect Duration--- For Each Person differently.
Which would need to be factored by workday/etc into AdjDuration.
Lets say we can sequence these
Row_Number() Over (Partition by claimID Order by changeDate) as tfrNum
Assigned is the prior, and Transfered is the next
1, 2, 3, ... thru N
V
a.changeDate -- NOW()
V V
a.AssignedAdjID, | a.TransferedAdjID,
a.AssignedAdj, | a.TransferedAdj,
a.AssignedUnit, | a.TransferedUnit,
|
a.usertypeid,
a.ChangedBy,
So, is tfrNum=1 or tfrNum=N the oddball??
Lets look at pairs: each pair goes StartFrom->EndTo
1-2, 2-3, 3-4, 4-5, 5-6, 6-Now
----
From row1 we get TransferredID Start(changeDate) and
from row2 we get AssignedAdjID End (changeDate)
-- 2-3, 3-4, 4-5, etc repeating
--except for
From row6 we get TransferredID Start(changeDate) and
from default (still them) End (Now)
-- -- except again when TransferredUnit is "Closed"
After getting these pairs and their Start and End, we can do the Duration calc.
I need to visualize this problem before I try to run some sql. Real data would help.
Lets start with this, and later I would expand on it after you get it working and look at some data--
With cte_tfrNum (claimID, changeDate, tfrNum, tfrMax) AS
(
SELECT
a.claimId
,a.changeDate
,ROW_NUMBER() Over ( Partition By a.claimId Order By a.changeDate) as tfrNum
,b.tfrMax
FROM DB1.dbo.Assign_Transfer a
-- just for giggles, lets also get the max# of transfers for this claim
Left Join
(SELECT claimId, COUNT(*) as tfrMax
FROM DB1.dbo.Assign_Transfer
Group By claimId
) as b
On b.claimId = a.claimId
)
-- Statement using the CTE
Select
tfrTo.*
From cte_tfrNum as tfrTo
Thank you! I was able to take what you gave me and add a few things to be able to look at what I needed.
select
case when abc.tfrMax > abc.tfrnum then datediff(day,lag(abc.changedate) over(partition by abc.claimID order by abc.claimId),abc.changeDate)
when abc.tfrMax = abc.tfrnum then datediff(day,lag(abc.changedate) over(partition by abc.claimID order by abc.claimId),abc.changeDate)
end as test
, abc.*
from
(
SELECT
a.claimId
,a.changeDate
,a.AssignedAdj
,a.TransferedAdj
,a.Coverage
,ROW_NUMBER() Over ( Partition By a.claimId Order By a.changeDate) as tfrNum
,b.tfrMax
FROM db1.dbo.Assign_Transfer a
Left Join
(SELECT claimId, COUNT(*) as tfrMax
FROM db1.dbo.Assign_Transfer
Group By claimId
) as b
On b.claimId = a.claimId
) abc
group by
abc.claimId
,abc.changeDate
,abc.AssignedAdj
,abc.TransferedAdj
,abc.Coverage
,abc.tfrMax
,abc.tfrNum

Printing the current value and previous value between the date range

I have a sample data like this
ID DATE TIME STATUS
---------------------------------------------
A 01-01-2000 0900 ACTIVE
A 05-02-2000 1000 INACTIVE
A 01-07-2000 1300 ACTIVE
B 01-05-2005 1000 ACTIVE
B 01-08-2007 1050 ACTIVE
C 01-01-2010 0900 ACTIVE
C 01-07-2010 1900 INACTIVE
From the above data set, if we only focus on ID='A' we note that A was initally active, then became inactive on 05-02-2000 and then it was inactive until 01-07-2000.
Which means that A was inactive from 05-Feb-2000 to 01-July-2000.
My questions are:
if I execute a query with (ID=A, Date=01-04-2000) it should give me
A 05-02-2000 1000 INACTIVE
because since that date is not available in that data set, it should search for the previous one and print that
Also, if my condition is (ID=A, Date=01-07-2000) it should not only print the value which is present in the table, but also print a previous value
A 05-02-2000 1000 INACTIVE
A 01-07-2000 1300 ACTIVE
I would really appreciate if any one can assist me solve this query. I am trying my best to solve this.
Thank you every one.
Any take on this?
Afaq
Something like the following should work:
SELECT ID, Date, Time, Status
from (select ID, Date, Time, Status, row_number() over (order by Date) Ranking
from MyTable
where ID = #SearchId
and Date <= #SearchDate) xx
where Ranking < 3
order by Date, Time
This will return at most two rows. Its not clear if you are using Date and Time datatyped columns, or if you are actually using reserved words as column names, so you'll have to fuss with that. (I left out Time, but you could easily add that to the various orderings and filterings.)
Given the revised criteria, it gets a bit trickier, as the inclusion or exclusion of a row depends upon the value returned in a different row. Here, the “second” row, if there are two or more rows, is included only if the “first” row equals a particular value. The standard way to do this is to query the data to get the max value, then query it again while referencing the result of the first set.
However, you can do a lot of screwy things with row_number. Work on this:
SELECT ID, Date, Time, Status
from (select
ID, Date, Time, Status
,row_number() over (partition by case when Date = #SearchDate then 0 else 1 end
order by case when Date = #SearchDate then 0 else 1 end
,Date) Ranking
from MyTable
where ID = #SearchId
and Date <= #SearchDate) xx
where Ranking = 1
order by Date, Time
You'll have to resolve the date/time issue, since this only works against dates.
Basically you need to pull a row if, for the specified date, it is:
1) the last record, or
2) the last inactive record.
And the two conditions may match the same row as well as two distinct rows.
Here's how this logic could be implemented in SQL Server 2005+:
WITH ranked AS (
SELECT
ID,
Date,
Time,
Status,
RankOverall = ROW_NUMBER() OVER ( ORDER BY Date DESC),
RankByStatus = ROW_NUMBER() OVER (PARTITION BY Status ORDER BY Date DESC)
FROM Activity
WHERE ID = #ID
AND Date <= #Date
)
SELECT
ID,
Date,
Time,
Status,
FROM ranked
WHERE RankOverall = 1
OR Status = 'INACTIVE' AND RankByStatus = 1

Merge rows based on date in SQL Server

I want to display data based on start date and end date. a code can contain different dates. if any time intervel is continues then I need to merge that rows and display as single row
Here is sample data
Code Start_Date End_Date Volume
470 24-Oct-10 30-Oct-10 28
470 17-Oct-10 23-Oct-10 2
470 26-Sep-10 2-Oct-10 2
471 22-Aug-10 29-Aug-10 2
471 15-Aug-10 21-Aug-10 2
The output result I want is
Code Start_Date End_Date Volume
470 17-Oct-10 30-Oct-10 30
470 26-Sep-10 2-Oct-10 2
471 15-Aug-10 29-Aug-10 4
a code can have any no. of time intervels. Pls help. Thank you
Based on your sample data (which I've put in a table called Test), and assuming no overlaps:
;with Ranges as (
select Code,Start_Date,End_Date,Volume from Test
union all
select r.Code,r.Start_Date,t.End_Date,(r.Volume + t.Volume)
from
Ranges r
inner join
Test t
on
r.Code = t.Code and
DATEDIFF(day,r.End_Date,t.Start_Date) = 1
), ExtendedRanges as (
select Code,MIN(Start_Date) as Start_Date,End_Date,MAX(Volume) as Volume
from Ranges
group by Code,End_Date
)
select Code,Start_Date,MAX(End_Date),MAX(Volume)
from ExtendedRanges
group by Code,Start_Date
Explanation:
The Ranges CTE contains all rows from the original table (because some of them might be relevant) and all rows we can form by joining ranges together (both original ranges, and any intermediate ranges we construct - we're doing recursion here).
Then ExtendedRanges (poorly named) finds, for any particular End_Date, the earliest Start_Date that can reach it.
Finally, we query this second CTE, to find, for any particular Start_Date, the latest End_Date that is associated with it.
These two queries combine to basically filter the Ranges CTE down to "the widest possible Start_Date/End_Date pair" in each set of overlapping date ranges.
Sample data setup:
create table Test (
Code int not null,
Start_Date date not null,
End_Date date not null,
Volume int not null
)
insert into Test(Code, Start_Date, End_Date, Volume)
select 470,'24-Oct-10','30-Oct-10',28 union all
select 470,'17-Oct-10','23-Oct-10',2 union all
select 470,'26-Sep-10','2-Oct-10',2 union all
select 471,'22-Aug-10','29-Aug-10',2 union all
select 471,'15-Aug-10','21-Aug-10',2
go
if I understand your request, you're looking for something like:
select code, min(Start_date), max(end_date), sum(volume)
from yourtable
group by code

Resources