T-SQL Issue using lead or lag

T-SQL Issue using lead or lag - sql-server

I have a table that has columns EVENT_ACTION and TIMESTAMP; in column EVENT_ACTION there are two possible values, 225 and 226.
225 represent the start_time and 226 represent the end_time; since they are in two different rows I'm trying to use LAG or LEAD and have some issues.
Here is what I have so far; the column MRDF is my unique id:
SELECT
f.EVENT_ACTION ,
(f.TIMESTAMP) AS starttime,
LEAD(f.TIMESTAMP) OVER (ORDER BY f.MRDF) AS endtime
FROM
dbo.flext f
WHERE
EVENT_ACTION IN (225,226)
ORDER BY
MRDF, EVENT_ACTION
This is what I'm getting: it's now getting the next row's timestamp as I thought it would:
I'm getting a null value for my last EVENT_ACTION 255. I'm planing to place this into a temp table and only take EVENT_ACTION 225
As you can see I'm lost :-).
Any help would be appreciated
Mike

I think you want to use f.TIMESTAMP as your ORDER BY for the LEAD(). I think your query should look something more like this:
SELECT
f.EVENT_ACTION ,
(f.TIMESTAMP) AS starttime,
LEAD(f.TIMESTAMP) OVER (ORDER BY f.TIMESTAMP ASC) AS endtime
FROM
dbo.flext f
WHERE
EVENT_ACTION IN (225,226)
ORDER BY MRDF, EVENT_ACTION
However, this will still leave you with a NULL for the endtime of your last 226 record. So you can add a default value to the LEAD() function for this situation. The syntax is:
LEAD ( scalar_expression [ ,offset ] , [ default ] )
Using this syntax, your LEAD() would then become:
LEAD(f.TIMESTAMP, 1, GETDATE()) OVER (ORDER BY f.TIMESTAMP ASC) AS endtime
You can replace the GETDATE() with whatever you'd want the default value to be when there is no leading record.

Related

Calculating Days Between Dates in Separate Rows For Same UnitID

I am trying to calculate the time a commercial real estate space sits vacant. I have move-in & move-out dates for each tenant that has occupied that unit. It is easy to calculate the occupied time of each tenant as that data is within the same row. However, I want to calculate the vacant time: the time between move-out of the previous tenant and move-in of the next tenant. These dates appear in separate rows.
Here is a sample of what I have currently:
SELECT
uni_vch_UnitNo AS UnitNumber,
uty_vch_Code AS UnitCode,
uty_int_Id AS UnitID, tul_int_FacilityId AS FacilityID,
tul_dtm_MoveInDate AS Move_In_Date,
tul_dtm_MoveOutDate AS Move_Out_Date,
DATEDIFF(day, tul_dtm_MoveInDate, tul_dtm_MoveOutDate) AS Occupancy_Days
FROM TenantUnitLeases
JOIN units
ON tul_int_UnitId = uni_int_UnitId
JOIN UnitTypes
ON uni_int_UnitTypeId = uty_int_Id
WHERE
tul_int_UnitId = '26490'
ORDER BY tul_dtm_MoveInDate ASC
Is there a way to assign an id to each row in chronological, sequential order and find the difference between row 2 move-in date less row 1 move-out date and so on?
Thank you in advance for the help.

I can't really tell which tables provide which columns for your query. Please alias and dot-qualify them in the future.
If you're using SQL 2012 or later, you've got LEAD and LAG functions which do exactly what you want: bring a "leading" or "lagging" row into a current row. See if this works (hopefully it should at least get you started):
SELECT
uni_vch_UnitNo AS UnitNumber,
uty_vch_Code AS UnitCode,
uty_int_Id AS UnitID, tul_int_FacilityId AS FacilityID,
tul_dtm_MoveInDate AS Move_In_Date,
tul_dtm_MoveOutDate AS Move_Out_Date,
DATEDIFF(day, tul_dtm_MoveInDate, tul_dtm_MoveOutDate) AS Occupancy_Days
, LAG(tul_dtm_MoveOutDate) over (partition by uni_vch_UnitNo order by tul_dtm_MoveOutDate) as Previous_Move_Out_Date
, DATEDIFF(day,LAG(tul_dtm_MoveOutDate) over (partition by uni_vch_UnitNo order by tul_dtm_MoveOutDate),tul_dtm_MoveInDate) as Days_Vacant
FROM TenantUnitLeases
JOIN units
ON tul_int_UnitId = uni_int_UnitId
JOIN UnitTypes
ON uni_int_UnitTypeId = uty_int_Id
WHERE
tul_int_UnitId = '26490'
ORDER BY tul_dtm_MoveInDate ASC

Just comparing a value from the current row with a value in the previous row is functionality provided by the lag() function.
Try this in your query:
select...
tul_dtm_MoveInDate AS Move_In_Date,
tul_dtm_MoveOutDate AS Move_Out_Date,
DateDiff(day, Lag(tul_dtm_MoveOutDate,1) over(partition by uty_vch_Code, tul_int_FacilityId order by tul_dtm_MoveInDate), tul_dtm_MoveInDate) DaysVacant,
...

This needs a window function or correlated sub query. The goal is to provide the previous move out date for each row, which is in turn a function of that row. The term 'window' in this context means to apply an aggregate function over a smaller range than the whole set.
If you had a function called GetPreviousMoveOutDate, the parameters would be the key to filter on, and the ranges to search within the filter. So we would pass the UnitID as the key and the MoveInDate for this row, and the function should return the most recent MoveOutDate for the same unit that is before the passed in date. By getting the max date before this one, we will ensure we get only the previous occupancy if it exists.
To use a sub-query in ANSI-SQL you just add the select as a column. This should work on MS-SQL as well as other DB platforms; however, it requires using aliases for the table names so they can be referenced in the query more than once. I've updated your sample SQL with aliases using the AS syntax, although it looks redundant to your table naming convention. I added a uni_dtm_UnitFirstAvailableDate to your units table to handle the first vacancy, but this can be a default:
SELECT
uni.uni_vch_UnitNo AS UnitNumber,
uty.uty_vch_Code AS UnitCode,
uty.uty_int_Id AS UnitID, tul_int_FacilityId AS FacilityID,
tul.tul_dtm_MoveInDate AS Move_In_Date,
tul.tul_dtm_MoveOutDate AS Move_Out_Date,
DATEDIFF(day, tul.tul_dtm_MoveInDate, tul.tul_dtm_MoveOutDate) AS Occupancy_Days,
-- select the date:
(SELECT MAX (prev_tul.tul_dtm_MoveOutDate )
FROM TenantUnitLeases AS prev_tul
WHERE prev_tul.tul_int_UnitId = tul.tul_int_UnitId
AND prev_tul.tul_dtm_MoveOutDate > tul.tul_dtm_MoveInDate
AND prev_tul.tul_dtm_MoveOutDate is not null
) AS previous_moveout,
-- use the date in a function:
DATEDIFF(day, tul.tul_dtm_MoveInDate,
ISNULL(
(SELECT MAX (prev_tul.tul_dtm_MoveOutDate )
FROM TenantUnitLeases AS prev_tul
WHERE prev_tul.tul_int_UnitId = tul.tul_int_UnitId
AND prev_tul.tul_dtm_MoveOutDate > tul.tul_dtm_MoveInDate
AND prev_tul.tul_dtm_MoveOutDate is not null
) , uni.uni_dtm_UnitFirstAvailableDate) -- handle first occupancy
) AS Vacancy_Days
FROM TenantUnitLeases AS tul
JOIN units AS uni
ON tul.tul_int_UnitId = uni.uni_int_UnitId
JOIN UnitTypes AS uty
ON uni.uni_int_UnitTypeId = uty.uty_int_Id
WHERE
tul.tul_int_UnitId = '26490'
ORDER BY tul.tul_dtm_MoveInDate ASC

TSQL - Matching a date between two dates in another table

I currently have two tables, tbl_Invoices
InvoiceNumber NextBillingDate
------------------------------
100 3/15/21
200 3/31/21
300 4/15/21
400 5/15/21
and tbl_GLPeriods:
GLPeriod PeriodStartDate PeriodEndDate
----------------------------------------------
250 3/3/21 4/3/21
251 4/4/21 5/2/21
252 5/3/21 6/3/21
I need a view that returns a column where the GL period for the next billing date is provided, ie:
InvoiceNumber NextBillingPeriod
---------------------------------
100 250
200 250
300 251
400 252
How do I query to find if one column is between the two columns in another table? I'm blanking on how to do this, thinking something with a CASE.
Edit: where I'm currently at, structurally won't work, but it shows what I'm currently trying to get going:
SELECT
*,
CASE
WHEN tbl_Invoices.NextBillingDate BETWEEN (SELECT PeriodStartDate FROM tbl_GLPeriods) AS stdt
AND (SELECT PeriodEndDate FROM tbl_GLPeriods) AS endt
THEN endt.GLPeriod
END AS NextBillingPeriod
FROM
tbl_Invoices
Solved with this thanks to #Charlieface:
select tbl_Invoices.InvoiceNumber, tbl_GLPeriods.GLPeriod
from tbl_Invoices
left join tbl_GLPeriods on tbl_Invoices.NextBillingDate between tbl_GLPeriods.PeriodStartDate AND tbl_GLPeriods.PeriodEndDate

You can use AND to connect multiple predicates to check for a range with <= and > (or equivalent). Like that you can use a correlated subquery similar to what you've tried, provided the periods cannot overlap.
SELECT i.invoicenumber,
(SELECT p.glperiod
FROM tbl_glperiods p
WHERE p.periodstartdate <= i.nextbillingdate
AND dateadd(DAY, 1, p.periodenddate) > i.nextbillingdate) nextbillingperiod
FROM tbl_invoices i;
You can also use a left join. Then the periods can overlap, you'll get multiple rows, if a date falls in two or more periods. A join might also perform better.
SELECT i.invoicenumber,
p.glperiod nextbillingperiod
FROM tbl_invoices i
LEFT JOIN tbl_glperiods p
ON p.periodstartdate <= i.nextbillingdate
AND dateadd(DAY, 1, p.periodenddate) > i.nextbillingdate;
Note that you can shorten dateadd(DAY, 1, p.periodenddate) to just p.periodenddate if tbl_glperiods.periodenddate is meant to be and exclusive upper bound or if it's inclusive but tbl_invoices.nextbillingdate is guaranteed not to be more precise than a day, i.e. it cannot have an hour, minute, second and so on portion. Otherwise you might miss timestamps on the last day past midnight.

select InvoiceNumber, (select GLPeriod from tbl_GLPeriods where NextBillingDate between PeriodStartDate and PeriodEndDate) 'NextBillingPeriod' from tbl_Invoices

Creating sequential date ranges for items in a queue

I have a table 'item_queue' containing, items, groups and a sequence number.
Each item is unique and is held against a group with a number indicating the sequence. The count is a total for that item e.g.
group_id|item_id|sequence_order_number|count
--------------------------------------------
A |123 |1 |20
A |124 |2 |30
B |125 |1 |10
Given this information I am trying to set up sequential start and end dates
The start datetime of the first item for a group is the current time, for example assume start of item 123 is '2019-04-04 12:00:00.000' then
end datetime would be start + (count * minutes) so '2019-04-04 12:20:00.000'
The start of item 124 would equal that end date as it is the next in the sequence for that group. the end is then calculated the same way to be '2019-04-04 12:50:00.000'
item 125 would start the time again at '2019-04-04 12:00:00.000' as it is in a different group
I have attempted a few ways to do this, and I think the answer is a recursive cte, but I can't wrap my head around it to make it work for one or multiple groups, my unsuccessful attempt for a single group:
;with cte as
(
select
group_id,
item_id,
count,
GETDATE() as start_datetime,
DATEADD(MINUTE, count, GETDATE()) as end_datetime,
iq.sequence_order_number
from item_queue iq
where iq.group_id = 'A'
union all
select
group_id,
item_id,
count,
cte.end_datetime,
DATEADD(MINUTE, count, cte2.end_datetime) as end_datetime,
iq.sequence_order_number
from item_queue iq
inner join cte
on cte.group_id = iq.group_id
and cte.sequence_order_number > iq.sequence_order_number
where iq.group_id = 'A'
)
select * from cte
I suspect the answer may involve a row number window something like
ROW_NUMBER() OVER (Partition By iq.group_id Order By iq.sequence_order_number ASC)
But I have had trouble using it recursively.
Using SQL server 2012, without the ability to upgrade this database.

The minutes you want to add are practically a cumulative sum. The sum() over() window function is available in 2012 and performs exactly that. Try:
select
*,
isnull(sum([count]) over
(
partition by group_id
order by item_id asc
rows between unbounded PRECEDING and 1 PRECEDING
)
,0) as cum_count_start,
sum([count]) over ( partition by group_id order by item_id asc ) as cum_count_end
from item_queue
You already know how to use dateadd after this point.
What the individual window function caluses do:
partition by group_id : Seperate (partition) the calculations for each group_id value subset
order by item_id asc : make a virtual sorting of the rows on which the window range will be applied
rows between.... : The actual window. For the start date, we want to consider all the lines from the start (thus unbounded preceding) to the previous one (thus 1 preceding), since you don't want the start date to include the current line's [count]. Note that ommitting this clause like we did on the cum_count_end is equivelant to rows between unbounded preceding and current row.
The isnull(...,0) is needed because for the first line of each group_id you want to add 0 to the start date, but the window function sees no rows and returns NULL, so we need to change this to 0.

Printing the current value and previous value between the date range

I have a sample data like this
ID DATE TIME STATUS
---------------------------------------------
A 01-01-2000 0900 ACTIVE
A 05-02-2000 1000 INACTIVE
A 01-07-2000 1300 ACTIVE
B 01-05-2005 1000 ACTIVE
B 01-08-2007 1050 ACTIVE
C 01-01-2010 0900 ACTIVE
C 01-07-2010 1900 INACTIVE
From the above data set, if we only focus on ID='A' we note that A was initally active, then became inactive on 05-02-2000 and then it was inactive until 01-07-2000.
Which means that A was inactive from 05-Feb-2000 to 01-July-2000.
My questions are:
if I execute a query with (ID=A, Date=01-04-2000) it should give me
A 05-02-2000 1000 INACTIVE
because since that date is not available in that data set, it should search for the previous one and print that
Also, if my condition is (ID=A, Date=01-07-2000) it should not only print the value which is present in the table, but also print a previous value
A 05-02-2000 1000 INACTIVE
A 01-07-2000 1300 ACTIVE
I would really appreciate if any one can assist me solve this query. I am trying my best to solve this.
Thank you every one.
Any take on this?
Afaq

Something like the following should work:
SELECT ID, Date, Time, Status
from (select ID, Date, Time, Status, row_number() over (order by Date) Ranking
from MyTable
where ID = #SearchId
and Date <= #SearchDate) xx
where Ranking < 3
order by Date, Time
This will return at most two rows. Its not clear if you are using Date and Time datatyped columns, or if you are actually using reserved words as column names, so you'll have to fuss with that. (I left out Time, but you could easily add that to the various orderings and filterings.)
Given the revised criteria, it gets a bit trickier, as the inclusion or exclusion of a row depends upon the value returned in a different row. Here, the “second” row, if there are two or more rows, is included only if the “first” row equals a particular value. The standard way to do this is to query the data to get the max value, then query it again while referencing the result of the first set.
However, you can do a lot of screwy things with row_number. Work on this:
SELECT ID, Date, Time, Status
from (select
ID, Date, Time, Status
,row_number() over (partition by case when Date = #SearchDate then 0 else 1 end
order by case when Date = #SearchDate then 0 else 1 end
,Date) Ranking
from MyTable
where ID = #SearchId
and Date <= #SearchDate) xx
where Ranking = 1
order by Date, Time
You'll have to resolve the date/time issue, since this only works against dates.

Basically you need to pull a row if, for the specified date, it is:
1) the last record, or
2) the last inactive record.
And the two conditions may match the same row as well as two distinct rows.
Here's how this logic could be implemented in SQL Server 2005+:
WITH ranked AS (
SELECT
ID,
Date,
Time,
Status,
RankOverall = ROW_NUMBER() OVER ( ORDER BY Date DESC),
RankByStatus = ROW_NUMBER() OVER (PARTITION BY Status ORDER BY Date DESC)
FROM Activity
WHERE ID = #ID
AND Date <= #Date
)
SELECT
ID,
Date,
Time,
Status,
FROM ranked
WHERE RankOverall = 1
OR Status = 'INACTIVE' AND RankByStatus = 1

Gaps in recurring series of a group with datetime [duplicate]

We have a table with following data
Id,ItemId,SeqNumber;DateTimeTrx
1,100,254,2011-12-01 09:00:00
2,100,1,2011-12-01 09:10:00
3,200,7,2011-12-02 11:00:00
4,200,5,2011-12-02 10:00:00
5,100,255,2011-12-01 09:05:00
6,200,3,2011-12-02 09:00:00
7,300,0,2011-12-03 10:00:00
8,300,255,2011-12-03 11:00:00
9,300,1,2011-12-03 10:30:00
Id is an identity column.
The sequence for an ItemId starts from 0 and goes till 255 and then resets to 0. All this information is stored in a table called Item. The order of sequence number is determined by the DateTimeTrx but such data can enter any time into the system. The expected output is as shown below-
ItemId,PrevorNext,SeqNumber,DateTimeTrx,MissingNumber
100,Previous,255,2011-12-01 09:05:00,0
100,Next,1,2011-12-01 09:10:00,0
200,Previous,3,2011-12-02 09:00:00,4
200,Next,5,2011-12-02 10:00:00,4
200,Previous,5,2011-12-02 10:00:00,6
200,Next,7,2011-12-02 11:00:00,6
300,Previous,1,2011-12-03 10:30:00,2
300,Next,255,2011-12-03 16:30:00,2
We need to get those rows one before and one after the missing sequence. In the above example for ItemId 300 - the record with sequence 1 has entered first (2011-12-03 10:30:00) and then 255(2011-12-03 16:30:00), hence the missing number here is 2. So 1 is previous and 255 is next and 2 is the first missing number. Coming to ItemId 100, the record with sequence 255 has entered first (2011-12-02 09:05:00) and then 1 (2011-12-02 09:10:00), hence 255 is previous and then 1, hence 0 is the first missing number.
In the above expected result, MissingNumber column is the first occuring missing number just to illustrate the example.
We will not have a case where we would have a complete series reset at one time i.e. it can be either a series rundown from 255 to 0 as in for itemid 100 or 0 to 255 as in ItemId 300. Hence we need to identify sequence missing when in ascending order (0,1,...255) or either in descending order (254,254,0,2) etc.
How can we accomplish this in a t-sql?

Could work like this:
;WITH b AS (
SELECT *
,row_number() OVER (ORDER BY ItemId, DateTimeTrx, SeqNumber) AS rn
FROM tbl
), x AS (
SELECT
b.Id
,b.ItemId AS prev_Itm
,b.SeqNumber AS prev_Seq
,c.ItemId AS next_Itm
,c.SeqNumber AS next_Seq
FROM b
JOIN b c ON c.rn = b.rn + 1 -- next row
WHERE c.ItemId = b.ItemId -- only with same ItemId
AND c.SeqNumber <> (b.SeqNumber + 1)%256 -- Seq cycles modulo 256
)
SELECT Id, prev_Itm, 'Previous' AS PrevNext, prev_Seq
FROM x
UNION ALL
SELECT Id, next_Itm ,'Next', next_Seq
FROM x
ORDER BY Id, PrevNext DESC
Produces exactly the requested result.
See a complete working demo on data.SE.
This solution takes gaps in the Id column into consideration, as there is no mention of a gapless sequence of Ids in the question.
Edit2: Answer to updated question:
I updated the CTE in the query above to match your latest verstion - or so I think.
Use those columns that define the sequence of rows. Add as many columns to your ORDER BY clause as necessary to break ties.
The explanation to your latest update is not entirely clear to me, but I think you only need to squeeze in DateTimeTrx to achieve what you want. I have SeqNumber in the ORDER BY additionally to break ties left by identical DateTimeTrx. I edited the query above.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight