Grouping based on Lag/Lead

Grouping based on Lag/Lead - sql-server

In SQL Server 2014, I have the following table that tracks user activity:
USER_ID
EVENT
EVENT_DATE
15552221111
LOGIN
2022-06-01
15552221111
COMPLETE
2022-06-08
15552221111
LOGIN
2022-09-01
15552221111
SHUTDOWN
2022-09-11
15552222222
LOGIN
2022-04-01
15552222222
PROCESSING
2022-04-08
15552222222
PROCESSING
2022-06-10
15552222222
COMPLETE
2022-06-11
15552222222
LOGIN
2022-09-08
I need to create some sort of sequencing value, so that all records that have an event less than 60 days of each other shares the same number. Desired result:
USER_ID
EVENT
EVENT_DATE
SEQ
15552221111
LOGIN
2022-06-01
1
15552221111
COMPLETE
2022-06-08
1
15552221111
LOGIN
2022-09-01
2
15552221111
SHUTDOWN
2022-09-11
2
15552222222
LOGIN
2022-04-01
1
15552222222
PROCESSING
2022-04-08
1
15552222222
PROCESSING
2022-06-10
2
15552222222
COMPLETE
2022-06-11
2
15552222222
LOGIN
2022-09-08
3
Totally stuck, any ideas?
Here's some test code:
WITH testTable (USERID, EVENT, EVENT_DATE) AS
(
SELECT 15552221111, 'LOGIN', '2022-06-01' UNION ALL
SELECT 15552221111, 'COMPLETE', '2022-06-01' UNION ALL
SELECT 15552221111, 'LOGIN', '2022-09-01' UNION ALL
SELECT 15552221111, 'SHUTDOWN', '2022-09-11' UNION ALL
SELECT 15552222222, 'LOGIN', '2022-04-01' UNION ALL
SELECT 15552222222, 'PROCESSING', '2022-04-08 ' UNION ALL
SELECT 15552222222, 'PROCESSING', '2022-06-10' UNION ALL
SELECT 15552222222, 'COMPLETE', '2022-06-11' UNION ALL
SELECT 15552222222, 'LOGIN', '2022-09-08'
)
SELECT
USERID
, EVENT
, EVENT_DATE
, LEAD (EVENT_DATE, 1, 0) OVER (PARTITION BY USERID ORDER BY EVENT_DATE) NEXT_DATE
, ROW_NUMBER() OVER (PARTITION BY USERID ORDER BY EVENT_DATE) RECORD_SEQ
FROM testTable

I would use LAG() to determine the row when EVENT_DATE is more than 60 days when compare to previous row. And then perform a cumulative SUM() OVER (..) to get the SEQ that you want
CTE AS
(
SELECT
USERID
, EVENT
, EVENT_DATE
, CASE WHEN DATEDIFF(DAY
, LAG (EVENT_DATE, 1) OVER (PARTITION BY USERID ORDER BY EVENT_DATE)
, EVENT_DATE) > 60
THEN 1
ELSE 0
END AS S
FROM testTable
)
SELECT *, SUM(S) OVER (PARTITION BY USERID ORDER BY EVENT_DATE) + 1 AS SEQ
FROM CTE

Related

SQL Server Window Paging Based on # of Groups

Given the following table structure
Column
Id
Name
DateCreated
with the following data
id
Name
DateCreated
1
Joe
1/13/2021
2
Fred
1/13/2021
3
Bob
1/12/2021
4
Sue
1/12/2021
5
Sally
1/10/2021
6
Alex
1/9/2021
I need SQL that will page over the data based on datecreated. The query should return the top 3 records, and any record which also shares the datecreated of the top 3.
So give the data above, we should get back Joe, Fred and Bob (as the top 3 records) plus Sue since sue has the same date as Bob.
Is there something like ROW_NUMBER that increments for each row where it encounters a different value.
For some context this query is being used to generate an agenda type view, and once we select any date we want to keep all data for that date together.
EDIT
I do have a solution but it smells:
;WITH CTE AS ( SELECT ROW_NUMBER() OVER(ORDER BY DateCreated DESC) RowNum,CAST(DateCreated AS DATE) DateCreated,Name
FROM MyTable),
PAGE AS (SELECT *
FROM CTE
WHERE RowNum<=5)
SELECT *
FROM Page
UNION
SELECT *
FROM CTE
WHERE DateCreated=(SELECT MIN(DateCreated) FROM Page)

I've used a TOP 3 WITH TIES example and a ROW_NUMBER example and a CTE to return four records:
DROP TABLE IF EXISTS #tmp
GO
CREATE TABLE #tmp (
Id INT PRIMARY KEY,
name VARCHAR(20) NOT NULL,
dateCreated DATE
)
GO
INSERT INTO #tmp VALUES
( 1, 'Joe', '13 Jan 2021' ),
( 2, 'Fred', '13 Jan 2021' ),
( 3, 'Bob', '12 Jan 2021' ),
( 4, 'Sue', '12 Jan 2021' ),
( 5, 'Sally', '10 Jan 2021' ),
( 6, 'Alex', '9 Jan 2021' )
GO
-- Gets same result
SELECT TOP 3 WITH TIES *
FROM #tmp t
ORDER BY dateCreated DESC
;WITH cte AS (
SELECT ROW_NUMBER() OVER( ORDER BY dateCreated DESC ) rn, *
FROM #tmp
)
SELECT *
FROM #tmp t
WHERE EXISTS
(
SELECT *
FROM cte c
WHERE rn <=3
AND t.dateCreated = c.dateCreated
)
My results:

As #Charlieface, we only need to replace ROW_NUMBER with DENSE_RANK. So that the ROW_NUMBER will be tied according to the same value.
When we run the query:
SELECT DENSE_RANK () OVER(ORDER BY DateCreated DESC) RowNum,CAST(DateCreated AS DATE) DateCreated,Name
FROM MyTable
The result will show as follows:
So as a result, we can set RowNum<=3 in the query to get the top 3:
;WITH CTE AS ( SELECT DENSE_RANK() OVER(ORDER BY DateCreated DESC) RowNum,CAST(DateCreated AS DATE) DateCreated,Name
FROM MyTable),
PAGE AS (SELECT *
FROM CTE
WHERE RowNum<=3)
SELECT *
FROM Page
UNION
SELECT *
FROM CTE
WHERE DateCreated=(SELECT MIN(DateCreated) FROM Page)
The First one is as yours the second one is as above. The results of the two queries are the same.
Kindly let us know if you need more infomation.

How to SELECT from multiple ranges of time

Lets say I have a table in SQLServer named Events. It contains some events with time stamp.
ID TimeStamp EventDescription
1 '2019-04-04 08:20' Machine Error 1
2 '2019-04-04 09:01' Machine Error 2
3 '2019-04-05 09:23' Machine Error 3
4 '2019-04-05 12:23' Machine Error 4
5 '2019-04-06 11:33' Machine Error 5
6 '2019-04-06 18:07' Machine Error 6
7 '2019-04-07 12:23' Machine Error 7
In addition I have second table named Ranges. It contains ranges of time.
ID From To
1 '2019-04-04 08:00' '2019-04-04 09:00'
2 '2019-04-05 10:30' '2019-04-05 16:00'
3 '2019-04-06 10:00' '2019-04-06 12:00'
I need to SELECT events from table Events where TimeStamp IS between ranges of time in table Ranges.
The result:
ID TimeStamp EventDescription
1 '2019-04-04 08:20' Machine Error 1
4 '2019-04-05 12:23' Machine Error 4
5 '2019-04-06 11:33' Machine Error 5
I have no idea what to do.
Do I have to use dynamic SQL to build this query?

Correlated subquery can be used here.
Select * from Events E
where exists (select 1 from Ranges where e.TimeStamp between [From] and [To])

I created your data as temp tables:
SELECT 1 ID, CAST('2019-04-04 08:20' AS DATETIME) TimeStamp, 'Machine Error 1' EventDescription
INTO #Events
UNION
SELECT 2 ID, CAST('2019-04-04 09:01' AS DATETIME) TimeStamp, 'Machine Error 2' EventDescription
UNION
SELECT 3 ID, CAST('2019-04-05 09:23' AS DATETIME) TimeStamp, 'Machine Error 2' EventDescription
UNION
SELECT 4 ID, CAST('2019-04-05 12:23' AS DATETIME) TimeStamp, 'Machine Error 2' EventDescription
UNION
SELECT 5 ID, CAST('2019-04-06 11:33' AS DATETIME) TimeStamp, 'Machine Error 2' EventDescription
UNION
SELECT 6 ID, CAST('2019-04-06 18:07' AS DATETIME) TimeStamp, 'Machine Error 2' EventDescription
UNION
SELECT 7 ID, CAST('2019-04-07 12:34' AS DATETIME) TimeStamp, 'Machine Error 2' EventDescription
SELECT 1 ID, CAST('2019-04-04 08:00' AS DATETIME) [From], CAST('2019-04-04 09:00' AS DATETIME) [To]
INTO #Ranges
UNION
SELECT 2 ID, CAST('2019-04-05 10:30' AS DATETIME) [From], CAST('2019-04-05 16:00' AS DATETIME) [To]
UNION
SELECT 3 ID, CAST('2019-04-06 10:00' AS DATETIME) [From], CAST('2019-04-06 12:00' AS DATETIME) [To]
And then it's as simple as joining them together:
SELECT E.*
FROM #Ranges R
JOIN #Events E ON E.TimeStamp BETWEEN R.[From] AND R.[To]

SQL Server: fill a range with dates from overlapping intervals with priority

I need to fill the range from 2017-04-01 to 2017-04-30 with the data from this table, knowing that the highest priority records should prevail over those with lower priorities
id startValidity endValidity priority
-------------------------------------------
1004 2017-04-03 2017-04-30 1
1005 2017-04-10 2017-04-22 2
1010 2017-04-19 2017-04-23 3
1006 2017-04-24 2017-04-28 2
1008 2017-04-26 2017-04-28 3
In practice I would need to get a result like this:
id startValidity endValidity priority
--------------------------------------------
1004 2017-04-03 2017-04-09 1
1005 2017-04-10 2017-04-18 2
1010 2017-04-19 2017-04-23 3
1006 2017-04-24 2017-04-25 2
1008 2017-04-26 2017-04-28 3
1004 2017-04-29 2017-04-30 1

can't think of anything elegant or more efficient solution right now . . .
-- Sample Table
declare #tbl table
(
id int,
startValidity date,
endValidty date,
priority int
)
-- Sample Data
insert into #tbl select 1004, '2017-04-03', '2017-04-30', 1
insert into #tbl select 1005, '2017-04-10', '2017-04-22', 2
insert into #tbl select 1010, '2017-04-19', '2017-04-23', 3
insert into #tbl select 1006, '2017-04-24', '2017-04-28', 2
insert into #tbl select 1008, '2017-04-26', '2017-04-28', 3
-- Query
; with
date_range as -- find the min and max date for generating list of dates
(
select start_date = min(startValidity), end_date = max(endValidty)
from #tbl
),
dates as -- gen the list of dates using recursive CTE
(
select rn = 1, date = start_date
from date_range
union all
select rn = rn + 1, date = dateadd(day, 1, d.date)
from dates d
where d.date < (select end_date from date_range)
),
cte as -- for each date, get the ID based on priority
(
select *, grp = row_number() over(order by id) - rn
from dates d
outer apply
(
select top 1 x.id, x.priority
from #tbl x
where x.startValidity <= d.date
and x.endValidty >= d.date
order by x.priority desc
) t
)
-- final result
select id, startValidity = min(date), endValidty = max(date), priority
from cte
group by grp, id, priority
order by startValidity

I do not understand the purpose of Calendar CTE or table.
So I am not using any REcursive CTE or calendar.
May be I hvn't understood the requirement completly.
Try this with diff sample data,
declare #tbl table
(
id int,
startValidity date,
endValidty date,
priority int
)
-- Sample Data
insert into #tbl select 1004, '2017-04-03', '2017-04-30', 1
insert into #tbl select 1005, '2017-04-10', '2017-04-22', 2
insert into #tbl select 1010, '2017-04-19', '2017-04-23', 3
insert into #tbl select 1006, '2017-04-24', '2017-04-28', 2
insert into #tbl select 1008, '2017-04-26', '2017-04-28', 3
;With CTE as
(
select * ,ROW_NUMBER()over(order by startValidity)rn
from #tbl
)
,CTE1 as
(
select c.id,c.startvalidity,isnull(dateadd(day,-1, c1.startvalidity)
,c.endValidty) Endvalidity
,c.[priority],c.rn
from cte c
left join cte c1
on c.rn+1=c1.rn
)
select id,startvalidity,Endvalidity,priority from cte1
union ALL
select id,startvalidity,Endvalidity,priority from
(
select top 1 id,ca.startvalidity,ca.Endvalidity,priority from cte1
cross apply(
select top 1
dateadd(day,1,endvalidity) startvalidity
,dateadd(day,-1,dateadd(month, datediff(month,0,endvalidity)+1,0)) Endvalidity
from cte1
order by rn desc)CA
order by priority
)t4
--order by startvalidity --if req

How to select only data with consecutive row number starting from 1

I have a table similar to the one below.
What I want to do is to select the rows with consecutive RowNo with the same job name must be selected if it begins with RowNo = 1. Here is the sample output:
Hope you can help. Thank you.

Try this
DECLARE #Tbl TABLE (RowNo INT, Jobname NVARCHAR(50), AuditDate DATETIME)
INSERT INTO #Tbl
SELECT 3, 'Backup Database Sales', '2016.07.26' UNION ALL
SELECT 1, 'Send Autoemail Sales Report', '2016.07.26' UNION ALL
SELECT 2, 'Send Autoemail Sales Report', '2016.07.25' UNION ALL
SELECT 3, 'Send Autoemail Sales Report', '2016.07.24' UNION ALL
SELECT 4, 'Update Sales Stats', '2016.07.23' UNION ALL
SELECT 4, 'Update Sales Stats', '2016.07.22' UNION ALL
SELECT 1, 'Generate new item codes', '2016.07.26' UNION ALL
SELECT 2, 'Generate new item codes', '2016.07.25'
;WITH CTE
AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS Id FROM #Tbl
)
SELECT
*
FROM
#Tbl T
WHERE
EXISTS
(
SELECT TOP 1
1
FROM
(
SELECT
C.Jobname,
MIN(C.RowNo) MinRowNo,
MAX(C.RowNo) MaxRow
FROM
CTE C
GROUP BY
C.Jobname,
C.Id - C.RowNo
) A
WHERE
A.MinRowNo <> A.MaxRow AND
A.MinRowNo = 1 AND
A.Jobname = T.Jobname AND
T.RowNo BETWEEN A.MinRowNo AND A.MaxRow
)
Output
RowNo Jobname AuditDate
----------- -------------------------------------------------- -----------------------
1 Send Autoemail Sales Report 2016-07-26 00:00:00.000
2 Send Autoemail Sales Report 2016-07-25 00:00:00.000
3 Send Autoemail Sales Report 2016-07-24 00:00:00.000
1 Generate new item codes 2016-07-26 00:00:00.000
2 Generate new item codes 2016-07-25 00:00:00.000

SELECT T1.*
FROM
YourTable T1
INNER Join
YourTable T2
ON T1.RowNo = 1 AND T2.RowNo =1 AND T1.JobName=T2.Jobname
OR T1.RowNo > 1 AND T1.RowNo - 1 = T2.RowNo AND T1.JobName=T2.Jobname

Is it possible to write single query for following scenario?

Is it possible to write single query for following scenario?
Scenario -
Table -
column name - id date isPaid
values - 1 1/1/2011 1
2 1/2/2011 1
3 1/3/2011 0
4 1/4/2011 0
5 1/5/2011 0
I want a result set which contains (all ispaid = 1) and (only 1 row of ispaid = 0 whose date is smaller).
Result set:
column name - id date isPaid
values - 1 1/1/2011 1
2 1/2/2011 1
3 1/3/2011 0
Thanks

You can use UNIONdocs
SELECT
[id],
[date],
[isPaid]
FROM
[tablename]
WHERE
[ispaid] = 1
UNION ALL
SELECT TOP 1
[id],
[date],
[isPaid]
FROM
[tablename]
WHERE
[ispaid] = 0
ORDER BY
[date] ASC

This should do what you need in SQL Server 2005 and higher.
select
[id],
[date],
isPaid
from (
select
[id],
[date],
isPaid,
ROW_NUMBER() over (partition by ispaid order by date) as row
from table_name t ) a
where ispaid = 1
or row = 1
order by [date]

Assuming date will be provided to the query by user, for which data is to be retrieved
select t.*
from table t,
(select Top 1 id, date, ispaid
from table
where ispaid = 0 and date<?) np
where (t.ispaid=1 and t.date = ? ) OR (t.id = np.id)