Observations in Group Per Any Rolling 5 Minute Period (Snowflake SQL)

Observations in Group Per Any Rolling 5 Minute Period (Snowflake SQL) - snowflake-cloud-data-platform

I have a table with 2 columns, here is a sample:
ID
DateTime
1
2022-04-01 13:19:15
1
2022-04-01 13:20:19
1
2022-04-01 15:01:37
2
2022-04-01 10:08:21
2
2022-04-01 12:09:32
2
2022-04-01 15:07:25
I am trying to build a SQL query in Snowflake that returns all of the IDs that have a minimum of 2 or more records within ANY rolling 5 minute window. For the example data provided, ID 1 would be returned but ID 2 would not since all times for that ID are more than 5 minutes apart.
When attempting to find solutions to this problem, it seems that a recursive CTE may be the correct approach to solve this, but I can't seem to understand exactly how to approach this.
Any support or guidance would be appreciated!

Using CTE for data -
with data_cte (id, date1) as
(
select * from values
(1,'2022-04-01 13:19:15'::timestamp),
(1,'2022-04-01 13:20:19'::timestamp),
(1,'2022-04-01 15:01:37'::timestamp),
(2,'2022-04-01 10:08:21'::timestamp),
(2,'2022-04-01 12:09:32'::timestamp),
(2,'2022-04-01 15:07:25'::timestamp)
), twindow_cte as (
select
id,
nvl(
TIMESTAMPDIFF('minute',date1,
lead(date1) over (partition by id order by null)), 0) tdiff from
data_cte)
select distinct id from twindow_cte a where
1< (select count(*) from twindow_cte b
where b.tdiff <=5 and b.id = a.id);
ID
1

Related

subquery returned more than 1 value table join

I am writing a course project (something like a program for the hotel manager) and I need a little help. I have tables Reservations and Rooms and I need to calculate the amount of payment after the client leaves the room ((End_date - Start_date) * price_per_day), but I'm having trouble getting the price_per_day from the table Rooms.
My query only works if there is one record in the Resertvation table, if there are 2 or more, I get an error "subquery returned more than 1 value" and I don’t know how to fix it (the problem is in this part of the query SELECT price_per_day FROM Rooms AS ro JOIN Reservations AS re ON ro.room_id = re.room_id)
I'm using visual studio 2019 + SQL Server Express LocalDB.
I will be grateful for any help or hint!
UPDATE Reservations
SET Amount_payable = (
DATEDIFF(day, CONVERT(datetime, Start_date, 104), CONVERT(datetime, End_date, 104) * (SELECT price_per_day FROM Rooms AS ro JOIN Reservations AS re ON ro.room_id = re.room_id))
)
WHERE Status = 'Archived'
Table Reservations
reservation_id customer_id room_id start_date end_date status Amount_payable
1 3 3 12.04.2020 05.06.2020 Archived 0
2 2 4 11.04.2020 30.05.2020 Active 0
Table Rooms
reservation_id room_id number_of_persons room_type price_per_day
0 1 3 Double 300
0 2 4 Triple 600
0 3 3 Studio 400
2 4 2 Single 444

you need slightly different approach to resolve the issue.
try the following:
UPDATE re
SET
Amount_payable = (DATEDIFF(day, CONVERT(DATETIME, Start_date, 104), CONVERT(DATETIME, End_date, 104)) * price_per_day)
FROM Reservations re
JOIN Rooms AS ro ON ro.room_id = re.room_id
WHERE STATUS = 'Archived';

DISTINCT and GROUP BY with SQL Server

I have the following table (sql server) and i'm looking for a query to select the last two rows with all fields:
order by created_at
group by / distinct type_id
id type_id some_value created_at
1 B mk2 2016-10-01 00:00:00.000
2 A mbs 2016-10-01 10:02:39.077
3 B sa 2016-10-02 10:03:08.123
4 A xc 2016-10-02 10:03:28.777
5 B q1 2016-10-03 10:04:20.920
6 A tr 2016-10-03 10:04:48.533
7 A 1a 2016-09-30 10:36:26.287
In MySQL its an easy task - but with SQL Server all fields have to be contained in either an aggregate function or the GROUP BY clause. But that results in field combinations that does not exist.
Is there a way to handle this?
Thanks in advance!

Solution
Based on the comment from Andrew Deighton i did this:
SELECT *
FROM (
SELECT
id,
type_id,
some_value,
created_at,
ROW_NUMBER()
OVER (PARTITION BY type_id
ORDER BY created_at DESC) AS row
FROM test_sql
) AS ts
WHERE row = 1
ORDER BY row
Conclusion: No need for GROUP BY and DISTINCT.

Calculate Bounce Rate SQL Server 2008

I'm trying to calculate the Bounce Rate of pages in SQL Server in a table with Audit Data from Sharepoint.
ItemId UserId DocLocation Occurred
1 1 Home.aspx 2016-08-02 13:39:41
1 2 Home.aspx 2016-08-02 13:40:07
2 1 Other.aspx 2016-08-02 13:40:16
3 1 Items.aspx 2016-08-02 13:40:17
2 2 Other.aspx 2016-08-02 13:40:11
ItemId is the id of the page, DocLocation the location of the page and Occurred when the user goes into the page.
To calculate the bounce rate we have to divide the number of bounces between the total number of visits.
A Bounce happens when an user leaves the page in less than 5 seconds.
This should be the results for that table:
ItemId Bounces Visits BounceRate(Bounces/Visits)
1 1 2 0.5
2 1 2 0.5
3 0 1 0
I want to count a bounce calculating how much passes since the user performs the check until the user makes a visit to another page. If that time is less than 5 seconds, it would be counted as a bounce.
I'm making a stored procedure that execute the query to show the bounce rate of each page, but this doesn´t work.
SELECT
SUM(CASE
WHEN (DATEDIFF(second, #Occurred,
(SELECT TOP 1 a.Occurred
FROM [AuditPages] a
WHERE a.UserId = #userId
AND a.Occurred > #occurred
ORDER BY a.Occurred ASC))) < 30
THEN 1.0
ELSE 0.0
END) / COUNT(#itemId)
Someone knows how i can calculate this Bounce Rate?
Thanks for all the answers.

I like using row_number for this type of sequenced problem. The query below gives the desired result. I find performance with CTEs can sometimes be problematic with larger tables and you may need to convert to a temp table. You might consider using milliseconds if there is a chance you would want to use 4.5 seconds or such in the future.
declare #bounce_seconds int = 5;
with audit_cte as (
select *, ROW_NUMBER() over (partition by UserId order by Occurred) row_num
from AuditPages
--order by UserId,row_num
)
select a.ItemId, sum(a.bounce) Bounces, count(1) Visits, sum(a.bounce)/convert(float, count(1)) BounceRate
from (
select a1.ItemId, datediff(s,a1.Occurred, a2.Occurred) elapsed, case when datediff(s,a1.Occurred, a2.Occurred) < #bounce_seconds then 1 else 0 end bounce
from audit_cte a1
left join audit_cte a2
on a2.UserId = a1.UserId
and a2.row_num = a1.row_num + 1
--order by a1.UserId, a1.row_num
) a
group by a.ItemId
order by a.ItemId;

SELECT ItemId,COUNT(1) VISITS,SUM(BOUNCE_IND) BOUNCE, cast(SUM(BOUNCE_IND) as decimal(5,2))/cast(COUNT(1) as decimal(5,2)) BOUNCE_RATE
FROM (
Select
UserID,
ItemID,
DocLocation,
Occurred as Entry_time,
Lead(Occurred,1) Over (Partition by Userid order by Occurred) Exit_time,
CASE WHEN DATEDIFF(ss,Occurred,Lead(Occurred,1) Over (Partition by Userid order by Occurred)) <= 5 THEN 1 ELSE 0 END BOUNCE_IND
FROM Web_Data_Sample
) TBL GROUP BY ItemId

How can i use sql query for the following

My data table sampletime in one column and sample value in another column contain data like follow
sampletime value
----------------------------
2016-03-02 08:31:14 1
2016-03-02 09:31:14 2
2016-03-02 12:31:14 3
2016-03-04 08:31:14 4
2016-03-04 09:31:14 5
2016-03-05 08:31:14 3
I need two minimum sample time in each day. How can I group?
Query
SELECT rn.sampletime AS stime
FROM rn_qos_data_0007 rn
INNER JOIN s_qos_data qos
ON qos.table_id = rn.table_id
AND qos.qos = 'QOS_CPU_USAGE'
AND Substring(qos.origin, 1, 4) = 'A0C3'
AND qos.host = '10.98.48.100'
WHERE rn.sampletime BETWEEN '2016/01/01' AND '2016/06/22'
GROUP BY rn.sampletime

You need ROW_NUMBER window function
Select * From
(
select row_number()over(partition by cast(sampletime as date) order by sampletime) RN,*
From ..
) A
Where RN <=2

Select randomly few Rows of the same ID in the same table (T-SQL)

I'm trying to select randomly few rows for each Id stored in one table where these Ids have multiple rows on this table. It's difficult to explain with words, so let me show you with an example :
Example from the table :
Id Review
1 Text11
1 Text12
1 Text13
2 Text21
3 Text31
3 Text32
4 Text41
5 Text51
6 Text61
6 Text62
6 Text63
Result expected :
Id Review
1 Text11
1 Text13
2 Text21
3 Text32
4 Text41
5 Text51
6 Text62
In fact, the table contains thousands of rows. Some Ids contain only one Review but others can contain hundreds of reviews. I would like to select 10% of these, and select at least once, all rows wich have 1-9 reviews (I saw the SELECT TOP 10 percent FROM table ORDER BY NEWID() includes the row even if it's alone)
I read some Stack topics, I think I have to use a subquery but I don't find the correct solution.
Thanks by advance.
Regards.

Try this:
DECLARE #t table(Id int, Review char(6))
INSERT #t values
(1,'Text11'),
(1,'Text12'),
(1,'Text13'),
(2,'Text21'),
(3,'Text31'),
(3,'Text32'),
(4,'Text41'),
(5,'Text51'),
(6,'Text61'),
(6,'Text62'),
(6,'Text63')
;WITH CTE AS
(
SELECT
id, Review,
row_number() over (partition by id order by newid()) rn,
count(*) over (partition by id) cnt
FROM #t
)
SELECT id, Review
FROM CTE
WHERE rn <= (cnt / 10) + 1
Result(random):
id Review
1 Text12
2 Text21
3 Text31
4 Text41
5 Text51
6 Text63

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Observations in Group Per Any Rolling 5 Minute Period (Snowflake SQL) - snowflake-cloud-data-platform

Related

subquery returned more than 1 value table join

DISTINCT and GROUP BY with SQL Server

Calculate Bounce Rate SQL Server 2008

How can i use sql query for the following

Select randomly few Rows of the same ID in the same table (T-SQL)

Categories

Resources