convert time in string to number of hours - sql-server

I am trying to convert time in string to number of hours.
For instance
| Hours in text | Number of hours |
| ------------------- | --------------- |
| 1 minute | 0.02 |
| 30 minutes | 0.5 |
| 2 Hours 15 Minutes | 2.25 |
| 8 Hours | 8 |
| 4 Hours 30 Minutes | 4.5 |
| 1 Hour | 1 |
DECLARE #tabvar TABLE(TimeInText VARCHAR(100));
INSERT INTO #tabvar(TimeInText)
SELECT '1 minute' UNION ALL
SELECT '30 minutes' UNION ALL
SELECT '2 Hours 15 Minutes' UNION ALL
SELECT '8 Hours' UNION ALL
SELECT '4 Hours 30 Minutes' UNION ALL
SELECT '1 Hour'
SELECT CONVERT(CHAR(5), DATEADD(MINUTE, 60 * NULLIF(RTRIM(LTRIM(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(TimeInText, ' Hours ', '.'), ' Hours', '.'), ' Hour', '.00'), ' Minutes', ''), ' ', ''))), ''), 0), 108)
FROM #tabvar
When trying to convert 8 Hours I am stuck with "8." instead of "8"

This is not pretty. T-SQL fortΓ© is not string manipulation and your choice of storing times as a descriptive string is a problematic design choice at best. I also, however, feel that having the values as a decimal is also a bad idea; there's a time data type that you should be making use of.
Saying that, this works for the sample data provided:
SELECT tv.TimeInText,
ISNULL(CONVERT(decimal(5,3),LEFT(tv.TimeInText,NULLIF(h.CI,0)-1)),0) + ISNULL((CONVERT(decimal(5,3),LEFT(s.m,NULLIF(m.CI,0)-1)) / 60),0) AS NumberOfHours
FROM #tabvar tv
CROSS APPLY (VALUES(CHARINDEX('hour',tv.TimeInText)))h(CI)
CROSS APPLY (VALUES(CHARINDEX(' ',tv.TimeInText,NULLIF(h.ci,0))))ws(ci)
CROSS APPLY (VALUES(STUFF(tv.TimeInText,1,ISNULL(ws.ci,0),'')))s(m)
CROSS APPLY (VALUES(CHARINDEX('minute',s.m)))m(CI);
Or without using VALUES to make the expressions easy to read... (enjoy this mess πŸ™ƒ):
SELECT tv.TimeInText,
ISNULL(CONVERT(decimal(5,3),LEFT(tv.TimeInText,NULLIF(CHARINDEX('hour',tv.TimeInText),0)-1)),0) + ISNULL((CONVERT(decimal(5,3),LEFT(STUFF(tv.TimeInText,1,ISNULL(CHARINDEX(' ',tv.TimeInText,NULLIF(CHARINDEX('hour',tv.TimeInText),0)),0),''),NULLIF(CHARINDEX('minute',STUFF(tv.TimeInText,1,ISNULL(CHARINDEX(' ',tv.TimeInText,NULLIF(CHARINDEX('hour',tv.TimeInText),0)),0),'')),0)-1)) / 60),0) AS NumberOfHours
FROM #tabvar tv;
db<>fiddle

select *, format(datepart(hour, _time) + datepart(minute, _time)/60., '0.##') as NumberOfHours
from
(
select *,
timefromparts
(
left(TimeInText, charindex(' hour', TimeInText)),
substring(TimeInText, charindex('minute', TimeInText)-3, 3),
0, 0, 0
) as _time
from #tabvar
) as t;

Another option
Example
Select A.*
,NewVal = convert(decimal(10,2),
case when Pos2 like 'Hour%' then try_convert(int,Pos1)+isnull(try_convert(decimal(10,2),Pos3),0)/60
else try_convert(decimal(10,2),Pos1)/60 end
)
From YourTable A
Cross Apply (
Select Pos1 = trim(JSON_VALUE(S,'$[0]'))
,Pos2 = trim(JSON_VALUE(S,'$[1]'))
,Pos3 = trim(JSON_VALUE(S,'$[2]'))
,Pos4 = trim(JSON_VALUE(S,'$[3]'))
From ( values ( '["'+replace(replace([Hours in text],'"','\"'),' ','","')+'"]' ) ) A(S)
) B
Returns
Hours in text NewVal
1 minute 0.02
30 minutes 0.50
2 Hours 15 Minutes 2.25
8 Hours 8.00
4 Hours 30 Minutes 4.50
1 Hour 1.00

Related

Summing over a numeric column with moving window of varying size in Snowflake

I have a sample dataset given as follows;
time | time_diff | amount
time1 | time1-time2 | 1000
time2 | time2-time3 | 2000
time3 | time3-time4 | 3000
time4 | time4-time5 | 4500
time5 | NULL | 1000
Quick explanation; first column gives time of transaction, second column gives difference with next row to get transaction interval(in hours), and third column gives money made in a particular transaction. We have sorted the data in ascending order using time column.
Some values are given as;
time | time_diff | amount
time1 | 2. | 1000
time2 | 3. | 2000
time3 | 1. | 3000
time4 | 19. | 4500
time5 | NULL | 1000
The goal is to find the total transaction for a given time, which occurred within 24 hours of that transaction. For example, the output for time1 shd be; 1000+2000+3000=6000. Because if we add the value at time4, the total time interval becomes 25, hence we omit the value of 4500 from the sum.
Example output:
time | amount
time1 | 6000
time2 | 9500
time3 | 7500
time4 | 4500
time5 | 1000
The concept of Mong window sum should work, in my knowledge, but here the width of the window is variable. Thats the challenge I am facing.Can I kindly get some help here?
You could ignore the time_diff column and use a theta self-join based on a timestamp range, like this:
WITH srctab AS ( SELECT TO_TIMESTAMP_NTZ('2020-04-15 00:00:00') AS "time", 1000::INT AS "amount"
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 00:02:00'), 2000::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 00:05:00'), 3000::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 00:06:00'), 4500::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-16 00:01:00'), 1000::INT
)
SELECT t1."time", SUM(t2."amount") AS tot
FROM srctab t1
JOIN srctab t2 ON t2."time" BETWEEN t1."time" AND TIMESTAMPADD(HOUR, +24, t1."time")
GROUP BY t1."time"
ORDER BY t1."time";
Minor detail: if your second column gives the time difference with the next row then I'd say the first value should be 10500 (not 6000) because it's only your 5th transaction of 1000 which is more than 24 hours ahead... I'm guessing your actual timestamps are at 0, 2, 5, 6 and 25 hours?
Another option might be to use the sliding WINDOW function by tweaking your transactional data to include each hour.
It's perhaps an overkill but might be a useful technique.
Firstly generate a placeholder for each hour using the timestamps. I utilised time_slice to map each timestamp into nice hour blocks and generator with dateadd to back fill each hour putting a zero in where no transactions took place.
So now I can use the sliding window function knowing that I can safely choose the 23 preceding hours.
Copy|Paste|Run
WITH SRCTAB AS (
SELECT TO_TIMESTAMP_NTZ('2020-04-15 00:00:00') AS TRANS_TS, 1000::INT AS AMOUNT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 02:00:00'), 2000::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 05:00:00'), 3000::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 06:00:00'), 4500::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-16 01:00:00'), 1000::INT
)
SELECT
TRANS_TIME_HOUR
,SUM(AMOUNT) OVER ( ORDER BY TRANS_TIME_HOUR ROWS BETWEEN 23 PRECEDING AND 0 PRECEDING ) OVERKILL FROM (
SELECT
TRANS_TIME_HOUR,
SUM(AMOUNT) AMOUNT
FROM
(
SELECT
DATEADD(HOUR, NUMBER, TRANS_TS) TRANS_TIME_HOUR,
DECODE( DATEADD(HOUR, NUMBER, TRANS_TS), TIME_SLICE(TRANS_TS, 1, 'HOUR', 'START'), AMOUNT,0) AMOUNT
FROM
SRCTAB,
(SELECT SEQ4() NUMBER FROM TABLE(GENERATOR(ROWCOUNT => 24)) ) G
)
GROUP BY
TRANS_TIME_HOUR
)

Cannot increment values in a T-SQL CTE

I have a case where I need to write a CTE ( at least this seems like the best approach) . I have almost everything I need in place but one last issue. I am using a CTE to generate many millions of a records and then I will insert them into a table. The data itself is almost irrelevant except for three columns. 2 date time columns and one character column.
The idea behind the CTE is this. I want one datetime field called Start and one int field called DataValue. I will have a variable which is the count of records I want to aim for and then another variable which is the number of times I want to repeat the datetime value. I don't think I need to explain the software this data represents but basically I need to have 16 rows where the Start value is the same and then after the 16th run I want to then add 15 minutes and then repeat. Effectively there will be events in 15 minute intervals and I will need X number of rows per 15 minute interval to represent those events.
This is my code
Declare #tot as int;
Declare #inter as int;
Set #tot = 26
Set #inter = 3;
WITH mycte(DataValue,start) AS
(
SELECT 1 DataValue, cast('01/01/2011 00:00:00' as datetime) as start
UNION all
if DataValue % #inter = 0
SELECT
DataValue + 1,
cast(DateAdd(minute,15,start) as datetime)
else
select
DataValue + ,
start
FROM mycte
WHERE DataValue + 1 <= #tot)
select
m.start,
m.start,
m.Datavalue%#inter
from mycte as m
option (maxrecursion 0);
I'll change the select statement into an insert statement once I get it working but the m.DataValue%#inter will make it repeat integer when inserting so the only thing I need is to figure out how to make the start be the same 16 times in a row and then increment
It seems that I cannot have an IF statement in the CTE but I am not sure how to accomplish that but what I was going to do was basically say if the DataValue%16 was 0 then increase the value of start.
In the end I should hopefully have something like this where in this case I only repeat it 4 times
+-----------+-------------------+
| DateValue | start |
+-----------+-------------------+
| 1 | 01/01/01 00:00:00 |
| 2 | 01/01/01 00:00:00 |
| 3 | 01/01/01 00:00:00 |
| 4 | 01/01/01 00:00:00 |
| 5 | 01/01/01 00:15:00 |
| 6 | 01/01/01 00:15:00 |
| 7 | 01/01/01 00:15:00 |
| 8 | 01/01/01 00:15:00 |
Is there another way to accomplish this without conditional statements?
You can use case when as below:
Declare #tot as int;
Declare #inter as int;
Set #tot = 26
Set #inter = 3;
WITH mycte(DataValue,start) AS
(
SELECT 1 DataValue, cast('01/01/2011 00:00:00' as datetime) as start
UNION all
SELECT DataValue+1 [Datavalue],
case when (DataValue % #inter) = 0 then cast(DateAdd(minute,15,start) as datetime) else [start] end [start]
FROM mycte
WHERE (DataValue + 1) <= #tot)
select
m.DataValue,
m.[start]
from mycte as m
option (maxrecursion 0);
This will give the below result
DataValue Start
========= =============
1 2011-01-01 00:00:00.000
2 2011-01-01 00:00:00.000
3 2011-01-01 00:00:00.000
4 2011-01-01 00:15:00.000
5 2011-01-01 00:15:00.000
6 2011-01-01 00:15:00.000
7 2011-01-01 00:30:00.000
8 2011-01-01 00:30:00.000
9 2011-01-01 00:30:00.000
10 2011-01-01 00:45:00.000
11 2011-01-01 00:45:00.000
12 2011-01-01 00:45:00.000
....
26 2011-01-01 02:00:00.000
And if you dont want to use case when you can use double recursive cte as below:-
WITH mycte(DataValue,start) AS
( --this recursive cte will generate the same record the number of #inter
SELECT 1 DataValue, cast('01/01/2011 00:00:00' as datetime) as start
UNION all
SELECT DataValue+1 [DataValue],[start]
FROM mycte
WHERE (DataValue + 1) <= #inter)
,Increments as (
-- this recursive cte will do the 15 additions
select * from mycte
union all
select DataValue+#inter [DataValue]
,DateAdd(minute,15,[start]) [start]
from Increments
WHERE (DataValue + 1) <= #tot
)
select
m.DataValue,
m.[start]
from Increments as m
order by DataValue
option (maxrecursion 0);
it will give the same results.
You can do this with a tally table and some basic math. I'm not sure if your total rows are #tot or should they be #tot * #inter. If so, you just need to change the TOP clause. If you need more rows, you just need to alter the tally table generation.
Declare #tot as int;
Declare #inter as int;
Set #tot = 26
Set #inter = 3;
WITH
E(n) AS(
SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(n)
),
E2(n) AS(
SELECT a.n FROM E a, E b
),
E4(n) AS(
SELECT a.n FROM E2 a, E2 b
),
cteTally(n) AS(
SELECT TOP( #tot) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) n
FROM E4
)
SELECT n, DATEADD( MI, 15* ((n-1)/#inter), '20110101')
FROM cteTally;

SQL Server Group By Help Required

Being a beginner, how can I get results as mentioned in the diagram. I am not getting results by grouping. Please advise.
If I'm reading your example data correctly, it seems that for Query3, you have the data flip-flopped for the dates. I think Query3 elapsed for 10/20/2017 should be 2:03 and for 10/21/2017, the elapsed time should be 1:48.
If that is indeed the case, here is a solution that uses a combination of Common Table Expressions, aggregation and PIVOT. It works for the example data you've provided in your question. You may need to adjust it for other data.
set nocount on
Declare #t Table (ID int, [Date] Date, ExecutedTime varchar(10), Label Varchar(10))
insert into #t values
(1,'2017-10-20','0:01:16','Query1'),
(2,'2017-10-20','0:00:20','Query1'),
(3,'2017-10-20','0:00:14','Query1'),
(4,'2017-10-20','0:01:43','Query2'),
(5,'2017-10-20','0:00:33','Query2'),
(6,'2017-10-20','0:00:34','Query2'),
(7,'2017-10-20','0:01:18','Query3'),
(8,'2017-10-20','0:00:30','Query3'),
(9,'2017-10-20','0:00:15','Query3'),
(10,'2017-10-21','0:01:16','Query1'),
(11,'2017-10-21','0:00:20','Query1'),
(12,'2017-10-21','0:00:14','Query1'),
(13,'2017-10-21','0:01:43','Query2'),
(14,'2017-10-21','0:00:33','Query2'),
(15,'2017-10-21','0:00:34','Query2'),
(16,'2017-10-21','0:01:18','Query3'),
(16,'2017-10-21','0:00:30','Query3')
;
With ExecutedTimeInSeconds as --Convert ExecutedTime to seconds in preparation for aggregation
(
select
[Date],
Label,
datepart(hour,(convert(time,executedTime))) * 3600 +
datepart(minute,(convert(time,executedTime))) * 60 +
datepart(second,(convert(time,executedTime))) as ElapsedSeconds
from #t
)
,AggregatedDataInElapsedSeconds as --Aggregate the seconds by Date and Label
(
SELECT [Date]
,Label
,sum(ElapsedSeconds) AS ElapsedSeconds
FROM ExecutedTimeInSeconds
GROUP BY [Date]
,Label
),
DataReadyForPivot as --Convert the aggregageted seconds back to an elapsed time
(
SELECT [Date], Label,
RIGHT('0' + CAST(ElapsedSeconds / 3600 AS VARCHAR),2) + ':' +
RIGHT('0' + CAST((ElapsedSeconds / 60) % 60 AS VARCHAR),2) + ':' +
RIGHT('0' + CAST(ElapsedSeconds % 60 AS VARCHAR),2) as ExecutedSum
from AggregatedDataInElapsedSeconds
)
,
PivotedData as --Pivot the data
(
SELECT *
FROM DataReadyForPivot
PIVOT(MAX(ExecutedSum) FOR Label IN (
[Query1]
,[Query2]
,[Query3]
)) AS pvt
)
select --add a row number as Id
ROW_NUMBER() over (order by [Date] desc) as Id,
*
from PivotedData
| Id | Date | Query1 | Query2 | Query3 |
|----|------------|----------|----------|----------|
| 1 | 2017-10-21 | 00:01:50 | 00:02:50 | 00:01:48 |
| 2 | 2017-10-20 | 00:01:50 | 00:02:50 | 00:02:03 |

Get data in groups of "Week of..." when dates may be missing

I have data in a table with dates, and want to count the rows by "Week of" (e.g., "Week of 2017-05-01"), where the result has the week's date (starting on Mondays) and the count of matching rows β€” even if there are no rows for that week. (This will all be in a date range.)
I can partition things into weeks readily enough by grouping on DATEPART(wk, D) (where D is the date column), but I'm struggling with:
How to get the "Week of" date and fill, and
How to have a row for a week where there are no matching rows in the data
Here's grouping by week:
SET DATEFORMAT ymd;
SET DATEFIRST 1; -- Monday is first day of week
DECLARE #startDate DATETIME = '2017-05-01';
DECLARE #endDate DATETIME = '2017-07-01';
SELECT DATEPART(wk, D) AS [Week Number], COUNT(*) AS [Count]
FROM #temp
GROUP BY DATEPART(wk, D)
ORDER BY DATEPART(wk, D);
Which gives me:
+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+
| Week Number | Count |
+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+
| 19 | 5 |
| 20 | 19 |
| 22 | 8 |
| 23 | 10 |
| 24 | 5 |
| 26 | 4 |
+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+
But ideally I want:
+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+
| Week | Count |
+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+
| 2017-05-01 | 5 |
| 2017-05-08 | 19 |
| 2017-05-15 | 0 |
| 2017-05-22 | 8 |
| 2017-05-29 | 10 |
| 2017-06-05 | 5 |
| 2017-06-12 | 0 |
| 2017-06-19 | 4 |
| 2017-06-26 | 0 |
+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’βˆ’+
How can I do that?
Set up information for testing:
SET DATEFIRST 1;
SET DATEFORMAT ymd;
CREATE TABLE #temp (
D DATETIME
);
GO
INSERT INTO #temp (D)
VALUES -- Week of 2017-05-01 (#19)
('2017-05-01'),('2017-05-01'),('2017-05-01'),
('2017-05-06'),('2017-05-06'),
-- Week of 2017-05-08 (#20) - note no data actually on the 8th
('2017-05-10'),
('2017-05-11'),('2017-05-11'),('2017-05-11'),('2017-05-11'),('2017-05-11'),('2017-05-11'),
('2017-05-12'),('2017-05-12'),('2017-05-12'),('2017-05-12'),
('2017-05-13'),('2017-05-13'),('2017-05-13'),('2017-05-13'),('2017-05-13'),('2017-05-13'),('2017-05-13'),
('2017-05-14'),
-- Week of 2017-05-15 (#21)
-- (note we have no data for this week)
-- Week of 2017-05-22 (#22)
('2017-05-22'),('2017-05-22'),('2017-05-22'),
('2017-05-23'),('2017-05-23'),('2017-05-23'),('2017-05-23'),('2017-05-23'),
-- Week of 2017-05-29 (#23)
('2017-05-29'),('2017-05-29'),('2017-05-29'),
('2017-06-02'),('2017-06-02'),
('2017-06-03'),
('2017-06-04'),('2017-06-04'),('2017-06-04'),('2017-06-04'),
-- Week of 2017-06-05 (#24) - note no data actually on the 5th
('2017-06-08'),('2017-06-08'),('2017-06-08'),
('2017-06-11'),('2017-06-11'),
-- Week of 2017-06-12 (#25)
-- (note we have no data for this week)
-- Week of 2017-06-19 (#26)
('2017-06-19'),('2017-06-19'),('2017-06-19'),
('2017-06-20');
GO
To do this, you have to generate a table or CTE with the Monday dates and their week numbers (as shown in this answer, slightly modified for what we need to do below), then LEFT JOIN or OUTER APPLY that with your data grouped by week, using the week numbers:
SET DATEFORMAT ymd;
SET DATEFIRST 1;
DECLARE #startDate DATETIME = '2017-05-01';
DECLARE #endDate DATETIME = '2017-07-01';
;WITH Mondays AS (
SELECT #startDate AS D, DATEPART(WK, #startDate) AS W
UNION ALL
SELECT DATEADD(DAY, 7, D), DATEPART(WK, DATEADD(DAY, 7, D))
FROM Mondays m
WHERE DATEADD(DAY, 7, D) < #endDate
)
SELECT LEFT(CONVERT(NVARCHAR(MAX), Mondays.D, 120), 10) AS [Week Of], d.Count
FROM Mondays
OUTER APPLY (
SELECT COUNT(*) AS [Count]
FROM #temp
WHERE DATEPART(WK, D) = W
AND D >= #startDate
AND D < #endDate
) d
ORDER BY Mondays.D;
Two notes on that:
I'm assuming we can ensure that #startDate is a Monday, which is easily done outside the query or could be done with a simple loop in T-SQL if needed (backing up until WEEKPART(WEEKDAY, #startDate) is 1). (Or worst case we could generate all the dates and then filter them with WEEKPART(WEEKDAY, ...).)
I'm assuming the date range is always a year or less; otherwise, we'd have duplicated week numbers. If the date range could be longer than a year, combine the week number with the year everywhere we're just using a week number above (e.g., DATEPART(YEAR, D) * 100 + DATEPART(wk, D)).
You can use this.
SET DATEFORMAT ymd;
SET DATEFIRST 1; -- Monday is first day of week
DECLARE #startDate DATETIME = '2017-05-01';
DECLARE #endDate DATETIME = '2017-07-01';
;WITH OrgResult AS ( -- Grouping result with missing week. Answer of the first question
SELECT
DATEADD(DAY, 1 - DATEPART (WEEKDAY, D), D) [Week] -- Fist Day Of the Week
, COUNT(*) [Count]
FROM #temp
WHERE D BETWEEN #startDate AND #endDate
GROUP BY
DATEADD(DAY, 1 - DATEPART (WEEKDAY, D), D)
)
, Result AS -- Adds only missing weeks. Answer of the second question
(
SELECT * FROM OrgResult
UNION ALL
SELECT DATEADD( DAY, 7, R.[Week] ), 0 [Count]
FROM Result R
WHERE NOT EXISTS( SELECT * FROM OrgResult O WHERE [Week] = DATEADD( DAY, 7, R.[Week] ) )
AND DATEADD( DAY, 7, R.[Week] ) <= #endDate
)
SELECT * FROM Result
ORDER BY [Week]
Result:
Week Count
----------- -----------
2017-05-01 5
2017-05-08 19
2017-05-15 0
2017-05-22 8
2017-05-29 10
2017-06-05 5
2017-06-12 0
2017-06-19 4
2017-06-26 0
Here's another approach. I included this as it will generate less reads than the Recursive CTE Solution and will be a lot fast
WITH E(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))x(x)),
iTally(N) AS
(
SELECT TOP (((DATEDIFF(day,#startdate, #endDate))/7)+1)
(ROW_NUMBER() OVER (ORDER BY (SELECT 1))-1)
FROM E a, E b, E c
)
SELECT WeekOf = DATEADD(WEEK,N,#startDate), [count] = COUNT(t.D)
FROM iTally i
LEFT JOIN #temp t ON t.D >= DATEADD(WEEK,N,#startDate) AND t.D < DATEADD(WEEK,N+1,#startDate)
GROUP BY DATEADD(WEEK,N,#startDate)
ORDER BY DATEADD(WEEK,N,#startDate); -- not required
Results:
WeekOf count
---------- -----------
2017-05-01 5
2017-05-08 19
2017-05-15 0
2017-05-22 8
2017-05-29 10
2017-06-05 5
2017-06-12 0
2017-06-19 4
2017-06-26 0

How can I group / window date ordered events delineated by an arbitrary expression?

I would like to group some data together based on dates and some (potentially arbitrary) indicator:
Date | Ind
================
2016-01-02 | 1
2016-01-03 | 5
2016-03-02 | 10
2016-03-05 | 15
2016-05-10 | 6
2016-05-11 | 2
I would like to group together subsequent (date-ordered) rows but breaking the group after Indicator >= 10:
Date | Ind | Group
========================
2016-01-02 | 1 | 1
2016-01-03 | 5 | 1
2016-03-02 | 10 | 1
2016-03-05 | 15 | 2
2016-05-10 | 6 | 3
2016-05-11 | 2 | 3
I did find a promising technique at the end of a blog post: "Use this Neat Window Function Trick to Calculate Time Differences in a Time Series" (the final subsection, "Extra Bonus"), but the important part of the query uses a keyword (FILTER) that doesn't seem to be supported in SQL Server (and a quick Google later and I'm not sure where it is supported!).
I'm still hopeful a technique using a window function might be the answer. I just need a counter that I can add to every row, (like RANK or ROW_NUMBER does) but that only increments when some arbitrary condition evaluates as true. Is there a way to do this in SQL Server?
Here is the solution:
DECLARE #t TABLE ([Date] DATETIME, Ind INT)
INSERT INTO #t
VALUES
('2016-01-02', 1),
('2016-01-03', 5),
('2016-03-02', 10),
('2016-03-05', 15),
('2016-05-10', 6),
('2016-05-11', 2)
SELECT [Date],
Ind,
1 + SUM([Group]) OVER(ORDER BY [Date]) AS [Group]
FROM
(
SELECT *,
CASE WHEN LAG(ind) OVER(ORDER BY [Date]) >= 10
THEN 1
ELSE 0
END AS [Group]
FROM #t
) t
Just mark row as 1 when previous is greater than 10 else 0. Then a running sum will give you the desired result.
Giving full credit to Giorgi for the idea, but I've modified his answer (both for my benefit and for future readers).
Just change the CASE statement to see if 30 or more days have lapsed since the last record:
DECLARE #t TABLE ([Date] DATETIME)
INSERT INTO #t
VALUES
('2016-01-02'),
('2016-01-03'),
('2016-03-02'),
('2016-03-05'),
('2016-05-10'),
('2016-05-11')
SELECT [Date],
1 + SUM([Group]) OVER(ORDER BY [Date]) AS [Group]
FROM
(
SELECT [Date],
CASE WHEN DATEADD(d, -30, [Date]) >= LAG([Date]) OVER(ORDER BY [Date])
THEN 1
ELSE 0
END AS [Group]
FROM #t
) t

Resources