TSQL - Missing Month - sql-server

I need to find the missing months in a table
for the earliest and latest start dates per ID_No. As an example:
create table #InputTable (ID_No int ,OccurMonth datetime)
insert into #InputTable (ID_No,OccurMonth)
select 10, '2007-11-01' Union all
select 10, '2007-12-01' Union all
select 10, '2008-01-01' Union all
select 20, '2009-01-01' Union all
select 20, '2009-02-01' Union all
select 20, '2009-04-01' Union all
select 30, '2010-05-01' Union all
select 30, '2010-08-01' Union all
select 30, '2010-09-01' Union all
select 40, '2008-03-01'
For the above table, the answer should be:
ID_No OccurMonth
----- ----------
20 2009-02-01
30 2010-06-01
30 2010-07-01
The other solutions posted on this site are similar, but:
1) don't include an ID column,
2) don't use the start date/end dates in the data or
3) use cursors, which are forbidden in my environment.

Try this:
;WITH
MonthRange AS
(
SELECT ID_No,
MinMonth = MIN(OccurMonth),
MaxMonth = MAX(OccurMonth)
FROM #InputTable
GROUP BY ID_No
),
AllMonths AS
(
SELECT ID_No,
OccurMonth = MinMonth
FROM MonthRange
UNION ALL
SELECT a.ID_No,
DATEADD(MONTH, 1, a.OccurMonth)
FROM AllMonths a
INNER JOIN MonthRange r ON a.ID_No = r.ID_No
WHERE a.OccurMonth < r.MaxMonth
)
SELECT a.*
FROM AllMonths a
LEFT JOIN #InputTable i ON a.ID_No = i.ID_No
AND a.OccurMonth = i.OccurMonth
WHERE i.ID_No IS NULL
OPTION (MAXRECURSION 0)
AllMonths is a recursive CTE that lists out all months between the min and max month for each ID_no. Then it's only a simple LEFT JOIN to find what month is missing in between.

Related

SQL Server 2016 - Running Count and Sum for a 24 hours sliding window

I am trying to count orders over a 24 hours sliding window. I have a 'detetime' field and I'm calculating the 24 hours window aggregating at the minute level. It should re-start counting every time the order time between two consecutive orders is over 1440 minutes or when the running time of consecutive orders is over 1440 minutes.
Environment is SQL server 2016, I can create Temp tables but no physical tables and no memory-optimized objects (I guess anything working on 2012+ should work).
I tried an inner join on the same table and tested with recursive CTEs, ROW_NUMBER etc. but the issue is that there is never a set number of rows for the 24 hours window and the base time from which to calculate the start of the period changes. The only constant I have is the 24 hours time span.
Tried the following:
https://www.red-gate.com/simple-talk/sql/t-sql-programming/calculating-values-within-a-rolling-window-in-transact-sql/
Calculate running total / running balance
Cross apply seems to be working for the most part but in some instances - when calculating the running 24 hours window - it isn't. I tried changing the datetime conditions in the WHERE clause in many ways but I still can't figure out how to get it to work correctly.
I thought about creating a reset event at the 24 hours mark as showed here https://blog.jooq.org/2015/05/12/use-this-neat-window-function-trick-to-calculate-time-differences-in-a-time-series/ but at this point my brain is melting and I can't even get the logic straight.
DROP TABLE IF EXISTS #Data
CREATE TABLE #Data
(
START_TIME DATETIME
,ORDER_ID NUMERIC(18,0)
,PROD_ID NUMERIC(18,0)
,ACC_ID NUMERIC(18,0)
);
INSERT INTO #Data
SELECT '2018-06-22 11:00:00.000', 198151606, 58666, 1601554883
UNION ALL SELECT '2018-07-09 10:15:00.000',2008873061,58666,1601554883
UNION ALL SELECT '2018-07-09 12:33:00.000',2009269222,58666,1601554883
UNION ALL SELECT '2018-07-10 08:29:00.000',2010735393,58666,1601554883
UNION ALL SELECT '2018-07-10 10:57:00.000',2010735584,58666,1601554883
UNION ALL SELECT '2018-06-27 23:53:00.000',1991467555,58666,2300231016
UNION ALL SELECT '2018-06-28 00:44:00.000',1991583916,58666,2300231016
UNION ALL SELECT '2018-07-04 04:15:00.000',2001154497,58666,2300231016
UNION ALL SELECT '2018-07-04 15:44:00.000',2001154818,58666,2300231016
UNION ALL SELECT '2018-07-04 21:30:00.000',2002057919,58666,2300231016
UNION ALL SELECT '2018-07-05 02:09:00.000',1200205808,58666,2300231016
UNION ALL SELECT '2018-07-05 04:15:00.000',2200205814,58666,2300231016
UNION ALL SELECT '2018-07-05 17:23:00.000',3200370070,58666,2300231016
UNION ALL SELECT '2018-07-05 18:07:00.000',4200370093,58666,2300231016
UNION ALL SELECT '2018-07-06 20:15:00.000',5200571962,58666,2300231016
UNION ALL SELECT '2018-07-07 07:45:00.000',6200571987,58666,2300231016
UNION ALL SELECT '2018-07-07 12:13:00.000',7200571993,58666,2300231016
UNION ALL SELECT '2018-07-09 18:29:00.000',8200939551,58666,2300231016
UNION ALL SELECT '2018-07-09 21:05:00.000',9200939552,58666,2300231016
UNION ALL SELECT '2018-07-11 21:31:00.000',2011107311,58666,2300231016
UNION ALL SELECT '2018-06-27 18:23:00.000',1991016382,58669,2300231016
UNION ALL SELECT '2018-06-27 19:07:00.000',1991181363,58669,2300231016
UNION ALL SELECT '2018-06-27 19:28:00.000',1991181374,58669,2300231016
UNION ALL SELECT '2018-06-28 01:44:00.000',1991583925,58669,2300231016
UNION ALL SELECT '2018-06-28 02:19:00.000',1991583946,58669,2300231016
UNION ALL SELECT '2018-07-03 10:15:00.000',1999231747,58669,2300231016
UNION ALL SELECT '2018-07-03 10:45:00.000',2000293678,58669,2300231016
UNION ALL SELECT '2018-07-03 14:22:00.000',200029380,58669,2300231016
UNION ALL SELECT '2018-07-04 19:45:00.000',2002057789,58669,2300231016
UNION ALL SELECT '2018-07-04 21:00:00.000',1200205781,58669,2300231016
UNION ALL SELECT '2018-07-05 15:12:00.000',2200254833,58669,2300231016
UNION ALL SELECT '2018-07-05 17:52:00.000',3200370071,58669,2300231016
UNION ALL SELECT '2018-07-09 22:30:00.000',4200939553,58669,2300231016
UNION ALL SELECT '2018-07-09 23:23:00.000',5200939566,58669,2300231016
UNION ALL SELECT '2018-07-30 17:45:00.000',6204364207,58666,2300231016
UNION ALL SELECT '2018-07-30 23:30:00.000',7204364211,58666,2300231016
;WITH TimeBetween AS(
SELECT
ACC_ID
,PROD_ID
,ORDER_ID
,START_TIME
,TIME_BETWEEN_ORDERS = COALESCE(CASE WHEN DATEDIFF(MINUTE, LAG(START_TIME) OVER(PARTITION BY ACC_ID, PROD_ID
ORDER BY START_TIME), START_TIME) >= 1440
THEN 0
ELSE DATEDIFF(MINUTE, LAG(START_TIME) OVER(PARTITION BY ACC_ID, PROD_ID
ORDER BY START_TIME), START_TIME)
END, 0)
FROM #Data
)
SELECT
TimeBetween.ACC_ID
,TimeBetween.PROD_ID
,TimeBetween.ORDER_ID
,TimeBetween.START_TIME
,TIME_BETWEEN_ORDERS
--Not working correctly, repeats the previous time at the end of the window when it should be 0.
,RUNNING_TIME_BETWEEN_ORDERS = SUM(TIME_BETWEEN_ORDERS) OVER(PARTITION BY ACC_ID, PROD_ID ORDER BY START_TIME)
,Running24h.*
FROM TimeBetween
CROSS APPLY(SELECT TOP 1
RUNNING_COUNT_24h = COUNT(*) OVER() --Count admin units within the time window in the WHERE clause
--Check what APPLY is returning for running time
,RUNNING_TIME_BETWEEN_ORDERS_Apply = DATEDIFF(MINUTE, StageBaseApply.START_TIME, TimeBetween.START_TIME)
--Check what APPLY is using as base event anchor for the calculation
,START_TIME_Apply = StageBaseApply.START_TIME
FROM #Data AS StageBaseApply
WHERE
StageBaseApply.ACC_ID = TimeBetween.ACC_ID
AND StageBaseApply.PROD_ID = TimeBetween.PROD_ID
AND (StageBaseApply.START_TIME > DATEADD(MINUTE, -1440, TimeBetween.START_TIME)
AND StageBaseApply.START_TIME <= TimeBetween.START_TIME
)
ORDER BY StageBaseApply.START_TIME
) AS Running24h
ORDER BY ACC_ID,PROD_ID, START_TIME
When the running time between orders is over 24 hours the running count should re-start from 1.
Currently it repeats the last value and the time it's using for the calculation seems to be off.
Current result from CROSS APPLY with notes on where it's not working and what it should be for what I'm trying to achieve
First create a Numbers table with at least as many rows as the minutes in the maximum time range you will ever be dealing with
CREATE TABLE dbo.Numbers(Number INT PRIMARY KEY);
WITH E1(N) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
) -- 1*10^1 or 10 rows
, E2(N) AS (SELECT 1 FROM E1 a, E1 b) -- 1*10^2 or 100 rows
, E4(N) AS (SELECT 1 FROM E2 a, E2 b) -- 1*10^4 or 10,000 rows
, E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows
, Nums AS (SELECT TOP (10000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM E8)
INSERT INTO dbo.Numbers
SELECT N
FROM Nums
And then you should be able to use something like this (I'm assuming that all start times are exact minutes and there are no duplicates per ACC_ID,PROD_ID,START_TIME as shown in your example data, if there are you will need to pre-aggregate at the minute level before participating in the left join)
WITH G
AS (SELECT ACC_ID,
PROD_ID,
MIN = MIN(START_TIME),
MAX = MAX(START_TIME),
Range = DATEDIFF(MINUTE, MIN(START_TIME), MAX(START_TIME))
FROM #Data
GROUP BY ACC_ID,
PROD_ID),
E
AS (SELECT *
FROM G
JOIN dbo.Numbers N
ON N.Number <= Range + 1),
R AS (SELECT E.ACC_ID,
E.PROD_ID,
D.START_TIME,
Cnt = COUNT(D.START_TIME) OVER (PARTITION BY E.ACC_ID, E.PROD_ID
ORDER BY DATEADD(MINUTE, NUMBER-1, MIN)
ROWS BETWEEN 1439 PRECEDING AND CURRENT ROW)
FROM E
LEFT JOIN #Data D
ON D.ACC_ID = E.ACC_ID
AND D.PROD_ID = E.PROD_ID
AND D.START_TIME = DATEADD(MINUTE, NUMBER-1, MIN) )
SELECT *
FROM R
WHERE START_TIME IS NOT NULL
ORDER BY ACC_ID,
PROD_ID,
START_TIME
After finding this post on how to reset a running sum, I think I may have finally been able to crack this nut. Not sure about how well it scales but it is working.
I also added a new column for order quantity since it may be useful sometimes to track the orders running total during the same time window.
The sliding time window can be set in this CASE statement:
CASE WHEN RunningOrders.LAG_LESS_THAN_24h + NextEventLag.NEXT_ORDER_TIME_LAG >= 1440 THEN 0 ELSE RunningOrders.LAG_LESS_THAN_24h + NextEventLag.NEXT_ORDER_TIME_LAG
END
DROP TABLE IF EXISTS #Data
CREATE TABLE #Data
(
ORDER_TIME DATETIME
,ORDER_ID NUMERIC(18,0)
,PROD_ID NUMERIC(18,0)
,ACCOUNT_ID NUMERIC(18,0)
,ORDER_QUANTITY INT
);
INSERT INTO #Data
SELECT '2018-06-22 11:00:00.000', 1981516061, 158666, 1601554883,5
UNION ALL SELECT '2018-07-09 10:15:00.000',2008873062,158666,1601554883,3
UNION ALL SELECT '2018-07-09 12:33:00.000',2009269223,158666,1601554883,2
UNION ALL SELECT '2018-07-10 08:29:00.000',2010735394,158666,1601554883,4
UNION ALL SELECT '2018-07-10 10:57:00.000',2010735584,158666,1601554883,7
UNION ALL SELECT '2018-06-27 23:53:00.000',1991467553,158666,2300231016,6
UNION ALL SELECT '2018-06-28 00:44:00.000',1991583913,158666,2300231016,6
UNION ALL SELECT '2018-07-04 04:15:00.000',2001154492,158666,2300231016,4
UNION ALL SELECT '2018-07-04 15:44:00.000',2001154814,158666,2300231016,5
UNION ALL SELECT '2018-07-04 21:30:00.000',2002057915,158666,2300231016,4
UNION ALL SELECT '2018-07-05 02:09:00.000',2002058086,158666,2300231016,4
UNION ALL SELECT '2018-07-05 04:15:00.000',2002058147,158666,2300231016,3
UNION ALL SELECT '2018-07-05 17:23:00.000',2003700706,158666,2300231016,2
UNION ALL SELECT '2018-07-05 18:07:00.000',2003700938,158666,2300231016,1
UNION ALL SELECT '2018-07-06 20:15:00.000',2005719626,158666,2300231016,7
UNION ALL SELECT '2018-07-07 07:45:00.000',2005719879,158666,2300231016,8
UNION ALL SELECT '2018-07-07 12:13:00.000',2005719931,158666,2300231016,9
UNION ALL SELECT '2018-07-09 18:29:00.000',2009395510,158666,2300231016,8
UNION ALL SELECT '2018-07-09 21:05:00.000',2009395523,158666,2300231016,6
UNION ALL SELECT '2018-07-11 21:31:00.000',2011107312,158666,2300231016,5
UNION ALL SELECT '2018-06-27 18:23:00.000',1991016381,258669,2300231016,4
UNION ALL SELECT '2018-06-27 19:07:00.000',1991181365,258669,2300231016,4
UNION ALL SELECT '2018-06-27 19:28:00.000',1991181376,258669,2300231016,3
UNION ALL SELECT '2018-06-28 01:44:00.000',1991583923,258669,2300231016,9
UNION ALL SELECT '2018-06-28 02:19:00.000',1991583943,258669,2300231016,2
UNION ALL SELECT '2018-07-03 10:15:00.000',1999231742,258669,2300231016,1
UNION ALL SELECT '2018-07-03 10:45:00.000',2000293679,258669,2300231016,1
UNION ALL SELECT '2018-07-03 14:22:00.000',2000293804,258669,2300231016,3
UNION ALL SELECT '2018-07-04 19:45:00.000',2002057785,258669,2300231016,2
UNION ALL SELECT '2018-07-04 21:00:00.000',2002057813,258669,2300231016,1
UNION ALL SELECT '2018-07-05 15:12:00.000',2002548332,258669,2300231016,7
UNION ALL SELECT '2018-07-05 17:52:00.000',2003700719,258669,2300231016,6
UNION ALL SELECT '2018-07-09 22:30:00.000',2009395530,258669,2300231016,5
UNION ALL SELECT '2018-07-09 23:23:00.000',2009395666,258669,2300231016,3
UNION ALL SELECT '2018-07-30 17:45:00.000',2043642075,158666,2300231016,2
UNION ALL SELECT '2018-07-30 23:30:00.000',2043642114,158666,2300231016,4
;WITH NextEventLag AS(
--Returns the next event information.
SELECT
ORDER_TIME
,ORDER_ID
,PROD_ID
,ACCOUNT_ID
,RowNum = ROW_NUMBER() OVER(PARTITION BY ACCOUNT_ID, PROD_ID ORDER BY ORDER_TIME)
--NEXT_ORDER_TIME_LAG: Returns the time difference between two consecutive order times.
,NEXT_ORDER_TIME_LAG = DATEDIFF(MINUTE, LAG(ORDER_TIME, 1, ORDER_TIME) OVER(PARTITION BY ACCOUNT_ID, PROD_ID ORDER BY ORDER_TIME), ORDER_TIME)
,ORDER_QUANTITY
FROM #Data
)
,RunningOrders AS(
SELECT
RowNum
,ORDER_TIME
,ACCOUNT_ID
,PROD_ID
,NEXT_ORDER_TIME_LAG
,LAG_LESS_THAN_24h = 0
,ORDER_QUANTITY
FROM NextEventLag
WHERE RowNum = 1
UNION ALL
SELECT
NextEventLag.RowNum
,NextEventLag.ORDER_TIME
,NextEventLag.ACCOUNT_ID
,NextEventLag.PROD_ID
,NextEventLag.NEXT_ORDER_TIME_LAG
--If the time lag between consecutive events and the time running sum is over 1440 minutes then set the value to 0.
--Change the NEXT_ORDER_TIME_LAG time interval to the desired interval value in minutes.
,LAG_LESS_THAN_24h = CASE WHEN RunningOrders.LAG_LESS_THAN_24h + NextEventLag.NEXT_ORDER_TIME_LAG >= 1440 THEN 0
ELSE RunningOrders.LAG_LESS_THAN_24h + NextEventLag.NEXT_ORDER_TIME_LAG
END
,NextEventLag.ORDER_QUANTITY
FROM RunningOrders
INNER JOIN NextEventLag ON RunningOrders.RowNum + 1 = NextEventLag.RowNum
AND RunningOrders.ACCOUNT_ID = NextEventLag.ACCOUNT_ID
AND RunningOrders.PROD_ID = NextEventLag.PROD_ID
)
,GroupedLags AS(
--This Groups together the LAG(s) less than 1440 minutes and is used by the outer query window functions
--to calculate the running aggregates.
SELECT RunningOrders.*
,Running24h.*
FROM RunningOrders
CROSS APPLY(SELECT TOP 1
Groups = COUNT(*) OVER(ORDER BY GroupApply.LAG_LESS_THAN_24h) --Count admin units within the time window in the WHERE clause
FROM RunningOrders AS GroupApply
WHERE
GroupApply.ACCOUNT_ID = RunningOrders.ACCOUNT_ID
AND GroupApply.PROD_ID = RunningOrders.PROD_ID
AND GroupApply.ORDER_TIME <= RunningOrders.ORDER_TIME
--ORDER BY StageBaseApply.ORDER_TIME
) AS Running24h
)
select
GroupedLags.ACCOUNT_ID
,GroupedLags.PROD_ID
,GroupedLags.ORDER_TIME
,GroupedLags.NEXT_ORDER_TIME_LAG
,GroupedLags.LAG_LESS_THAN_24h
,RUNNING_COUNT_24h = ROW_NUMBER() OVER(PARTITION BY GroupedLags.ACCOUNT_ID, GroupedLags.PROD_ID, GroupedLags.Groups ORDER BY GroupedLags.ORDER_TIME)
,RUNNING_SUM_24h = SUM(ORDER_QUANTITY) OVER(PARTITION BY GroupedLags.ACCOUNT_ID, GroupedLags.PROD_ID, GroupedLags.Groups ORDER BY GroupedLags.ORDER_TIME)
from GroupedLags
ORDER BY
GroupedLags.ACCOUNT_ID
,GroupedLags.PROD_ID
,GroupedLags.ORDER_TIME
Here is the db<>fiddle demo

Find minimum datetime while using FK in two different tables

I have 2 tables:
COURSE
------
Id
Name
TEST
------
Id
CourseId (FK to `COURSE.ID`)
DATETIME
NUMBERS
Suppose COURSE table with ID 1,2 (only 2 columns) and TEST table with 8 numbers of data having different DATETIME and CourseId of 1 (3 columns) and 2 (6 columns).
I want to find the minimum DATETIME,CourseID and Name by joining these 2 tables. The below query is giving a 2 output:
(SELECT min([DATETIME]) as DATETIME ,[TEST].CourseID,Name
FROM [dbo].[TEST]
left JOIN [dbo].[COURSE]
ON [dbo].[TEST].CourseID=[COURSE].ID GROUP BY CourseID,Name)
I want a single column output i.e. a single output column (minimum datetime along with Name and ID)..HOW can i achieve??
With 2 courses you are always going to get 2 rows when joining like this. It will give you the minimum date value for each course. The first way you can get a single row is to use TOP 1 in your query, which will simply give you the course with the earliest test date. The other way is to use a WHERE clause to filter it by a single course.
Please run this sample code with some variations of what you can do, notes included in comments:
CREATE TABLE #course ( id INT, name NVARCHAR(20) );
CREATE TABLE #Test
(
id INT ,
courseId INT ,
testDate DATETIME -- you shouldn't use a keyword for a column name
);
INSERT INTO #course
( id, name )
VALUES ( 1, 'Maths' ),
( 2, 'Science' );
-- note I used DATEADD(HOUR, -1, GETDATE()) to simply get some random datetime values
INSERT INTO #Test
( id, courseId, testDate )
VALUES ( 1, 1, DATEADD(HOUR, -1, GETDATE()) ),
( 2, 1, DATEADD(HOUR, -2, GETDATE()) ),
( 3, 1, DATEADD(HOUR, -3, GETDATE()) ),
( 4, 2, DATEADD(HOUR, -4, GETDATE()) ),
( 5, 2, DATEADD(HOUR, -5, GETDATE()) ),
( 6, 2, DATEADD(HOUR, -6, GETDATE()) ),
( 7, 2, DATEADD(HOUR, -7, GETDATE()) ),
( 8, 2, DATEADD(HOUR, -8, GETDATE()) );
-- returns minumum date for each course - 2 rows
SELECT MIN(t.testDate) AS TestDate ,
t.courseId ,
c.name
FROM #Test t
-- used inner join as could see no reason for left join
INNER JOIN #course c ON t.courseId = c.id
GROUP BY courseId , name;
-- to get course with minimum date - 1 row
SELECT TOP 1
MIN(t.testDate) AS TestDate ,
t.courseId ,
c.name
FROM #Test t
-- used inner join as could see no reason for left join
INNER JOIN #course c ON t.courseId = c.id
GROUP BY t.courseId , c.name
ORDER BY MIN(t.testDate); -- requires order by
-- to get minimum date for a specified course - 1 row
SELECT MIN(t.testDate) AS TestDate ,
t.courseId ,
c.name
FROM #Test t
-- used inner join as could see no reason for left join
INNER JOIN #course c ON t.courseId = c.id
WHERE t.courseId = 1 -- requires you specify a course id
GROUP BY courseId , name;
DROP TABLE #course;
DROP TABLE #Test;
In my understanding, you want to return the minimum date from the entire table with the course details of that day.
Please try the below script
SELECT TOP 1 MIN(t.testDate) OVER (ORDER BY t.testDate) AS TestDate ,
t.courseId ,
c.name
FROM Test t
INNER JOIN course c ON t.courseId = c.id
ORDER BY t.testDate

Extracting a (sampled) time series from an SQL DB

I have an MS SQL data base which contains values stored with their time stamps. So my result table looks like this:
date value
03.01.2016 11
19.01.2016 22
29.01.2016 33
17.02.2016 44
01.03.2016 55
06.03.2016 66
The time stamps don't really follow much of a pattern. Now, I need to extract weekly data from this: (sampled on Friday, for example)
date value
01.01.2016 11 // friday
08.01.2016 11 // next friday
15.01.2016 11
22.01.2016 22
29.01.2016 33
05.02.2016 33
12.02.2016 33
19.02.2016 44
26.02.2016 44
04.03.2016 55
11.03.2016 66
Is there a reasonable way to do this directly in T-SQL?
I could reformat the result table using a C# or Matlab program, but it seems a bit weird, because I seem to again query the result table...
You Could possibly use a CROSS JOIN or INNER JOIN. I would personally go with the INNER JOIN as its much more efficient.
SAMPLE DATA:
CREATE TABLE #Temp(SomeDate DATE
, SomeValue VARCHAR(10));
INSERT INTO #Temp(SomeDate
, SomeValue)
VALUES
('20160103'
, 11),
('20160119'
, 22),
('20160129'
, 33),
('20160217'
, 44),
('20160301'
, 55),
('20160306'
, 66)
QUERY USING CROSS JOIN:
;WITH T
AS (SELECT *
FROM #Temp),
D
AS (
SELECT SomeDate
, SomeValue
FROM #Temp AS A
UNION
SELECT DATEADD(day, 7, SomeDate)
, SomeValue
FROM #Temp AS B
UNION
SELECT DATEADD(day, 14, SomeDate)
, SomeValue
FROM #Temp AS C)
SELECT D.*
FROM T
CROSS JOIN D
WHERE T.SomeValue = D.SomeValue
ORDER BY SomeValue
, SomeDate;
RESULT:
QUERY USING INNER JOIN:
;WITH T
AS (SELECT *
FROM #Temp),
D
AS (
SELECT SomeDate
, SomeValue
FROM #Temp AS A
UNION
SELECT DATEADD(day, 7, SomeDate)
, SomeValue
FROM #Temp AS B
UNION
SELECT DATEADD(day, 14, SomeDate)
, SomeValue
FROM #Temp AS C)
SELECT D.*
FROM T
INNER JOIN D
ON T.SomeValue = D.SomeValue
ORDER BY SomeValue
, SomeDate;
RESULT:
This solution supports a maximum time window of 252 weeks from the first value time.
First row of your desired output is missing, because that friday is before the first value.
If needed, you can add it by mean of a UNION with a min of the table.
DECLARE #tbl TABLE ( [date] date, [value] int )
INSERT INTO #tbl
VALUES
('2016-01-03','11'),
('2016-01-19','22'),
('2016-01-29','33'),
('2016-02-17','44'),
('2016-03-01','55'),
('2016-03-06','66')
;WITH DATA
AS (
SELECT (S+P+Q) WeekNum, DATEADD( week, S + P + Q, MinDate ) Fridays, SubFri, [value]
FROM ( SELECT 1 S UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 ) A
CROSS JOIN ( SELECT 0 P UNION SELECT 7 UNION SELECT 14 UNION SELECT 21 UNION SELECT 28 UNION SELECT 35 ) B
CROSS JOIN ( SELECT 0 Q UNION SELECT 42 UNION SELECT 84 UNION SELECT 126 UNION SELECT 168 UNION SELECT 210 ) C
CROSS JOIN (
SELECT
min ( DATEADD( day, -8 - DATEPART(weekday,[date]), [date] ) ) MinDate,
max ( DATEADD( day, 13 - DATEPART(weekday,[date]), [date] ) ) MaxDate
FROM #tbl
) MD
LEFT JOIN ( SELECT DATEADD( day, 6 - DATEPART(weekday,[date]), [date] ) SubFri, [value] FROM #tbl ) Val
ON SubFri<=DATEADD( week, S + P + Q, MinDate )
WHERE DATEADD( week, S + P + Q, MinDate )<=MaxDate
)
SELECT DATA.Fridays, DATA.value
FROM DATA
INNER JOIN
(
SELECT Fridays, max(SubFri) MaxSubFri
FROM DATA
GROUP BY Fridays
) idx
ON DATA.Fridays=idx.Fridays
AND SubFri=MaxSubFri
ORDER BY Fridays
You should be able to use DATENAME to get all the records of a certain day:
SELECT *
FROM table
WHERE DATENAME(WEEKDAY, date) = 'Friday'
This causes a scan in the query plan though so it would be advisable to have another column with the day of the week and you could just select WHERE dayOfWeekCol = 'Friday'
I found my own solution, which I find more readable. I'm first using a WHILE loop to generate the dates I'm looking for. Then I 'join' these dates to the actual data table using an OUTER APPLY, which looks up 'last value before a specific date'. Here's the code:
-- prepare in-memory table
declare #tbl table ( [date] date, [value] int )
insert into #tbl
values
('2016-01-03','11'),
('2016-01-19','22'),
('2016-01-29','33'),
('2016-02-17','44'),
('2016-03-01','55'),
('2016-03-06','66')
-- query
declare #startDate date='2016-01-01';
declare #endDate date='2016-03-31';
with Fridays as (
select #startDate as fridayDate
union all
select dateadd(day,7,fridayDate) from Fridays where dateadd(day,7,fridayDate)<=#endDate
)
select *
from
Fridays f
outer apply (
select top(1) * from #tbl t
where f.fridayDate >= t.[date]
order by t.[value] desc
) as result
option (maxrecursion 10000)
Gives me:
fridayDate date value
---------- ---------- -----------
2016-01-01 NULL NULL
2016-01-08 2016-01-03 11
2016-01-15 2016-01-03 11
2016-01-22 2016-01-19 22
2016-01-29 2016-01-29 33
2016-02-05 2016-01-29 33
2016-02-12 2016-01-29 33
2016-02-19 2016-02-17 44
2016-02-26 2016-02-17 44
2016-03-04 2016-03-01 55
2016-03-11 2016-03-06 66
2016-03-18 2016-03-06 66
2016-03-25 2016-03-06 66
Thanks for everybody's ideas and support though!

Getting quantity between a range of months from 2 date parameters

I have a table that stores budget quantities for a company whose fiscal year begins 1st April and ends on 31st March the next year.
I have this query to extract figures for a particular month.
SELECT SUM(T1.U_Quantity) AS 'YTDBOwnMadeTea'
FROM [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA] T0
INNER JOIN [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA_ROW] T1
ON T0.DocEntry = T1.DocEntry
WHERE T1.U_Month = DATENAME(MONTH, '2015-04-01') AND T0.U_Source = 'NTEL'
There is an existing report that takes two parameters, a Start and End Date. (type datetime)
Table below: The month column is of type nvarchar.
How do I modify the query such when a user enters StartDate and EndDate e.g.
1st May 2015 and 31st July 2015, I will get a quantity result of 12640.
You can use couple of ways to do this.
One way would be to use PARSE. Like this.
SELECT SUM(T1.U_Quantity) AS 'YTDBOwnMadeTea'
FROM [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA] T0
INNER JOIN [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA_ROW] T1
ON T0.DocEntry = T1.DocEntry
WHERE PARSE((T1.U_Month + CONVERT(VARCHAR(4),YEAR(CURRENT_TIMESTAMP))) as datetime) BETWEEN #StartDate AND #EndDate
AND T0.U_Source = 'NTEL'
Another way would be to use a numbers table to map your month name to a month number and use it in your query.
;WITH CTE AS (
SELECT 1 as rn UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
MonthMap AS
(
SELECT ROW_NUMBER()OVER(ORDER BY rn ASC) as monthnumber FROM CTE
)
SELECT monthnumber,DATENAME(MONTH,DATEFROMPARTS(2016,monthnumber,1)) FROM MonthMap;
and then join it with your month table like this.
;WITH CTE AS (
SELECT 1 as rn UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
MonthMap AS
(
SELECT ROW_NUMBER()OVER(ORDER BY rn ASC) as monthnumber FROM CTE
)
SELECT SUM(T1.U_Quantity) AS 'YTDBOwnMadeTea'
FROM [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA] T0
INNER JOIN [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA_ROW] T1
ON T0.DocEntry = T1.DocEntry
INNER JOIN MonthMap M ON T1.U_Month = DATENAME(MONTH,DATEFROMPARTS(2016,monthnumber,1))
WHERE M.monthnumber BETWEEN DATEPART(MONTH,#StartDate) AND DATEPART(MONTH,#EndDate)
AND T0.U_Source = 'NTEL';
You should compare both the approaches for performance. PARSE is simpler to use but would be difficult to index properly.
On a Separate note, you should avoid storing dates or date parts as month names as these take up more storage(even more since you are using NVARCHAR), and are difficult to use efficiently.

Consecutive Day Query

I have a table of contactIDs and datetimes, the time being when a letter was generated for the contact. Each contact can only have one letter generated a day. I want to write a query to select any contact that has had letters generated on more than one consecutive day.
I guess I'd need to increment the datetime as records are found but how would I do this separately for each contact?
select contactid from ContactTable a inner join Contacttable B on a.contctid=b.contactid and datediff(day,a.date,b.date)=1
I decided to utilise a calendar table. Use your favourite search engine to find a script to create a calendar table.
Alternatively, here's one I made earlier
So here's the query I have rolled with in full, I will explain the detail of it afterwards
DECLARE #your_table table (
contact_id int
, created_on datetime
);
INSERT INTO #your_table (contact_id, created_on)
SELECT 9, '2014-01-02 06:00'
UNION ALL SELECT 9, '2014-01-02 18:00'
UNION ALL SELECT 9, '2014-01-05 08:00'
UNION ALL SELECT 9, '2014-01-07 01:00'
UNION ALL SELECT 3, '2014-01-02 00:01'
UNION ALL SELECT 3, '2014-01-03 23:59' -- Over 24 hours but a "day" different
UNION ALL SELECT 7, '2014-01-04 01:00'
UNION ALL SELECT 7, '2014-01-06 01:00'
UNION ALL SELECT 7, '2014-01-08 01:00'
UNION ALL SELECT 7, '2014-01-09 01:00'
UNION ALL SELECT 7, '2014-01-10 01:00'
UNION ALL SELECT 7, '2014-01-11 01:00'
;
; WITH x AS (
SELECT your_table.contact_id
, your_table.created_on
, calendar.the_date
, Row_Number() OVER (PARTITION BY your_table.contact_id ORDER BY calendar.the_date) As sequence
FROM #your_table As your_table
INNER
JOIN dbo.calendar
ON your_table.created_on >= calendar.the_date
AND your_table.created_on < DateAdd(dd, 1, calendar.the_date)
)
, y AS (
SELECT curr.contact_id
, curr.created_on
, curr.the_date As the_date
, prev.the_date As previous_date
, DateDiff(dd, prev.the_date, curr.the_date) As difference_in_days
FROM x As curr
LEFT
JOIN x As prev
ON curr.contact_id = prev.contact_id
AND curr.sequence = prev.sequence + 1
)
SELECT contact_id
, created_on
, the_date
, previous_date
, difference_in_days
FROM y
WHERE difference_in_days = 1
Because you didn't provide any sample data that's where I had to start, so the query is self-contained using a table variable (#your_table) as its source.
Once populated we start out with a couple of Common-Table Expressions (CTE for short). Read up here if you're not familiar with the concept: http://msdn.microsoft.com/en-us/library/ms175972.aspx . There's not a lot of difference between these and subqueries.
Our first CTE (x) joins #your_table to the calendar table. It does this by returning the single row from the calendar on which the created_on date lies, by checking that it is greater than (or equal to) the calendar date and less than the next calendar date (DateAdd()).
Once complete we use the windowed function - Row_Number() to provide some sequencing.
We partition (i.e. reset the sequence) for each contact_id and sort the sequence by the created_on date.
Moving on to the second CTE (y): we perform a self-join on CTE x joining each contact record with its "previous" based on the sequencing.
This allows us to work out the difference in days (DateDiff()) between the current and the previous records.
Finally we reduce our resultset to only those records where the difference in days is 1 i.e. contacts on consecutive days

Resources