Convert Historical Local Time to UTC Time in SQL Server - sql-server

I am confronting an SQL Server database which contains many DateTime values stored in local time. (Yes, this is unfortunate!) We have perhaps 5-10 years of data, which means that the UTC offset for this location will change depending on the time of year if the region in question observes Daylight Savings Time, and of course the schedule on which that change occurred may also change, as for example it did in the United States (where most of these data originate) back in 2007.
My objective is to convert these DateTimes to UTC time at the SQL level. Short of loading the entire Olson or TZ Database and querying it, does anyone have a technique for converting an historical local timestamp to a UTC time? [If it helps, conveniently, we happen to have the latitude and longitude for each row as well (could be used to identify timezone.]
Note: for a row written in real time, the trick of DATEDIFF(Hour, Getutcdate(), GETDATE()) AS UtcOffset works fine, of course. The problem is applying this retroactively to dates that occurred on either side of the Daylight Savings Time "barrier".

You can use AT TIME ZONE to convert to UTC. SQL knows about the switches to daylight savings so it will account for it. You just have to figure out the timezone (using the latitude and longitude, as you said).
You can get all timezones from here:
SELECT * FROM sys.time_zone_info
So the solution will be something like this:
First, add a column to your table with timezone (which you find out using the latitude and longitude).
Then update your (newly added) UTC date column with AT TIME ZONE, for example:
-- some sample data to play with
CREATE TABLE #YourTable
(
LocalDateTime DATETIME,
[UtcDateTime] DATETIMEOFFSET,
TimeZoneName VARCHAR(100)
);
INSERT INTO #YourTable
(
LocalDateTime,
TimeZoneName
)
VALUES
('20150101', 'Alaskan Standard Time'),
('20150101', 'US Mountain Standard Time'),
('20190701', 'Alaskan Standard Time'),
('20190701', 'US Mountain Standard Time');
-- convert to UTC
UPDATE #YourTable
SET [UtcDateTime] = LocalDateTime AT TIME ZONE TimeZoneName AT TIME ZONE 'UTC';
-- check results
SELECT * FROM #YourTable;

This is based on a previous answer by Chris Barlow, at
SQL Server - Convert date field to UTC
This is a solution component in the form of a SQL Server 2008 view that includes a daylight savings (DST) rules approach for historical data conversion.
(No lat/long data needed.)
You can use this view to create your custom solution referencing for update, your local table columns that might need to be converted, like dbo.mytable.created_date.
Some notes on using the view are referenced below, of interest is the section "EXAMPLE USAGE - FOR HISTORICAL DATA CONVERSION":
--
-- DATETIME VS. DATETIMEOFFSET
--
-- WHERE, t = '2016-12-13 04:32:00'
--
declare
#Sydney DATETIME
set
#Sydney = '2016-12-13 04:32:00'
select
Sydney = #Sydney
declare
#Sydney_UTC DATETIMEOFFSET
set
#Sydney_UTC = '2016-12-13 04:32:00.6427663 +10:00'
select
Sydney_UTC = #Sydney_UTC
declare
#NewYork DATETIME
set
#NewYork = '2016-12-13 04:32:00:34'
select
NewYork = #NewYork
declare
#NewYork_UTC DATETIMEOFFSET
set
#NewYork_UTC = '2016-12-13 04:32:00.6427663 -04:00'
select
NewYork_UTC = #NewYork_UTC
select
DATEDIFF(hh, #Sydney, #NewYork) as DIFF_DATETIME
select
DATEDIFF(hh, #Sydney_UTC, #NewYork_UTC) as DIFF_DATETIMEOFFSET
--
-- LOCAL UTC OFFSET FOR REAL-TIME DATA TODAY
--
select
DATEDIFF( Hour, GETUTCDATE(), GETDATE() ) AS UtcOffset
--
-- LOCAL UTC DATE FOR REAL-TIME DATA TODAY - EASTERN STANDARD EXAMPLE
--
select
convert( datetimeoffset( 5 ), GETDATE(), 120 )
--
-- EXAMPLE USAGE -
--
select
*
from
vw_datetime__dst__timezone
--
-- EXAMPLE USAGE - FOR HISTORICAL DATA CONVERSION - EASTERN STANDARD
--
select
created_date,
isnull( dst.zone, 'NO TZ' ) as zone,
isnull(
case
when created_date >= dstlow and
created_date < dsthigh
then dst.daylight
else dst.standard
end,
'NO OFFSET'
) as zone_offsettime,
TODATETIMEOFFSET(
created_date,
case
when created_date >= dstlow and
created_date < dsthigh
then dst.daylight
else dst.standard
end
) as zone_time,
SWITCHOFFSET(
TODATETIMEOFFSET(
created_date,
case
when created_date >= dstlow and
created_date < dsthigh
then dst.daylight
else dst.standard
end
),
'+00:00' -- parameterize?
) as utc_time
from
(
select GETDATE() as created_date
union
select SYSDATETIMEOFFSET() as created_date
union
select '2017-01-01 15:20:24.653' as created_date
) DYNAMIC_temp_table
left outer join vw_datetime__dst__timezone dst on
created_date between yrstart and yrend and
dst.zone = 'ET'
order by
created_date
-- Here is the view SQL:
drop view
vw_datetime__dst__timezone
go
create view
vw_datetime__dst__timezone
as
select
yr,
zone,
standard,
daylight,
rulename,
strule,
edrule,
yrstart,
yrend,
dateadd(day, (stdowref + stweekadd), stmonthref) dstlow,
dateadd(day, (eddowref + edweekadd), edmonthref) dsthigh
from (
select
yrs.yr,
timezone.zone,
timezone.standard,
timezone.daylight,
timezone.rulename,
dst_rule.strule,
dst_rule.edrule,
yrs.yr + '-01-01 00:00:00' yrstart,
yrs.yr + '-12-31 23:59:59' yrend,
yrs.yr + dst_rule.stdtpart + ' ' + dst_rule.cngtime stmonthref,
yrs.yr + dst_rule.eddtpart + ' ' + dst_rule.cngtime edmonthref,
case
when dst_rule.strule in ('1', '2', '3')
then
case
when datepart(dw, yrs.yr + dst_rule.stdtpart) = '1'
then 0
else 8 - datepart(dw, yrs.yr + dst_rule.stdtpart)
end
else (datepart(dw, yrs.yr + dst_rule.stdtpart) - 1) * -1
end as stdowref,
case
when dst_rule.edrule in ('1', '2', '3')
then
case
when datepart(dw, yrs.yr + dst_rule.eddtpart) = '1'
then 0
else 8 - datepart(dw, yrs.yr + dst_rule.eddtpart)
end
else (datepart(dw, yrs.yr + dst_rule.eddtpart) - 1) * -1
end as eddowref,
datename(dw, yrs.yr + dst_rule.stdtpart) as stdow,
datename(dw, yrs.yr + dst_rule.eddtpart) as eddow,
case
when dst_rule.strule in ('1', '2', '3')
then (7 * CAST(dst_rule.strule AS Integer)) - 7
else 0
end as stweekadd,
case
when dst_rule.edrule in ('1', '2', '3')
then (7 * CAST(dst_rule.edrule AS Integer)) - 7
else 0
end as edweekadd
from (
select '1900' yr
union select '1901' yr
union select '1902' yr
union select '1903' yr
union select '1904' yr
union select '1905' yr
union select '1906' yr
union select '1907' yr
union select '1908' yr
union select '1909' yr
union select '1910' yr
union select '1911' yr
union select '1912' yr
union select '1913' yr
union select '1914' yr
union select '1915' yr
union select '1916' yr
union select '1917' yr
union select '1918' yr
union select '1919' yr
union select '1920' yr
union select '1921' yr
union select '1922' yr
union select '1923' yr
union select '1924' yr
union select '1925' yr
union select '1926' yr
union select '1927' yr
union select '1928' yr
union select '1929' yr
union select '1930' yr
union select '1931' yr
union select '1932' yr
union select '1933' yr
union select '1934' yr
union select '1935' yr
union select '1936' yr
union select '1937' yr
union select '1938' yr
union select '1939' yr
union select '1940' yr
union select '1941' yr
union select '1942' yr
union select '1943' yr
union select '1944' yr
union select '1945' yr
union select '1946' yr
union select '1947' yr
union select '1948' yr
union select '1949' yr
union select '1950' yr
union select '1951' yr
union select '1952' yr
union select '1953' yr
union select '1954' yr
union select '1955' yr
union select '1956' yr
union select '1957' yr
union select '1958' yr
union select '1959' yr
union select '1960' yr
union select '1961' yr
union select '1962' yr
union select '1963' yr
union select '1964' yr
union select '1965' yr
union select '1966' yr
union select '1967' yr
union select '1968' yr
union select '1969' yr
union select '1970' yr
union select '1971' yr
union select '1972' yr
union select '1973' yr
union select '1974' yr
union select '1975' yr
union select '1976' yr
union select '1977' yr
union select '1978' yr
union select '1979' yr
union select '1980' yr
union select '1981' yr
union select '1982' yr
union select '1983' yr
union select '1984' yr
union select '1985' yr
union select '1986' yr
union select '1987' yr
union select '1988' yr
union select '1989' yr
union select '1990' yr
union select '1991' yr
union select '1992' yr
union select '1993' yr
union select '1994' yr
union select '1995' yr
union select '1996' yr
union select '1997' yr
union select '1998' yr
union select '1999' yr
union select '2000' yr
union select '2001' yr
union select '2002' yr
union select '2003' yr
union select '2004' yr
union select '2005' yr
union select '2006' yr -- OLD US RULES
union select '2007' yr
union select '2008' yr
union select '2009' yr
union select '2010' yr
union select '2011' yr
union select '2012' yr
union select '2013' yr
union select '2014' yr
union select '2015' yr
union select '2016' yr
union select '2017' yr
union select '2018' yr
union select '2018' yr
union select '2020' yr
union select '2021' yr
union select '2022' yr
union select '2023' yr
union select '2024' yr
union select '2025' yr
union select '2026' yr
union select '2027' yr
union select '2028' yr
union select '2029' yr
union select '2030' yr
union select '2031' yr
union select '2032' yr
union select '2033' yr
union select '2034' yr
union select '2035' yr
union select '2036' yr
union select '2037' yr
union select '2038' yr
union select '2039' yr
union select '2040' yr
union select '2041' yr
union select '2042' yr
union select '2043' yr
union select '2044' yr
union select '2045' yr
union select '2046' yr
union select '2047' yr
union select '2048' yr
union select '2049' yr
union select '2050' yr
union select '2051' yr
union select '2052' yr
union select '2053' yr
union select '2054' yr
union select '2055' yr
union select '2056' yr
union select '2057' yr
union select '2058' yr
union select '2059' yr
union select '2060' yr
union select '2061' yr
union select '2062' yr
union select '2063' yr
union select '2064' yr
union select '2065' yr
union select '2066' yr
union select '2067' yr
union select '2068' yr
union select '2069' yr
union select '2070' yr
union select '2071' yr
union select '2072' yr
union select '2073' yr
union select '2074' yr
union select '2075' yr
union select '2076' yr
union select '2077' yr
union select '2078' yr
union select '2079' yr
union select '2080' yr
union select '2081' yr
union select '2082' yr
union select '2083' yr
union select '2084' yr
union select '2085' yr
union select '2086' yr
union select '2087' yr
union select '2088' yr
union select '2089' yr
union select '2090' yr
union select '2091' yr
union select '2092' yr
union select '2093' yr
union select '2094' yr
union select '2095' yr
union select '2096' yr
union select '2097' yr
union select '2098' yr
union select '2099' yr
) yrs
cross join (
-- Dynamic, hardcoded table of timezone-based, daylight savings time (DST) rules
-- -- TIMEZONE
select 'UTC' zone, '+00:00' standard, '+01:00' daylight, 'UTC' rulename -- UTC - STAGING ONLY - this line is not accurate
union select 'CET' zone, '+01:00' standard, '+02:00' daylight, 'EU' rulename -- Centeral Europe
union select 'ET' zone, '-05:00' standard, '-04:00' daylight, 'US' rulename -- Eastern Time
union select 'CT' zone, '-06:00' standard, '-05:00' daylight, 'US' rulename -- Central Time
union select 'MT' zone, '-07:00' standard, '-06:00' daylight, 'US' rulename -- Mountain Time
union select 'PT' zone, '-08:00' standard, '-07:00' daylight, 'US' rulename -- Pacific Time
) timezone
join (
-- Dynamic, hardcoded table of country-based, daylight savings time (DST) rules
select 'UTC' rulename, 'L' strule, '-03-31' stdtpart, 'L' edrule, '-10-31' eddtpart, 1900 firstyr, 2099 lastyr, '01:00:00' cngtime
-- Country - Europe
union select 'EU' rulename, 'L' strule, '-03-31' stdtpart, 'L' edrule, '-10-31' eddtpart, 1900 firstyr, 2099 lastyr, '01:00:00' cngtime
-- Country - US
union select 'US' rulename, '1' strule, '-04-01' stdtpart, 'L' edrule, '-10-31' eddtpart, 1900 firstyr, 2006 lastyr, '02:00:00' cngtime
union select 'US' rulename, '2' strule, '-03-01' stdtpart, '1' edrule, '-11-01' eddtpart, 2007 firstyr, 2099 lastyr, '02:00:00' cngtime
) dst_rule on
dst_rule.rulename = timezone.rulename and
datepart( year, yrs.yr ) between firstyr and lastyr
) dst_dates
go

I used the following to convert from local Eastern time to UTC (hence the fixed values of 4 and 5 in the function). If you have pre-2007 values, then you would in fact need to modify the udf_IsInDST below to accomodate that as well.
CREATE FUNCTION [dbo].[udf_ConvertTimeLocalToUTC](#dt DATETIME)
RETURNS DATETIME
AS
BEGIN
SET #dt = DATEADD(HOUR, CASE WHEN [dbo].udf_IsInDST(#dt) = 1 THEN 4 ELSE 5 END, #dt)
RETURN #dt
END
GO
CREATE FUNCTION [dbo].[udf_IsInDST](#dt DATETIME)
RETURNS BIT
AS
BEGIN
DECLARE #returnValue BIT = 0
DECLARE #mm INT = DATEPART(MONTH, #dt)
DECLARE #dd INT = DATEPART(DAY, #dt)
DECLARE #dow INT = DATEPART(dw, #dt) -- 1 = sun
DECLARE #hr INT = DATEPART(HOUR, #dt)
SET #returnValue =
CASE WHEN #mm > 3 AND #mm < 11 THEN 1
WHEN #mm = 3 THEN
CASE WHEN #dd < 8 THEN 0
WHEN #dd >= 8 AND #dd <= 14 THEN (CASE WHEN #dow = 1 THEN (CASE WHEN #hr >= 2 THEN 1 ELSE 0 END) ELSE (CASE WHEN #dd - #dow >= 7 THEN 1 ELSE 0 END) END)
ELSE 1
END
WHEN #mm = 11 THEN
CASE WHEN #dd < 7 THEN (CASE WHEN #dow = 1 THEN (CASE WHEN #hr < 2 THEN 1 ELSE 0 END) ELSE (CASE WHEN #dow > #dd THEN 1 ELSE 0 END) END)
ELSE 0
END
ELSE 0
END;
RETURN #returnValue
END
GO

I've used 2 methods in the past.
The first was to create a .Net CLR that takes a datetime and timezone and returns the UTC datetime value which was stored with the data.
The second solution was only required to work for a limited number of time zones and involved creating a table consisting of time zone ID, date from, date to and the correct UTC offset for dates in the past and 20 years in the future. From there it is simple to join and apply the correct offset.

Related

SSMS Rolling Average over Day of Week

Leadership wants to know how Teammates are performing on Mondays & Fridays in comparison to the rest of the work week. Below is a sample temp dbo of a Teammate X's daily performance over a two-month period. Each subsequent Teammate has a different starting point from whence they are measured. I initially looked at using UNBOUNDED PRECEDING in conjunction with the various start dates, but windows functions are not cooperating. Help!
CREATE TABLE #RollingAverage
(
[Date] DATE PRIMARY KEY
,[Value] INT
);
INSERT INTO #RollingAverage
SELECT '2019-01-02',626
UNION ALL SELECT '2019-01-03',231 UNION ALL SELECT '2019-01-04',572
UNION ALL SELECT '2019-01-07',775 UNION ALL SELECT '2019-01-09',660
UNION ALL SELECT '2019-01-10',662 UNION ALL SELECT '2019-01-11',541
UNION ALL SELECT '2019-01-14',849 UNION ALL SELECT '2019-01-15',632
UNION ALL SELECT '2019-01-16',906 UNION ALL SELECT '2019-01-18',961
UNION ALL SELECT '2019-01-21',501 UNION ALL SELECT '2019-01-24',311
UNION ALL SELECT '2019-01-25',614 UNION ALL SELECT '2019-01-28',296
UNION ALL SELECT '2019-01-29',390 UNION ALL SELECT '2019-01-31',804
UNION ALL SELECT '2019-02-01',928 UNION ALL SELECT '2019-02-05',855
UNION ALL SELECT '2019-02-06',605 UNION ALL SELECT '2019-02-08',283
UNION ALL SELECT '2019-02-12',144 UNION ALL SELECT '2019-02-14',382
UNION ALL SELECT '2019-02-15',862 UNION ALL SELECT '2019-02-18',549
UNION ALL SELECT '2019-02-19',401 UNION ALL SELECT '2019-02-20',515
UNION ALL SELECT '2019-02-21',590 UNION ALL SELECT '2019-02-22',625
UNION ALL SELECT '2019-02-25',304 UNION ALL SELECT '2019-02-26',402
UNION ALL SELECT '2019-02-27',326;
AVG(Value) over (ORDER BY [Date] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) did not work
The first thing you need to understand, is that your "daily" performance is not daily. A simple solution would be to fill the gaps to be able to effectively count the days.
I filled the gaps using a CTE that generates a calendar table on the fly, but you could use a permanent calendar table if available.
WITH
E(n) AS(
SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(n)
),
E2(n) AS(
SELECT a.n FROM E a, E b
),
cteCalendar(calDate) AS(
SELECT TOP (61)
CAST( DATEADD( DD, 1-ROW_NUMBER() OVER(ORDER BY (SELECT NULL)), GETDATE()) AS date) AS calDate
FROM E2
),
cteRollingAverages AS(
SELECT ra.[Date],
ra.value,
AVG(Value) over (ORDER BY calDate ROWS BETWEEN 7 PRECEDING AND CURRENT ROW) RollingAverage
FROM #RollingAverage AS ra
RIGHT JOIN cteCalendar AS c ON ra.[Date] = c.calDate
)
SELECT *
FROM cteRollingAverages
WHERE [Date] IS NOT NULL
ORDER BY [Date];
A different option is to use APPLY. This is not limited by a certain date.
SELECT *
FROM #RollingAverage r
CROSS APPLY( SELECT AVG(i.[Value]) AS RollingAvg
FROM #RollingAverage i
WHERE i.[Date] BETWEEN DATEADD( DD, -7, r.[Date]) AND r.[Date]) av
ORDER BY [Date];

SQL Server 2016 - Running Count and Sum for a 24 hours sliding window

I am trying to count orders over a 24 hours sliding window. I have a 'detetime' field and I'm calculating the 24 hours window aggregating at the minute level. It should re-start counting every time the order time between two consecutive orders is over 1440 minutes or when the running time of consecutive orders is over 1440 minutes.
Environment is SQL server 2016, I can create Temp tables but no physical tables and no memory-optimized objects (I guess anything working on 2012+ should work).
I tried an inner join on the same table and tested with recursive CTEs, ROW_NUMBER etc. but the issue is that there is never a set number of rows for the 24 hours window and the base time from which to calculate the start of the period changes. The only constant I have is the 24 hours time span.
Tried the following:
https://www.red-gate.com/simple-talk/sql/t-sql-programming/calculating-values-within-a-rolling-window-in-transact-sql/
Calculate running total / running balance
Cross apply seems to be working for the most part but in some instances - when calculating the running 24 hours window - it isn't. I tried changing the datetime conditions in the WHERE clause in many ways but I still can't figure out how to get it to work correctly.
I thought about creating a reset event at the 24 hours mark as showed here https://blog.jooq.org/2015/05/12/use-this-neat-window-function-trick-to-calculate-time-differences-in-a-time-series/ but at this point my brain is melting and I can't even get the logic straight.
DROP TABLE IF EXISTS #Data
CREATE TABLE #Data
(
START_TIME DATETIME
,ORDER_ID NUMERIC(18,0)
,PROD_ID NUMERIC(18,0)
,ACC_ID NUMERIC(18,0)
);
INSERT INTO #Data
SELECT '2018-06-22 11:00:00.000', 198151606, 58666, 1601554883
UNION ALL SELECT '2018-07-09 10:15:00.000',2008873061,58666,1601554883
UNION ALL SELECT '2018-07-09 12:33:00.000',2009269222,58666,1601554883
UNION ALL SELECT '2018-07-10 08:29:00.000',2010735393,58666,1601554883
UNION ALL SELECT '2018-07-10 10:57:00.000',2010735584,58666,1601554883
UNION ALL SELECT '2018-06-27 23:53:00.000',1991467555,58666,2300231016
UNION ALL SELECT '2018-06-28 00:44:00.000',1991583916,58666,2300231016
UNION ALL SELECT '2018-07-04 04:15:00.000',2001154497,58666,2300231016
UNION ALL SELECT '2018-07-04 15:44:00.000',2001154818,58666,2300231016
UNION ALL SELECT '2018-07-04 21:30:00.000',2002057919,58666,2300231016
UNION ALL SELECT '2018-07-05 02:09:00.000',1200205808,58666,2300231016
UNION ALL SELECT '2018-07-05 04:15:00.000',2200205814,58666,2300231016
UNION ALL SELECT '2018-07-05 17:23:00.000',3200370070,58666,2300231016
UNION ALL SELECT '2018-07-05 18:07:00.000',4200370093,58666,2300231016
UNION ALL SELECT '2018-07-06 20:15:00.000',5200571962,58666,2300231016
UNION ALL SELECT '2018-07-07 07:45:00.000',6200571987,58666,2300231016
UNION ALL SELECT '2018-07-07 12:13:00.000',7200571993,58666,2300231016
UNION ALL SELECT '2018-07-09 18:29:00.000',8200939551,58666,2300231016
UNION ALL SELECT '2018-07-09 21:05:00.000',9200939552,58666,2300231016
UNION ALL SELECT '2018-07-11 21:31:00.000',2011107311,58666,2300231016
UNION ALL SELECT '2018-06-27 18:23:00.000',1991016382,58669,2300231016
UNION ALL SELECT '2018-06-27 19:07:00.000',1991181363,58669,2300231016
UNION ALL SELECT '2018-06-27 19:28:00.000',1991181374,58669,2300231016
UNION ALL SELECT '2018-06-28 01:44:00.000',1991583925,58669,2300231016
UNION ALL SELECT '2018-06-28 02:19:00.000',1991583946,58669,2300231016
UNION ALL SELECT '2018-07-03 10:15:00.000',1999231747,58669,2300231016
UNION ALL SELECT '2018-07-03 10:45:00.000',2000293678,58669,2300231016
UNION ALL SELECT '2018-07-03 14:22:00.000',200029380,58669,2300231016
UNION ALL SELECT '2018-07-04 19:45:00.000',2002057789,58669,2300231016
UNION ALL SELECT '2018-07-04 21:00:00.000',1200205781,58669,2300231016
UNION ALL SELECT '2018-07-05 15:12:00.000',2200254833,58669,2300231016
UNION ALL SELECT '2018-07-05 17:52:00.000',3200370071,58669,2300231016
UNION ALL SELECT '2018-07-09 22:30:00.000',4200939553,58669,2300231016
UNION ALL SELECT '2018-07-09 23:23:00.000',5200939566,58669,2300231016
UNION ALL SELECT '2018-07-30 17:45:00.000',6204364207,58666,2300231016
UNION ALL SELECT '2018-07-30 23:30:00.000',7204364211,58666,2300231016
;WITH TimeBetween AS(
SELECT
ACC_ID
,PROD_ID
,ORDER_ID
,START_TIME
,TIME_BETWEEN_ORDERS = COALESCE(CASE WHEN DATEDIFF(MINUTE, LAG(START_TIME) OVER(PARTITION BY ACC_ID, PROD_ID
ORDER BY START_TIME), START_TIME) >= 1440
THEN 0
ELSE DATEDIFF(MINUTE, LAG(START_TIME) OVER(PARTITION BY ACC_ID, PROD_ID
ORDER BY START_TIME), START_TIME)
END, 0)
FROM #Data
)
SELECT
TimeBetween.ACC_ID
,TimeBetween.PROD_ID
,TimeBetween.ORDER_ID
,TimeBetween.START_TIME
,TIME_BETWEEN_ORDERS
--Not working correctly, repeats the previous time at the end of the window when it should be 0.
,RUNNING_TIME_BETWEEN_ORDERS = SUM(TIME_BETWEEN_ORDERS) OVER(PARTITION BY ACC_ID, PROD_ID ORDER BY START_TIME)
,Running24h.*
FROM TimeBetween
CROSS APPLY(SELECT TOP 1
RUNNING_COUNT_24h = COUNT(*) OVER() --Count admin units within the time window in the WHERE clause
--Check what APPLY is returning for running time
,RUNNING_TIME_BETWEEN_ORDERS_Apply = DATEDIFF(MINUTE, StageBaseApply.START_TIME, TimeBetween.START_TIME)
--Check what APPLY is using as base event anchor for the calculation
,START_TIME_Apply = StageBaseApply.START_TIME
FROM #Data AS StageBaseApply
WHERE
StageBaseApply.ACC_ID = TimeBetween.ACC_ID
AND StageBaseApply.PROD_ID = TimeBetween.PROD_ID
AND (StageBaseApply.START_TIME > DATEADD(MINUTE, -1440, TimeBetween.START_TIME)
AND StageBaseApply.START_TIME <= TimeBetween.START_TIME
)
ORDER BY StageBaseApply.START_TIME
) AS Running24h
ORDER BY ACC_ID,PROD_ID, START_TIME
When the running time between orders is over 24 hours the running count should re-start from 1.
Currently it repeats the last value and the time it's using for the calculation seems to be off.
Current result from CROSS APPLY with notes on where it's not working and what it should be for what I'm trying to achieve
First create a Numbers table with at least as many rows as the minutes in the maximum time range you will ever be dealing with
CREATE TABLE dbo.Numbers(Number INT PRIMARY KEY);
WITH E1(N) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
) -- 1*10^1 or 10 rows
, E2(N) AS (SELECT 1 FROM E1 a, E1 b) -- 1*10^2 or 100 rows
, E4(N) AS (SELECT 1 FROM E2 a, E2 b) -- 1*10^4 or 10,000 rows
, E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows
, Nums AS (SELECT TOP (10000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM E8)
INSERT INTO dbo.Numbers
SELECT N
FROM Nums
And then you should be able to use something like this (I'm assuming that all start times are exact minutes and there are no duplicates per ACC_ID,PROD_ID,START_TIME as shown in your example data, if there are you will need to pre-aggregate at the minute level before participating in the left join)
WITH G
AS (SELECT ACC_ID,
PROD_ID,
MIN = MIN(START_TIME),
MAX = MAX(START_TIME),
Range = DATEDIFF(MINUTE, MIN(START_TIME), MAX(START_TIME))
FROM #Data
GROUP BY ACC_ID,
PROD_ID),
E
AS (SELECT *
FROM G
JOIN dbo.Numbers N
ON N.Number <= Range + 1),
R AS (SELECT E.ACC_ID,
E.PROD_ID,
D.START_TIME,
Cnt = COUNT(D.START_TIME) OVER (PARTITION BY E.ACC_ID, E.PROD_ID
ORDER BY DATEADD(MINUTE, NUMBER-1, MIN)
ROWS BETWEEN 1439 PRECEDING AND CURRENT ROW)
FROM E
LEFT JOIN #Data D
ON D.ACC_ID = E.ACC_ID
AND D.PROD_ID = E.PROD_ID
AND D.START_TIME = DATEADD(MINUTE, NUMBER-1, MIN) )
SELECT *
FROM R
WHERE START_TIME IS NOT NULL
ORDER BY ACC_ID,
PROD_ID,
START_TIME
After finding this post on how to reset a running sum, I think I may have finally been able to crack this nut. Not sure about how well it scales but it is working.
I also added a new column for order quantity since it may be useful sometimes to track the orders running total during the same time window.
The sliding time window can be set in this CASE statement:
CASE WHEN RunningOrders.LAG_LESS_THAN_24h + NextEventLag.NEXT_ORDER_TIME_LAG >= 1440 THEN 0 ELSE RunningOrders.LAG_LESS_THAN_24h + NextEventLag.NEXT_ORDER_TIME_LAG
END
DROP TABLE IF EXISTS #Data
CREATE TABLE #Data
(
ORDER_TIME DATETIME
,ORDER_ID NUMERIC(18,0)
,PROD_ID NUMERIC(18,0)
,ACCOUNT_ID NUMERIC(18,0)
,ORDER_QUANTITY INT
);
INSERT INTO #Data
SELECT '2018-06-22 11:00:00.000', 1981516061, 158666, 1601554883,5
UNION ALL SELECT '2018-07-09 10:15:00.000',2008873062,158666,1601554883,3
UNION ALL SELECT '2018-07-09 12:33:00.000',2009269223,158666,1601554883,2
UNION ALL SELECT '2018-07-10 08:29:00.000',2010735394,158666,1601554883,4
UNION ALL SELECT '2018-07-10 10:57:00.000',2010735584,158666,1601554883,7
UNION ALL SELECT '2018-06-27 23:53:00.000',1991467553,158666,2300231016,6
UNION ALL SELECT '2018-06-28 00:44:00.000',1991583913,158666,2300231016,6
UNION ALL SELECT '2018-07-04 04:15:00.000',2001154492,158666,2300231016,4
UNION ALL SELECT '2018-07-04 15:44:00.000',2001154814,158666,2300231016,5
UNION ALL SELECT '2018-07-04 21:30:00.000',2002057915,158666,2300231016,4
UNION ALL SELECT '2018-07-05 02:09:00.000',2002058086,158666,2300231016,4
UNION ALL SELECT '2018-07-05 04:15:00.000',2002058147,158666,2300231016,3
UNION ALL SELECT '2018-07-05 17:23:00.000',2003700706,158666,2300231016,2
UNION ALL SELECT '2018-07-05 18:07:00.000',2003700938,158666,2300231016,1
UNION ALL SELECT '2018-07-06 20:15:00.000',2005719626,158666,2300231016,7
UNION ALL SELECT '2018-07-07 07:45:00.000',2005719879,158666,2300231016,8
UNION ALL SELECT '2018-07-07 12:13:00.000',2005719931,158666,2300231016,9
UNION ALL SELECT '2018-07-09 18:29:00.000',2009395510,158666,2300231016,8
UNION ALL SELECT '2018-07-09 21:05:00.000',2009395523,158666,2300231016,6
UNION ALL SELECT '2018-07-11 21:31:00.000',2011107312,158666,2300231016,5
UNION ALL SELECT '2018-06-27 18:23:00.000',1991016381,258669,2300231016,4
UNION ALL SELECT '2018-06-27 19:07:00.000',1991181365,258669,2300231016,4
UNION ALL SELECT '2018-06-27 19:28:00.000',1991181376,258669,2300231016,3
UNION ALL SELECT '2018-06-28 01:44:00.000',1991583923,258669,2300231016,9
UNION ALL SELECT '2018-06-28 02:19:00.000',1991583943,258669,2300231016,2
UNION ALL SELECT '2018-07-03 10:15:00.000',1999231742,258669,2300231016,1
UNION ALL SELECT '2018-07-03 10:45:00.000',2000293679,258669,2300231016,1
UNION ALL SELECT '2018-07-03 14:22:00.000',2000293804,258669,2300231016,3
UNION ALL SELECT '2018-07-04 19:45:00.000',2002057785,258669,2300231016,2
UNION ALL SELECT '2018-07-04 21:00:00.000',2002057813,258669,2300231016,1
UNION ALL SELECT '2018-07-05 15:12:00.000',2002548332,258669,2300231016,7
UNION ALL SELECT '2018-07-05 17:52:00.000',2003700719,258669,2300231016,6
UNION ALL SELECT '2018-07-09 22:30:00.000',2009395530,258669,2300231016,5
UNION ALL SELECT '2018-07-09 23:23:00.000',2009395666,258669,2300231016,3
UNION ALL SELECT '2018-07-30 17:45:00.000',2043642075,158666,2300231016,2
UNION ALL SELECT '2018-07-30 23:30:00.000',2043642114,158666,2300231016,4
;WITH NextEventLag AS(
--Returns the next event information.
SELECT
ORDER_TIME
,ORDER_ID
,PROD_ID
,ACCOUNT_ID
,RowNum = ROW_NUMBER() OVER(PARTITION BY ACCOUNT_ID, PROD_ID ORDER BY ORDER_TIME)
--NEXT_ORDER_TIME_LAG: Returns the time difference between two consecutive order times.
,NEXT_ORDER_TIME_LAG = DATEDIFF(MINUTE, LAG(ORDER_TIME, 1, ORDER_TIME) OVER(PARTITION BY ACCOUNT_ID, PROD_ID ORDER BY ORDER_TIME), ORDER_TIME)
,ORDER_QUANTITY
FROM #Data
)
,RunningOrders AS(
SELECT
RowNum
,ORDER_TIME
,ACCOUNT_ID
,PROD_ID
,NEXT_ORDER_TIME_LAG
,LAG_LESS_THAN_24h = 0
,ORDER_QUANTITY
FROM NextEventLag
WHERE RowNum = 1
UNION ALL
SELECT
NextEventLag.RowNum
,NextEventLag.ORDER_TIME
,NextEventLag.ACCOUNT_ID
,NextEventLag.PROD_ID
,NextEventLag.NEXT_ORDER_TIME_LAG
--If the time lag between consecutive events and the time running sum is over 1440 minutes then set the value to 0.
--Change the NEXT_ORDER_TIME_LAG time interval to the desired interval value in minutes.
,LAG_LESS_THAN_24h = CASE WHEN RunningOrders.LAG_LESS_THAN_24h + NextEventLag.NEXT_ORDER_TIME_LAG >= 1440 THEN 0
ELSE RunningOrders.LAG_LESS_THAN_24h + NextEventLag.NEXT_ORDER_TIME_LAG
END
,NextEventLag.ORDER_QUANTITY
FROM RunningOrders
INNER JOIN NextEventLag ON RunningOrders.RowNum + 1 = NextEventLag.RowNum
AND RunningOrders.ACCOUNT_ID = NextEventLag.ACCOUNT_ID
AND RunningOrders.PROD_ID = NextEventLag.PROD_ID
)
,GroupedLags AS(
--This Groups together the LAG(s) less than 1440 minutes and is used by the outer query window functions
--to calculate the running aggregates.
SELECT RunningOrders.*
,Running24h.*
FROM RunningOrders
CROSS APPLY(SELECT TOP 1
Groups = COUNT(*) OVER(ORDER BY GroupApply.LAG_LESS_THAN_24h) --Count admin units within the time window in the WHERE clause
FROM RunningOrders AS GroupApply
WHERE
GroupApply.ACCOUNT_ID = RunningOrders.ACCOUNT_ID
AND GroupApply.PROD_ID = RunningOrders.PROD_ID
AND GroupApply.ORDER_TIME <= RunningOrders.ORDER_TIME
--ORDER BY StageBaseApply.ORDER_TIME
) AS Running24h
)
select
GroupedLags.ACCOUNT_ID
,GroupedLags.PROD_ID
,GroupedLags.ORDER_TIME
,GroupedLags.NEXT_ORDER_TIME_LAG
,GroupedLags.LAG_LESS_THAN_24h
,RUNNING_COUNT_24h = ROW_NUMBER() OVER(PARTITION BY GroupedLags.ACCOUNT_ID, GroupedLags.PROD_ID, GroupedLags.Groups ORDER BY GroupedLags.ORDER_TIME)
,RUNNING_SUM_24h = SUM(ORDER_QUANTITY) OVER(PARTITION BY GroupedLags.ACCOUNT_ID, GroupedLags.PROD_ID, GroupedLags.Groups ORDER BY GroupedLags.ORDER_TIME)
from GroupedLags
ORDER BY
GroupedLags.ACCOUNT_ID
,GroupedLags.PROD_ID
,GroupedLags.ORDER_TIME
Here is the db<>fiddle demo

Why not grouping correctly when calculating the moving average in SQL Server

I have the following code which is used to calculate the 12-month moving average. I want to calculate the 12month moving average per ACNBR. When only a single ACNBR is used the code works fine, when I try to calculate the moving average for more than one ACNBR it doesn't work anymore. Please assist. Thank you.
-Sample data code:
CREATE TABLE #RollingTotalsExample
(
[Date] DATE
,[Value] INT
,[ACNBR] INT
,[CIS] INT
);
INSERT INTO #RollingTotalsExample
SELECT '2011-01-01',626,100,12
UNION ALL SELECT '2011-02-01',231,100,12 UNION ALL SELECT '2011-03-01',572,100,12
UNION ALL SELECT '2011-04-01',775,100,12 UNION ALL SELECT '2011-05-01',660,100,12
UNION ALL SELECT '2011-06-01',662,100,12 UNION ALL SELECT '2011-07-01',541,100,12
UNION ALL SELECT '2011-08-01',849,100,12 UNION ALL SELECT '2011-09-01',632,100,12
UNION ALL SELECT '2011-10-01',906,100,12 UNION ALL SELECT '2011-11-01',961,100,12
UNION ALL SELECT '2011-12-01',361,100,12 UNION ALL SELECT '2012-01-01',461,100,12
UNION ALL SELECT '2012-02-01',928,100,12 UNION ALL SELECT '2012-03-01',855,100,12
UNION ALL SELECT '2012-04-01',605,100,12 UNION ALL SELECT '2012-05-01',83,100,12
UNION ALL SELECT '2012-06-01',44,100,12 UNION ALL SELECT '2012-07-01',382,100,12
UNION ALL SELECT '2012-08-01',862,100,12 UNION ALL SELECT '2012-09-01',549,100,12
UNION ALL SELECT '2012-10-01',632,100,12 UNION ALL SELECT '2012-11-01',2,100,12
UNION ALL SELECT '2012-12-01',26,100,12
UNION ALL SELECT '2011-01-01',626,200,12
UNION ALL SELECT '2011-02-01',231,200,12 UNION ALL SELECT '2011-03-01',572,200,12
UNION ALL SELECT '2011-04-01',775,200,12 UNION ALL SELECT '2011-05-01',660,200,12
UNION ALL SELECT '2011-06-01',662,200,12 UNION ALL SELECT '2011-07-01',541,200,12
UNION ALL SELECT '2011-08-01',849,200,12 UNION ALL SELECT '2011-09-01',632,200,12
UNION ALL SELECT '2011-10-01',906,200,12 UNION ALL SELECT '2011-11-01',961,200,12
UNION ALL SELECT '2011-12-01',361,200,12 UNION ALL SELECT '2012-01-01',461,200,12
UNION ALL SELECT '2012-02-01',928,200,12 UNION ALL SELECT '2012-03-01',855,200,12
UNION ALL SELECT '2012-04-01',605,200,12 UNION ALL SELECT '2012-05-01',83,200,12
UNION ALL SELECT '2012-06-01',44,200,12 UNION ALL SELECT '2012-07-01',382,200,12
UNION ALL SELECT '2012-08-01',862,200,12 UNION ALL SELECT '2012-09-01',549,200,12
UNION ALL SELECT '2012-10-01',632,200,12 UNION ALL SELECT '2012-11-01',2,200,12
UNION ALL SELECT '2012-12-01',26,200,12;
-code
SELECT a.[Date]
,a.ACNBR
,Value=MAX(CASE WHEN a.[Date] = b.[Date] THEN a.Value END)
,Rolling12Months=CASE
WHEN ROW_NUMBER() OVER (ORDER BY a.[Date]) < (12)
THEN NULL
ELSE avg(b.Value)
END
FROM #RollingTotalsExample a
JOIN #RollingTotalsExample b ON b.[Date] BETWEEN DATEADD(month, -11, a.[Date]) AND a.[Date]
GROUP BY a.ACNBR,a.[Date]
ORDER BY a.ACNBR,a.[Date]
Are you looking output like this?
select *, sum(value) over(partition by acnbr order by date rows between 11 preceding and current row) from #RollingTotalsExample
Else post the expected output

Need help in SQL Query 5

I am using SQL Server 2008. I have data by each employee for each day. Below is the sample data.
WITH RawData as
(
SELECT '10001' AS EmpNo,'2015-01-01' as AttendanceDate,'FS' AS ShiftCode UNION
SELECT '10001','2015-01-02','WO' UNION
SELECT '10001','2015-01-03','FS' UNION
SELECT '10001','2015-01-04','FS' UNION
SELECT '10001','2015-01-05','FS' UNION
SELECT '10001','2015-01-06','FS' UNION
SELECT '10001','2015-01-07','FS' UNION
SELECT '10001','2015-01-08','FS' UNION
SELECT '10001','2015-01-09','WO' UNION
SELECT '10001','2015-01-10','FS' UNION
SELECT '10001','2015-01-11','FS' UNION
SELECT '10001','2015-01-12','FS' UNION
SELECT '10001','2015-01-13','FS' UNION
SELECT '10001','2015-01-14','FS' UNION
SELECT '10001','2015-01-15','FS' UNION
SELECT '10001','2015-01-16','WO' UNION
SELECT '10001','2015-01-17','FS' UNION
SELECT '10001','2015-01-18','FS' UNION
SELECT '10001','2015-01-19','FS' UNION
SELECT '10001','2015-01-20','FS' UNION
SELECT '10001','2015-01-21','FS' UNION
SELECT '10001','2015-01-22','FS' UNION
SELECT '10001','2015-01-23','WO' UNION
SELECT '10001','2015-01-24','FS' UNION
SELECT '10001','2015-01-25','FS' UNION
SELECT '10001','2015-01-26','FS' UNION
SELECT '10001','2015-01-27','FS' UNION
SELECT '10001','2015-01-28','FS' UNION
SELECT '10001','2015-01-29','FS' UNION
SELECT '10001','2015-01-30','WO' UNION
SELECT '10001','2015-01-31','FS' UNION
SELECT '10002','2015-01-01','FS' UNION
SELECT '10002','2015-01-02','WO' UNION
SELECT '10002','2015-01-03','WO' UNION
SELECT '10002','2015-01-04','FS' UNION
SELECT '10002','2015-01-05','FS' UNION
SELECT '10002','2015-01-06','FS' UNION
SELECT '10002','2015-01-07','FS' UNION
SELECT '10002','2015-01-08','FS' UNION
SELECT '10002','2015-01-09','WO' UNION
SELECT '10002','2015-01-10','WO' UNION
SELECT '10002','2015-01-11','FS' UNION
SELECT '10002','2015-01-12','FS' UNION
SELECT '10002','2015-01-13','FS' UNION
SELECT '10002','2015-01-14','FS' UNION
SELECT '10002','2015-01-15','FS' UNION
SELECT '10002','2015-01-16','WO' UNION
SELECT '10002','2015-01-17','WO' UNION
SELECT '10002','2015-01-18','FS' UNION
SELECT '10002','2015-01-19','FS' UNION
SELECT '10002','2015-01-20','FS' UNION
SELECT '10002','2015-01-21','FS' UNION
SELECT '10002','2015-01-22','FS' UNION
SELECT '10002','2015-01-23','WO' UNION
SELECT '10002','2015-01-24','WO' UNION
SELECT '10002','2015-01-25','FS' UNION
SELECT '10002','2015-01-26','FS' UNION
SELECT '10002','2015-01-27','FS' UNION
SELECT '10002','2015-01-28','FS' UNION
SELECT '10002','2015-01-29','FS' UNION
SELECT '10002','2015-01-30','WO' UNION
SELECT '10002','2015-01-31','WO')
SELECT * FROM RawData Order By EmpNo,AttendanceDate
How to write SQL Query to get following output based on this sample data ? The workweek of each employee starts on a Day after weekly off and it can be any day (mon, tue etc). The shift code denotes WO: weekly off, FS: First Shift, SS: Second Shift.
EmpNo WeekFrom WeekTo
10001 2015-01-01 2015-01-02
10001 2015-01-03 2015-01-09
10001 2015-01-10 2015-01-16
10001 2015-01-17 2015-01-23
10001 2015-01-24 2015-01-30
10001 2015-01-31 2015-01-31
10002 2015-01-01 2015-01-03
10002 2015-01-04 2015-01-10
10002 2015-01-11 2015-01-17
10002 2015-01-18 2015-01-24
10002 2015-01-25 2015-01-31
Got a solution. But its taking quite a long time on live table with 1 Million rows. Have I done something wrong in a query ? Or there is a better way of doing this.
WITH RawData as
(
-- Insert above data here.
)
,ProcessData AS (
SELECT EmpNo,AttendanceDate,ShiftCode,RowID = ROW_NUMBER() OVER (
ORDER BY EmpNo, AttendanceDate
), WeekNo = 1 FROM RawData
)
,FinalData
AS (
SELECT EmpNo, AttendanceDate, ShiftCode, RowID, WeekNo = 1
FROM ProcessData DA
WHERE RowID = 1
UNION ALL
SELECT DA.EmpNo, DA.AttendanceDate, DA.ShiftCode, DA.RowID,
WeekNo = (CASE WHEN FinalData.EmpNo != DA.EmpNo THEN 1 ELSE FinalData.WeekNo + (CASE WHEN (FinalData.ShiftCode = 'WO' AND DA.ShiftCode != 'WO') THEN 1 ELSE 0 END) END)
FROM FinalData
INNER JOIN ProcessData DA ON DA.RowID = FinalData.RowID + 1
)
SELECT EmpNo, MIN(AttendanceDate) AS StartDate, MAX(AttendanceDate) AS EndDate, WeekNo
FROM FinalData
GROUP BY EmpNo, WeekNo
ORDER BY EmpNo, WeekNo
Try this:
SQL Fiddle
;WITH RawData AS (
-- Your insert statements here
),
Cte AS(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY EmpNo, grp ORDER BY AttendanceDate DESC)
FROM (
SELECT *,
grp = DATEADD(DAY, -ROW_NUMBER() OVER(PARTITION BY EmpNo ORDER BY AttendanceDate), AttendanceDate)
FROM RawData
WHERE ShiftCode = 'WO'
)t
),
CteWeekOff AS(
SELECT EmpNo, AttendanceDate, ShiftCode FROM cte WHERE RN = 1
),
CteFinal AS(
SELECT
EmpNo,
WeekFrom = MIN(AttendanceDate),
Weekto = MAX(AttendanceDate)
FROM (
SELECT *,
grp = DATEADD(DAY, - ROW_NUMBER() OVER(PARTITION BY EmpNo ORDER BY AttendanceDate), AttendanceDate)
FROM RawData
WHERE ShiftCode <> 'WO'
)t
GROUP BY EmpNo, grp
)
SELECT
EmpNo,
WeekFrom = x.WeekFrom,
WeekTo = w.AttendanceDate
FROM CteWeekOff w
CROSS APPLY(
SELECT TOP 1 WeekFrom
FROM CteFinal r
WHERE
r.EmpNo = w.EmpNo
AND r.WeekFrom <= w.AttendanceDate
ORDER BY r.WeekFrom DESC
)x(WeekFrom)
UNION ALL
SELECT
EmpNo,
WeekFrom = x.WeekFrom,
WeekTo = t.AttendanceDate
FROM (
SELECT *, RN = ROW_NUMBER() OVER(PARTITION BY EmpNo ORDER BY AttendanceDate DESC)
FROM RawData
)t
CROSS APPLY(
SELECT TOP 1 AttendanceDate
FROM CteFinal r
WHERE
r.EmpNo = t.EmpNo
AND r.WeekFrom < t.AttendanceDate
ORDER BY r.WeekFrom DESC
)x(WeekFrom)
WHERE
RN = 1
AND ShiftCode <> 'WO'
ORDER BY EmpNo, WeekFrom
Finally this worked. 5 seconds on 230,000 records. I will go ahead with my solution. Thanks for your time. Hope this solution helps someone.
-- Step 1 : Save it to temp table
SELECT EmpNo,AttendanceDate,ShiftCode,RowID = ROW_NUMBER() OVER (
ORDER BY EmpNo, AttendanceDate
), WeekNo = 1 into #RawData FROM -- My table
-- Step 2 : Use temp table
;WITH FinalData
AS (
SELECT EmpNo, AttendanceDate, ShiftCode, RowID, WeekNo = 1
FROM #RawData DA
WHERE RowID = 1
UNION ALL
SELECT DA.EmpNo, DA.AttendanceDate, DA.ShiftCode, DA.RowID,
WeekNo = (CASE WHEN FinalData.EmpNo != DA.EmpNo THEN 1 ELSE FinalData.WeekNo + (CASE WHEN (FinalData.ShiftCode = 'WO' AND DA.ShiftCode != 'WO') THEN 1 ELSE 0 END) END)
FROM FinalData
INNER JOIN #RawData DA ON DA.RowID = FinalData.RowID + 1
)
SELECT EmpNo, MIN(AttendanceDate) AS StartDate, MAX(AttendanceDate) AS EndDate, WeekNo
FROM FinalData
GROUP BY EmpNo, WeekNo
ORDER BY EmpNo, WeekNo
OPTION (MAXRECURSION 0)

How to query Open-high-low-close (OHLC) data from SQL Server

I'm trying to retrieve data for a Open-high-low-close (OHLC) chart directly from the database, it's the kind of chart you see of stocks. Is this possible, and if, how?
I have a table like this (simplified):
Date | Price | PriceType
A record is created for each day, I will report per month / year, not per day as used for stocks.
I would like to query something like this:
SELECT PriceType, MAX(Price) as High, MIN(Price) as Low, [Price of first item of month] as Open, [Price of last item of month] as Close GROUP BY PriceType, Year(Date), Month(Date)
To access the SQL Server I use LLBLGen, so an anwser based on that technology would be great, a generic SQL server will do too!
It's SQL 2005, but 2008 is also an option.
Thanks.
This appears to work. There may well be a less verbose way to do it.
--create test data
CREATE TABLE #t
(priceDate DATETIME
,price MONEY
,priceType CHAR(1)
)
INSERT #t
SELECT '20090101',100,'A'
UNION SELECT '20090102',500,'A'
UNION SELECT '20090103',20 ,'A'
UNION SELECT '20090104',25 ,'A'
UNION SELECT '20090105',28 ,'A'
UNION SELECT '20090131',150,'A'
UNION SELECT '20090201',501,'A'
UNION SELECT '20090203',21 ,'A'
UNION SELECT '20090204',26 ,'A'
UNION SELECT '20090205',29 ,'A'
UNION SELECT '20090228',151,'A'
UNION SELECT '20090101',100,'B'
UNION SELECT '20090102',500,'B'
UNION SELECT '20090103',20 ,'B'
UNION SELECT '20090104',25 ,'B'
UNION SELECT '20090105',28 ,'B'
UNION SELECT '20090131',150,'B'
UNION SELECT '20090201',501,'B'
UNION SELECT '20090203',21 ,'B'
UNION SELECT '20090204',26 ,'B'
UNION SELECT '20090205',29 ,'B'
UNION SELECT '20090228',151,'B'
--query
;WITH rangeCTE
AS
(
SELECT MIN(priceDate) minDate
,MAX(priceDate) maxDate
FROM #t
)
,datelistCTE
AS
(
SELECT CAST(CONVERT(CHAR(6),minDate,112) + '01' AS DATETIME) AS monthStart
,DATEADD(mm,1,CAST(CONVERT(CHAR(6),minDate,112) + '01' AS DATETIME)) -1 AS monthEnd
,1 AS monthID
FROM rangeCTE
UNION ALL
SELECT DATEADD(mm,1,monthStart)
,DATEADD(mm,2,monthStart) - 1
,monthID + 1
FROM datelistCTE
WHERE monthStart <= (SELECT maxDate FROM rangeCTE)
)
,priceOrderCTE
AS
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY monthID, priceType
ORDER BY priceDate
) AS rn1
,ROW_NUMBER() OVER (PARTITION BY monthID, priceType
ORDER BY priceDate DESC
) AS rn2
,ROW_NUMBER() OVER (PARTITION BY monthID, priceType
ORDER BY price DESC
) AS rn3
,ROW_NUMBER() OVER (PARTITION BY monthID, priceType
ORDER BY price
) AS rn4
FROM datelistCTE AS d
JOIN #t AS t
ON t.priceDate BETWEEN d.monthStart AND d.monthEnd
WHERE monthStart <= (SELECT maxDate FROM rangeCTE)
)
SELECT o.MonthStart
,o.priceType
,o.Price AS opening
,c.price AS closing
,h.price AS high
,l.price AS low
FROM priceOrderCTE AS o
JOIN priceOrderCTE AS c
ON c.priceType = o.PriceType
AND c.monthID = o.MonthID
JOIN priceOrderCTE AS h
ON h.priceType = o.PriceType
AND h.monthID = o.MonthID
JOIN priceOrderCTE AS l
ON l.priceType = o.PriceType
AND l.monthID = o.MonthID
WHERE o.rn1 = 1
AND c.rn2 = 1
AND h.rn3 = 1
AND l.rn4 = 1
This is a little query I wrote that seems to work nicely for one time span at a time. All you need to do is comment the select DATEPARTS in order to get to the timespan you are looking for. Or you could just make multiple views for different timespans. Also the underlying data table uses Bid Ask tick style data. If you are using mids or last prices you could eliminate the case statements from the selects.
Select
tmp.num,
rf.CurveName,
rf.Period as Period,
CASE WHEN (tmp2.Bid is null or tmp2.Ask is null) then isnull(tmp2.Bid,0)+isnull(tmp2.Ask,0) else (tmp2.Bid+tmp2.Ask)/2 end as [Open],
tmp.Hi,
tmp.Lo,
CASE WHEN (rf.Bid is null or Rf.Ask is null) then isnull(rf.Bid,0)+isnull(rf.Ask,0) else (rf.Bid+rf.Ask)/2 end as [Close],
tmp.OpenDate,
tmp.CloseDate,
tmp.yr,
tmp.mth,
tmp.wk,
tmp.dy,
tmp.hr
from BidAsk rf inner join
(SELECT count(CurveName)as num,CurveName,
Period,
max(CASE WHEN (Bid is null or Ask is null) then isnull(Bid,0)+isnull(Ask,0) else (Bid+Ask)/2 end) as Hi,
min(CASE WHEN (Bid is null or Ask is null) then isnull(Bid,0)+isnull(Ask,0) else (Bid+Ask)/2 end) as Lo,
max(CurveDateTime) as CloseDate, min(CurveDateTime) as OpenDate,
DATEPART(year, CurveDateTime) As yr,
DATEPART(month, CurveDateTime) As mth,
DATEPART(week, CurveDateTime) As wk,
DATEPART(Day, CurveDateTime) as dy,
DATEPART(Hour, CurveDateTime) as hr
--DATEPART(minute, CurveDateTime) as mnt
FROM
BidAsk
GROUP BY
CurveName,Period,
DATEPART(year, CurveDateTime),
DATEPART(month, CurveDateTime),
DATEPART(week, CurveDateTime),
DATEPART(Day, CurveDateTime) ,
DATEPART(Hour, CurveDateTime)
--DATEPART(minute, CurveDateTime)
) tmp on
tmp.CurveName=rf.CurveName and
tmp.CloseDate=rf.CurveDateTime and
tmp.Period=rf.Period
inner join BidAsk tmp2 on
tmp2.CurveName=rf.CurveName and
tmp2.CurveDateTime=tmp.Opendate and
tmp2.Period=rf.Period
ORDER BY
CurveName,Period,tmp.yr,tmp.mth
--DATEPART(year, CurveDateTime),
--DATEPART(month, CurveDateTime)
--DATEPART(day, CurveDateTime),
--DATEPART(Hour, CurveDateTime),
--DATEPART(minute, CurveDateTime) )

Resources