Find the n highest consecutive values in a set of rows

Find the n highest consecutive values in a set of rows - sql-server

I have some data in a table as follows:
FileDate SumAmount
20150401 90.99
20150401 313
20150403 481.2
20150404 321.27
20150405 103
20150406 25
20150407 180.5
20150408 319.91
20150409 688
20150411 69
20150412 65
20150413 322
20150414 100
20150415 111.97
20150416 979.15
20150417 655.4
20150418 124
20150419 30
20150420 457
20150421 192.6
20150422 191.96
20150423 220
20150424 252.5
20150425 109.1
20150426 135.25
20150427 648.08
20150428 692
20150429 410.99
20150430 170
20150501 166.19
20150502 92
20150503 100
20150504 59
20150505 124.01
20150506 44.5
20150507 331.64
20150508 299.8
I am trying to devise a query that will find the highest 4 consecutive days values in the data.
Essentially, I think I need to partition by date and perform a row numbering over it but I can't seem to get the syntax right to evaluate the values.

So I use -3 in the join conditions since the day itself counts as one. Let me know what you think. Also I use day of year(DY) to ensure that it's only consecutive days and so I don't have to rank the dates manually. Hope this helps!
DECLARE #yourTable TABLE(FileDate DATE ,SumAmount FLOAT);
INSERT INTO #yourTable
VALUES ('20150401',90.99),
('20150402',313),
('20150403',481.2),
('20150404',321.27),
('20150405',103),
('20150406',25),
('20150407',180.5),
('20150408',319.91),
('20150409',688),
('20150411',69),
('20150412',65),
('20150413',322),
('20150414',100),
('20150415',111.97),
('20150416',979.15),
('20150417',655.4),
('20150418',124),
('20150419',30),
('20150420',457),
('20150421',192.6),
('20150422',191.96),
('20150423',220),
('20150424',252.5),
('20150425',109.1),
('20150426',135.25),
('20150427',648.08),
('20150428',692),
('20150429',410.99),
('20150430',170),
('20150501',166.19),
('20150502',92),
('20150503',100),
('20150504',59),
('20150505',124.01),
('20150506',44.5),
('20150507',331.64),
('20150508',299.8);
WITH CTE
AS
(
SELECT YEAR(FileDate) yr,DATEPART(DY,FileDate) dy,fileDate,SumAmount
FROM #yourTable
),
CTE_Max_Sum
AS
(
SELECT TOP 1 A.yr,A.dy,A.FileDate,SUM(B.SumAmount) consec4DaySum
FROM CTE A
INNER JOIN CTE B
ON B.dy BETWEEN A.dy - 3 AND A.dy
AND A.yr = B.yr
GROUP BY A.yr,A.dy,A.FileDate
ORDER BY SUM(B.SumAmount) DESC
)
SELECT A.*,B.consec4DaySum
FROM CTE A
INNER JOIN CTE_Max_Sum B
ON A.dy BETWEEN B.dy - 3 AND B.dy
AND A.yr = B.yr
Results:
yr dy fileDate SumAmount consec4DaySum
----------- ----------- ---------- ---------------------- ----------------------
2015 117 2015-04-27 648.08 1921.07
2015 118 2015-04-28 692 1921.07
2015 119 2015-04-29 410.99 1921.07
2015 120 2015-04-30 170 1921.07

You can use a CTE for that, joining every row with its three following rows (day-wise) and summing up. This Fiddle sadly does not work for me, it runs on my sql server and work for you. Watch out for recursion depth, without WHERE cte.Consecutive < 4 you quickly run into an error.
WITH cte (StartDate, EndDate, Consecutive, SumAmount)
AS (
SELECT t.FileDate, t.FileDate, 1, t.SumAmount FROM dbo.table30194903 t
UNION ALL
SELECT cte.StartDate, t.FileDate, cte.Consecutive + 1, cte.SumAmount + t.SumAmount
FROM dbo.table30194903 t INNER JOIN cte ON DATEADD(DAY, 1, cte.EndDate) = t.FileDate
WHERE cte.Consecutive < 5
)
SELECT *
FROM cte
WHERE cte.Consecutive = 4
ORDER BY cte.SumAmount DESC
EDIT: Had two errors in my query, it summed up wrong rows and showd the last day in the series.

I would like to add an answer using a subquery, however it does take more time compared to my cte...
SELECT t.FileDate, SUM(s.SumAmount)
FROM dbo.table30194903 t
LEFT JOIN dbo.table30194903 s ON t.FileDate <= s.FileDate AND DATEDIFF(DAY, t.FileDate, s.FileDate) < 4
GROUP BY t.FileDate
HAVING COUNT(s.SumAmount) = 4
ORDER BY SUM(s.SumAmount) DESC

I think the simplest way to get this is to use an APPLY to get the number of records in the n days following each row, and then limit this to where there are n dates, this ensures you have consecutive days. You can then just order by the sum and select the top 1:
DECLARE #n INT = 4;
SELECT TOP 1
FirstDate = t.FileDate,
FourDaySum = t2.Amount
FROM dbo.T
CROSS APPLY
( SELECT Amount = SUM(t2.SumAmount),
Dates = COUNT(DISTINCT t2.FileDate)
FROM dbo.T AS t2
WHERE t2.FileDate >= t.FileDate
AND t2.FileDate < DATEADD(DAY, #n, t.FileDate)
) AS t2
WHERE t2.Dates = #n
ORDER BY t2.Amount DESC;
Example on SQL Fiddle

How about a simply while block and sum the values of a range of dates?
DECLARE #startingDate DATETIME, #searchDate DATETIME;
DECLARE #maxSoFar INT, #sum INT, #daysRange INT;
SET #startingDate = convert(datetime, '20150401', 110)
SET #searchDate = #startingDate;
SET #daysRange = 3;
SET #maxSoFar = 0;
WHILE GETDATE()> #searchDate
BEGIN
--PRINT #searchDate
--PRINT DATEADD(DAY,#daysRange,#searchDate)
SELECT #sum = SUM(SumAmount) FROM MyTable WHERE FileDate >= #searchDate AND FileDate <= DATEADD(DAY,#daysRange,#searchDate)
IF #sum > #maxSoFar
BEGIN
SET #maxSoFar = #sum;
END
SET #searchDate = DATEADD(DAY,1,#searchDate)
END

Related

Is it possible to use the SQL DATEADD function but exclude dates from a table in the calculation?

Is it possible to use the DATEADD function but exclude dates from a table?
We already have a table with all dates we need to exclude. Basically, I need to add number of days to a date but exclude dates within a table.
Example: Add 5 days to 01/08/2021. Dates 03/08/2021 and 04/08/2021 exist in the exclusion table. So, resultant date should be: 08/08/2021.
Thank you

A bit of a "wonky" solution, but it works. Firstly we use a tally to create a Calendar table of dates, that exclude your dates in the table, then we get the nth row, where n is the number of days to add:
DECLARE #DaysToAdd int = 5,
#StartDate date = '20210801';
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT 0 AS I
UNION ALL
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3), --Up to 1,000
Calendar AS(
SELECT DATEADD(DAY,T.I, #StartDate) AS D,
ROW_NUMBER() OVER (ORDER BY T.I) AS I
FROM Tally T
WHERE NOT EXISTS (SELECT 1
FROM dbo.DatesTable DT
WHERE DT.YourDate = DATEADD(DAY,T.I, #StartDate)))
SELECT D
FROM Calendar
WHERE I = #DaysToAdd+1;

A best solution is probably a calendar table.
But if you're willing to traverse through every date, then a recursive CTE can work. It would require tracking the total iterations and another column to substract if any traversed date was in the table. The exit condition uses the total difference.
An example dataset would be:
CREATE TABLE mytable(mydate date); INSERT INTO mytable VALUES ('20210803'), ('20210804');
And an example function run in it's own batch:
ALTER FUNCTION dbo.fn_getDays (#mydate date, #daysadd int)
RETURNS date
AS
BEGIN
DECLARE #newdate date;
WITH CTE(num, diff, mydate) AS (
SELECT 0 AS [num]
,0 AS [diff]
,DATEADD(DAY, 0, #mydate) [mydate]
UNION ALL
SELECT num + 1 AS [num]
,CTE.diff +
CASE WHEN DATEADD(DAY, num+1, #mydate) IN (SELECT mydate FROM mytable)
THEN 0 ELSE 1 END
AS [diff]
,DATEADD(DAY, num+1, #mydate) [mydate]
FROM CTE
WHERE (CTE.diff +
CASE WHEN DATEADD(DAY, num+1, #mydate) IN (SELECT mydate FROM mytable)
THEN 0 ELSE 1 END) <= #daysadd
)
SELECT #newdate = (SELECT MAX(mydate) AS [mydate] FROM CTE);
RETURN #newdate;
END
Running the function:
SELECT dbo.fn_getDays('20210801', 5)
Produces output, which is the MAX(mydate) from the function:
----------
2021-08-08
For reference the MAX(mydate) is taken from this dataset:
n diff mydate
----------- ----------- ----------
0 0 2021-08-01
1 1 2021-08-02
2 1 2021-08-03
3 1 2021-08-04
4 2 2021-08-05
5 3 2021-08-06
6 4 2021-08-07
7 5 2021-08-08

You can use the IN clause.
To perform the test, I used a W3Schools Test DB
SELECT DATE_ADD(BirthDate, INTERVAL 10 DAY) FROM Employees WHERE FirstName NOT IN (Select FirstName FROM Employees WHERE FirstName LIKE 'N%')
This query shows all the birth dates + 10 days except for the only employee with name starting with N (Nancy)

How to get a derive 'N' Date Rows from a single record with From / To date columns?

Title sounds confusing but let me please explain:
I have a table that has two columns that provide a date range, and one column that provides a value. I need to query that table and "detail" the data such as this
Is it possible to do only using TSQL?
Additional Info
The table in question is about 2-3million records long (and growing)

Assuming the range of dates is fairly narrow, an alternative is to use a recursive CTE to create a list of all dates in the range and then join interpolate to it:
WITH LastDay AS
(
SELECT MAX(Date_To) AS MaxDate
FROM MyTable
),
Days AS
(
SELECT MIN(Date_From) AS TheDate
FROM MyTable
UNION ALL
SELECT DATEADD(d, 1, TheDate) AS TheDate
FROM Days CROSS JOIN LastDay
WHERE TheDate <= LastDay.MaxDate
)
SELECT mt.Item_ID, mt.Cost_Of_Item, d.TheDate
FROM MyTable mt
INNER JOIN Days d
ON d.TheDate BETWEEN mt.Date_From AND mt.Date_To;
I've also assumed an that date from and date to represent an inclusive range (i.e. includes both edges) - it is unusual to use inclusive BETWEEN on dates.
SqlFiddle here
Edit
The default MAXRECURSION on a recursive CTE in Sql Server is 100, which will limit the date range in the query to a span of 100 days. You can adjust this to a maximum of 32767.
Also, if you are filtering just a smaller range of dates in your large table, you can adjust the CTE to limit the number of days in the range:
WITH DateRange AS
(
SELECT CAST('2014-01-01' AS DATE) AS MinDate,
CAST('2014-02-16' AS DATE) AS MaxDate
),
Days AS
(
SELECT MinDate AS TheDate
FROM DateRange
UNION ALL
SELECT DATEADD(d, 1, TheDate) AS TheDate
FROM Days CROSS APPLY DateRange
WHERE TheDate <= DateRange.MaxDate
)
SELECT mt.Item_ID, mt.Cost_Of_Item, d.TheDate
FROM MyTable mt
INNER JOIN Days d
ON d.TheDate BETWEEN mt.Date_From AND mt.Date_To
OPTION (MAXRECURSION 0);
Update Fiddle

This can be achieved using Cursors.
I've simulated the test data provided and created another table with the name "DesiredTable" to store the data inside, and created the following cusror which achieved exactly what you are looking for:
SET NOCOUNT ON;
DECLARE #ITEM_ID int, #COST_OF_ITEM Money,
#DATE_FROM date, #DATE_TO date;
DECLARE #DateDiff INT; -- holds number of days between from & to columns
DECLARE #counter INT = 0; -- for loop counter
PRINT '-------- Begin the Date Expanding Cursor --------';
-- defining the cursor target statement
DECLARE Date_Expanding_Cursor CURSOR FOR
SELECT [ITEM_ID]
,[COST_OF_ITEM]
,[DATE_FROM]
,[DATE_TO]
FROM [dbo].[OriginalTable]
-- openning the cursor
OPEN Date_Expanding_Cursor
-- fetching next row data into the declared variables
FETCH NEXT FROM Date_Expanding_Cursor
INTO #ITEM_ID, #COST_OF_ITEM, #DATE_FROM, #DATE_TO
-- if next row is found
WHILE ##FETCH_STATUS = 0
BEGIN
-- calculate the number of days in between the date columns
SELECT #DateDiff = DATEDIFF(day,#DATE_FROM,#DATE_TO)
-- reset the counter to 0 for the next loop
set #counter = 0;
WHILE #counter <= #DateDiff
BEGIN
-- inserting rows inside the new table
insert into DesiredTable
Values (#COST_OF_ITEM, DATEADD(day,#counter,#DATE_FROM))
set #counter = #counter +1
END
-- fetching next row
FETCH NEXT FROM Date_Expanding_Cursor
INTO #ITEM_ID, #COST_OF_ITEM, #DATE_FROM, #DATE_TO
END
-- cleanup code
CLOSE Date_Expanding_Cursor;
DEALLOCATE Date_Expanding_Cursor;
The code fetches every row from your original table, then it calculates the number of days between DATE_FROM and DATE_TO columns, then using this number the script will create identical rows to be inserted inside the new table DesiredTable.
give it a try and let me know of the results.

You can generate an increment table and join it to your date From:
Query:
With inc(n) as (
Select ROW_NUMBER() over (order by (select 1)) -1 From (
Select 1 From (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x1(n)
Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x2(n)
) as x(n)
)
Select item_id, cost, DATEADD(day, n, dateFrom), n From #dates d
Inner Join inc i on n <= DATEDIFF(day, dateFrom, dateTo)
Order by item_id
Output:
item_id cost Date n
1 100 2014-01-01 00:00:00.000 0
1 100 2014-01-02 00:00:00.000 1
1 100 2014-01-03 00:00:00.000 2
2 105 2014-01-08 00:00:00.000 2
2 105 2014-01-07 00:00:00.000 1
2 105 2014-01-06 00:00:00.000 0
2 105 2014-01-09 00:00:00.000 3
3 102 2014-02-14 00:00:00.000 3
3 102 2014-02-15 00:00:00.000 4
3 102 2014-02-16 00:00:00.000 5
3 102 2014-02-11 00:00:00.000 0
3 102 2014-02-12 00:00:00.000 1
3 102 2014-02-13 00:00:00.000 2
Sample Data:
declare #dates table(item_id int, cost int, dateFrom datetime, dateTo datetime);
insert into #dates(item_id, cost, dateFrom, dateTo) values
(1, 100, '20140101', '20140103')
, (2, 105, '20140106', '20140109')
, (3, 102, '20140211', '20140216');

Yet another way is to create and maintain calendar table, containing all dates for many years (in our app we have table for 30 years or so, extending every year). Then you can just link to calendar:
select <whatever you need>, calendar.day
from <your tables> inner join calendar on calendar.day between <min date> and <max date>
This approach allows to include additional information (holidays etc) in calendar table - sometimes very helpful.

Conditional counting based on comparison to previous row sql

Let's start with a sample of the data I'm working with:
Policy No | start date
1 | 2/15/2006
1 | 2/15/2009
1 | 2/15/2012
2 | 3/15/2006
3 | 3/19/2006
3 | 3/19/2012
4 | 3/31/2006
4 | 3/31/2009
I'm trying to write code in SQL Server 2008 that counts a few things. The principle is that the policyholder's earliest start date is when the policy began. Every three years an increase is offered to the client. If they agree to the increase, the start date is refreshed with the same date as the original, three years later. If they decline, nothing is added to the database at all.
I'm trying to not only count the number of times a customer accepted the offer (or increased the start date by three years), but separate it out by first offer or second offer. Taking the original start date and dividing the number of days between now and then by 1095 gets me the total number of offers, so I've gotten that far. What I really want it to do is compare each policy number to the one before it to see if it's the same (it's already ordered by policy number), then count the date change in a new "accepted" column and count the times it didn't change but could have as "declined".
Is this a case where I would need to self-join the table to itself to compare the dates? Or is there an easier way?

are you looking for this :-
Set Nocount On;
Declare #Test Table
(
PolicyNo Int
,StartDate Date
)
Declare #PolicyWithInc Table
(
RowId Int Identity(1,1) Primary Key
,PolicyNo Int
,StartDate Date
)
Insert Into #Test(PolicyNo,StartDate) Values
(1,'2/15/2006')
,(1,'2/15/2009')
,(1,'2/15/2012')
,(2,'3/15/2006')
,(3,'3/19/2006')
,(3,'3/19/2012')
,(4,'3/31/2006')
,(4,'3/31/2009')
Insert Into #PolicyWithInc(PolicyNo,StartDate)
Select t.PolicyNo
,t.StartDate
From #Test As t
Select pw.PolicyNo
,Sum(Case When Datediff(Year,t.StartDate, pw.StartDate) = 3 Then 1 Else 0 End) As DateArrived
,Sum(Case When Datediff(Year,t.StartDate, pw.StartDate) > 3 Then 1 Else 0 End) As DateNotArrived
,Sum(Case When Isnull(Datediff(Year,t.StartDate,pw.StartDate),0) = 3 Then 1 Else 0 End) As Years3IncrementCount
From #PolicyWithInc As pw
Left Join #PolicyWithInc As t On pw.PolicyNo = t.PolicyNo And pw.RowId = (t.RowId + 1)
Group By pw.PolicyNo

Probably below could help:
Set Nocount On;
Declare #Test Table
(
PolicyNo Int
,StartDate Date
)
Insert Into #Test(PolicyNo,StartDate) Values
(1,'2/15/2006')
,(1,'2/15/2009')
,(1,'2/15/2012')
,(2,'3/15/2006')
,(3,'3/19/2006')
,(3,'3/19/2012')
,(4,'3/31/2006')
,(4,'3/31/2009')
select PolicyNo, StartDate, dateadd(yy, 3, StartDate)Offer1, dateadd(yy, 6, StartDate)Offer2, dateadd(yy, 9, StartDate)Offer3 from
(select * , row_number() over (partition by PolicyNo order by StartDate) rn from #Test)A
where rn = 1
select
count(*) * 3 TotalOffersMade,
count(Data1.StartDate) FirstOfferAccepted,
count(Data2.StartDate) SecondOfferAccepted,
count(Data3.StartDate) ThirdOfferAccepted,
count(*) - count(Data1.StartDate) FirstOfferDeclined,
count(*) - count(Data2.StartDate) SecondOfferDeclined,
count(*) - count(Data3.StartDate) ThirdOfferDeclined
from
(
select PolicyNo, StartDate, dateadd(yy, 3, StartDate)Offer1, dateadd(yy, 6, StartDate)Offer2, dateadd(yy, 9, StartDate)Offer3 from
(select * , row_number() over (partition by PolicyNo order by StartDate) rn from #Test)A
where rn = 1
)Offers
LEFT JOIN
#Test Data1
on Offers.PolicyNo = Data1.PolicyNo and Offers.Offer1 = Data1.StartDate
LEFT JOIN
#Test Data2
on Offers.PolicyNo = Data2.PolicyNo and Offers.Offer2 = Data2.StartDate
LEFT JOIN
#Test Data3
on Offers.PolicyNo = Data3.PolicyNo and Offers.Offer3 = Data3.StartDate

Set based solution for processing rows in a SQL table

Can someone steer me in the right direction for solving this issue with a set-based solution versus cursor-based?
Given a table with the following rows:
Date Value
2013-11-01 12
2013-11-12 15
2013-11-21 13
2013-12-01 0
I need a query that will give me a row for each date between 2013-11-1 and 2013-12-1, as follows:
2013-11-01 12
2013-11-02 12
2013-11-03 12
...
2013-11-12 15
2013-11-13 15
2013-11-14 15
...
2013-11-21 13
2013-11-21 13
...
2013-11-30 13
2013-11-31 13
Any advice and/or direction will be appreciated.

The first thing that came to my mind was to fill in the missing dates by looking at the day of the year. You can do this by joining to the spt_values table in the master DB and adding the number to the first day of the year.
DECLARE #Table AS TABLE(ADate Date, ANumber Int);
INSERT INTO #Table
VALUES
('2013-11-01',12),
('2013-11-12',15),
('2013-11-21',13),
('2013-12-01',0);
SELECT
DateAdd(D, v.number, MinDate) Date
FROM (SELECT number FROM master.dbo.spt_values WHERE name IS NULL) v
INNER JOIN (
SELECT
Min(ADate) MinDate
,DateDiff(D, Min(ADate), Max(ADate)) DaysInSpan
,Year(Min(ADate)) StartYear
FROM #Table
) dates ON v.number BETWEEN 0 AND DaysInSpan - 1
Next I would wrap that to make a derived table, and add a subquery to get the most recent number. Your end result may look something like:
DECLARE #Table AS TABLE(ADate Date, ANumber Int);
INSERT INTO #Table
VALUES
('2013-11-01',12),
('2013-11-12',15),
('2013-11-21',13),
('2013-12-01',0);
-- Uncomment the following line to see how it behaves when the date range spans a year end
--UPDATE #Table SET ADate = DateAdd(d, 45, ADate)
SELECT
AllDates.Date
,(SELECT TOP 1 ANumber FROM #Table t WHERE t.ADate <= AllDates.Date ORDER BY ADate DESC)
FROM (
SELECT
DateAdd(D, v.number, MinDate) Date
FROM
(SELECT number FROM master.dbo.spt_values WHERE name IS NULL) v
INNER JOIN (
SELECT
Min(ADate) MinDate
,DateDiff(D, Min(ADate), Max(ADate)) DaysInSpan
,Year(Min(ADate)) StartYear
FROM #Table
) dates ON v.number BETWEEN 0 AND DaysInSpan - 1
) AllDates

Another solution, not sure how it compares to the two already posted performance wise but it's a bit more concise:
Uses a numbers table:
Linky
Query:
DECLARE #SDATE DATETIME
DECLARE #EDATE DATETIME
DECLARE #DAYS INT
SET #SDATE = '2013-11-01'
SET #EDATE = '2013-11-29'
SET #DAYS = DATEDIFF(DAY,#SDATE, #EDATE)
SELECT Num, DATEADD(DAY,N.Num,#SDATE), SUB.[Value]
FROM Numbers N
LEFT JOIN MyTable M ON DATEADD(DAY,N.Num,#SDATE) = M.[Date]
CROSS APPLY (SELECT TOP 1 [Value]
FROM MyTable M2
WHERE [Date] <= DATEADD(DAY,N.Num,#SDATE)
ORDER BY [Date] DESC) SUB
WHERE N.Num <= #DAYS
--
SQL Fiddle

It's possible, but neither pretty nor very performant at scale:
In addition to your_table, you'll need to create a second table/view dates containing every date you'd ever like to appear in the output of this query. For your example it would need to contain at least 2013-11-01 through 2013-12-01.
SELECT m.date, y.value
FROM your_table y
INNER JOIN (
SELECT md.date, MAX(my.date) AS max_date
FROM dates md
INNER JOIN your_table my ON md.date >= my.date
GROUP BY md.date
) m
ON y.date = m.max_date

SQL Query to return 24 hour, hourly count even when no values exist?

I've written a query that groups the number of rows per hour, based on a given date range.
SELECT CONVERT(VARCHAR(8),TransactionTime,101) + ' ' + CONVERT(VARCHAR(2),TransactionTime,108) as TDate,
COUNT(TransactionID) AS TotalHourlyTransactions
FROM MyTransactions WITH (NOLOCK)
WHERE TransactionTime BETWEEN CAST(#StartDate AS SMALLDATETIME) AND CAST(#EndDate AS SMALLDATETIME)
AND TerminalId = #TerminalID
GROUP BY CONVERT(VARCHAR(8),TransactionTime,101) + ' ' + CONVERT(VARCHAR(2),TransactionTime,108)
ORDER BY TDate ASC
Which displays something like this:
02/11/20 07 4
02/11/20 10 1
02/11/20 12 4
02/11/20 13 1
02/11/20 14 2
02/11/20 16 3
Giving the number of transactions and the given hour of the day.
How can I display all hours of the day - from 0 to 23, and show 0 for those which have no values?
Thanks.
UPDATE
Using the tvf below works for me for one day, however I'm not sure how to make it work for a date range.
Using the temp table of 24 hours:
-- temp table to store hours of the day
DECLARE #tmp_Hours TABLE ( WhichHour SMALLINT )
DECLARE #counter SMALLINT
SET #counter = -1
WHILE #counter < 23
BEGIN
SET #counter = #counter + 1
--print
INSERT INTO #tmp_Hours
( WhichHour )
VALUES ( #counter )
END
SELECT MIN(CONVERT(VARCHAR(10),[dbo].[TerminalTransactions].[TransactionTime],101)) AS TDate, [#tmp_Hours].[WhichHour], CONVERT(VARCHAR(2),[dbo].[TerminalTransactions].[TransactionTime],108) AS TheHour,
COUNT([dbo].[TerminalTransactions].[TransactionId]) AS TotalTransactions,
ISNULL(SUM([dbo].[TerminalTransactions].[TransactionAmount]), 0) AS TransactionSum
FROM [dbo].[TerminalTransactions] RIGHT JOIN #tmp_Hours ON [#tmp_Hours].[WhichHour] = CONVERT(VARCHAR(2),[dbo].[TerminalTransactions].[TransactionTime],108)
GROUP BY [#tmp_Hours].[WhichHour], CONVERT(VARCHAR(2),[dbo].[TerminalTransactions].[TransactionTime],108), COALESCE([dbo].[TerminalTransactions].[TransactionAmount], 0)
Gives me a result of:
TDate WhichHour TheHour TotalTransactions TransactionSum
---------- --------- ------- ----------------- ---------------------
02/16/2010 0 00 4 40.00
NULL 1 NULL 0 0.00
02/14/2010 2 02 1 10.00
NULL 3 NULL 0 0.00
02/14/2010 4 04 28 280.00
02/14/2010 5 05 11 110.00
NULL 6 NULL 0 0.00
02/11/2010 7 07 4 40.00
NULL 8 NULL 0 0.00
02/24/2010 9 09 2 20.00
So how can I get this to group properly?
The other issue is that for some days there will be no transactions, and these days also need to appear.
Thanks.

You do this by building first the 23 hours table, the doing an outer join against the transactions table. I use, for same purposes, a table valued function:
create function tvfGetDay24Hours(#date datetime)
returns table
as return (
select dateadd(hour, number, cast(floor(cast(#date as float)) as datetime)) as StartHour
, dateadd(hour, number+1, cast(floor(cast(#date as float)) as datetime)) as EndHour
from master.dbo.spt_values
where number < 24 and type = 'p');
Then I can use the TVF in queries that need to get 'per-hour' basis data, even for missing intervals in the data:
select h.StartHour, t.TotalHourlyTransactions
from tvfGetDay24Hours(#StartDate) as h
outer apply (
SELECT
COUNT(TransactionID) AS TotalHourlyTransactions
FROM MyTransactions
WHERE TransactionTime BETWEEN h.StartHour and h.EndHour
AND TerminalId = #TerminalID) as t
order by h.StartHour
Updated
Example of a TVF that returns 24hours between any arbitrary dates:
create function tvfGetAnyDayHours(#dateFrom datetime, #dateTo datetime)
returns table
as return (
select dateadd(hour, number, cast(floor(cast(#dateFrom as float)) as datetime)) as StartHour
, dateadd(hour, number+1, cast(floor(cast(#dateFrom as float)) as datetime)) as EndHour
from master.dbo.spt_values
where type = 'p'
and number < datediff(hour,#dateFrom, #dateTo) + 24);
Note that since master.dbo.spt_values contains only 2048 numbers, the function will not work between dates further apart than 2048 hours.

You have just discovered the value of the NUMBERS table. You need to create a table with a single column containing the numbers 0 to 23 in it. Then you join again this table using an OUTER join to ensure you always get 24 rows returned.

So going back to using Remus' original function, I've re-used it in a recursive call and storing the results in a temp table:
DECLARE #count INT
DECLARE #NumDays INT
DECLARE #StartDate DATETIME
DECLARE #EndDate DATETIME
DECLARE #CurrentDay DATE
DECLARE #tmp_Transactions TABLE
(
StartHour DATETIME,
TotalHourlyTransactions INT
)
SET #StartDate = '2000/02/10'
SET #EndDate = '2010/02/13'
SET #count = 0
SET #NumDays = DateDiff(Day, #StartDate, #EndDate)
WHILE #count < #NumDays
BEGIN
SET #CurrentDay = DateAdd(Day, #count, #StartDate)
INSERT INTO #tmp_Transactions (StartHour, TotalHourlyTransactions)
SELECT h.StartHour ,
t.TotalHourlyTransactions
FROM tvfGetDay24Hours(#CurrentDay) AS h
OUTER APPLY ( SELECT COUNT(TransactionID) AS TotalHourlyTransactions
FROM [dbo].[TerminalTransactions]
WHERE TransactionTime BETWEEN h.StartHour AND h.EndHour
AND TerminalId = 4
) AS t
ORDER BY h.StartHour
SET #count = #Count + 1
END
SELECT *
FROM #tmp_Transactions

group by datepart('hour', thetime). to show those hours with no values you'd have to left join a table of times against the grouping (coalesce(transaction.amount, 0))

I've run into a version of this problem before. The suggestion that worked the best was to setup a table (temporary, or not) with the hours of the day, then do an outer join to that table and group by datepart('h', timeOfRecord).
I don't remember why, but probably due to lack of flexibility because of the need for the other table, I ended up using a method where I group by whatever datepart I want and order by the datetime, then loop through and fill any spaces that are skipped with a 0. This approach worked well for me because I'm not reliant on the database to do all my work for me, and it's also MUCH easier to write an automated test for it.

Step 1, Create #table or a CTE to generate a hours days table. Outer loop for days and inner loop hours 0-23. This should be 3 columns Date, Days, Hours.
Step 2, Write your main query to also have days and hours columns and alias it so you can join it. CTE's have to be above this main query and pivots should be inside CTE's for it to work naturally.
Step 3, Do a select from step 1 table and Left join this Main Query table
ON A.[DATE] = B.[DATE]
AND A.[HOUR] = B.[HOUR]
You can also create a order by if your date columns like
ORDER BY substring(CONVERT(VARCHAR(15), A.[DATE], 105),4,2)
Guidlines
This will then give you all data for hours and days and including zeros for hours with no matches to do that use isnull([col1],0) as [col1].
You can now graph facts against days and hours.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Find the n highest consecutive values in a set of rows - sql-server

Related

Is it possible to use the SQL DATEADD function but exclude dates from a table in the calculation?

How to get a derive 'N' Date Rows from a single record with From / To date columns?

Conditional counting based on comparison to previous row sql

Set based solution for processing rows in a SQL table

SQL Query to return 24 hour, hourly count even when no values exist?

Categories

Resources