How to calculate the average of a Time column in SQL - sql-server

I have an ID column, and a time column. I want to group the IDs by average time.
IDs: 1234, 1234, 5678, 5678
Times: 13:21, 19:55, 14:25, 15:04
select ID,
avg(cast(CONCAT(left(cast(Time as varchar),2),substring(cast(Time as varchar),4,2)) as int)*1.0)
It does return a result, but I don't believe the average to be correct as the average time can be outside of normal time constraints (aka the minutes can be > 59).

time stores a point in time, not a duration. What would you do for a duration longer than a day? You should instead store either the duration in seconds, minutes, what have you, and format it as hh:mm etc. when you want to display it. Or better yet, store a start date and end date, which is more complete information, and you can always derive the duration (in whatever format you like) from that.
Anyway, dealing with what you have, and assuming this table and sample data:
CREATE TABLE dbo.BadChoices
(
ID int,
DurationWithWrongType time(0)
);
INSERT dbo.BadChoices(ID, DurationWithWrongType) VALUES
(1234, '13:21'),
(1234, '19:55'),
(5678, '14:25'),
(5678, '15:04');
You could I suppose do:
SELECT ID, AvgDuration = CONVERT(DECIMAL(10,2),
AVG(DATEDIFF(MINUTE, '00:00', DurationWithWrongType)*1.0))
FROM dbo.BadChoices
GROUP BY ID;
Output:
ID
AvgDuration
1234
998.00
5678
884.50
Example db<>fiddle
If you want the display to be HH:MM, and you know for sure your durations will always be < 24 hours, you could do:
;WITH src AS
(
SELECT ID, AvgDuration = CONVERT(DECIMAL(10,2),
AVG(DATEDIFF(MINUTE, '00:00', DurationWithWrongType)*1.0))
FROM dbo.BadChoices
GROUP BY ID
)
SELECT ID, AvgDuration,
AvgDurHHMMSS = CONVERT(time(0), DATEADD(SECOND, AvgDuration*60, '00:00'))
FROM src;
Output:
ID
AvgDuration
AvgDurHHMMSS
1234
998.00
16:38:00
5678
884.50
14:44:30
Example db<>fiddle

We have to cast to datetime to be able to cast to float. We can then find the average and cast back to datetime and then back to time.
A second alternative is to convert the time into minutes, get the average and then use dateadd() and cast back to time
create table times(
t time);
insert into times values
('13:21'),
('19:55'),
('14:25'),
('15:04');
GO
4 rows affected
select
cast(
cast(
avg(
cast(
cast(t
as datetime)
as float)
)
as datetime)
as time)
from times
GO
| (No column name) |
| :--------------- |
| 15:41:15 |
select
cast(
dateadd(second,
avg(
DateDiff(second,0,t)
),
2000-01-01)
as time)
from times
GO
| (No column name) |
| :--------------- |
| 15:41:15 |
db<>fiddle here

Related

Summing over a numeric column with moving window of varying size in Snowflake

I have a sample dataset given as follows;
time | time_diff | amount
time1 | time1-time2 | 1000
time2 | time2-time3 | 2000
time3 | time3-time4 | 3000
time4 | time4-time5 | 4500
time5 | NULL | 1000
Quick explanation; first column gives time of transaction, second column gives difference with next row to get transaction interval(in hours), and third column gives money made in a particular transaction. We have sorted the data in ascending order using time column.
Some values are given as;
time | time_diff | amount
time1 | 2. | 1000
time2 | 3. | 2000
time3 | 1. | 3000
time4 | 19. | 4500
time5 | NULL | 1000
The goal is to find the total transaction for a given time, which occurred within 24 hours of that transaction. For example, the output for time1 shd be; 1000+2000+3000=6000. Because if we add the value at time4, the total time interval becomes 25, hence we omit the value of 4500 from the sum.
Example output:
time | amount
time1 | 6000
time2 | 9500
time3 | 7500
time4 | 4500
time5 | 1000
The concept of Mong window sum should work, in my knowledge, but here the width of the window is variable. Thats the challenge I am facing.Can I kindly get some help here?
You could ignore the time_diff column and use a theta self-join based on a timestamp range, like this:
WITH srctab AS ( SELECT TO_TIMESTAMP_NTZ('2020-04-15 00:00:00') AS "time", 1000::INT AS "amount"
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 00:02:00'), 2000::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 00:05:00'), 3000::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 00:06:00'), 4500::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-16 00:01:00'), 1000::INT
)
SELECT t1."time", SUM(t2."amount") AS tot
FROM srctab t1
JOIN srctab t2 ON t2."time" BETWEEN t1."time" AND TIMESTAMPADD(HOUR, +24, t1."time")
GROUP BY t1."time"
ORDER BY t1."time";
Minor detail: if your second column gives the time difference with the next row then I'd say the first value should be 10500 (not 6000) because it's only your 5th transaction of 1000 which is more than 24 hours ahead... I'm guessing your actual timestamps are at 0, 2, 5, 6 and 25 hours?
Another option might be to use the sliding WINDOW function by tweaking your transactional data to include each hour.
It's perhaps an overkill but might be a useful technique.
Firstly generate a placeholder for each hour using the timestamps. I utilised time_slice to map each timestamp into nice hour blocks and generator with dateadd to back fill each hour putting a zero in where no transactions took place.
So now I can use the sliding window function knowing that I can safely choose the 23 preceding hours.
Copy|Paste|Run
WITH SRCTAB AS (
SELECT TO_TIMESTAMP_NTZ('2020-04-15 00:00:00') AS TRANS_TS, 1000::INT AS AMOUNT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 02:00:00'), 2000::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 05:00:00'), 3000::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-15 06:00:00'), 4500::INT
UNION ALL SELECT TO_TIMESTAMP_NTZ('2020-04-16 01:00:00'), 1000::INT
)
SELECT
TRANS_TIME_HOUR
,SUM(AMOUNT) OVER ( ORDER BY TRANS_TIME_HOUR ROWS BETWEEN 23 PRECEDING AND 0 PRECEDING ) OVERKILL FROM (
SELECT
TRANS_TIME_HOUR,
SUM(AMOUNT) AMOUNT
FROM
(
SELECT
DATEADD(HOUR, NUMBER, TRANS_TS) TRANS_TIME_HOUR,
DECODE( DATEADD(HOUR, NUMBER, TRANS_TS), TIME_SLICE(TRANS_TS, 1, 'HOUR', 'START'), AMOUNT,0) AMOUNT
FROM
SRCTAB,
(SELECT SEQ4() NUMBER FROM TABLE(GENERATOR(ROWCOUNT => 24)) ) G
)
GROUP BY
TRANS_TIME_HOUR
)

T-SQL - Finding records with chronological gaps

This is my first post here. I'm still a novice SQL user at this point though I've been using it for several years now. I am trying to find a solution to the following problem and am looking for some advice, as simple as possible, please.
I have this 'recordTable' with the following columns related to transactions; 'personID', 'recordID', 'item', 'txDate' and 'daySupply'. The recordID is the primary key. Almost every personID should have many distinct recordID's with distinct txDate's.
My focus is on one particular 'item' for all of 2017. It's expected that once the item daySupply has elapsed for a recordID that we would see a newer recordID for that person with a more recent txDate somewhere between five days before and five days after the end of the daySupply.
What I'm trying to uncover are the number of distinct recordID's where there wasn't an expected new recordID during this ten day window. I think this is probably very simple to solve but I am having a lot of difficulty trying to create a query for it, let alone explain it to someone.
My thought thus far is to create two temp tables. The first temp table stores all of the records associated with the desired items and I'm just storing the personID, recordID and txDate columns. The second temp table has the personID, recordID and the two derived columns from the txDate and daySupply; these would represent the five days before and five days after.
I am trying to find some way to determine the number of recordID's from the first table that don't have expected refills for that personID in the second. I thought a simple EXCEPT would do this but I don't think there's anyway of getting around a recursive type statement to answer this and I have never gotten comfortable with recursive queries.
I searched Stackoverflow and elsewhere but couldn't come up with an answer to this one. I would really appreciate some help from some more clever data folks. Here is the code so far. Thanks everyone!
CREATE TABLE #temp1 (personID VARCHAR(20), recordID VARCHAR(10), txDate
DATE)
CREATE TABLE #temp2 (personID VARCHAR(20), recordID VARCHAR(10), startDate
DATE, endDate DATE)
INSERT INTO #temp1
SELECT [personID], [recordID], txDate
FROM recordTable
WHERE item = 'desiredItem'
AND txDate > '12/31/16'
AND txDate < '1/1/18';
INSERT INTO #temp2
SELECT [personID], [recordID], (txDate + (daySupply - 5)), (txDate +
(daySupply + 5))
FROM recordTable
WHERE item = 'desiredItem'
AND txDate > '12/31/16'
AND txDate < '1/1/18';
I agree with mypetlion that you could have been more concise with your question, but I think I can figure out what you are asking.
SQL Window Functions to the rescue!
Here's the basic idea...
CREATE TABLE #fills(
personid INT,
recordid INT,
item NVARCHAR(MAX),
filldate DATE,
dayssupply INT
);
INSERT #fills
VALUES (1, 1, 'item', '1/1/2018', 30),
(1, 2, 'item', '2/1/2018', 30),
(1, 3, 'item', '3/1/2018', 30),
(1, 4, 'item', '5/1/2018', 30),
(1, 5, 'item', '6/1/2018', 30)
;
SELECT *,
ABS(
DATEDIFF(
DAY,
LAG(DATEADD(DAY, dayssupply, filldate)) OVER (PARTITION BY personid, item ORDER BY filldate),
filldate
)
) AS gap
FROM #fills
ORDER BY filldate;
... outputs ...
+----------+----------+------+------------+------------+------+
| personid | recordid | item | filldate | dayssupply | gap |
+----------+----------+------+------------+------------+------+
| 1 | 1 | item | 2018-01-01 | 30 | NULL |
| 1 | 2 | item | 2018-02-01 | 30 | 1 |
| 1 | 3 | item | 2018-03-01 | 30 | 2 |
| 1 | 4 | item | 2018-05-01 | 30 | 31 |
| 1 | 5 | item | 2018-06-01 | 30 | 1 |
+----------+----------+------+------------+------------+------+
You can insert the results into a temp table and pull out only the ones you want (gap > 5), or use the query above as a CTE and pull out the results without the temp table.
This could be stated as follows: "Given a set of orders, return a subset for which there is no order within +/- 5 days of the expected resupply date (defined as txDate + DaysSupply)."
This can be solved simply with NOT EXISTS. Define the range of orders you wish to examine, and this query will find the subset of those orders for which there is no resupply order (NOT EXISTS) within 5 days of either side of the expected resupply date (txDate + daysSupply).
SELECT
gappedOrder.personID
, gappedOrder.recordID
, gappedOrder.item
, gappedOrder.txDate
, gappedOrder.daysSupply
FROM
recordTable as gappedOrder
WHERE
gappedOrder.item = 'desiredItem'
AND gappedOrder.txDate > '12/31/16'
AND gappedOrder.txDate < '1/1/18'
--order not refilled within date range tolerance
AND NOT EXISTS
(
SELECT
1
FROM
recordTable AS refilledOrder
WHERE
refilledOrder.personID = gappedOrder.personID
AND refilledOrder.item = gappedOrder.item
--5 days prior to (txDate + daysSupply)
AND refilledOrder.txtDate >= DATEADD(day, -5, DATEADD(day, gappedOrder.daysSupply, gappedOrder.txDate))
--5 days after (txtDate + daysSupply)
AND refilledOrder.txtDate <= DATEADD(day, 5, DATEADD(day, gappedOrder.daysSupply, gappedOrder.txtDate))
);

Binary operator OR in TSQL?

What I 'm trying to achieve is to count occurrences in a sort of time line, considering overlapping events as a single one, starting from a field like this and using TSQL:
Pattern (JSON array of couple of values indicating
the start day and the duration of the event)
----------------------------------------------------
[[0,22],[0,24],[18,10],[30,3]]
----------------------------------------------------
For this example the result expected should be 30
What i need is a TSQL function to obtain this number...
Even If I'm not sure it's the right path to follow, I'm trying to simulate a sort of BINARY OR between rows of my dataset.
After some trying I managed to turn my dataset into something like this:
start | length | pattern
----------------------------------------------------
0 | 22 | 1111111111111111111111
0 | 24 | 111111111111111111111111
18 | 10 | 000000000000000001111111111
30 | 3 | 000000000000000000000000000000111
----------------------------------------------------
But now I dont' know how to proceed in TSQL =)
a solution as i said could be a binary OR between the "pattern" fields to obtain something like this:
1111111111111111111111...........
111111111111111111111111.........
000000000000000001111111111......
000000000000000000000000000000111
--------------------------------------
111111111111111111111111111000111
Is it possible to do it in TSQL?
Maybe i'm just complicating things here do you have other ideas?
DO NOT forget I just need the result number!!!
Thank you all
Only the total days that an event occurs needs to be returned.
But I was wondering how hard it would be to actually calculate that binary OR'd pattern.
declare #T table (start int, length int);
insert into #T values
(0,22),
(0,24),
(18,10),
(30,3);
WITH
DIGITS as (
select n
from (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) D(n)
),
NUMBERS as (
select (10*d2.n + d1.n) as n
from DIGITS d1, DIGITS d2
where (10*d2.n + d1.n) < (select max(start+length) from #T)
),
CALC as (
select N.n, max(case when N.n between IIF(T.start>0,T.start,1) and IIF(T.start>0,T.start,1)+T.length-1 then 1 else 0 end) as ranged
from #T T
cross apply NUMBERS N
group by N.n
)
select SUM(c.ranged) as total,
stuff(
(
select ranged as 'text()'
from CALC
order by n
for xml path('')
),1,1,'') as pattern
from CALC c;
Result:
total pattern
30 11111111111111111111111111100111
Depending on your input date, you should be able to do something like the following to calculate your Days With An Event.
The cte is used to generate a table of dates, the start and end of which are defined by the two date variables. These would be best suited as data driven from your source data. If you have to use numbered date values, you could simply return incrementing numbers instead of incrementing dates:
declare #Events table (StartDate date
,DaysLength int
)
insert into #Events values
('20160801',22)
,('20160801',24)
,('20160818',10)
,('20160830',3)
declare #StartDate date = getdate()-30
,#EndDate date = getdate()+30
;with Dates As
(
select DATEADD(day,1,#StartDate) as Dates
union all
select DATEADD(day,1, Dates)
from Dates
where Dates < #EndDate
)
select count(distinct d.Dates) as EventingDays
from Dates d
inner join #Events e
on(d.Dates between e.StartDate and dateadd(d,e.DaysLength-1,e.StartDate)
)
option(maxrecursion 0)

SQL Query - Average time from datetime field

I'm struggling with what I thought would be a simple SQL query. Running SQL Server 2014
I have an SQL table, "Visits":
Id | EntryTime | Duration
And I want to find the average entry TIME OF DAY between two dates, taking into account all records between those dates.
so if my EntryTime field between my dates is:
2016-04-28 12:00:00
2016-04-20 10:00:00
2016-04-19 08:00:00
2016-04-17 10:00:00
Then the average time returned should just be:
10:00:00
The date should not be taken into account at all, and it should be returned in string format, or a manner which returns ONLY 10:00:00.
create table mytimes(
id int identity,
mydatetime datetime
)
insert into mytimes (mydatetime) values ('2016-04-28 12:00:00')
insert into mytimes (mydatetime) values ('2016-04-20 10:00:00')
insert into mytimes (mydatetime) values ('2016-04-19 08:00:00')
insert into mytimes (mydatetime) values ('2016-04-17 10:00:00')
SELECT Cast(DateAdd(ms, AVG(CAST(DateDiff( ms, '00:00:00', cast(mydatetime as time)) AS BIGINT)), '00:00:00' ) as Time )
from mytimes
-- where mydatetime between XXX and YYY
SELECT convert(varchar(8), Cast(DateAdd(ms, AVG(CAST(DateDiff( ms, '00:00:00', cast(mydatetime as time)) AS BIGINT)), '00:00:00' ) as Time ))
from mytimes
-- where mydatetime between XXX and YYY
output-1 10:00:00.0000000 - this is an actual Time type that you can do more with if needed
output-2 10:00:00 - this is output as a varchar(8)
Add your where clause as you see fit
The steps include
Casting to a Time type from a DateTime.
Using the AVG on Time, this is not supported by type Time so you have to first convert Time to milliseconds.
Converting the milliseconds back to a Time type
To avoid Arithmetic overflow error converting expression to data type int you can cast the result of DateAdd to a BigInt. Alternatively you can use seconds instead of milliseconds in the DateDiff function which should work unless your result set is overly large.
SO Sources:
T-SQL calculating average time
How to get Time from DateTime format in SQL?
Operand data type time is invalid for avg operator…?
SELECT CONVERT(TIME, DATEADD(SECOND, AVG(DATEDIFF(SECOND, 0, CONVERT(TIME, EntryTime ))), 0))
FROM Visits
WHERE EntryTime >= #begin AND EntryTime <= #end
The idea came from here: T-SQL calculating average time
Yap, you can use the Time() to get this done.
Your query becomes like this (modify accordingly)
SELECT COUNT(*) FROM Visits WHERE TIME(EntryTime) > '06:00' AND EntryTime < '18:00';

TSQL Performance issues using DATEADD in where clause

I have a query using the DATEADD method which takes a lot of time.
I'll try to simplify what we do.
We are monitoring tempretures and every 5 minutes we store the highest temp and lowest temp in
table A
Date | Time | MaxTemp | MinTemp
2011-09-18 | 12:05:00 | 38.15 | 38.099
2011-09-18 | 12:10:00 | 38.20 | 38.10
2011-09-18 | 12:15:00 | 38.22 | 38.17
2011-09-18 | 12:20:00 | 38.21 | 38.20
...
2011-09-19 | 11:50:00 | 38.17 | 38.10
2011-09-19 | 12:55:00 | 38.32 | 38.27
2011-09-19 | 12:00:00 | 38.30 | 38.20
Date/Time columns are of type date/time (and not datetime)
In another table (Table B) we store some data for the entire day, where a day is from NOON (12PM) to noon (not midnight to midnight).
So table B columns include:
Date (date only no time)
ShiftManager
MaxTemp (this is the max temp for the entire 24 hours starting at that date noon till next day noon)
MinTemp
I get table B with all the data and just need to update the MaxTemp and MinTemp using table A
For example:For 09/18/2011 I need the maximum temp reading that was between 09/18/2011 12PM and 09/19/2011 12PM.
In the TableA sample we have above, the returend result would be 38.32 as it is the MAX(MaxTemp) for the desired period.
The SQL I'm using:
update TableB
set MaxTemp = (
select MAX(HighTemp) from TableA
where
(Date=TableB.Date and Time > '12:00:00')
or
(Date=DATEADD(dd,1,TableB.Date) and Time <= '12:00:00')
)
And it takes a lot of time (if I remove the DATEADD method it is quick).
Here is a simplified sample that shows the data I have and the expected result:
DECLARE #TableA TABLE ([Date] DATE, [Time] TIME(0), HighTemp DECIMAL(6,2));
DECLARE #TableB TABLE ([Date] DATE, MaxTemp DECIMAL(6,2));
INSERT #TableA VALUES
('2011-09-18','12:05:00',38.15),
('2011-09-18','12:10:00',38.20),
('2011-09-18','12:15:00',38.22),
('2011-09-19','11:50:00',38.17),
('2011-09-19','11:55:00',38.32),
('2011-09-19','12:00:00',38.31),
('2011-09-19','12:05:00',38.33),
('2011-09-19','12:10:00',38.40),
('2011-09-19','12:15:00',38.12),
('2011-09-20','11:50:00',38.27),
('2011-09-20','11:55:00',38.42),
('2011-09-20','12:00:00',38.16);
INSERT #TableB VALUES
('2011-09-18', 0),
('2011-09-19', 0);
-- This is how I get the data, now I just need to update the max temp for each day
with TableB(d, maxt) as
(
select * from #TableB
)
update TableB
set maxt = (
select MAX(HighTemp) from #TableA
where
(Date=TableB.d and Time > '12:00:00')
or
(Date=DATEADD(dd,1,TableB.d) and Time <= '12:00:00')
)
select * from #TableB
Hope I was able to explian myself, any ideas how can I do it differently? Thx!
Functions on column usually kill performance. So can OR.
However, I assume you want AND not OR because it is a range.
So, applying some logic and having just one calculation
update TableB
set MaxTemp =
(
select MAX(HighTemp) from TableA
where
(Date + Time - 0.5 = TableB.Date)
)
(Date + Time - 0.5) will change noon to noon to be midnight to midnight (0.5 = 12 hours). More importantly, you can make this a computed column and index it
More correctly, Date + Time - 0.5 is DATEADD(hour, -12, Date+Time) assuming Date and Time are real dates/times and not varchar...
Edit: this answer is wrong but I'll leave it up as "what not to do"
See this for more:
Bad Habits to Kick : Using shorthand with date/time operations
This would probably be a lot easier if you used a single SMALLDATETIME column instead of separating this data into DATE/TIME columns. Also I'm assuming you are using SQL Server 2008 and not a previous version where you're storing DATE/TIME data as strings. Please specify the version of SQL Server and the actual data types being used.
DECLARE #d TABLE ([Date] DATE, [Time] TIME(0), MaxTemp DECIMAL(6,3), MinTemp DECIMAL(6,3));
INSERT #d VALUES
('2011-09-18','12:05:00',38.15,38.099),
('2011-09-18','12:10:00',38.20,38.10),
('2011-09-18','12:15:00',38.22,38.17),
('2011-09-18','12:20:00',38.21,38.20),
('2011-09-19','11:50:00',38.17,38.10),
('2011-09-19','12:55:00',38.32,38.27),
('2011-09-19','12:00:00',38.30,38.20);
SELECT '-- before update';
SELECT * FROM #d;
;WITH d(d,t,dtr,maxt) AS
(
SELECT [Date], [Time], DATEADD(HOUR, -12, CONVERT(SMALLDATETIME, CONVERT(CHAR(8),
[Date], 112) + ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp FROM #d
),
d2(dtr, maxt) AS
(
SELECT CONVERT([Date], dtr), MAX(maxt) FROM d
GROUP BY CONVERT([Date], dtr)
)
UPDATE d SET maxt = d2.maxt FROM d
INNER JOIN d2 ON d.dtr >= d2.dtr AND d.dtr < DATEADD(DAY, 1, d2.dtr);
SELECT '-- after update';
SELECT * FROM #d;
Results:
-- before update
2011-09-18 12:05:00 38.150 38.099
2011-09-18 12:10:00 38.200 38.100
2011-09-18 12:15:00 38.220 38.170
2011-09-18 12:20:00 38.210 38.200
2011-09-19 11:50:00 38.170 38.100
2011-09-19 12:55:00 38.320 38.270
2011-09-19 12:00:00 38.300 38.200
-- after update
2011-09-18 12:05:00 38.220 38.099
2011-09-18 12:10:00 38.220 38.100
2011-09-18 12:15:00 38.220 38.170
2011-09-18 12:20:00 38.220 38.200
2011-09-19 11:50:00 38.220 38.100
2011-09-19 12:55:00 38.320 38.270
2011-09-19 12:00:00 38.320 38.200
Presumably you want to update the MinTemp as well, and that would just be:
;WITH d(d,t,dtr,maxt,mint) AS
(
SELECT [Date], [Time], DATEADD(HOUR, -12,
CONVERT(SMALLDATETIME, CONVERT(CHAR(8), [Date], 112)
+ ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp, MaxTemp
FROM #d
),
d2(dtr, maxt, mint) AS
(
SELECT CONVERT([Date], dtr), MAX(maxt), MIN(mint) FROM d
GROUP BY CONVERT([Date], dtr)
)
UPDATE d
SET maxt = d2.maxt, mint = d2.maxt
FROM d
INNER JOIN d2
ON d.dtr >= d2.dtr
AND d.dtr < DATEADD(DAY, 1, d2.dtr);
Now, this is not really better than your existing query, because it's still going to be using scans to figure out aggregates and all the rows that need to be updating. I'm not saying you should be updating the table at all, because this information can always be derived at query time, but if it is something you really want to do, I would combine the advice in these answers and consider revising the schema. For example, if the schema were:
USE [tempdb];
GO
CREATE TABLE dbo.d
(
[Date] SMALLDATETIME,
MaxTemp DECIMAL(6,3),
MinTemp DECIMAL(6,3),
RoundedDate AS (CONVERT(DATE, DATEADD(HOUR, -12, [Date]))) PERSISTED
);
CREATE INDEX rd ON dbo.d(RoundedDate);
INSERT dbo.d([Date],MaxTemp,MinTemp) VALUES
('2011-09-18 12:05:00',38.15,38.099),
('2011-09-18 12:10:00',38.20,38.10),
('2011-09-18 12:15:00',38.22,38.17),
('2011-09-18 12:20:00',38.21,38.20),
('2011-09-19 11:50:00',38.17,38.10),
('2011-09-19 12:55:00',38.32,38.27),
('2011-09-19 12:00:00',38.30,38.20);
Then your update is this simple, and the plan is much nicer:
;WITH g(RoundedDate,MaxTemp)
AS
(
SELECT RoundedDate, MAX(MaxTemp)
FROM dbo.d
GROUP BY RoundedDate
)
UPDATE d
SET MaxTemp = g.MaxTemp
FROM dbo.d AS d
INNER JOIN g
ON d.RoundedDate = g.RoundedDate;
Finally, one of the reasons your existing query is probably taking so long is that you are updating all of time, every time. Is data from last week changing? Probably not. So why not limit the WHERE clause to recent data only? I see no need to go recalculate anything earlier than yesterday unless you are constantly receiving revised estimates of how warm it was last Tuesday at noon. So why are there no WHERE clauses on your current query, to limit the date range where it is attempting to do this work? Do you really want to update the WHOLE able, EVERY time? This is probably something you should only be doing once a day, sometime in the afternoon, to update yesterday. So whether it takes 2 seconds or 2.5 seconds shouldn't really matter.
You may need to use -12 depending on date as start date or end date for the noon to noon internal.
update tableA
set tableAx.MaxTemp = MAX(TableB.HighTemp)
from tableA as tableAx
join TableB
on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)
group by tableAx.Date
Because of the 12 hour offset not sure how much would would gain by putting TableB Date plus Time in a DateTime field directly. Cannot get away from the DATEADD and the output from a functions is not indexed even if the parameters going into the function are indexed. What you might be able to to is create a computed column that = date + time +/- 12h and index that column.
Like the recommendation from Arron to only update those without values.
update tableA
set tableAx.MaxTemp = MAX(TableB.HighTemp)
from tableA as tableAx
join TableB
on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)
where tableAx.MaxTemp is null
group by tableAx.Date
or an insert of new dates
insert into tableA (date, MaxTemp)
select CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]), as Date) as [date] , MAX(TableB.HighTemp) as [MaxTemp]
from tableA as tableAx
right outer join TableB
on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)
where TableB.Date is null
group by CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)

Resources