Select Statement Returning Multiple of Same Dates

Select Statement Returning Multiple of Same Dates - sql-server

I am having issues with a select statement that is returning multiple dates that are the same.
I will start off with the tables that I have:
Calendar: Very basic, just a table that contains all of the days for the next 20 years.
PKDate
------
2015-04-01
2015-04-02
2015-04-03
etc...
DaysWorked: This table contains all days that all equipment in the company has worked. There is a foreign key constraint to the PKDate in calendar to the DayWorked in this table.
DayWorked | Unit
----------------
2015-04-01 | 102
2015-04-05 | 103
Event: This is the table that is behind our scheduling system. This holds all of the days that can be booked off as days off. The use can select a start and end date for the days off or vacation. There are no foreign key constraints in this table.
Name | EventStart | EventEnd | Unit
-----------------------------------------
Days Off | 2015-04-06 | 2015-04-08 | 103
Days Off | 2015-04-03 | 2015-04-09 | 102
This is the stored procedure that I am executing:
select distinct PKDate as 'Date', case when PKDate not in (select DayWorked
from DaysWorked
where Unit='124')
then 'AVAILABLE'
else ''
end
as 'Available',
case when PKDate in (select DayWorked
from DaysWorked
where Unit='124')
then 'WORKED'
else ''
end
as 'Worked',
case when PKDate between E.EventStart and DATEADD(day, -1, E.EventEnd)
and E.ResourceID='124'
then UPPER(E.Name)
else ''
end
as 'Schedule'
from Event E
full outer join Calendar C
on PKDate between E.EventStart and E.EventEnd
where PKDate between '2015-04-01' and GETDATE()
order by PKDate asc
This stored procedure almost works as planned. I want the result of the procedure to show every day in the calendar in one column (Date), then display if the equipment was available (available), if the equipment worked (worked), and if the equipment was booked for days off or vacation (Schedule).
What is happening when I run the procedure is the date displays more than once for the same date. An example is shown in the photo below:
For the days Aprilt 13th to April 16th the days are repeated. I believe these days are repeated because I have something for those days in the Event table, but I do no know why the day displays twice. How can I get these days to only display once?

select C.PKDate
,case when not exists ( select * from DaysWorked where Unit = '124' and DayWorked = C.PKDate )
and not exists ( select * from Event E where E.EventStart <= C.PKDate and E.EventEnd >= C.PKDate and E.ResourceID = '124')
then 'AVAILABLE' else '' end as Available
,case when exists ( select * from DaysWorked where Unit = '124' and DayWorked = C.PKDate ) then 'WORKED' else '' end as Worked
,isnull((select max(E.Name) from Event E where E.EventStart <= C.PKDate and E.EventEnd >= C.PKDate and E.ResourceID = '124' ), '') as Schedule
from Calendar C
where C.PKDate between '2015-4-1' and getdate()
order by c.PKDate

Related

get unique records existing only once per day in a date range

I have a case where I want to extract the device ids (DIDs) that exist only and only once for each day in a certain period. I have tried different methods and partitions but I seem to only be able to get that data individually per day (where date = X, but I need a query where I can put where date between X & Y)
Example, this is the data:
DID date
A 2019-01-01
A 2019-01-01
A 2019-01-02
A 2019-01-03
B 2019-01-01
B 2019-01-02
B 2019-01-03
C 2019-01-01
C 2019-01-02
C 2019-01-02
C 2019-01-03
D 2019-01-01
D 2019-01-02
D 2019-01-03
The query should return only B & D(because B & D exists once in each day from 01 to 03)
I also wish to get the count, which would be 2 in this case
thanks!

You want the devices to exist only once on each day of the period, so if you group by did you need to return the dids that have count(date) and count(distinct date) equal to the number of days of that period:
select did
from tablename
where date between cast('2019-01-01' as date) and cast('2019-01-03' as date)
group by did
having
count(distinct date) = cast('2019-01-03' as date) - cast('2019-01-01' as date) + 1
and
count(date) = cast('2019-01-03' as date) - cast('2019-01-01' as date) + 1
See the demo.
Or:
select t.did
from (
select did, date
from tablename
where date between cast('2019-01-01' as date) and cast('2019-01-03' as date)
group by did, date
having count(*) = 1
)t
group by t.did
having count(*) = cast('2019-01-03' as date) - cast('2019-01-01' as date) + 1
See the demo.
Result:
| did |
| --- |
| B |
| D |

One option would be to aggregate by DID and assert that the total count is equal to the count of distinct dates. If this assertion passes, it means that a given DID has only distinct dates present.
SELECT DID
FROM yourTable
GROUP BY DID
HAVING COUNT(date) = COUNT(DISTINCT date);
Demo
If you want to get the total count of matching DID, then you could subquery the above and take COUNT(*). Or, if you wanted to use the same query you might try:
SELECT DID, COUNT(*) OVER () AS total_cnt
FROM yourTable
GROUP BY DID
HAVING COUNT(date) = COUNT(DISTINCT date);

TSQL - Return duplicate rows with highest value and longest date

I have got a list of staff who are contractors and it includes duplicates as some work on multiple contracts at the same time. I need to find the row with the most hours for that person and secondly with the end date furthest away (if the hours is the same). I guess this is the Current main contract. I also need to make sure the Date From and the Date to is in between the current date - how can this be done?
+------------+----------+------+-------+------------+------------+
| ContractID | PersonID | Name | Hours | Date From | Date To |
+------------+----------+------+-------+------------+------------+
| 8 | 1 | John | 30 | 20/02/2018 | 26/02/2018 |
| 8 | 2 | Paul | 5 | 20/02/2018 | 26/02/2018 |
| 7 | 3 | John | 7 | 20/02/2018 | 26/02/2018 |
+------------+----------+------+-------+------------+------------+
In the above example, I would need to bring back the John – 30hours and the Paul 5 Hours row. PS - The PersonID is different for each row but the "Name" is the same for the person if on multiple contracts.
Thanks

One approach is simply to use exists with appropriate ordering logic:
select c.*
from contracts c
where c.contractid = (select top 1 c2.contractid
from contracts c2
where c2.name = c.cname and
getdate() >= c2.datefrom and
getdate() < c2.dateto
order by c2.hours desc, c2.dateto desc
);
You can put similar logic into a window function:
select c.*
from (select c.*,
row_number() over (partition by c.name order by c.hours desc, c.dateto desc) as seqnum
from contracts c
where getdate() >= c.dateto and getdate() < c.datefrom
) c
where seqnum = 1;

If you need the full row, I'd do somehthing like this:
with
rankedByHours as (
select
ContractID,
PersonID,
Name,
Hours,
[Date From],
[Date To],
row_number() over (partition by PersonID order by Hours desc) as RowID
from
Contracts
)
select
ContractID,
PersonID,
Name,
Hours,
[Date From],
[Date To],
case
when getdate() between [Date From] and [Date To] then 'Current'
when getdate() < [Date From] then 'Not Started'
else 'Expired'
end as ContractStatus
from
RankedByHours
where
RowID = 1;
Use the CTE to inject a row_number() sorting all rows by your sort criteria, then select out the top one in the main body. It can be easily extended to also capture your farthest-out end date.

Joining multiple date fields to calendar table

I have a table, let's call it Records, where the relevant data is organized as such:
| Employee | SubmissionDate | FirstReviewDate | SecondReviewDate |
Anne 2017-10-02 2017-10-03 2017-10-10
Bernard 2017-10-03 2017-10-05 2017-10-10
Charlene 2017-10-06 2017-10-09 2017-10-09
Danielle 2017-10-02 2017-10-03 2017-10-09
Anne 2017-10-03 2017-10-03 2017-10-09
Every time an employee makes a submission, a new entry is added with a SubmissionDate. The record is later edited to include then the first and second reviews take place.
I also have a calendar table called Calendar with a field called TheDate which has dates for every day this year. What I would like to do is associate SubmissionDate, FirstReviewDate, and SecondReview date with Calendar.TheDate so that I can do a count for all three fields for any given day.
I have tried the following code:
SELECT Employee,
Count(SubmissionDate) AS "Submissions",
Count(FirstReviewDate) AS "First Reviews",
Count(SecondReviewDate) AS "Second Reviews"
FROM Records
LEFT JOIN Calendar ON Records.SubmissionDate = Calendar.TheDate
AND Records.FirstReviewDate = Calendar.TheDate
AND Records.SecondReviewDate = Calendar.TheDate
WHERE TheDate = '2017-10-11'
All of the variations of this I have tried output nothing:
| Employee | Submissions | First Reviews | Second Reviews |
My desired code would look like this:
SELECT Employee,
Count(SubmissionDate),
Count(FirstReviewDate),
Count(SecondReviewDate)
FROM ???
WHERE TheDate = '2017-10-03'
where ??? would be the proper join. Using the desired code block (including the where clause), the desired output would like this for the example data provided:
| Employee | Submissions | First Reviews | Second Reviews |
Anne 1 2 0
Bernard 1 0 0
Charlene 0 0 0
Danielle 0 1 0
I am not sure how to do this join. I have read many resources about calendar tables, but the examples always include joining tables that each have a single date identifier. My problem is that I have one table with three date fields that need to be associated with Calendar.TheDate.
I am arranging my data like this so that the data can be visualized in Qlik Sense. I would like users to be able to select TheDate from a filter panel and have it aggregate all three fields for the date specified (essentially identical to the WHERE clause in my example code).

Try below code
create table #recs (
Employee varchar(10),
SubmissionDate date,
FirstReviewDate date,
SecondReviewDate date
)
insert into #recs values
('Anne', '2017-10-02', '2017-10-03', '2017-10-10'),
('Bernard', '2017-10-03', '2017-10-05', '2017-10-10'),
('Charlene', '2017-10-06', '2017-10-09', '2017-10-09'),
('Danielle', '2017-10-02', '2017-10-03', '2017-10-09'),
('Anne', '2017-10-03', '2017-10-03', '2017-10-09')
select Employee
, sum(IIF(SubD.Date_Value = #recs.SubmissionDate, 1, 0)) AS Submissions
, sum(IIF(SubD.Date_Value = #recs.FirstReviewDate, 1, 0)) AS [First Reviews]
, sum(IIF(SubD.Date_Value = #recs.SecondReviewDate, 1, 0)) AS [Second Reviews]
from #recs
left join DimDate SubD on SubD.Date_Value = '2017-10-03' and
(SubD.Date_Value = #recs.SubmissionDate
or SubD.Date_Value = #recs.FirstReviewDate
or SubD.Date_Value = #recs.SecondReviewDate)
group by Employee
If your SQL Server is older than 2012, use CASE instead of IIF
select Employee
, sum(case when SubD.Date_Value = #recs.SubmissionDate then 1 else 0 end) AS Submissions
, sum(case when SubD.Date_Value = #recs.FirstReviewDate then 1 else 0 end) AS [First Reviews]
, sum(case when SubD.Date_Value = #recs.SecondReviewDate then 1 else 0 end) AS [Second Reviews]
from #recs
left join DimDate SubD on SubD.Date_Value = '2017-10-03' and
(SubD.Date_Value = #recs.SubmissionDate
or SubD.Date_Value = #recs.FirstReviewDate
or SubD.Date_Value = #recs.SecondReviewDate)
group by Employee

This should work - I've included some simple DML for sample data:-
--Table Setup
IF OBJECT_ID('Records') IS NOT NULL
DROP TABLE Records;
IF OBJECT_ID('Calendar') IS NOT NULL
DROP TABLE Calendar;
CREATE TABLE Records (
Employee varchar(100),
SubmissionDate datetime,
FirstReviewDate datetime,
SecondReviewDate datetime
)
CREATE TABLE Calendar (
TheDate datetime
)
INSERT Records
VALUES ('Anne', '2017-10-02', '2017-10-03', '2017-10-10'),
('Bernard', '2017-10-03', '2017-10-05', '2017-10-10'),
('Charlene', '2017-10-06', '2017-10-09', '2017-10-09'),
('Danielle', '2017-10-02', '2017-10-03', '2017-10-09'),
('Anne', '2017-10-03', '2017-10-03', '2017-10-09')
INSERT Calendar
VALUES ('2017-10-03'), ('2017-10-04'), ('2017-10-05')
--Main Query
DECLARE #TheDate datetime = '2017-10-03'
SELECT
Employee,
SUM(CASE
WHEN c1.TheDate IS NOT NULL THEN 1
ELSE 0
END),
SUM(CASE
WHEN c2.TheDate IS NOT NULL THEN 1
ELSE 0
END),
SUM(CASE
WHEN c3.TheDate IS NOT NULL THEN 1
ELSE 0
END)
FROM Records r
LEFT JOIN Calendar c1
ON r.SubmissionDate = c1.TheDate
AND c1.thedate = #TheDate
LEFT JOIN Calendar c2
ON r.FirstReviewDate = c2.TheDate
AND c2.thedate = #TheDate
LEFT JOIN Calendar c3
ON r.SecondReviewDate = c3.TheDate
AND c3.thedate = #TheDate
GROUP BY Employee

Count number of days in a year with a record

I have a SQL Server table named AgentLog in which I store for each agent his daily number of sales.
+-----------+------------+-------------+
| AgentName | Date | SalesNumber |
+-----------+------------+-------------+
| John | 01.01.2014 | 45 |
| Terry | 01.01.2014 | 30 |
| John | 02.01.2014 | 20 |
| Terry | 02.01.2014 | 15 |
| Terry | 03.01.2014 | 52 |
| Terry | 04.01.2014 | 24 |
| Terry | 05.01.2014 | 12 |
| Terry | 06.01.2014 | 10 |
| Terry | 07.01.2014 | 23 |
| John | 08.01.2014 | 48 |
| Terry | 08.01.2014 | 35 |
| John | 09.01.2014 | 37 |
| Terry | 10.01.2014 | 35 |
+-----------+------------+-------------+
If an agent doesn't work on one particular day, there is no record of his sales on that date.
I want to generate a report(query) on a given date interval (ex: 01.01.2014 - 10.01.2014) that counts on how many days an agent wasn't present for work (ex: John - 6 days), was at work (John - 4 days) and also returns the date interval it wasn't present (ex: John 03.01.2014 - 07.01.2014, 10.01.2014) (there can be multiple intervals).

You need to create a custom table and populate it with a record for each date you want in your range (Feel free to go as far back in the past and forward into the future as you feel you may need.). You could do this in Excel very easily and import it.
Select *
from Custom.DateListTable dlt
left outer join agentlog ag
on dlt.Date = ag.Date

I would approach this by getting the number of dates in the interval, as well as the number of dates the agent was at work, and you then have everything you need.
To get the number of days you can use DATEDIFF:
SELECT DATEDIFF(day, '2014-01-01', '2014-10-01') AS totalDays;
To get the number of days an agent worked, you can use the COUNT(*) aggregate function:
SELECT agentName, COUNT(*) AS daysWorked
FROM myTable
GROUP BY agentName;
Then, you can just add to that query to get the days not worked by subtracting totalDays - daysWorked:
SELECT agentName, COUNT(*) AS daysWorked, (DATEDIFF(day, '2014-01-01', '2014-10-01') - COUNT(*)) AS daysMissed
FROM myTable
GROUP BY agentName;
Here is an SQL Fiddle example.

The only way I can think of to resolve this is to creating a temporary table with only one column (datetime) and save there all the dates from the selected range. You can create an stored procedure that fills that temporary table using a cursor with all the dates from the interval. Then do a LEFT join between your table and the temporary table to look for null values in your table (The days where that person didn't come to work)

Try this...
SET DATEFIRST 1; --Monday
DECLARE #StartDate DATETIME = '2014-01.01',
#EndDate DATETIME = '2014-01.10';
WITH data as (
select 0 as i, DATEADD(DAY, 0, #StartDate) as TheDate
union all
select i + 1, DATEADD(DAY, i + 1, #StartDate) as TheDate
from data
where i < (#EndDate - #StartDate)
)
SELECT a.AgentName,
SUM(CASE WHEN c.Date IS NULL THEN 1 ELSE 0 END) AS Missing,
SUM(CASE WHEN c.Date IS NOT NULL THEN 1 ELSE 0 END) AS Working
FROM Agent a
JOIN data b ON NOT EXISTS(SELECT NULL FROM SpecialDate s WHERE s.date = b.TheDate)
LEFT JOIN AgentLog c ON
c.AgentName = a.AgentName
AND c.Date = b.TheDate
WHERE DATEPART(weekday, b.TheDate) <= 5
GROUP BY a.AgentName
OPTION (MAXRECURSION 10000);
It includes a check for weekends, as well as a reference to "SpecialDate" where a list of non working days can be maintained, and excluded from the check.
Reading your question again, I realise that this will only solve half your problem.

NOTE: The following answer mainly addresses the trickiest part of the question, which is how to obtain "absence from work" intervals.
Given these values as Interval Start - End dates:
DECLARE #IntervalStart DATE = '2013-12-30'
DECLARE #IntervalEnd DATE = '2014-01-10'
the following query gives you the "absence from work" intervals:
SELECT AgentName,
DATEADD(d, 1, t.[Date]) As OffWorkStart,
DATEADD(d, -1, t.NextDate) As OffWorkEnd
FROM (
SELECT AgentName, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As NextDate,
DATEDIFF(DAY, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC)) As NextMinusCurrent
FROM #AgentLog) t
WHERE t.NextMinusCurrent > 1
-- Get marginal beginning interval (in case such an interval exists)
UNION ALL
SELECT AgentName, #IntervalStart AS OffWorkStart, DATEADD(DAY, -1, MIN([Date])) AS OffWorkEnd
FROM #AgentLog
GROUP BY AgentName
HAVING MIN([Date]) > #IntervalStart
-- Get marginal ending interval (in case such an interval exists)
UNION ALL
SELECT AgentName, DATEADD(DAY, 1, MAX([Date])) AS OffWorkStart, #IntervalEnd
FROM #AgentLog
GROUP BY AgentName
HAVING MAX([Date]) < #IntervalEnd
ORDER By AgentName, OffWorkStart
With the input data you supplied, the above query gives you the following output:
AgentName OffWorkStart OffWorkEnd
---------------------------------------
John 2013-12-30 2013-12-31
John 2014-01-03 2014-01-07
John 2014-01-10 2014-01-10
Terry 2013-12-30 2013-12-31
Terry 2014-01-09 2014-01-09
The idea behind the basic part of the query is to employ the following nested query:
SELECT AgentName,
[Date],
LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As NextDate,
DATEDIFF(DAY, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC)) As NextMinusCurrent
FROM #AgentLog
in order to get any existing gaps between the days a certain agent is present for work. A value of NextMinusCurrent > 1 indicates such a gap.
Counting days is trivial once you have the above query in place. E.g. placing the above query in a CTE you can count total number of absence days with sth like:
;WITH cte (
... query goes here
)
SELECT AgentName, SUM(DATEDIFF(DAY, OffWorkStart, OffWorkEnd) + 1) AS AbsenceDays
FROM cte
GROUP By AgentName
P.S. The above query makes use of SQL Server LEAD function, which is available from SQL SERVER 2012 onwards.
SQL Fiddle here
EDIT:
CTEs together with ROW_NUMBER() can be used to simulate LEAD function. The first part of the query becomes:
;WITH cte1 AS (
SELECT AgentName,
[Date],
ROW_NUMBER() OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As rn
FROM #AgentLog
),
cte2 AS (
SELECT cte1.AgentName, cte1.[Date],
cteLead.[Date] AS NextDate,
DATEDIFF(DAY, cte1.[Date], cteLead.[Date]) As NextMinusCurrent
FROM cte1
LEFT OUTER JOIN cte1 AS cteLead
ON (cte1.rn = cteLead.rn - 1) AND (cte1.AgentName = cteLead.AgentName)
)
SELECT AgentName,
DATEADD(d, 1, cte2.[Date]) As OffWorkStart,
DATEADD(d, -1, cte2.NextDate) As OffWorkEnd
FROM cte2
WHERE NextMinusCurrent > 1
SQL Fiddle for SQL Server 2008 here. I hope it executes in SQL Server 2005 also!

TSQL Performance issues using DATEADD in where clause

I have a query using the DATEADD method which takes a lot of time.
I'll try to simplify what we do.
We are monitoring tempretures and every 5 minutes we store the highest temp and lowest temp in
table A
Date | Time | MaxTemp | MinTemp
2011-09-18 | 12:05:00 | 38.15 | 38.099
2011-09-18 | 12:10:00 | 38.20 | 38.10
2011-09-18 | 12:15:00 | 38.22 | 38.17
2011-09-18 | 12:20:00 | 38.21 | 38.20
...
2011-09-19 | 11:50:00 | 38.17 | 38.10
2011-09-19 | 12:55:00 | 38.32 | 38.27
2011-09-19 | 12:00:00 | 38.30 | 38.20
Date/Time columns are of type date/time (and not datetime)
In another table (Table B) we store some data for the entire day, where a day is from NOON (12PM) to noon (not midnight to midnight).
So table B columns include:
Date (date only no time)
ShiftManager
MaxTemp (this is the max temp for the entire 24 hours starting at that date noon till next day noon)
MinTemp
I get table B with all the data and just need to update the MaxTemp and MinTemp using table A
For example:For 09/18/2011 I need the maximum temp reading that was between 09/18/2011 12PM and 09/19/2011 12PM.
In the TableA sample we have above, the returend result would be 38.32 as it is the MAX(MaxTemp) for the desired period.
The SQL I'm using:
update TableB
set MaxTemp = (
select MAX(HighTemp) from TableA
where
(Date=TableB.Date and Time > '12:00:00')
or
(Date=DATEADD(dd,1,TableB.Date) and Time <= '12:00:00')
)
And it takes a lot of time (if I remove the DATEADD method it is quick).
Here is a simplified sample that shows the data I have and the expected result:
DECLARE #TableA TABLE ([Date] DATE, [Time] TIME(0), HighTemp DECIMAL(6,2));
DECLARE #TableB TABLE ([Date] DATE, MaxTemp DECIMAL(6,2));
INSERT #TableA VALUES
('2011-09-18','12:05:00',38.15),
('2011-09-18','12:10:00',38.20),
('2011-09-18','12:15:00',38.22),
('2011-09-19','11:50:00',38.17),
('2011-09-19','11:55:00',38.32),
('2011-09-19','12:00:00',38.31),
('2011-09-19','12:05:00',38.33),
('2011-09-19','12:10:00',38.40),
('2011-09-19','12:15:00',38.12),
('2011-09-20','11:50:00',38.27),
('2011-09-20','11:55:00',38.42),
('2011-09-20','12:00:00',38.16);
INSERT #TableB VALUES
('2011-09-18', 0),
('2011-09-19', 0);
-- This is how I get the data, now I just need to update the max temp for each day
with TableB(d, maxt) as
(
select * from #TableB
)
update TableB
set maxt = (
select MAX(HighTemp) from #TableA
where
(Date=TableB.d and Time > '12:00:00')
or
(Date=DATEADD(dd,1,TableB.d) and Time <= '12:00:00')
)
select * from #TableB
Hope I was able to explian myself, any ideas how can I do it differently? Thx!

Functions on column usually kill performance. So can OR.
However, I assume you want AND not OR because it is a range.
So, applying some logic and having just one calculation
update TableB
set MaxTemp =
(
select MAX(HighTemp) from TableA
where
(Date + Time - 0.5 = TableB.Date)
)
(Date + Time - 0.5) will change noon to noon to be midnight to midnight (0.5 = 12 hours). More importantly, you can make this a computed column and index it
More correctly, Date + Time - 0.5 is DATEADD(hour, -12, Date+Time) assuming Date and Time are real dates/times and not varchar...
Edit: this answer is wrong but I'll leave it up as "what not to do"
See this for more:
Bad Habits to Kick : Using shorthand with date/time operations

This would probably be a lot easier if you used a single SMALLDATETIME column instead of separating this data into DATE/TIME columns. Also I'm assuming you are using SQL Server 2008 and not a previous version where you're storing DATE/TIME data as strings. Please specify the version of SQL Server and the actual data types being used.
DECLARE #d TABLE ([Date] DATE, [Time] TIME(0), MaxTemp DECIMAL(6,3), MinTemp DECIMAL(6,3));
INSERT #d VALUES
('2011-09-18','12:05:00',38.15,38.099),
('2011-09-18','12:10:00',38.20,38.10),
('2011-09-18','12:15:00',38.22,38.17),
('2011-09-18','12:20:00',38.21,38.20),
('2011-09-19','11:50:00',38.17,38.10),
('2011-09-19','12:55:00',38.32,38.27),
('2011-09-19','12:00:00',38.30,38.20);
SELECT '-- before update';
SELECT * FROM #d;
;WITH d(d,t,dtr,maxt) AS
(
SELECT [Date], [Time], DATEADD(HOUR, -12, CONVERT(SMALLDATETIME, CONVERT(CHAR(8),
[Date], 112) + ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp FROM #d
),
d2(dtr, maxt) AS
(
SELECT CONVERT([Date], dtr), MAX(maxt) FROM d
GROUP BY CONVERT([Date], dtr)
)
UPDATE d SET maxt = d2.maxt FROM d
INNER JOIN d2 ON d.dtr >= d2.dtr AND d.dtr < DATEADD(DAY, 1, d2.dtr);
SELECT '-- after update';
SELECT * FROM #d;
Results:
-- before update
2011-09-18 12:05:00 38.150 38.099
2011-09-18 12:10:00 38.200 38.100
2011-09-18 12:15:00 38.220 38.170
2011-09-18 12:20:00 38.210 38.200
2011-09-19 11:50:00 38.170 38.100
2011-09-19 12:55:00 38.320 38.270
2011-09-19 12:00:00 38.300 38.200
-- after update
2011-09-18 12:05:00 38.220 38.099
2011-09-18 12:10:00 38.220 38.100
2011-09-18 12:15:00 38.220 38.170
2011-09-18 12:20:00 38.220 38.200
2011-09-19 11:50:00 38.220 38.100
2011-09-19 12:55:00 38.320 38.270
2011-09-19 12:00:00 38.320 38.200
Presumably you want to update the MinTemp as well, and that would just be:
;WITH d(d,t,dtr,maxt,mint) AS
(
SELECT [Date], [Time], DATEADD(HOUR, -12,
CONVERT(SMALLDATETIME, CONVERT(CHAR(8), [Date], 112)
+ ' ' + CONVERT(CHAR(8), [Time], 108))), MaxTemp, MaxTemp
FROM #d
),
d2(dtr, maxt, mint) AS
(
SELECT CONVERT([Date], dtr), MAX(maxt), MIN(mint) FROM d
GROUP BY CONVERT([Date], dtr)
)
UPDATE d
SET maxt = d2.maxt, mint = d2.maxt
FROM d
INNER JOIN d2
ON d.dtr >= d2.dtr
AND d.dtr < DATEADD(DAY, 1, d2.dtr);
Now, this is not really better than your existing query, because it's still going to be using scans to figure out aggregates and all the rows that need to be updating. I'm not saying you should be updating the table at all, because this information can always be derived at query time, but if it is something you really want to do, I would combine the advice in these answers and consider revising the schema. For example, if the schema were:
USE [tempdb];
GO
CREATE TABLE dbo.d
(
[Date] SMALLDATETIME,
MaxTemp DECIMAL(6,3),
MinTemp DECIMAL(6,3),
RoundedDate AS (CONVERT(DATE, DATEADD(HOUR, -12, [Date]))) PERSISTED
);
CREATE INDEX rd ON dbo.d(RoundedDate);
INSERT dbo.d([Date],MaxTemp,MinTemp) VALUES
('2011-09-18 12:05:00',38.15,38.099),
('2011-09-18 12:10:00',38.20,38.10),
('2011-09-18 12:15:00',38.22,38.17),
('2011-09-18 12:20:00',38.21,38.20),
('2011-09-19 11:50:00',38.17,38.10),
('2011-09-19 12:55:00',38.32,38.27),
('2011-09-19 12:00:00',38.30,38.20);
Then your update is this simple, and the plan is much nicer:
;WITH g(RoundedDate,MaxTemp)
AS
(
SELECT RoundedDate, MAX(MaxTemp)
FROM dbo.d
GROUP BY RoundedDate
)
UPDATE d
SET MaxTemp = g.MaxTemp
FROM dbo.d AS d
INNER JOIN g
ON d.RoundedDate = g.RoundedDate;
Finally, one of the reasons your existing query is probably taking so long is that you are updating all of time, every time. Is data from last week changing? Probably not. So why not limit the WHERE clause to recent data only? I see no need to go recalculate anything earlier than yesterday unless you are constantly receiving revised estimates of how warm it was last Tuesday at noon. So why are there no WHERE clauses on your current query, to limit the date range where it is attempting to do this work? Do you really want to update the WHOLE able, EVERY time? This is probably something you should only be doing once a day, sometime in the afternoon, to update yesterday. So whether it takes 2 seconds or 2.5 seconds shouldn't really matter.

You may need to use -12 depending on date as start date or end date for the noon to noon internal.
update tableA
set tableAx.MaxTemp = MAX(TableB.HighTemp)
from tableA as tableAx
join TableB
on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)
group by tableAx.Date
Because of the 12 hour offset not sure how much would would gain by putting TableB Date plus Time in a DateTime field directly. Cannot get away from the DATEADD and the output from a functions is not indexed even if the parameters going into the function are indexed. What you might be able to to is create a computed column that = date + time +/- 12h and index that column.
Like the recommendation from Arron to only update those without values.
update tableA
set tableAx.MaxTemp = MAX(TableB.HighTemp)
from tableA as tableAx
join TableB
on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)
where tableAx.MaxTemp is null
group by tableAx.Date
or an insert of new dates
insert into tableA (date, MaxTemp)
select CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]), as Date) as [date] , MAX(TableB.HighTemp) as [MaxTemp]
from tableA as tableAx
right outer join TableB
on tableAx.Date = CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)
where TableB.Date is null
group by CAST(DATEADD(hh,12,TableB.[Date]+TableB.[Time]) as Date)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Select Statement Returning Multiple of Same Dates - sql-server

Related

get unique records existing only once per day in a date range

TSQL - Return duplicate rows with highest value and longest date

Joining multiple date fields to calendar table

Count number of days in a year with a record

TSQL Performance issues using DATEADD in where clause

Categories

Resources