Why is selecting date slow [duplicate] - sql-server

This question already has answers here:
Performance of SQL Server 2005 Query
(3 answers)
Closed 7 years ago.
I have a table with about 1.5 million rows with date_run is indexed non cluster. Query #1 takes 0 second to finish and query #2 takes 3 seconds. Can someone please explain why query #2 runs slower. I also included execution plans for both. Sql server version 2014.
query #1
select avg14gain
from stocktrack
where
date_run >= '2013-3-21'
and date_run < '2013-3-22'
Valid XHTML http://biginkz.com/Pics/DateHardCoded.jpg.
query #2
declare #today date
declare #yesterday date
set #today='2013-3-22'
set #yesterday='2013-3-21'
select avg14gain
from stocktrack
where
date_run >= #yesterday
and b.date_run <#today
Valid XHTML http://biginkz.com/Pics/DataAsigned.jpg.

I am not sure why your query is not picking up the index, but you can use an index hint.
Try something like this:
declare #today date
declare #yesterday date
set #today='2013-3-22'
set #yesterday='2013-3-21'
select avg14gain
from stocktrack
where
date_run >= #yesterday
and b.date_run <#today
with (index([stocktrack].[ix_drun]))
also you can try what is suggested in this post: TSQL not using indexes.
See #Justin Dearing's answer (rebuild index / update statistics).

Create an index on date_run, with avg14gain as an INCLUDE column on that index. That way the entire query can be satisfied from the one index, and the optimizer will see that.

Related

SQL getdate() - not the same in one statement

The getDate() statement always returns the same value anywhere in one statement.
However, in one SQL Server 2017, I'm seeing otherwise.
To set this up, create a table and put two rows into it:
CREATE TABLE Test
(
TestDate datetime2(0) NULL,
OtherValue varchar(5) NULL
)
INSERT INTO Test (OtherValue) VALUES ('x')
INSERT INTO Test (OtherValue) VALUES ('x')
Then run this query a number of times:
SELECT
CASE
WHEN GETDATE() < COALESCE(TestDate, GETDATE())
THEN 'less'
WHEN GETDATE() > COALESCE(TestDate, GETDATE())
THEN 'greater'
ELSE 'same'
END [Compare]
FROM
Test
Both rows always return matching results.
When I do this in SQL Server 2008 R2 (v10.50) and other SQL Server 2017 machines, the result is always 'same'.
However, on one of my SQL Server 2017 instances, it varies randomly between 'same', 'less' and 'greater':
Why is this happening? Is there a server setting that can cause this?
Edit:
Using SYSDATETIME in place of GETDATE works as expected on the 'bad' server, always returning 'same'.
Edit #2:
If I test GETDATE as above on a column defined as DATETIME (which is what GETDATE() generates), then it works as expected. So it seems to be related to converting between DATETIME and DATETIME2.
Interesting enough question.
The behaviour in your example can be explaned by the following:
Since SQL Server 2016, datetime rounding have been changed. In short: since 2016 SQL Server, value doesn't round before comparison and comparison executes with raw value. Before 2016 SQL Server, value is rounded and then compare.
By default, comparison datetime and datetime2 performs with conversion datetime to datetime2(7). You can see that in execution plan.
datetime variable with 3 at the end - for example .003 - gets converted in .0033333. 007 gets converted in .0066667.
And the most interest part: nanoseconds. During comparison SQL Server uses 8 (or more!) digits in fractional part. I just show two examples to explane.
DECLARE #DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE #DateTime2 datetime2(7) = #DateTime;
select datepart(NANOSECOND,#DateTime) as "DateTimeRes",
datepart(nanosecond,#DateTime2) as "DateTime2Res"
go
DECLARE #DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE #DateTime2 datetime2(7) = #DateTime;
select datepart(NANOSECOND,#DateTime) as "DateTimeRes",
datepart(nanosecond,#DateTime2) as "DateTime2Res"
Results:
+-------------+--------------+
| DateTimeRes | DateTime2Res |
+-------------+--------------+
| 3333333 | 3333300 |
+-------------+--------------+
+-------------+--------------+
| DateTimeRes | DateTime2Res |
+-------------+--------------+
| 6666666 | 6666670 |
+-------------+--------------+
I took it all from this article.
Also, there is a similar question on SO.
I believe this behaviour is independent of your server repformance (virtual machine or etc).
Good luck!
Turns out the behaviour of getdate changed from SQL 2000 to SQL 2005.
See https://stackoverflow.com/a/3620119/32429 explaining the old behaviour:
In practice, GETDATE() is only evaluated once for each expression
where it is used -- at execution time rather than compile time.
However, Microsoft puts rand() and getdate() into a special category,
called non-deterministic runtime constant functions.
and the following discussion:
In SQL 2000 if you did something like this
INSERT INTO tbl (fields, LOADDATE) SELECT fields, GETDATE() FROM tblb
you would get the same date/time for all records inserted.
This same command In SQL 2005, reruns GETDATE() for every single
record selected from tblb and gives you potentially unique values for
each record. Also causes HUGE performance problems if you are
inserting say, 17 million rows at a time.
This has caused me many a headache, as we use this code to do batch
date/times in many tables. This was a very simple way to back out a
"batch" of transactions, because everything had the same date/time.
Now in 2005, that is not true.

Correct week number from datepart [duplicate]

This question already has answers here:
Isoweek in SQL Server 2005
(4 answers)
Closed 7 years ago.
The code returns:
set datefirst 1
select datepart(wk,'2016-01-01') - 1
but
set datefirst 1
select datepart(wk,'2015-12-31')
returns..53 :/
But in fact - this is the same week. There is more days belonging to 2015 in this week, so it should be "53" or "1" (the same value) for any dates in this particular week. Is this possible to avieve this without building dedicated procedure to analyse date and adjust returned value ?
I use SQL Server 2005
You probably want iso_week:
set datefirst 1
select datepart(iso_week,'2015-12-31') --53
select datepart(iso_week,'2016-01-01') --53
LiveDemo
EDIT:
Looks like that iso_week is supported from SQL Server 2008+.
Is this possible to avieve this without building dedicated procedure
to analyse date and adjust returned value ?
Probably you need to write custom code to calculate it.

SQL Server Query Performance with Timestamp and variable

I have a simple SQL query to count the number of telemetry records by clients within the last 24 hours.
With an index on TimeStamp, the following query runs in less than 1 seconds for about 10k rows
select MachineName,count(Message) from Telemetry where TimeStamp between DATEADD(HOUR,-24, getutcdate()) and getutcdate() group by MachineName
However, when I tried to making the hard-coded -24 configurable and added a variable, it took more than 5 min for the query to get executed.
DECLARE #cutoff int; SET #cutoff = 24
select MachineName,count(Message) from Telemetry where TimeStamp between DATEADD(HOUR, -1*#cutoff, getutcdate()) and getutcdate() group by MachineName
Is there any specific reason for the significant decrease of performance? What's the best way of adding a variable without impacting performance?
My guess is that you also have an index on MachineName - or that SQL is deciding that since it needs to group by MachineName, that would be a better way to access the records.
Updating statistics as suggested by AngularRat is a good start - but SQL often maintains those automatically. (In fact, the good performance when SQL knows the 24 hour interval in advance is evidence that the statistics are good...but when SQL doesn't know the size of the BETWEEN in advance, then it thinks other approaches might be a better idea).
Given:
CREATE TABLE Telemetry ( machineName sysname, message varchar(88), [timestamp] timestamp)
CREATE INDEX Telemetry_TS ON Telemetry([timestamp]);
First, try the OPTION (OPTIMIZE FOR ( #cutoff = 24 )); clause to let SQL know how to approach the query, and if that is insufficient then try WITH (Index( Telemetry_TS)). Using the INDEX hint is less desirable.
DECLARE #cutoff int = 24;
select MachineName,count(Message)
from Telemetry -- WITH (Index( Telemetry_TS))
where TimeStamp between DATEADD(HOUR, -1*#cutoff, getutcdate()) and getutcdate()
group by MachineName
OPTION (OPTIMIZE FOR ( #cutoff = 24 ));
Your parameter should actually work, but you MIGHT be seeing an issue where the database is using out of date statistics for the query plan. I'd try updating statistics for the table you are quering. Something like:
UPDATE STATISTICS TableName;
Additionally, if your code is running from within a stored procedure, you might want to recompile the procedure. Something like:
EXEC sp_recompile N'ProcedureName';
A lot of times when I have a query that seems like it should run a lot faster but isn't, it's a statistic/query plan out of date issue.
References:
https://msdn.microsoft.com/en-us/library/ms187348.aspx
https://msdn.microsoft.com/en-us/library/ms190439.aspx

Why is a T-SQL variable comparison slower than GETDATE() function-based comparison?

I have a T-SQL statement that I am running against a table with many rows. I am seeing some strange behavior. Comparing a DateTime column against a precalculated value is slower than comparing each row against a calculation based on the GETDATE() function.
The following SQL takes 8 secs:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
GO
DECLARE #TimeZoneOffset int = -(DATEPART("HH", GETUTCDATE() - GETDATE()))
DECLARE #LowerTime DATETIME = DATEADD("HH", ABS(#TimeZoneOffset), CONVERT(VARCHAR, GETDATE(), 101) + ' 17:00:00')
SELECT TOP 200 Id, EventDate, Message
FROM Events WITH (NOLOCK)
WHERE EventDate > #LowerTime
GO
This alternate strangely returns instantly:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
GO
SELECT TOP 200 Id, EventDate, Message
FROM Events WITH (NOLOCK)
WHERE EventDate > GETDATE()-1
GO
Why is the second query so much faster?
EDITED: I updated the SQL to accurately reflect other settings I am using
After doing a lot of reading and researching, I've discovered the issue here is parameter sniffing. Sql Server attempts to determine how best to use indexes based on the where clause, but in this case it isnt doing a very good job.
See the examples below :
Slow version:
declare #dNow DateTime
Select #dNow=GetDate()
Select *
From response_master_Incident rmi
Where rmi.response_date between DateAdd(hh,-2,#dNow) AND #dNow
Fast version:
Select *
From response_master_Incident rmi
Where rmi.response_date between DateAdd(hh,-2,GetDate()) AND GetDate()
The "Fast" version runs around 10x faster than the slow version. The Response_Date field is indexed and is a DateTime type.
The solution is to tell Sql Server how best to optimise the query. Modifying the example as follows to include the OPTIMIZE option resulted in it using the same execution plan as the "Fast Version". The OPTMIZE option here explicitly tells sql server to treat the local #dNow variable as a date (as if declaring it as DateTime wasnt enough :s )
Care should be taken when doing this however because in more complicated WHERE clauses you could end up making the query perform worse than Sql Server's own optimisations.
declare #dNow DateTime
SET #dNow=GetDate()
Select ID, response_date, call_back_phone
from response_master_Incident rmi
where rmi.response_date between DateAdd(hh,-2,#dNow) AND #dNow
-- The optimizer does not know too much about the variable so assumes to should perform a clusterd index scann (on the clustered index ID) - this is slow
-- This hint tells the optimzer that the variable is indeed a datetime in this format (why it does not know that already who knows)
OPTION(OPTIMIZE FOR (#dNow = '99991231'));
The execution plans must be different, because SQL Server does not evaluate the value of the variable when creating the execution plan in execution time. So, it uses average statistics from all the different dates that can be stored in the table.
On the other hand, the function getdate is evaluated in execution time, so the execution plan is created using statistics for that specific date, which of course, are more realistic that the previous ones.
If you create a stored procedure with #LowerTime as a parameter, you will get better results.

SQL Server Query: Fast with Literal but Slow with Variable

I have a view that returns 2 ints from a table using a CTE. If I query the view like this it runs in less than a second
SELECT * FROM view1 WHERE ID = 1
However if I query the view like this it takes 4 seconds.
DECLARE #id INT = 1
SELECT * FROM View1 WHERE ID = #id
I've checked the 2 query plans and the first query is performing a Clustered index seek on the main table returning 1 record then applying the rest of the view query to that result set, where as the second query is performing an index scan which is returning about 3000 records records rather than just the one I'm interested in and then later filtering the result set.
Is there anything obvious that I'm missing to try to get the second query to use the Index Seek rather than an index scan. I'm using SQL 2008 but anything I do needs to also run on SQL 2005. At first I thought it was some sort of parameter sniffing problem but I get the same results even if I clear the cache.
Probably it is because in the parameter case, the optimizer cannot know that the value is not null, so it needs to create a plan that returns correct results even when it is. If you have SQL Server 2008 SP1 you can try adding OPTION(RECOMPILE) to the query.
You could add an OPTIMIZE FOR hint to your query, e.g.
DECLARE #id INT = 1
SELECT * FROM View1 WHERE ID = #id OPTION (OPTIMIZE FOR (#ID = 1))
In my case in DB table column type was defined as VarChar and in parameterized query parameter type was defined as NVarChar, this introduced CONVERT_IMPLICIT in the actual execution plan to match data type before comparing and that was culprit for sow performance, 2 sec vs 11 sec. Just correcting parameter type made parameterized query as fast as non parameterized version.
One possible way to do that is to CAST the parameters, as such:
SELECT ...
FROM ...
WHERE name = CAST(:name AS varchar)
Hope this may help someone with similar issue.
I ran into this problem myself with a view that ran < 10ms with a direct assignment (WHERE UtilAcctId=12345), but took over 100 times as long with a variable assignment (WHERE UtilAcctId = #UtilAcctId).
The execution-plan for the latter was no different than if I had run the view on the entire table.
My solution didn't require tons of indexes, optimizer-hints, or a long-statistics-update.
Instead I converted the view into a User-Table-Function where the parameter was the value needed on the WHERE clause. In fact this WHERE clause was nested 3 queries deep and it still worked and it was back to the < 10ms speed.
Eventually I changed the parameter to be a TYPE that is a table of UtilAcctIds (int). Then I can limit the WHERE clause to a list from the table.
WHERE UtilAcctId = [parameter-List].UtilAcctId.
This works even better. I think the user-table-functions are pre-compiled.
When SQL starts to optimize the query plan for the query with the variable it will match the available index against the column. In this case there was an index so SQL figured it would just scan the index looking for the value. When SQL made the plan for the query with the column and a literal value it could look at the statistics and the value to decide if it should scan the index or if a seek would be correct.
Using the optimize hint and a value tells SQL that “this is the value which will be used most of the time so optimize for this value” and a plan is stored as if this literal value was used. Using the optimize hint and the sub-hint of UNKNOWN tells SQL you do not know what the value will be, so SQL looks at the statistics for the column and decides what, seek or scan, will be best and makes the plan accordingly.
I know this is long since answered, but I came across this same issue and have a fairly simple solution that doesn't require hints, statistics-updates, additional indexes, forcing plans etc.
Based on the comment above that "the optimizer cannot know that the value is not null", I decided to move the values from a variable into a table:
Original Code:
declare #StartTime datetime2(0) = '10/23/2020 00:00:00'
declare #EndTime datetime2(0) = '10/23/2020 01:00:00'
SELECT * FROM ...
WHERE
C.CreateDtTm >= #StartTime
AND C.CreateDtTm < #EndTime
New Code:
declare #StartTime datetime2(0) = '10/23/2020 00:00:00'
declare #EndTime datetime2(0) = '10/23/2020 01:00:00'
CREATE TABLE #Times (StartTime datetime2(0) NOT NULL, EndTime datetime2(0) NOT NULL)
INSERT INTO #Times(StartTime, EndTime) VALUES(#StartTime, #EndTime)
SELECT * FROM ...
WHERE
C.CreateDtTm >= (SELECT MAX(StartTime) FROM #Times)
AND C.CreateDtTm < (SELECT MAX(EndTime) FROM #Times)
This performed instantly as opposed to several minutes for the original code (obviously your results may vary) .
I assume if I changed my data type in my main table to be NOT NULL, it would work as well, but I was not able to test this at this time due to system constraints.
Came across this same issue myself and it turned out to be a missing index involving a (left) join on the result of a subquery.
select *
from foo A
left outer join (
select x, count(*)
from bar
group by x
) B on A.x = B.x
Added an index named bar_x for bar.x
DECLARE #id INT = 1
SELECT * FROM View1 WHERE ID = #id
Do this
DECLARE #sql varchar(max)
SET #sql = 'SELECT * FROM View1 WHERE ID =' + CAST(#id as varchar)
EXEC (#sql)
Solves your problem

Resources