No blocking, good execution plan, slow query: why? - sql-server

I have a query that occasionally takes several minutes to complete. Several processes are running concurrently but there is no blocking (I'm running an extended events session, I can see blocking of other transactions, so the query to inspect the logged events is working).
Looking at the query plan cache, the execution plan is a good one: running it in SSMS, it takes less than 100 IOs, and there are no table or index scans.
There is the possibility that the users are getting a different plan, but if I add hints to use scans on all tables (and some are fairly large), it still returns in around 1 second. So the worst possible execution plan still wouldn't result in a query that takes several minutes.
Having ruled out blocking and a bad execution plan, What else can make a query slow ?
One thing worth pointing out is that SQL Server uses an indexed view we have created, although the code doesn't reference it (we're using SQL Server Enterprise). That indexed view has a covering index to support the query and it is being used - again, the execution plan is very good. The original query is using NOLOCK, and I observed that no locks are taken on any rows or pages of the indexed view either (so SQL Server respects our locking hints, even though it's accessing an indexed view instead of the underlying tables - good). This makes sense, otherwise I would have expected to see blocking.
We are using indexed views in some other queries but we reference them in SQL code (and specify NOLOCK, NOEXPAND). I've not seen any problems with those queries, and I'm not aware that there should be any difference between indexed views that we tell the optimizer to use and indexed views that the optimizer itself decides to use, but what I'm seeing suggests that there is.
Any thoughts ? Anything else I should be looking at ?
This is the query:
execute sp_executesql
N'SELECT DISTINCT p.policy_id
, p.name_e AS policy_name_e
, p.name_l AS policy_name_l
FROM patient_visit_nl_view AS pv
INNER JOIN swe_cashier_transaction_nl_view AS ct ON ct.patient_visit_id = pv.patient_visit_id
AND ct.split_date_time IS NOT NULL
INNER JOIN ar_invoice_nl_view AS ai ON ai.ar_invoice_id = ct.invoice_id
AND ai.company_code = ''KOC''
AND ai.transaction_status_rcd = ''TEMP''
INNER JOIN policy_nl_view p ON p.policy_id = ai.policy_id
WHERE pv.patient_id = #pv__patient_id'
, N' #pv__patient_id uniqueidentifier'
, #pv__patient_id = '5D61EDF1-7542-11E8-BFCB-D89EF37315A2'
Note: views with suffix _nl_view select from the table with NOLOCK (the idea is we can change this in future without affecting the business tier code).
You can see the query plan here: https://www.brentozar.com/pastetheplan/?id=HJI9Lj_WH
IO stats:
Table 'policy'. Scan count 0, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ar_invoice_cashier_transaction_visit_iview'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Locks taken (IS locks on the objects involved, nothing else):
locks taken
Below the relevant part of the indexed view:
CREATE VIEW dbo.ar_invoice_cashier_transaction_visit_iview WITH SCHEMABINDING
AS
SELECT ai.ar_invoice_id
, ai.company_code
, ai.policy_id
, ai.transaction_status_rcd
, ct.cashier_transaction_id
, pv.patient_id
-- more columns
FROM dbo.ar_invoice AS ai
INNER JOIN dbo.swe_cashier_transaction AS ct ON ct.invoice_id = ai.ar_invoice_id AND ct.split_date_time IS NOT NULL
INNER JOIN dbo.patient_visit AS pv ON pv.patient_visit_id = ct.patient_visit_id
CREATE UNIQUE CLUSTERED INDEX XPKar_invoice_cashier_transaction_visit_iview ON dbo.ar_invoice_cashier_transaction_visit_iview (ar_invoice_id, cashier_transaction_id)
CREATE INDEX XIE4ar_invoice_cashier_transaction_visit_iview ON dbo.ar_invoice_cashier_transaction_visit_iview (patient_id, transaction_status_rcd, company_code) INCLUDE (policy_id)
So far so good.
But every few days (and not at the same time of day), things go pear-shaped, the query takes minutes and actually times out (the command timeout of the provider is set to 10 minutes). When this happens, there is no blocking. I have an extended event session and this is my query
DECLARE #event_xml xml;
SELECT #event_xml = CONVERT(xml, target_data)
FROM sys.dm_xe_sessions AS s
INNER JOIN sys.dm_xe_session_targets AS t ON s.address = t.event_session_address
WHERE s.name = 'Blocking over 10 seconds'
SELECT DATEADD(hour, DATEDIFF(hour, GETUTCDATE(), GETDATE()), R.c.value('#timestamp', 'datetime')) AS time_stamp
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocked-process[1]/process)[1]/#spid', 'int') AS blocked_spid
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocked-process[1]/process[1]/inputbuf)[1]', 'varchar(max)') AS blocked_inputbuf
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocked-process[1]/process[1]/#waitresource)[1]', 'varchar(max)') AS wait_resource
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocking-process[1]/process)[1]/#spid', 'int') AS blocking_spid
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocking-process[1]/process[1]/inputbuf)[1]', 'varchar(max)') AS blocking_inputbuf
, R.c.query('.')
FROM #event_xml.nodes('/RingBufferTarget/event') AS R(c)
ORDER BY R.c.value('#timestamp', 'datetime') DESC
This query is returning other cases of blocking, so I believe it's correct. At the time the problem (the timeouts) occur, there are no cases of blocking involving the query above, or any other query.
Since there is no blocking, I'm looking at the possibility of a bad query plans. I didn't find a bad plan in the cache (I had already recommended an sp_recompile on of the tables before I was given remote access), so I tried to think of the worst possible one: scans for every table. Applying the relevant options, here are the IO stats for this query:
Table 'patient_visit'. Scan count 1, logical reads 4559, physical reads 0, read-ahead reads 7, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'swe_cashier_transaction'. Scan count 9, logical reads 24840, physical reads 0, read-ahead reads 23660, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ar_invoice'. Scan count 9, logical reads 21247, physical reads 0, read-ahead reads 7074, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'policy'. Scan count 9, logical reads 271, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
And here is the execution plan: https://www.brentozar.com/pastetheplan/?id=rJr29s_br
The customer has a beefy SQL Server 2012 box, plenty of cores (maxdop is set to 8), tons of memory. It eats this bad query for breakfast (takes around 350 msec).
For completeness, here are the row counts of the tables involved:
ar_invoice: 2363527
swe_cashier_transaction: 2946514
patient_visit: 654976
policy: 1038
ar_invoice_cashier_transaction_visit_iview: 1999609
I also ran the query for a patient_id that returns the most rows, and for a patient_id that didn't exist (i.e. 0 rows). I ran these with the recompile option: in both cases the optimizer selected the same (good) execution plan.
So back to the question: there is no blocking, the query plan seems to be good (and even if it was bad, it wouldn't be bad to the extent this query takes 10 minutes), so what can cause this ?
The only thing a little unusual here is that, although the SQL doesn't select from the indexed view, the optimizer uses it anyway - and this is or should be a good thing. I know the Enterprise version claims it can do this, but this is the first time I've seen it in the wild (I've seen plenty of the opposite though: referencing an indexed view in SQL, but the optimizer selects from the view's underlying tables anyway). I'm tempted to believe that this is relevant.

Without knowing anything about your setup, a few other things I would check:
what is overall CPU and memory utilisation like on the box, could there be resource contention
if your storage is on a SAN rather than local storage, is there contention at the storage end (this can happen if you have heavy reads/writes on the same disk arrays from different systems)

There can be several other factors involved in slowing down a query. Personally I don't really trust the SQL Server's Optimization technique though. Normally I would recommend to optimize your query so that optimizer does NOT have to do hard work, for example use Exists / In on main table instead of joining and doing distinct/grouping, like,
select distinct ia.AttributeCode, ia.AttributeDescription
from ItemsTable as i
inner join ItemAttributesTable as ia on i.AttributeCode = ia.AttributeCode
where i.Manufacturer = #paramMfr
and i.MfrYear between #paramYearStart and #paramYearYend
instead of running a query like above run it like this
select ia.AttributeCode, ia.AttributeDescription
from ItemAttributesTable as ia
where ia.AttributeCode in (
select i.AttributeCode
from ItemsTable as i
where i.Manufacturer = #paramMfr
and i.MfrYear between #paramYearStart and #paramYearYend
)
I am NOT really expert in indexing, but for above case, I think only 1 index should be sufficient in ItemsTable
Another optimization can be done by removing the views and directly using the tables, because views may also be doing joins on other tables that are really not required here.
All in all, the main point is that when query optimizer is figuring out the best possible scenario and it may run into the case where it reaches to the timeout (which is called Optimizer TimeOut limit), in that case it may pick up a plan which is NOT really good at that specific time, which is why the plan cache should be used. That's the reason here I am recommending to focus on optimizing the query rather looking at the reasons why it's timing out.
Check this out as well https://blogs.msdn.microsoft.com/psssql/2018/10/19/understanding-optimizer-timeout-and-how-complex-queries-can-be-affected-in-sql-server/
Update-1:
Recommendations:
Use Exists / In, even if you see the same execution plan as your current query, still this will help optimizer to almost always use the correct plan
Try eliminating the views and directly use the tables, with fewer select columns.
Make sure you have proper indexed defined as per the given parameters
Try breaking the query into smaller parts, for example pick the filtered data in temporary table and then grab rest of the details using temporary tables
Try googling "Timeout in Application not in SSMS" and see different hacks
Common causes of query timeout:
No indexing defined
Extracting too much data
There is/are lock(s) on one or more tables while you are trying to read data from those table(s)
Parameter type and Field type difference, for example, the column is varchar while parameter type is nvarchar
Parameter sniffing

Related

Understanding Implicit Type Conversion in SQL Server

On the SQL Server documentation page, MS provides the following matrix showing what conversions are supported and not:
What would be an example in SQL of an explicit conversion and an implicit conversion?
For example, I would assume that an explicit conversion would be something like CAST('2014-01-01' AS DATE), but then it also allows odd things like converting varchar to image. Or, how could you explicitly cast a datetime to a float?
https://learn.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql?view=sql-server-ver16
We have a table Employee with a NationalIDNumber column defined with a NVARCHAR data type. In this query, we will use a WHERE clause to search for a specific ID.
In the query below, we have requested NationalIDNumber equal to the integer value 14417807. For SQL Server to compare these two data types, it must convert that NVARCHAR into INT. Which means every value in that column must go through a conversion process which causes a table scan.
USE AdventureWorks2016CTP3
GO
SET STATISTICS IO ON
GO
SELECT BusinessEntityID, NationalIDNumber, LoginID, HireDate,JobTitle
FROM HumanResources.Employee
WHERE NationalIDNumber = 14417807
In the execution plan, you will see an exclamation point warning you that there is a potential issue with the query. Hovering over the SELECT operator, you will see that a CONVERT_IMPLICIT is happening which may have affected the optimizer from using a SEEK.
(1 row affected)
Table 'Employee'. Scan count 1, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Now the question is how do we fix it. It’s really simple but it does require a code change. Let’s look back at our query.
SELECT BusinessEntityID, NationalIDNumber, LoginID, HireDate,JobTitle
FROM HumanResources.Employee
WHERE NationalIDNumber = 14417807
Remember we asked for an integer value. Just by adding single quotes to the value, we can eliminate our issue. It’s important to always know what data types your columns are when querying them. In this case, since it is an NVARCHAR, all I need to do is to supply a character value. This is accomplished by adding single quotes around the value.
SELECT BusinessEntityID, NationalIDNumber, LoginID, HireDate,JobTitle
FROM HumanResources.Employee
WHERE NationalIDNumber = '14417807'
It’s simple to see the results. Note above the Scan count 1, logical reads 9, physical reads 0. When we rerun it we get the below.
(1 row affected)
Table 'Employee'. Scan count 0, logical reads 4, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
You can also see in the graphical plan that the warning is now gone, and we have a SEEK instead of the SCAN which is much more efficient.
Source

How to best use multicolumn index with value ranges in SQL Server?

I'm running SQL Server 2016 Enterprise edition.
I have a table with 24 columns and 10 indexes. Those indexes are vendor defined so I cannot change them. I have a hard time to understand how to get best performance as whatever I do SQL Server chooses in my opinion a poor execution plan.
The following query :
SELECT event_id
FROM Events e WITH(NOLOCK, index=[Event_By_PU_And_TimeStamp])
WHERE e.timestamp > '2022-05-12 15:00'
AND e.PU_Id BETWEEN 103 AND 186
results in this index seek:
The specified index is the clustered index and it has two columns PU_ID and Timestamp. Even though the SEEK PREDICATE lists both PU_ID and Timestamp as the used columns the "Number of rows read" is too high in my opinion. Without the index hint SQL chooses a different index for the seek with double rows-read number.
Unfortunately the order of the columns in the index is PU_ID, Timestamp, while Timestamp is the much more selective column here.
However if I change the PU_ID condition to list every possible number between the margins
PU_ID IN (103,104,105,...186)
then the "rows read are exactly the number of returned rows" and the statistics output confirms a better performance (validated with profiler trace).
Between-condition:
(632 rows affected)
Table 'Events'. Scan count 7, logical reads 139002, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
IN-condition with every number written out:
(632 rows affected)
Table 'Events'. Scan count 84, logical reads 459, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Edit: the IndexSeek for the IN-query
What is the best way to make SQL Server choose the better plan?
Do I really need to write out all possible PU_IDs in every query?
The used index is just a simple two column index, it's just the clustered index as well:
CREATE UNIQUE CLUSTERED INDEX [Event_By_PU_And_TimeStamp] ON [dbo].[Events]
(
[PU_Id] ASC,
[TimeStamp] ASC
)

Improve performance against sys tables in SQL Server

We're experiencing performance issues in our SQL Server application (written in PHP, but we experience the same timings when running the queries in the management studio, so I don't think that's releveant).
The offending query is as follow:
SELECT
c.name, t.name AS type, c.is_nullable, c.is_identity,
object_definition(c.default_object_id) AS default_value,
c.precision, c.scale, c.max_length, c.collation_name,
CASE WHEN p.column_id IS NOT NULL THEN 1 ELSE 0 END AS is_primary,
CASE WHEN u.column_id IS NOT NULL THEN 1 ELSE 0 END AS is_unique
FROM
sys.columns AS c
LEFT JOIN sys.types AS t
ON c.user_type_id = t.user_type_id
LEFT JOIN (
SELECT DISTINCT
ic.object_id, ic.column_id
FROM sys.indexes ix
JOIN sys.index_columns ic
ON ix.object_id = ic.object_id
AND ix.index_id = ic.index_id
WHERE is_primary_key = 1
) AS p
ON p.object_id = c.object_id AND p.column_id = c.column_id
LEFT JOIN (
SELECT DISTINCT
ic.object_id, ic.column_id
FROM sys.indexes ix
JOIN sys.index_columns ic
ON ix.object_id = ic.object_id
AND ix.index_id = ic.index_id
WHERE is_unique = 1
) AS u
ON u.object_id = c.object_id AND u.column_id = c.column_id
WHERE
c.object_id = object_id('tblTestTable');
Testing locally on SQL Server 2014 Express we get a first-run time of about 0.3s with subsequent runs between 0.1s and 0.2s. On our production server, running the full version of SQL Server 2014, the performance is worse!
I would expect a query such as this (using the system tables) to be running much faster, e.g. in the 0.01 -> 0.05 range, which is the kind of performance we get for similar queries against our own user tables.
Should I expect a query like this to be slow?
If yes, is there an alternative, faster method of getting this information? If no, what should we do to optimise it?
I notice that these system views don't appear to have indexes. Is that a factor?
Also of relevance is that this query originally used the INFORMATION_SCHEMA views, but these were at least twice as slow as the current performance we are getting using sys (though possibly that was because the sub-selects were in the field list rather than in the join).
Note that the timings come from the properties window in the management studio, and are consistent with what I get if I microtime() the query execution from within PHP.
UPDATE #1
I have constructed a query using our user data tables, which is basically the same structure as the one above (or as close as I could get).
This query runs at about 0.14s on first run, and then between 0.015 and 0.07 on subsequent runs. This is the kind of performance I would be expecting for the sys queries. Therefore, this appears to be an issue specific to the sys tables, rather than a general server configuration issue.
I can post the query here if it would be helpful, but will hold-off for now in case it's just spammy.
UPDATE #2
As requested, here is the stats output with SET STATISTICS TIME|IO ON for the original query, from a cold cache.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(56 row(s) affected)
Table 'sysiscols'. Scan count 112, logical reads 224, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysidxstats'. Scan count 112, logical reads 224, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 0, logical reads 448, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'syssingleobjrefs'. Scan count 0, logical reads 112, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysscalartypes'. Scan count 0, logical reads 112, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'syscolpars'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 210 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
System views do have indexes (the underlying system catalog tables have indexes, actually). First and foremost, start by identifying the slowness cause. Read How to analyse SQL Server performance. I would somehow doubt this to be root caused to an indexing issue (ie. driven by size-of-data in metadata views). It is much more probable you are experience blocking, for various reasons. Scanning system views is still subject to locks, and DDL operations (create/alter/drop) will block scans until the DDL commits.
Also, just apply some common sense query optimizations. You are filtering by c.object_id = object_id('tblTestTable'); the result, but perhaps the inner queries (SELECT DISTINCT ...FROM sys.indexes ix JOIN sys.index_columns ic) cannot push down this predicate. Try forcing it, ie. add the WHERE object_id = object_id('tblTestTable') clause to the inner queries as well.
Based on various comments in this thread and in my own research, I have concluded the following:
Q1. Should I expect a query like this to be slow?
Answer: Apparently, yes.
I reduced the query to the following, which is still really slow, by my standards (i.e. 0.11 to 0.17 seconds on a warm cache, instead of 0.0x seconds, which I would expect):
SELECT
c.name, c.is_nullable, c.is_identity,
c.precision, c.scale, c.max_length, c.collation_name
FROM
sys.columns AS c;
If the straight query, with no joins or calculated fields, is this slow then I can only conclude that it is a limitation of SQL Server rather than something strange that I am doing.
Q2. If yes, is there an alternative, faster method of getting this information? If no, what should we do to optimise it?
Answer: It appears that there is no faster method, nor are there any meaningful optimisations to be made (assuming that all columns are required).
Here is a quote from Remus Rusanu in a comment to one of the other answers:
Are you saying that these are complex views and that therefore I should therefore expect them to be slow?
I'm saying that you will get slower response than compared to a table of similar structure as the view. The underlying tables are optimized for metadata maintenance operations (find by Id and by name) and to prevent DDL excessive locking or even deadlocks. Performance of querying of the views on top of the system tables has to be 'good enough', not the main goal.
Based on this, as well as some experiments of my own, it seems unlikely that there are any further optimisations that could be made to extract this information in a faster way, either through alternative query formulation or via different lookup mechanisms.
The solution, in my case
So, given the above, the only solution for our use-case is to cache the table schema locally, via the caching mechanisms provided by our framework. This requires that we run the re-cache script every time we update the DB schema, and runs the risk of the application blowing-up massively if schema changes are made without this script being run. However, it has removed this performance bottle-neck from our application, simply by removing the need to run the query during normal use.

SQL Server does not choose to use index although everything seems to suggest it

Something is wrong here and I don't understand what. It's worth to mention, there searched value is not in the table, for an existing value there is no problem. Though, why does the first query require a clustered key search for the primary key which is not even used in the query, while the second can run on the index directly.
Forcing the query to use the index WITH(INDEX(indexname)) does work, but why does the optimizer not choose to use it by itself.
The column PIECE_NUM is not in any other index and is also not the primary key.
SET STATISTICS IO ON
DECLARE #vchEventNum VARCHAR(50)
SET #vchEventNum = '54235DDS28KC1F5SJQMWZ'
SELECT TOP 1
fwt.WEIGHT,
fwt.TEST_RESULT
FROM FIN_WEIGHT_TESTS fwt WITH(NOLOCK)
WHERE fwt.PIECE_NUM LIKE #vchEventNum + '%'
ORDER BY fwt.DTTM_INSERT DESC
SELECT TOP 1
fwt.WEIGHT,
fwt.TEST_RESULT
FROM FIN_WEIGHT_TESTS fwt WITH(NOLOCK)
WHERE fwt.PIECE_NUM LIKE '54235DDS28KC1F5SJQMWZ' + '%'
ORDER BY fwt.DTTM_INSERT DESC
SET STATISTICS IO OFF
I let both queries run in one batch:
IO statistics report:
Query 1: logical reads 16244910
Query 2: logical reads 5
Table 'FIN_WEIGHT_TESTS'. Scan count 1, logical reads 16244910, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'FIN_WEIGHT_TESTS'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
The table has a non-clustered index on PIECE_NUM INCLUDING all three other columns from the query.
Here are the query execution plans(with a little editing to remove the actual names):
I noticed the convert_implicit, but that is just due to conversion of the varchar parameter to nvarchar column. Changing the parameter type did not change the behaviour of the query.
Why does the query with the parameter not use the index while replacing the parameter with its value does?
The first query is going to scan because you are using a local variable. The optimizer sees this as an "anonymous" value and therefore cannot use statistics to build a good query plan.
The second query seeks because it is a literal value and SQL can look into it's stats and knows much better how many estimated rows it will find with that value.
If you run your first query as follows I would imagine you will see it use the better plan:
DECLARE #vchEventNum VARCHAR(50)
SET #vchEventNum = '54235DDS28KC1F5SJQMWZ'
SELECT TOP 1
fwt.WEIGHT,
fwt.TEST_RESULT
FROM FIN_WEIGHT_TESTS fwt WITH(NOLOCK)
WHERE fwt.PIECE_NUM LIKE #vchEventNum + '%'
ORDER BY fwt.DTTM_INSERT DESC
OPTION(RECOMPILE)
I would suggest using a parameterized procedure to run this code to ensure that it uses a cached plan. Using the RECOMPILE hint has it's own drawbacks as the optimizer will need to rebuild the plan every time it runs. So if you run this code very often I would avoid this hint.
You can read about local variables here:https://www.brentozar.com/archive/2014/06/tuning-stored-procedures-local-variables-problems/
I think the cause of what happened is both use of #variable and ORDER BY in your query. To test my guess remove order by from your query and it may lead to equal plans in both cases (this time with different estimated number of rows reported in select).
As mentioned in previous answer, local variables cannot be sniffed at compilation time as the batch is seen as a whole thing, and only recompile option permits to server to know the value of a variable at the compilation time as recompilation begins when the variable is already assigned. This leads to "estimate for unknown" in the first case i.e. statistics cannot be used as we don't know the value in the filter, more rows in output are estimated.
But the query has top + order by in it. This means that if we expect many rows, to get only one but the first ordered by DTTM_INSERT DESC we must sort all the filtered rows. In fact if you look at the second plan you see that the SORT operator costs most of all. But when you use the constant, SQL Server uses the statistics and finds out that there will be only one row returned, so it can permit to sort the result.
In case of many rows expected it decides to use the index that is already ordered by DTTM_INSERT. It's only my guess because you did not post here the creation scripts for your indexes but from the plan I see that the first plan surely goes to the clustered index to grab the fields that are missing in non-clustered index, this means it's not the same non clustered that is used in the second case, but I'm sure the index chosen in the first case has THE LEADING KEY COLUMN DTTM_INSERT. Doing so server eliminates the sort that we see in the second plan

To get the number of orders placed on the current day

I am using sql-server 2012
The query is
1.select count(*) from table where orderti=getdate()
2.select count(*) from table where orderti>=convert(date,getdate()) and orderti<
dateadd(day,convert(date,getdate())
the table structure is:
sales(orderti datetime)
non clustered index on orderti.
I want to know what is the difference in writing styles of 2 queries mentioned above.
which one is efficient ?
any help?
Thanks,
Chio
Based on your question
Query 1
select count(*) from table where orderti=getdate()
This query will not give you the orders for the current day because
orderti is datetime and will contain the time portion as well
getdate() contains both the current date and time as well
This this query is trying to do is to get all orders with orderti = current date and time which is not what you require.
The query which you are looking for is this
select count(*) from table where CONVERT(DATE,orderti)=CONVERT(DATE,getdate())
Query 2
The query you are looking for is
select count(*) from table where orderti>=convert(date,getdate()) and orderti< dateadd(day,1,convert(date,getdate()))
On your question
which one is efficient ?
Based on the statistics and execution plan, both query do a index seek and are equally efficient.
Table 'sales'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sales'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Update: With around 650k records there are variations in the statistics however the plan remains the same.
Table 'sales'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sales'. Scan count 1, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
I would recommend Query 2 as it has lesser logical reads and doesn't have a CAST/Convert.
I would prefer option-2 due to couple of reasons
Better don't wrap filtered columns with functions as standard practice, not valid in the case of course because optimizer is cleaver in 2012 version.
Although 2012 version of optimizer is using index seek which dosen't mean it saves CPU cycles for CAST

Resources