Improve performance against sys tables in SQL Server - sql-server

We're experiencing performance issues in our SQL Server application (written in PHP, but we experience the same timings when running the queries in the management studio, so I don't think that's releveant).
The offending query is as follow:
SELECT
c.name, t.name AS type, c.is_nullable, c.is_identity,
object_definition(c.default_object_id) AS default_value,
c.precision, c.scale, c.max_length, c.collation_name,
CASE WHEN p.column_id IS NOT NULL THEN 1 ELSE 0 END AS is_primary,
CASE WHEN u.column_id IS NOT NULL THEN 1 ELSE 0 END AS is_unique
FROM
sys.columns AS c
LEFT JOIN sys.types AS t
ON c.user_type_id = t.user_type_id
LEFT JOIN (
SELECT DISTINCT
ic.object_id, ic.column_id
FROM sys.indexes ix
JOIN sys.index_columns ic
ON ix.object_id = ic.object_id
AND ix.index_id = ic.index_id
WHERE is_primary_key = 1
) AS p
ON p.object_id = c.object_id AND p.column_id = c.column_id
LEFT JOIN (
SELECT DISTINCT
ic.object_id, ic.column_id
FROM sys.indexes ix
JOIN sys.index_columns ic
ON ix.object_id = ic.object_id
AND ix.index_id = ic.index_id
WHERE is_unique = 1
) AS u
ON u.object_id = c.object_id AND u.column_id = c.column_id
WHERE
c.object_id = object_id('tblTestTable');
Testing locally on SQL Server 2014 Express we get a first-run time of about 0.3s with subsequent runs between 0.1s and 0.2s. On our production server, running the full version of SQL Server 2014, the performance is worse!
I would expect a query such as this (using the system tables) to be running much faster, e.g. in the 0.01 -> 0.05 range, which is the kind of performance we get for similar queries against our own user tables.
Should I expect a query like this to be slow?
If yes, is there an alternative, faster method of getting this information? If no, what should we do to optimise it?
I notice that these system views don't appear to have indexes. Is that a factor?
Also of relevance is that this query originally used the INFORMATION_SCHEMA views, but these were at least twice as slow as the current performance we are getting using sys (though possibly that was because the sub-selects were in the field list rather than in the join).
Note that the timings come from the properties window in the management studio, and are consistent with what I get if I microtime() the query execution from within PHP.
UPDATE #1
I have constructed a query using our user data tables, which is basically the same structure as the one above (or as close as I could get).
This query runs at about 0.14s on first run, and then between 0.015 and 0.07 on subsequent runs. This is the kind of performance I would be expecting for the sys queries. Therefore, this appears to be an issue specific to the sys tables, rather than a general server configuration issue.
I can post the query here if it would be helpful, but will hold-off for now in case it's just spammy.
UPDATE #2
As requested, here is the stats output with SET STATISTICS TIME|IO ON for the original query, from a cold cache.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(56 row(s) affected)
Table 'sysiscols'. Scan count 112, logical reads 224, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysidxstats'. Scan count 112, logical reads 224, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 0, logical reads 448, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'syssingleobjrefs'. Scan count 0, logical reads 112, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysscalartypes'. Scan count 0, logical reads 112, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'syscolpars'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 210 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

System views do have indexes (the underlying system catalog tables have indexes, actually). First and foremost, start by identifying the slowness cause. Read How to analyse SQL Server performance. I would somehow doubt this to be root caused to an indexing issue (ie. driven by size-of-data in metadata views). It is much more probable you are experience blocking, for various reasons. Scanning system views is still subject to locks, and DDL operations (create/alter/drop) will block scans until the DDL commits.
Also, just apply some common sense query optimizations. You are filtering by c.object_id = object_id('tblTestTable'); the result, but perhaps the inner queries (SELECT DISTINCT ...FROM sys.indexes ix JOIN sys.index_columns ic) cannot push down this predicate. Try forcing it, ie. add the WHERE object_id = object_id('tblTestTable') clause to the inner queries as well.

Based on various comments in this thread and in my own research, I have concluded the following:
Q1. Should I expect a query like this to be slow?
Answer: Apparently, yes.
I reduced the query to the following, which is still really slow, by my standards (i.e. 0.11 to 0.17 seconds on a warm cache, instead of 0.0x seconds, which I would expect):
SELECT
c.name, c.is_nullable, c.is_identity,
c.precision, c.scale, c.max_length, c.collation_name
FROM
sys.columns AS c;
If the straight query, with no joins or calculated fields, is this slow then I can only conclude that it is a limitation of SQL Server rather than something strange that I am doing.
Q2. If yes, is there an alternative, faster method of getting this information? If no, what should we do to optimise it?
Answer: It appears that there is no faster method, nor are there any meaningful optimisations to be made (assuming that all columns are required).
Here is a quote from Remus Rusanu in a comment to one of the other answers:
Are you saying that these are complex views and that therefore I should therefore expect them to be slow?
I'm saying that you will get slower response than compared to a table of similar structure as the view. The underlying tables are optimized for metadata maintenance operations (find by Id and by name) and to prevent DDL excessive locking or even deadlocks. Performance of querying of the views on top of the system tables has to be 'good enough', not the main goal.
Based on this, as well as some experiments of my own, it seems unlikely that there are any further optimisations that could be made to extract this information in a faster way, either through alternative query formulation or via different lookup mechanisms.
The solution, in my case
So, given the above, the only solution for our use-case is to cache the table schema locally, via the caching mechanisms provided by our framework. This requires that we run the re-cache script every time we update the DB schema, and runs the risk of the application blowing-up massively if schema changes are made without this script being run. However, it has removed this performance bottle-neck from our application, simply by removing the need to run the query during normal use.

Related

Understanding Implicit Type Conversion in SQL Server

On the SQL Server documentation page, MS provides the following matrix showing what conversions are supported and not:
What would be an example in SQL of an explicit conversion and an implicit conversion?
For example, I would assume that an explicit conversion would be something like CAST('2014-01-01' AS DATE), but then it also allows odd things like converting varchar to image. Or, how could you explicitly cast a datetime to a float?
https://learn.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql?view=sql-server-ver16
We have a table Employee with a NationalIDNumber column defined with a NVARCHAR data type. In this query, we will use a WHERE clause to search for a specific ID.
In the query below, we have requested NationalIDNumber equal to the integer value 14417807. For SQL Server to compare these two data types, it must convert that NVARCHAR into INT. Which means every value in that column must go through a conversion process which causes a table scan.
USE AdventureWorks2016CTP3
GO
SET STATISTICS IO ON
GO
SELECT BusinessEntityID, NationalIDNumber, LoginID, HireDate,JobTitle
FROM HumanResources.Employee
WHERE NationalIDNumber = 14417807
In the execution plan, you will see an exclamation point warning you that there is a potential issue with the query. Hovering over the SELECT operator, you will see that a CONVERT_IMPLICIT is happening which may have affected the optimizer from using a SEEK.
(1 row affected)
Table 'Employee'. Scan count 1, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Now the question is how do we fix it. It’s really simple but it does require a code change. Let’s look back at our query.
SELECT BusinessEntityID, NationalIDNumber, LoginID, HireDate,JobTitle
FROM HumanResources.Employee
WHERE NationalIDNumber = 14417807
Remember we asked for an integer value. Just by adding single quotes to the value, we can eliminate our issue. It’s important to always know what data types your columns are when querying them. In this case, since it is an NVARCHAR, all I need to do is to supply a character value. This is accomplished by adding single quotes around the value.
SELECT BusinessEntityID, NationalIDNumber, LoginID, HireDate,JobTitle
FROM HumanResources.Employee
WHERE NationalIDNumber = '14417807'
It’s simple to see the results. Note above the Scan count 1, logical reads 9, physical reads 0. When we rerun it we get the below.
(1 row affected)
Table 'Employee'. Scan count 0, logical reads 4, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
You can also see in the graphical plan that the warning is now gone, and we have a SEEK instead of the SCAN which is much more efficient.
Source

How to best use multicolumn index with value ranges in SQL Server?

I'm running SQL Server 2016 Enterprise edition.
I have a table with 24 columns and 10 indexes. Those indexes are vendor defined so I cannot change them. I have a hard time to understand how to get best performance as whatever I do SQL Server chooses in my opinion a poor execution plan.
The following query :
SELECT event_id
FROM Events e WITH(NOLOCK, index=[Event_By_PU_And_TimeStamp])
WHERE e.timestamp > '2022-05-12 15:00'
AND e.PU_Id BETWEEN 103 AND 186
results in this index seek:
The specified index is the clustered index and it has two columns PU_ID and Timestamp. Even though the SEEK PREDICATE lists both PU_ID and Timestamp as the used columns the "Number of rows read" is too high in my opinion. Without the index hint SQL chooses a different index for the seek with double rows-read number.
Unfortunately the order of the columns in the index is PU_ID, Timestamp, while Timestamp is the much more selective column here.
However if I change the PU_ID condition to list every possible number between the margins
PU_ID IN (103,104,105,...186)
then the "rows read are exactly the number of returned rows" and the statistics output confirms a better performance (validated with profiler trace).
Between-condition:
(632 rows affected)
Table 'Events'. Scan count 7, logical reads 139002, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
IN-condition with every number written out:
(632 rows affected)
Table 'Events'. Scan count 84, logical reads 459, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Edit: the IndexSeek for the IN-query
What is the best way to make SQL Server choose the better plan?
Do I really need to write out all possible PU_IDs in every query?
The used index is just a simple two column index, it's just the clustered index as well:
CREATE UNIQUE CLUSTERED INDEX [Event_By_PU_And_TimeStamp] ON [dbo].[Events]
(
[PU_Id] ASC,
[TimeStamp] ASC
)

No blocking, good execution plan, slow query: why?

I have a query that occasionally takes several minutes to complete. Several processes are running concurrently but there is no blocking (I'm running an extended events session, I can see blocking of other transactions, so the query to inspect the logged events is working).
Looking at the query plan cache, the execution plan is a good one: running it in SSMS, it takes less than 100 IOs, and there are no table or index scans.
There is the possibility that the users are getting a different plan, but if I add hints to use scans on all tables (and some are fairly large), it still returns in around 1 second. So the worst possible execution plan still wouldn't result in a query that takes several minutes.
Having ruled out blocking and a bad execution plan, What else can make a query slow ?
One thing worth pointing out is that SQL Server uses an indexed view we have created, although the code doesn't reference it (we're using SQL Server Enterprise). That indexed view has a covering index to support the query and it is being used - again, the execution plan is very good. The original query is using NOLOCK, and I observed that no locks are taken on any rows or pages of the indexed view either (so SQL Server respects our locking hints, even though it's accessing an indexed view instead of the underlying tables - good). This makes sense, otherwise I would have expected to see blocking.
We are using indexed views in some other queries but we reference them in SQL code (and specify NOLOCK, NOEXPAND). I've not seen any problems with those queries, and I'm not aware that there should be any difference between indexed views that we tell the optimizer to use and indexed views that the optimizer itself decides to use, but what I'm seeing suggests that there is.
Any thoughts ? Anything else I should be looking at ?
This is the query:
execute sp_executesql
N'SELECT DISTINCT p.policy_id
, p.name_e AS policy_name_e
, p.name_l AS policy_name_l
FROM patient_visit_nl_view AS pv
INNER JOIN swe_cashier_transaction_nl_view AS ct ON ct.patient_visit_id = pv.patient_visit_id
AND ct.split_date_time IS NOT NULL
INNER JOIN ar_invoice_nl_view AS ai ON ai.ar_invoice_id = ct.invoice_id
AND ai.company_code = ''KOC''
AND ai.transaction_status_rcd = ''TEMP''
INNER JOIN policy_nl_view p ON p.policy_id = ai.policy_id
WHERE pv.patient_id = #pv__patient_id'
, N' #pv__patient_id uniqueidentifier'
, #pv__patient_id = '5D61EDF1-7542-11E8-BFCB-D89EF37315A2'
Note: views with suffix _nl_view select from the table with NOLOCK (the idea is we can change this in future without affecting the business tier code).
You can see the query plan here: https://www.brentozar.com/pastetheplan/?id=HJI9Lj_WH
IO stats:
Table 'policy'. Scan count 0, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ar_invoice_cashier_transaction_visit_iview'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Locks taken (IS locks on the objects involved, nothing else):
locks taken
Below the relevant part of the indexed view:
CREATE VIEW dbo.ar_invoice_cashier_transaction_visit_iview WITH SCHEMABINDING
AS
SELECT ai.ar_invoice_id
, ai.company_code
, ai.policy_id
, ai.transaction_status_rcd
, ct.cashier_transaction_id
, pv.patient_id
-- more columns
FROM dbo.ar_invoice AS ai
INNER JOIN dbo.swe_cashier_transaction AS ct ON ct.invoice_id = ai.ar_invoice_id AND ct.split_date_time IS NOT NULL
INNER JOIN dbo.patient_visit AS pv ON pv.patient_visit_id = ct.patient_visit_id
CREATE UNIQUE CLUSTERED INDEX XPKar_invoice_cashier_transaction_visit_iview ON dbo.ar_invoice_cashier_transaction_visit_iview (ar_invoice_id, cashier_transaction_id)
CREATE INDEX XIE4ar_invoice_cashier_transaction_visit_iview ON dbo.ar_invoice_cashier_transaction_visit_iview (patient_id, transaction_status_rcd, company_code) INCLUDE (policy_id)
So far so good.
But every few days (and not at the same time of day), things go pear-shaped, the query takes minutes and actually times out (the command timeout of the provider is set to 10 minutes). When this happens, there is no blocking. I have an extended event session and this is my query
DECLARE #event_xml xml;
SELECT #event_xml = CONVERT(xml, target_data)
FROM sys.dm_xe_sessions AS s
INNER JOIN sys.dm_xe_session_targets AS t ON s.address = t.event_session_address
WHERE s.name = 'Blocking over 10 seconds'
SELECT DATEADD(hour, DATEDIFF(hour, GETUTCDATE(), GETDATE()), R.c.value('#timestamp', 'datetime')) AS time_stamp
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocked-process[1]/process)[1]/#spid', 'int') AS blocked_spid
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocked-process[1]/process[1]/inputbuf)[1]', 'varchar(max)') AS blocked_inputbuf
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocked-process[1]/process[1]/#waitresource)[1]', 'varchar(max)') AS wait_resource
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocking-process[1]/process)[1]/#spid', 'int') AS blocking_spid
, R.c.value('(data[#name="blocked_process"]/value[1]/blocked-process-report[1]/blocking-process[1]/process[1]/inputbuf)[1]', 'varchar(max)') AS blocking_inputbuf
, R.c.query('.')
FROM #event_xml.nodes('/RingBufferTarget/event') AS R(c)
ORDER BY R.c.value('#timestamp', 'datetime') DESC
This query is returning other cases of blocking, so I believe it's correct. At the time the problem (the timeouts) occur, there are no cases of blocking involving the query above, or any other query.
Since there is no blocking, I'm looking at the possibility of a bad query plans. I didn't find a bad plan in the cache (I had already recommended an sp_recompile on of the tables before I was given remote access), so I tried to think of the worst possible one: scans for every table. Applying the relevant options, here are the IO stats for this query:
Table 'patient_visit'. Scan count 1, logical reads 4559, physical reads 0, read-ahead reads 7, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'swe_cashier_transaction'. Scan count 9, logical reads 24840, physical reads 0, read-ahead reads 23660, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ar_invoice'. Scan count 9, logical reads 21247, physical reads 0, read-ahead reads 7074, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'policy'. Scan count 9, logical reads 271, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
And here is the execution plan: https://www.brentozar.com/pastetheplan/?id=rJr29s_br
The customer has a beefy SQL Server 2012 box, plenty of cores (maxdop is set to 8), tons of memory. It eats this bad query for breakfast (takes around 350 msec).
For completeness, here are the row counts of the tables involved:
ar_invoice: 2363527
swe_cashier_transaction: 2946514
patient_visit: 654976
policy: 1038
ar_invoice_cashier_transaction_visit_iview: 1999609
I also ran the query for a patient_id that returns the most rows, and for a patient_id that didn't exist (i.e. 0 rows). I ran these with the recompile option: in both cases the optimizer selected the same (good) execution plan.
So back to the question: there is no blocking, the query plan seems to be good (and even if it was bad, it wouldn't be bad to the extent this query takes 10 minutes), so what can cause this ?
The only thing a little unusual here is that, although the SQL doesn't select from the indexed view, the optimizer uses it anyway - and this is or should be a good thing. I know the Enterprise version claims it can do this, but this is the first time I've seen it in the wild (I've seen plenty of the opposite though: referencing an indexed view in SQL, but the optimizer selects from the view's underlying tables anyway). I'm tempted to believe that this is relevant.
Without knowing anything about your setup, a few other things I would check:
what is overall CPU and memory utilisation like on the box, could there be resource contention
if your storage is on a SAN rather than local storage, is there contention at the storage end (this can happen if you have heavy reads/writes on the same disk arrays from different systems)
There can be several other factors involved in slowing down a query. Personally I don't really trust the SQL Server's Optimization technique though. Normally I would recommend to optimize your query so that optimizer does NOT have to do hard work, for example use Exists / In on main table instead of joining and doing distinct/grouping, like,
select distinct ia.AttributeCode, ia.AttributeDescription
from ItemsTable as i
inner join ItemAttributesTable as ia on i.AttributeCode = ia.AttributeCode
where i.Manufacturer = #paramMfr
and i.MfrYear between #paramYearStart and #paramYearYend
instead of running a query like above run it like this
select ia.AttributeCode, ia.AttributeDescription
from ItemAttributesTable as ia
where ia.AttributeCode in (
select i.AttributeCode
from ItemsTable as i
where i.Manufacturer = #paramMfr
and i.MfrYear between #paramYearStart and #paramYearYend
)
I am NOT really expert in indexing, but for above case, I think only 1 index should be sufficient in ItemsTable
Another optimization can be done by removing the views and directly using the tables, because views may also be doing joins on other tables that are really not required here.
All in all, the main point is that when query optimizer is figuring out the best possible scenario and it may run into the case where it reaches to the timeout (which is called Optimizer TimeOut limit), in that case it may pick up a plan which is NOT really good at that specific time, which is why the plan cache should be used. That's the reason here I am recommending to focus on optimizing the query rather looking at the reasons why it's timing out.
Check this out as well https://blogs.msdn.microsoft.com/psssql/2018/10/19/understanding-optimizer-timeout-and-how-complex-queries-can-be-affected-in-sql-server/
Update-1:
Recommendations:
Use Exists / In, even if you see the same execution plan as your current query, still this will help optimizer to almost always use the correct plan
Try eliminating the views and directly use the tables, with fewer select columns.
Make sure you have proper indexed defined as per the given parameters
Try breaking the query into smaller parts, for example pick the filtered data in temporary table and then grab rest of the details using temporary tables
Try googling "Timeout in Application not in SSMS" and see different hacks
Common causes of query timeout:
No indexing defined
Extracting too much data
There is/are lock(s) on one or more tables while you are trying to read data from those table(s)
Parameter type and Field type difference, for example, the column is varchar while parameter type is nvarchar
Parameter sniffing

CPU Time or Elapsed Time - Which actually means SQL Query's Performance?

I have a SQL server 2012 table with 2697 Records and the table is not indexed. The data will get increased in future up to 100k records. I am not joining any other table with this one to retrieve records. Initially I created a user defined function to retrieve the records from the table.
Later I came to know that a view will be more faster than the user defined function and hence I created a View for that table.
TO know the Query's performance, I Included the below codes to get the CPU time and elapsed time of my UDF, VIEW and direct SQL statement.
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
When I pulled the data directly from my table with a select query I got the below CPU time and Elapsed time
SELECT [CollegeName]
,[CandidateID]
,[age]
,[race]
,[sex]
,[ethnic]
,[arm]
,[Weeknum]
,[siteid]
,[country]
,[Region]
,[SubRegion]
,[SNAME]
,[UID]
FROM [testdata]
---- Result
Scan count 1, logical reads 1338, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 4381 ms.
When I used the VIEW, I got the CPU time and Elapsed Time as
CREATE VIEW vw_testdata
AS
SELECT [CollegeName]
,[CandidateID]
,[age]
,[race]
,[sex]
,[ethnic]
,[arm]
,[Weeknum]
,[siteid]
,[country]
,[Region]
,[SubRegion]
,[SNAME]
,[UID]
FROM [testdata]
-- Result
Scan count 1, logical reads 1324, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 5853 ms.
And my UDF returned as
CREATE FUNCTION [dbo].[fn_DocApproval] (#collegename nvarchar(30) = NULL)
RETURNS TABLE
AS
RETURN
(
SELECT [CollegeName]
,[CandidateID]
,[age]
,[race]
,[sex]
,[ethnic]
,[arm]
,[Weeknum]
,[siteid]
,[country]
,[Region]
,[SubRegion]
,[SNAME]
,[UID]
FROM [testdata] WHERE CollegeName = ISNULL(#collegename, collagename)
)
-- Result
Scan count 1, logical reads 1338, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 203 ms, elapsed time = 785 ms.
The UDF has very lesser elapsed time than the direct sql and the view, however the CPU time is more.
However the CPU time is less in the view when compared to direct SQL and UDF.
I want to know which one we need to lookout to determine the query's performance.
Also Why does the both CPU time and elapsed time changes when I ran the same query each time?
My Schema and sample dataFiddle
I have currently 2697 rows and i'm not able to load all them in fiddle.
As per the article SQL Query performance Tuning
SQL Server parse and Compile time : When we submit a query to SQL server to execute,it has to parse and compile for any syntax error and optimizer has to produce the optimal plan for the execution. SQL Server parse and Compile time refers to the time taken to complete this pre -execute steps.If you look into the output of second execution, the CPU time and elapsed time are 0 in the SQL Server parse and Compile time section. That shows that SQL server did not spend any time in parsing and compiling the query as the execution plan was readily available in the cache. CPU time refers to the actual time spend on CPU and elapsed time refers to the total time taken for the completion of the parse and compile. The difference between the CPU time and elapsed time might wait time in the queue to get the CPU cycle or it was waiting for the IO completion. This does not have much significance in performance tuning as the value will vary from execution to execution. If you are getting consistent value in this section, probably you will be running the procedure with recompile option.
SQL Server Execution Time: This refers to the time taken by SQL server to complete the execution of the compiled plan. CPU time refers to the actual time spend on CPU where as the elapsed time is the total time to complete the execution which includes signal wait time, wait time to complete the IO operation and time taken to transfer the output to the client.The CPU time can be used to baseline the performance tuning. This value will not vary much from execution to execution unless you modify the query or data. The load on the server will not impact much on this value. Please note that time shown is in milliseconds. The value of CPU time might vary from execution to execution for the same query with same data but it will be only in 100's which is only part of a second. The elapsed time will depend on many factor, like load on the server, IO load ,network bandwidth between server and client. So always use the CPU time as baseline while doing the performance tuning.
The lesser number of logical reads you have in the plan, the more efficient is the query.
If you are using a modern server, always look at "elapsed time" not on "CPU time". In the era of fast multi-core processors, multiprocessor boards and so on - all other factors conditioning a quick response, not a processor, are important. It happens that with complicated queries, the indication of CPU time is 5 times greater than the total time (check "execution plan" - then there will be many parallelisms).

To get the number of orders placed on the current day

I am using sql-server 2012
The query is
1.select count(*) from table where orderti=getdate()
2.select count(*) from table where orderti>=convert(date,getdate()) and orderti<
dateadd(day,convert(date,getdate())
the table structure is:
sales(orderti datetime)
non clustered index on orderti.
I want to know what is the difference in writing styles of 2 queries mentioned above.
which one is efficient ?
any help?
Thanks,
Chio
Based on your question
Query 1
select count(*) from table where orderti=getdate()
This query will not give you the orders for the current day because
orderti is datetime and will contain the time portion as well
getdate() contains both the current date and time as well
This this query is trying to do is to get all orders with orderti = current date and time which is not what you require.
The query which you are looking for is this
select count(*) from table where CONVERT(DATE,orderti)=CONVERT(DATE,getdate())
Query 2
The query you are looking for is
select count(*) from table where orderti>=convert(date,getdate()) and orderti< dateadd(day,1,convert(date,getdate()))
On your question
which one is efficient ?
Based on the statistics and execution plan, both query do a index seek and are equally efficient.
Table 'sales'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sales'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Update: With around 650k records there are variations in the statistics however the plan remains the same.
Table 'sales'. Scan count 1, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sales'. Scan count 1, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
I would recommend Query 2 as it has lesser logical reads and doesn't have a CAST/Convert.
I would prefer option-2 due to couple of reasons
Better don't wrap filtered columns with functions as standard practice, not valid in the case of course because optimizer is cleaver in 2012 version.
Although 2012 version of optimizer is using index seek which dosen't mean it saves CPU cycles for CAST

Resources