I have two stored stored procedures, where the second stored procedure is an improvement of the first one.
I'm trying to measure exactly how much is that improvement.
1/ Measuring clock time doesn't seem to be an option as I get different execution times. Even worse, sometimes (rarely, but it happens) the execution time of the second stored procedure is bigger than the execution time of the first procedure (I guess due to the server workload at that moment).
2/ Include client statistics also provides different results.
3/ DBCC DROPCLEANBUFFERS, DBCC FREEPROCCACHE are good, but the same story...
4/ SET STATISTICS IO ON could be an option, but how could I get an overall score as I have many tables involved in my stored procedures?
5/ Include actual execution plan could be an option also. I get an estimated subtreecost of 0.3253 for the first stored procedure, and 0.3079 for the second one. Can I say the second stored procedure is 6% faster (=0.3253/0.3079) ?
6/ Using "Reads" field from SQL Server Profiler? Even in this case I get different results when I execute those stored procedure many times.
So how can I say the second stored procedure is x% faster than the first procedure, no matter the execution conditions (the workload of the server, the server where these stored procedures are executed, etc).
If it is not possible, how can I prove the second stored procedure is more optimized.
I have a stored procedure used for a DI report that contains 62 sub-queries using UNION ALL. Recently, performance went from under 1 minute to over 8 minutes and using SQL Profiler, it was showing very high CPU and Reads. The procedure currently has passed in variables set to local variables to prevent parameter sniffing.
Running the contents of the procedure as a SELECT statement and performance was back to under a minute.
Calling the procedure via EXEC in Management Studio and performance was horrible and over 8 minutes.
Calling procedure via EXEC including WITH RECOMPILE command and performance did not improve. I ran DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS and still no improvement.
In the end, I dropped the procedure and re-applied it and performance is now back.
Can anyone help explain to me why the initial steps did not correct the performance of the procedure but dropping and re-applying the procedure did?
Sounds like blocking parameter sniffing produced a bad plan. When you use local variables the query optimizer uses the density for each column to come up with cardinality estimates, essentially optimizing for the average value. If you data distribution is skewed significantly, this estimate will be significantly off for some values. This theory explains why your initial steps did not work. Using WITH RECOMPILE or running DBCC FREEPROCCACHE will not help if parameter sniffing is blocked. It will just produce the same plan every time. Because you say that running the contents of the procedure as a SELECT statement made it faster makes me think you actually need parameter sniffing. However you also need to try using WITH RECOMPILE if compilation time is acceptable, otherwise there's a risk of getting stuck with a bad plan based on atipical sniffed values.
I am developing a system which periodically (4-5 times daily) runs a select statement, that normally takes less than 10 seconds but periodically has taken up to 40 minutes.
The database is on Windows Server 2008 + SQL Server 2008 R2; both 64bit.
There is a service on the machine running the database which polls the database and generates values for records which require it. These records are then periodically queried using a multi table join select from a service on a second machine written in C++ (VS 2010) using the MFC CRecordset class to extract the data. An example of the the query causing the problem is shown below.
SELECT DISTINCT "JobKeysFrom"."Key" AS "KeyFrom","KeysFrom"."ID" AS "IDFrom",
"KeysFrom"."X" AS "XFrom","KeysFrom"."Y" AS "YFrom","JobKeysTo"."Key" AS "KeyTo",
"KeysTo"."ID" AS "IDTo","KeysTo"."X" AS "XTo","KeysTo"."Y" AS "YTo",
"Matrix"."TimeInSeconds","Matrix"."DistanceInMetres","Matrix"."Calculated"
FROM "JobKeys" AS "JobKeysFrom"
INNER JOIN "JobKeys" AS "JobKeysTo" ON
("JobKeysFrom"."Key"<>"JobKeysTo"."Key") AND
("JobKeysFrom"."JobID"=531) AND
("JobKeysTo"."JobID"=531)
INNER JOIN "Keys" AS "KeysFrom" ON
("JobKeysFrom"."Key"="KeysFrom"."Key") AND ("JobKeysFrom"."Status"=4)
INNER JOIN "Keys" AS "KeysTo" ON
("JobKeysTo"."Key"="KeysTo"."Key") AND ("JobKeysTo"."Status"=4)
INNER JOIN "Matrix" AS "Matrix" ON
("Matrix"."IDFrom"="KeysFrom"."ID") AND ("Matrix"."IDTo"="KeysTo"."ID")
ORDER BY "JobKeysFrom"."Key","JobKeysTo"."Key"
I have tried the following
checked the indexes and all seem correct and they are active and are being used according to the query
the design advisor comes back with no suggestions
I have tried defragging the indexes and data
rebuilt the database from scratch by exporting the data and reimporting it in a new database.
ran the profiler on it and found that when it goes wrong it seems to do many millions (up to 100 million) of reads rather than a few hundred thousand.
ran the database on a different server
During the time it is running the query, I can run exactly the same query in the management studio window and it will be back to running in 10 seconds. The problem does not seem to be lock, deadlock, CPU, disk or memory related as it has done it when the machine running the database was only running this one query. The server has 4 processors and 16 gb of memory to run it in. I have also tried upgrading the disks to much faster ones and this had no effect.
It seems to me that it is almost as though the database receives the query, starts to process it and then goes to sleep for 40 minutes or runs the query without using the indexes.
When it takes a long time it will eventually finish and send the query results (normally about 70-100000 records) back to the calling application.
Any help or suggestions would be gratefully received, many thanks
This sounds very much like parameter sniffing.
When a stored procedure is invoked and there is no existing execution plan in the cache matching the set options for the connection a new execution plan will be compiled using the parameter values passed in on that invocation.
Sometimes this will happen when the parameters passed are atypical (e.g. have unusually high selectivity) so the generated plan will not be suitable for most other invocations with different parameters. For example it may choose a plan with index seeks and bookmark lookups which is fine for a highly selective case but poor if it needs to be done hundreds of thousands of times.
This would explain why the number of reads goes through the roof.
Your SSMS connection will likely have different SET ... options so will not get handed the same problematic plan from the cache when you execute the stored procedure inside SSMS
You can use the following to get the plan for the slow session
select p.query_plan, *
from sys.dm_exec_requests r
cross apply sys.dm_exec_query_plan(r.plan_handle) p
where r.session_id = <session_id>
Then compare with the plan for the good session.
If you do determine that parameter sniffing is at fault you can use OPTIMIZE FOR hints to avoid it choosing the bad plan.
Check that you don't have a maintenance task running that is rebuilding indexes, or that your database statistics are somehow invalid when the query is executed.
This is exactly the sort of thing one would expect to see if the query is not using your indexes, which is usually because either the indexes are not accessible to the query at the point it runs or because the statistics are invalid and make the optimiser believe that your large table(s) only have a few rows in them and the query would run faster with a full table scan than using indexed access.
I have been working on a stored procedure performance problem for over a week now and is related to my other post on Stackoverflow here. Let me give you some background information.
We have a nightly process which runs and is started by a stored procedure which calls many many many other stored procedures. Lots of the called stored procedures call others, etc. I have looked at some of the called procs and there is all sorts of frightnening complicated stuff in there such as XML string processing, unnecessary over-use of cursors, NOLOCK hints over-used, rare use of set-based processing, etc - the list goes on, it's quite horrendous.
This nightly process in our production environment takes on average 1:15 to run. It sometimes takes 2 hours to run which is unacceptable. I have created a test environment on identical hardware to production and run the proc. It took 45 minutes the first time I ran it. If I restore the database to the exact same point and run it again, it takes longer: indeed, if I repeat this action several times (restoring and re-running), the proc takes progressively longer until it plateaus at around 2 hours. This really puzzles me because I restore the database to the exact same point every time. There are no other user databases on the server.
I thought of two lines of investigation to pursue:
Query plans and parameter spoofing
Tempdb
As a test, I restarted SQL Server to clear out both the cache and tempdb and re-ran the proc with the same database restore. The proc took 45 minutes. I repeated this several times to ensure that it was repeatable - again it took 45 minutes each time. I then embarked on several tests to try and isolate the puzzling increase in run times when SQL Server does not get restarted:
Run the initial stored procedure WITH RECOMPILE
Before running the procedure, executre DBCC FREEPROCCACHE to clear out the procedure cache
Before running the procedure, execute CHECKPOINT followed by DBCC DROPCLEANBUFFERS to ensure that the cache was empty and clean
Executed the following script to ensure all stored procedures were marked for recompilation:
DECLARE #proc_schema SYSNAME
DECLARE #proc_name SYSNAME
DECLARE prcCsr CURSOR local
FOR SELECT specific_schema,
specific_name
FROM INFORMATION_SCHEMA.routines
WHERE routine_type = 'PROCEDURE'
OPEN prcCsr
FETCH NEXT FROM prcCsr INTO #proc_schema, #proc_name
DECLARE #stmt NVARCHAR(MAX)
WHILE ##FETCH_STATUS = 0
BEGIN
SET #stmt = N'exec sp_recompile ''[' + #proc_schema + '].['
+ #proc_name + ']'''
-- PRINT #stmt -- DEBUG
EXEC ( #stmt
)
FETCH NEXT FROM prcCsr INTO #proc_schema, #proc_name
END
In all the above tests, the procedure takes longer and longer to run with the same database restore. I am really at a loss now as to what to try. Looking into the code at this point is an option but realistically its going to take 3-6 months to get that optimised as there is lots of room for improvement there. What I am really interested in getting to the bottom of, is why does the proc execution time get longer each time when a database restore has been performed even when the procedure and buffer caches have been cleaned?
I did also investigate tempdb, and try and clear out old tables in there as described in my other stackoverflow post, but I am unable to manually clear out temp tables that were created from table variables manually and they don't seem to want to disappear on their own (even after leaving them for 24 hours).
Any insight or suggestions for further testing would be greatly appreciated. I am running SQL Server 2005 SP3 64-bit Enterprise edition on a Windows 2003 R2 Ent. edition cluster.
Regards,
Mark.
One thing that could cause this is if the process is leaking XML documents. That would cause SQL Server to use more memory, and parts of that might be written to a page file on disk, causing the process to slow down.
Code that creates an XML document looks like:
EXEC sp_xml_preparedocument #idoc OUTPUT, #strXML
It leaks if there is no corresponding:
EXEC sp_xml_removedocument #idoc
XML documents are COM objects stored outside the configured SQL Server memory. Even if you set SQL Server to use max 5 GB, leaking XML documents grows memory usage beyond that.
Reviewing all posts to-date and your related question, it certainly sounds like your strongest lead is the mystery behind those tempdb objects. Some leading questions:
After a fresh start, after the process is run how many objects are in tempdb? Is it the same number after every fresh start?
Do the numbers grow after “successive” runs? Do they grow at the same rate?
Can you determine if they occupy space?
For that matter, your tempdb files grow with each successive run of your process?
I followed the links, but didn’t find any reference discussion the actual problem. You might want to raise the issue on the Microsoft SQL Technet forums here -- they can be pretty good with the abstract stuff. (If all else fails, you can open a case with MS technical support. It might take days, but odds are very good that they will figure things out. And if it is an MS bug, they refund your money!)
You've said that rewriting the code is not an option. However, if temp table abuse is a factor, identifying and refactoring those parts of the code first might help a lot. To find which those may be, run SQL Profiler while your process executes. This kind of work is, alas, subjective and highly iterative (meaning you hardly ever get just the right set of counters on the first pass). Some thoughts:
Start with tracking SP:Started, to track which stored proedures are being called.
SQL Profiler can be used to group data; it’s awkward and I’m not sure how to describe it in mere text, but configured properly you’ll get a Profiler display showing the number of times each procedures was. Ideally, this would show the most frequenly called procs, and you can analyze them for temp table abuse and refactor as necessary.
If nothing jumps out there, you can trace SP:StmtStarting and do the same thing for individual statements. The problem here is that in a 2+/- hour spaghetti-code run, you might run out of disk space, and analyzing 100s of MB of trace data can be a nightmare. (Hint: load it in a table, build indexes, then carefully delete out the cruft.) Again, the goal would be to identify overly used/abused temp table code to be refactored.
Mark-
So it might take 3-6 months to totally re-write this procedure, but that doesn't mean you can't do some relatively quick performance optimization.
Some of the routines I have to support run 30hrs+, I would be ecstatic to get them to run in 2hrs!! The kind of optimization that you do on these routines is a little different than your normal OLTP database:
Capture a trace of the entire process, making sure to capture SP:StmtCompleted and SQL:StmtCompleted events. Make sure to put a filter on Duration (>10ms or something) to eliminate all the quick, unimportant statements.
Pull this trace into a table, and do some filtering/sorting/grouping, focusing on Duration and Reads. You will likely end up with one of two situations:
(A) A handful of individual queries/statements are responsible for the bulk of the time of the procedure (good news)
(B) A whole lot of similar statements each take a short amount of time, but together they add up to a long time.
In scenario (A), just focus your attention on these queries. Optimize them using indexes, or using other standard techniques. I highly recommend Dan Tow's book "SQL Tuning" for a powerful technique to optimize queries, especially messy ones with complicated joins.
In scenario (B), step back a bit and look at the set of statements as a whole. Are they all similar in some way? Can you add an index on a key, common table that will improve them all? Can you eliminate a loop that executes 10,000 dynamic queries, and instead do a single set-based query?
Still two other possibilities, I suppose:
(C) 15,000 totally different dynamic SQL statements, each requiring its own painstaking optimization. In this case, try to focus on server-level optimizations, such as I/O based improvements that will benefit them all.
(D) Something else weird going on with TempDB or something mis-configured on the server. Not much else I can say here, other than find the problem, and fix it!
Hope this helps.
Can you try the following scenario on the test server:
Make two copies of the database on the server: [A] and [B]. [A] is the database in question, [B] is the copy.
Restart server
Run your process
Drop the database [A]
Rename [B] to [A]
Run your process
This would be like a hot database swap. If the second run takes longer, something on the server level is happening (tempdb, memory, I/O, etc). If the second run takes about the same time, then the problem is on the database level (locks, index fragmentation, etc).
Good luck!
Run the following script at start of test and then after each iteration:
select sum(single_pages_kb) as sum_bp_kb
, sum(multi_pages_kb) as sum_va_kb
, type
from sys.dm_os_memory_clerks
group by type
having sum(single_pages_kb+multi_pages_kb) > 16
order by sum(single_pages_kb+multi_pages_kb) desc
select sum(total_pages), type_desc
from tempdb.sys.allocation_units
group by type_desc;
select * from sys.dm_os_performance_counters
where counter_name in (
'Log Truncations'
,'Log Growths'
,'Log Shrinks'
,'Data File(s) Size (KB)'
,'Log File(s) Size (KB)'
,'Active Temp Tables');
If the results are not self-evident, you can post them somewhere and place a link here, I can look into them and see if something strikes as odd.
What does the overall process do, what is the purpose of the operation being performed?
I would assume that executing the process results in data modification within the database. Is this the case?
If this is the case, then each time you run the process, the data begin considered is different and so different execution plan production is a possibility and so too are differing execution times.
Assuming that modification to the database data is occuring then you should also investigate:
Updating relevant database statistics
between each process run.
Reviewing the level of index
fragmentation between each process
run and determine if defragmentation could prove benificial.
Apparently you want to try anything except what you really have to do which is fix the process. Start by getting rid of the cursors. If it takes two hours right now, without the cursors I'll bet you can get it down to less than ten minutes.
I would log information into a log_table and the time it took to run each steps...that will help you narrow down the issue and also help you progressively improve the process by tackling it one at time (from improving procs that take the longest).
Best way is to simply insert at the beginning and the end of each proc.
Cursors are not peformance boosters, others address that. (not your decision)
Look into the temp tables use/management. Are they global temp tables or session/local temp tables? The fact that they are hanging around looks interesting. The tempdb is locked when temp tables are created which might be part of the issue.
Local temp tables (#mytable syntax) should go away when the session goes out of scope, but you SHOULD have dropped these (release early) to free up resources.
Use of local temp tables in transaction then cancel without COMMIT/ROLLBACK can increase locking in tempdb causing performance issues.
Speaking of transactions - this will cause locks on syscolumns, sysindexes etc. if temp tables are created in transactions - thus other exeuctions are blocked from using the same query.
Use of temp tables created by calling procedures in the called procedures points to logic need - rethink and try to use relational structures instead.
IF you need temp tables (to eliminate cursors :) then avoid SELECT INTO - to avoid system objects locks.
Use of global temp tables (##myglobaltable syntax) should be avoided as multiple sessions accessing can be and issue (the table hangs around until all sessions clear), and for me at least, makes no additive logical value proposition (look into the use of a permanent table instead). Question if global, are there blocking procedures?
Are there a lot of sparse temp tables (grow with large data, but have smaller data sets in them?)
Microsoft SQL Server Book Online,
“Consider using table variables instead of temporary tables. Temporary tables are useful in cases when indexes need to be created explicitly on them, or when the table values need to be visible across multiple stored procedures or functions. In general, table variables contribute to more efficient query processing.”
Of course if the temp table needs indexes, tabel variables are not an option.
I don't have the answer but some ideas of what I would do to isolate issues like this.
First, I would take snapshots of sys.dm_os_wait_stats before and after each execution. You subtract the 2 snapshots (get a deltas) and see if any particular WAIT is prominent or gets worse with each run. An easy way to calculate deltas is to copy the sys.dm_os_wait_stats values into Excel worksheets and use VLOOKUP() to subtract corresponding values. I've used this investigation technique hundreds of times. You don't know what aspect SQL Server is hung up on?! Let SQL Server "tell" you via sys.dm_os_wait_stats !
The other thing I might try is to adjust the behavior of the loop to understand if the subsequent slower executions exhibit constant throughput for all records from beginning to end or does it only slow down for particular sproc(s) in INFORMATION_SCHEMA.routines ... 2 techniques for exploring this is:
1) Add a "top N" clause the SQL SELECT such as "top 100" or "top 1000" (create an artificial limit) to see if you get subsequent slowdowns for all record count scenarios ... or ... do you only get the slowdowns when the cursor resultset is large enough to include the offending sproc.
2) Instead of adding "top N", you can add more print statements (instrumentation) to calculate the throughput as it is processing.
Of course, you can do combination of both.
Maybe these diagnostics will get you closer to the root cause.
Edited to add: Btw, SQL2008 has a new performance monitor that makes it easy to "eyeball" the numbers of sys.dm_os_wait_stats. However for SQL2005, you'll have to manually calculate the deltas via Excel or a script.
These are long shots:
Quickly look through all of the
stored procedures for things that are
unusual and SQL Server should not
really be doing, for example sending
email or writing files, etc. SQL trying to send email to a non-exist email server could cause delays.
The other thing to keep in mind is
that as you restore the database
before each test possibly your disk
is getting more fragmented (not
really sure about this though). So
that may explain why run times get longer each time until they plateau.
Firstly, thanks to everyone for some really great help. I much appreciate your time and expertise in helping me to solve this very strange issue. I have an update.
I started a server-side trace to try and isolate the stored procs that were running slower between iterations. What I found surprised me. 96 stored procedures are involved in the process. Most of these stored procedures ran slower the second time around - about 50 of them. The rest were very quick to run and didn't influence the overall time at all, and in fact some of these ran a little quicker (as would be expected).
I failed over the database instance to another node in my cluster and ran the tests there with the exact same results - so I can rule out any OS differences between cluster nodes - when building the clusters I was very conscious to build them identically.
1100 temp tables get created during the process and persist after it has finished - these are all table variables and I found a way to remove them. Running sp_recompile on every proc and function in the database caused all the temp tables to get cleared up. However this did not improve the run times at all. The only thing that helps the run times is a restart of the SQL Server service. Unfortunately I am out of time now to investigate this further - I have other work to do, but would like to persist with it. Perhaps I will come back to it later if I get a spare few hours. In the meantime however, I have to admit defeat with no solution and no bounty to give.
Thanks again everyone.
A while ago I had a query that I ran quite a lot for one of my users. It was still being evolved and tweaked but eventually it stablised and ran quite quickly, so we created a stored procedure from it.
So far, so normal.
The stored procedure, though, was dog slow. No material difference between the query and the proc, but the speed change was massive.
[Background, we're running SQL Server 2005.]
A friendly local DBA (who no longer works here) took one look at the stored procedure and said "parameter spoofing!" (Edit: although it seems that it is possibly also known as 'parameter sniffing', which might explain the paucity of Google hits when I tried to search it out.)
We abstracted some of the stored procedure to a second one, wrapped the call to this new inner proc into the pre-existing outer one, called the outer one and, hey presto, it was as quick as the original query.
So, what gives? Can someone explain parameter spoofing?
Bonus credit for
highlighting how to avoid it
suggesting how to recognise possible cause
discuss alternative strategies, e.g. stats, indices, keys, for mitigating the situation
FYI - you need to be aware of something else when you're working with SQL 2005 and stored procs with parameters.
SQL Server will compile the stored proc's execution plan with the first parameter that's used. So if you run this:
usp_QueryMyDataByState 'Rhode Island'
The execution plan will work best with a small state's data. But if someone turns around and runs:
usp_QueryMyDataByState 'Texas'
The execution plan designed for Rhode-Island-sized data may not be as efficient with Texas-sized data. This can produce surprising results when the server is restarted, because the newly generated execution plan will be targeted at whatever parameter is used first - not necessarily the best one. The plan won't be recompiled until there's a big reason to do it, like if statistics are rebuilt.
This is where query plans come in, and SQL Server 2008 offers a lot of new features that help DBAs pin a particular query plan in place long-term no matter what parameters get called first.
My concern is that when you rebuilt your stored proc, you forced the execution plan to recompile. You called it with your favorite parameter, and then of course it was fast - but the problem may not have been the stored proc. It might have been that the stored proc was recompiled at some point with an unusual set of parameters and thus, an inefficient query plan. You might not have fixed anything, and you might face the same problem the next time the server restarts or the query plan gets recompiled.
Yes, I think you mean parameter sniffing, which is a technique the SQL Server optimizer uses to try to figure out parameter values/ranges so it can choose the best execution plan for your query. In some instances SQL Server does a poor job at parameter sniffing & doesn't pick the best execution plan for the query.
I believe this blog article http://blogs.msdn.com/queryoptteam/archive/2006/03/31/565991.aspx has a good explanation.
It seems that the DBA in your example chose option #4 to move the query to another sproc to a separate procedural context.
You could have also used the with recompile on the original sproc or used the optimize for option on the parameter.
A simple way to speed that up is to reassign the input parameters to local parameters in the very beginning of the sproc, e.g.
CREATE PROCEDURE uspParameterSniffingAvoidance
#SniffedFormalParameter int
AS
BEGIN
DECLARE #SniffAvoidingLocalParameter int
SET #SniffAvoidingLocalParameter = #SniffedFormalParameter
--Work w/ #SniffAvoidingLocalParameter in sproc body
-- ...
In my experience, the best solution for parameter sniffing is 'Dynamic SQL'. Two important things to note is that 1. you should use parameters in your dynamic sql query 2. you should use sp_executesql (and not sp_execute), which saves the execution plan for each parameter values
Parameter sniffing is a technique SQL Server uses to optimize the query execution plan for a stored procedure. When you first call the stored procedure, SQL Server looks at the given parameter values of your call and decides which indices to use based on the parameter values.
So when the first call contains not very typical parameters, SQL Server might select and store a sub-optimal execution plan in regard to the following calls of the stored procedure.
You can work around this by either
using WITH RECOMPILE
copying the parameter values to local variables inside the stored procedure and using the locals in your queries.
I even heard that it's better to not use stored procedures at all but to send your queries directly to the server.
I recently came across the same problem where I have no real solution yet.
For some queries the copy to local vars helps getting back to the right execution plan, for some queries performance degrades with local vars.
I still have to do more research on how SQL Server caches and reuses (sub-optimal) execution plans.
I had similar problem. My stored procedure's execution plan took 30-40 seconds. I tried using the SP Statements in query window and it took few ms to execute the same.
Then I worked out declaring local variables within stored procedure and transferring the values of parameters to local variables. This made the SP execution very fast and now the same SP executes within few milliseconds instead of 30-40 seconds.
Very simple and sort, Query optimizer use old query plan for frequently running queries. but actually the size of data is also increasing so at that time new optimized plan is require and still query optimizer using old plan of query. This is called Parameter Sniffing.
I have also created detailed post on this. Please visit this url:
http://www.dbrnd.com/2015/05/sql-server-parameter-sniffing/
Changing your store procedure to execute as a batch should increase the speed.
Batch file select i.e.:
exec ('select * from order where order id ='''+ #ordersID')
Instead of the normal stored procedure select:
select * from order where order id = #ordersID
Just pass in the parameter as nvarchar and you should get quicker results.