Setup:
Using SQL Server 2008 R2.
We've got a stored procedure that has been intermittently running very long. I'd like to test a theory that parameter sniffing is causing the query engine to choose a bad plan.
Question:
How can I copy the query's execution plans from one database to another (test) database?
Note:
I'm fully aware that this may not be parameter sniffing issues. However, I'd like to go through the motions of creating a test plan and using it, if at all possible. Therefore please do not ask me to post code and/or table schema, since this is irrelevant at this time.
Plans are not portable, they bind to object IDs. You can use planguides, but they are strictly tied to the database. What you have to do is test on a restored backup of the same database. On a restored backup you can use a planguide. But for relevance the physical characteristics of the machines should be similar (CPUs, RAM, Disks).
Normally though one does not need to resort to such shenanigans as copy the plans. Looking at actual execution plans all the answers are right there.
Have you tried using OPTIMIZE FOR clause? With it you can tune your procedure easier, and without the risk that plan that you copy from another database will be inappropriate due to differences in those databases (if copying the plan is even possible).
http://www.mssqltips.com/sqlservertip/1354/optimize-parameter-driven-queries-with-sql-server-optimize-for-hint/
Related
I was wondering if you guys could help me get to the bottom of a weird problem I have recently had on SQL Server.
I have a stored procedure (lets call SPold) which is reasonably large with a lot of calculations (can't possibly do this in app as info for around 6000 users needs to come back in a one-er (I reduce this to 1000 based on Surname)). The stored procedure usually executes in a couple of seconds, and is called once every couple of minutes.
Now this morning, the stored procedure was suddenly taking 4-10 times as long to execute, causing a number of timeouts. I discovered that by making a copy of the procedure with a new name (SPnew) and executing, I would get the fast execution times again. This indicated to me that the execution plan was the problem with the original, SPold, so I decided to execute it with recompile. This would return the results a quicker (although not as fast as SPnew), but subsequent calls from users to SPold were once again slow. It was like the new plan wasn't being kept.
What I have done is to fix this is put Exec SPnew into SPold, and now calls to SPold are returning fast again.
Does anyone have any idea what is going on here? The only thing that updated overnight was the statistics, although I think that this should affect both SPold and SPnew.
Sounds like you are experiencing an incorrectly cached query plan due to parameter sniffing.
Can you post the stored procedure?
Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005
I Smell a Parameter!
In SQL Server 2005, you can use the OPTIMIZE FOR query hint for preferred values of parameters to remedy some of the problems associated with parameter sniffing:
OPTIMIZE FOR Instructs the query optimizer to use a particular value for a local
variable when the query is compiled and optimized. The value is used
only during query optimization, and not during query execution.
OPTIMIZE FOR can counteract the parameter detection behavior of the
optimizer or can be used when you create plan guides. For more
information, see Recompiling Stored Procedures and Optimizing Queries
in Deployed Applications by Using Plan Guides.
Although SQL Server 2005 does not support OPTIMIZE FOR UNKNOWN (introduced in SQL Server 2008) which
will eliminate parameter sniffing for a given parameter:
OPTION (OPTIMIZE FOR (#myParam UNKNOWN))
You can achieve the same effect in SQL Server 2005 by copying the parameter into a local variable, and then use the local variable in the query.
I've also encounterred two "strange" cases with Sql Server 2005, which might relate to your problem as well.
In the first case my procedure executed prety fast, when being run as dbo, and it was slow when being run from the application, under a different user account.
In the second case the query plan of the procedure got optimized for the parameter values with which the procedure was called for the first time, and this plan was then reused later for other parameter values as well, resulting in a slow execution.
For this second case the solution was to copy the parameter values into local variables in the procedure, and then using the variables in the queries instead of the parameters.
Here's one I need help from the SQL administrators out there. I have two separate SQL Server instances on Amazon EC2. One is our staging environment, and the other is our production environment, but they are configured exactly the same way (spawned from the same image).
We had a database that we copied from staging to our production environment last week. The way we copy a db to production is we take a backup of it on our staging site, and restore the backup in production. Anyways, we found that in production, one particular complex query was timing out after an hour, but that exact query in our staging environment completed in 10 minutes.
The explain plan on both were almost the same, except in one server it was doing a PK scan on a large table (8M rows), and on the other table it was doing an index seek. We're assuming this was the difference. So one server was doing a lot of disk IO, and the other was not.
So my question is, what are the reasons that one installation of SQL server would decide to use an index, while another one ignores it--assuming same versions of SQL server, and same data set? Even better, what are the best ways to find out why SQL is ignoring an index?
SQL Server uses statistics to determine the query execution plan.
Normally, they should be the same on the same datasets, but there is a chance of outdated statistics on one of the machines.
Use sp_updatestats to update statistics on both machines.
Also, I'm not familiar with Amazon EC2, but there may be a chance that the machines running the two instances have different number of CPU installed (or made available for use by SQL Server). This is also taken into account by the optimizer.
Parameter Sniffing?
An SP will use the query plan that was deemed most appropriate based on the parameters passed to it when it was executed (and so compiled) for the first time.
Restoring a database wipes the plan cache; if the SP on the copy of the database was run with parameters that favored an index seek, then that's what will subsequently be used.
You can check this by sp_recompile'ing both and running them again with identical parameters.
This was our mistake.
After much digging investigation, we found that one of our devs had added a couple additional indexes to the production db after the transfer. This was a case where the additional indexes actually caused the query optimizer to pick a less efficient route in the production environment.
Removing those additional indexes appeared to have addressed the performance issue for the particular query, and both explain plans are now the same.
We have a site in development that when we deployed it to the client's production server, we started getting query timeouts after a couple of hours.
This was with a single user testing it and on our server (which is identical in terms of Sql Server version number - 2005 SP3) we have never had the same problem.
One of our senior developers had come across similar behaviour in a previous job and he ran a query to manually update the statistics and the problem magically went away - the query returned in a few miliseconds.
A couple of hours later, the same problem occurred.So we again manually updated the statistics and again, the problem went away. We've checked the database properties and sure enough, auto update statistics isTRUE.
As a temporary measure, we've set a task to update stats periodically, but clearly, this isn't a good solution.
The developer who experienced this problem before is certain it's an environment problem - when it occurred for him previously, it went away of its own accord after a few days.
We have examined the SQL server installation on their db server and it's not what I would regard as normal. Although they have SQL 2005 installed (and not 2008) there's an empty "100" folder in installation directory. There is also MSQL.1, MSQL.2, MSQL.3 and MSQL.4 (which is where the executables and data are actually stored).
If anybody has any ideas we'd be very grateful - I'm of the opinion that rather than the statistics failing to update, they are somehow becoming corrupt.
Many thanks
Tony
Disagreeing with Remus...
Parameter sniffing allows SQL Server to guess the optimal plan for a wide range of input values. Some times, it's wrong and the plan is bad because of an atypical value or a poorly chosen default.
I used to be able to demonstrate this on demand by changing a default between 0 and NULL: plan and performance changed dramatically.
A statistics update will invalidate the plan. The query will thus be compiled and cached when next used
The workarounds are one of these follows:
parameter masking
use OPTIMISE FOR UNKNOWN hint
duplicate "default"
See these SO questions
Why does the SqlServer optimizer get so confused with parameters?
At some point in your career with SQL Server does parameter sniffing just jump out and attack?
SQL poor stored procedure execution plan performance - parameter sniffing
Known issue?: SQL Server 2005 stored procedure fails to complete with a parameter
...and Google search on SO
Now, Remus works for the SQL Server development team. However, this phenomenon is well documented by Microsoft on their own website so blaming developers is unfair
How Data Access Code Affects Database Performance (MSDN mag)
Suboptimal index usage within stored procedure (MS Connect)
Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005 (an excellent white paper)
Is not that the statistics are outdated. What happens when you update statistics all plans get invalidated and some bad cached plan gets evicted. Things run smooth until a bad plan gets again cached and causes slow execution.
The real question is why do you get bad plans to start with? We can get into lengthy technical and philosophical arguments whether a query processor shoudl create a bad plan to start with, but the thing is that, when applications are written in a certain way, bad plans can happen. The typical example is having a where clause like (#somevaribale is null) or (somefield= #somevariable). Ultimately 99% of the bad plans can be traced to developers writing queries that have C style procedural expectation instead of sound, set based, relational processing.
What you need to do now is to identify the bad queries. Is really easy, just check sys.dm_exec_query_stats, the bad queries will stand out in terms of total_elapsed_time and total_logical_reads. Once you identified the bad plan, you can take corrective measures which depend from query to query.
I'm tracking down an odd and massive performance problem in my SQL server installation. On my system, a particular stored procedure takes 2 minutes to execute; on a colleague's system it takes less than 1 second. We have similar databases/data and configurations, but there's obviously something very different.
I ran the SP in question through the Profiler on both systems and noticed something odd. On My system, I see 9 entries with the following properties:
The Duration is way high relative to other rows. I have values as high as 37,698 and as low as 1734. On the "fast" system the maximum duration (for the entire SP call) is 259.
They are executed for two databases related to the one that contains the SP I'm running. (This SP makes calls via Linked Servers to these two databases).
They are executions of one of the following system SPs:
sp_tables_info_90_rowset
sp_check_constbytable_rowset
sp_columns_90_rowset
sp_table_statistics2_rowset
sp_indexes_90_rowset
I can't find any Googleable documentation on what these are, why they would be so slow, or why they would run on one system but not the other. Does anyone know what they're all about?
Try manually updating statistics on that table.
UPDATE STATISTICS [TableName]
Then double check that the database option to AutoUpdateStatistics is TRUE. Even if it is, though, I've seen cases where adding large amounts of data to a table doesn't always cause the statistics to update in a timely way, and queries can be slow.
I don't know the answer to your question. But to try to fix the problem you're having (which, I assume, is what you're actually interested in), the first thing I'd do is run a re-index on the tables you're querying. This frequently will fix any kind of slowness when the conditions are as you described (same database structure, different data/database, same query).
These are the tables created when you have linked server calls. These are called work tables created in Tempdb. They are automatically created by the database engine for temporary operations like Spooling etc.
Those sp's mean your query is hitting linked servers by using synonyms. This should be avoided whenever possible.
I'm not familiar with those specific procedures, but you can try running:
SELECT object_definition(object_id('Procedure Name'))
To get a better idea of what's going on under the hood.
Last index rebuild? Last statistics update?
Otherwise, these stored procs are used by the SQL Server client too... no? And probably won't cause these errors
Why is it that if I run my query as a parameterized procedure it runs 10 times faster then if I run it directly as a parameterized query?
I'm using the exact same query in both cases, and it doesn't matter if I'm calling from Management Studio or an SqlCommand from code.
EDIT: The execution plan looks different. So, why? I'm calling it with EXACTLY the same set of parameters.
EDIT: After more testing it seems the 10x slowdown only occurs when running the parameterized query from SQL Management Studio.
One thing I've seen recently is that if you set up the query parameters wrong it can cause major problems.
For example, say you have a parameter for an indexed varchar column and set it up from .Net using the SqlCommand's AddWithValue() method. You're in for a world of hurt with this scenario. .Net uses unicode strings and will set up your parameter as an nvarchar rather than varchar. Now sql server won't be able to use your index, and you'll see a significant performance penalty.
Find out if they are using the same execution plan is to display it when running. Use "include actual execution plan" in management studio and see what is the difference.
The connection-level settings can be critical in some cases, especially ANSI NULLS, CONCAT NULL YIELDS NULL, etc. In particular, if you have calculated persisted indexed columns (including promoted "xml" columns), then it won't trust the pre-calculated, index value if the settings aren't compatible, and will recalculate for each row (i.e. table scan instead of index seek).
Parameterized queries have a lot of advantages, including often a hefty performance increases.
Caching of queries
String Concatenation problems minimized
addressing SQL injection
Data does not have to be converted to a string before processing
Parameter sniffing may be affecting the stored procedure performance.
http://omnibuzz-sql.blogspot.com/2006/11/parameter-sniffing-stored-procedures.html
Stored procedures can run faster because the execution plan is cached by sql server.
But 10 times performance is suspicious.
Does it run the same the First time after you've cleared the stored execution plans? You can use these commands to clear the cache. But they clear the whole servers cache so do it only on development servers.
DBCC FREEPROCCACHE
DBCC FLUSHPROCINDB (<dbid>)
Are you running these directly on the SQL server to eliminate any network I/O from the performance testing?
My guess is that it is running slowly the first time, then once it is cached it is running faster.