Non-optimal execution plan using sp_executesql - sql-server

I'm having problems with slow performance in a sql select statement with some parameters, for the same query, executing this select using sp_executesql way it takes double time that the inline-way.
The problem is that in sp_execute-way sql server is not using optimal execution plan. Although plans are different, it seems that in both cases indexes of tables are being used correctly. I really don't understand why performance are so different.
My original query is more complex but to try to figure out what's happening I have simplify the original query to a select with 3 tables and 2 joins. The main difference is the use of Hash Match in optimal , I really don't know the meaning of this but is the only difference I can see.
Optimal plan (hash match, over 3 seconds)
Wrong plan (no hash match, same indexes than above, over 12 seconds)
I think my problem is not "parameter sniffing", in my case the query is always slow for all distinct parameter values because the execution plan is always incorrect.
OPTION (RECOMPILE) doesn't help, sp_executesql keeps going slow and inline-way take more time (because the query always compile the execution plan)
Statistics for tables are updated
I have to use sp_executesql way because it seems that reporting services encapsulates the select in sp_executesql calls
Does anybody know why sp_executesql generates a different (wrong) execution plan than the inline query?
EDIT: Queries wasn't using same indexes I guess that because the execution tree is not the same and sqlserver takes indexes as it pleases, attached you can find new execution plans to force to use the same indexes, performance is now even worst, from 12 seconds to more than 15 minutes (I have cancelled) in slow query. I'm really not interested in run this specific query more speed, as I say this is not the real query I'm dealing with, what I'm trying to figure out is why execution plans are so different between inline-query and sp_executesql-query.
Is there any magic option in sp_executesql that do this works properly? :)
Optimal
Slow

My understanding is that sp_executesql keeps a cached plan after the first execution. Subsequent queries maybe using a bad cached plan. You can use the following command to clear out the ENTIRE SQL Server procedure cache.
DBCC FREEPROCCACHE
http://msdn.microsoft.com/en-us/library/ms174283.aspx

Related

Executing stored procedure takes too long than executing TSQL

I have a stored procedure and when I want to execute it using exec proc_name it takes 1 min
If I copy the code from stored procedure, declare params as variables and then execute the code it takes 10 sec.
What's wrong ?
I am missing something here ?
I am asking this because I use ADO.NET and I get a timeout error when I want to execute that stored procedure using ExecuteNonQuery.
Thank you
Its caused by suboptimal plans being used.
You mention that the s.p. has parameters, I've had similar issues due to 'parameter sniffing'.
The quickest check to see if this is the issue is just to, inside the SP, copy the input parameters into local variables then use only the local variables.
This stops e.g. optimisation for certain paramater values at the expense of others.
I've had this before in an s.p. which had int parameters where certain parameter values changed the control flow (as well as how queries would be executed) a bit.
Start Sql Profiler and compare those two executions: is the extra 50 mins spent on the server? Are the queries really the same?
You can than copy the actual query text and run it manually and check execution plan.
try the executing proc with Execution plan icon switched on.
It will tell you exactly which part takes time and you/we can probably take over (suggestions) from there.
Thanks
As a general idea, query plans are cached differently when we talk about adhoc statements vs stored procedures. So the execution time could be different as chosen query plan could be different.
As suggestions, I think at:
1/ Invalidate the query plan associated with that stored procedure:
sp_recompile <procname>
2/ Delete all query plans from cache (the hard-way, non recommended in PROD unless you understand very well the consequences): :
DBCC FREEPROCCACHE
3/ Update statistics for involved tables.
4/ Have look at actual execution plan for both cases and isolate where is the performance bottleneck. Post some code and we'll provide you more details about.
Option 1 : execute SP in Alter State and try again with parameters.
Option 2 : EXEC sp_updatestats
Option 3 : Failing with option 1, add "option(recompile)" at the end of your query.
Eg : Select Id from Table1 Order BY Id desc option(recompile)
If this runs faster, slow down was due to executions plans made by SQL.

SQL performance: Recompile versus drop and re-apply for stored procedure

I have a stored procedure used for a DI report that contains 62 sub-queries using UNION ALL. Recently, performance went from under 1 minute to over 8 minutes and using SQL Profiler, it was showing very high CPU and Reads. The procedure currently has passed in variables set to local variables to prevent parameter sniffing.
Running the contents of the procedure as a SELECT statement and performance was back to under a minute.
Calling the procedure via EXEC in Management Studio and performance was horrible and over 8 minutes.
Calling procedure via EXEC including WITH RECOMPILE command and performance did not improve. I ran DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS and still no improvement.
In the end, I dropped the procedure and re-applied it and performance is now back.
Can anyone help explain to me why the initial steps did not correct the performance of the procedure but dropping and re-applying the procedure did?
Sounds like blocking parameter sniffing produced a bad plan. When you use local variables the query optimizer uses the density for each column to come up with cardinality estimates, essentially optimizing for the average value. If you data distribution is skewed significantly, this estimate will be significantly off for some values. This theory explains why your initial steps did not work. Using WITH RECOMPILE or running DBCC FREEPROCCACHE will not help if parameter sniffing is blocked. It will just produce the same plan every time. Because you say that running the contents of the procedure as a SELECT statement made it faster makes me think you actually need parameter sniffing. However you also need to try using WITH RECOMPILE if compilation time is acceptable, otherwise there's a risk of getting stuck with a bad plan based on atipical sniffed values.

In SQL Server, how to allow for multiple execution plans for a single query in a SP without having to recompile every time?

In SQL Server, what is the best way to allow for multiple execution plans to exist for a query in a SP without having to recompile every time?
For example, I have a case where the query plan varies significantly depending on how many rows are in a temp table that the query uses. Since there was no "one size fits all" plan that was satisfactory, and since it was unacceptable to recompile every time, I ended up copy/pasting (ick) the main query in the SP multiple times within several IF statements, forcing the SQL engine to give each case its own optimal plan. It actually seemed to work beautifully performance-wise, but it feels a bit clunky. (I know I could similarly break this part out into multiple SPs to do the same thing.) Is there a better way to do this?
IF #RowCount < 1
[paste query here]
ELSE IF #RowCount < 50
[paste query here]
ELSE IF #RowCount < 200
[paste query here]
ELSE
[paste query here]
You can use OPTIMIZE FOR in certain situations, to create a plan targeted to a certain value of a parameter (but not multiple plans per se). This allows you to specify what parameter value we want SQL Server to use when creating the execution plan. This is a SQL Server 2005 onwards hint.
Optimize Parameter Driven Queries with the OPTIMIZE FOR Hint in SQL Server
There is also OPTIMIZE FOR UNKNOWN – a SQL Server 2008 onwards feature (use judiciously):
This hint directs the query optimizer
to use the standard algorithms it has
always used if no parameters values
had been passed to the query at all.
In this case the optimizer will look
at all available statistical data to
reach a determination of what the
values of the local variables used to
generate the queryplan should be,
instead of looking at the specific
parameter values that were passed to
the query by the application.
Perhaps also look into optimize for ad hoc workloads Option
SQL Server 2005+ has statement level recompilation and is better at dealing with this kind of branching. You have one plan still but the plan can be partially recompiled at the statement level.
But it is ugly.
I'd go with #Mitch Wheat's option personally because you have recompilations anyway with the stored procedure using a temp table. See Temp table and stored proc compilation

SQL Server query plan differences

I'm having trouble understanding the behavior of the estimated query plans for my statement in SQL Server when a change from a parameterized query to a non-parameterized query.
I have the following query:
DECLARE #p0 UniqueIdentifier = '1fc66e37-6eaf-4032-b374-e7b60fbd25ea'
SELECT [t5].[value2] AS [Date], [t5].[value] AS [New]
FROM (
SELECT COUNT(*) AS [value], [t4].[value] AS [value2]
FROM (
SELECT CONVERT(DATE, [t3].[ServerTime]) AS [value]
FROM (
SELECT [t0].[CookieID]
FROM [dbo].[Usage] AS [t0]
WHERE ([t0].[CookieID] IS NOT NULL) AND ([t0].[ProductID] = #p0)
GROUP BY [t0].[CookieID]
) AS [t1]
OUTER APPLY (
SELECT TOP (1) [t2].[ServerTime]
FROM [dbo].[Usage] AS [t2]
WHERE ((([t1].[CookieID] IS NULL) AND ([t2].[CookieID] IS NULL))
OR (([t1].[CookieID] IS NOT NULL) AND ([t2].[CookieID] IS NOT NULL)
AND ([t1].[CookieID] = [t2].[CookieID])))
AND ([t2].[CookieID] IS NOT NULL)
AND ([t2].[ProductID] = #p0)
ORDER BY [t2].[ServerTime]
) AS [t3]
) AS [t4]
GROUP BY [t4].[value]
) AS [t5]
ORDER BY [t5].[value2]
This query is generated by a Linq2SQL expression and extracted from LINQPad. This produces a nice query plan (as far as I can tell) and executes in about 10 seconds on the database. However, if I replace the two uses of parameters with the exact value, that is replace the two '= #p0' parts with '= '1fc66e37-6eaf-4032-b374-e7b60fbd25ea' ' I get a different estimated query plan and the query now runs much longer (more than 60 seconds, haven't seen it through).
Why is it that performing the seemingly innocent replacement produces a much less efficient query plan and execution? I have cleared the procedure cache with 'DBCC FreeProcCache' to ensure that I was not caching a bad plan, but the behavior remains.
My real problem is that I can live with the 10 seconds execution time (at least for a good while) but I can't live with the 60+ sec execution time. My query will (as hinted above) by produced by Linq2SQL so it is executed on the database as
exec sp_executesql N'
...
WHERE ([t0].[CookieID] IS NOT NULL) AND ([t0].[ProductID] = #p0)
...
AND ([t2].[ProductID] = #p0)
...
',N'#p0 uniqueidentifier',#p0='1FC66E37-6EAF-4032-B374-E7B60FBD25EA'
which produces the same poor execution time (which I think is doubly strange since this seems to be using parameterized queries.
I'm not looking for advise on which indexes to create or the like, I'm just trying to understand why the query plan and execution are so dissimilar on three seemingly similar queries.
EDIT: I have uploaded execution plans for the non-parameterized and the parameterized query as well as an execution plan for a parameterized query (as suggested by Heinz) with a different GUID here
Hope it helps you help me :)
If you provide an explicit value, SQL Server can use statistics of this field to make a "better" query plan decision. Unfortunately (as I've experienced myself recently), if the information contained in the statistics is misleading, sometimes SQL Server just makes the wrong choices.
If you want to dig deeper into this issue, I recommend you to check what happens if you use other GUIDs: If it uses a different query plan for different concrete GUIDs, that's an indication that statistics data is used. In that case, you might want to look at sp_updatestats and related commands.
EDIT: Have a look at DBCC SHOW_STATISTICS: The "slow" and the "fast" GUID are probably in different buckets in the histogram. I've had a similar problem, which I solved by adding an INDEX table hint to the SQL, which "guides" SQL Server towards finding the "right" query plan. Basically, I've looked at what indices are used during a "fast" query and hard-coded those into the SQL. This is far from an optimal or elegant solution, but I haven't found a better one yet...
I'm not looking for advise on which indexes to create or the like, I'm just trying to understand why the query plan and execution are so dissimilar on three seemingly similar queries.
You seem to have two indexes:
IX_NonCluster_Config (ProductID, ServerTime)
IX_NonCluster_ProductID_CookieID_With_ServerTime (ProductID, CookieID) INCLUDE (ServerTime)
The first index does not cover CookieID but is ordered on ServerTime and hence is more efficient for the less selective ProductID's (i. e. those that you have many)
The second index does cover all columns but is not ordered, and hence is more efficient for more selective ProductID's (those that you have few).
In average, you ProductID cardinality is so that SQL Server expects the second method to be efficient, which is what it uses when you use parametrized queries or explicitly provide selective GUID's.
However, your original GUID is considered less selective, that's why the first method is used.
Unfortunately, the first method requires additional filtering on CookieID which is why it's less efficient in fact.
My guess is that when you take the non paramaterized route, your guid has to be converted from a varchar to a UniqueIdentifier which may cause an index not to be used, while it will be used taking the paramatarized route.
I've seen this happen with using queries that have a smalldatetime in the where clause against a column that uses a datetime.
Its difficult to tell without looking at the execution plans, however if I was going to guess at a reason I'd say that its a combinaton of parameter sniffing and poor statistics - In the case where you hard-code the GUID into the query, the query optimiser attempts to optimise the query for that value of the parameter. I believe that the same thing happens with the parameterised / prepared query (this is called parameter sniffing - the execution plan is optimised for the parameters used the first time that the prepared statement is executed), however this definitely doesn't happen when you declare the parameter and use it in the query.
Like I said, SQL server attempt to optimise the execution plan for that value, and so usually you should see better results. It seems here that that information it is basing its decisions on is incorrect / misleading, and you are better off (for some reason) when it optimises the query for a generic parameter value.
This is mostly guesswork however - its impossible to tell really without the execution - if you can upload the executuion plan somewhere then I'm sure someone will be able to help you with the real reason.

SQL Server query taking up 100% CPU and runs for hours

I have a query that has been running every day for a little over 2 years now and has typically taken less than 30 seconds to complete. All of a sudden, yesterday, the query started taking 3+ hours to complete and was using 100% CPU the entire time.
The SQL is:
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER JOIN vwSomeData alpha ON beta.id = alpha.id
alpha.id is a BIGINT type and beta.id is an INT type. dbo.fnGetStuff() is a simple SELECT statement with 2 INNER JOINs on tables in the same DB, using a WHERE id = #id. The function returns approximately 11000 results.
The view vwSomeData is a simple SELECT statement with two INNER JOINs that returns about 590000 results.
Both the view and the function will complete in less than 10 seconds when executed by themselves. Selecting the results of the function into a temporary table first and then joining on that makes the query finish in < 10 seconds.
How do I troubleshoot what's going on? I don't see any locks in the activity manager.
Look at the query plan. My guess is that there is a table scan or more in the execution plan. This will cause huge amounts of I/O for the few record you get in the result.
You could use the SQL Server Profiler tool to monitor what queries are running on SQL Server. It doesn't show the locks, but it can for instance also give you hints on how to improve your query by suggesting indexes.
If you've got a reasonably recent version of SQL Server Management Studio, it has a Database Tuning Adviser as well, under Tools. It takes a trace from the Profiler and makes some, sometimes highly useful, suggestions. Makes sure there's not too many queries - it takes a long time to build advice.
I'm not an expert on it, but have had some luck with it in the past.
Do you need to use a function? Can you re-write the entire thing into a stored procedure in which you pass in the #ID as a parameter.
Even if your table has indexes because you pass the #ID as a variable to the WHERE clause potentially greatly increasing the amount of time for the query to run.
The reason the indexes may not be used is because the Query Analyzer does not know the value of the variables when it selects an access method to perform the query. Because this is a batch file, only one pass is made of the Transact-SQL code, preventing the Query Optimizer from knowing what it needs to know in order to select an access method that uses the indexes.
You might want to consider an INDEX query hint if you cannot re-write the SQL.
it might also be possible, since this just started happening, that the INDEXes have become fragmented and might need to be rebuilt.
I've had similar problems with joining functions that return large datasets. I had to do what you've already suggested. Put the results in a temp table and join on that.
Look at the estimated plan, this will probably shed some light. Typically when query cost gets orders of magnitude more expensive it is because a loop or merge join is being used where a hash join is more appropriate. If you see a loop or merge join in the estimated plan, look at the number of rows it expects to process - is it far smaller than the number of rows you know will actually be in play? You can also specify a hint to use a hash join and see if it performs much better. If so, try updating statistics and see if it goes back to a hash join without a hint.
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER HASH JOIN vwSomeData alpha ON beta.id = alpha.id
-- having no idea what type of schema is in place and just trying to throw out ideas:
Like others have said... use Profiler and find the source of pain... but I'm thinking it is the function on the other database. Since that function might be a source of pain, have you thought about a little denormalization or anything on [DifferentDatabase]. I think you'll find a bit more scalability in joining to a more flattened table with indexes than a costly function.
Run this command:
SET SHOWPLAN_ALL ON
Then run your query. It will display the execution plan, look for a "SCAN" on an index or a table. That is most likely what is happening to your query now. If that is the case, try to figure out why it is not using indexes now (refresh statistics, etc)

Resources