Is recompiling a long running query a good habit

Is recompiling a long running query a good habit - sql-server

I have some long running (a few hours) stored procedures which contain queries that goes to tables that contain millions of records in a distributed environment. These stored procedures take a date parameter and filters these tables according to that date parameter.
I've been thinking that because of the parameter sniffing feature of SQL Server, at the first time that my stored procedure gets called, the query execution plan will be cached according to that specific date and any future calls will use that exact plan. And I think that since creating an execution plan takes only a few seconds, why would I not use RECOMPILE option in my long running queries, right? Does it have any cons that I have missed?

if the query should run within your acceptable performance limits and you suspect parameter sniffing is the cause,i suggest you add recompile hint to the query..
Also if the query is part of stored proc,instead of recompiling the entire proc,you can also do a statement level recompilation like
create proc procname
(
#a int
)
as
select * from table where a=#a
option(recompile)
--no recompile here
select * from table t1
join
t2 on t1.id=t2.id
end
Also to remind ,recompiling query will cost you.But to quote from Paul White
There is a price to pay for the plan compilation on every execution, but the improved plan quality often repays this cost many times over.
Query store in 2016 helps you in tracking this issues and also stores plans for the queries over time..you will be able to see which are performing worse..
if you are not on 2016,William Durkin have developed open query store for versions (2008-2014) which works more or less the same and helps you in troubleshootng issues
Further reading:
Parameter Sniffing, Embedding, and the RECOMPILE Options

Related

Does query form have an influence on getting into parameter sniffing?

Recently, one of my colleague working in SQL development got into a problem like this: a procedure ran fine on all environments, but production, which has the most resources. Typical case of parameter sniffing, but the profiler indicated that only one query in the whole procedure took very much to execute:
UPDATE a
SET status_id = 6
FROM usr.tpt_udef_article_grouping_buffer a
LEFT JOIN (SELECT DISTINCT buying_domain_id, suppl_no FROM usr.buyingdomain_supplier_article) b ON a.buying_domain_id = b.buying_domain_id
AND a.suppl_no = b.suppl_no
WHERE a.tpt_file_id = #tpt_file_id
AND a.status_id IS NULL
AND b.suppl_no IS NULL
As I am biased towards development (I have little administration experience), I suggested that this query should be rewritten:
replace LEFT JOIN (SELECT DISTINCT ...) with NOT EXISTS (SELECT 1 ...)
put the appropriate index on table usr.tpt_udef_article_grouping_buffer(SSMS suggested an effort reduced by 95% when query was run outside the procedure)
Also, multiple queries from the procedure shared the same pattern.
I know that parameter sniffing is more related to the plan constructing when running the procedure for the first time after its (re)creation and I think it is also favored by high cyclomatic complexity.
My question is:
Does the way queries in the procedure are written (bad execution plans from the beginning) favor parameter sniffing appearance or just worsen their effects?

Your only parameter here is a.tpt_file_id = #tpt_file_id and if this is parameter sniffing, then the cases must be such that for certain tpt_file_id there are thousands (or more) records, and for certain there is few (or none).
The other reason you get different plans in production than test environment is that the machines are different. You usually have a lot more memory and more CPUs / cores in production environment, causing optimizer to choose different plan and of course if your row counts in the tables are not the same, it of course can lead to into a totally different plan.
You can check this with using option (recompile) to see if the plan changes or look at plan cache that what was the value of the parameter used to create the plan. It can be seen in the properties of the leftmost object in the plan.
Changing the select distinct into exists clause is probably a good idea, and of course indexing the tables properly.

Stored procedures eating CPU SQL Server 2005

I am trying to resolve a 100% cpu usage by the SQL Server process on the database server. While investigating, I have now come across the fact that the stored procedures are taking the most worker time.
For the following query of dmv's to find queries taking highest time,
SELECT TOP 20 st.text
,st.dbid
,st.objectid
,qs.total_worker_time
,qs.last_worker_time
,qp.query_plan
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
ORDER BY qs.total_worker_time DESC
most of them are stored procedures. The weird thing is that all these stored procedures are querying different tables. And yet they are the top of taking the most worker time, even though when I look at the Profiler for queries with top CPU, Reads, Duration, the stored procedures don't figure at the top there.
Why could this be happening?
==Edit==
The application actually uses adhoc queries more than stored procedures. Some of these procedures are to be migrated to using adhoc queries. The thing is that these procedures are not called as often as some of the other queries, which are cpu intensive, and are called very frequently.
Also, it does strike me odd that a stored procedure with which does a simple select a,b,c from tbl where id=#id would have a higher total worker time than a query which has mulitple joins, user defined functions in the where clause, a sort and a row_number over and while the simple one queries a table with 20000 records, the complex query is on a table with over 200,000 records.

It depends what are you doing in that stored procedure, how you are tuning your tables, indexes, etc.
For example if you create a stored procedure using loops like cursor you will have at maximum your cpu usage. If you don't setup your indexes and you are using select using different joins,etc. You'll have your cpu overloaded.
It is accordingly of how you use your database.
Some tips:
I recommend create stored procedure most of the time.
When you create your stored procedures, run with execution plan to get some suggestions.
If the tables have a lot of records (more than 1 or 2 millions), think about create index views.
Do update or delete only when is necessary, sometimes is better only insert records and run a job daily or weekly to update or remove the records that you don't need. (it depends of the case)

In SQL Server, how to allow for multiple execution plans for a single query in a SP without having to recompile every time?

In SQL Server, what is the best way to allow for multiple execution plans to exist for a query in a SP without having to recompile every time?
For example, I have a case where the query plan varies significantly depending on how many rows are in a temp table that the query uses. Since there was no "one size fits all" plan that was satisfactory, and since it was unacceptable to recompile every time, I ended up copy/pasting (ick) the main query in the SP multiple times within several IF statements, forcing the SQL engine to give each case its own optimal plan. It actually seemed to work beautifully performance-wise, but it feels a bit clunky. (I know I could similarly break this part out into multiple SPs to do the same thing.) Is there a better way to do this?
IF #RowCount < 1
[paste query here]
ELSE IF #RowCount < 50
[paste query here]
ELSE IF #RowCount < 200
[paste query here]
ELSE
[paste query here]

You can use OPTIMIZE FOR in certain situations, to create a plan targeted to a certain value of a parameter (but not multiple plans per se). This allows you to specify what parameter value we want SQL Server to use when creating the execution plan. This is a SQL Server 2005 onwards hint.
Optimize Parameter Driven Queries with the OPTIMIZE FOR Hint in SQL Server
There is also OPTIMIZE FOR UNKNOWN – a SQL Server 2008 onwards feature (use judiciously):
This hint directs the query optimizer
to use the standard algorithms it has
always used if no parameters values
had been passed to the query at all.
In this case the optimizer will look
at all available statistical data to
reach a determination of what the
values of the local variables used to
generate the queryplan should be,
instead of looking at the specific
parameter values that were passed to
the query by the application.
Perhaps also look into optimize for ad hoc workloads Option

SQL Server 2005+ has statement level recompilation and is better at dealing with this kind of branching. You have one plan still but the plan can be partially recompiled at the statement level.
But it is ugly.
I'd go with #Mitch Wheat's option personally because you have recompilations anyway with the stored procedure using a temp table. See Temp table and stored proc compilation

SQL Server query taking up 100% CPU and runs for hours

I have a query that has been running every day for a little over 2 years now and has typically taken less than 30 seconds to complete. All of a sudden, yesterday, the query started taking 3+ hours to complete and was using 100% CPU the entire time.
The SQL is:
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER JOIN vwSomeData alpha ON beta.id = alpha.id
alpha.id is a BIGINT type and beta.id is an INT type. dbo.fnGetStuff() is a simple SELECT statement with 2 INNER JOINs on tables in the same DB, using a WHERE id = #id. The function returns approximately 11000 results.
The view vwSomeData is a simple SELECT statement with two INNER JOINs that returns about 590000 results.
Both the view and the function will complete in less than 10 seconds when executed by themselves. Selecting the results of the function into a temporary table first and then joining on that makes the query finish in < 10 seconds.
How do I troubleshoot what's going on? I don't see any locks in the activity manager.

Look at the query plan. My guess is that there is a table scan or more in the execution plan. This will cause huge amounts of I/O for the few record you get in the result.

You could use the SQL Server Profiler tool to monitor what queries are running on SQL Server. It doesn't show the locks, but it can for instance also give you hints on how to improve your query by suggesting indexes.

If you've got a reasonably recent version of SQL Server Management Studio, it has a Database Tuning Adviser as well, under Tools. It takes a trace from the Profiler and makes some, sometimes highly useful, suggestions. Makes sure there's not too many queries - it takes a long time to build advice.
I'm not an expert on it, but have had some luck with it in the past.

Do you need to use a function? Can you re-write the entire thing into a stored procedure in which you pass in the #ID as a parameter.
Even if your table has indexes because you pass the #ID as a variable to the WHERE clause potentially greatly increasing the amount of time for the query to run.
The reason the indexes may not be used is because the Query Analyzer does not know the value of the variables when it selects an access method to perform the query. Because this is a batch file, only one pass is made of the Transact-SQL code, preventing the Query Optimizer from knowing what it needs to know in order to select an access method that uses the indexes.
You might want to consider an INDEX query hint if you cannot re-write the SQL.
it might also be possible, since this just started happening, that the INDEXes have become fragmented and might need to be rebuilt.

I've had similar problems with joining functions that return large datasets. I had to do what you've already suggested. Put the results in a temp table and join on that.

Look at the estimated plan, this will probably shed some light. Typically when query cost gets orders of magnitude more expensive it is because a loop or merge join is being used where a hash join is more appropriate. If you see a loop or merge join in the estimated plan, look at the number of rows it expects to process - is it far smaller than the number of rows you know will actually be in play? You can also specify a hint to use a hash join and see if it performs much better. If so, try updating statistics and see if it goes back to a hash join without a hint.
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER HASH JOIN vwSomeData alpha ON beta.id = alpha.id

-- having no idea what type of schema is in place and just trying to throw out ideas:
Like others have said... use Profiler and find the source of pain... but I'm thinking it is the function on the other database. Since that function might be a source of pain, have you thought about a little denormalization or anything on [DifferentDatabase]. I think you'll find a bit more scalability in joining to a more flattened table with indexes than a costly function.

Run this command:
SET SHOWPLAN_ALL ON
Then run your query. It will display the execution plan, look for a "SCAN" on an index or a table. That is most likely what is happening to your query now. If that is the case, try to figure out why it is not using indexes now (refresh statistics, etc)

Parameter Sniffing (or Spoofing) in SQL Server

A while ago I had a query that I ran quite a lot for one of my users. It was still being evolved and tweaked but eventually it stablised and ran quite quickly, so we created a stored procedure from it.
So far, so normal.
The stored procedure, though, was dog slow. No material difference between the query and the proc, but the speed change was massive.
[Background, we're running SQL Server 2005.]
A friendly local DBA (who no longer works here) took one look at the stored procedure and said "parameter spoofing!" (Edit: although it seems that it is possibly also known as 'parameter sniffing', which might explain the paucity of Google hits when I tried to search it out.)
We abstracted some of the stored procedure to a second one, wrapped the call to this new inner proc into the pre-existing outer one, called the outer one and, hey presto, it was as quick as the original query.
So, what gives? Can someone explain parameter spoofing?
Bonus credit for
highlighting how to avoid it
suggesting how to recognise possible cause
discuss alternative strategies, e.g. stats, indices, keys, for mitigating the situation

FYI - you need to be aware of something else when you're working with SQL 2005 and stored procs with parameters.
SQL Server will compile the stored proc's execution plan with the first parameter that's used. So if you run this:
usp_QueryMyDataByState 'Rhode Island'
The execution plan will work best with a small state's data. But if someone turns around and runs:
usp_QueryMyDataByState 'Texas'
The execution plan designed for Rhode-Island-sized data may not be as efficient with Texas-sized data. This can produce surprising results when the server is restarted, because the newly generated execution plan will be targeted at whatever parameter is used first - not necessarily the best one. The plan won't be recompiled until there's a big reason to do it, like if statistics are rebuilt.
This is where query plans come in, and SQL Server 2008 offers a lot of new features that help DBAs pin a particular query plan in place long-term no matter what parameters get called first.
My concern is that when you rebuilt your stored proc, you forced the execution plan to recompile. You called it with your favorite parameter, and then of course it was fast - but the problem may not have been the stored proc. It might have been that the stored proc was recompiled at some point with an unusual set of parameters and thus, an inefficient query plan. You might not have fixed anything, and you might face the same problem the next time the server restarts or the query plan gets recompiled.

Yes, I think you mean parameter sniffing, which is a technique the SQL Server optimizer uses to try to figure out parameter values/ranges so it can choose the best execution plan for your query. In some instances SQL Server does a poor job at parameter sniffing & doesn't pick the best execution plan for the query.
I believe this blog article http://blogs.msdn.com/queryoptteam/archive/2006/03/31/565991.aspx has a good explanation.
It seems that the DBA in your example chose option #4 to move the query to another sproc to a separate procedural context.
You could have also used the with recompile on the original sproc or used the optimize for option on the parameter.

A simple way to speed that up is to reassign the input parameters to local parameters in the very beginning of the sproc, e.g.
CREATE PROCEDURE uspParameterSniffingAvoidance
#SniffedFormalParameter int
AS
BEGIN
DECLARE #SniffAvoidingLocalParameter int
SET #SniffAvoidingLocalParameter = #SniffedFormalParameter
--Work w/ #SniffAvoidingLocalParameter in sproc body
-- ...

In my experience, the best solution for parameter sniffing is 'Dynamic SQL'. Two important things to note is that 1. you should use parameters in your dynamic sql query 2. you should use sp_executesql (and not sp_execute), which saves the execution plan for each parameter values

Parameter sniffing is a technique SQL Server uses to optimize the query execution plan for a stored procedure. When you first call the stored procedure, SQL Server looks at the given parameter values of your call and decides which indices to use based on the parameter values.
So when the first call contains not very typical parameters, SQL Server might select and store a sub-optimal execution plan in regard to the following calls of the stored procedure.
You can work around this by either
using WITH RECOMPILE
copying the parameter values to local variables inside the stored procedure and using the locals in your queries.
I even heard that it's better to not use stored procedures at all but to send your queries directly to the server.
I recently came across the same problem where I have no real solution yet.
For some queries the copy to local vars helps getting back to the right execution plan, for some queries performance degrades with local vars.
I still have to do more research on how SQL Server caches and reuses (sub-optimal) execution plans.

I had similar problem. My stored procedure's execution plan took 30-40 seconds. I tried using the SP Statements in query window and it took few ms to execute the same.
Then I worked out declaring local variables within stored procedure and transferring the values of parameters to local variables. This made the SP execution very fast and now the same SP executes within few milliseconds instead of 30-40 seconds.

Very simple and sort, Query optimizer use old query plan for frequently running queries. but actually the size of data is also increasing so at that time new optimized plan is require and still query optimizer using old plan of query. This is called Parameter Sniffing.
I have also created detailed post on this. Please visit this url:
http://www.dbrnd.com/2015/05/sql-server-parameter-sniffing/

Changing your store procedure to execute as a batch should increase the speed.
Batch file select i.e.:
exec ('select * from order where order id ='''+ #ordersID')
Instead of the normal stored procedure select:
select * from order where order id = #ordersID
Just pass in the parameter as nvarchar and you should get quicker results.