I'm running SQL Server 2014.
In the following code, I have a temp table (defined earlier in code) that is being filled from a stored procedure. Most of the parameters for the stored procedure are standard data types, but #GroupLayoutSpecifications is a table variable that accepts a small heap table that is joined to within the stored procedure.
INSERT #StandardizedResponses
EXEC rpt.usp_QueryBuilder_GatherStandardizedResponses
#Member, #OrgUnits, #Groups, #MeasureName,
#StartDate, #EndDate, #InstrumentTypeIDs,
#BackgroundDataIDs, #GroupLayoutSpecifications;
The problem I'm having the query engine isn't able to effectively estimate the number of rows that the stored procedure is likely to return. It is typical for it to estimate a return of something like 1 row, with an actual return of closer to 200k rows. I believe this is what is causing a tempdb spillover later in the plan.
Is it likely that it is the table-type parameter that is causing the query engine some grief? If so, how might I get around that?
Similarly, is there a way to hint to SQL Server that the following query will likely result in a larger than expected row count?
I've researched this quite a bit through things like MSDN, this site, SQL Authority, and others, and am hoping someone here can help me tune this.
If you need more information to supply a reasonable answer, just let me know what you might need.
Cheers,
Joe
Related
I have a stored procedure (SP) in which a table-valued parameter (TVP) is being passed in. The same code in the SP executes a lot slowly than it does outside the SP.
I took a look at the execution plans and they are very different.
At first this seemed like a sign of parameter sniffing, however this is for a TVP! Which works a bit differently (I am not too certain - apparently there is no sniffing for TVP's?).
In any case, if I create a new local TVP and insert the rows into it, then I get a good execution plan!
CREATE PROCEDURE [dbo].[TVPSniffTest] (
#GuidList dbo.Guid_LIST readonly
)
AS
BEGIN
DECLARE #GuidList2 dbo.Guid_LIST
INSERT INTO #GuidList2
SELECT * FROM #GuidList
--query code here using #GuidList2, produces a good plan!
END
What is going on?
Edit I've tried a number of query optimizer hints, they do not work. Including the one in the suggested duplicate question. It's almost like the bad plan (slow one), is the one that is correct in terms of the estimated number of rows. The fast plan has an incorrect estimation of the number of rows.
TVPs don't have distribution statistics, but they do have cardinality information.
Parameter sniffing applies to table-valued parameters: the optimizer might reuse a plan compiled for a low-cardinality TVP in a following invocation of the SP on a TVP with many rows.
Sources:
Erland Sommarskog's Arrays and Lists in SQL Server: The Long Version
(section 13.1 Performance in SQL Server)
BrentOzar's Table Valued Parameters: Unexpected Parameter Sniffing (see "Second Thing: They don’t get a fixed estimate like Table Variables" and "Third Thing: Non-join cardinality estimates behave like local variables (fixed estimates)")
I have a stored procedure which is contains a lot of query.
I Would like to execute the query plan only for a specific query.
EX:
select * from A
select * from B
--QUERY PLAN THIS
select * from C
--END OF QUERY PLAN
select * from D
Thanks a lot
You really can't do that. So you have to work around the problem by extracting that specific bit of SQL and building its execution plan by itself.
It may be that you're curious to see if the execution plan is affected when running in the context of the rest of the stored procedure. In that case you'll have to go ahead and generate a plan on the whole procedure and decipher what piece of it is the query you're concerned with.
You may also find SQL profiler helpful when trying to optimize/troubleshoot specific elements of complicated procedures.
I have a relatively complex query, with several self joins, which works on a rather large table.
For that query to perform faster, I thus need to only work with a subset of the data.
Said subset of data can range between 12 000 and 120 000 rows depending on the parameters passed.
More details can be found here: SQL Server CTE referred in self joins slow
As you can see, I was using a CTE to return the data subset before, which caused some performance problems as SQL Server was re-running the Select statement in the CTE for every join instead of simply being run once and reusing its data set.
The alternative, using temporary tables worked much faster (while testing the query in a separate window outside the UDF body).
However, when I tried to implement this in a multi-statement UDF, I was harshly reminded by SQL Server that multi-statement UDFs do not support temporary tables for some reason...
UDFs do allow table variables however, so I tried that, but the performance is absolutely horrible as it takes 1m40 for my query to complete whereas the CTE version only took 40 seconds.
I believe the table variables is slow for reasons listed in this thread: Table variable poor performance on insert in SQL Server Stored Procedure
Temporary table version takes around 1 seconds, but I can't make it into a function due to the SQL Server restrictions, and I have to return a table back to the caller.
Considering that CTE and table variables are both too slow, and that temporary tables are rejected in UDFs, What are my options in order for my UDF to perform quickly?
Thanks a lot in advance.
In many such cases all we need to do is to declare primary keys for those table variables, and it is fast again.
Set up and use a Process-Keyed Table, See the article: from How to Share Data Between Stored Procedures by Erland Sommarskog
One kludgey work-around I've used involves code like so (psuedo code follows):
CREATE TEMP TABLE #foo
EXECUTE MyStoredProcedure
SELECT *
from #foo
GO
-- Stored procedure definition
CREATE PROCEDURE MyStoredProcedure
AS
INSERT #foo values (whatever)
RETURN
GO
In short, the stored procedure references and uses a temp table created by the calling procedure (or routine). This will work, but it can be confusing for others to follow what's going on if you don't document it clearly, and you will get recompiles, statistics recalcs, and other oddness that may consume unwanted clock cycles.
I have a report that renders data returned from a stored procedure. Using profiler I can catch the call to the stored procedure from the reporting services.
The report fails stating the report timed out yet I can execute the stored procedure from SSMS and it returns the data back in five to six seconds.
Note, in the example test run only two rows are returned to the report for rendering though within the stored procedure it may have been working over thousands or even millions of records in order to collate the result passed back to reporting services.
I know the stored procedure could be optimised more but I do not understand why SSRS would be timing out when the execution only seems to take a few seconds to execute from SSMS.
Also another issue has surfaced. If I recreate the stored procedure, the report starts to render perfectly fine again. That is fine except after a short period of time, the report starts timing out again.
The return of the time out seems to be related to new data being added into the main table the report is running against. In the example I was testing, just one hundred new records being inserted was enough to screw up the report.
I imagine more correctly its not the report that is the root cause. It is the stored procedure that is causing the time out when executed from SSRS.
Once it is timeing out again, I best fix I have so far is to recreate the stored procedure. This doesn't seem to be an ideal solution.
The problem also only seems to be occuring on our production environment. Our test and development platforms do not seem to be exhibiting the same problem. Though dev and test do not have the same volume of records as production.
The problem, as you described it, seems to come from variations on the execution plan of some parts in your stored procedure. Look at what statistics are kept on the tables used and how adding new rows affect them.
If you're adding a lot of rows at the
end of the range of a column (think
about adding autonumbers, or
timestamps), the histogram for that
column will become outdated rapidly.
You can force an immediate update from
T-SQL by executing the UPDATE
STATISTICS statement.
I have also had this issue where the SPROC takes seconds to run yet SSRS simply times out.
I have found from my own experience that there are a couple of different methods to overcome this issue.
Is parameter sniffing! When your stored procedure is executed from SSRS it will "sniff" out your parameters to see how your SPROC is using them. SQL Server will then produce an execution plan based on its findings. This is good the first time you execute your SPROC, but you don't want it to be doing this every time you run your report. So I declare a new set of variables at the top of my SPROC's which simply store the parameters passed in the query and use these new parameters throughout the query.
Example:
CREATE PROCEDURE [dbo].[usp_REPORT_ITD001]
#StartDate DATETIME,
#EndDate DATETIME,
#ReportTab INT
AS
-- Deter parameter sniffing
DECLARE #snf_StartDate DATETIME = #StartDate
DECLARE #snf_EndDate DATETIME = #EndDate
DECLARE #snf_ReportTab INT = #ReportTab
...this means that when your SPORC is executed by SSRS it is only looking at the first few rows in your query for the passed parameters rather than the whole of your query. Which cuts down execution time considerably in SSRS.
If your SPROC has a lot of temp tables that are declared as variables (DECLARE #MyTable AS TABLE), these are really intensive on the server (In terms of memory) when generating reports. By using hash temp tables (SELECT MyCol1, MyCol2 INTO #MyTable) instead, SQL Server will store your temp tables in TempDB on the server rather than in system memeory, making the report generation less intensive.
sometime adding WITH RECOMPILE option to the CREATE statement of stored procedure helps.
This is effective in situations when the number of records explored by the procedure changes in the way that the original execution plan is not optimal.
Basically all I've done so far was to optimise the sproc a bit more and it seems to at least temporarily solve the problem.
I would still like to know what the difference is between calling the sproc from SSMS and SSRS.
A while ago I had a query that I ran quite a lot for one of my users. It was still being evolved and tweaked but eventually it stablised and ran quite quickly, so we created a stored procedure from it.
So far, so normal.
The stored procedure, though, was dog slow. No material difference between the query and the proc, but the speed change was massive.
[Background, we're running SQL Server 2005.]
A friendly local DBA (who no longer works here) took one look at the stored procedure and said "parameter spoofing!" (Edit: although it seems that it is possibly also known as 'parameter sniffing', which might explain the paucity of Google hits when I tried to search it out.)
We abstracted some of the stored procedure to a second one, wrapped the call to this new inner proc into the pre-existing outer one, called the outer one and, hey presto, it was as quick as the original query.
So, what gives? Can someone explain parameter spoofing?
Bonus credit for
highlighting how to avoid it
suggesting how to recognise possible cause
discuss alternative strategies, e.g. stats, indices, keys, for mitigating the situation
FYI - you need to be aware of something else when you're working with SQL 2005 and stored procs with parameters.
SQL Server will compile the stored proc's execution plan with the first parameter that's used. So if you run this:
usp_QueryMyDataByState 'Rhode Island'
The execution plan will work best with a small state's data. But if someone turns around and runs:
usp_QueryMyDataByState 'Texas'
The execution plan designed for Rhode-Island-sized data may not be as efficient with Texas-sized data. This can produce surprising results when the server is restarted, because the newly generated execution plan will be targeted at whatever parameter is used first - not necessarily the best one. The plan won't be recompiled until there's a big reason to do it, like if statistics are rebuilt.
This is where query plans come in, and SQL Server 2008 offers a lot of new features that help DBAs pin a particular query plan in place long-term no matter what parameters get called first.
My concern is that when you rebuilt your stored proc, you forced the execution plan to recompile. You called it with your favorite parameter, and then of course it was fast - but the problem may not have been the stored proc. It might have been that the stored proc was recompiled at some point with an unusual set of parameters and thus, an inefficient query plan. You might not have fixed anything, and you might face the same problem the next time the server restarts or the query plan gets recompiled.
Yes, I think you mean parameter sniffing, which is a technique the SQL Server optimizer uses to try to figure out parameter values/ranges so it can choose the best execution plan for your query. In some instances SQL Server does a poor job at parameter sniffing & doesn't pick the best execution plan for the query.
I believe this blog article http://blogs.msdn.com/queryoptteam/archive/2006/03/31/565991.aspx has a good explanation.
It seems that the DBA in your example chose option #4 to move the query to another sproc to a separate procedural context.
You could have also used the with recompile on the original sproc or used the optimize for option on the parameter.
A simple way to speed that up is to reassign the input parameters to local parameters in the very beginning of the sproc, e.g.
CREATE PROCEDURE uspParameterSniffingAvoidance
#SniffedFormalParameter int
AS
BEGIN
DECLARE #SniffAvoidingLocalParameter int
SET #SniffAvoidingLocalParameter = #SniffedFormalParameter
--Work w/ #SniffAvoidingLocalParameter in sproc body
-- ...
In my experience, the best solution for parameter sniffing is 'Dynamic SQL'. Two important things to note is that 1. you should use parameters in your dynamic sql query 2. you should use sp_executesql (and not sp_execute), which saves the execution plan for each parameter values
Parameter sniffing is a technique SQL Server uses to optimize the query execution plan for a stored procedure. When you first call the stored procedure, SQL Server looks at the given parameter values of your call and decides which indices to use based on the parameter values.
So when the first call contains not very typical parameters, SQL Server might select and store a sub-optimal execution plan in regard to the following calls of the stored procedure.
You can work around this by either
using WITH RECOMPILE
copying the parameter values to local variables inside the stored procedure and using the locals in your queries.
I even heard that it's better to not use stored procedures at all but to send your queries directly to the server.
I recently came across the same problem where I have no real solution yet.
For some queries the copy to local vars helps getting back to the right execution plan, for some queries performance degrades with local vars.
I still have to do more research on how SQL Server caches and reuses (sub-optimal) execution plans.
I had similar problem. My stored procedure's execution plan took 30-40 seconds. I tried using the SP Statements in query window and it took few ms to execute the same.
Then I worked out declaring local variables within stored procedure and transferring the values of parameters to local variables. This made the SP execution very fast and now the same SP executes within few milliseconds instead of 30-40 seconds.
Very simple and sort, Query optimizer use old query plan for frequently running queries. but actually the size of data is also increasing so at that time new optimized plan is require and still query optimizer using old plan of query. This is called Parameter Sniffing.
I have also created detailed post on this. Please visit this url:
http://www.dbrnd.com/2015/05/sql-server-parameter-sniffing/
Changing your store procedure to execute as a batch should increase the speed.
Batch file select i.e.:
exec ('select * from order where order id ='''+ #ordersID')
Instead of the normal stored procedure select:
select * from order where order id = #ordersID
Just pass in the parameter as nvarchar and you should get quicker results.