I have a query in my MVC application which takes about 20 seconds to complete (using NHibernate 3.1). When I execute the query manually on Management studio it takes 0 seconds.
I've seen similiar questions on SO about problems similar to this one, so I took my test one step further.
I intercepted the query using Sql Server Profiler, and executed the query using ADO.NET in my application.
The query that i got from the Profiler is something like: "exec sp_executesql N'select...."
My ADO.NET code:
SqlConnection conn = (SqlConnection) NHibernateManager.Current.Connection;
var query = #"<query from profiler...>";
var cmd = new SqlCommand(query, conn);
SqlDataReader reader = cmd.ExecuteReader(CommandBehavior.CloseConnection);
return RedirectToAction("Index");
This query is also very fast, taking no time to execute.
Also, I've seen something very strange on the Profiler. The query, when executed from NH, has the following statistics:
reads: 281702
writes: 0
The one from ADO.NET:
reads: 333
writes: 0
Anyone has any clue? Is there any info I may provide to help diagnose the problem?
I thought it might be related to some connection settings, but the ADO.NET version is using the same connection from NHibernate.
Thanks in advance
UPDATE:
I'm using NHibernate LINQ. The query is enormous, but is a paging query, with just 10 records being fetched.
The parameters that are passed to the "exec sp_executesql" are:
#p0 int,#p1 datetime,#p2 datetime,#p3 bit,#p4 int,#p5 int
#p0=10,#p1='2009-12-01 00:00:00',#p2='2009-12-31 23:59:59',#p3=0,#p4=1,#p5=0
I had the ADO.NET and NHibernate using different query-plans, and I was sufering the effects of parameter sniffing on the NH version. Why? Because I had previously made a query with a small date interval, and the stored query-plan was optimized for it.
Afterwards, when querying with a large date interval, the stored plan was used and it took ages to get a result.
I confirmed that this was in fact the problem because a simple:
DBCC FREEPROCCACHE -- clears the query-plan cache
made my query fast again.
I found 2 ways to solve this:
Injecting an "option(recompile)" to the query, using a NH Interceptor
Adding a dummy predicate to my NH Linq expression, like: query = query.Where(true) when the expected result-set was small (date interval wise). This way two different query plans would be created, one for large-sets of data and one for small-sets.
I tried both options, and both worked, but opted for the second approach. It's a little bit hacky but works really well I my case, because the data is uniformly distributed date-wise.
I had the exact same problem as the OP. I tried #psousa's suggestion of injecting an "option(recompile)" which did improve my performance. But in the end I found that simply updating statistics on SQL Server did the trick for me.
update statistics tablename;
I ended up backing out my code to inject the "option(recompile)". I realize this may not be the answer for everyone, but wanted to share since it was the cause of my problems.
Look at the parameters being supplied to the sp_executesql stored proc. If the parameters are supplied as nvarchar (N'value') and the columns they reference are varchar, SQL Server will use a very inefficient query plan. This has been the root cause of all the performance issues I've had that exhibit these symptoms (slow in app., fast in SSMS).
you didn't specify your query or the size of its resultset, but there's an issue with fetching large number of entities with nHibernate.
basically, the time to 'hydrate' the objects is what's taking that long.
you can try turning on the reflection optimizer, or using an IStatelessSession.
see som suggestions that i've got here.
Related
I'm maintaining stored procedures for SQL Server 2005 and I wish I could use a new feature in 2008 that allows the query hint: "OPTIMIZE FOR UNKNOWN"
It seems as though the following query (written for SQL Server 2005) estimates the same number of rows (i.e. selectivity) as if OPTION (OPTIMIZE FOR UNKNOWN) were specified:
CREATE PROCEDURE SwartTest(#productid INT)
AS
DECLARE #newproductid INT
SET #newproductid = #productid
SELECT ProductID
FROM Sales.SalesOrderDetail
WHERE ProductID = #newproductid
This query avoids parameter sniffing by declaring and setting a new variable. Is this really a SQL Server 2005 work-around for the OPTIMIZE-FOR-UNKNOWN feature? Or am I missing something? (Authoritative links, answers or test results are appreciated).
More Info:
A quick test on SQL Server 2008 tells me that the number of estimated rows in this query is in fact the same as if OPTIMIZE FOR UNKNOWN was specified. Is this the same behavior on SQL Server 2005? I think I remember hearing once that without more info, the SQL Server Optimizing Engine has to guess at the selectivity of the parameter (usually at 10% for inequality predicates). I'm still looking for definitive info on SQL 2005 behavior though. I'm not quite sure that info exists though...
More Info 2:
To be clear, this question is asking for a comparison of the UNKNOWN query hint and the parameter-masking technique I describe.
It's a technical question, not a problem solving question. I considered a lot of other options and settled on this. So the only goal of this question was to help me gain some confidence that the two methods are equivalent.
I've used that solution several times recently to avoid parameter sniffing on SQL 2005 and it seems to me to do the same thing as OPTIMIZE FOR UNKNOWN on SQL 2008. Its fixed a lot of problems we had with some of our bigger stored procedures sometimes just hanging when passed certain parameters.
Okay, so I've done some experimenting. I'll write up the results here, but first I want to say that based on what I've seeen and know, I'm confident that using temporary parameters in 2005 and 2008 is exactly equivalent to using 2008's OPTIMIZE FOR UNKNOWN. At least in the context of stored procedures.
So this is what I've found.
In the procedure above, I'm using the AdventureWorks database. (But I use similar methods and get similar results for any other database) I ran:
dbcc show_statistics ('Sales.SalesOrderDetail', IX_SalesOrderDetail_ProductID)
And I see statistics with 200 steps in its histogram. Looking at its histogram I see that there are 66 distinct range rows (i.e. 66 distinct values that weren't included in stats as equality values). Add the 200 equality rows (from each step), and I get an estimate of 266 distinct values for ProductId in Sales.SalesOrderDetail.
With 121317 rows in the table, I can estimate that each ProductId has 456 rows on average. And when I look at the query plan for my test procedure (in xml format) I see something like:
...
<QueryPlan DegreeOfParallelism="1" >
<RelOp NodeId="0"
PhysicalOp="Index Seek"
LogicalOp="Index Seek"
EstimateRows="456.079"
TableCardinality="121317" />
...
<ParameterList>
<ColumnReference
Column="#newproductid"
ParameterRuntimeValue="(999)" />
</ParameterList>
</QueryPlan>
...
So I know where the EstimateRows value is coming from (accurate to three decimals) and Notice that the ParameterCompiledValue attribute is missing from query plan. This is exactly what a plan looks like when using 2008's OPTIMIZE FOR UNKNOWN
Interesting question.
There's a good article on the SQL Programming and API Development Team blog here which lists the workaround solutions, pre-SQL 2008 as:
use RECOMPILE hint so the query is recompiled every time
unparameterise the query
give specific values in OPTIMIZE FOR hint
force use of a specific index
use a plan guide
Which leads me on to this article, which mentions your workaround of using local parameters and how it generates an execution plan based on statistics. How similar this process is to the new OPTIMIZER FOR UNKNOWN hint, I don't know. My hunch is it is a reasonable workaround.
I've been using this parameter masking technique for at least the past year because off odd performance problems, and it has worked well, but is VERY annoying to have to do all the time.
I have ALSO been using WITH RECOMPILE.
I do not have controlled tests because I can't selectively turn the usage of each on and off automatically in the system, but I suspect that the parameter masking will only help IF the parameter is used. I have some complex SPs where the parameter is not used in every statement, and I expect that WITH RECOMPILE was still necessary because some of the "temporary" work tables are not populated (or even indexed identically, if I'm trying to tune) the same way on every run, and some later statements don't rely on the parameter once the work tables are already appropriately populated. I have broken some processes up into multiple SPs precisely so that work done to populate a work table in one SP can be properly analyzed and executed against WITH RECOMPILE in the next SP.
Using SQL Server Management Studio.
How can I test the performance of a large select (say 600k rows) without the results window impacting my test? All things being equal it doesn't really matter, since the two queries will both be outputting to the same place. But I'd like to speed up my testing cycles and I'm thinking that the output settings of SQL Server Management Studio are getting in my way. Output to text is what I'm using currently, but I'm hoping for a better alternative.
I think this is impacting my numbers because the database is on my local box.
Edit: Had a question about doing WHERE 1=0 here (thinking that the join would happen but no output), but I tested it and it didn't work -- not a valid indicator of query performance.
You could do SET ROWCOUNT 1 before your query. I'm not sure it's exactly what you want but it will avoid having to wait for lots of data to be returned and therefore give you accurate calculation costs.
However, if you add Client Statistics to your query, one of the numbers is Wait time on server replies which will give you the server calculation time not including the time it takes to transfer the data over the network.
You can SET STATISTICS TIME ON to get a measurement of the time on server. And you can use the Query/Include Client Statistics (Shift+Alt+S) on SSMS to get detail information about the client time usage. Note that SQL queries don't run and then return the result to the client when finished, but instead they run as they return results and even suspend execution if the communication channel is full.
The only context under which a query completely ignores sending the result packets back to the client is activation. But then the time to return the output to the client should be also considered when you measure your performance. Are you sure your own client will be any faster than SSMS?
SET ROWCOUNT 1 will stop processing after the first row is returned which means unless the plan happens to have a blocking operator the results will be useless.
Taking a trivial example
SELECT * FROM TableX
The cost of this query in practice will heavily depend on the number of rows in TableX.
Using SET ROWCOUNT 1 won't show any of that. Irrespective of whether TableX has 1 row or 1 billion rows it will stop executing after the first row is returned.
I often assign the SELECT results to variables to be able to look at things like logical reads without being slowed down by SSMS displaying the results.
SET STATISTICS IO ON
DECLARE #name nvarchar(35),
#type nchar(3)
SELECT #name = name,
#type = type
FROM master..spt_values
There is a related Connect Item request Provide "Discard results at server" option in SSMS and/or TSQL
The best thing you can do is to check the Query Execution Plan (press Ctrl+L) for the actual query. That will give you the best guesstimate for performance available.
I'd think that the where clause of WHERE 1=0 is definitely happening on the SQL Server side, and not Management Studio. No results would be returned.
Is you DB engine on the same machine that you're running the Mgmt Studio on?
You could :
Output to Text or
Output to File.
Close the Query Results pane.
That'd just move the cycles spent on drawing the grid in Mgmt Studio. Perhaps the Resuls to Text would be more performant on the whole. Hiding the pane would save the cycles on Mgmt Studio on having to draw the data. It's still being returned to the Mgmt Studio, so it really isn't saving a lot of cycles.
How can you test performance of your query if you don't output the results? Speeding up the testing is pointless if the testing doesn't tell you anything about how the query is going to perform. Do you really want to find out this dog of a query takes ten minutes to return data after you push it to prod?
And of course its going to take some time to return 600,000 records. It will in your user interface as well, it will probably take longer than in your query window because the info has to go across the network.
There is a lot of more correct answers of answers but I assume real question here is the one I just asked myself when I stumbled upon this question:
I have a query A and a query B on the same test data. Which is faster? And I want to check quick and dirty. For me the answer is - temp tables (overhead of creating temp table here is easy to ignore). This is to be done on perf/testing/dev server only!
Query A:
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS (to clear statistics
SELECT * INTO #temp1 FROM ...
Query B
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
SELECT * INTO #temp2 FROM ...
I currently have a 'Filter' object which corresponds to a business object. This object has properties that relate to the different ways that I want to be able to filter/search a list of such business objects. Currently these Filter objects have a method that builds the contents of a where-clause that is then passed to a SQL Server 2000 stored procedure where it is concatendated with the rest of the select query. The final string is then executed using Exec.
Currently this works fine except I worry about the performance issue with the lack of execution plan caching. In some research I have seen the use of calling sp_executesql; is this a better solution or are there better conventions for what I am doing?
Update: I think part of the issue with using sp_executesql is that based on a collection in my filter I need to generate a list of OR statements. I am not sure that the 'parameterized' query would be my solution.
example
var whereClause = new StringBuilder();
if (Status.Count > 0)
{
whereClause.Append("(");
foreach (OrderStatus item in Status)
{
whereClause.AppendFormat("Orders.Status = {0} OR ", (int)item);
}
whereClause.Remove(whereClause.Length - 4, 3);
whereClause.Append(") AND ");
}
Yes, sp_executesql will "cache" the execution plan of the query it executes.
Alternatively, instead of passing part of the query to the stored procedure, building the full query there, and executing dynamic SQL, you could build entire query on .NET side and execute it using ADO.NET command object. All queries executed through ADO.NET are getting "cached" by default.
sp_executesql is better than exec because of plan reuse, and you can use parameters which help against sql injection. sp_executesql also won't cause procedure cache bloat if used correctly
take a look at these two articles
Avoid Conversions In Execution Plans By Using sp_executesql Instead of Exec
Changing exec to sp_executesql doesn't provide any benefit if you are not using parameters correctly
You should be using sp_executesql, simply because, as you say, the query plan is stored and future executions will be optimized. It also generally seems to handle dynamic sql better than execute.
Modern RDBMS'es (can't really say whether to consider SQL Server 2000 a "modern" one) are optimized for ad-hoc queries, so there's a negligible performance hit (if any). What's bothering me is that you're using sproc to construct dynamic SQL: this is a huge debugging/support PITA.
sp_executesql is the better option. Have you considered not using a stored procedure for this or at least taking out some of the dynamics? I think it would be much safer from any kind of injection. I write filters much like you are talking about but i try to take care of the input in my code as opposed to in a stored procedure. I really like dynamic sql but maybe it's safer to go the extra mile sometimes.
Profiling LINQ queries and their execution plans is especially important due to the crazy SQL that can sometimes be created.
I often find that I need to track a specific query and have a hard time finding in query analyzer. I often do this on a database which has a lot of running transactions (sometimes production server) - so just opening Profiler is no good.
I've also found tryin to use the DataContext to trace inadequate, since it doesnt give me SQL I can actually execute myself.
My best strategy so far is to add in a 'random' number to my query, and filter for it in the trace.
LINQ:
where o.CompletedOrderID != "59872547981"
Profiler filter:
'TextData' like '%59872547981'
This works fine with a couple caveats :
I have to be careful to remember to remove the criteria, or pick something that wont affect the query plan too much. Yes I know leaving it in is asking for trouble.
As far as I can tell though, even with this approach I need to start a new trace for every LINQ query I need to track. If I go to 'File > Properties' for an existing trace I cannot change the filter criteria.
You cant beat running a query in your app and seeing it pop up in the Profiler without any extra effort. Was just hoping someone else had come up with a better way than this, or at least suggest a less 'dangerous' token to search for than a query on a column.
Messing with the where clause is maybe not the best thing to do since it can and will affect the execution plans for your queries.
Do something funky with projection into anonymous classes instead - use a unique static column name or something that will not affect the execution plan. (That way you can leave it intact in production code in case you later need to do any profiling of production code...)
from someobject in dc.SomeTable
where someobject.xyz = 123
select new { MyObject = someobject, QueryTraceID1234132412='boo' }
You can use the Linq to SQL Debug Visualiser - http://weblogs.asp.net/scottgu/archive/2007/07/31/linq-to-sql-debug-visualizer.aspx and see it in your watch window.
Or you can use DataContext.GetCommand(); to see the SQL before it executes.
You can also look at the DataContext.GetChangeSet() to view what's going to be inserted/ updated or deleted.
EFCore has a feature TagWith() exactly for this purpose.
var nearestFriends =
(from f in context.Friends.TagWith("This is my spatial query!")
orderby f.Location.Distance(myLocation) descending
select f).Take(5).ToList();
https://learn.microsoft.com/en-us/ef/core/querying/tags
Unfortunately you can't use Query Store to find them :-)
This is because comments before the query are stripped out.
Such a shame! Hope I don't have to wait another 12 years.
You can have your datacontext log out the raw SQL, which you could then search for in the profiler to examine performance.
using System.Diagnostics.Debugger;
yourDataContext.Log = new DebuggerWriter();
All of your SQL queries will be displayed in the debugger output window now.
A while ago I had a query that I ran quite a lot for one of my users. It was still being evolved and tweaked but eventually it stablised and ran quite quickly, so we created a stored procedure from it.
So far, so normal.
The stored procedure, though, was dog slow. No material difference between the query and the proc, but the speed change was massive.
[Background, we're running SQL Server 2005.]
A friendly local DBA (who no longer works here) took one look at the stored procedure and said "parameter spoofing!" (Edit: although it seems that it is possibly also known as 'parameter sniffing', which might explain the paucity of Google hits when I tried to search it out.)
We abstracted some of the stored procedure to a second one, wrapped the call to this new inner proc into the pre-existing outer one, called the outer one and, hey presto, it was as quick as the original query.
So, what gives? Can someone explain parameter spoofing?
Bonus credit for
highlighting how to avoid it
suggesting how to recognise possible cause
discuss alternative strategies, e.g. stats, indices, keys, for mitigating the situation
FYI - you need to be aware of something else when you're working with SQL 2005 and stored procs with parameters.
SQL Server will compile the stored proc's execution plan with the first parameter that's used. So if you run this:
usp_QueryMyDataByState 'Rhode Island'
The execution plan will work best with a small state's data. But if someone turns around and runs:
usp_QueryMyDataByState 'Texas'
The execution plan designed for Rhode-Island-sized data may not be as efficient with Texas-sized data. This can produce surprising results when the server is restarted, because the newly generated execution plan will be targeted at whatever parameter is used first - not necessarily the best one. The plan won't be recompiled until there's a big reason to do it, like if statistics are rebuilt.
This is where query plans come in, and SQL Server 2008 offers a lot of new features that help DBAs pin a particular query plan in place long-term no matter what parameters get called first.
My concern is that when you rebuilt your stored proc, you forced the execution plan to recompile. You called it with your favorite parameter, and then of course it was fast - but the problem may not have been the stored proc. It might have been that the stored proc was recompiled at some point with an unusual set of parameters and thus, an inefficient query plan. You might not have fixed anything, and you might face the same problem the next time the server restarts or the query plan gets recompiled.
Yes, I think you mean parameter sniffing, which is a technique the SQL Server optimizer uses to try to figure out parameter values/ranges so it can choose the best execution plan for your query. In some instances SQL Server does a poor job at parameter sniffing & doesn't pick the best execution plan for the query.
I believe this blog article http://blogs.msdn.com/queryoptteam/archive/2006/03/31/565991.aspx has a good explanation.
It seems that the DBA in your example chose option #4 to move the query to another sproc to a separate procedural context.
You could have also used the with recompile on the original sproc or used the optimize for option on the parameter.
A simple way to speed that up is to reassign the input parameters to local parameters in the very beginning of the sproc, e.g.
CREATE PROCEDURE uspParameterSniffingAvoidance
#SniffedFormalParameter int
AS
BEGIN
DECLARE #SniffAvoidingLocalParameter int
SET #SniffAvoidingLocalParameter = #SniffedFormalParameter
--Work w/ #SniffAvoidingLocalParameter in sproc body
-- ...
In my experience, the best solution for parameter sniffing is 'Dynamic SQL'. Two important things to note is that 1. you should use parameters in your dynamic sql query 2. you should use sp_executesql (and not sp_execute), which saves the execution plan for each parameter values
Parameter sniffing is a technique SQL Server uses to optimize the query execution plan for a stored procedure. When you first call the stored procedure, SQL Server looks at the given parameter values of your call and decides which indices to use based on the parameter values.
So when the first call contains not very typical parameters, SQL Server might select and store a sub-optimal execution plan in regard to the following calls of the stored procedure.
You can work around this by either
using WITH RECOMPILE
copying the parameter values to local variables inside the stored procedure and using the locals in your queries.
I even heard that it's better to not use stored procedures at all but to send your queries directly to the server.
I recently came across the same problem where I have no real solution yet.
For some queries the copy to local vars helps getting back to the right execution plan, for some queries performance degrades with local vars.
I still have to do more research on how SQL Server caches and reuses (sub-optimal) execution plans.
I had similar problem. My stored procedure's execution plan took 30-40 seconds. I tried using the SP Statements in query window and it took few ms to execute the same.
Then I worked out declaring local variables within stored procedure and transferring the values of parameters to local variables. This made the SP execution very fast and now the same SP executes within few milliseconds instead of 30-40 seconds.
Very simple and sort, Query optimizer use old query plan for frequently running queries. but actually the size of data is also increasing so at that time new optimized plan is require and still query optimizer using old plan of query. This is called Parameter Sniffing.
I have also created detailed post on this. Please visit this url:
http://www.dbrnd.com/2015/05/sql-server-parameter-sniffing/
Changing your store procedure to execute as a batch should increase the speed.
Batch file select i.e.:
exec ('select * from order where order id ='''+ #ordersID')
Instead of the normal stored procedure select:
select * from order where order id = #ordersID
Just pass in the parameter as nvarchar and you should get quicker results.