SQL Server ORDER BY Performance Aberration - sql-server

SQL Server 2008 running on Windows Server Enterprise(?) Edition 2008
I have a query joining against twenty some-odd tables (mostly LEFT OUTER JOINs). The full dataset returned by an unfiltered query returns less than 1,000 rows in less than 1s. When I apply a WHERE clause to filter the query it returns less than 300 rows in less than 1s.
When I apply an ORDER BY clause to the query it returns in 90s.
I examined the results of the query and notice a number of NULL results returned in the column that is being used to sort. I modified the query to COALESCE a NULL value to a valid search value without any change to the performance of the query.
I then did a
SELECT * FROM
(
my query goes here
) qry
ORDER BY myOrderByHere
And that produced the same results.
When I SELECT ... INTO #tempTable (without the ORDER BY) and then SELECT FROM the #tempTable with the order by the query returns in less than 1s.
What is really strange at this point is that the SELECT... INTO will also take 90s even without the ORDER BY.
The Execution Plan says that the SORT is taking 98% of the execution time when included with the main query. If I am doing the INSERT INTO the the explain plan says that the actual insert into the temp table takes 99% of the execution time.
And to take out server issues I have run the same tests on two different instances of SQL Server 2008 with nearly identical results.
Many thanks!
rjsjr

Sounds like something strange is going on with your tempdb. Inserting 1000 rows in a temporary table should be fast, whether it's an implicit spool for sorting, or an explicit select into.
Check the size of your tempdb, the health of the hard disk it's on, and it's recovery model (should be simple, not full or bulk logged.)

A sort operation is usually an expensive step in the query. So, it's not surprising that the addition of the sort adds time. You may be seeing similar results when you incorporate a temp table in your steps. The sort operation in your original query may use tempdb to help do the sort, and that can be the time-consuming step in each query you compare.
If you want to learn more about each query you're running, you can review query plan outputs.

Related

Extrememly High Estimated Number of Rows in Execution Plan

I have a stored procedure running 10 times slower in production than in staging. I took at look at the execution plan and the first thing I noticed was the cost on Table Insert (into a table variable #temp) was 100% in production and 2% in staging.
The estimated number of rows in production showed almost 200 million row! But in staging was only about 33.
Although the production DB is running on SQL Server 2008 R2 while staging is SQL Server 2012 but I don't think this difference could cause such a problem.
What could be the cause of such a huge difference?
UPDATED
Added the execution plan. As you can see, the large number of estimated rows shows up in Nested Loops (Inner Join) but all it does is a clustered index seek to another table.
UPDATED2
Link for the plan XML included
plan.xml
And SQL Sentry Plan Explorer view (with estimated counts shown)
This looks like a bug to me.
There are an estimated 90,991.1 rows going into the nested loops.
The table cardinality of the table being seeked on is 24,826.
If there are no statistics for a column and the equality operator is used, that means the SQL can’t know the density of the column, so it uses a 10 percent fixed value.
90,991.1 * 24,826 * 10% = 225,894,504.86 which is pretty close to your estimated rows of 225,894,000
But the execution plan shows that only 1 row is estimated per seek. Not the 24,826 from above.
So these figures don't add up. I would assume that it starts off from an original 10% ball park estimate and then later adjusts it to 1 because of the presence of a unique constraint without making a compensating adjustment to the other branches.
I see that the seek is calling a scalar UDF [dbo].[TryConvertGuid] I was able to reproduce similar behavior on SQL Server 2005 where seeking on a unique index on the inside of a nested loops with the predicate being a UDF produced a result where the number of rows estimated out of the join was much larger than would be expected by multiplying estimated seeked rows * estimated number of executions.
But, in your case, the operators to the left of the problematic part of the plan are pretty simple and not sensitive to the number of rows (neither the rowcount top operator or the insert operator will change) so I don't think this quirk is responsible for the performance issues you noticed.
Regarding the point in the comments to another answer that switching to a temp table helped the performance of the insert this may be because it allows the read part of the plan to operate in parallel (inserting to a table variable would block this)
Run EXEC sp_updatestats; on the production database. This updates statistics on all tables. It might produce more sane execution plans if your statistics are screwed up.
Please don't run EXEC sp_updatestats; On a large system it could take hours, or days, to complete. What you may want to do is look at the query plan that is being used on production. Try to see if it has a index that could be used and is not being used. Try rebuilding the index (as a side effect it rebuilds statistics on the index.) After rebuilding look at the query plan and note if it is using the index. Perhaps you many need to add an index to the table. Does the table have a clustered index?
As a general rule, since 2005, SQL server manages statistics on its own rather well. The only time you need to explicitly update statistics is if you know that if SQL Server uses an index the query would execute would execute a lot faster but its not. You may want to run (on a nightly or weekly basis) scripts that automatically test every table and every index to see if the index needs to be reorged or rebuilt (depending on how fragmented it is). These kind of scripts (on a large active OLTP system)r may take a long time to run and you should consider carefully when you have a window to run it. There are quite a few versions of this script floating around but I have used this one often:
https://msdn.microsoft.com/en-us/library/ms189858.aspx
Sorry this is probably too late to help you.
Table Variables are impossible for SQL Server to predict. They always estimate one row and exactly one row coming back.
To get accurate estimates so that the better plan can be created you need to switch your table variable to a temp table or a cte.

Sql query gets too slow

Few days ago I wrote one query and it gets executes quickly but now a days it takes 1 hrs.
This query run on my SQL7 server and it takes about 10 seconds.
This query exists on another SQL7 server and until last week it took about
10 seconds.
The configuration of both servers are same. Only the hardware is different.
Now, on the second server this query takes about 30 minutes to extract the s
ame details, but anybody has changed any details.
If I execute this query without Where, it'll show me the details in 7
seconds.
This query still takes about same time if Where is problem
Without seeing the query and probably the data I can't do a lot other than offer tips.
Can you put more constraints on the query. If you can reduce the amount of data involved then this will speed up the query.
Look at the columns used in your joins, where and having clauses and order by. Check that the tables that the columns belong to contain indices for these columns.
Do you need to use the user defined function or can it be done another way?
Are you using subquerys? If so can these be pulled out into separate views?
Hope this helps.
Without knowing how much data is going into your tables, and not knowing your schema, it's hard to give a definitive answer but things to look at:
Try running UPDATE STATS or DBCC REINDEX.
Do you have any indexes on the tables? If not, try adding indexes to columns used in WHERE clauses and JOIN predicates.
Avoid cross table OR clauses (i.e, where you do WHERE table1.col1 = #somevalue OR table2.col2 = #someothervalue). SQL can't use indexes effectively with this construct and you may get better performance by splitting the query into two and UNION'ing the results.
What do your functions (UDFs) do and how are you using them? It's worth noting that dropping them in the columns part of a query gets expensive as the function is executed per row returned: thus if a function does a select against the database, then you end up running n + 1 queries against the database (where n = number of rows returned in the main select). Try and engineer the function out if possible.
Make sure your JOINs are correct -- where you're using a LEFT JOIN, revisit the logic and see if it needs to be a LEFT or whether it can be turned into an INNER JOIN. Sometimes people use LEFT JOINs, but when you examine the logic in the rest of the query, it can sometimes be apparent that the LEFT JOIN gives you nothing (because, for example, someone may had added a WHERE col IS NOT NULL predicate against the joined table). INNER JOINs can be faster, so it's worth reviewing all of these.
It would be a lot easier to suggest things if we could see the query.

SQL Server query plan differences

I'm having trouble understanding the behavior of the estimated query plans for my statement in SQL Server when a change from a parameterized query to a non-parameterized query.
I have the following query:
DECLARE #p0 UniqueIdentifier = '1fc66e37-6eaf-4032-b374-e7b60fbd25ea'
SELECT [t5].[value2] AS [Date], [t5].[value] AS [New]
FROM (
SELECT COUNT(*) AS [value], [t4].[value] AS [value2]
FROM (
SELECT CONVERT(DATE, [t3].[ServerTime]) AS [value]
FROM (
SELECT [t0].[CookieID]
FROM [dbo].[Usage] AS [t0]
WHERE ([t0].[CookieID] IS NOT NULL) AND ([t0].[ProductID] = #p0)
GROUP BY [t0].[CookieID]
) AS [t1]
OUTER APPLY (
SELECT TOP (1) [t2].[ServerTime]
FROM [dbo].[Usage] AS [t2]
WHERE ((([t1].[CookieID] IS NULL) AND ([t2].[CookieID] IS NULL))
OR (([t1].[CookieID] IS NOT NULL) AND ([t2].[CookieID] IS NOT NULL)
AND ([t1].[CookieID] = [t2].[CookieID])))
AND ([t2].[CookieID] IS NOT NULL)
AND ([t2].[ProductID] = #p0)
ORDER BY [t2].[ServerTime]
) AS [t3]
) AS [t4]
GROUP BY [t4].[value]
) AS [t5]
ORDER BY [t5].[value2]
This query is generated by a Linq2SQL expression and extracted from LINQPad. This produces a nice query plan (as far as I can tell) and executes in about 10 seconds on the database. However, if I replace the two uses of parameters with the exact value, that is replace the two '= #p0' parts with '= '1fc66e37-6eaf-4032-b374-e7b60fbd25ea' ' I get a different estimated query plan and the query now runs much longer (more than 60 seconds, haven't seen it through).
Why is it that performing the seemingly innocent replacement produces a much less efficient query plan and execution? I have cleared the procedure cache with 'DBCC FreeProcCache' to ensure that I was not caching a bad plan, but the behavior remains.
My real problem is that I can live with the 10 seconds execution time (at least for a good while) but I can't live with the 60+ sec execution time. My query will (as hinted above) by produced by Linq2SQL so it is executed on the database as
exec sp_executesql N'
...
WHERE ([t0].[CookieID] IS NOT NULL) AND ([t0].[ProductID] = #p0)
...
AND ([t2].[ProductID] = #p0)
...
',N'#p0 uniqueidentifier',#p0='1FC66E37-6EAF-4032-B374-E7B60FBD25EA'
which produces the same poor execution time (which I think is doubly strange since this seems to be using parameterized queries.
I'm not looking for advise on which indexes to create or the like, I'm just trying to understand why the query plan and execution are so dissimilar on three seemingly similar queries.
EDIT: I have uploaded execution plans for the non-parameterized and the parameterized query as well as an execution plan for a parameterized query (as suggested by Heinz) with a different GUID here
Hope it helps you help me :)
If you provide an explicit value, SQL Server can use statistics of this field to make a "better" query plan decision. Unfortunately (as I've experienced myself recently), if the information contained in the statistics is misleading, sometimes SQL Server just makes the wrong choices.
If you want to dig deeper into this issue, I recommend you to check what happens if you use other GUIDs: If it uses a different query plan for different concrete GUIDs, that's an indication that statistics data is used. In that case, you might want to look at sp_updatestats and related commands.
EDIT: Have a look at DBCC SHOW_STATISTICS: The "slow" and the "fast" GUID are probably in different buckets in the histogram. I've had a similar problem, which I solved by adding an INDEX table hint to the SQL, which "guides" SQL Server towards finding the "right" query plan. Basically, I've looked at what indices are used during a "fast" query and hard-coded those into the SQL. This is far from an optimal or elegant solution, but I haven't found a better one yet...
I'm not looking for advise on which indexes to create or the like, I'm just trying to understand why the query plan and execution are so dissimilar on three seemingly similar queries.
You seem to have two indexes:
IX_NonCluster_Config (ProductID, ServerTime)
IX_NonCluster_ProductID_CookieID_With_ServerTime (ProductID, CookieID) INCLUDE (ServerTime)
The first index does not cover CookieID but is ordered on ServerTime and hence is more efficient for the less selective ProductID's (i. e. those that you have many)
The second index does cover all columns but is not ordered, and hence is more efficient for more selective ProductID's (those that you have few).
In average, you ProductID cardinality is so that SQL Server expects the second method to be efficient, which is what it uses when you use parametrized queries or explicitly provide selective GUID's.
However, your original GUID is considered less selective, that's why the first method is used.
Unfortunately, the first method requires additional filtering on CookieID which is why it's less efficient in fact.
My guess is that when you take the non paramaterized route, your guid has to be converted from a varchar to a UniqueIdentifier which may cause an index not to be used, while it will be used taking the paramatarized route.
I've seen this happen with using queries that have a smalldatetime in the where clause against a column that uses a datetime.
Its difficult to tell without looking at the execution plans, however if I was going to guess at a reason I'd say that its a combinaton of parameter sniffing and poor statistics - In the case where you hard-code the GUID into the query, the query optimiser attempts to optimise the query for that value of the parameter. I believe that the same thing happens with the parameterised / prepared query (this is called parameter sniffing - the execution plan is optimised for the parameters used the first time that the prepared statement is executed), however this definitely doesn't happen when you declare the parameter and use it in the query.
Like I said, SQL server attempt to optimise the execution plan for that value, and so usually you should see better results. It seems here that that information it is basing its decisions on is incorrect / misleading, and you are better off (for some reason) when it optimises the query for a generic parameter value.
This is mostly guesswork however - its impossible to tell really without the execution - if you can upload the executuion plan somewhere then I'm sure someone will be able to help you with the real reason.

SQL Server query taking up 100% CPU and runs for hours

I have a query that has been running every day for a little over 2 years now and has typically taken less than 30 seconds to complete. All of a sudden, yesterday, the query started taking 3+ hours to complete and was using 100% CPU the entire time.
The SQL is:
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER JOIN vwSomeData alpha ON beta.id = alpha.id
alpha.id is a BIGINT type and beta.id is an INT type. dbo.fnGetStuff() is a simple SELECT statement with 2 INNER JOINs on tables in the same DB, using a WHERE id = #id. The function returns approximately 11000 results.
The view vwSomeData is a simple SELECT statement with two INNER JOINs that returns about 590000 results.
Both the view and the function will complete in less than 10 seconds when executed by themselves. Selecting the results of the function into a temporary table first and then joining on that makes the query finish in < 10 seconds.
How do I troubleshoot what's going on? I don't see any locks in the activity manager.
Look at the query plan. My guess is that there is a table scan or more in the execution plan. This will cause huge amounts of I/O for the few record you get in the result.
You could use the SQL Server Profiler tool to monitor what queries are running on SQL Server. It doesn't show the locks, but it can for instance also give you hints on how to improve your query by suggesting indexes.
If you've got a reasonably recent version of SQL Server Management Studio, it has a Database Tuning Adviser as well, under Tools. It takes a trace from the Profiler and makes some, sometimes highly useful, suggestions. Makes sure there's not too many queries - it takes a long time to build advice.
I'm not an expert on it, but have had some luck with it in the past.
Do you need to use a function? Can you re-write the entire thing into a stored procedure in which you pass in the #ID as a parameter.
Even if your table has indexes because you pass the #ID as a variable to the WHERE clause potentially greatly increasing the amount of time for the query to run.
The reason the indexes may not be used is because the Query Analyzer does not know the value of the variables when it selects an access method to perform the query. Because this is a batch file, only one pass is made of the Transact-SQL code, preventing the Query Optimizer from knowing what it needs to know in order to select an access method that uses the indexes.
You might want to consider an INDEX query hint if you cannot re-write the SQL.
it might also be possible, since this just started happening, that the INDEXes have become fragmented and might need to be rebuilt.
I've had similar problems with joining functions that return large datasets. I had to do what you've already suggested. Put the results in a temp table and join on that.
Look at the estimated plan, this will probably shed some light. Typically when query cost gets orders of magnitude more expensive it is because a loop or merge join is being used where a hash join is more appropriate. If you see a loop or merge join in the estimated plan, look at the number of rows it expects to process - is it far smaller than the number of rows you know will actually be in play? You can also specify a hint to use a hash join and see if it performs much better. If so, try updating statistics and see if it goes back to a hash join without a hint.
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER HASH JOIN vwSomeData alpha ON beta.id = alpha.id
-- having no idea what type of schema is in place and just trying to throw out ideas:
Like others have said... use Profiler and find the source of pain... but I'm thinking it is the function on the other database. Since that function might be a source of pain, have you thought about a little denormalization or anything on [DifferentDatabase]. I think you'll find a bit more scalability in joining to a more flattened table with indexes than a costly function.
Run this command:
SET SHOWPLAN_ALL ON
Then run your query. It will display the execution plan, look for a "SCAN" on an index or a table. That is most likely what is happening to your query now. If that is the case, try to figure out why it is not using indexes now (refresh statistics, etc)

SQL Server query execution plan shows wrong "actual row count" on an used index and performance is terrible slow

Today i stumbled upon an interesting performance problem with a stored procedure running on Sql Server 2005 SP2 in a db running on compatible level of 80 (SQL2000).
The proc runs about 8 Minutes and the execution plan shows the usage of an index with an actual row count of 1.339.241.423 which is about factor 1000 higher than the "real" actual rowcount of the table itself which is 1.144.640 as shown correctly by estimated row count. So the actual row count given by the query plan optimizer is definitly wrong!
Interestingly enough, when i copy the procs parameter values inside the proc to local variables and than use the local variables in the actual query, everything works fine - the proc runs 18 seconds and the execution plan shows the right actual row count.
EDIT: As suggested by TrickyNixon, this seems to be a sign of the parameter sniffing problem. But actually, i get in both cases exact the same execution plan. Same indices are beeing used in the same order. The only difference i see is the way to high actual row count on the PK_ED_Transitions index when directly using the parametervalues.
I have done dbcc dbreindex and UPDATE STATISTICS already without any success.
dbcc show_statistics shows good data for the index, too.
The proc is created WITH RECOMPILE so every time it runs a new execution plan is getting compiled.
To be more specific - this one runs fast:
CREATE Proc [dbo].[myProc](
#Param datetime
)
WITH RECOMPILE
as
set nocount on
declare #local datetime
set #local = #Param
select
some columns
from
table1
where
column = #local
group by
some other columns
And this version runs terribly slow, but produces exactly the same execution plan (besides the too high actual row count on an used index):
CREATE Proc [dbo].[myProc](
#Param datetime
)
WITH RECOMPILE
as
set nocount on
select
some columns
from
table1
where
column = #Param
group by
some other columns
Any ideas?
Anybody out there who knows where Sql Server gets the actual row count value from when calculating query plans?
Update: I tried the query on another server woth copat mode set to 90 (Sql2005). Its the same behavior. I think i will open up an ms support call, because this looks to me like a bug.
Ok, finally i got to it myself.
The two query plans are different in a small detail which i missed at first. the slow one uses a nested loops operator to join two subqueries together. And that results in the high number at current row count on the index scan operator which is simply the result of multiplicating the number of rows of input a with number of rows of input b.
I still don't know why the optimizer decides to use the nested loops instead of a hash match which runs 1000 timer faster in this case, but i could handle my problem by creating a new index, so that the engine does an index seek statt instead of an index scan under the nested loops.
When you're checking execution plans of the stored proc against the copy/paste query, are you using the estimated plans or the actual plans? Make sure to click Query, Include Execution Plan, and then run each query. Compare those plans and see what the differences are.
It sounds like a case of Parameter Sniffing. Here's an excellent explanation along with possible solutions: I Smell a Parameter!
Here's another StackOverflow thread that addresses it: Parameter Sniffing (or Spoofing) in SQL Server
To me it still sounds as if the statistics were incorrect. Rebuilding the indexes does not necessarily update them.
Have you already tried an explicit UPDATE STATISTICS for the affected tables?
Have you run sp_spaceused to check if SQL Server's got the right summary for that table? I believe in SQL 2000 the engine used to use that sort of metadata when building execution plans. We used to have to run DBCC UPDATEUSAGE weekly to update the metadata on some of the rapidly changing tables, as SQL Server was choosing the wrong indexes due to the incorrect row count data.
You're running SQL 2005, and BOL says that in 2005 you shouldn't have to run UpdateUsage anymore, but since you're in 2000 compat mode you might find that it is still required.

Resources