order by slows query down massively - sql-server

using sql server 2014; ((SP1-CU3) (KB3094221) Oct 10 2015 x64
I have the following query
SELECT * FROM dbo.table1 t1
LEFT JOIN dbo.table2 t2 ON t2.trade_id = t1.tradeNo
LEFT JOIN dbo.table3 t3 ON t3.TradeReportID = t1.tradeNo
order by t1.tradeNo
there are ~70k, 35k and 73k rows in t1,t2 and t3 respectively.
When I omit the order by this query executes in 3 seconds with 73k rows.
As written the query took 8.5 minutes to return ~50k rows (I since stopped it)
Switching the order of the LEFT JOINs makes a difference:
SELECT * FROM dbo.table1 t1
LEFT JOIN dbo.table3 t3 ON t3.TradeReportID = t1.tradeNo
LEFT JOIN dbo.table2 t2 ON t2.trade_id = t1.tradeNo
order by t1.tradeNo
This now runs in 3 seconds.
I dont have any indexes on the tables. Adding indexes t1.tradeNo and t2.trade_id and t3.TradeReportID has no effect.
Running the query with only one left join (both scenarios) in combination with the order by is fast.
Its fine for me to swap the order of the LEFT JOINs but this doesnt go far to explaining why this happens and under what scenarios it may happen again
The estimated exectuion plan is: (slow)
(exclamation mark details)
VS
Switching the order of the left joins (fast):
which I note are markedly different but I cannot interpret these to explain the performance difference
UPDATE
It appears the addition of the order by clause results in the execution plan using the Table Spool (lazy spool) vs NOT using this in the fast query.
If I turn off the table spool via DBCC RULEOFF ('BuildSpool'); this 'fixes' the speed but according to this post this isnt recommended and cannot do it per query anyway
UPDATE 2
One of the columns returned (table3.Text] has type varchar(max)) - If this is changed to nvarchar(512) then the original (slow) query is now fast - ie the execution plan now decides to not use the Table Spool - also note that even tho the type is varchar(max) the field values are NULL for every one of the rows. This is now fixable but I am none the wiser
UPDATE 3
Warnings in the execution plan stated
Type conversion in expression (CONVERT_IMPLICIT(nvarchar(50),[t2].[trade_id],0)) may affect "CardinalityEstimate" in query plan choice, ...
t1.tradeNo is nvarchar(21) - the other two are varchar(50) - after altering the latter two to the same as the first the problem disappears! (leaving varchar(max) as stated in UPDATE 2 unchanged)
Given this problem goes away when either UPDATE 2 or UPDATE 3 are rectified I would guess that its a combination of the query optimizer using a temp table (table spool) for a column that has an unbounded size - very interesting despite the nvarchar(max) column having no data.

Your likely best fix is to make sure both sides of your joins have the same datatype. There's no need for one to be varchar and the other nvarchar.
This is a class of problems that comes up quite frequently in DBs. The database is assuming the wrong thing about the composition of the data it's about to deal with. The costs shown in your estimated execution plan are likely a long way from what you'd get in your actual plan. We all make mistakes and really it would be good if SQL Server learned from its own but currently it doesn't. It will estimate a 2 second return time despite being immediately proven wrong again and again. To be fair, I don't know of any DBMS which has machine-learning to do better.
Where your query is fast
The hardest part is done up front by sorting table3. That means it can do an efficient merge join which in turn means it has no reason to be lazy about spooling.
Where it's slow
Having an ID that refers to the same thing stored as two different types and data lengths shouldn't ever be necessary and will never be a good idea for an ID. In this case nvarchar in one place varchar in another. When that makes it fail to get a cardinality estimate that's the key flaw and here's why:
The optimizer is hoping to require only a few unique rows from table3. Just a handful of options like "Male", "Female", "Other". That would be what is known as "low cardinality". So imagine tradeNo actually contained IDs for genders for some weird reason. Remember, it's you with your human skills of contextualisation, who knows that's very unlikely. The DB is blind to that. So here is what it expects to happen: As it executes the query the first time it encounters the ID for "Male" it will lazily fetch the data associated (like the word "Male") and put it in the spool. Next, because it's sorted it expects just a lot more males and to simply re-use what it has already put in the spool.
Basically, it plans to fetch the data from tables 1 and 2 in a few big chunks stopping once or twice to fetch new details from table 3. In practice the stopping isn't occasional. In fact, it may even be stopping for every single row because there are lots of different IDs here. The lazy spool is like going upstairs to get one small thing at a time. Good if you think you just need your wallet. Not so good if you're moving house, in which case you'll want a big box (the eager spool).
The likely reason that shrinking the size of the field in table3 helped is that it meant it estimated less of a comparative benefit in doing the lazy spool over a full sort up front. With varchar it doesn't know how much data there is, just how much there could potentially be. The bigger the potential chunks of data that need shuffling, the more physical work needs doing.
What you can do to avoid in future
Make your table schema and indexes reflect the real shape of the data.
If an ID can be varchar in one table then it's very unlikely to need the extra characters available in nvarchar for another table. Avoid the need for conversions on IDs and also use integers instead of characters where possible.
Ask yourself if any of these tables need tradeNo to be filled in for
all rows. If so, make it not nullable on that table. Next, ask if the
ID should be unique for any of these tables and set it up as such in
the appropriate index. Unique is the definition of maximum cardinality
so it won't make that mistake again.
Nudge in the right direction with join order.
The order you have your joins in the SQL is a signal to the database about how powerful/difficult you expect each join to be. (Sometimes as a human you know more. e.g. if querying for 50 year old astronauts you know that filtering for astronauts would be the first filter to apply but maybe begin with the age when searching for 50 year office workers.) The heavy stuff should come first. It will ignore you if it thinks it has the information to know better but in this case it's relying on your knowledge.
If all else fails
A possible fix would be to INCLUDE all the fields you'll need from table3 in the index on TradeReportId. The reason the indexes couldn't help so much already is that they make it easy to identify how to re-sort but it still hasn't been physically done. That is work it was hoping to optimize with a lazy spool but if the data were included it would be already available so no work to optimize.

Having indexes on a table are key to speeding up retrieval of data. Start with this and then retry your query to see if the speed is improved using 'ORDER BY'

Related

Interpreting SQL Server table statistics

I write queries and procedures, I have no experience as a DB Admin and I am not in a position to be such. I work with hundreds of tables and certain older tables are difficult to work with. I suspect that statistics are a problem but the DBA states that isn't the case.
I don't know how to interpret statistics or even which ones I should look at. As an example, I am currently JOINing 2 tables, it is a simple JOIN that uses an index.
It returns just under 500 rows in 4 columns. It runs very quickly but not when in production with thousands of runs a day. My estimated and actual rows on this JOIN are off by 462%.
I have distilled this stored procedure down to a lot of very basic temp tables to locate the problem areas and it appears to be 2 tables, this example is one of them.
What I want is to know is which commands to run and what statistics to look at to take to the DBA to discuss the specific problem at hand. I do not want to be confrontational but informational. I have a very good professional relationship with this DBA but he is very black and white with his policies so I may not get anywhere with it in the end, but then I can also take that to my lead if I get stonewalled.
I ran a DBCC SHOW_STATISTICS on the table's index. I am not sure if this is the data I need or what I am really looking at. I would really like to know where to start with this. I have googled but all the pages I read are very geared towards DBAs and assume prior knowledge in areas I don't have.
Below is a sample of my JOIN obfuscated - my JOIN is on a temp table the first 2 conditions are needed for the Index, the date conditions when removed make the JOIN actually much worse with 10x the reads:
SELECT
x.UniqueID,
x.ChargeCode,
x.dtDate,
x.uniqueForeignID
INTO
#AnotherTempTable
FROM
Billing.dbo.Charges x
JOIN
#temptable y ON x.uniqueForeignID = y.uniqueID
AND x.ChargeCode = y.ChargeCode
AND #PostMonthStart <= x.dtDate
AND x.dtDate < #PostMonthEnd
The JOIN above is part of a new plan where I have been dissecting all the data needed into temp tables to determine the root cause of the problem in high CPU and reads in production. Below is a list of all the statements that are executing, sorted by the number of reads. The second row is this example query but there are others with similar issues.
Below is the execution plan operations for the plan prior to my updates.
While the new plan has better run time and closer estimates, I worry that I am still going to run into issues if the statistics are off. If I am completely off-base, please tell me and point me in the right direction, I will gladly bark up a different tree if I am making incorrect assumptions.
The first table returned shows some general information. You can see the statistics on this index were last updated 12/25/2019 at 10:19 PM. As of the writing of this answer, that is yesterday evening, so stats were updated recently. That is likely to be some kind of evening maintenance, but it could also be a threshold of data modifications that triggered an automatic statistics update.
There were 222,596,063 rows in the table at the time the statistics were sampled. The statistics update sampled 626,452 of these rows, so the sample rate is 0.2%. This sample size was likely the default sample rate used by a simple update statistics MyTable command.
A sample rate of 0.2% is fast to calculate but can lead to very bad estimates-- especially if an index is being used to support a foreign key. For example, a parent/child relationship may have a ParentKey column on the child table. A low statistics sample rate will result in very high estimates per parent row which can lead to strange decisions in query plans.
Look at the third table (the histogram). The RANGE_HI_KEY corresponds to a specific key value of the first column in this index. The EQ_ROWS column is the histogram's estimate of the number of rows that correspond to this key. If you get a count of the rows in this table by one of these keys in the RANGE_HI_KEY column, does the number in the EQ_ROWS column look like an accurate estimate? If not, a higher sample rate may yield better query plans.
For example, take the value 1475616. Is the count of rows for this key close to the EQ_ROWS value of 3893?
select count(*) from MyTable where FirstIndexColumn = 1475616
If the estimate is very bad, the DBA may need to increase the sample size on this table:
update statistics MyTable with sample 5 percent
If the DBA uses Ola Hallengren's plan (an excellent choice, in my opinion), this can be done by passing the #StatisticsSample parameter to the IndexOptimize procedure.

Is there a way I can tell what OPTION (FAST 1) does at the end of my query?

I have a number of horribly large queries which work ok on small databases but when the volume of data gets larger then the performance of these queries get slower. They are badly designed really and we must address that really. These queries have a very large number of LEFT OUTER JOINS. I note that when the number of LEFT OUTER JOINS goes past 10 then performance gets logarithmically slower each time a new join is added. If I put a OPTION (FAST 1) at the end of my query then the results appear nearly immediately. Of course I do not want to use this as it firstly, it is not going to help all of the time (if it did then every query would have it) and secondly I want to know how to optimise these joins better. When I run the query without the OPTION set then the execution plan shows a number of nested loops on my LEFT OUTER JOINS are showing a high percentage cost, but with the option off it does not. How can I find out what it does to speed the query up so I can reflect it in the query ?
I cannot get the query nor the execution plans today as the server I am on does not let me copy data from it. If they are needed for this I can arrange to get them sent but will take some time, in the morning.
Your comments would be really interesting to know.
Kind regards,
Derek.
You can set column to primary key and column automatically will be Clustered Index.
Clustered Index->Benefit and Drawback
Benefit: Performance boost if implemented correctly
Drawback: requires understanding of clustered/non clustered indexes and storage implications
Note: varchar foreign keys can lead to poor performance as well. Change the base table to have an integer primary key instead.
And also
I would suggest to use database paging(f.e. via ROW_NUMBER function) to partition your result set and query only the data you want to show(f.e. 20 rows per page in a GridView).

Can joining with an iTVF be as fast as joining with a temp table?

Scenario
Quick background on this one: I am attempting to optimize the use of an inline table-valued function uf_GetVisibleCustomers(#cUserId). The iTVF wraps a view CustomerView and filters out all rows containing data for customers whom the provided requesting user is not permitted to see. This way, should selection criteria ever change in the future for certain user types, we won't have to implement that new condition a hundred times (hyperbole) all over the SQL codebase.
Performance is not great, however, so I want to fix that before encouraging use of the iTVF. Changed database object names here just so it's easier to demonstrate (hopefully).
Queries
In attempting to optimize our iTVF uf_GetVisibleCustomers, I've noticed that the following SQL …
CREATE TABLE #tC ( idCustomer INT )
INSERT #tC
SELECT idCustomer
FROM [dbo].[uf_GetVisibleCustomers]('requester')
SELECT T.fAmount
FROM [Transactions] T
JOIN #tC C ON C.idCustomer = T.idCustomer
… is orders of magnitude faster than my original (IMO more readable, likely to be used) SQL here…
SELECT T.fAmount
FROM [Transactions] T
JOIN [dbo].[uf_GetVisibleCustomers]('requester') C ON C.idCustomer = T.idCustomer
I don't get why this is. The former (top block of SQL) returns ~700k rows in 17 seconds on a fairly modest development server. The latter (second block of SQL) returns the same number of rows in about ten minutes when there is no other user activity on the server. Maybe worth noting that there is a WHERE clause, however I have omitted it here for simplicity; it is the same for both queries.
Execution Plan
Below is the execution plan for the first. It enjoys automatic parallelism as mentioned while the latter query isn't worth displaying here because it's just massive (expands the entire iTVF and underlying view, subqueries). Anyway, the latter also does not execute in parallel (AFAIK) to any extent.
My Questions
Is it possible to achieve performance comparable to the first block without a temp table?
That is, with the relative simplicity and human-readability of the slower SQL.
Why is a join to a temp table faster than a join to iTVF?
Why is it faster to use a temp table than an in-memory table populated the same way?
Beyond those explicit questions, if someone can point me in the right direction toward understanding this better in general then I would be very grateful.
Without seeing the DDL for your inline function - it's hard to say what the issue is. It would also help to see the actual execution plans for both queries (perhaps you could try: https://www.brentozar.com/pastetheplan/). That said, I can offer some food for thought.
As you mentioned, the iTVF accesses the underlying tables, views and associated indexes. If your statistics are not up-to-date you can get a bad plan, that won't happen with your temp table. On that note, too, how long does it take to populate that temp table?
Another thing to look at (again, this is why DDL is helpful) is: are the data type's the same for Transactions.idCustomer and #TC.idCustomer? I see a hash match in the plan you posted which seems bad for a join between two IDs (a nested loops or merge join would be better). This could be slowing both queries down but would appear to have a more dramatic impact on the query that leverages your iTVF.
Again this ^^^ is speculation based on my experience. A couple quick things to try (not as a perm fix but for troubleshooting):
1. Check to see if re-compiling your query when using the iTVF speeds things up (this would be a sign of a bad stats or a bad execution plan being cached and re-used)
2. Try forcing a parallel plan for the iTVF query. You can do this by adding OPTION (QUERYTRACEON 8649) to the end of your query of by using make_parallel() by Adam Machanic.

Small table has very high cost in query plan

I am having an issue with a query where the query plan says that 15% of the execution cost is for one table. However, this table is very small (only 9 rows).
Clearly there is a problem if the smallest table involved in the query has the highest cost.
My guess is that the query keeps on looping over the same table again and again, rather than caching the results.
What can I do about this?
Sorry, I can't paste the exact code (which is quite complex), but here is something similar:
SELECT Foo.Id
FROM Foo
-- Various other joins have been removed for the example
LEFT OUTER JOIN SmallTable as st_1 ON st_1.Id = Foo.SmallTableId1
LEFT OUTER JOIN SmallTable as st_2 ON st_2.Id = Foo.SmallTableId2
WHERE (
-- various where clauses removed for the example
)
AND (st_1.Id is null OR st_1.Code = 7)
AND (st_2.Id is null OR st_2.Code = 4)
Take these execution-plan statistics with a wee grain of salt. If this table is "disproportionately small," relative to all the others, then those cost-statistics probably don't actually mean a hill o' beans.
I mean... think about it ... :-) ... if it's a tiny table, what actually is it? Probably, "it's one lousy 4K storage-page in a file somewhere." We read it in once, and we've got it, period. End of story. Nothing (actually...) there to index; no (actual...) need to index it; and, at the end of the day, the DBMS will understand this just as well as we do. Don't worry about it.
Now, having said that ... one more thing: make sure that the "cost" which seems to be attributed to "the tiny table" is not actually being incurred by very-expensive access to the tables to which it is joined. If those tables don't have decent indexes, or if the query as-written isn't able to make effective use of them, then there's your actual problem; that's what the query optimizer is actually trying to tell you. ("It's just a computer ... backwards things says it sometimes.")
Without the query plan it's difficult to solve your problem here, but there is one glaring clue in your example:
AND (st_1.Id is null OR st_1.Code = 7)
AND (st_2.Id is null OR st_2.Code = 4)
This is going to be incredibly difficult for SQL Server to optimize because it's nearly impossible to accurately estimate the cardinality. Hover over the elements of your query plan and look at EstimatedRows vs. ActualRows and EstimatedExecutions vs. ActualExecutions. My guess is these are way off.
Not sure what the whole query looks like, but you might want to see if you can rewrite it as two queries with a UNION operator rather than using the OR logic.
Well, with the limited information available, all I can suggest is that you ensure all columns being used for comparisons are properly indexed.
In addition, you haven't stated if you have an actual performance problem. Even if those table accesses took up 90% of the query time, it's most likely not a problem if the query only takes (for example) a tenth of a second.

Why would using a temp table be faster than a nested query?

We are trying to optimize some of our queries.
One query is doing the following:
SELECT t.TaskID, t.Name as Task, '' as Tracker, t.ClientID, (<complex subquery>) Date,
INTO [#Gadget]
FROM task t
SELECT TOP 500 TaskID, Task, Tracker, ClientID, dbo.GetClientDisplayName(ClientID) as Client
FROM [#Gadget]
order by CASE WHEN Date IS NULL THEN 1 ELSE 0 END , Date ASC
DROP TABLE [#Gadget]
(I have removed the complex subquery. I don't think it's relevant other than to explain why this query has been done as a two stage process.)
I thought it would be far more efficient to merge this down into a single query using subqueries as:
SELECT TOP 500 TaskID, Task, Tracker, ClientID, dbo.GetClientDisplayName(ClientID)
FROM
(
SELECT t.TaskID, t.Name as Task, '' as Tracker, t.ClientID, (<complex subquery>) Date,
FROM task t
) as sub
order by CASE WHEN Date IS NULL THEN 1 ELSE 0 END , Date ASC
This would give the optimizer better information to work out what was going on and avoid any temporary tables. I assumed it should be faster.
But it turns out it is a lot slower. 8 seconds vs. under 5 seconds.
I can't work out why this would be the case, as all my knowledge of databases imply that subqueries would always be faster than using temporary tables.
What am I missing?
Edit --
From what I have been able to see from the query plans, both are largely identical, except for the temporary table which has an extra "Table Insert" operation with a cost of 18%.
Obviously as it has two queries the cost of the Sort Top N is a lot higher in the second query than the cost of the Sort in the Subquery method, so it is difficult to make a direct comparison of the costs.
Everything I can see from the plans would indicate that the subquery method would be faster.
"should be" is a hazardous thing to say of database performance. I have often found that temp tables speed things up, sometimes dramatically. The simple explanation is that it makes it easier for the optimiser to avoid repeating work.
Of course, I've also seen temp tables make things slower, sometimes much slower.
There is no substitute for profiling and studying query plans (read their estimates with a grain of salt, though).
Obviously, SQL Server is choosing the wrong query plan. Yes, that can happen, I've had exactly the same scenario as you a few times.
The problem is that optimizing a query (you mention a "complex subquery") is a non-trivial task: If you have n tables, there are roughly n! possible join orders -- and that's just the beginning. So, it's quite plausible that doing (a) first your inner query and (b) then your outer query is a good way to go, but SQL Server cannot deduce this information in reasonable time.
What you can do is to help SQL Server. As Dan Tow writes in his great book "SQL Tuning", the key is usually the join order, going from the most selective to the least selective table. Using common sense (or the method described in his book, which is a lot better), you could determine which join order would be most appropriate and then use the FORCE ORDER query hint.
Anyway, every query is unique, there is no "magic button" to make SQL Server faster. If you really want to find out what is going on, you need to look at (or show us) the query plans of your queries. Other interesting data is shown by SET STATISTICS IO, which will tell you how much (costly) HDD access your query produces.
I have re-iterated this question here: How can I force a subquery to perform as well as a #temp table?
The nub of it is, yes, I get that sometimes the optimiser is right to meddle with your subqueries as if they weren't fully self contained but sometimes it makes a bad wrong turn when it tries to be clever in a way that we're all familiar with. I'm saying there must be a way of switching off that "cleverness" where necessary instead of wrecking a View-led approach with temp tables.

Resources