I'm using SQL Server 2008 and I noticed an enormous difference in performance when running these two almost identical queries.
Fast query (takes less than a second):
SELECT Season.Description, sum( Sales.Value ) AS Value
FROM Seasons, Sales
WHERE Sales.Property05=Seasons.Season
GROUP BY Seasons.Description
Slow query (takes around 5 minutes):
SELECT Season.Description, sum( Sales.Value ) AS Value
FROM Seasons, Sales
WHERE Sales.Property04=Seasons.Season
GROUP BY Seasons.Description
The only difference is that the tables SALES and SEASONS are joined on Property05 in the fast query and Property04 in the slow one.
Neither of the two property fields are in a key nor in an index so I really don't understand why the execution plan and the performances are so different between the two queries.
Can somebody enlighten me?
EDIT: The query is automatically generated by a Business Intelligence program, so I have no power there. I would have normally used the JOIN ON sintax, although I don't know if that makes a difference.
Slow query plan: https://www.brentozar.com/pastetheplan/?id=HkcBc7gXZ
Fast query plan: https://www.brentozar.com/pastetheplan/?id=rJQ95mgXb
Note that the query above were simplified to the essential part. The query plans are more detailed.
Related
Query A
SELECT Distinct ord_no FROM Orders
ORDER BY ord_no
Query B
SELECT ord_no FROM Orders
GROUP BY ord_no
ORDER BY ord_no
In Orders tabel, ord_no is varchar type and has duplicate. Here it's a composite key with a identity column.
May I know which query is better?
How could we check the query performance using MS SQL Server 2008 R2
(express version)?
You can see the amount of time each query takes in milli seconds on the SQL profiler. From management studio, go to Tools --> Profiler and start a trace on your DB. The run your queries. You can see the duration they took. Mind you, you'll need to have considerable amount of data to see the difference.
You can use SQL Express Profiler if you are not on the full blown version of SQL.
Check the execution plans for both queries. It's very likely that they will be the same, especially with such a simple query (you'll probably see a stream aggregate operator, doing the grouping, in both cases).
If the execution plans are the same, then there is no (statistically significant) difference in performance between the two.
Having said that, use group by instead distinct whenever in doubt.
I'm trying to run a query from multiple tables and I'm having an issue with the query taking over 10 minutes to provide just 3 records. The query is as follows:
select TOP 100 pm_entity_type_name, year(event_date),
pm_event_type_name, pm_event_name, pm_entity_name,
pm_entity_code, event_priority, event_cost
from pm_event_priority, pm_entity, pm_entity_type, pm_event_type, pm_event
where pm_event.pm_event_id = pm_event_priority.pm_event_id
And pm_entity.pm_entity_id = pm_event_priority.pm_entity_id
And pm_entity_type.pm_entity_type_id = pm_entity.pm_entity_type_id
And pm_event_type.pm_event_type_id = pm_event_priority.pm_event_type_id
And ( pm_entity.pm_entity_type_id = '002LEITUU0005T8EX40001XFTEW000000OZX' OR
pm_entity_type.parent_id= '002LEITUU0005T8EX40001XFTEW000000OZX' )
ORDER BY 1,2,3
I wonder, is there any way I can modify this query to possibly make the query a little faster?
Query performance can tank when you have to join many large tables together, particularly when the join columns are not properly indexed. In your case, I suspect your tables are quite large (many rows) and the _id columns are not indexed.
If you are using SQL Server Management Studio, you can click on "Display Estimated Execution Plan" to see how the query optimizer is interpreting your query. If you see a bunch of Table Scans rather than Index Scans/Seeks, this means SQL Server has to read through each and every row in your tables; a performance nightmare! Try putting some indexes on the _id columns of each table (perhaps a clustered index), and/or using Database Engine Tuning Advisor to automatically recommend the best index structure to apply to your tables to improve this query's performance.
You need to look at the query plan. See this question on how to obtain it.
Once you have a query plan, see if you can tell from the list what is so slow. Chances are there are table scans because of missing indexes.
What happens if you take out the TOP 100 and do a SELECT * using the same criteria? If you get a ridiculous amount of data back, there may be missing join criteria.
Few days ago I wrote one query and it gets executes quickly but now a days it takes 1 hrs.
This query run on my SQL7 server and it takes about 10 seconds.
This query exists on another SQL7 server and until last week it took about
10 seconds.
The configuration of both servers are same. Only the hardware is different.
Now, on the second server this query takes about 30 minutes to extract the s
ame details, but anybody has changed any details.
If I execute this query without Where, it'll show me the details in 7
seconds.
This query still takes about same time if Where is problem
Without seeing the query and probably the data I can't do a lot other than offer tips.
Can you put more constraints on the query. If you can reduce the amount of data involved then this will speed up the query.
Look at the columns used in your joins, where and having clauses and order by. Check that the tables that the columns belong to contain indices for these columns.
Do you need to use the user defined function or can it be done another way?
Are you using subquerys? If so can these be pulled out into separate views?
Hope this helps.
Without knowing how much data is going into your tables, and not knowing your schema, it's hard to give a definitive answer but things to look at:
Try running UPDATE STATS or DBCC REINDEX.
Do you have any indexes on the tables? If not, try adding indexes to columns used in WHERE clauses and JOIN predicates.
Avoid cross table OR clauses (i.e, where you do WHERE table1.col1 = #somevalue OR table2.col2 = #someothervalue). SQL can't use indexes effectively with this construct and you may get better performance by splitting the query into two and UNION'ing the results.
What do your functions (UDFs) do and how are you using them? It's worth noting that dropping them in the columns part of a query gets expensive as the function is executed per row returned: thus if a function does a select against the database, then you end up running n + 1 queries against the database (where n = number of rows returned in the main select). Try and engineer the function out if possible.
Make sure your JOINs are correct -- where you're using a LEFT JOIN, revisit the logic and see if it needs to be a LEFT or whether it can be turned into an INNER JOIN. Sometimes people use LEFT JOINs, but when you examine the logic in the rest of the query, it can sometimes be apparent that the LEFT JOIN gives you nothing (because, for example, someone may had added a WHERE col IS NOT NULL predicate against the joined table). INNER JOINs can be faster, so it's worth reviewing all of these.
It would be a lot easier to suggest things if we could see the query.
Can anyone help me understand the SQL Server execution plan for the following queries?
I expected the subquery version (Query 2) to execute faster, because it's set-based. This appears to be the case when runnning the queries independently - marginally - however the execution plan shows the query costs as 15% vs. 85% respectively:
//-- Query 1 (15%) - Scalar Function
SELECT
gi.GalleryImageId,
gi.FbUserId,
dbo.GetGalleryImageVotesByGalleryImageId(gi.GalleryImageId) AS Votes
FROM
GalleryImage gi
//-- Query 2 (85%) - Subquery
SELECT
gi.GalleryImageId,
gi.FbUserId,
(SELECT COUNT(*) FROM GalleryImageVote WHERE GalleryImageId = gi.GalleryImageId)
FROM
GalleryImage gi
What am I missing here; does the execution plan skip over the cost of the function? Also, any suggestions as to whether either of the above would be better served with a CTE or OVER/PARTITION query?
Thank you in advance!
Never trust the Execution Plan.
It is a very useful to let you see what the plan will be, but if you want real metrics, always turn on statistics
set statistics io on
set statistics time on
..and compare actual executions. Statistics may say the expectation is 15% / 85%, but the actuals will show you what that really translates to.
There is no silver bullet to performance tuning. Even "best" queries can change over time as the shape or distribution of your data changes.
The CTE won't be much different, and I am not sure how you plan to do a PARTITION query over this, but you can try the left join form.
SELECT
gi.GalleryImageId,
gi.FbUserId,
count(v.GalleryImageId) AS Votes
FROM
GalleryImage gi
LEFT JOIN GalleryImageVote v ON v.GalleryImageId = gi.GalleryImageId
GROUP BY
gi.GalleryImageId, gi.FbUserId
The optimiser does not know the cost of the function.
You can see the CPU and Reads and Duration via profiler though
Some related answers from similar questions. One Two
Inline table functions expand into the main query (they are macros like views)
Scalar (your one) and multi statement table functions do not and are black boxes to the "outer" query
SQL Server 2008 running on Windows Server Enterprise(?) Edition 2008
I have a query joining against twenty some-odd tables (mostly LEFT OUTER JOINs). The full dataset returned by an unfiltered query returns less than 1,000 rows in less than 1s. When I apply a WHERE clause to filter the query it returns less than 300 rows in less than 1s.
When I apply an ORDER BY clause to the query it returns in 90s.
I examined the results of the query and notice a number of NULL results returned in the column that is being used to sort. I modified the query to COALESCE a NULL value to a valid search value without any change to the performance of the query.
I then did a
SELECT * FROM
(
my query goes here
) qry
ORDER BY myOrderByHere
And that produced the same results.
When I SELECT ... INTO #tempTable (without the ORDER BY) and then SELECT FROM the #tempTable with the order by the query returns in less than 1s.
What is really strange at this point is that the SELECT... INTO will also take 90s even without the ORDER BY.
The Execution Plan says that the SORT is taking 98% of the execution time when included with the main query. If I am doing the INSERT INTO the the explain plan says that the actual insert into the temp table takes 99% of the execution time.
And to take out server issues I have run the same tests on two different instances of SQL Server 2008 with nearly identical results.
Many thanks!
rjsjr
Sounds like something strange is going on with your tempdb. Inserting 1000 rows in a temporary table should be fast, whether it's an implicit spool for sorting, or an explicit select into.
Check the size of your tempdb, the health of the hard disk it's on, and it's recovery model (should be simple, not full or bulk logged.)
A sort operation is usually an expensive step in the query. So, it's not surprising that the addition of the sort adds time. You may be seeing similar results when you incorporate a temp table in your steps. The sort operation in your original query may use tempdb to help do the sort, and that can be the time-consuming step in each query you compare.
If you want to learn more about each query you're running, you can review query plan outputs.