Does ORDER BY clause slows down the query? - sql-server

Does the ORDER BY clause slows down the query performance ??? How much will it effect if the column is indexed as against when it is not.

#JamesZ is correct. There are many things to consider when adding the order by clause to your query. For instance if you did a select top 10 * from dbo.Table order by field with let's say 10,000,000 rows would cause the query to spill into tempdb as it spooled the entire table to tempdb and then after sorting by your non-indexed field would then return the 10 rows. If you did the same select without the sort, results would return almost immediately.
It's very important to know how your tables are indexed before issuing an order by clause. CTRL-M is your friend in SSMS.

If there is a sort operator in the query plan (and there is more than 1 row), yes, it has an affect. If the data is already in the order you need it (either clustered index, or non-clustered index that has all the fields that the query needs in correct order), the sorting might not be needed, but other operations in the plan might still cause sorting to be done to be sure that the data is still in the correct order.
How much does if affect, well test it. Take the sorting away and compare the performance.

It is better to test. You can use actual execution plan.
Here is a simple ORDEY BY but the additional cost is 4 times the original one

Related

Is there a way I can tell what OPTION (FAST 1) does at the end of my query?

I have a number of horribly large queries which work ok on small databases but when the volume of data gets larger then the performance of these queries get slower. They are badly designed really and we must address that really. These queries have a very large number of LEFT OUTER JOINS. I note that when the number of LEFT OUTER JOINS goes past 10 then performance gets logarithmically slower each time a new join is added. If I put a OPTION (FAST 1) at the end of my query then the results appear nearly immediately. Of course I do not want to use this as it firstly, it is not going to help all of the time (if it did then every query would have it) and secondly I want to know how to optimise these joins better. When I run the query without the OPTION set then the execution plan shows a number of nested loops on my LEFT OUTER JOINS are showing a high percentage cost, but with the option off it does not. How can I find out what it does to speed the query up so I can reflect it in the query ?
I cannot get the query nor the execution plans today as the server I am on does not let me copy data from it. If they are needed for this I can arrange to get them sent but will take some time, in the morning.
Your comments would be really interesting to know.
Kind regards,
Derek.
You can set column to primary key and column automatically will be Clustered Index.
Clustered Index->Benefit and Drawback
Benefit: Performance boost if implemented correctly
Drawback: requires understanding of clustered/non clustered indexes and storage implications
Note: varchar foreign keys can lead to poor performance as well. Change the base table to have an integer primary key instead.
And also
I would suggest to use database paging(f.e. via ROW_NUMBER function) to partition your result set and query only the data you want to show(f.e. 20 rows per page in a GridView).

SQL Server ignoring index and performs table scan

we have a little problem with one of our queries, which is executed inside a .Net (4.5) application via System.Data.SqlClient.SqlCommand.
The problem is, that the query is going to perform a Table-Scan which is very slow. So the execution plan shows the Table-Scan here
Screenshot:
The details:
So the text shows, that the filter to Termine.Datum and Termine.EndDatum causing the Table-Scan. But why is the SQL-Server ignoring the Indexes? There are two indexes on Termine.Datum and Termine.EndDatum. We also tryed to add a third one with Datum and EndDatum combined.
The indexes are all non-clustered indexes and both fields are DateTime.
It decides on Table Scan based on Estimated number of rows 124844 where as your actual rows are only 831.
Optimizer thinks that to traverse 124844 it will better do scan in table instead of Index Seek.
Also need to check about other columns selected apart from Index. If you have selected other columns apart from Index it has to Do RID Lookup after doing index seek, Optimizer might think instead of RID lookup it preferred to go with Table Scan.
First fix: Update the statistics and provide enough information to optimizer to choose better plan.
Can you provide the full query? I see that you are pulling a range of data that span a range of 3 months. If this range is a high percentage of the dataset it might be scanning due to you attempting to return such a large percentage of the data. If the index is not selective enough it won't get picked up.
Also...
You have an OR clause in the filter. From looking at the predicate in the screenshot you provided it looks like you might be missing () around the two different filters. This might also lead to the scan.
One more thing...
OR clauses can sometimes lead to bad plans - an alternative is to split the query into two UNIONED queries each with the different OR in it. If you provide the query I should be able to give you a re-written version to show this.

SQL Server detecting slow vs fast columns

I have an ASP.Net MVC application & I use PetaPoco and SQL Server.
My usecase is I want to allow a search on a table with many fields, but hide fields that are "slow" (ie) unindexed. I'm going to modify the PetaPoco T4 template to decorate this information on the columns.
I found this answer that gives you a list of tables vs indexes. My concern is it shows a lot of columns for a particular table. Is the query given in the answer reliable for my usecase ? (ie) can the columns shown be included in the where clause & it wont be slow ? I have some tables that have 40M rows. I dont want to include slow columns in the where condition.
Or is there a better way to solve this problem ?
There are no slow columns in the sense of your question. You have to distinguish between two uses of a column.
Searching. When the column appears in the WHERE, or JOIN clause, it slows down your query, if there is no index for it.
Returning in recordset. If the column appears in the SELECT clause, its content must be returned with each row, whether you need it, or not. So for queries returning many rows, each additional column to be returned means a performance penalty.
Conclusion: As you can see, the performance impact of SELECTED columns does NOT DEPEND on index, but on the number of the returned rows.
Advice: Create indexes for columns used to search and do not return unnecessary columns. Let your queries be as specific as possible in terms of both, selected columns and returned rows.
I think it will not be that simple. You can check indexed columns using the suggested approach (or similar), but the fact that a column is present in an index does not mean your query will necessarily utilize it efficiently. For example if an index is created on columns A, B and C (in that order) and you only have a 'WHERE' clause on B or C (but not on A) you will probably end up with index scan rather than index seek and your query is likely to be slower than expected.
So your check should take into account the sequence of the columns in the indices - instantly fast columns (in your situation) might probably be considered the first columns of the indices (where ic.index_column_id = 1 in the post you mentioned). Columns that are not first in the indices (i.e. ic.index_column_id > 1) will be fast as long as the first columns are also included in the filter. There are other things you might also need to take into account (e.g. cardinality), but this is important to make sure you drive index seeks rather than scans.

problem with limit and offset in sql?

Well I have a sorted table by id and I want to retrieve n rows offset m rows, but when I use it without orderby it is producing unpredictable results and with order by id its taking too much of time, since my table is already ordered, I just want to retrieve n rows leaving first m rows.
The documentation says:
The query planner takes LIMIT into account when generating a query plan, so you are very likely to get different plans (yielding different row orders) depending on what you use for LIMIT and OFFSET. Thus, using different LIMIT/OFFSET values to select different subsets of a query result will give inconsistent results unless you enforce a predictable result ordering with ORDER BY. This is not a bug; it is an inherent consequence of the fact that SQL does not promise to deliver the results of a query in any particular order unless ORDER BY is used to constrain the order.
So what you are trying cannot really be done.
I suppose you could play with the planner options or rearrange the query in a clever way to trick the planner into using a plan that suits you, but without actually showing the query, it's hard to say.
SELECT * FROM mytable LIMIT 100 OFFSET 0
You really shouldn't rely on implicit ordering though, because you may not be able to predict the exact order of how data goes into the database.
As pointed out above, SQL does not guarantee anything about order unless you have an ORDER BY clause. LIMIT can still be useful in such a situation, but I can't think of any use for OFFSET. It sounds like you don't have an index on id, because if you do, the query should be extremely fast, clustered or not. Take another look at that. (Also check CLUSTER, which may improve your performance at the margin.)
REPEAT: this is not something about Postgresql. Its behavior here is conforming.

needed index to make this sql query run faster

here is the query I am stuck with:
SELECT *
FROM customers
WHERE salesmanid = #salesrep
OR telephonenum IN (SELECT telephonenum
FROM salesmancustomers
WHERE salesmanname = #salesrepname)
ORDER BY customernum
It is SLOW and crushing my CPU at 99%. I know an index would help but not sure what kind or if it should be 2 indexes or 1 with both columns included.
Three indexes probably each on a single column. This is assuming that your queries are all quite selective relative to the size of the tables.
It would help if you told us what your table schemas are along with details of existing indexes (Your PKs will get a clustered index by default if you don't specify otherwise) and some details about size of tables / selectivity.
Customers
SalesmanId
TelephoneNum
SalesmanCustomers
SalesmanName
Take a look at the Query Execution Plan and see if there are any table scans going on. This will help you identify what indexes you need.
I suppose that in addition to columns suggested by #Martin, an index on CustomerNum is also required since its used in order by clause.
If you have lots of records, the OrderBy is something which takes a lot of time. You can also try to run query without the orderby and see how much time it takes.

Resources