Index Seek with Bookmark Lookup Only Option for SQL Query? - sql-server

I am working on optimizing a SQL query that goes against a very wide table in a legacy system. I am not able to narrow the table at this point for various reasons.
My query is running slowly because it does an Index Seek on an Index I've created, and then uses a Bookmark Lookup to find the additional columns it needs that do not exist in the Index. The bookmark lookup takes 42% of the query time (according to the query optimizer).
The table has 38 columns, some of which are nvarchars, so I cannot make a covering index that includes all the columns. I have tried to take advantage of index intersection by creating indexes that cover all the columns, however those "covering" indexes are not picked up by the execution plan and are not used.
Also, since 28 of the 38 columns are pulled out via this query, I'd have 28/38 of the columns in the table stored in these covering indexes, so I'm not sure how much this would help.
Do you think a Bookmark Lookup is as good as it is going to get, or what would another option be?
(I should specify that this is SQL Server 2000)

OH,
the covering index with include should work. Another option might be to create a clustered indexed view containing only the columns you need.
Regards,
Lieven

You could create an index with included columns as another option
example from BOL, this is for 2005 and up
CREATE NONCLUSTERED INDEX IX_Address_PostalCode
ON Person.Address (PostalCode)
INCLUDE (AddressLine1, AddressLine2, City, StateProvinceID);
To answer this part "I have tried to take advantage of index intersection by creating indexes that cover all the columns, however those "covering" indexes are not picked up by the execution plan and are not used."
An index can only be used when the query is created in a way that it is sargable, in other words if you use function on the left side of the operator or leave out the first column of the index in your WHERE clause then the index won't be used. If the selectivity of the index is low then also the index won't be used
Check out SQL Server covering indexes for some more info

Related

How does sql server look up in composite non-clustered index?

If for example I have composite non-clustered index as following:
CREATE NONCLUSTERED INDEX idx_Test ON dbo.Persons(IsActive, UserName)
Depending on this answer How important is the order of columns in indexes?
If I run this query :
Select * From Persons Where UserName='Smith'
In the query above IsActive which its order=1 in the non-clustered index is not present. Does that mean Sql Server query optimizer will ignore looking up in the index because IsActive is not present or what?
Of course I can just test it and check the execution plan, and I will do that, but I'm also curious about the theory behind it. When does cardinality matter and when does it not?
SQLServer will scan the total index ,in this case it might be narrowest index..
Below is a small example on orders table i have
Query predicate (shipperid='G') satisfies 199748 rows,but sql server has to read total rows (998123) to get data.This is visible from the number of rows read to actual number of rows.
I found this from Craig freedman to be very usefull..Assuming you have index on (a,b)..SQLServer can effectively do below
a=somevalue and b=somevalue
a=someval and b>0
a=someval and b>=0
for below operations,sql server will choose to filter out as many as rows possible by first predicate(This is also the reason you might have heard to keep a column with more unique values first) and will use second predicate as a residual
- a>=somevalue and b=someval
for below case,sql server has to scan the entire index..
b=someval
Further reading :
Craig Freedman's SQL Server Blog :Seek Predicates
Probe Residual when you have a Hash Match – a hidden cost in execution plans:Rob Farley
The Tipping Point Query Answers:Kimberly L. Tripp

Is there benefit to index base tables of an indexed view?

After I created the indexed view, I tried disabling all the indexes in base tables including the indexes for foreign key column (constraint is still there) and the query plan for the view stays the same.
It is just like magic to me that the indexed view would be able to optimize the query so much even without base table being indexed. Even without any index on the View, SQL Server is able to do an index scan on the primary key index of the indexed view to retrieve data like 1000 times faster than using the base table.
Something like SELECT * FROM MyView WITH(NOEXPAND) WHERE NotIndexedColumn = 5 ORDER BY NotIndexedColumn
So the first two questions are:
Is there any benefit to index base tables of indexed view?
What is Sql server doing when it is doing a index scan on the PK while the constraint is on a not indexed column?
Then I noticed that if I use full-text search + order by I would see a table spool (eager spool) in the query plan with a cost like 95%.
Query looks like SELECT ID FROM View WITH(NOEXPAND) WHERE CONTAINS(IndexedColumn, '"SomeText*"') ORDER BY IndexedColumn
Question n° 3:
Is there any index I could add to get rid of that operation?
It's important to understand that an indexed view is a "materialized view" and the results are stored onto disk.
So the speedup you are seeing is the actual result of the query you are seeing stored to disk.
To answer your questions:
1) Is there any benefit to index base tables of indexed view?
This is situational. If your view is flattening out data or having many extra aggregate columns, then an indexed view is better than the table. If you are just using your indexed view like such
SELECT * FROM foo WHERE createdDate > getDate() then probably not.
But if you are doing SELECT sum(price),min(id) FROM x GROUP BY id,price then the indexed view would probably be better. Granted, you are doing a more complex query with joins and other advanced options.
2) What is Sql server doing when it is doing a index scan on the PK while the constraint is on a not indexed column?
First we need to understand how clustered indexes are stored. The index is stored in a B-tree. So SQL Server is walking the tree finding all values that match your criteria when you are searching on a clustered index Depending on how you have your indexes set up i.e covering vs non covering and how your non-clustered indexes are set up will determine what the Pages and Extents look like. Without more knowledge of the table structure I can't help you understand what the scan is actually doing.
3)Is there any index I could add to get rid of that operation?
Just because something is taking 95% of the query's time doesn't make that a bad thing. The query time needs to add up to 100%, so no matter what you do there is always going to be something taking up a large percentage of time. What you need to check is the IO reads and how much time the query itself takes.
To determine this, you need to understand that SQL Server caches the results of queries. With this in mind, you can have a query take a long time the first time but afterward since the data itself is cached it would be much quicker. It all depends on the frequency of the query and how your system is set up.
For a more in-depth read on indexed view

Create more than one non clustered index on same column in SQL Server

What is the index creating strategy?
Is it possible to create more than one non-clustered index on the same column in SQL Server?
How about creating clustered and non-clustered on same column?
Very sorry, but indexing is very confusing to me.
Is there any way to find out the estimated query execution time in SQL Server?
The words are rather logical and you'll learn them quite quickly. :)
In layman's terms, SEEK implies seeking out precise locations for records, which is what the SQL Server does when the column you're searching in is indexed, and your filter (the WHERE condition) is accurrate enough.
SCAN means a larger range of rows where the query execution planner estimates it's faster to fetch a whole range as opposed to individually seeking each value.
And yes, you can have multiple indexes on the same field, and sometimes it can be a very good idea. Play out with the indexes and use the query execution planner to determine what happens (shortcut in SSMS: Ctrl + M). You can even run two versions of the same query and the execution planner will easily show you how much resources and time is taken by each, making optimization quite easy.
But to expand on these a bit, say you have an address table like so, and it has over 1 billion records:
CREATE TABLE ADDRESS
(ADDRESS_ID INT -- CLUSTERED primary key ADRESS_PK_IDX
, PERSON_ID INT -- FOREIGN KEY, NONCLUSTERED INDEX ADDRESS_PERSON_IDX
, CITY VARCHAR(256)
, MARKED_FOR_CHECKUP BIT
, **+n^10 different other columns...**)
Now, if you want to find all the address information for person 12345, the index on PERSON_ID is perfect. Since the table has loads of other data on the same row, it would be inefficient and space-consuming to create a nonclustered index to cover all other columns as well as PERSON_ID. In this case, SQL Server will execute an index SEEK on the index in PERSON_ID, then use that to do a Key Lookup on the clustered index in ADDRESS_ID, and from there return all the data in all other columns on that same row.
However, say you want to search for all the persons in a city, but you don't need other address information. This time, the most effective way would be to create an index on CITY and use INCLUDE option to cover PERSON_ID as well. That way, a single index seek / scan would return all the information you need without the need to resort to checking the CLUSTERED index for the PERSON_ID data on the same row.
Now, let's say both of those queries are required but still rather heavy because of the 1 billion records. But there's one special query that needs to be really really fast. That query wants all the persons on addresses that have been MARKED_FOR_CHECKUP, and who must live in New York (ignore whatever checkup means, that doesn't matter). Now you might want to create a third, filtered index on MARKED_FOR_CHECKUP and CITY, with INCLUDE covering PERSON_ID, and with a filter saying CITY = 'New York' and MARKED_FOR_CHECKUP = 1. This index would be insanely fast, as it only ever cover queries that satisfy those exact conditions, and therefore has a fraction of the data to go through compared to the other indexes.
(Disclaimer here, bear in mind that the query execution planner is not stupid, it can use multiple nonclustered indexes together to produce the correct results, so the examples above may not be the best ones available as it's very hard to imagine when you would need 3 different indexes covering the same column, but I'm sure you get the idea.)
The types of index, their columns, included columns, sorting orders, filters etc depend entirely on the situation. You will need to make covering indexes to satisfy several different types of queries, as well as customized indexes created specifically for singular, important queries. Each index takes up space on the HDD so making useless indexes is wasteful and requires extra maintenance whenever the data model changes, and wastes time in defragmentation and statistics update operations though... so you don't want to just slap an index on everything either.
Experiment, learn and work out which works best for your needs.
I'm not the expert on indexing either, but here is what I know.
You can have only ONE Clustered Index per table.
You can have up to a certain limit of non clustered indexes per table. Refer to http://social.msdn.microsoft.com/Forums/en-US/63ba3877-e0bd-4417-a04b-19c3bfb02ac9/maximum-number-of-index-per-table-max-no-of-columns-in-noncluster-index-in-sql-server?forum=transactsql
Indexes should just have different names, but its better not to use the same column(s) on a lot of different indexes as you will run into some performance problems.
A very important point to remember is that Indexes although it makes your select faster, influence your Insert/Update/Delete speed as the information needs to be added to the index, which means that the more indexes you have on a column that gets updated a lot, will drastically reduce the speed of the update.
You can include columns that is used on a CLUSTERED index in one or more NON-CLUSTERED indexes.
Here is some more reading material
http://www.sqlteam.com/article/sql-server-indexes-the-basics
http://www.programmerinterview.com/index.php/database-sql/what-is-an-index/
EDIT
Another point to remember is that an index takes up space just like the table. The more indexes you create the more space it uses, so try not to use char/varchar (or nchar/nvarchar) in an index. It uses to much space in the index, and on huge columns give basically no benefit. When your Indexes start to become bigger than your table, it also means that you have to relook your index strategy.

Composite and Covering index

What is the diffrence between composite index and covering index in Sql Server ?
A covering index is a composite index that contains every column you are currently retrieving with your select statement and that participates in the where clause. It is one of the best ways to improve query performance substantially.
A covering index is a composite index that covers (hence the name) all columns that are necessary to fulfill a query or a join condition.
There is nothing special about SQL server here, these are generic designations.
A composite index is also a covering index when the index contains your search criteria and all the data your query is attempting to retrieve. In this example:
SELECT a,b,c FROM Foo WHERE a = 'FooFoo'
A covering index would contain column a (your search predicate) as well as the columns b and c.
In this case SQL Server is optimized to return those values found in the index and does not need to make an additional look up in the actual table. If b and c are frequently returned but rarely searched on then the index might be set up such that b and c are included in the index but not indexed.
Before SQL Server 2005 DBA's would add additional 'covering' columns to their indexes to achieve this optimization. In SQL Server 2005 an additional feature was added that allowed you to include covering columns in the leaf nodes of the index that were not part of the index tree. When creating an index you can specify additional 'covering' columns in the include clause. These columns will not be indexed but added to the leaf node of the index saving SQL Server from looking up the additional data in the main table. Adding the data to the include clause saves SQL Server the overhead of adding the additional data to the search tree while gaining the optimization that a covering index brings.

Covering Index versus Clustered Index (Database Index)

I'm working on a database system and it's indexes, but I'm having a really hard time seing the clear difference between a covering index and a clustered index.
I've googled my way around but hasn't got a clear cut answer on:
What is the differences between the two types of indexes
When do I use Covering index and when do I use Clustered index.
I hope someone can explain it to me in a almost children-like answer :-)
Sincerely Mestika
By the way, I'm using IBM DB2 version 9.7
I cannot speak to DB2, but the following applies to SQL Server.
When all of the required columns are part of the index the index is called a a "covering index". SQL Server 2005 introduced this type of index by allowing you to have "included columns" in the index. This allows you to include additional columns in the index over the 16 column limit or columns that would be too large to include.
While you can only have one clustered index per table, you can have up to 249 non-clustered indexes per table.
By having a covering index available to satisfy a query, SQL Server won't need to go back to the clustered index to retrieve the rest of the data required by the query.
Randy

Resources