What is a non-clustered index scan - sql-server

I know what table scan, clustered index scan and index seek is but my google skills let me down to find a precise explanation into non clustered index scans. Why and when a query uses a non clustered index scan?
Thank you.

As the name suggests, Non Clustered Index Scans are scans on Non Clustered Indexes - NCI scans will typically be done if all of the fields in a select can be fulfilled from a non clustered index, but where the selectivity or indexing of the query is too poor to result in an Seek.
NCI scans potentially have performance benefit over a clustered index scan in that the NCI indexes are generally narrower than the Clustered Indexes (since they generally have fewer columns), hence fewer pages to fetch, and less I/O.
I've put a contrived scenario up on SqlFiddle Here - click on the 'view execution plan' at the bottom.
Given the following setup of table, clustered, and non clustered indexes:
CREATE TABLE Foo
(
FooId INT,
Name VARCHAR(50),
BigCharField CHAR(7000),
CONSTRAINT PK_FOO PRIMARY KEY CLUSTERED(FooId)
);
CREATE NONCLUSTERED INDEX IX_FOO ON Foo(Name);
The following queries demonstrate the different scans:
-- Clustered Index Scan - because we need all fields, CI is most efficient
SELECT * FROM FOO;
-- Non Clustered Index Scan - because we just need Name, but have no selectivity, the NCI
-- will suffice and is narrower.
SELECT DISTINCT(Name) FROM FOO;

Related

Multiple Clustered Indexes on a Single Table?

I thought we could only place one clustered index on one table, and put multiple non-clustered indexes on a table, but using the code below I can easily add more than one clustered index to my table.
CREATE CLUSTERED INDEX TBL_MULTI_LC_HIST ON dbo.TBL_MULTI_LC_HIST (ID,AsOfDate)
Is this completely wrong?
It isn't possible to create multiple clustered indexes for a single table. From the docs (emphasis mine):
Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be stored in only one order.
For example this will fail:
CREATE TABLE Thing
(
Column1 INT NOT NULL,
Column2 INT NOT NULL
)
CREATE CLUSTERED INDEX IX1 ON dbo.Thing(Column1)
CREATE CLUSTERED INDEX IX2 ON dbo.Thing(Column2)
Error:
Cannot create more than one clustered index on table 'dbo.Thing'. Drop the existing clustered index 'IX1' before creating another.
Example: http://www.sqlfiddle.com/#!18/53a63/1
You can however have a single index with multiple columns in it which is perhaps where you are getting confused:
CREATE CLUSTERED INDEX IX3 ON dbo.Thing(Column1, Column2)
You can only have one clustered index. A "Clustered" index IS the row... it contains all the columns. Every other index would just contain a pointer to the clustered row. The key of the clustered index enforces an 'ordering' on the rows by default.
If there is no clustered index, then the rows are basically stored in a heap, with no order or structure.

Why QO choses clustered index-scan vs table-scan?

If I have a query like this:
SELECT * FROM tTable
where tTable does not contain any indexes a table-scan happens, as expected. If I add a clustered index on some column then QO decides to use clustered index scan on this query. Why? Why is clustered-index-scan preferred instead of table-scan in this case?
If I add a clustered index on some column then QO decides to use clustered index scan on this query
because when you create a clustered index on a table,data in table is rearranged in index order..so table it self is clustered index.This is also the reason why you can't have two clustered indexes on same table
To summarize,when you create a clustered index,there is only one structure ,not two(clustered index and table)
The query is "give me all rows and all columns" which means "read every row" which is a scan
There is nothing to do an index seek on, because there is no WHERE clause.
Unlike this:
SELECT * FROM tTable WHERE PrimaryClusteredKeyValue = 45
Then this may use a nonclustered seek followed by a clustered key lookup or it may still scan the clustered index because you ask for all columns. It depends on how many rows gbn will match
SELECT * FROM tTable WHERE NonClusteredOtherColumnValue = 'gbn'

Redundant indexes?

I noticed a strange combination of indexes in one of the databases I was working on.
Here is the table design:
CREATE TABLE tblABC
(
id INT NOT NULL IDENTITY(1,1),
AnotherId INT NOT NULL, --not unique column
Othercolumn1 INT,
OtherColumn2 VARCHAR(10),
OtherColumn3 DATETIME,
OtherColumn4 DECIMAL(14, 4),
OtherColumn5 INT,
CONSTRAINT idxPKNCU
PRIMARY KEY NONCLUSTERED (id)
)
CREATE CLUSTERED INDEX idx1
ON tblABC(AnotherId ASC)
CREATE NONCLUSTERED INDEX idx2
ON tblABC(AnotherId ASC) INCLUDE(OtherColumn4)
CREATE NONCLUSTERED INDEX idx3
ON tblABC (AnotherId) INCLUDE (OtherColumn2, OtherColumn4)
Please note that column id is identity and defined as primary key.
A clustered index is defined on column - AnotherId, this column is not unique.
There are two additional nonclustered indexes defined on AnotherId, with additional include columns
My opinion is that either of the nonclustered indexes on AnotherId are redundant (idx2 and idx3) because the main copy of the table (culstred index) has the same data.
When I checked the index usage, I was expecting to see no usage on idx2 and idx3, but idx3 had highest index seeks.
I have given a screenshots of the index design and usage
My question is - aren't these nonclustered indexes - idx2 and idx3 redundant? Optimizer can get the same data from the clustered index - idx1. May be it would have got it, if there was no NC index defined.
Am I missing something?
Regards,
Nayak
It is a bit odd to have two very similar non-clustered indexes, though they may both be getting used equally. I do also find it positively weird that the clustered index was made on a non-unique field.
Check out the following link for information and a free tool to ascertain index usage. I use this all the time to see which indexes are being used etc.
https://www.brentozar.com/blitzindex/
For the non-clustered indexes - You can consolidate, and remove the unused indexes as if you're only writing to them, it is a royal waste of resources.
For the clustered index, you may consider redoing it based on your findings with the blitz index tool.

Should the PK on an identity column (which is surrogate key) be non-clustered?

For a table with PK on an identity column, it will be clustered by default. Could it better be non-clustered? The PK is a surrogate key which may never be used for querying directly, it may be used to join another table.
The reason is other indexes will be created for queries. A query which uses a non-clustered index and returned columns are not covered by the index will use less LIO because there is no extra clustered index seek steps?
create table T (
Id int identity(1,1) primary key, -- clustered or non-clustered?
A ....
B ....
C ....
....)
create index ix_A on T (A)
create index ix_..... -- Many indexes can be created for different queries
select A, B
from T
where A between #a and #a+5 -- This query will have less LIO if the PK is non-clustered (seek)
It's perfectly fine to set your surrogate PK to be non-clustered if there is a better candidate in the table for the clustered index.
Good candidates for a clustered index are columns that you will frequently do either range searches ([ColumnName] BETWEEN This AND That) on, or ORDER BY clauses on.

What is the difference between composite non clustered index and covering index

SQL Server 2005 includes "covering index" feature which allows us to select more than one non key column to be included to the existing non clustered index.
For example, I have the following columns:
EmployeeID, DepartmentID, DesignationID, BranchID
Here are two scenarios:
EmployeeID is a primary key with
clustered index and the remaining
columns (DepartmentID, DesignationID,
BranchID) are taken as non clustered
index (composite index).
EmployeeID is a primary key with
clustered index and DepartmentID is
non clustered index with
DesignationID, BranchID are "included
columns" for non clustered index.
What is the difference between the above two? If both are same what's new to introduce "Covering Index" concept?
The difference is that if there are two rows with the same DepartmentID in the first index they will be sorted based on their values of DesignationID and BranchID. In the second case they will not be sorted relative to each other and could appear in any order in the index.
In terms of what this means to your application:
A query which can use an index on (DepartmentID, DesignationID) can be more efficient with the first query than the second.
Building the first index may take slightly longer because of the extra sorting required.
Covered index is a nonclustered index with INCLUDE clause

Resources