indexes that appear to be redundant with clustered PK - sql-server

I am working on a database at a client with the following table:
CREATE TABLE [Example] (
[ID] INT IDENTITY (1, 1) NOT NULL,
....
[AddressID] INT NULL,
[RepName] VARCHAR(50) NULL,
....
CONSTRAINT [PK_Example] PRIMARY KEY CLUSTERED ([ID] ASC)
)
And it has the following indexes:
CREATE NONCLUSTERED INDEX [IDX_Example_Address]
ON [example]( [ID] ASC, [AddressId] ASC);
CREATE NONCLUSTERED INDEX [IDX_Example_Rep]
ON [example]( [ID] ASC, [RepName] ASC);
To me these are appear to be redundant with the clustered Index. I cannot imagine any scenario where these would be beneficial. If anyone can come up with a situation where these would be useful, let me know.
Here is another example:
CREATE NONCLUSTERED INDEX [IDX_Example_IsDeleted]
ON [example]( [IsDeleted] ASC)
INCLUDE( [ID], [SomeNumber]);
Why would you need to INCLUDE [ID]? My understanding is that the clustered index key is already present in every non-clustered index, so why would they do that? I would just INCLUDE ([SomeNumber])

You are correct in that the clustered index key is already included in every non-clustered index, but not in the same sense as your example clustered indices suggest.
For example, if you have a non-clustered index as in your example for IDX_Example_Rep, and you run this query:
SELECT [RepName], [Id] FROM [Example] WHERE [RepName] = 'some_value';
The IDX_Example_Rep index will be used, but it will be an index scan (every row will be checked). This is because the [Id] column was specified as the first column in the index.
If the index is instead specified as follows:
CREATE NONCLUSTERED INDEX [IDX_Example_Rep]
ON [example]([RepName] ASC);
Then when you run the same sample query, the IDX_Example_Rep index is used and the operation is an index seek - the engine knows exactly where to find the records by [RepName] within the IDX_Example_Rep index and, because the only other field being returned by the SELECT is the [Id] field, which is the key of the clustered index and therefore included in the non-clustered index, no further operations are necessary.
If the SELECT list were expanded to include, say, the [AddressId] field, then you'll find the engine still performs the index seek against IDX_Example_Rep to find the correct records, but then also has do a key lookup against the clustered index to get the "other" fields (the [AddressId] in this example).
So, no - you probably don't want to repeat the [Id] column as part of the non-clustered indices in general, but when it comes to non-clustered indices you definitely want to pay attention to your SELECTed fields and know whether or not you're covering the fields you're going to need.

Related

Should I explicitly list partitioning column as part of the index key or it's enough to specify it in the ON clause with partition schema?

I have SQL Server 2019 where I want to partition one of my tables. Let's say we have a simple table like so:
IF OBJECT_ID('dbo.t') IS NOT NULL
DROP TABLE t;
CREATE TABLE t
(
PKID INT NOT NULL,
PeriodId INT NOT NULL,
ColA VARCHAR(10),
ColB INT
);
Let's also say that I have defined partition function and schema. The schema is called [PS_PartitionKey]
Now I can partition this table by building a clustered index in a couple of ways.
Like this:
CREATE CLUSTERED INDEX IX_1 ON t ([PKId] ASC )
ON [PS_PartitionKey]([PeriodID])
Or like this:
CREATE CLUSTERED INDEX IX_1 ON t ([PKId] ASC, [PeriodId] ASC )
ON [PS_PartitionKey]([PeriodID])
As you can see, in the first case I did not explicitly specify my partitioning column as part of the index key, but in the second case I did. Both of these work, but what's the difference?
A similar question would apply if I were building these as non-clustered indexes. Using the same table as an example. Let's say I start by creating a clustered PK:
ALTER TABLE [dbo].t
ADD CONSTRAINT PK_t
PRIMARY KEY CLUSTERED ([PKId] ASC, [PeriodId]) ON [PS_PartitionKey]([PeriodID])
Now I want to define additional non-clustered index. Once again, I can do it in two ways:
CREATE NONCLUSTERED INDEX IX_1 ON t ([ColA] ASC)
ON [PS_PartitionKey]([PeriodID])
or:
CREATE NONCLUSTERED INDEX IX_1 ON t ([ColA] ASC, [PeriodId] ASC)
ON [PS_PartitionKey]([PeriodID])
What difference would it make?

SQL Server create clustered index on nvarchar column, enforce sorting

I want have a small table with two columns [Id] [bigint] and [Name] [nvarchar](63). The column is used for tags and it will contain all tags that exist.
I want to force an alphabetical sorting by the Name column so that a given tag is found more quickly.
Necessary points are:
The Id is my primary key, I use it e.g. for foreign keys.
The Name is unique as well.
I want to sort by Name alphabetically.
I need the SQL command for creating the constraints since I use scripts to create the table.
I know you can sort the table by using a clustered index, but I know that the table is not necessarily in that order.
My query looks like this but I don't understand how to create the clustered index on Name but still keep the Id as Primary Key:
IF NOT EXISTS (SELECT * FROM sys.objects
WHERE object_id = OBJECT_ID(N'[dbo].[Tags]')
AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[Tags]
(
[Id] [bigint] IDENTITY(1,1) PRIMARY KEY NOT NULL,
[Name] [nvarchar](63) NOT NULL,
CONSTRAINT AK_TagName UNIQUE(Name)
)
END
Edit:
I decided to follow paparazzo's advice. So if you have the same problem make sure you read his answer as well.
You should NOT do what you want to do.
Let the Id identity be the clustered PK. It (under normal use) will not fragment.
A table has no natural order. You have to sort by to get an order. Yes data is typically presented in PK order but that is just a convenience the query optimizer may or may not use.
Just put a non clustered unique index on Name and sort by it in the select.
You really need bigint? That is a massive table.
You can specify that the Primary Key is NONCLUSTERED when declaring it as a constraint, you can then declare the Unique Key as being the CLUSTERED index.
CREATE TABLE [dbo].[Tags] (
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](63) NOT NULL,
CONSTRAINT PK_Tag PRIMARY KEY NONCLUSTERED (Id ASC),
CONSTRAINT AK_TagName UNIQUE CLUSTERED (Name ASC)
);
Also specifying ASC or DESC after the Column name (within the key/index declaration) sets the index sort order. The default is usually ascending.

Redundant indexes?

I noticed a strange combination of indexes in one of the databases I was working on.
Here is the table design:
CREATE TABLE tblABC
(
id INT NOT NULL IDENTITY(1,1),
AnotherId INT NOT NULL, --not unique column
Othercolumn1 INT,
OtherColumn2 VARCHAR(10),
OtherColumn3 DATETIME,
OtherColumn4 DECIMAL(14, 4),
OtherColumn5 INT,
CONSTRAINT idxPKNCU
PRIMARY KEY NONCLUSTERED (id)
)
CREATE CLUSTERED INDEX idx1
ON tblABC(AnotherId ASC)
CREATE NONCLUSTERED INDEX idx2
ON tblABC(AnotherId ASC) INCLUDE(OtherColumn4)
CREATE NONCLUSTERED INDEX idx3
ON tblABC (AnotherId) INCLUDE (OtherColumn2, OtherColumn4)
Please note that column id is identity and defined as primary key.
A clustered index is defined on column - AnotherId, this column is not unique.
There are two additional nonclustered indexes defined on AnotherId, with additional include columns
My opinion is that either of the nonclustered indexes on AnotherId are redundant (idx2 and idx3) because the main copy of the table (culstred index) has the same data.
When I checked the index usage, I was expecting to see no usage on idx2 and idx3, but idx3 had highest index seeks.
I have given a screenshots of the index design and usage
My question is - aren't these nonclustered indexes - idx2 and idx3 redundant? Optimizer can get the same data from the clustered index - idx1. May be it would have got it, if there was no NC index defined.
Am I missing something?
Regards,
Nayak
It is a bit odd to have two very similar non-clustered indexes, though they may both be getting used equally. I do also find it positively weird that the clustered index was made on a non-unique field.
Check out the following link for information and a free tool to ascertain index usage. I use this all the time to see which indexes are being used etc.
https://www.brentozar.com/blitzindex/
For the non-clustered indexes - You can consolidate, and remove the unused indexes as if you're only writing to them, it is a royal waste of resources.
For the clustered index, you may consider redoing it based on your findings with the blitz index tool.

Is Unique key Clustered or Non-Clustered Index in SQL Server?

I am new to SQL Server and while learning about clustered index, I got confused!
Is unique key clustered or a non-clustered index? Unique key holds only unique values in the column including null, so according to this concept, unique key should be a clustered index, right? But when I went through this article I got confused MSDN
When you create a UNIQUE constraint, a unique nonclustered index is
created to enforce a UNIQUE constraint by default. You can specify a
unique clustered index if a clustered index on the table does not
already exist.
Please help me to understand the concept in a better manner, Thank you.
There are three ways of enforcing uniqueness in SQL Server indexes.
Primary Key constraint
Unique constraint
Unique index (not constraint based)
Whether they are clustered or non clustered is orthogonal to whether or not the indexes are declared as unique using any of these methods.
All three methods can create a clustered or non clustered index.
By default the unique constraint and Unique index will create a non clustered index if you don't specify any different (and the PK will by default be created as CLUSTERED if no conflicting clustered index exists) but you can explicitly specify CLUSTERED/NONCLUSTERED for any of them.
Example syntax is
CREATE TABLE T
(
X INT NOT NULL,
Y INT NOT NULL,
Z INT NOT NULL
);
ALTER TABLE T ADD PRIMARY KEY NONCLUSTERED(X);
--Unique constraint NONCLUSTERED would be the default anyway
ALTER TABLE T ADD UNIQUE NONCLUSTERED(Y);
CREATE UNIQUE CLUSTERED INDEX ix ON T(Z);
DROP TABLE T;
For indexes that are not specified as unique SQL Server will silently make them unique any way. For clustered indexes this is done by appending a uniquefier to duplicate keys. For non clustered indexes the row identifier (logical or physical) is added to the key to guarantee uniqueness.
Unique index can be both clustered or non-clustered.
But if you have nullable column the NULL value should be unique (only 1 row where column is null).
If you want to store more then 1 NULLs you can create the index with filter "where columnName is not null".
well all the answers provided was very helpful, but still i would like to add some detailed answer so that i would be helpful for some others as well
A table can contain only one clustered index and a primary key can
be a clustered / non-clustered index.
Unique Key can be a clustered/non-clustered index as well,
below are some of the examples
Scenario 1 : Primary Key will default to Clustered Index
In this case we will create only Primary Key and when we check the kind of index created on the table we will notice that it has created clustered index automatically over it.
USE TempDB
GO
-- Create table
CREATE TABLE TestTable
(ID INT NOT NULL PRIMARY KEY,
Col1 INT NOT NULL)
GO
-- Check Indexes
SELECT OBJECT_NAME(OBJECT_ID) TableObject,
[name] IndexName,
[Type_Desc] FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable'
GO
-- Clean up
DROP TABLE TestTable
GO
Scenario 2: Primary Key is defined as a Non-clustered Index
In this case we will explicitly defined Primary Key as a non-clustered index and it will create it as a non-clustered index. It proves that Primary Key can be non-clustered index.
USE TempDB
GO
-- Create table
CREATE TABLE TestTable
(ID INT NOT NULL PRIMARY KEY NONCLUSTERED,
Col1 INT NOT NULL)
GO
-- Check Indexes
SELECT OBJECT_NAME(OBJECT_ID) TableObject,
[name] IndexName,
[Type_Desc] FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable'
GO
-- Clean up
DROP TABLE TestTable
GO
Scenario 3: Primary Key defaults to Non-Clustered Index with another column defined as a Clustered Index
In this case we will create clustered index on another column, SQL Server will automatically create a Primary Key as a non-clustered index as clustered index is specified on another column.
-- Case 3 Primary Key Defaults to Non-clustered Index
USE TempDB
GO
-- Create table
CREATE TABLE TestTable
(ID INT NOT NULL PRIMARY KEY,
Col1 INT NOT NULL UNIQUE CLUSTERED)
GO
-- Check Indexes
SELECT OBJECT_NAME(OBJECT_ID) TableObject,
[name] IndexName,
[Type_Desc] FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable'
GO
-- Clean up
DROP TABLE TestTable
GO
Scenario 4: Primary Key defaults to Clustered Index with other index defaults to Non-clustered index
In this case we will create two indexes on the both the tables but we will not specify the type of the index on the columns. When we check the results we will notice that Primary Key is automatically defaulted to Clustered Index and another column as a Non-clustered index.
-- Case 4 Primary Key and Defaults
USE TempDB
GO
-- Create table
CREATE TABLE TestTable
(ID INT NOT NULL PRIMARY KEY,
Col1 INT NOT NULL UNIQUE)
GO
-- Check Indexes
SELECT OBJECT_NAME(OBJECT_ID) TableObject,
[name] IndexName,
[Type_Desc] FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable'
GO
-- Clean up
DROP TABLE TestTable
GO
reference:the above details is been refrenced from this article

How indexes work for below queries?

I have created the below table with primary key:
create table test2
(id int primary key,name varchar(20))
insert into test2 values
(1,'mahesh'),(2,'ram'),(3,'sham')
then created the non clustered index on it.
create nonclustered index ind_non_name on test2(name)
when I write below query it will always you non clustered indexes in query execution plan.
select COUNT(*) from test2
select id from test2
select * from test2
Could you please help me to understand why it always use non clustered index even if we have clustered index on table?
Thanks in advance.
Basically when you create a non-clustered index on name, the index actually contains name and id, so it kind of contains all the table itself.
If you add another field like this:
create table test4
(id int primary key clustered,name varchar(20), name2 varchar(20))
insert into test4 values
(1,'mahesh','mahesh'),(2,'ram','mahesh'),(3,'sham','mahesh')
create nonclustered index ind_non_name on test4(name)
You'll see that some of the queries will start using the clustered index.
In your case the indexes are pretty much the same thing, since clustered index also contains the data, your clustered index is id, name and non clustered indexes contain the clustering key, so the non-clustered index is name, id.
You don't have any search criteria, so no matter which index is used, it must be scanned completely anyhow, so why should it actually use the clustered index?
If you add third field you your table, then at least select * will use clustered index.
You are confusing Primary Keys with clustering keys. They are not the same. You will need to explicitly create the clustering key.
To create the clustering key on the primary key in the create statement:
create table test2
(id int ,name varchar(20)
constraint PK_ID_test2 primary key clustered(id))
To add the clustering key to what you have already:
ALTER TABLE test2
ADD CONSTRAINT PK_ID_test2 primary key clustered(id)

Resources