How do I monitor and find unused indexes in SQL database

How do I monitor and find unused indexes in SQL database - sql-server

I would like to monitor index usage for an SQL database in order to find unused indexes and then drop them. How can I monitor index usage most efficiently? Which scripts could be useful?
I'm aware of this question about identifying unused objects, but this applies only to the current run of the SQL server. I would like to monitor index usage over a period of time.

This is an interesting question. I've been working on this same question over the past week. There is a system table called dm_db_index_usage_stats that contains usage statistics on indexes.
Indexes That Never Appear in the Usage Statistics Table
However, many indexes never appear in this table at all. The query David Andres posted lists all indexes for this case. I've updated it a little bit to ignore primary keys, which probably shouldn't be deleted, even if they aren't ever used. I also joined on the dm_db_index_physical_stats table to get other information, including Page Count, Total Index Size, and the Fragmentation Percentage. An interesting note is that indexes that are returned by this query don't seem to show up in the SQL Report for Index Usage Statistics.
DECLARE #dbid INT
SELECT #dbid = DB_ID(DB_NAME())
SELECT Databases.Name AS [Database],
Objects.NAME AS [Table],
Indexes.NAME AS [Index],
Indexes.INDEX_ID,
PhysicalStats.page_count as [Page Count],
CONVERT(decimal(18,2), PhysicalStats.page_count * 8 / 1024.0) AS [Total Index Size (MB)],
CONVERT(decimal(18,2), PhysicalStats.avg_fragmentation_in_percent) AS [Fragmentation (%)]
FROM SYS.INDEXES Indexes
INNER JOIN SYS.OBJECTS Objects ON Indexes.OBJECT_ID = Objects.OBJECT_ID
LEFT JOIN sys.dm_db_index_physical_stats(#dbid, null, null, null, null) PhysicalStats
on PhysicalStats.object_id = Indexes.object_id and PhysicalStats.index_id = indexes.index_id
INNER JOIN sys.databases Databases
ON Databases.database_id = PhysicalStats.database_id
WHERE OBJECTPROPERTY(Objects.OBJECT_ID,'IsUserTable') = 1
AND Indexes.type = 2 -- Nonclustered indexes
AND Indexes.INDEX_ID NOT IN (
SELECT UsageStats.INDEX_ID
FROM SYS.DM_DB_INDEX_USAGE_STATS UsageStats
WHERE UsageStats.OBJECT_ID = Indexes.OBJECT_ID
AND Indexes.INDEX_ID = UsageStats.INDEX_ID
AND DATABASE_ID = #dbid)
ORDER BY PhysicalStats.page_count DESC,
Objects.NAME,
Indexes.INDEX_ID,
Indexes.NAME ASC
Indexes That Do Appear in the Usage Statistics Table, But Are Never Used
There are other indexes that do appear in the dm_db_index_usage_stats table, but which have never been used for user seeks, scans, or lookups. This query will identify indexes that fall into this category. Incidentally, unlike the indexes returned from the other query, the indexes returned in this query can be verified on the SQL Report by Index Usage Statistics.
I added a Minimum Page Count that allows me to initially focus on and remove unused indexes that are taking up a lot of storage.
DECLARE #MinimumPageCount int
SET #MinimumPageCount = 500
SELECT Databases.name AS [Database],
Indexes.name AS [Index],
Objects.Name AS [Table],
PhysicalStats.page_count as [Page Count],
CONVERT(decimal(18,2), PhysicalStats.page_count * 8 / 1024.0) AS [Total Index Size (MB)],
CONVERT(decimal(18,2), PhysicalStats.avg_fragmentation_in_percent) AS [Fragmentation (%)],
ParititionStats.row_count AS [Row Count],
CONVERT(decimal(18,2), (PhysicalStats.page_count * 8.0 * 1024) / ParititionStats.row_count) AS [Index Size/Row (Bytes)]
FROM sys.dm_db_index_usage_stats UsageStats
INNER JOIN sys.indexes Indexes
ON Indexes.index_id = UsageStats.index_id
AND Indexes.object_id = UsageStats.object_id
INNER JOIN sys.objects Objects
ON Objects.object_id = UsageStats.object_id
INNER JOIN SYS.databases Databases
ON Databases.database_id = UsageStats.database_id
INNER JOIN sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS PhysicalStats
ON PhysicalStats.index_id = UsageStats.Index_id
and PhysicalStats.object_id = UsageStats.object_id
INNER JOIN SYS.dm_db_partition_stats ParititionStats
ON ParititionStats.index_id = UsageStats.index_id
and ParititionStats.object_id = UsageStats.object_id
WHERE UsageStats.user_scans = 0
AND UsageStats.user_seeks = 0
AND UsageStats.user_lookups = 0
AND PhysicalStats.page_count > #MinimumPageCount -- ignore indexes with less than 500 pages of memory
AND Indexes.type_desc != 'CLUSTERED' -- Exclude primary keys, which should not be removed
ORDER BY [Page Count] DESC
I hope this helps.
Final Thought
Of course, once indexes are identified as candidates for removal, careful consideration should still be employed to make sure it's a good decision to do so.
For more information, see Identifying Unused Indexes in a SQL Server Database

Currently (as of SQL Server 2005 - 2008) the SQL index stats information is only kept in memory and so you have to do some of the work yourself if you would like to have that persisted across restarts and database detaches.
What I usually do, is I create a job that runs every day and takes a snapshot of the information found in the sys.dm_db_index_usage_stats table, into a custom table that I create for the database in question.
This seems to work pretty well until a future version of SQL which will support persistent index usage stats.

Pulled this puppy off of http://blog.sqlauthority.com/2008/02/11/sql-server-2005-find-unused-indexes-of-current-database/. Note that this works for 2005 and above. The key is the JOIN to the SYS.DM_DB_INDEX_USAGE_STATS system table.
USE AdventureWorks
GO
DECLARE #dbid INT
SELECT #dbid = DB_ID(DB_NAME())
SELECT OBJECTNAME = OBJECT_NAME(I.OBJECT_ID),
INDEXNAME = I.NAME,
I.INDEX_ID
FROM SYS.INDEXES I
JOIN SYS.OBJECTS O ON I.OBJECT_ID = O.OBJECT_ID
WHERE OBJECTPROPERTY(O.OBJECT_ID,'IsUserTable') = 1
AND I.INDEX_ID NOT IN (
SELECT S.INDEX_ID
FROM SYS.DM_DB_INDEX_USAGE_STATS S
WHERE S.OBJECT_ID = I.OBJECT_ID
AND I.INDEX_ID = S.INDEX_ID
AND DATABASE_ID = #dbid)
ORDER BY OBJECTNAME,
I.INDEX_ID,
INDEXNAME ASC
GO

I tweaked John Pasquet's queries here: Identifying Unused Indexes in a SQL Server Database to return indexes used 10 or less times, unioned the results that aren't in the usage stats tables, exclude heap indexes and unique constraints or primary key indexes, and finally to exclude indexes with zero pages.
Be careful with the results of this query – it's best to use in production where indexes are actually getting used the way you would expect. If you query on a database with rebuilt or dropped/recreated indexes or on a recent database backup you could get false positives (indexes that normally would get used but aren't because of special circumstances). Not safe to use in test or dev environments to decide whether to drop indexes. As Narnian says, this query just identifies candidates for removal for your careful consideration.
USE [DatabaseName]
DECLARE #MinimumPageCount int
SET #MinimumPageCount = 500
DECLARE #dbid INT
SELECT #dbid = DB_ID(DB_NAME())
-- GET UNUSED INDEXES THAT APPEAR IN THE INDEX USAGE STATS TABLE
SELECT
Databases.name AS [Database]
,object_name(Indexes.object_id) AS [Table]
,Indexes.name AS [Index]
,PhysicalStats.page_count as [Page Count]
,CONVERT(decimal(18,2), PhysicalStats.page_count * 8 / 1024.0) AS [Total Index Size (MB)]
,CONVERT(decimal(18,2), PhysicalStats.avg_fragmentation_in_percent) AS [Fragmentation (%)]
,ParititionStats.row_count AS [Row Count]
,CONVERT(decimal(18,2), (PhysicalStats.page_count * 8.0 * 1024) / ParititionStats.row_count) AS [Index Size Per Row (Bytes)]
,1 AS [Appears In Usage Stats Table]
FROM sys.dm_db_index_usage_stats UsageStats
INNER JOIN sys.indexes Indexes
ON Indexes.index_id = UsageStats.index_id AND Indexes.object_id = UsageStats.object_id
INNER JOIN SYS.databases Databases
ON Databases.database_id = UsageStats.database_id
INNER JOIN sys.dm_db_index_physical_stats (DB_ID(),NULL,NULL,NULL,NULL) AS PhysicalStats
ON PhysicalStats.index_id = UsageStats.Index_id AND PhysicalStats.object_id = UsageStats.object_id
INNER JOIN SYS.dm_db_partition_stats ParititionStats
ON ParititionStats.index_id = UsageStats.index_id AND ParititionStats.object_id = UsageStats.object_id
WHERE
UsageStats.user_scans <= 10
AND UsageStats.user_seeks <= 10
AND UsageStats.user_lookups <= 10
-- exclude heap indexes
AND Indexes.name IS NOT NULL
-- ignore indexes with less than a certain number of pages of memory
AND PhysicalStats.page_count > #MinimumPageCount
-- Exclude primary keys, which should not be removed
AND Indexes.is_primary_key = 0
-- ignore unique constraints - those shouldn't be removed
AND Indexes.is_unique_constraint = 0
AND Indexes.is_unique = 0
UNION ALL
(
-- GET UNUSED INDEXES THAT DO **NOT** APPEAR IN THE INDEX USAGE STATS TABLE
SELECT
Databases.Name AS [Database]
,Objects.NAME AS [Table]
,Indexes.NAME AS [Index]
,PhysicalStats.page_count as [Page Count]
,CONVERT(decimal(18,2), PhysicalStats.page_count * 8 / 1024.0) AS [Total Index Size (MB)]
,CONVERT(decimal(18,2), PhysicalStats.avg_fragmentation_in_percent) AS [Fragmentation (%)]
,-1 AS [Row Count]
,-1 AS [Index Size Per Row (Bytes)]
,0 AS [Appears In Usage Stats Table]
FROM SYS.INDEXES Indexes
INNER JOIN SYS.OBJECTS Objects
ON Indexes.OBJECT_ID = Objects.OBJECT_ID
LEFT JOIN sys.dm_db_index_physical_stats(#dbid, null, null, null, null) PhysicalStats
ON PhysicalStats.object_id = Indexes.object_id AND PhysicalStats.index_id = indexes.index_id
INNER JOIN sys.databases Databases
ON Databases.database_id = PhysicalStats.database_id
WHERE
Objects.type = 'U' -- Is User Table
-- exclude heap indexes
AND Indexes.name IS NOT NULL
-- exclude empty tables
AND PhysicalStats.page_count <> 0
-- Exclude primary keys, which should not be removed
AND Indexes.is_primary_key = 0
-- ignore unique constraints - those shouldn't be removed
AND Indexes.is_unique_constraint = 0
AND Indexes.is_unique = 0
AND Indexes.INDEX_ID NOT IN
(
SELECT UsageStats.INDEX_ID
FROM SYS.DM_DB_INDEX_USAGE_STATS UsageStats
WHERE
UsageStats.OBJECT_ID = Indexes.OBJECT_ID
AND Indexes.INDEX_ID = UsageStats.INDEX_ID
AND DATABASE_ID = #dbid
)
)
ORDER BY [Table] ASC, [Total Index Size (MB)] DESC

You should take a look at Brent Ozars sp_BlitzIndex. This stored procedure lists among others unsused indexes. It lists the disorders in a report. For each entry an URL is given which explains what to look for and how to handle the issue.

Related

Sum of index space :: T-SQL vs GUI

I have recently rebuilt all indexes on a table and the GUI from SSMS tells me that the index space is 7 555.711 MB.
But if I look at the actual index space through T-SQL I have a different result:
SELECT tn.[name] AS [Table name], ix.[name] AS [Index name],
SUM(sz.[used_page_count]) * 8 * 1024/(1024 * 1024) AS [Index size (MB)]
FROM sys.dm_db_partition_stats AS sz
INNER JOIN sys.indexes AS ix ON sz.[object_id] = ix.[object_id]
AND sz.[index_id] = ix.[index_id]
INNER JOIN sys.tables tn ON tn.OBJECT_ID = ix.object_id
where tn.[name] = 'MyTableName'
GROUP BY tn.[name], ix.[name]
ORDER BY tn.[name]
Why?

Thank you #Jeroen Mostert,
In fact the cluster index is part of the table data.
With Index Space SSMS is calculating only the non-clustered index which are:
1452 + 1590 + 1590 + 1452 + 1452 = 7536MB
which is very close to 7555MB

SQL Server number of pages in a table

I'm trying to calculate how many pages a single table has. For this i end up with different queries
SELECT SUM(si.dpages)
FROM sys.sysindexes si
INNER JOIN sys.tables st on si.id = st.object_id
WHERE st.name = 'job'
SELECT SUM(page_count)
FROM sys.dm_db_index_physical_stats (NULL, NULL, NULL, NULL, NULL) s
WHERE
s.database_id = db_id()
and s.object_id = OBJECT_ID('job')
SELECT SUM(used_page_count)
FROM sys.dm_db_partition_stats
WHERE object_id = OBJECT_ID('job');
dbcc ind
(
'wellnew'
,'dbo.job'
,-1
);
And these are the results:
16681
16681
16771
(16771 row(s) affected)
Now, which one is right? Or what's the correct way to get that information?
There is quite a similar question here it's even using a query similar tom mine, but i have 4 queries that in theory are all correct but they return different results
Thanks in advance
Henrry

How to get the size of some rows in a table

First of all sorry for my english, I know it is not perfect
I have the table in my DB in whtch I have about 40 columns, and I need to get size of some records in this table to make a report.
I tried to do something with dbcc SHOWCONTIG, but I didn't get satisfactory results.
Have you any advice/ideas to solve my problem?
Thanks for help!

Would suggest writing a query to calculate based on contents using the reference info here: http://www.connectionstrings.com/sql-server-data-types-reference/
E.g. let's say your table has 2 columns: A varchar(20), and a datetime. Then your query would look something like this:
SELECT RowID, -- Some identifier for the row, e.g. primary key
LEN(varcharcolumn) + 2 -- Length of varchar is L + 2 bytes
+ 8 -- Length of datetime column is 8 bytes
) AS TotalBytes
-- Add WHERE clause to restrict rows in report

Assuming you want data type lengths try selecting from INFORMATION_SCHEMA to get the structure of a table.
Something like
SELECT * FROM INFORMATION_SCHEMA.COLUMNS;
should suffice. The column CHARACTER_MAXIMUM_LENGTH gives you string lengths.

I'm not sure I exactly understand your need, but may be DataLength function will help you:
select DataLength(ColumnOfInterest)
from TableName
where <filter condition>
for a set of columns:
select DataLength(ColumnOfInterest1) + DataLength(ColumnOfInterest2) + etc.
from TableName
where <filter condition>

In sql server it is easy to get the size of a table. So another approach is to store the records you would like filtered in a separate table, and then just get the size of the table.
Below is an example:
--CREATE A TABLE TO STORE YOUR FILTERED RECORDS
CREATE TABLE dbo.resulttable
(
id INT,
employee VARCHAR(32)
)
GO
--INSERT FILTERED RECORDS IN NEW TABLE
INSERT INTO dbo.resulttable (id,employee)
SELECT id,employee FROM dbo.originaltable WHERE id > 100
GO
--GET SIZE OF TABLE WHICH HAS THE FILTERED RECORDS WITH HOWEVER MANY COLUMNS
SELECT
t.name AS TableName,
p.rows AS RowCounts,
SUM(a.total_pages) * 8 AS TotalSpaceKB,
SUM(a.used_pages) * 8 AS UsedSpaceKB,
(SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.object_id = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.NAME = 'resulttable'
GROUP BY
t.Name, p.Rows
ORDER BY
t.Name
GO

Why would SQL Server choose Clustered Index Scan over Non-Clustered one?

In one of the tables I am querying, a clustered index was created over a key that's not a primary key. (I don't know why.)
However, there's a non-clustered index for the primary key for this table.
In the execution plan, SQL is choosing the clustered index, rather than the non-clustered index for the primary key which I am using in my query.
Is there a reason why SQL would do this? How can I force SQL to choose the non-clustered index instead?
Appending more detail:
The table has many fields and the query contains many joins. Let me abstract it a bit.
The table definition looks like this:
SlowTable
[SlowTable_id] [int] IDENTITY(200000000,1) NOT NULL,
[fk1Field] [int] NULL,
[fk2Field] [int] NULL,
[other1Field] [varchar] NULL,
etc. etc...
and then the indices for this table are:
fk1Field (Clustered)
SlowTable_id (Non-Unique, Non-Clustered)
fk2Field (Non-Unique, Non-Clustered)
... and 14 other Non-Unique, Non-Clustered indices on other fields
Presumably there are lots more queries made against fk1Field which is why they selected this as the basis for the Clustered index.
The query I have uses a view:
SELECT
[field list]
FROM
SourceTable1 S1
INNER JOIN SourceTable2 S2
ON S2.S2_id = S1.S2_id
INNER JOIN SourceTable3 S3
ON S3.S3_id = S2.S3_id
INNER JOIN SlowTable ST
ON ST.SlowTable_id = S1.SlowTable_id
INNER JOIN [many other tables, around 7 more...]
The execution plan is quite big, with the nodes concerned say
Hash Match
(Inner Join)
Cost: 9%
with a thick arrow pointing to
Clustered Index Scan (Clustered)
SlowTable.fk1Field
Cost: 77%
I hope this provides enough detail on the issue.
Thanks!
ADDENDUM 2:
Correction to my previous post. The view doesn't have a where clause. It is just a series of inner joins. The execution plan was taken from an Insert statement that uses the View (listed as SLOW_VIEW) in a complex query that looks like the following:
(What this stored procedure does is to do a proportional split of the total amount of some records, based on weights, computed as percentage against a total. This mimics distributing a value from, say, one account, to other accounts.)
INSERT INTO dbo.WDTD(
FieldA,
FieldB,
GWB_id,
C_id,
FieldC,
PG_id,
FieldD,
FieldE,
O_id,
FieldF,
FieldG,
FieldH,
FieldI,
GWBIH_id,
T_id,
JO_id,
PC_id,
PP_id,
FieldJ,
FieldK,
FieldL,
FieldM,
FieldN,
FieldO,
FieldP,
FieldQ,
FieldS)
SELECT DISTINCT
#FieldA FieldA,
GETDATE() FieldB,
#Parameter1 GWB_id,
GWBIH.C_id C_id,
P.FieldT FieldC,
P.PG_id PG_id,
PAM.FieldD FieldD,
PP.FieldU FieldE,
GWBIH.O_id O_id,
CO.FieldF FieldF,
CO.FieldG FieldG,
PSAM.FieldH FieldH,
PSAM.FieldI FieldI,
SOURCE.GWBIH_id GWBIH_id,
' ' T_id,
GWBIH.JO_id JO_id,
SOURCE.PC_id PC_id,
GWB.PP_id,
SOURCE.FieldJ FieldJ,
1 FieldK,
ROUND((SUM(GWBIH.Total) / AGG.Total) * SOURCE.Total, 2) FieldL,
ROUND((SUM(GWBIH.Total) / AGG.Total) * SOURCE.Total, 2) FieldM,
0 FieldN,
' ' FieldO,
ESGM.FieldP_flag FieldP,
SOURCE.FieldQ FieldQ,
'[UNPROCESSED]'
FROM
dbo.Table1 GWBIH
INNER JOIN dbo.Table2 GWBPH
ON GWBPH.GWBP_id = GWBIH.GWBP_id
INNER JOIN dbo.Table3 GWB
ON GWB.GWB_id = GWBPH.GWB_id
INNER JOIN dbo.Table4 P
ON P.P_id = GWBPH.P_id
INNER JOIN dbo.Table5 ESGM
ON ESGM.ET_id = P.ET_id
INNER JOIN dbo.Table6 PAM
ON PAM.PG_id = P.PG_id
INNER JOIN dbo.Table7 O
ON O.dboffcode = GWBIH.O_id
INNER JOIN dbo.Table8 CO
ON
CO.Country_id = O.Country_id
AND CO.Brand_id = O.Brand_id
INNER JOIN dbo.Table9 PSAM
ON PSAM.Office_id = GWBIH.O_id
INNER JOIN dbo.Table10 PCM
ON PCM.PC_id = GWBIH.PC_id
INNER JOIN dbo.Table11 PC
ON PC.PC_id = GWBIH.PC_id
INNER JOIN dbo.Table12 PP
ON PP.PP_id = GWB.PP_id
-- THIS IS THE VIEW THAT CONTAINS THE CLUSTERED INDEX SCAN
INNER JOIN dbo.SLOW_VIEW GL
ON GL.JO_id = GWBIH.JO_id
INNER JOIN (
SELECT
GWBIH.C_id C_id,
GWBPH.GWB_id,
SUM(GWBIH.Total) Total
FROM
dbo.Table1 GWBIH
INNER JOIN dbo.Table2 GWBPH
ON GWBPH.GWBP_id = GWBIH.GWBP_id
INNER JOIN dbo.Table10 PCM
ON PCM.PC_id = GWBIH.PC_id
WHERE
PCM.Split_flag = 0
AND GWBIH.JO_id IS NOT NULL
GROUP BY
GWBIH.C_id,
GWBPH.GWB_id
) AGG
ON AGG.C_id = GWBIH.C_id
AND AGG.GWB_id = GWBPH.GWB_id
INNER JOIN (
SELECT
GWBIH.GWBIH_id GWBIH_id,
GWBIH.C_id C_id,
GWBIH.FieldQ FieldQ,
GWBP.GWB_id GWB_id,
PCM.PC_id PC_id,
CASE
WHEN WT.FieldS IS NOT NULL
THEN WT.FieldS
WHEN WT.FieldS IS NULL
THEN PCMS.FieldT
END FieldJ,
SUM(GWBIH.Total) Total
FROM
dbo.Table1 GWBIH
INNER JOIN dbo.Table2 GWBP
ON GWBP.GWBP_id = GWBIH.GWBP_id
INNER JOIN dbo.Table4 P
ON P.P_id = GWBP.P_id
INNER JOIN dbo.Table10 PCM
ON PCM.PC_id = GWBIH.PC_id
INNER JOIN dbo.Table11 PCMS
ON PCMS.PC_id = PCM.PC_id
LEFT JOIN dbo.WT WT
ON WT.ET_id = P.ET_id
AND WT.PC_id = GWBIH.PC_id
WHERE
PCM.Split_flag = 1
GROUP BY
GWBIH.GWBI_id,
GWBIH.C_id,
GWBIH.FieldQ,
GWBP.GWB_id,
WT.FieldS,
PCM.PC_id,
PCMS.ImportCode
) SOURCE
ON SOURCE.C_id = GWBIH.C_id
AND SOURCE.GWB_id = GWBPH.GWB_id
WHERE
PCM.Split_flag = 0
AND AGG.Total > 0
AND GWBPH.GWB_id = #Parameter1
AND NOT EXISTS (
SELECT *
FROM dbo.WDTD
WHERE
TD.C_id = GWBIH.C_id
AND TD.FieldA = GWBPH.GWB_id
AND TD.JO_id = GWBIH.JO_id
AND TD.PC_id = SOURCE.PC_id
AND TD.GWBIH_id = ' ')
GROUP BY
GWBIH.C_id,
P.FieldT,
GWBIH.JO_id,
GWBIH.O_id,
GWBPH.GWB_id,
P.PG_id,
PAM.FieldD,
PP.FieldU,
GWBIH.O_id,
CO.FieldF,
CO.FieldG,
PSAM.FieldH,
PSAM.FieldI,
GWBIH.JO_id,
SOURCE.PC_id,
GWB.PP_id,
SOURCE.FieldJ,
ESGM.FieldP_flag,
SOURCE.GWBIH_id,
SOURCE.FieldQ,
AGG.Total,
SOURCE.Total
ADDENDUM 3: When doing an execution plan on the select statement of the view, I see this:
Hash Match <==== Bitmap <------ etc...
(Inner Join) (Bitmap Create)
Cost: 0% Cost: 0%
^
|
|
Parallelism Clustered Index Scan (Clustered)
(Repartition Streams) <==== Slow_Table.fk1Field
Cost: 1% Cost: 98%
ADDENDUM 4: I think I found the problem. The Clustered Index Scan isn't referring to my clause that references the Primary Key, but rather another clause that needs a field that is, in some way, related to fk1Field above.

Most likely one of:
too many rows to make the index effective
index doesn't fit the ON/WHERE conditions
index isn't covering and SQL Server avoids a key lookup
Edit, after update:
Your indexes are useless because they are all single column indexes, so it does a clustered index scan.
You need an index that matches your ON, WHERE, GROUP BY conditions with INCLUDES for your SELECT list.

If the query you're executing isn't selecting a small subset of the records, SQL Server may well choose to ignore any "otherwise useful" non-clustered index and just scan through the clustered index (in this instance, most likely all rows in the table) - the logic being that the amount of I/O required to perform the query vs. the non-clustered index outweights that required for a full scan.
If you can post the schema of your table(s) + a sample query, I'm sure we can offer more information.

Ideally you shouldn't be telling SQL Server to do either or, it can pick the best, if you give it a good query. Query hints was created to steer the engine a bit, but you shouldn't have to use this just yet.
Sometimes it is beneficial to cluster the table differently that the primary key, is rare, but it can be useful (the clustering controls the data layout while the primary key ensures correctness).
I can tell you exactly why SQL Server picks the clustered index if you show me your query and schema otherwise I'd only be guessing on likely causes and execution plan is helpful in these cases.
For a non-clustered index to be considered it has to be meaningful to the query and if you non-clustered index doesn't cover your query, there's no guaratee that it will be used at all.

A clustered index scan is essentially a table scan (on a table that happens to have a clustered index). You really should post your statement to get a better answer. Your where clause may not be searchable (see sargs), or if you are selecting many records, sql server may scan the table rather than use the index and later have to look up related columns.

Optimising this query. Relevant for DBA's working on a social network/community type website

I suppose this is quite a common SP present in socialnetworks and community type websites.
I have this SP that returns all of a user's friends on their 'friends' page order by those currently online, then alphabetically. It's taking quite a while to load and I am looking to speed it up.
I remember reading somewhere on SO that breaking up multiple joins into smaller result sets might speed it up. I haven't tried this yet but I am curious to see what other recommendations SO might have on this procedure.
DECLARE #userID INT -- This variable is parsed in
DECLARE #lastActivityMinutes INT
SET #lastActivitytMinutes = '15'
SELECT
Active = CASE WHEN DATEDIFF("n", b.LastActivityDate ,GETDATE()) < #lastActivityMinutes THEN 1 ELSE 0 END,
a.DisplayName, a.ImageFile, a.UserId, b.LastActivityDate
FROM
Profile AS a
INNER JOIN aspnet_Users as b on b.userId = a.UserId
LEFT JOIN Friend AS x ON x.UserID = a.UserID
LEFT JOIN Friend AS z ON z.FriendID = a.UserID
WHERE ((x.FriendId = #userID AND x.status = 1) -- Status = 1 means friendship accepted
OR (z.UserID = #userID AND z.Status = 1))
GROUP BY a.userID, a.DisplayName, a.ImageFile, a.UserId, b.LastActivityDate
ORDER BY Active DESC, DisplayName ASC
I am not sure how to clip in my execution plan but the main bottle neck seems to be occurring on a MERGE JOIN (Right Outer Join) that's costing me 29%. At various stages, Parallelism is also costing 9%, 6%, 5% and 9% for a total of 29% as well.
My initial thoughts are to first return the JOINED results from the Profile and aspnet tables with a CTE and then do LEFT JOINS to the Friends table.

You are joining Friend twice, using a LEFT JOIN, then you are removing the NULL's returned by the LEFT JOIN by WHERE condition, then using GROUP BY to get rid on distincts.
This is not the best query possible.
Why don't you just use this:
SELECT Active = CASE WHEN DATEDIFF("n", b.LastActivityDate ,GETDATE()) < #lastActivityMinutes THEN 1 ELSE 0 END,
a.DisplayName, a.ImageFile, a.UserId, b.LastActivityDate
FROM (
SELECT FriendID
FROM Friends
WHERE UserID = #UserId
AND status = 1
UNION
SELECT UserID
FROM Friends
WHERE FriendID = #UserId
AND status = 1
) x
INNER JOIN
Profile AS a
ON a.UserID = x.FriendID
INNER JOIN
aspnet_Users as b
ON b.userId = a.UserId
ORDER BY
Active DESC, DisplayName ASC

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How do I monitor and find unused indexes in SQL database - sql-server

You should take a look at Brent Ozars sp_BlitzIndex. This stored procedure lists among others unsused indexes. It lists the disorders in a report. For each entry an URL is given which explains what to look for and how to handle the issue.

Related

Sum of index space :: T-SQL vs GUI

SQL Server number of pages in a table

How to get the size of some rows in a table

Why would SQL Server choose Clustered Index Scan over Non-Clustered one?

Optimising this query. Relevant for DBA's working on a social network/community type website

Categories

Resources