Sum of index space :: T-SQL vs GUI

Sum of index space :: T-SQL vs GUI - sql-server

I have recently rebuilt all indexes on a table and the GUI from SSMS tells me that the index space is 7 555.711 MB.
But if I look at the actual index space through T-SQL I have a different result:
SELECT tn.[name] AS [Table name], ix.[name] AS [Index name],
SUM(sz.[used_page_count]) * 8 * 1024/(1024 * 1024) AS [Index size (MB)]
FROM sys.dm_db_partition_stats AS sz
INNER JOIN sys.indexes AS ix ON sz.[object_id] = ix.[object_id]
AND sz.[index_id] = ix.[index_id]
INNER JOIN sys.tables tn ON tn.OBJECT_ID = ix.object_id
where tn.[name] = 'MyTableName'
GROUP BY tn.[name], ix.[name]
ORDER BY tn.[name]
Why?

Thank you #Jeroen Mostert,
In fact the cluster index is part of the table data.
With Index Space SSMS is calculating only the non-clustered index which are:
1452 + 1590 + 1590 + 1452 + 1452 = 7536MB
which is very close to 7555MB

Related

Can't get stored procedure to change index scan to index seek

Background: I'm running on Sql Server 2014 (12.0.2000.8) on Azure...
I found a handy script the other day which shows the queries/stored procedures that are "touching" indexes. I've been looking for this for awhile because I have some indexes that were performing very poorly, but I couldn't find where they were called.
Now that I have that information, I've been trying to re-work the procs that are touching the index in question.
When looking at the execution plan of my query, it says it's doing a scan which is obviously not optimal.
Hovering over the index it shows the the output list of the join, but no predicates.
I went ahead and created an index with the exact fields that are in that output list.
Here's the query that is being run:
declare #season int = 2017
select s.SchoolId,
s.Name [SchoolName],
s.Conference,
tr.DualRank [Rank],
convert(varchar(2), tr.DualWins) + ' - ' + convert(varchar(2), tr.DualLosses) [Record],
tr.RankingDate,
case when tr.WeekNumber = 0 then null
else
(select trx.DualRank from dbo.TeamRankings trx where trx.Season = tr.Season and trx.WeekNumber = (tr.WeekNumber - 1) and trx.SchoolId = tr.SchoolId)
- tr.DualRank
end [Trend],
(select trx.DualRank from dbo.TeamRankings trx where trx.Season = tr.Season and trx.WeekNumber = (tr.WeekNumber - 1) and trx.SchoolId = tr.SchoolId) [PreviousWeek]
from dbo.TeamRankings tr
join dbo.School s on s.SchoolId = tr.SchoolId
where tr.Season = #season
and tr.IsCurrent = 1
order by tr.DualRank
The only join in this list that has a scan instead of a seek is the one to the school table. It's joining on the SchoolId, and then in the select portion, it's outputing the Name and Conference. Seems pretty straight forward.
In my first try, I went ahead and created my index like this:
create nonclustered index idx_NC_School_SchoolId_incs on dbo.School (SchoolId asc) include (Name, Conference)
but that still resulted in a scan. My second attempt was to do it like this:
create nonclustered index idx_NC_School_SchoolId_Name_Conference on dbo.School (SchoolId asc, Name asc, Conference asc)
But that STILL is doing a scan while utilizing the index that I created.
What else should I be looking at to try to get this query to do a seek instead of a scan.
For more background info, here's a subset of the table definition:
dbo.School
SchoolId int identity(1,1) primary key,
Name varchar(100) not null,
Conference varchar(100) not null -- will soon change this to a lookup table
......
I know someone will ask, but I can't figure out how to do it; how do I attach my execution plan to the question?
Here's a link to the page where the data is displayed: http://www.wrestlestat.com/rankings/dual/live

Index scans are not always a bad thing, specially when you have a very small table.
But something that can definitely improve performance of your query is to move these sub-queries from your select clause to from and use a join.
Something like......
declare #season int = 2017
select s.SchoolId,
s.Name [SchoolName],
s.Conference,
tr.DualRank [Rank],
convert(varchar(2), tr.DualWins) + ' - ' + convert(varchar(2), tr.DualLosses) [Record],
tr.RankingDate,
CASE WHEN tr.WeekNumber = 0 then null
ELSE trx.DualRank - tr.DualRank end [Trend],
trx.DualRank [PreviousWeek]
from dbo.TeamRankings tr
Inner join dbo.School s on s.SchoolId = tr.SchoolId
Left join dbo.TeamRankings trx ON trx.Season = tr.Season
and trx.WeekNumber = (tr.WeekNumber - 1)
and trx.SchoolId = tr.SchoolId
where tr.Season = #season
and tr.IsCurrent = 1
order by tr.DualRank
When you have a sub-query in the select clause, the sub-query is executed for each row returned by the outer query, if you move it to from clause and use joins , it will be executed once and the result set will be joined with result set coming from other joins. Much more efficient and cleaner.

You can use windowing functions such as LAG and LEAD to get around self-joining to the table.
It can lead to simpler execution plans.
declare #season int = 2017
select
s.SchoolId,
s.Name [SchoolName],
s.Conference,
tr.DualRank [Rank],
convert(varchar(2), tr.DualWins) + ' - ' + convert(varchar(2), tr.DualLosses) [Record],
tr.RankingDate,
CASE WHEN tr.WeekNumber = 0 THEN NULL ELSE tr.DualRank - LAG(tr.DualRank,1,0) OVER(Partition BY tr.Season,tr.SchoolId ORDER BY trx.WeekNumber) END AS [Trend],
LAG(tr.DualRank,1,0) OVER(Partition BY tr.Season,tr.SchoolId ORDER BY trx.WeekNumber) AS [PreviousWeek]
from
dbo.TeamRankings tr
join dbo.School s on s.SchoolId = tr.SchoolId
where
tr.Season = #season
and
tr.IsCurrent = 1
order by
tr.DualRank
When you use
trx.WeekNumber = (tr.WeekNumber - 1)
You're altering the value of tr.WeekNumber, consequently it's different to the value stored in the index, so SQL will perform a scan instead of a seek.

How to identify duplicate index in sql server

I have a table with around 100 columns with total rows around 73 million.
e.g. Table(Col1, Col2, Col3, Col4,....Col100)
Composite Clusterd Index(Col1, Col2, Col3, Col4, Col5)
Composite Non Clustered Index(Col25, Col1, Col2, Col3, Col4, Col5)
Can we say Non clusted index is a duplicate index and we can fine tune the performance and storage by creating NC index only on Col25 and it will work the same way?

You can run script to determine index usage for your DB:
select
object_name(s.object_id) as table_name,
i.type_desc as index_type_desc,
case when s.index_id=0 then '' else i.name end as index_name,
s.user_seeks + s.user_scans + s.user_lookups as total_reads,
case when (s.user_seeks + s.user_scans + s.user_lookups)=0 then 0
else (convert(float,s.user_scans)) / (s.user_seeks + s.user_scans + s.user_lookups) * 100.00 end as scan_percentage,
s.user_updates as total_writes,
case when (s.user_seeks + s.user_scans + s.user_lookups)=0 then 0 else
(convert(float,s.user_updates)) / (s.user_seeks + s.user_scans + s.user_lookups) * 100 end as writes_percentage,
ios.lock_count, ios.lock_wait_in_ms, ios.latch_wait_in_ms, ios.io_latch_wait_in_ms, ios.index_lock_promotion_count,
ph.avg_fragmentation_in_percent, ph.page_count
from
sys.dm_db_index_usage_stats s
left join sys.indexes i on s.object_id=i.object_id and s.index_id=i.index_id
left join (
select
database_id, object_id, index_id,
row_lock_count + page_lock_count as lock_count,
row_lock_wait_in_ms + page_lock_wait_in_ms as lock_wait_in_ms,
page_latch_wait_in_ms + tree_page_latch_wait_in_ms as latch_wait_in_ms,
page_io_latch_wait_in_ms + tree_page_io_latch_wait_in_ms as io_latch_wait_in_ms,
index_lock_promotion_count
from sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL)
where object_id>100 and database_id=DB_ID()
) ios on s.database_id=ios.database_id and s.object_id=ios.object_id and s.index_id=ios.index_id
left join(
select
p.database_id,
p.object_id,
p.index_id,
p.avg_fragmentation_in_percent,
p.page_count
from
sys.dm_db_index_physical_stats(db_id(),null,null,null,'limited') p
where
p.avg_fragmentation_in_percent > 0
) ph on ph.database_id=s.database_id and ph.object_id=s.object_id and ph.index_id=s.index_id
where
s.database_id=DB_ID() and s.object_id>100 OPTION (RECOMPILE)
where columns are:
table_name
index_type_desc: clustered, heap or nonclustered
index_name
total_reads
scan_percentage: what percentage of reads was scans (bad query plan)
total_writes
writes_percentage: detects indexes that updates but not used (>100%)
lock & latchs: show you single-user latency (latch) and multi-user latency (locks)
avg_fragmentation_in_percent & p.page_count: tell you about index fragmentation (index health)
Run it and you can see how many times SQL uses your index1, index2, index3 etc.
In addition to your question, you'll see latency and maintenance cost of your indexes. At 73 millions records it is important values, I think.

SQL Server - how to identify if table is heap or B-tree

SQL Server - Is there a system table for identifying if my table is heap or b-tree?

Yes, the catalog view sys.partitions holds this information. The field index_id will tell you if a table is heap (index_id = 0) or b-tree (index_id > 0).
select
o.name,
o.object_id,
case
when p.index_id = 0 then 'Heap'
when p.index_id = 1 then 'Clustered Index/b-tree'
when p.index_id > 1 then 'Non-clustered Index/b-tree'
end as 'Type'
from sys.objects o
inner join sys.partitions p on p.object_id = o.object_id
where name = 'YourTableName'
From the documentation - Table and Index Organization:
ID of the index within the object to which this partition belongs.
0 = heap
1 = clustered index
2 or greater = nonclustered
Heaps are tables that have no clustered index. Nonclustered indexes
have a B-tree index structure similar to the one in clustered indexes.

Each table that have't clustered index is a heap table.
you can check clustered index on each table in order to determine that table is a heap table or no.

How to get the size of some rows in a table

First of all sorry for my english, I know it is not perfect
I have the table in my DB in whtch I have about 40 columns, and I need to get size of some records in this table to make a report.
I tried to do something with dbcc SHOWCONTIG, but I didn't get satisfactory results.
Have you any advice/ideas to solve my problem?
Thanks for help!

Would suggest writing a query to calculate based on contents using the reference info here: http://www.connectionstrings.com/sql-server-data-types-reference/
E.g. let's say your table has 2 columns: A varchar(20), and a datetime. Then your query would look something like this:
SELECT RowID, -- Some identifier for the row, e.g. primary key
LEN(varcharcolumn) + 2 -- Length of varchar is L + 2 bytes
+ 8 -- Length of datetime column is 8 bytes
) AS TotalBytes
-- Add WHERE clause to restrict rows in report

Assuming you want data type lengths try selecting from INFORMATION_SCHEMA to get the structure of a table.
Something like
SELECT * FROM INFORMATION_SCHEMA.COLUMNS;
should suffice. The column CHARACTER_MAXIMUM_LENGTH gives you string lengths.

I'm not sure I exactly understand your need, but may be DataLength function will help you:
select DataLength(ColumnOfInterest)
from TableName
where <filter condition>
for a set of columns:
select DataLength(ColumnOfInterest1) + DataLength(ColumnOfInterest2) + etc.
from TableName
where <filter condition>

In sql server it is easy to get the size of a table. So another approach is to store the records you would like filtered in a separate table, and then just get the size of the table.
Below is an example:
--CREATE A TABLE TO STORE YOUR FILTERED RECORDS
CREATE TABLE dbo.resulttable
(
id INT,
employee VARCHAR(32)
)
GO
--INSERT FILTERED RECORDS IN NEW TABLE
INSERT INTO dbo.resulttable (id,employee)
SELECT id,employee FROM dbo.originaltable WHERE id > 100
GO
--GET SIZE OF TABLE WHICH HAS THE FILTERED RECORDS WITH HOWEVER MANY COLUMNS
SELECT
t.name AS TableName,
p.rows AS RowCounts,
SUM(a.total_pages) * 8 AS TotalSpaceKB,
SUM(a.used_pages) * 8 AS UsedSpaceKB,
(SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.object_id = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.NAME = 'resulttable'
GROUP BY
t.Name, p.Rows
ORDER BY
t.Name
GO

How do I monitor and find unused indexes in SQL database

I would like to monitor index usage for an SQL database in order to find unused indexes and then drop them. How can I monitor index usage most efficiently? Which scripts could be useful?
I'm aware of this question about identifying unused objects, but this applies only to the current run of the SQL server. I would like to monitor index usage over a period of time.

This is an interesting question. I've been working on this same question over the past week. There is a system table called dm_db_index_usage_stats that contains usage statistics on indexes.
Indexes That Never Appear in the Usage Statistics Table
However, many indexes never appear in this table at all. The query David Andres posted lists all indexes for this case. I've updated it a little bit to ignore primary keys, which probably shouldn't be deleted, even if they aren't ever used. I also joined on the dm_db_index_physical_stats table to get other information, including Page Count, Total Index Size, and the Fragmentation Percentage. An interesting note is that indexes that are returned by this query don't seem to show up in the SQL Report for Index Usage Statistics.
DECLARE #dbid INT
SELECT #dbid = DB_ID(DB_NAME())
SELECT Databases.Name AS [Database],
Objects.NAME AS [Table],
Indexes.NAME AS [Index],
Indexes.INDEX_ID,
PhysicalStats.page_count as [Page Count],
CONVERT(decimal(18,2), PhysicalStats.page_count * 8 / 1024.0) AS [Total Index Size (MB)],
CONVERT(decimal(18,2), PhysicalStats.avg_fragmentation_in_percent) AS [Fragmentation (%)]
FROM SYS.INDEXES Indexes
INNER JOIN SYS.OBJECTS Objects ON Indexes.OBJECT_ID = Objects.OBJECT_ID
LEFT JOIN sys.dm_db_index_physical_stats(#dbid, null, null, null, null) PhysicalStats
on PhysicalStats.object_id = Indexes.object_id and PhysicalStats.index_id = indexes.index_id
INNER JOIN sys.databases Databases
ON Databases.database_id = PhysicalStats.database_id
WHERE OBJECTPROPERTY(Objects.OBJECT_ID,'IsUserTable') = 1
AND Indexes.type = 2 -- Nonclustered indexes
AND Indexes.INDEX_ID NOT IN (
SELECT UsageStats.INDEX_ID
FROM SYS.DM_DB_INDEX_USAGE_STATS UsageStats
WHERE UsageStats.OBJECT_ID = Indexes.OBJECT_ID
AND Indexes.INDEX_ID = UsageStats.INDEX_ID
AND DATABASE_ID = #dbid)
ORDER BY PhysicalStats.page_count DESC,
Objects.NAME,
Indexes.INDEX_ID,
Indexes.NAME ASC
Indexes That Do Appear in the Usage Statistics Table, But Are Never Used
There are other indexes that do appear in the dm_db_index_usage_stats table, but which have never been used for user seeks, scans, or lookups. This query will identify indexes that fall into this category. Incidentally, unlike the indexes returned from the other query, the indexes returned in this query can be verified on the SQL Report by Index Usage Statistics.
I added a Minimum Page Count that allows me to initially focus on and remove unused indexes that are taking up a lot of storage.
DECLARE #MinimumPageCount int
SET #MinimumPageCount = 500
SELECT Databases.name AS [Database],
Indexes.name AS [Index],
Objects.Name AS [Table],
PhysicalStats.page_count as [Page Count],
CONVERT(decimal(18,2), PhysicalStats.page_count * 8 / 1024.0) AS [Total Index Size (MB)],
CONVERT(decimal(18,2), PhysicalStats.avg_fragmentation_in_percent) AS [Fragmentation (%)],
ParititionStats.row_count AS [Row Count],
CONVERT(decimal(18,2), (PhysicalStats.page_count * 8.0 * 1024) / ParititionStats.row_count) AS [Index Size/Row (Bytes)]
FROM sys.dm_db_index_usage_stats UsageStats
INNER JOIN sys.indexes Indexes
ON Indexes.index_id = UsageStats.index_id
AND Indexes.object_id = UsageStats.object_id
INNER JOIN sys.objects Objects
ON Objects.object_id = UsageStats.object_id
INNER JOIN SYS.databases Databases
ON Databases.database_id = UsageStats.database_id
INNER JOIN sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS PhysicalStats
ON PhysicalStats.index_id = UsageStats.Index_id
and PhysicalStats.object_id = UsageStats.object_id
INNER JOIN SYS.dm_db_partition_stats ParititionStats
ON ParititionStats.index_id = UsageStats.index_id
and ParititionStats.object_id = UsageStats.object_id
WHERE UsageStats.user_scans = 0
AND UsageStats.user_seeks = 0
AND UsageStats.user_lookups = 0
AND PhysicalStats.page_count > #MinimumPageCount -- ignore indexes with less than 500 pages of memory
AND Indexes.type_desc != 'CLUSTERED' -- Exclude primary keys, which should not be removed
ORDER BY [Page Count] DESC
I hope this helps.
Final Thought
Of course, once indexes are identified as candidates for removal, careful consideration should still be employed to make sure it's a good decision to do so.
For more information, see Identifying Unused Indexes in a SQL Server Database

Currently (as of SQL Server 2005 - 2008) the SQL index stats information is only kept in memory and so you have to do some of the work yourself if you would like to have that persisted across restarts and database detaches.
What I usually do, is I create a job that runs every day and takes a snapshot of the information found in the sys.dm_db_index_usage_stats table, into a custom table that I create for the database in question.
This seems to work pretty well until a future version of SQL which will support persistent index usage stats.

Pulled this puppy off of http://blog.sqlauthority.com/2008/02/11/sql-server-2005-find-unused-indexes-of-current-database/. Note that this works for 2005 and above. The key is the JOIN to the SYS.DM_DB_INDEX_USAGE_STATS system table.
USE AdventureWorks
GO
DECLARE #dbid INT
SELECT #dbid = DB_ID(DB_NAME())
SELECT OBJECTNAME = OBJECT_NAME(I.OBJECT_ID),
INDEXNAME = I.NAME,
I.INDEX_ID
FROM SYS.INDEXES I
JOIN SYS.OBJECTS O ON I.OBJECT_ID = O.OBJECT_ID
WHERE OBJECTPROPERTY(O.OBJECT_ID,'IsUserTable') = 1
AND I.INDEX_ID NOT IN (
SELECT S.INDEX_ID
FROM SYS.DM_DB_INDEX_USAGE_STATS S
WHERE S.OBJECT_ID = I.OBJECT_ID
AND I.INDEX_ID = S.INDEX_ID
AND DATABASE_ID = #dbid)
ORDER BY OBJECTNAME,
I.INDEX_ID,
INDEXNAME ASC
GO

I tweaked John Pasquet's queries here: Identifying Unused Indexes in a SQL Server Database to return indexes used 10 or less times, unioned the results that aren't in the usage stats tables, exclude heap indexes and unique constraints or primary key indexes, and finally to exclude indexes with zero pages.
Be careful with the results of this query – it's best to use in production where indexes are actually getting used the way you would expect. If you query on a database with rebuilt or dropped/recreated indexes or on a recent database backup you could get false positives (indexes that normally would get used but aren't because of special circumstances). Not safe to use in test or dev environments to decide whether to drop indexes. As Narnian says, this query just identifies candidates for removal for your careful consideration.
USE [DatabaseName]
DECLARE #MinimumPageCount int
SET #MinimumPageCount = 500
DECLARE #dbid INT
SELECT #dbid = DB_ID(DB_NAME())
-- GET UNUSED INDEXES THAT APPEAR IN THE INDEX USAGE STATS TABLE
SELECT
Databases.name AS [Database]
,object_name(Indexes.object_id) AS [Table]
,Indexes.name AS [Index]
,PhysicalStats.page_count as [Page Count]
,CONVERT(decimal(18,2), PhysicalStats.page_count * 8 / 1024.0) AS [Total Index Size (MB)]
,CONVERT(decimal(18,2), PhysicalStats.avg_fragmentation_in_percent) AS [Fragmentation (%)]
,ParititionStats.row_count AS [Row Count]
,CONVERT(decimal(18,2), (PhysicalStats.page_count * 8.0 * 1024) / ParititionStats.row_count) AS [Index Size Per Row (Bytes)]
,1 AS [Appears In Usage Stats Table]
FROM sys.dm_db_index_usage_stats UsageStats
INNER JOIN sys.indexes Indexes
ON Indexes.index_id = UsageStats.index_id AND Indexes.object_id = UsageStats.object_id
INNER JOIN SYS.databases Databases
ON Databases.database_id = UsageStats.database_id
INNER JOIN sys.dm_db_index_physical_stats (DB_ID(),NULL,NULL,NULL,NULL) AS PhysicalStats
ON PhysicalStats.index_id = UsageStats.Index_id AND PhysicalStats.object_id = UsageStats.object_id
INNER JOIN SYS.dm_db_partition_stats ParititionStats
ON ParititionStats.index_id = UsageStats.index_id AND ParititionStats.object_id = UsageStats.object_id
WHERE
UsageStats.user_scans <= 10
AND UsageStats.user_seeks <= 10
AND UsageStats.user_lookups <= 10
-- exclude heap indexes
AND Indexes.name IS NOT NULL
-- ignore indexes with less than a certain number of pages of memory
AND PhysicalStats.page_count > #MinimumPageCount
-- Exclude primary keys, which should not be removed
AND Indexes.is_primary_key = 0
-- ignore unique constraints - those shouldn't be removed
AND Indexes.is_unique_constraint = 0
AND Indexes.is_unique = 0
UNION ALL
(
-- GET UNUSED INDEXES THAT DO **NOT** APPEAR IN THE INDEX USAGE STATS TABLE
SELECT
Databases.Name AS [Database]
,Objects.NAME AS [Table]
,Indexes.NAME AS [Index]
,PhysicalStats.page_count as [Page Count]
,CONVERT(decimal(18,2), PhysicalStats.page_count * 8 / 1024.0) AS [Total Index Size (MB)]
,CONVERT(decimal(18,2), PhysicalStats.avg_fragmentation_in_percent) AS [Fragmentation (%)]
,-1 AS [Row Count]
,-1 AS [Index Size Per Row (Bytes)]
,0 AS [Appears In Usage Stats Table]
FROM SYS.INDEXES Indexes
INNER JOIN SYS.OBJECTS Objects
ON Indexes.OBJECT_ID = Objects.OBJECT_ID
LEFT JOIN sys.dm_db_index_physical_stats(#dbid, null, null, null, null) PhysicalStats
ON PhysicalStats.object_id = Indexes.object_id AND PhysicalStats.index_id = indexes.index_id
INNER JOIN sys.databases Databases
ON Databases.database_id = PhysicalStats.database_id
WHERE
Objects.type = 'U' -- Is User Table
-- exclude heap indexes
AND Indexes.name IS NOT NULL
-- exclude empty tables
AND PhysicalStats.page_count <> 0
-- Exclude primary keys, which should not be removed
AND Indexes.is_primary_key = 0
-- ignore unique constraints - those shouldn't be removed
AND Indexes.is_unique_constraint = 0
AND Indexes.is_unique = 0
AND Indexes.INDEX_ID NOT IN
(
SELECT UsageStats.INDEX_ID
FROM SYS.DM_DB_INDEX_USAGE_STATS UsageStats
WHERE
UsageStats.OBJECT_ID = Indexes.OBJECT_ID
AND Indexes.INDEX_ID = UsageStats.INDEX_ID
AND DATABASE_ID = #dbid
)
)
ORDER BY [Table] ASC, [Total Index Size (MB)] DESC

You should take a look at Brent Ozars sp_BlitzIndex. This stored procedure lists among others unsused indexes. It lists the disorders in a report. For each entry an URL is given which explains what to look for and how to handle the issue.