How would I go about finding the largest objects in a SQL Server database? First, by determining which tables (and related indices) are the largest and then determining which rows in a particular table are largest (we're storing binary data in BLOBs)?
Are there any tools out there for helping with this kind of database analysis? Or are there some simple queries I could run against the system tables?
I've been using this SQL script (which I got from someone, somewhere - can't reconstruct who it came from) for ages and it's helped me quite a bit understanding and determining the size of indices and tables:
SELECT
t.name AS TableName,
i.name as indexName,
sum(p.rows) as RowCounts,
sum(a.total_pages) as TotalPages,
sum(a.used_pages) as UsedPages,
sum(a.data_pages) as DataPages,
(sum(a.total_pages) * 8) / 1024 as TotalSpaceMB,
(sum(a.used_pages) * 8) / 1024 as UsedSpaceMB,
(sum(a.data_pages) * 8) / 1024 as DataSpaceMB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.object_id = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.name NOT LIKE 'dt%' AND
i.object_id > 255 AND
i.index_id <= 1
GROUP BY
t.name, i.object_id, i.index_id, i.name
ORDER BY
object_name(i.object_id)
Of course, you can use another ordering criteria, e.g.
ORDER BY SUM(p.rows) DESC
to get the tables with the most rows, or
ORDER BY SUM(a.total_pages) DESC
to get the tables with the most pages (8K blocks) used.
In SQL Server 2008, you can also just run the standard report Disk Usage by Top Tables. This can be found by right clicking the DB, selecting Reports->Standard Reports and selecting the report you want.
This query help to find largest table in you are connection.
SELECT TOP 1 OBJECT_NAME(OBJECT_ID) TableName, st.row_count
FROM sys.dm_db_partition_stats st
WHERE index_id < 2
ORDER BY st.row_count DESC
You may also use the following code:
USE AdventureWork
GO
CREATE TABLE #GetLargest
(
table_name sysname ,
row_count INT,
reserved_size VARCHAR(50),
data_size VARCHAR(50),
index_size VARCHAR(50),
unused_size VARCHAR(50)
)
SET NOCOUNT ON
INSERT #GetLargest
EXEC sp_msforeachtable 'sp_spaceused ''?'''
SELECT
a.table_name,
a.row_count,
COUNT(*) AS col_count,
a.data_size
FROM #GetLargest a
INNER JOIN information_schema.columns b
ON a.table_name collate database_default
= b.table_name collate database_default
GROUP BY a.table_name, a.row_count, a.data_size
ORDER BY CAST(REPLACE(a.data_size, ' KB', '') AS integer) DESC
DROP TABLE #GetLargest
#marc_s's answer is very great and I've been using it for few years. However, I noticed that the script misses data in some columnstore indexes and doesn't show complete picture. E.g. when you do SUM(TotalSpace) against the script and compare it with total space database property in Management Studio the numbers don't match in my case (Management Studio shows larger numbers). I modified the script to overcome this issue and extended it a little bit:
select
tables.[name] as table_name,
schemas.[name] as schema_name,
isnull(db_name(dm_db_index_usage_stats.database_id), 'Unknown') as database_name,
sum(allocation_units.total_pages) * 8 as total_space_kb,
cast(round(((sum(allocation_units.total_pages) * 8) / 1024.00), 2) as numeric(36, 2)) as total_space_mb,
sum(allocation_units.used_pages) * 8 as used_space_kb,
cast(round(((sum(allocation_units.used_pages) * 8) / 1024.00), 2) as numeric(36, 2)) as used_space_mb,
(sum(allocation_units.total_pages) - sum(allocation_units.used_pages)) * 8 as unused_space_kb,
cast(round(((sum(allocation_units.total_pages) - sum(allocation_units.used_pages)) * 8) / 1024.00, 2) as numeric(36, 2)) as unused_space_mb,
count(distinct indexes.index_id) as indexes_count,
max(dm_db_partition_stats.row_count) as row_count,
iif(max(isnull(user_seeks, 0)) = 0 and max(isnull(user_scans, 0)) = 0 and max(isnull(user_lookups, 0)) = 0, 1, 0) as no_reads,
iif(max(isnull(user_updates, 0)) = 0, 1, 0) as no_writes,
max(isnull(user_seeks, 0)) as user_seeks,
max(isnull(user_scans, 0)) as user_scans,
max(isnull(user_lookups, 0)) as user_lookups,
max(isnull(user_updates, 0)) as user_updates,
max(last_user_seek) as last_user_seek,
max(last_user_scan) as last_user_scan,
max(last_user_lookup) as last_user_lookup,
max(last_user_update) as last_user_update,
max(tables.create_date) as create_date,
max(tables.modify_date) as modify_date
from
sys.tables
left join sys.schemas on schemas.schema_id = tables.schema_id
left join sys.indexes on tables.object_id = indexes.object_id
left join sys.partitions on indexes.object_id = partitions.object_id and indexes.index_id = partitions.index_id
left join sys.allocation_units on partitions.partition_id = allocation_units.container_id
left join sys.dm_db_index_usage_stats on tables.object_id = dm_db_index_usage_stats.object_id and indexes.index_id = dm_db_index_usage_stats.index_id
left join sys.dm_db_partition_stats on tables.object_id = dm_db_partition_stats.object_id and indexes.index_id = dm_db_partition_stats.index_id
group by schemas.[name], tables.[name], isnull(db_name(dm_db_index_usage_stats.database_id), 'Unknown')
order by 5 desc
Hope it will be helpful for someone.
This script was tested against large TB-wide databases with hundreds of different tables, indexes and schemas.
If you are using Sql Server Management Studio 2008 there are certain data fields you can view in the object explorer details window. Simply browse to and select the tables folder. In the details view you are able to right-click the column titles and add fields to the "report". Your mileage may vary if you are on SSMS 2008 express.
I've found this query also very helpful in SqlServerCentral, here is the link to original post
Sql Server largest tables
select name=object_schema_name(object_id) + '.' + object_name(object_id)
, rows=sum(case when index_id < 2 then row_count else 0 end)
, reserved_kb=8*sum(reserved_page_count)
, data_kb=8*sum( case
when index_id<2 then in_row_data_page_count + lob_used_page_count + row_overflow_used_page_count
else lob_used_page_count + row_overflow_used_page_count
end )
, index_kb=8*(sum(used_page_count)
- sum( case
when index_id<2 then in_row_data_page_count + lob_used_page_count + row_overflow_used_page_count
else lob_used_page_count + row_overflow_used_page_count
end )
)
, unused_kb=8*sum(reserved_page_count-used_page_count)
from sys.dm_db_partition_stats
where object_id > 1024
group by object_id
order by
rows desc
In my database they gave different results between this query and the 1st answer.
Hope somebody finds useful
Related
I have a database in Azure SQL Server, and I run the following query to track my DELETE statement.
SELECT
t.NAME AS TableName,
i.name as indexName,
p.[Rows],
sum(a.total_pages) as TotalPages,
sum(a.used_pages) as UsedPages,
sum(a.data_pages) as DataPages,
(sum(a.total_pages) * 8) / 1024 as TotalSpaceMB,
(sum(a.used_pages) * 8) / 1024 as UsedSpaceMB,
(sum(a.data_pages) * 8) / 1024 as DataSpaceMB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.NAME NOT LIKE 'dt%' AND
i.OBJECT_ID > 255
GROUP BY
t.NAME, i.object_id, i.index_id, i.name, p.[Rows]
ORDER BY
object_name(i.object_id) ;
The results shows the table Events twice, but once with index and once without.
# TableName indexName Rows
1 Events [NULL] 1,878
2 Events IX_timestamp 1,886
Why I run the following delete statement, it deletes from both tables:
DELETE from [dbo].[Events]
where timestamp > parse('2021-11-29 08:59:00.1130000' as datetime)
and timestamp < cast('2021-11-30' as date)
Can someone please assist in understanding why this happens? I only want the table with the index to exist.
I've been testing a DB-migration tool, which inserts data into all my DB tables from another RDBMS.
On each run, the Migration tool deletes * from the target tables.
I also run 'Shrink' on the SQL Server db on a regular basis
Despite this, my DB has now grown in size to be orders of magnitude larger than the sum of the sizes of each table. The well-known query for determining table sizes as follows:
SELECT
sum( p.rows ) AS RowCounts,
CAST(ROUND(((SUM(a.total_pages) * 8) / 1024.00), 2) AS NUMERIC(36, 2)) AS TotalSpaceMB,
CAST(ROUND(((SUM(a.used_pages) * 8) / 1024.00), 2) AS NUMERIC(36, 2)) AS UsedSpaceMB,
CAST(ROUND(((SUM(a.total_pages) - SUM(a.used_pages)) * 8) / 1024.00, 2) AS NUMERIC(36, 2)) AS UnusedSpaceMB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
LEFT OUTER JOIN
sys.schemas s ON t.schema_id = s.schema_id
WHERE
t.NAME NOT LIKE 'dt%'
AND t.is_ms_shipped = 0
AND i.OBJECT_ID > 255
Returns:
RowCounts TotalSpaceMB UsedSpaceMB UnusedSpaceMB
2816317 161.44 154.04 7.40
Yet, the size of the DB on disk is 32024.13 MB
What else may be taking up space? How can I safely free it up?
Delete * is not ideal if you want to reduce the size. Try to truncate the tables and then insert. Also, which tool are you using? I would recommend DMA or SSMA.
Does a query that gets the count of all rows in a table have to do a full table scan or does SQL Server maintain a count of rows somewhere?
SELECT COUNT(*) FROM TABLE_NAME;
The table TABLE_NAME has a primary key, and therefore a clustered index, and looks like so:
CREATE TABLE TABLE_NAME
(
Id int PRIMARY KEY IDENTITY(1, 1),
Name nvarchar(50) NOT NULL
);
I am using Microsoft SQL Server 2014.
The server will always read all records (if there's an index then it will scan the entire index) to count the rows. You can't escape this as long as you are doing SELECT COUNT(*) FROM Table.
If your table has a clustered index, you can change your query to an "under the hood" query to retrieve the count without actually fetching the records with:
SELECT OBJECT_NAME(i.id) [Table_Name], i.rowcnt [Row_Count]
FROM sys.sysindexes i WITH (NOLOCK)
WHERE i.indid in (0,1)
ORDER BY i.rowcnt desc
if you are looking for an approximate count of the records, you can also use the following query:
SELECT
TableName = t.NAME,
SchemaName = s.Name,
[RowCount] = p.rows,
TotalSpaceMB = CONVERT(DECIMAL(18,2), SUM(a.total_pages) * 8 / 1024.0),
UsedSpaceMB = CONVERT(DECIMAL(18,2), SUM(a.used_pages) * 8 / 1024.0),
UnusedSpaceMB = CONVERT(DECIMAL(18,2), (SUM(a.total_pages) - SUM(a.used_pages)) * 8 / 1024.0)
FROM
sys.tables t
INNER JOIN sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id
LEFT OUTER JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE
t.NAME NOT LIKE 'dt%'
AND t.is_ms_shipped = 0
AND i.OBJECT_ID > 255
GROUP BY
t.Name,
s.Name,
p.Rows
ORDER BY
TotalSpaceMB DESC
This will show non-system tables with their calculated (not exact) row count and the sum of the sizes of their data (with any index they might have), relatively fast without retrieving the records.
When SQL Server performs a query like SELECT COUNT(*), SQL Server will use the narrowest non-clustered index to count the rows. If the table does not have any non-clustered index, it will have to scan the table.
If your table has a clustered index you can get your count even faster.
SELECT COUNT(*) FROM TABLE_NAME;
Does a full table scan.
For optimizations you can refer to this.
you can following way. it is better in performance I guess.
SELECT COUNT(1) FROM TABLE_NAME
Its easy to check storage sizes for Tables and Indexes, you can right-click the table object on SSMS explorer and voila, the details appear in a nice popup.
But since Indexed Views are displayed the same as Normal Views, there is no storage information avaiable in SSMS to show me the current size taken up on disk.
Is there an alterate way to calculate the size (say via a system SP or similar method)?
Thanks.
EXEC sys.sp_spaceused #objname = N'dbo.YourView'
You can use this query here to find your data for any given indexed view:
SELECT
v.NAME AS ViewName,
i.name AS IndexName,
p.rows AS RowCounts,
SUM(a.total_pages) * 8 AS TotalSpaceKB,
SUM(a.used_pages) * 8 AS UsedSpaceKB,
SUM(a.data_pages) * 8 AS DataSpaceKB
FROM
sys.views v
INNER JOIN
sys.indexes i ON v.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
v.Name = 'YourViewNameHere' --View name only, not 'schema.viewname'
AND
i.index_id = 1 -- clustered index, remove this to see all indexes
GROUP BY
v.NAME, i.object_id, i.index_id, i.name, p.Rows
Gives an output something like
ViewName IndexName RowCounts TotalSpaceKB UsedSpaceKB DataSpaceKB
YourViewName IX_YourView 1771 592 552 536
How can you tell what Tables are taking up the most space in a SQL Server 2005 Database?
I am sure there is some System Stored Procedure that shows this information.
I have a TEST database that grew from 1tb to 23tb. We are currently doing a lot of client conversion testing in the database, which entails running the same conversion Stored Procedure multiple times. It does DELETEs which I am sure is increasing the Transaction Log. But this got me thinking to ask this question.
info
the big problem is the dbo.Download table, it creates massive storage that is actually not needed, I had 3GB before truncating it, then 52MB ;)
Try this script - it will list the number of rows and the space used by data rows (and the total space used) for all tables in your database:
SELECT
t.NAME AS TableName,
i.name AS indexName,
SUM(p.rows) AS RowCounts,
SUM(a.total_pages) AS TotalPages,
SUM(a.used_pages) AS UsedPages,
SUM(a.data_pages) AS DataPages,
(SUM(a.total_pages) * 8) / 1024 AS TotalSpaceMB,
(SUM(a.used_pages) * 8) / 1024 AS UsedSpaceMB,
(SUM(a.data_pages) * 8) / 1024 AS DataSpaceMB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.NAME NOT LIKE 'dt%' AND
i.OBJECT_ID > 255 AND
i.index_id <= 1
GROUP BY
t.NAME, i.object_id, i.index_id, i.name
ORDER BY
OBJECT_NAME(i.object_id)
Use sp_spacedUsed
Exec sp_spaceused N'YourTableName'
Or if you want to execute the sp_spaceused for each table in your database then you can use this SQL:
set nocount on
create table #spaceused (
name nvarchar(120),
rows char(11),
reserved varchar(18),
data varchar(18),
index_size varchar(18),
unused varchar(18)
)
declare Tables cursor for
select name
from sysobjects where type='U'
order by name asc
OPEN Tables
DECLARE #table varchar(128)
FETCH NEXT FROM Tables INTO #table
WHILE ##FETCH_STATUS = 0
BEGIN
insert into #spaceused exec sp_spaceused #table
FETCH NEXT FROM Tables INTO #table
END
CLOSE Tables
DEALLOCATE Tables
select * from #spaceused
drop table #spaceused
exec sp_spaceused
The above SQL is from here
Rossisdead's comment answered this question the best for me, I wish it wasn't buried in a comment. This will be useful for people like me not trying to script the solution (the OP did not ask for a code snippet)
If you're using Management Studio you can also right click on the
database and go to Reports -> Disk Usage by Table for the same
results
Thanks to #marc_s for the answer. I needed to know data vs index space so I went ahead and expanded on the query to include that.
SELECT TableName
, SUM(DataRowCounts) AS DataRowCounts
, SUM(DataTotalSpaceGB) AS DataTotalSpaceGB
, SUM(DataSpaceUsedGB) AS DataSpaceUsedGB
, SUM(DataUnusedSpaceGB) AS DataUnusedSpaceGB
, SUM(IndexRowCounts) AS IndexRowCounts
, SUM(IndexTotalSpaceGB) AS IndexTotalSpaceGB
, SUM(IndexSpaceUsedGB) AS IndexSpaceUsedGB
, SUM(IndexUnusedSpaceGB) AS IndexUnusedSpaceGB
, SUM(DataTotalSpaceGB) + SUM(IndexTotalSpaceGB) AS TotalSpaceGB
FROM
(
SELECT t.NAME AS TableName
, i.type_desc AS IndexType
, CASE WHEN i.type_desc IN ('CLUSTERED', 'CLUSTERED COLUMNSTORE', 'HEAP') THEN CAST(ROUND(((SUM(a.total_pages) * 8) / 1024.00), 2)/1000 AS NUMERIC(36, 2)) ELSE 0 END AS DataTotalSpaceGB
, CASE WHEN i.type_desc IN ('CLUSTERED', 'CLUSTERED COLUMNSTORE', 'HEAP') THEN CAST(ROUND(((SUM(a.used_pages) * 8) / 1024.00), 2)/1000 AS NUMERIC(36, 2)) ELSE 0 END AS DataSpaceUsedGB
, CASE WHEN i.type_desc IN ('CLUSTERED', 'CLUSTERED COLUMNSTORE', 'HEAP') THEN CAST(ROUND(((SUM(a.total_pages) - SUM(a.used_pages)) * 8) / 1024.00, 2)/1000 AS NUMERIC(36, 2)) ELSE 0 END AS DataUnusedSpaceGB
, CASE WHEN i.type_desc IN ('CLUSTERED', 'CLUSTERED COLUMNSTORE', 'HEAP') THEN SUM(p.Rows) ELSE 0 END AS DataRowCounts
, CASE WHEN i.type_desc = 'NONCLUSTERED' THEN CAST(ROUND(((SUM(a.total_pages) * 8) / 1024.00), 2)/1000 AS NUMERIC(36, 2)) ELSE 0 END AS IndexTotalSpaceGB
, CASE WHEN i.type_desc = 'NONCLUSTERED' THEN CAST(ROUND(((SUM(a.used_pages) * 8) / 1024.00), 2)/1000 AS NUMERIC(36, 2)) ELSE 0 END AS IndexSpaceUsedGB
, CASE WHEN i.type_desc = 'NONCLUSTERED' THEN CAST(ROUND(((SUM(a.total_pages) - SUM(a.used_pages)) * 8) / 1024.00, 2)/1000 AS NUMERIC(36, 2)) ELSE 0 END AS IndexUnusedSpaceGB
, CASE WHEN i.type_desc = 'NONCLUSTERED' THEN SUM(p.Rows) ELSE 0 END AS IndexRowCounts
FROM sys.tables t
INNER JOIN sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id
LEFT JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE t.NAME NOT LIKE 'dt%'
AND t.is_ms_shipped = 0
AND i.OBJECT_ID > 255
AND s.Name = 'dbo' --update this filter
AND t.Name = 'MyTable'
GROUP BY t.Name
, i.type_desc
) x
GROUP BY TableName
ORDER BY TotalSpaceGB DESC