Reclaim free space in partition with as little fragmentation as possible - sql-server

Question:
In Sql server 2012 what is the best way to reclaim as much reserved space as possible while having as little fragmentation as possible?
Background:
Our SQL server is running low on disk space and as a part of HW+SW upgrade we are going to move data files to different server - for this reason we want to reduce size of datafiles (to prevent unnecessary moving of 'reserved space'. We are speaking about tearbytes). I also want to perform this partition by partition to be able to run overnights and limit production impact.
One approach I tried (per partition on the heavy consumer table with single index):
ALTER TABLE <Tbl>
REBUILD PARTITION = <PartitionId> WITH (DATA_COMPRESSION = PAGE)
GO
--I know this is bad practice - but I need to reclaim space to speed up moving
DBCC SHRINKFILE(<PartitionName>, target_size = 10 )
GO
-- This is to mitigate the impact of shrinkfile
ALTER TABLE <Tbl>
REBUILD PARTITION = <PartitionId>
GO
--Run this in the end (and I run also between the individual tasks) to see impact on index fragmentation
SELECT * FROM sys.dm_db_index_physical_stats
(DB_ID(<DbName>), OBJECT_ID(<TblName>), <IndexId>, <PartitionId>, 'SAMPLED');
GO
In test environemnt this yield great results for some partitions (0% gragmentation and near 0% 'wasted' space on reserved space. Wasted by considering that next stage is moving data over wire), but I have a case of partition that SHRINKFILE reduces the size significantly, but causes 99.99% fragmentation; the REBUILD solves the fragmentation, but doubles the filegroup size (half being reserved space) - which is probably expected as rebuild creates index from scratch. If I shrink afterwards I can reclaim the space, but again get the large fragmentation. And this can go in circles
I'm now trying to run reorganize on shrinked filegroup:
ALTER INDEX <IdxName> on <Tbl> REORGANIZE PARTITION = <PartitionId>
As this should hopefully fix the index fragmentation without growing the datafile.
However:
is that a good idea to run Reorganize on 99.99% fragmented index?
will the result be comparable/inferior/superior to running rebuild?
Another option I'm considering is to rebuild the partition to brand new filegroup, but this would require manipulating partition schema - and I want to keep the process as simple as possible.

What about backing up the database using compression and restoring it to the new server. Backups do not include unused space.

This is not the best answer, but this is what I did as it best addressed my specific circumstances - especially the lack of any additional free space to be used - so I couldn't even use the backup-restore approach; and ability to perform the operation in smaller (overnight) batches.
I wanted to post it in case someone would find it helpful.
However - it's definitively better to make sure that you always have at least as much free space as what your DB currently occupies and then you can use more appropriate solution like e.g. the compressed backup suggestion that I marked as answer.
--This is just so that anything doesn't interract with table during the entire process.
-- reorganize is being done online; but I want the process to finish as fast as possible and
-- app logic is resilient to not seeing the table for while
exec sp_rename <tbl_orig>, <tbl>
GO
print 'starting compressing: ' + CAST(GETDATE() AS NVARCHAR)
GO
-- this is to enable compression on the partition
ALTER TABLE <tbl>
REBUILD PARTITION = <PartitionId> WITH (DATA_COMPRESSION = PAGE)
GO
print 'Compressing done: ' + CAST(GETDATE() AS NVARCHAR)
GO
-- recaliaming all free space; potentially very bad fragmentation is possible
DBCC SHRINKFILE(<DataFile>, target_size = 10 )
GO
print 'shrinking done: ' + CAST(GETDATE() AS NVARCHAR)
GO
-- solve the fragmentation without giving up on any reclaimed space. Rebuild would use some additional space.
-- This assumes that my partitions are stored in dedicated filegroups (which is always a good idea)
ALTER INDEX <IdxName> on <tbl> REORGANIZE PARTITION = <PartitionId>
GO
print 'index reorganizing done: ' + CAST(GETDATE() AS NVARCHAR)
GO
-- see the stats
SELECT * FROM sys.dm_db_index_physical_stats
(DB_ID(<DBName>), OBJECT_ID(<Tbl>), 1, <PartitionId> , 'SAMPLED');
GO
print 'DONE: ' + CAST(GETDATE() AS NVARCHAR)
GO
-- show the table back to app logic
exec sp_rename <tbl>, <tbl_orig>
GO

You can rebuild the partition over to a new filegroup. This will result in perfectly contiguous pages and a perfectly filled file. This is in general a very nice way of defragmenting. You can automate this.
Defragmenting by rebuilding in place has some issues as you found out. You need a lot of temporary space and your newly allocated b-tree will be squished into lots of free space holes by SQL Servers allocation algorithm. The allocation algorithm is not smart. It will not try to find big holes. It is happy spreading the new tree over tiny holes if they exist. That's the reason you can end up being fragmented directly after rebuilding. (Interestingly, NTFS has the same issue. If you just write a 100GB file sequentially it might end up extremely fragmented.)
I believe this issue is not widely understood in the SQL Server community.

Related

Reclaim unused space from a SQL Server table

In SQL Server, in one of our databases, we have a big database table that's using over 1.2 TB of space. It has about 200 GB of actual data but over 1 TB of unused space.
This happened over 2 years as old time series data was deleted from this table daily and new data was inserted daily.
We do not expect for the table size to increase much going forward.
I am looking for the best way to reclaim unused space from this table without taking the database or table offline, and without causing too much CPU overhead.
I think for this you'll need to use DBCC Shrinkfile, possibly in several incremental steps.
First see if truncateonly has an acceptable effect - it depends on how the data is distributed within the file
DBCC SHRINKFILE (N'Tablename' , truncateonly)
If the file does not shrink sufficiently you can specify a target size to shrink to in MB eg
DBCC SHRINKFILE (N'Tablename' , 256000)
You can monitor the impact on performance while this executes and stop it if need be, resuming again as appropriate.

How to get the space used by SQL Server back?

I have created a table on SQL Server then inserted 135 million records in that table.
then I truncate it
then tried to re-insert the same 135 million records again.
but something went wrong and had to restart the computer
I got the recovery mode in my database.
then fixed.
the problem now is C drive has 10GB free only (210GB used) while before that I used to have 105GB free!
I checked folders but the sum of all sizes including hidden ones does not sum to 210GB
what happened and where did these GBs have gone?
The space is not automatically released by the database. In the case of tempdb, if you restart the sql service, tempdb will reinitialize to original size. But, not the case for other databases.
If you want to reclaim the space, there are two approaches:
Cleaner approch:
As suggested by Paul Randal, go for new filegroup for existing tables and then drop the old filegroup.Refer to his article
Create a new filegroup
Move all affected tables and indexes into the new filegroup using the CREATE INDEX … WITH (DROP_EXISTING = ON) ON syntax, to move the
tables and remove fragmentation from them at the same time
Drop the old filegroup that you were going to shrink anyway (or shrink it way down if its the primary filegroup)
Brute Force approach and redmediation
Here, you can use DBCC SHRINKFILE and reduce the size. Keep little more space in the database to avoid frequent autogrowth. Beware that, shrinking file will lead to index fragmentation and you have to rebuild indexes post the shrinking of files.
As Paul Randal recommends:
If you absolutely have no choice and have to run a data file shrink
operation, be aware that you’re going to cause index fragmentation and
you might need to take steps to remove it afterwards if it’s going to
cause performance problems. The only way to remove index fragmentation
without causing data file growth again is to use DBCC INDEXDEFRAG or
ALTER INDEX … REORGANIZE. These commands only require a single 8KB
page of extra space, instead of needing to build a whole new index in
the case of an index rebuild operation (which will likely cause the
file to grow).

DB2 - Reclaiming disk space used by dropped tables

I have an application that logs to a DB2 database. Each log is stored in a daily table, meaning that I have several tables, one per each day.
Since the application is running for quite some time, I dropped some of the older daily tables, but the disk space was not reclaimed.
I understand this is normal in DB2, so I goggled and found out that the following command can be used to reclaim space:
db2 alter tablespace <table space> reduce max
Since the tablespace that store the daily log tables is called USERSPACE1, I executed the following command successfully:
db2 alter tablespace userspace1 reduce max
Unfortunately the disk space used by DB2 instance is still the same...
I've read somewhere that the REORG command can be executed, but what I've seen it is used to reorganize tables. Since I dropped the tables, how can I use REORG?
Is there any other way to do this?
Thanks
Reduce the size of a tablespace is very complex. The extents (set of contiguous pages; unit of tablespace allocation) of the tables are not distributed sequentially for a same table. When you reorg a table, the rows will be organized in pages, and the new pages will be written normally at the end of the tablespace. Sometimes, the high watermark will be increased, and your tablespace will be bigger.
You need to reorg all tables from a tablespace in order to "defrag" all tables. Then, you have to perform a new reorg in order to use the previous space, because it should be an empty space in the tablespace.
However, there are many criteria that impacts the organization of the tables in a tablespace: New extents are created (new rows, rows overflow due to updates); compression could be activated after reorg.
What you can do is to assign few or just one table per tablespace; however, you will waste a lot of space (overhead, empty pages, etc.)
The command that you are using is an automatic way to do that, but it does not always work as desired: http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0055392.html
If you want to see the distribution of the tables in your tablespace, you can use db2dart. Then, you can have an idea of which table to reorg (move).
Sorry guys,
The command that I mentioned on the original post works after all, but the space was retrieved very slowly.
Thanks for the help

Enable index without rebuild?

Using SQL Server 2012 Entreprise.
I have a table of 12 billion rows that takes 700GB on disk, in 30 partitions.
It has only one index, clustered.
I have 500 GB free disk space.
I disabled the index (please don't ask why. If you have to know, I targeted the wrong database).
I now want to enable the index. If I do
alter index x1 on t1 rebuild
I eventually get an error because there is not enough free disk space. That was a painful lesson about disk space requirements for rebuilding a clustered index.
Ideally, I want to rebuild the index one partition at a time. If I do
alter index x1 on t1 rebuild partition = 1
I get the error: Cannot perform the specified operation on disabled index.
Any solution, besides buying more physical disks? The table has not changed since disabling the index (can't be accessed anyway), so I am really looking for a hack that can fool SQL into thinking the index is enabled. Any suggestions?
Thanks
If its a clustered index that you have disabled you have effectively disabled the table the only operations you can execute on this table is "drop" or "rebuild" as far as i am aware.
you could try the deprecated dbbc dbreindex command, maybe you are lucky and it rebuilds more disk space efficiently. Also you might squeeze some more space out if you set a fill factor to 100 when you rebuild. assuming you database table is now only being read.
DBCC DBREINDEX ('Person.Address', 'PK_Address_AddressID', 100)
allows you to reindex just the clustered index.

How to reduce size of SQL Server table that grew from a datatype change

I have a table on SQL Server 2005 that was about 4gb in size.
(about 17 million records)
I changed one of the fields from datatype char(30) to char(60) (there are in total 25 fields most of which are char(10) so the amount of char space adds up to about 300)
This caused the table to double in size (over 9gb)
I then changed the char(60) to varchar(60) and then ran a function to cut extra whitespace out of the data (so as to reduce the average length of the data in the field to about 15)
This did not reduce the table size. Shrinking the database did not help either.
Short of actually recreating the table structure and copying the data over (that's 17 million records!) is there a less drastic way of getting the size back down again?
You have not cleaned or compacted any data, even with a "shrink database".
DBCC CLEANTABLE
Reclaims space from dropped variable-length columns in tables or indexed views.
However, a simple index rebuild if there is a clustered index should also do it
ALTER INDEX ALL ON dbo.Mytable REBUILD
A worked example from Tony Rogerson
Well it's clear you're not getting any space back ! :-)
When you changed your text fields to CHAR(60), they are all filled up to capacity with spaces. So ALL your fields are now really 60 characters long.
Changing that back to VARCHAR(60) won't help - the fields are still all 60 chars long....
What you really need to do is run a TRIM function over all your fields to reduce them back to their trimmed length, and then do a database shrinking.
After you've done that, you need to REBUILD your clustered index in order to reclaim some of that wasted space. The clustered index is really where your data lives - you can rebuild it like this:
ALTER INDEX IndexName ON YourTable REBUILD
By default, your primary key is your clustered index (unless you've specified otherwise).
Marc
I know I'm not answering your question as you are asking, but have you considered archiving some of the data to a history table, and work with fewer rows?
Most of the times you might think at first glance that you need all that data all the time but when actually sitting down and examining it, there are cases where that's not true. Or at least I've experienced that situation before.
I had a similar problem here SQL Server, Converting NTEXT to NVARCHAR(MAX) that was related to changing ntext to nvarchar(max).
I had to do an UPDATE MyTable SET MyValue = MyValue in order to get it to resize everything nicely.
This obviously takes quite a long time with a lot of records. There were a number of suggestions as how better to do it. They key one was a temporary flag indicated if it had been done or not and then updating a few thousand at a time in a loop until it was all done. This meant I had "some" control over how much it was doing.
On another note though, if you really want to shrink the database as much as possible, it can help if you turn the recovery model down to simple, shrink the transaction logs, reorganise all the data in the pages, then set it back to full recovery model. Be careful though, shrinking of databases is generally not advisable, and if you reduce the recovery model of a live database you are asking for something to go wrong.
Alternatively, you could do a full table rebuild to ensure there's no extra data hanging around anywhere:
CREATE TABLE tmp_table(<column definitions>);
GO
INSERT INTO tmp_table(<columns>) SELECT <columns> FROM <table>;
GO
DROP TABLE <table>;
GO
EXEC sp_rename N'tmp_table', N'<table>';
GO
Of course, things get more complicated with identity, indexes, etc etc...

Resources