Is it ok to truncate a LOG file during Fulltext Index Repopulation? - sql-server

A simple question ...
As part of a database maintenance routine we occasionally completely delete and rebuild a Fulltext Index and its underlying Clustered index.
This works quite well, and there is no problem with it, apart from ONE thing:
After we have re-created the Clustered index we execute a statement to re-create the fulltext index:
CREATE FULLTEXT INDEX ON [dbo].[<ourtablename>] (<thefieldswewanttoindex>) KEY INDEX [<theclusteredindex>] ON [<thefulltextcatalog>] WITH CHANGE_TRACKING AUTO
This, again, works perfectly fine, and it takes a number of hours to complete, which is also perfectly fine as this is done during down time and it affects no users. But there is ONE small thing that annoys me. While the fulltext index is repopulating, i.e:
SELECT FULLTEXTCATALOGPROPERTY('<thefulltextcatalog>', 'Populatestatus')
returns 1, the LOG file keeps growing and growing, up to 110GB. After that we just truncate it and the users carry on the next day.
So the question is:
would it be OK to occasionally truncate the LOG file during the hours while the Populatestatus returns 1 so that we keep the LOG file size to a manageable level?

It is perfectly fine to do this, however this may not release the log used for rebuilding the index. In other words, truncating the log may not reduce its size until the index is populated.
Deleting such huge amount of data and rebuilding it will always consume IO and log resources. If you try yo avoid deletion/repopulation of your clustered table, this will significantly reduce the log growth. In this case you will not need to recreate the full text index too.

Related

How to get the space used by SQL Server back?

I have created a table on SQL Server then inserted 135 million records in that table.
then I truncate it
then tried to re-insert the same 135 million records again.
but something went wrong and had to restart the computer
I got the recovery mode in my database.
then fixed.
the problem now is C drive has 10GB free only (210GB used) while before that I used to have 105GB free!
I checked folders but the sum of all sizes including hidden ones does not sum to 210GB
what happened and where did these GBs have gone?
The space is not automatically released by the database. In the case of tempdb, if you restart the sql service, tempdb will reinitialize to original size. But, not the case for other databases.
If you want to reclaim the space, there are two approaches:
Cleaner approch:
As suggested by Paul Randal, go for new filegroup for existing tables and then drop the old filegroup.Refer to his article
Create a new filegroup
Move all affected tables and indexes into the new filegroup using the CREATE INDEX … WITH (DROP_EXISTING = ON) ON syntax, to move the
tables and remove fragmentation from them at the same time
Drop the old filegroup that you were going to shrink anyway (or shrink it way down if its the primary filegroup)
Brute Force approach and redmediation
Here, you can use DBCC SHRINKFILE and reduce the size. Keep little more space in the database to avoid frequent autogrowth. Beware that, shrinking file will lead to index fragmentation and you have to rebuild indexes post the shrinking of files.
As Paul Randal recommends:
If you absolutely have no choice and have to run a data file shrink
operation, be aware that you’re going to cause index fragmentation and
you might need to take steps to remove it afterwards if it’s going to
cause performance problems. The only way to remove index fragmentation
without causing data file growth again is to use DBCC INDEXDEFRAG or
ALTER INDEX … REORGANIZE. These commands only require a single 8KB
page of extra space, instead of needing to build a whole new index in
the case of an index rebuild operation (which will likely cause the
file to grow).

Clustered Column Store Index gets created in the beginning and vanishes once the job is completed

I have an existing application which has many SQL Server stored procedures that run as below.These stored procs are applied on a data file and compute is done as per some business rules.
1) Pre-process
2) Process
3) Post-Process
In Pre-process, we are creating 'n' no. of tables with clustered column store index in place. When the job kicks off the tables get created with clustered column store index but the indexes vanish once the job is completed. ( This happens only for a large input data file. )
When I run the job on a small data file the clustered column store index gets created on the tables and it exists even after the completion of job.
Note :- The code is the same when i executed it for both small and large data files.
Can somebody share your thoughts on this if you have encountered similar problem?
Two things will cause an already fully established Index to 'vanish' from a table:
A process or user deletes it.
The transaction in which the index was created is rolled back, either because an exception was raised later in the transaction, the transaction wasn't recoverable, or via an explicit Rollback.
And that's it. You're answer lies in one of the two above.
I know this is not the answer you were looking for, it is however guaranteed to be THE answer. Somewhere your code is failing and that's why the indexes are now vanishing.
Sql Server isn't a slapdash RDBMS - if it just arbitrarily just randomly dropped indexes then you know we'd be all over it. By your own admission you have complicated code.
Our DataWarehouse routinely drops and rebuilds indexes of all sorts - the only times it's 'missing' them has been the result of a bug in our code.

SQL Server - Index maintenance for index with uniqueidentifier?

I got some non-clustered indexes (unique) with uniqueidentifier (GUID) as column. The index gets a lot of fragmentation all the time.
How should I solve this with Ola Hallengren´s maintenance script?
Skip reorg/rebuild of these index?
The problem is described here:
https://blogs.msdn.microsoft.com/sqlserverfaq/2011/08/30/another-reason-of-index-logical-fragmentation/
here you have two options:
Very basic information.
DBCC DBReindex: locks up the tables and users may not be able to access the data until the reindex is done. Bottom line - this drops
the indexes and creates them from scratch. You have brand new indexes
when this is done, so they are in the 'best state' possible. Again, it
ties up the database tables. This is an all or nothing action. If you
stop the process, everything has to rollback.
DBCC INDEXDEFRAG: Does not lock up the tables as much. Users can still access the data. The indexes still exist, they are just being
'fixed'. If this is stopped, it doesn't rollback everything. So the
index will be less defragged than when you started.
If you run DBReindex, you don't need to run INDEXDEFRAG. There's
nothing to defrag when you have brand new indexes.
hope this help!
I think in this instance you should exclude these from Ola Hallengren's maintenance script. Also Guids should not be part of any clustered index.

Enable index without rebuild?

Using SQL Server 2012 Entreprise.
I have a table of 12 billion rows that takes 700GB on disk, in 30 partitions.
It has only one index, clustered.
I have 500 GB free disk space.
I disabled the index (please don't ask why. If you have to know, I targeted the wrong database).
I now want to enable the index. If I do
alter index x1 on t1 rebuild
I eventually get an error because there is not enough free disk space. That was a painful lesson about disk space requirements for rebuilding a clustered index.
Ideally, I want to rebuild the index one partition at a time. If I do
alter index x1 on t1 rebuild partition = 1
I get the error: Cannot perform the specified operation on disabled index.
Any solution, besides buying more physical disks? The table has not changed since disabling the index (can't be accessed anyway), so I am really looking for a hack that can fool SQL into thinking the index is enabled. Any suggestions?
Thanks
If its a clustered index that you have disabled you have effectively disabled the table the only operations you can execute on this table is "drop" or "rebuild" as far as i am aware.
you could try the deprecated dbbc dbreindex command, maybe you are lucky and it rebuilds more disk space efficiently. Also you might squeeze some more space out if you set a fill factor to 100 when you rebuild. assuming you database table is now only being read.
DBCC DBREINDEX ('Person.Address', 'PK_Address_AddressID', 100)
allows you to reindex just the clustered index.

How to reduce size of SQL Server table that grew from a datatype change

I have a table on SQL Server 2005 that was about 4gb in size.
(about 17 million records)
I changed one of the fields from datatype char(30) to char(60) (there are in total 25 fields most of which are char(10) so the amount of char space adds up to about 300)
This caused the table to double in size (over 9gb)
I then changed the char(60) to varchar(60) and then ran a function to cut extra whitespace out of the data (so as to reduce the average length of the data in the field to about 15)
This did not reduce the table size. Shrinking the database did not help either.
Short of actually recreating the table structure and copying the data over (that's 17 million records!) is there a less drastic way of getting the size back down again?
You have not cleaned or compacted any data, even with a "shrink database".
DBCC CLEANTABLE
Reclaims space from dropped variable-length columns in tables or indexed views.
However, a simple index rebuild if there is a clustered index should also do it
ALTER INDEX ALL ON dbo.Mytable REBUILD
A worked example from Tony Rogerson
Well it's clear you're not getting any space back ! :-)
When you changed your text fields to CHAR(60), they are all filled up to capacity with spaces. So ALL your fields are now really 60 characters long.
Changing that back to VARCHAR(60) won't help - the fields are still all 60 chars long....
What you really need to do is run a TRIM function over all your fields to reduce them back to their trimmed length, and then do a database shrinking.
After you've done that, you need to REBUILD your clustered index in order to reclaim some of that wasted space. The clustered index is really where your data lives - you can rebuild it like this:
ALTER INDEX IndexName ON YourTable REBUILD
By default, your primary key is your clustered index (unless you've specified otherwise).
Marc
I know I'm not answering your question as you are asking, but have you considered archiving some of the data to a history table, and work with fewer rows?
Most of the times you might think at first glance that you need all that data all the time but when actually sitting down and examining it, there are cases where that's not true. Or at least I've experienced that situation before.
I had a similar problem here SQL Server, Converting NTEXT to NVARCHAR(MAX) that was related to changing ntext to nvarchar(max).
I had to do an UPDATE MyTable SET MyValue = MyValue in order to get it to resize everything nicely.
This obviously takes quite a long time with a lot of records. There were a number of suggestions as how better to do it. They key one was a temporary flag indicated if it had been done or not and then updating a few thousand at a time in a loop until it was all done. This meant I had "some" control over how much it was doing.
On another note though, if you really want to shrink the database as much as possible, it can help if you turn the recovery model down to simple, shrink the transaction logs, reorganise all the data in the pages, then set it back to full recovery model. Be careful though, shrinking of databases is generally not advisable, and if you reduce the recovery model of a live database you are asking for something to go wrong.
Alternatively, you could do a full table rebuild to ensure there's no extra data hanging around anywhere:
CREATE TABLE tmp_table(<column definitions>);
GO
INSERT INTO tmp_table(<columns>) SELECT <columns> FROM <table>;
GO
DROP TABLE <table>;
GO
EXEC sp_rename N'tmp_table', N'<table>';
GO
Of course, things get more complicated with identity, indexes, etc etc...

Resources