SQL server - Delete statement increase the LOG size - sql-server

I have a LOGGIN database and it is quite big - 400 GB. It has millions of rows.
I just ran a delete statement which took 2.5 hours and deleted probably millions of rows.
delete FROM [DB].[dbo].[table]
where [Level] not in ('info','error')
This is a simple recovery model database. But when I ran the above statement the log files grew to be 800 GB and crashed the server. Why does the LOG file grows for a simple recovery model database?
How can I avoid this in future?
Thanks for your time - RM

I bet you tried to run the whole delete in one transaction. Correct?
Once the transaction is complete, the log space can be reclaimed. Since the transaction never completed, the log file grew until it crashed the server.
Check out my blog entry on How to Delete Large Data.
http://craftydba.com/?p=3079
The key to the solution is the following, SIMPLE recover mode, DELETE in small batches, take FULL backup at end of purge. Select the recovery model that you want at the end.
Here is some sample code to help you on your way.
--
-- Delete in batches in SIMPLE recovery mode
--
-- Select correct db
USE [MATH]
GO
-- Set to simple mode
ALTER DATABASE [MATH] SET RECOVERY SIMPLE;
GO
-- Get count of records
SELECT COUNT(*) AS Total FROM [MATH].[dbo].[TBL_PRIMES];
GO
-- Delete in batches
DECLARE #VAR_ROWS INT = 1;
WHILE (#VAR_ROWS > 0)
BEGIN
DELETE TOP (10000) FROM [MATH].[dbo].[TBL_PRIMES];
SET #VAR_ROWS = ##ROWCOUNT;
CHECKPOINT;
END;
GO
-- Set to full mode
ALTER DATABASE [MATH] SET RECOVERY FULL;
GO
Last but not least, if the amount of remaining data after the delete is real small, it might be quicker to do the following.
1 - SELECT * INTO [Temp Table] WHERE (clause = small data).
2 - DROP [Original Table].
3 - Rename [Temp Table] to [Original Table].
4 - Add any constraints or missing objects.
The DROP table action does not LOG all the data being removed.
Sincerely,
John

Consider using an open-source PowerShell Module sqlsizer-msql.
It's available on GitHub and it's published under MIT license:
https://github.com/sqlsizer/sqlsizer-mssql
I think could help you with your task. It has "slow delete" feature.

Related

Stored procedure - truncate table

I've created a stored procedure to add data to a table. In mock fashion the steps are:
truncate original table
Select data into the original table
The query that selects data into the original table is quite long (it can take almost a minute to complete), which means that the table is then empty of data for over a minute.
To fix this empty table I changed the stored procedure to:
select data into #temp table
truncate Original table
insert * from #temp into Original
While the stored procedure was running, I did a select * on the original table and it was empty (refreshing, it stayed empty until the stored procedure completed).
Does the truncate happen at the beginning of the procedure no matter where it actually is in the code? If so is there something else I can do to control when the data is deleted?
A very interesting method to move data into a table very quickly is to use partition switching.
Create two staging tables, myStaging1 and myStaging2, with the new data in myStaging2. They must be in the same DB and the same filegroup (so not temp tables or table variables), with the EXACT same columns, PKs, FKs and indexes.
Then run this:
SET XACT_ABORT, NOCOUNT ON; -- force immediate rollback if session is killed
BEGIN TRAN;
ALTER TABLE myTargetTable SWITCH TO myStaging1
WITH ( WAIT_AT_LOW_PRIORITY ( MAX_DURATION = 1 MINUTES, ABORT_AFTER_WAIT = BLOCKERS ));
-- not strictly necessary to use WAIT_AT_LOW_PRIORITY but better for blocking
-- use SELF instead of BLOCKERS to kill your own session
ALTER TABLE myStaging2 SWITCH TO myTargetTable
WITH (WAIT_AT_LOW_PRIORITY (MAX_DURATION = 0 MINUTES, ABORT_AFTER_WAIT = BLOCKERS));
-- force blockers off immediately
COMMIT TRAN;
TRUNCATE TABLE myStaging1;
This is extremely fast, as it's just a metadata change.
You will ask: partitions are only supported on Enterprise Edition (or Developer), how does that help?
Switching non-partitioned tables between each other is still allowed even in Standard or Express Editions.
See this article by Kendra Little for further info on this technique.
The sp is being called by code in an HTTP Get, so I didn't want the table to be empty for over a minute during refresh. When I asked the question I was using a select * from the table to test, but just now I tested by hitting the endpoint in postman and I never received an empty response. So it appears that putting the truncate later in the sp did work.

Reserving clean block of identity values in T-SQL for data migration

We're currently working on the following process whose goal is to move data between 2 sets of database servers while maintaining FK's and handling the fact that the destination tables already have rows with overlapping identity column values:
Extract a set of rows from a "root" table and all of its children tables' FK associated data n-levels deep along with related rows that may reside in other databases on the same instance from the source database server.
Place that extracted data set into a set of staging tables on the destination database server.
Rekey the data in the staging tables by reserving block of identities for the destination tables and update all related child staging tables (each of these staging tables will have the same schema as the source/destination table with the addition of a "lNewIdentityID" column).
Insert the data with its new identity into the destination tables in correct order (option SET IDENTITY_INSERT 'desttable' ON will be used obviously).
I'm struggling with the block reservation portion of this process (#3). Our system is pretty much a 24 hour system except for a short weekly maintenance window. Management needs this process to NOT have to wait each week for the maintenance window to migrate data between servers. That being said, I may have 100 insert transactions competing with our migration process while it is on #3. Below is my wag at an attempt to reserve the block of identities, but I'm worried that between "SET #newIdent..." and "DBCC CHECKIDENT..." that an insert transaction will complete and the migration process won't have a "clean" block of identities in a known range that it can use to rekey the staging data.
I essentially need to lock the table, get the current identity, increase the identity, and then unlock the table. I don't know how to do that in T-SQL and am looking for ideas. Thank you.
IF EXISTS (SELECT TOP 1 1 FROM sys.procedures WHERE [name]='DataMigration_ReserveBlock')
DROP PROC DataMigration_ReserveBlock
GO
CREATE PROC DataMigration_ReserveBlock (
#tableName varchar(100),
#blockSize int
)
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
DECLARE #newIdent bigint;
SET #newIdent = #blockSize + IDENT_CURRENT(#tableName);
DBCC CHECKIDENT (#tableName, RESEED, #newIdent);
SELECT #newIdent AS NewIdentity;
END
GO
DataMigration_ReserveBlock 'tblAddress', 1234
You could wrap it in a transaction
BEGIN TRANSACTION
...
COMMIT
It should be fast enough to not cause problems with your other insert processes. Though it would be a good idea to include try / catch logic so you could rollback if problems do occur.

DBCC SHRINKFILE on log file not reducing size even after BACKUP LOG TO DISK

I've got a database, [My DB], that has the following info:
SQL Server 2008
MDF size: 30 GB
LDF size: 67 GB
I wanted to shrink the log file as much as possible and so I started my quest to figure out how to do this. Caveat: I am not a DBA (or even approaching a DBA) and have been progressing by feel through this quest.
First, I just went into SSMS, DB properties, Files, and edited the Initial Size (MB) value to 10. That reduced the log file to 62 GB (not exactly the 10 MB that I entered). So, I attached SQL Profiler, saw that DBCC SHRINKFILE was being called. I then entered that command into the query editor and here's the results.
DBCC SHRINKFILE (N'My DB_Log' , 10)
And the output was:
Cannot shrink log file 2 (My DB_Log) because the logical log file located at the end of the file is in use.
DbId FileId CurrentSize MinimumSize UsedPages EstimatedPages
------ ----------- ----------- ----------- ----------- --------------
8 2 8044104 12800 8044104 12800
(1 row(s) affected)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
I then did some research on that and found this:
http://support.microsoft.com/kb/907511
Which says that I need to backup the log file before the shrinkfile so that the virtual log files will be released and the shrinkfile can do its job - I don't know what that means... I'm just paraphrasing here :)
So, I figured I'd try to backup the log file and then do a DBCC SHRINKFILE (and I changed the new log file size to 12800 since that was the MinimumSize identified in the output of the previous DBCC SHRINKFILE command)
BACKUP LOG [My DB] TO DISK = 'D:\SQLBackup\20110824-MyDB-Log.bak'
GO
DBCC SHRINKFILE (N'My DB_Log' , 12800)
GO
The result was the same as the first go around. I can only get the log file down to 62 GB.
I'm not sure what I'm doing wrong and what I should try next.
Okay, here is a solution to reduce the physical size of the transaction file, but without changing the recovery mode to simple.
Within your database, locate the file_id of the log file using the following query.
SELECT * FROM sys.database_files;
In my instance, the log file is file_id 2. Now we want to locate the virtual logs in use, and do this with the following command.
DBCC LOGINFO;
Here you can see if any virtual logs are in use by seeing if the status is 2 (in use), or 0 (free). When shrinking files, empty virtual logs are physically removed starting at the end of the file until it hits the first used status. This is why shrinking a transaction log file sometimes shrinks it part way but does not remove all free virtual logs.
If you notice a status 2's that occur after 0's, this is blocking the shrink from fully shrinking the file. To get around this do another transaction log backup, and immediately run these commands, supplying the file_id found above, and the size you would like your log file to be reduced to.
-- DBCC SHRINKFILE (file_id, LogSize_MB)
DBCC SHRINKFILE (2, 100);
DBCC LOGINFO;
This will then show the virtual log file allocation, and hopefully you'll notice that it's been reduced somewhat. Because virtual log files are not always allocated in order, you may have to backup the transaction log a couple of times and run this last query again; but I can normally shrink it down within a backup or two.
In addition to the steps you have already taken, you will need to set the recovery mode to simple before you can shrink the log.
THIS IS NOT A RECOMMENDED PRACTICE for production systems... You will lose your ability to recover to a point in time from previous backups/log files.
See example B on this DBCC SHRINKFILE (Transact-SQL) msdn page for an example, and explanation.
Try this
ALTER DATABASE XXXX SET RECOVERY SIMPLE
use XXXX
declare #log_File_Name varchar(200)
select #log_File_Name = name from sysfiles where filename like '%LDF'
declare #i int = FILE_IDEX ( #log_File_Name)
dbcc shrinkfile ( #i , 50)
I use this script on sql server 2008 R2.
USE [db_name]
ALTER DATABASE [db_name] SET RECOVERY SIMPLE WITH NO_WAIT
DBCC SHRINKFILE([log_file_name]/log_file_number, wanted_size)
ALTER DATABASE [db_name] SET RECOVERY FULL WITH NO_WAIT
Paul Randal has an exccellent discussion of this problem on his blog: http://www.sqlskills.com/blogs/paul/post/backup-log-with-no_log-use-abuse-and-undocumented-trace-flags-to-stop-it.aspx
I tried many ways but this works.
Sample code is availalbe in DBCC SHRINKFILE
USE DBName;
GO
-- Truncate the log by changing the database recovery model to SIMPLE.
ALTER DATABASE DBName
SET RECOVERY SIMPLE;
GO
-- Shrink the truncated log file to 1 MB.
DBCC SHRINKFILE (DBName_log, 1); --File name SELECT * FROM sys.database_files; query to get the file name
GO
-- Reset the database recovery model.
ALTER DATABASE DBName
SET RECOVERY FULL;
GO
I resolved this problem by taking the full and transactional backup. Sometimes, the backup process is not completed and that's one of the reason the .ldf file is not getting shrink. Try this. It worked for me.
One of our heavily transacted databases grows a few hundred thousand records in a log table every day. There are multiple log files that grow a few hundred GB every day.
We have a scheduled job that takes differential backup every half an hour. We have another scheduled job for housekeeping that runs early morning every day.
We do SHRINKFILE during the housekeeping after setting the RECOVERY to SIMPLE. We do take a full backup at the beginning and at the end of the process in order to overcome the issue of losing our ability to recover to a point in time from previous backups/log files. We use some flag in the database to make sure that the differential backup is not attempted until the housekeeping job is completed.
The brief outline is as below -
The House Keeping job:
Set the status to 'Housekeeping in progress'
Set the database to Single User mode
Take a full backup of the database
Delete records from various tables that are old
Set the database RECOVERY mode to SIMPLE
Iterate through the log files and shrink each of them
Set the database RECOVERY mode to FULL
Take a full backup of the database
Set the database to Multi-User mode
Set the status to 'Housekeeping completed'
The differential backup job:
Proceed only if the status is 'Housekeeping completed'
Take a differential backup
It takes a while to complete but it gets our database tidy and fresh before the regular business starts in the morning. It has been working fine for us.
Thanks to #user2630576 and #Ed.S.
the following worked a treat:
BACKUP LOG [database] TO DISK = 'D:\database.bak'
GO
ALTER DATABASE [database] SET RECOVERY SIMPLE
use [database]
declare #log_File_Name varchar(200)
select #log_File_Name = name from sysfiles where filename like '%LDF'
declare #i int = FILE_IDEX ( #log_File_Name)
dbcc shrinkfile ( #i , 50)
ALTER DATABASE [database] SET RECOVERY FULL

How to recover data from truncated table

while going though sql server interview question in book of mr. shiv prashad koirala. i got to know that, even after using truncate table command the data can be recovered.
please tell me how can we recover data when data is deleted using 'delete' command and how can data be recover if data is deleted using 'truncate' command.
what i know is that when we use delete command to delete records the entry of it is made in log file but i don't know how to recover the data from and as i read that truncate table not enters any log entry in database then how can that also be recovered.
if you can give me any good link to do it practically step by step than that will be great help to me.
i have got sql server 2008.
Thanks
If you use TRANSACTIONS in your code, TRUNCATE can be rolled back. If there is no transaction is used and TRUNCATE operation is committed, it can not be retrieved from log file. TRUNCATE is DDL operation and it is not logged in log file.
DELETE and TRUNCATE both can be rolled back when surrounded by TRANSACTION if the current session is not closed. If TRUNCATE is written in Query Editor surrounded by TRANSACTION and if session is closed, it can not be rolled back but DELETE can be rolled back.
USE tempdb
GO
-- Create Test Table
CREATE TABLE TruncateTest (ID INT)
INSERT INTO TruncateTest (ID)
SELECT 1
UNION ALL
SELECT 2
UNION ALL
SELECT 3
GO
-- Check the data before truncate
SELECT * FROM TruncateTest
GO
-- Begin Transaction
BEGIN TRAN
-- Truncate Table
TRUNCATE TABLE TruncateTest
GO
-- Check the data after truncate
SELECT * FROM TruncateTest
GO
-- Rollback Transaction
ROLLBACK TRAN
GO
-- Check the data after Rollback
SELECT * FROM TruncateTest
GO
-- Clean up
DROP TABLE TruncateTest
GO
By default none of these two can be reverted but there are special cases when this is possible.
Truncate: when truncate is executed SQL Server doesn’t delete data but only deallocates pages. This means that if you can still read these pages (using query or third party tool) there is a possibility to recover data. However you need to act fast before these pages are overwritten.
Delete: If database is in full recovery mode then all transactions are logged in transaction log. If you can read transaction log you can in theory figure out what were the previous values of all affected rows and then recover data.
Recovery methods:
One method is using SQL queries similar to the one posted here for
truncate or using functions like fn_dblog to read transaction log.
Another one is to use third party tools such as ApexSQL Log, SQL Log
Rescue, ApexSQL Recover or Quest Toad
SQL server keeps the entry (Page # & file #) of the truncated records and those records, you can easily browse from the below query.
Once you get the page ID & file ID , you can put it in the DBCC PAGE to retreive the complete record.
SELECT LTRIM(RTRIM(Replace([Description],'Deallocated',''))) AS [PAGE ID]
,[Slot ID],[AllocUnitId]
FROM sys.fn_dblog(NULL, NULL)
WHERE
AllocUnitId IN
(Select [Allocation_unit_id] from sys.allocation_units allocunits
INNER JOIN sys.partitions partitions ON (allocunits.type IN (1, 3)
AND partitions.hobt_id = allocunits.container_id) OR (allocunits.type = 2
AND partitions.partition_id = allocunits.container_id)
Where object_id=object_ID('' + 'dbo.Student' + ''))
AND Operation IN ('LOP_MODIFY_ROW') AND [Context] IN ('LCX_PFS')
AND Description Like '%Deallocated%'
Given below is the link of article that explains , how to recover truncated records from SQl server.
http://raresql.com/2012/04/08/how-to-recover-truncated-data-from-sql-server-without-backup/
If your database is in full recovery mode you can recover data either by truncated, deleted or dropped
Complete Step by Step Article is here https://codingfry.blogspot.com/2018/09/how-to-recover-data-from-truncated.html

Sql Server: chunking deletes still fills up transaction log; on fail all deletes are rolled back - why?

Here is my scenario: we have a database, let's call it Logging, with a table that holds records from Log4Net (via MSMQ). The db's recovery mode is set to Simple: we don't care about the transaction logs -- they can roll over.
We have a job that uses data from sp_spaceused to determine if we've met a certain size threshold. If the threshold is exceeded, we determine how many rows need to be deleted to bring the size down to x percent of that threshold. (As an aside, I'm using exec sp_spaceused MyLogTable, TRUE to get the number of rows and a rough approximation of their average size, although I'm not convinced that's the best way to go about it. But that's a different issue.)
I then try to chunk deletes (say, 5000 at a time) by looping a call to a sproc that basically does this:
DELETE TOP (#RowsToDelete) FROM [dbo].[MyLogTable]
until I've deleted what needs to be deleted.
Here's the issue: If I have a lot of rows to delete, the transaction log file fills up. I can watch it grow by running
dbcc sqlperf (logspace)
What puzzles me is that, when the job fails, ALL deleted rows get rolled back. In other words, it appears all the chunks are getting wrapped (somehow) in an implicit transaction.
I've tried expressly setting implicit transactions off, wrapping each DELETE statement in a BEGIN and COMMIT TRAN, but to no avail: either all deleted chunks succeed, or none at all.
I know the simple answer is, Make your log file big enough to handle the largest possible number of records you'd ever delete, but still, why is this being treated as a single transaction?
Sorry if I missed something easy, but I've looked at a lot of posts regarding log file growth, recovery modes, etc., and I can't figure this out.
One other thing: Once the job has failed, the log file stays up at around 95 - 100 percent full for a while before it drops back. However, if I run
checkpoint
dbcc dropcleanbuffers
it drops right back down to about 5 percent utilization.
TIA.
The log file in simple recovery model is truncated automatically every checkpoint gererally speaking. You can invoke checkpoint manually as you do at the end of the loop, but you can also do it every iteration. The frequency of checkpoints is by default determined automatically by sql server based on the recovery interval setting.
As far as the 'all deletes are rolled back', I don't see other explanation but an external transaction. Can you post entire code that cleans up the log? How do you invoke this code?
What is your setting of implicit transactions?
Hm.. if the log grows and doesn't truncate automatically, it may also indicate that there is a transaction running outside of the loop. Can you select ##trancount before your loop and perhaps with each iteration to find out what's going on?
Well, I tried several things, but still all deletes get rolled back. I added printint ##TRANCOUNT both before and after the delete and I get zero as the count. Yet, on failure, all deletes are rolled back .... I added SET IMPLICIT_TRANSACTIONS OFF in several places (including within my initial call from Query Analyzer, but that does not seem to help. This is the body of the stored procedure that is being called (I have set #RowsToDelete to 5000 and 8000):
SET NOCOUNT ON;
print N'##TRANCOUNT PRIOR TO DELETE: ' + CAST(##TRANCOUNT AS VARCHAR(20));
set implicit_transactions off;
WITH RemoveRows AS
(
SELECT ROW_NUMBER() OVER(ORDER BY [Date] ASC) AS RowNum
FROM [dbo].[Log4Net]
)
DELETE FROM RemoveRows
WHERE RowNum < #RowsToDelete + 1
print N'##TRANCOUNT AFTER DELETE: ' + CAST(##TRANCOUNT AS VARCHAR(20));
It is called from this t-sql:
WHILE #RowsDeleted < #RowsToDelete
BEGIN
EXEC [dbo].[DeleteFromLog4Net] #RowsToDelete
SET #RowsDeleted = #RowsDeleted + #RowsToDelete
Set #loops = #loops + 1
print 'Loop: ' + cast(#loops as varchar(10))
END
I have to admit I am puzzled. I am not a DB guru, but I thought I understood enough to figure this out....

Resources