Can a large transaction log cause cpu hikes to occur - sql-server

I have a client with a very large database on Sql Server 2005. The total space allocated to the db is 15Gb with roughly 5Gb to the db and 10 Gb to the transaction log. Just recently a web application that is connecting to that db is timing out.
I have traced the actions on the web page and examined the queries that execute whilst these web operation are performed. There is nothing untoward in the execution plan.
The query itself used multiple joins but completes very quickly. However, the db server's CPU hikes to 100% for a few seconds. The issue occurs when several simultaneous users are working on the system (when I say multiple .. read about 5). Under this timeouts start to occur.
I suppose my question is, can a large transaction log cause issues with CPU performance? There is about 12Gb of free space on the disk currently. The configuration is a little out of my hands but the db and log are both on the same physical disk.
I appreciate that the log file is massive and needs attending to, but I'm just looking for a heads up as to whether this may cause CPU spikes (ie trying to find the correlation). The timeouts are a recent thing and this app has been responsive for a few years (ie its a recent manifestation).
Many Thanks,

It's hard to say exactly given the lack of data, but the spikes are commonly observed on transaction log checkpoint.
A checkpoint is a procedure of applying data sequentially appended and stored in the transaction log to the actual datafiles.
This involves lots of I/O, including CPU operations, and may be a reason of the CPU activity spikes.
Normally, a checkpoint occurs when a transaction log is 70% full or when SQL Server decides that a recovery procedure (reapplying the log) would take longer than 1 minute.

Your first priority should be to address the transaction log size. Is the DB being backed up correctly, and how frequently. Address theses issues and then see if the CPU spikes go away.
CHECKPOINT is the process of reading your transaction log and applying the changes to the DB file, if the transaction log is HUGE then it makes sense it could affect it?

You could try extending the autogrowth: Kimberley Tripp suggests upwards of 500MB autogrowth for transaction logs measured in GBs:
http://www.sqlskills.com/blogs/kimberly/post/8-Steps-to-better-Transaction-Log-throughput.aspx
(see point 7)

While I wouldn't be surprised if having a log that size wasn't causing a problem, there are other things it could be as well. Have the statistics been updated lately? Are the spikes happening when some automated job is running, is there a clear time pattern to when you have the spikes - then look at what else is running? Did you load a new version of anything on the server about the time the spikes started happeining?
In any event, the transaction log needs to be fixed. The reason it is so large is that it is not being backed up (or not backed up frequently enough). It is not enough to back up the database, you must also back up the log. We back ours up every 15 minutes but ours is a highly transactional system and we cannot afford to lose data.

Related

MSSQL Recurring Transaction Log full. Need to know who causes it

MSSQL V18.7.1
Transaction log on databases is back-upped every hour.
Size from this databaselog is auto-grow with 128Mb max 5Gb
This runs smoothly but sometimes we do get an error in our application:
'The transaction log for database Borculo is full due to 'LOG_BACKUP'
This message we got 8.15AM while on 8.01AM de log-backup was done (and emptied).
I would really like it if I had a script or command to check what caused this exponential growth.
We could backup more often (ever 30 minutes) or change size but the problem is not solved then.
Basically this problem should not occur with the number of transactions we have.
Probably some task is running (in our ERP) which causes this.
This does not happen every day but in the last month this is the 2nd time.
The transaction-log is a back-upped one to get info from. Not the active one.
Can anyone point me in the right direction?
Thanks
An hourly transaction log backup means in case of a disaster you could lose up to an hour's worth of data.
It is usually advised to keep you transaction log backups as frequent as possible.
Every 15 mins is usually a good starting point. But if it is a business critical database consider a transaction log backup every minute.
Also why would you limit the size for your transaction log file? If you have more space available on the disk, allow your file to grow if it needs to grow.
It is possible that the transaction log file is getting full because there is some maintenance task running (Index/Statistics maintenance etc) and because the log file is not backed up for an entire hour, the logs doesn't get truncated for and hour and the file reaches 5GB in size. Hence the error message.
Things I would do, to sort this out.
Remove the file size limit, or at least increase the limit to allow it to grow bigger than 5 GB.
Take transaction Log backups more frequently, maybe every 5 minutes.
Set the log file growth increment to at least 1 GB from 128MB (to reduce the number of VLFs)
Monitor closely what is running on the server when the log file gets full, it is very likely to be a maintenance task (or maybe a bad hung connection).
Instead of setting max limit on the log file size, setup some alerts to inform you when the log file is growing too much, this will allow you to investigate the issue without any interference or even potential downtime for the end users.

SQL Server ROLLBACK transaction took forever. Why?

We have a huge DML script, that opens up a transaction and performs a lot of changes and only then it commits.
So recently, I had triggered this scripts (through an app), and as it was taking quite an amount of time, I had killed the session, which triggered a ROLLBACK.
So the problem is that this ROLLBACK took forever and moreover it was hogging a lot of CPU (100% utilization), and as I was monitoring this session (using exec DMVs), I saw a lot of waits that are IO related (IO_COMPLETION, PAGE_IO_LATCH etc).
So my question is:
1. WHy does a rollback take some much amount of time? Is it because it needs to write every revert change to the LOG file? And the IO waits I saw could be related to IO operation against this LOG file?
2. Are there any online resources that I can find, that explains how ROLLBACK mechanism works?
Thank You
Based on another article on the DBA side of SO, ROLLBACKs are slower for at least two reasons: the original SQL is capable of being multithreaded, where the rollback is single-threaded, and two, a commit confirms work that is already complete, where the rollback not only must identify the log action to reverse, but then target the impacted row.
https://dba.stackexchange.com/questions/5233/is-rollback-a-fast-operation
This is what I have found out about why a ROLLBACK operation in SQL Server could be time-consuming and as to why it could produce a lot of IO.
Background Knowledge (Open Tran/Log mechanism):
When a lot of changes to the DB are being written as part of an open transaction, these changes modify the data pages in memory (dirty pages) and log records (into a structure called LOG BLOCKS) generated are initially written to the buffer pool (In Memory). These dirty pages are flushed to the disk either by a recurring Checkpoint operation or a lazy-write process. In accordance with the write-ahead logging mechanism of the SQL Server, before the dirty pages are flushed the LOG RECORDS describing these changes needs to be flushed to the disk as well.
Keeping this background knowledge in mind, now when a transaction is rolled back, this is almost like a recovery operation, where all the changes that are written to the disk, have to be undone. So, the heavy IO we were experiencing might have happened because of this, as there were lots of data changes that had to be undone.
Information Source: https://app.pluralsight.com/library/courses/sqlserver-logging/table-of-contents
This course has a very deep and detailed explanation of how logging recovery works in SQL Server.

Monitoring SQL Log File

After deploying a project on the client's machine, the sql db log file has grown to up to 450G, although the db size is less than 100MB, The logging mode is set to Simple mode, and the transactions are send from a windows service that send insertion and updating transaction every 30 seconds.
my question is, how to know the reason of db log file growth?
I would like to know how to monitor the log file to know what is the exact transaction that causes the problem.
should i debug the front end ? or there is away that expose the transactions that cause db log file growth.
Thank you.
Note that a simple recovery model does not allow for log backups since it keeps the least amount of information and relies on CHECKPOINT, so if this is a critical database, consider protecting the client by use of a FULL RECOVERY plan. Yes, you have to use more space, but disk space is cheap and you can have greater control over the point in time recovery and managing your log files. Trying to be concise:
A) Your database in Simple Mode will only truncate transactions in your transaction log as when a CHECKPOINT is created.
B) Unfortunately, large/lots of uncommitted transactions, including BACKUP, creation of SNAPSHOT, and LOG SCANs, among other things will stop your database from creating those checkpoints and your database will be left unprotected until those transactions are completed.
Your current system relies on having the right edition of your .bak file, which depending on the size may mean hours of potential loss.
In other words, it is that ridiculous size because your database is not able to create a CHECKPOINT to truncate these transactions often enough....
a little note on log files
Foremost, Log files are not automatically truncated every time a transaction is committed (otherwise, you would only have the last committed transaction to go back to). Taking frequent log backups will ensure pertinent changes are kept (point in time) and SHRINKFILE will squeeze the log file to the smallest size available/size specified.
Use DBCC SQLPERF(logspace) to see how much of your log file is in use and how large it is. Once you perform a full backup, the log file will be truncated to the remaining uncommitted/active transactions. (Do not confuse this with shrinking the size)
Some suggestions on researching your transactions:
You can use the system tables to see the most expensive cache, frequent, and active plans.
You can search the log file using an undocumented extended stored procedure, fn_dblog.
Pinal has great info on this topic that you can read at this webpage and link:
Beginning Reading Transaction Log
A Log File is text, and depending on your log levels and how many errors and messages you receive these files can grow very quickly.
You need to rotate your logs with something like logrotate although from your question it sounds like you're using Windows so not sure what the solution for that would be.
The basics of log rotation are taking daily/weekly versions of the logs, and compressing them with gzip or similar and trashing the uncompressed version.
As it is text with a lot of repetition this will make the files very very small in comparison, and should solve your storage issues.
log file space won't be reused ,if there is open transaction..You can verify the reason for log space reuse using below DMV..
select log_reuse_wait_desc,database_id from sys.databases
In your case,your database is set to simple and database is 100 MB..but the log has grown upto 450 GB..which is very huge..
My theory is that ,there may be some open transactions ,which prevented log space reuse..log file won't shrink back,once it grew..
As of know you can run above DMV and see ,what is preventing log space reuse at this point,you can't go back in time to know what prevented log space reuse

Huge transaction in Sql Server, are there any problems?

I have a program which does many bulk operations on an SQL Server 2005 or 2008 database (drops and creates indexes, creates columns, full table updates etc), all in one transaction.
Are there any problems to be expected?
I know that the transaction log expands even in Simple recovery mode.
This program is not executed during normal operation of the system, so locking and concurrency is not an issue.
Are there other reasons to split the transaction into smaller steps?
In short,
Using smaller transactions provides more robust recovery from failure.
Long transactions may also unnecessarily hold locks on objects for extended periods of time that other processes may require access to i.e. blocking.
Consider that if at any point between the time the transaction started and finished, your server experienced a failure, in order to be bring the database online SQL Server would have to perform the crash recovery process which would involve rolling back all uncommitted transactions from the log.
Supposing you developed a data processing solution that is intelligent enough to pick up from where it left off. By using a single transaction this would not be an option available to you because you would need to start the process from the begging once again.
If the transaction causes too many database log entries (updates) the log can hit what is known as the "high water mark". It's the point at which the log reaches (about) half of its absolute maximum size, when it must then commence rolling back all updates (which will consume about the same amount of disk as it took to do the updates.
Not rolling back at this point would mean risking eventually reaching the maximum log size and still not finishing the transaction or hitting a rollback command, at which point the database is screwed because there's not enough log space to rollback.
It isn't really a problem until you run out of disk space, but you'll find that rollback will take a long time. I'm not saying to plan for failure of course.
However, consider the process not the transaction log as such. I'd consider separating:
DDL into a separate transaction
Bulk load staging tables with a transaction
Flush data from staging to final table in another transaction
If something goes wrong I'd hope that you have rollback scripts and/or a backup.
Is there really a need to do everything atomically?
Depending on the complexity of your update statements, I'd recommend to do this only on small tables of, say, a few 100 rows. Especially if you have only a small amount of main memory available. Otherwise, for instance, updates on big tables can take a very long time and even appear to hang. Then it's difficult to figure out what the process (spid) is doing and how long it might take.
I'm not sure whether "Drop index" is transaction-logged operation anyway. See this question here on stackoverflow.com.

Does the transaction log drive need to be as fast as the database drive?

We are telling our client to put a SQL Server database file (mdf), on a different physical drive than the transaction log file (ldf). The tech company (hired by our client) wanted to put the transaction log on a slower (e.g. cheaper) drive than the database drive, because with transaction logs, you are just sequencially writing to the log file.
I told them that I thought that the drive (actually a RAID configuration) needed to be on a fast drive as well, because every data changing call to the database, needs be saved there, as well as to the database itself.
After saying that though, I realized I was not entirely sure about that. Does the speed of the transaction log drive make a significant difference in performance... if the drive with the database is fast?
The speed of the log drive is the most critical factor for a write intensive database. No updates can occur faster than the log can be written, so your drive must support your maximum update rate experienced at a spike. And all updates generate log. Database file (MDF/NDF) updates can afford slower rates of write because of two factors
data updates are written out lazily and flushed on checkpoint. This means that an update spike can be amortized over the average drive throughput
multiple updates can accumulate on a single page and thus will need one single write
So you are right that the log throughput is critical.
But at the same time, log writes have a specific pattern of sequential writes: log is always appended at the end. All mechanical drives have a much higher throughput, for both reads and writes, for sequential operations, since they involve less physical movement of the disk heads. So is also true what your ops guys say that a slower drive can offer in fact sufficient throughput.
But all these come with some big warnings:
the slower drive (or RAID combination) must truly offer high sequential throughput
the drive must see log writes from one and only one database, and nothing else. Any other operation that could interfere with the current disk head position will damage your write throughput and result in slower database performance
the log must be only write, and not read. Keep in mind that certain components need to read from the log, and thus they will move the disk mechanics to other positions so they can read back the previously written log:
transactional replication
database mirroring
log backup
In simplistic terms, if you are talking about an OLTP database, your throughput is determined by the speed of your writes to the Transaction Log. Once this performance ceiling is hit, all other dependant actions must wait on the commit to log to complete.
This is a VERY simplistic take on the internals of the Transaction Log, to which entire books are dedicated, but the rudimentary point remains.
Now if the storage system you are working with can provide the IOPS that you require to support both your Transaction Log and Database data files together then a shared drive/LUN would provide adequately for your needs.
To provide you with a specific recommended course of action I would need to know more about your database workload and the performance you require your database server to deliver.
Get your hands on the title SQL Server 2008 Internals to get a thorough look into the internals of the SQL Server transaction log, it's one of the best SQL Server titles out there and it will pay for itself in minutes from the value you gain from reading.
Well, the transaction log is the main structure that provides ACID, can be a big bottleneck for performance, and if you do backups regularly its required space has an upper limit, so i would put it in a safe, fast drive with just space enough + a bit of margin.
The Transaction log should be on the fastest drives, if it just can complete the write to the log it can do the rest of the transaction in memory and let it hit disk later.

Resources