Question about database transaction log - sql-server

I read the following statement:
SQL Server doesn’t write data immediately to disk. It is kept in a
buffer cache until this cache is full or until SQL Server issues a
checkpoint, and then the data is written out. If a power failure
occurs while the cache is still filling up, then that data is lost.
Once the power comes back, though, SQL Server would start from its
last checkpoint state, and any updates after the last checkpoint that
were logged as successful transactions will be performed from the
transaction log.
And a couple of questions arise:
What if the power failure happens after SQL Server issues a
checkpoint and before the buffer cache is actuall written to
disk? Isn't the content in buffer cache permanently missing?
The transaction log is also stored as disk file, which is no
different from the actual database file. So how could we guarantee
the integrity of log file?
So, is it true that no real transaction ever exists? It's only a matter of probability.

The statement is correct in that data can be written to cache, but misses the vital point that SQL Server uses a technique called Write Ahead Logging (WAL). The writes to the log are not cached, and a transaction is only considered complete once the transaction records have been written to the log.
http://msdn.microsoft.com/en-us/library/ms186259.aspx
In the event of a failure, the log is replayed as you mention, but the situation regarding the data pages still being in memory and not written to disk does not matter, since the log of their modification is stored and can be retrieved.
It is not true that there is no real transaction, but if you are operating in simple logging mode then the ability to replay is not there.
For the integrity of the log file / same as the data file - a proper backup schedule and a proper restore testing schedule - do not just backup data / logs and assume they work.

What if the power failure happens after SQL Server issues a checkpoint and before the buffer cache is actuall written to disk? Isn't the content in buffer cache permanently missing?
The checkpoint start and end are different records on the transaction log.
The checkpoint is marked as succeeded only after the end of the checkpoint has been written into the log and the LSN of the oldest living transaction (including the checkpoint itself) is written into the database.
If the checkpoint fails to complete, the database is rolled back to the previous LSN, taking the data from the transaction log as necessary.
The transaction log is also stored as disk file, which is no different from the actual database file. So how could we guarantee the integrity of log file?
We couldn't. It's just the data are stored in two places rather than one.
If someone steals your server with both data and log files on it, your transactions are lost.

Related

conditions when the Transaction logs are flushed into the Log File

In which conditions the T-logs are flushed from the log cache to log file or disk?
Does it happen after every commit or after every 3 seconds or only after checkpoint?
And in where the dirty pages are stored in the SQL server when the memory is not big enough to hold the data in the buffer pool(in temp db or in the respective databases)? and for how long the uncommitted data is preserved in SQL server and where?
You are asking two random questions
1.Transaction log buffer
2.Buffer pool
In which conditions the T-logs are flushed from the log cache to log file or disk?Does it happen after every commit or after every 3 seconds or only after checkpoint?
Consider below update statement
Update table set id=1
where id=2
First of all this modification is written to Transaction log buffer..SQLServer then writes this modification to disk, before we get successfull commit..This is called write Ahead logging and this type of commits will not be periodic or any thing..This happens per statement
And in where the dirty pages are stored in the SQL server when the memory is not big enough to hold the data in the buffer pool(in temp db or in the respective databases)? and for how long the uncommitted data is preserved in SQL server and where?
consider the same update transaction and this update needs to touch three pages..and one page is not in buffer pool..In this case,SQL reads the page from disk and places it in buffer pool and modifies it..Now this page is called dirty page...
These type of pages will be flushed to disk ,When check point occurs..check point occurs due to various conditions as mentioned in below link
https://msdn.microsoft.com/en-us/library/ms189573.aspx
Checkpoint is an internal process that writes all dirty pages (modified pages) from Buffer Cache to Physical disk, apart from this it also writes the log records from log buffer to physical file.
A checkpoint always writes out all pages that have changed (known as being marked dirty) since the last checkpoint, or since the page was read in from disk. It doesn't matter whether the transaction that changed a page has committed or not – the page is written to disk regardless. The only exception is for tempdb, where data pages are not written to disk as part of a checkpoint.
A checkpoint is only done for tempdb when the tempdb log file reaches 70% full – this is to prevent the tempdb log from growing if at all possible (note that a long-running transaction can still essentially hold the log hostage and prevent it from clearing, just like in a user database).
Conditions when the Transaction logs are flushed into the Log File:
The LOGWRITER is the process which is responsible for writing logs from Log Cache to Log file.
The conditions where the log buffer is flushed to disk includes:
A session issues a commit or a rollback command.
The log buffer becomes 1/3 full.
After every checkpoint.
Whenever Log file becomes 70% Full.
It also depends on the Target Recovery Time
Does it happen after every commit or after every 3 seconds or only
after checkpoint?
It happens after every commit and after every checkpoint.
Checkpoint occurs for a user database, all dirty pages for that database are flushed to disk (as well as other operations). This does not happen for tempdb. Tempdb is not recovered in the event of a crash, and so there is no need to force dirty tempdb pages to disk, except in the case where the lazywriter process (part of the buffer pool) has to make space for pages from other databases. When you issue a manual CHECKPOINT, all the dirty pages are flushed, but for automatic checkpoints they’re not.
Checkpoints
How long the uncommitted data is preserved in SQL server and where?
SQL Server will keep the uncommitted data in the data and log files unless and until the transaction is completed / rolled back.

Monitoring SQL Log File

After deploying a project on the client's machine, the sql db log file has grown to up to 450G, although the db size is less than 100MB, The logging mode is set to Simple mode, and the transactions are send from a windows service that send insertion and updating transaction every 30 seconds.
my question is, how to know the reason of db log file growth?
I would like to know how to monitor the log file to know what is the exact transaction that causes the problem.
should i debug the front end ? or there is away that expose the transactions that cause db log file growth.
Thank you.
Note that a simple recovery model does not allow for log backups since it keeps the least amount of information and relies on CHECKPOINT, so if this is a critical database, consider protecting the client by use of a FULL RECOVERY plan. Yes, you have to use more space, but disk space is cheap and you can have greater control over the point in time recovery and managing your log files. Trying to be concise:
A) Your database in Simple Mode will only truncate transactions in your transaction log as when a CHECKPOINT is created.
B) Unfortunately, large/lots of uncommitted transactions, including BACKUP, creation of SNAPSHOT, and LOG SCANs, among other things will stop your database from creating those checkpoints and your database will be left unprotected until those transactions are completed.
Your current system relies on having the right edition of your .bak file, which depending on the size may mean hours of potential loss.
In other words, it is that ridiculous size because your database is not able to create a CHECKPOINT to truncate these transactions often enough....
a little note on log files
Foremost, Log files are not automatically truncated every time a transaction is committed (otherwise, you would only have the last committed transaction to go back to). Taking frequent log backups will ensure pertinent changes are kept (point in time) and SHRINKFILE will squeeze the log file to the smallest size available/size specified.
Use DBCC SQLPERF(logspace) to see how much of your log file is in use and how large it is. Once you perform a full backup, the log file will be truncated to the remaining uncommitted/active transactions. (Do not confuse this with shrinking the size)
Some suggestions on researching your transactions:
You can use the system tables to see the most expensive cache, frequent, and active plans.
You can search the log file using an undocumented extended stored procedure, fn_dblog.
Pinal has great info on this topic that you can read at this webpage and link:
Beginning Reading Transaction Log
A Log File is text, and depending on your log levels and how many errors and messages you receive these files can grow very quickly.
You need to rotate your logs with something like logrotate although from your question it sounds like you're using Windows so not sure what the solution for that would be.
The basics of log rotation are taking daily/weekly versions of the logs, and compressing them with gzip or similar and trashing the uncompressed version.
As it is text with a lot of repetition this will make the files very very small in comparison, and should solve your storage issues.
log file space won't be reused ,if there is open transaction..You can verify the reason for log space reuse using below DMV..
select log_reuse_wait_desc,database_id from sys.databases
In your case,your database is set to simple and database is 100 MB..but the log has grown upto 450 GB..which is very huge..
My theory is that ,there may be some open transactions ,which prevented log space reuse..log file won't shrink back,once it grew..
As of know you can run above DMV and see ,what is preventing log space reuse at this point,you can't go back in time to know what prevented log space reuse

Log suspend reason unknown

I am not a DBA but a programmer. Recently we have been getting LOG SUSPEND issue daily on our production. I am unable to catch the scenario as it is not reproducible on my local system.
A file when uploaded on production fails with log suspend while same file uploaded on local seems to work fine. Also, when the same file is uploaded again after some time it seems to work fine in production too.
Really confused as why this is happening.
Log Suspend indicates that the transaction log is filling up, and may not be properly sized for the transaction rate you are supporting. Have the DBA/System Administrator add additional Log device space to the database that is having issues. If possible, you may also want to break up any large transactions as well to lower the possibility
As for a cause, it's very dependent on how the system is setup. First check the database settings.
sp_helpdb will print out the list of databases on the server, as well as any options that may be set for each database.
If you don't see trunc log on chkpt, then the database is setup for maximum recoverability, the log space will only free up after a backup is run, or after the transaction log is dumped. This allows for up to the second recovery in the event of a failure, at the expense of using more log space.
If you DO see trunc log on chkpt, then the database will automatically truncate the log after a checkpoint occurs in the database. Checkpoints are issued by the database itself as part of routine processing, but the command can also be issued manually. If this option is set, and the database still goes into log suspend, then you may have a transaction that did not properly close (whether by committing or rolling back). You can check the master..syslogshold table to find long running transactions.
A third possibility is that if the system is using SAP/Sybase Replication Server, there is actually a secondary truncation point used as part of the replication processes. The system will not truncate the transaction log until after a transaction has been read by the RepAgent process, so this too can cause a system to go into log suspend.

Writing to transaction log when log comes to full size

Let's say we have database with defined transaction log initial size to 100MB and maxsize is UNLIMITED.
SQL Server will write into log sequentially from start to end. In one book I found next sentence:
When SQL Server reaches the end of the file as defined by the size
when it was set up, it will wrap around to the beginning again,
looking for free space to use. SQL Server can wrap around without
increasing the physical log file size when there is free virtual
transaction space. Virtual transaction log space becomes free when SQL
Server can write the data from the transaction log into the underlying
tables within the database.
Last part is really confusing to me. What last sentence means? Does it means that SQL Server overwrite old, committed transactions with new transactions?
As far as I know, that would not be the case, because, all transactions must be presented until backup is done.
I don't know if I was enough clear, I will updtae post if needed some explanations.
This only applies to SIMPLE transaction logging:
Virtual transaction log space becomes free when SQL Server can write the data from the transaction log into the underlying tables within the database.
This means, that once the transactions have actually been written to the physical tables, they are no longer needed in the transaction log. Because at this point, a power outage or another catastrophic failure can no longer cause the transactions to be "lost", as they have already been persisted to the disk.
No need to wait until a backup is done. However, if you need full point-in-time recovery, you would use FULL transaction logging, and in that case, no transaction logs will ever be overwritten.
The log records are no longer needed in the transaction log if all of the following are true:
The transaction of which it is part has committed.
The database pages it changed have all been written to disk by a checkpoint.
The log record is not needed for a backup (full, differential, or log).
The log record is not needed for any feature that reads the log (such as database mirroring or replication).
Further Reads,
https://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx
https://technet.microsoft.com/en-us/library/jj835093%28v=sql.110%29.aspx

Does the Full Recovery Model Generate Additional Transaction Logs?

I read some Books Online about recovery/backup, one stupid question, if I use full database backup/full recovery model, for backup operation itself, will it generate any additional transaction log to source database server? Will full recovery operation generate additional transaction log to destination database?
A more useful view of this might be to say that Full Recovery prevents the contents of the transaction log from being overwritten without some other action allowing them to be overwritten
SQL Server will log most transactions (e.g. bulk load and a few others aside) and when running in simple recovery mode, effectively discard the newly created log contents at the end of the transaction associated with the creation of the same. When running in Full Recovery mode the contents of the trans log are retained until marked as available to be overwritten. To mark them as available to be overwritten one normally performs a backup (either Full or Trans Log).
If there is no space in the trans log and no logs contents marked as available to be overwritten then SQL Server will attempt to increase the size of the logs.
In practical terms Full Recovery requires you to manage your transaction logs, generally by performing a trans log backup every so often (every 1 hour is probably a good rule of thumb if you have no SLA to work to or other driver to determine how often to do this)
I'm not sure I completely understand your question, but here goes. Keeping your DB in Full Recovery mode can make your transaction logs grow to be very large. The trade off is that you can restore to the point of recovery.
The reason that the transaction logs are larger than normal is ALL transactions are fully logged. This can include bulk-logged operations, index creation, etc.
If drive space is not a concern (and with drives being so inexpensive, it shouldn't be), this is the recommended backup approach.

Resources