I have studied a lot how durability is achieved in databases and if I understand well it works like this (simplified):
Clent's point of view:
start transaction.
insert into table values...
commit transaction
DB engine point of view:
write transaction start indicator to log file
write changes done by client to log file
write transaction commit indicator to log file
flush log file to HDD (this ensures durability of data)
return 'OK' to client
What I observed:
Client application is single thread application (one db connection). I'm able to perform 400 transactions/sec, while simple tests that writes something to file and then fsync this file to HDD performs only 150 syncs/sec. If client were multithread/multi connection I would imagine that DB engine groups transactions and does one fsync per few transactions, but this is not the case.
My question is if, for example MsSQL, really synchronizes log file (fsync, FlushFileBuffers, etc...) on every transaction commit, or is it some other kind of magic behind?
The short answer is that, for a transaction to be durable, the log file has to be written to stable storage before changes to the database are written to disk.
Stable storage is more complicated than you might think. Disks, for example, are not usually considered to be stable storage. (Not by people who write code for transactional database engines, anyway.)
It see how a particular open source dbms writes to stable storage, you'll need to read the source code. PostgreSQL source code is online. (File is xlog.c) Don't know about MySQL source.
Related
TL;DR: Is it possible to basically create a fast, temporary, "fork" of a database (like a snapshot transaction) without any locks given that I know for a fact that the changes will never be committed and always be rolled back.
Details:
I'm currently working with SQL Server and am trying to implement a feature where the user can try all sorts of stuff (in the application) that is never persisted in the database.
My first instinct was to (mis)use snapshot transactions for that to basically "fork" the database into a short lived (under 15min) user-specific context. The rest of the application wouldn't even have to know that all the actions the user performs will later be thrown away (I currently persist the connection across requests - it's a web application).
Problem is that there are situations where the snapshot transaction locks and waits for other transactions to complete. My guess is that this happens because SQL server has to make sure it can merge the data if one of the open transactions commits, but in my case I know for a fact that I will never commit the changes from this transactions and always throw the data away (note that not everything happens in this transactions, there are other things that a user can do that happen on a different connection and are persisted).
Are there other ideas, that don't involve cloning the database (too large/slow) or updating/changing the schema of all tables (I'd like to avoid "poisoning" the schema with the implemenation detail of the "try out" feature).
No. SQL Server has copy-on-write Database Snapshots, but the snapshots are read-only. So where a SNAPSHOT transaction acquires regular exclusive locks when it modifies the database, a Database Snapshot would just give you an error.
There are storage technologies that can a writable copy-on-write storage snapshot, like NetApp. You would run a command to create a new LUN that is a snapshot of an existing LUN, present it to your server as a disk, mount its volume in a folder or drive letter, and attach the files you find there as a database. This is often done for cloning across environments to refresh dev/test with prod data without having to copy all the data. But it seems like way too much infrastructure work for your use case.
I am reading about databases and I can't understand one thing about WAL files. They exist in order to make sure transactions are reliable and recoverable, however, apparently, to improve performance, WAL files are written in batches instead of immediately. This looks to me quite contradictory and negates the purpose of WAL files. What happens if there's a crash between WAL commits? How does this differ from not having the WAL at all and simply fsync'ing the database itself periodically?
I've no much idea and just seeked for information about this as it seems interesting to me.
If some ninja find my explanation incorrect please, correct me. What I understand at this point is that WAL files are written before the commit, then once confirmed that the transaction data is on the WAL, it confirms the transaction.
What is done in batch is to move this WAL data to heap and index, real tables.
Write-Ahead Logging (WAL) is a standard method for ensuring data integrity. A detailed description can be found in most (if not all) books about transaction processing. Briefly, WAL's central concept is that changes to data files (where tables and indexes reside) must be written only after those changes have been logged, that is, after log records describing the changes have been flushed to permanent storage. If we follow this procedure, we do not need to flush data pages to disk on every transaction commit, because we know that in the event of a crash we will be able to recover the database using the log: any changes that have not been applied to the data pages can be redone from the log records. (This is roll-forward recovery, also known as REDO.)
https://www.postgresql.org/docs/current/wal-intro.html
After deploying a project on the client's machine, the sql db log file has grown to up to 450G, although the db size is less than 100MB, The logging mode is set to Simple mode, and the transactions are send from a windows service that send insertion and updating transaction every 30 seconds.
my question is, how to know the reason of db log file growth?
I would like to know how to monitor the log file to know what is the exact transaction that causes the problem.
should i debug the front end ? or there is away that expose the transactions that cause db log file growth.
Thank you.
Note that a simple recovery model does not allow for log backups since it keeps the least amount of information and relies on CHECKPOINT, so if this is a critical database, consider protecting the client by use of a FULL RECOVERY plan. Yes, you have to use more space, but disk space is cheap and you can have greater control over the point in time recovery and managing your log files. Trying to be concise:
A) Your database in Simple Mode will only truncate transactions in your transaction log as when a CHECKPOINT is created.
B) Unfortunately, large/lots of uncommitted transactions, including BACKUP, creation of SNAPSHOT, and LOG SCANs, among other things will stop your database from creating those checkpoints and your database will be left unprotected until those transactions are completed.
Your current system relies on having the right edition of your .bak file, which depending on the size may mean hours of potential loss.
In other words, it is that ridiculous size because your database is not able to create a CHECKPOINT to truncate these transactions often enough....
a little note on log files
Foremost, Log files are not automatically truncated every time a transaction is committed (otherwise, you would only have the last committed transaction to go back to). Taking frequent log backups will ensure pertinent changes are kept (point in time) and SHRINKFILE will squeeze the log file to the smallest size available/size specified.
Use DBCC SQLPERF(logspace) to see how much of your log file is in use and how large it is. Once you perform a full backup, the log file will be truncated to the remaining uncommitted/active transactions. (Do not confuse this with shrinking the size)
Some suggestions on researching your transactions:
You can use the system tables to see the most expensive cache, frequent, and active plans.
You can search the log file using an undocumented extended stored procedure, fn_dblog.
Pinal has great info on this topic that you can read at this webpage and link:
Beginning Reading Transaction Log
A Log File is text, and depending on your log levels and how many errors and messages you receive these files can grow very quickly.
You need to rotate your logs with something like logrotate although from your question it sounds like you're using Windows so not sure what the solution for that would be.
The basics of log rotation are taking daily/weekly versions of the logs, and compressing them with gzip or similar and trashing the uncompressed version.
As it is text with a lot of repetition this will make the files very very small in comparison, and should solve your storage issues.
log file space won't be reused ,if there is open transaction..You can verify the reason for log space reuse using below DMV..
select log_reuse_wait_desc,database_id from sys.databases
In your case,your database is set to simple and database is 100 MB..but the log has grown upto 450 GB..which is very huge..
My theory is that ,there may be some open transactions ,which prevented log space reuse..log file won't shrink back,once it grew..
As of know you can run above DMV and see ,what is preventing log space reuse at this point,you can't go back in time to know what prevented log space reuse
Hi I have this silly question, but I want to be sure. I have created a database based on sqlite3. I trigger the commit() after 1000k operations so I will not have too much disk I/O. When I seek data on the database, will the select query search only in the database file or will it check the uncommited data too ?
Thanks.
Transactions allow isolation and atomicty regarding other users of the database.
Any changes you make are visible in your own connection immediately.
If you are using the same SQLite connection for reading as you are for writing the database, then the effects of the writing will be visible to the reader, as expected.
If you are using different connections -- even within a single thread -- for reading and writing, the reader will not see uncommitted writes to the database unless you go to rather significant lengths to allow it to do so.
We are performing quite large transactions on a SQLite database that is causing the WAL file to grow extremely large. (Sometimes up to 1GB for large transactions.) Is there a way to checkpoint the WAL file while in the middle of a transaction? When I try calling sqlite3_wal_checkpoint() or executing the WAL checkpoint PRAGMA statement, both return SQLITE_BUSY.
Not really. This is whole point of transactions: WAL (or journal file) keeps data that would become official once successfully committed. Until that happens, if anything goes wrong - program crash, computer reboot, etc, WAL or journal file allow to safely rollback (undo) uncommitted action. Moving only part of this uncommitted transaction would defeat the purpose.
Note that SQLite documentations defines check-pointing as moving the WAL file transactions back into the database. In other words, checkpointing moves one or more transactions from WAL, but not part of huge uncommitted transaction.
There are few possible solutions to your problem:
Avoid huge transactions - commit in smaller chunks if you can. Of course, this is not always possible, depending on your application.
Use old journaling mode with PRAGMA journal_mode=DELETE. It is slightly slower than new WAL mode (with PRAGMA journal_mode=WAL), but in my experience it tends to create much smaller journal files, and they get deleted when transaction successfully commits. For example, Android 4.x is still using old journaling mode - it tends to work faster on flash and does not create huge temporary or journal files.