I was reading about ACID properties of dbms of which 1 property is Atomicity.
http://ecomputernotes.com/database-system/rdbms/transaction
Scenario:
Suppose that, just prior to execution of transaction Ti the values of account A and B are Rs.I000 and Rs.2000.
Now, suppose that during the execution of Ti, a power failure has occurred that prevented the Ti to complete successfully. The point of failure may be after the completion Write (A,a) and before Write(B,b). It means that the changes in A are performed but not in B. Thus the values of account A and Bare Rs.950 and Rs.2000 respectively. We have lost Rs.50 as a result 'of this failure.
Now, our database is in inconsistent state.
My question is in case of power failure which lead us to the inconsistent state, how does we recover from it?
Can we do it at application level/ code level?
How many ways are there to recover from it?
Generally speaking, these ways may differ from one database to another, but usually DBMSs insure atomicity this way:
When a new request for data change is received, a database first writes this request to a special log which represents a change vector. Only when this record has been successfully written, a transaction can be committed.
In case of power failure this log persists. And database can recover from inconsistent state using this log, applying the changes one by one.
For example, this log in Oracle database is called Redo Log. In PostgreSql it's called WAL.
Related
I have recently ran into postgres concurrency bug which I won't repeat here. The original post was can be found at this link.
I am still trying to better understand how postgres handles serializable concurrency. My situation is this. I have one stored procedure which reads a table and then inserts based on the output of the read. This stored procedure, if called by multiple clients results in the 40001 read/write dependencies exception.
The question is this. Lets assume that the stored procedure which reads a table and then inserts into it based on the read, only reads some rows. If it is guaranteed that every call to the stored procedure for reads-insert touches a different row, would the concurrency exception go away? Is postgres smart enough to keep track of which rows were read during a transaction so that it can accurately detect modification of those specific rows by a different transaction resulting in the exception? And if yes, how reliable is this mechanism? Can it be optimized away in some cases and postgres, just to be safe throws an exception on modification to any of the read tables?
First, what you encountered in the link you give is not a bug, but intended and documented behaviour.
I gather that you are using transaction isolation level SERIALIZABLE.
In this mode, every row you read is locked with a special SIReadLock which doesn't block anything, but is used to determine if a serialization anomality may have occurred, in which case the transaction is interrupted with a serialization error.
Note that not only rows that are returned to you are locked in this fashion, but all rows in all tables that are accessed during the execution of your query. So if there is a sequential scan in your execution plan, all rows of the table will have a SIReadLock. Moreover, if there are too many of these locks on a table, they get escalated to page or table level locks.
So it is possible that rows are locked unnecessarily. In addition to that, the algorithm which is used to detect inconsistencies can report false positives (it would be computationally too expensive to be exact).
As a consequence, you may receive serialization errors in the case you describe, although I would not expect any as long as everything is kept simple and there are no sequential scans.
Serialization errors are normal and to be expected on the SERIALIZABLE isolation level. Your application must be ready to handle them by retrying the transaction. It is the price you have to pay for not having to worry about data consistency.
Let's say I need to perform two different kinds write operations on a datastore entity that might happen simultaneously, for example:
The client that holds a write-lock on the entry updates the entry's content
The client requests a refresh of the write-lock (updates the lock's expiration time-stamp)
As the content-update operation is only allowed if the client holds the current write-lock, I need to perform the lock-check and the content-write in a transaction (unless there is another way that I am missing?). Also, a lock-refresh must happen in a transaction because the client needs to first be confirmed as the current lock-holder.
The lock-refresh is a very quick operation.
The content-update operation can be quite complex. Think of it as the client sending the server a complicated update-script that the server executes on the content.
Given this, if there is a conflict between those two transactions (should they be executed simultaneously), I would much rather have the lock-refresh operation fail than the complex content-update.
Is there a way that I can "prioritize" the content-update transaction? I don't see anything in the docs and I would imagine that this is not a specific feature, but maybe there is some trick I can use?
For example, what happens if my content-update reads the entry, writes it back with a small modification (without committing the transaction), then performs the lengthy operation and finally writes the result and commits the transaction? Would the first write be applied immediately and cause a simultaneous lock-refresh transaction to fail? Or are all writes kept until the transaction is committed at the end?
Is there such a thing as keeping two transactions open? Or doing an intermediate commit in a transaction?
Clearly, I can just split my content-update into two transactions: The first one sets a "don't mess with this, please!"-flag and the second one (later) writes the changes and clears that flag.
But maybe there is some other trick to achieve this with fewer reads/writes/transactions?
Another thought I had was that there are 3 different "blocks" of data: The current lock-holder (LH), the lock expiration (EX), and the content that is being modified (CO). The lock-refresh operation needs to perform a read of LH and a write to EX in a transaction, while the content-update operation needs to perform a read of LH, a read of CO, and a write of CO in a transaction. Is there a way to break the data apart into three entities and somehow have the transactions span only the needed entities? Since LH is never modified by these two operations, this might help avoid the conflict in the first place?
The datastore uses optimistic concurrency control, which means that a (datastore primitive) transaction waits until it is committed, then succeeds only if someone else hasn't committed first. Typically, the app retries the failed transaction with fresh data. There is no way to modify this first-wins behavior.
It might help to know that datastore transactions are strongly consistent, so a client can first commit a lock refresh with a synchronous datastore call, and when that call returns, the client knows for sure whether it obtained or refreshed the lock. The client can then proceed with its update and lock clear. The case you describe where a lock refresh and an update might occur concurrently from the same client sounds avoidable.
I'm assuming you need the lock mechanism to prevent writes from other clients while the lock owner performs multiple datastore primitive transactions. If a client is actually only doing one update before it releases the lock and it can do so within seconds (well before the datastore RPC timeout), you might get by with just a primitive datastore transaction with optimistic concurrency control and retries. But a lock might be a good idea for simple serialization of, say, edits to a record in a user interface, where a user hits an "edit" button in a UI and you want that to guarantee that the user has some time to prepare and submit changes without the record being changed by someone else. (Whether that's the user experience you want is your decision. :) )
Why does Aries algorithm apply a redo before an undo if it already knows what transactions to undo after the analysis phase?
I know(think) it has something to do with the Lsn numbers and maintaining consistency in the sense that undoing a transaction given that the data flushed on disk may not be the same as undoing a transaction at the time of the crash (due to dirty pages), but I can't find any sort of 'formal' answer to this question (at least one that I can understand).
Because there may be unflushed pages on the buffer even if a transaction is committed. ARIES uses no-force in the buffer manager. Redoing brings the transaction table and dirty page table to the state that was at the time of the crash. As a result, successful transactions can be reflected to the stable storage.
Short answer:
We need to repeat all the history crash in the redo pass in order to ensure the database consistency before performing the undo pass.
Long answer:
The recovery algorithm ARIES, in order to ensure the atomicity and the durability properties of the DBMS, performs 3 passes:
Analysis pass: to see what needs to be done (plays log forward)
Redo pass: to make sure the disk reflects any updates that are in the log but not on disk including those that belong to transactions that will eventually be rolled back. This way it ensures we are in consistent state, which will allow logical undo.
Undo pass: to remove the actions of any losing transactions
The UNDO data log is logical, while the REDO data log is physical:
We must do physical REDO, since we can't guarantee that the database is in a consistent state (so, e.g., logging "INSERT VALUE X INTO TABLE Y" might not be a good idea, because X might be reflected in an index but not the table, or vice versa, in case a crash happens while inserting)
We can do logical UNDO, because after REDO we know that things are consistent. In fact, we must do logical UNDO because we only UNDO some actions, and physical logging of UNDOs of the form, e.g., "split page x of index y" might not be the right thing to do anymore in terms of index management or invariant maintenance. We don't have to worry about this during redo because we repeat history and replay everything, which means any physical modifications made to the database last time will still be correct.
Source
No idea what aries is, but assuming it is the same that other databases do:
Starting from some base backup redo logs are applied, which basically means all the data changing statements that happened after the backup but before the crash get applied. Without that you would lose everything that happens since the last backup.
When that is finished all incomplete transactions get rolled back because there is nobody who could pick up those transactions to complete them.
You want to get back to the state at failure in order to be accurate on which transactions need to be undone. One example which come to mind is successive failures. Precisely failures when recovering from crashes. During recovery you write your actions on the log. If you fail during recovery the process, you will REDO all the operations in the log (even the UNDO operations written during the last attempt!!).
It provides a simple algorithm, because you don't have to handle special cases and special cases of special cases. There is a guarantee that after any amount of crashes during recovery, we will go back to the same state as if there was no crash during recovery.
if you don't support record-level lock, then you can use selective-redo which only redo winner transaction. otherwise, it is better to repeat history(redo all) before undo
You can consider what is really done during redo and undo.
Redo is repeating history, according to exited logs.
Undo, in contrast, is create new CLR log records. When system crash, the log has records about uncommited xacts. If you donnot undo them, there will not be CLR log records, thus causing inconsistency.
One of the goals of ARIES is simplicity. While the undo after redo might not be necessary, it makes the correctness of the algorithm more apparent than a more complex scheme that would do an undo before a redo.
Besides to make sure database is consistent and disk is exactly the same as before crash happens (as Franck Dernoncourt answered), another benefit of performing redo before undo is that:
Failure may happen during recovery. Redo advances the progress of the whole "incremental recovery", namely, if failure happens during redo or undo, next recovery can pick up what previous recovery (redo) has left and continue, if redo is performed before undo.
An extreme case is, if undo performs before redo, and failure happens again during undo and again, all undo will become in vain.
I saw this sentence not only in one place:
"A transaction should be kept as short as possible to avoid concurrency issues and to enable maximum number of positive commits."
What does this really mean?
It puzzles me now because I want to use transactions for my app which in normal use will deal with inserting of hundreds of rows from many clients, concurrently.
For example, I have a service which exposes a method: AddObjects(List<Objects>) and of course these object contain other nested different objects.
I was thinking to start a transaction for each call from the client performing the appropriate actions (bunch of insert/update/delete for each object with their nested objects). EDIT1: I meant a transaction for entire "AddObjects" call in order to prevent undefined states/behaviour.
Am I going in the wrong direction? If yes, how would you do that and what are your recommendations?
EDIT2: Also, I understood that transactions are fast for bulk oeprations, but it contradicts somehow with the quoted sentence. What is the conclusion?
Thanks in advance!
A transaction has to cover a business specific unit of work. It has nothing to do with generic 'objects', it must always be expressed in domain specific terms: 'debit of account X and credit of account Y must be in a transaction', 'subtract of inventory item and sale must be in a transaction' etc etc. Everything that must either succeed together or fail together must be in a transaction. If you are down an abstract path of 'adding objects to a list is a transaction' then yes, you are on a wrong path. The fact that all inserts/updates/deletes triggered by a an object save are in a transaction is not a purpose, but a side effect. The correct semantics should be 'update of object X and update of object Y must be in a transaction'. Even a degenerate case of a single 'object' being updated should still be regarded in terms of domain specific terms.
That recommendation is best understood as Do not allow user interaction in a transaction. If you need to ask the user during a transaction, roll back, ask and run again.
Other than that, do use transaction whenever you need to ensure atomicity.
It is not a transactions' problem that they may cause "concurrency issues", it is the fact that the database might need some more thought, a better set of indices or a more standardized data access order.
"A transaction should be kept as short as possible to avoid concurrency issues and to enable maximum number of positive commits."
The longer a transaction is kept open the more likely it will lock resources that are needed by other transactions. This blocking will cause other concurrent transactions to wait for the resources (or fail depending on the design).
Sql Server is usually setup in Auto Commit mode. This means that every sql statement is a distinct transaction. Many times you want to use a multi-statement transaction so you can commit or rollback multiple updates. The longer the updates take, the more likely other transactions will conflict.
Do I understand correctly that table/row lock hints are being used for pessimistic transaction (TX) isolation models of concurrency ONLY?
In other words, when can table/row lock hints be used during engagement of optimistic TX isolation provided by SQL Server (2005 and higher)?
When one would need pessimistic TX isolation levels/hints in SQL Server2005+ if the later provides built-in optimistic (aka snapshot aka versioning) concurrency isolation?
I did read that pessimistic options are legacy and are not needed anymore, though I am in doubt.
Also, having optimistic (aka snapshot aka versioning) TX isolation levels built-in SQL Server2005+,
when one would need to manually code for optimistic concurrency features?
The last question is inspired by having read:
"Optimistic Concurrency in SQL Server" (September 28, 2007)
describing custom coding to provide versioning in SQL Server.
Optimistic concurrency requires more resources and is more expensive when the conflict occurs.
Two sessions can read and modify the values and the conflict only occurs when they try to apply their changes simultaneously. This means that in case of the concurrent update both values should be stored somewhere (which of course requires resources).
Also, when a conflict occurs, usually the whole transaction should be rolled back or the cursor refetched, which is expensive too.
Pessimistic concurrency model uses locking, thus downgrading concurrency but improving performance.
In case of two concurrent tasks, it may be cheaper for the second task to wait for a lock to release than spending CPU time and disk I/O on two simultaneous works and then yet more on rolling back the less fortunate work and redoing it.
Say, you have a query like this:
UPDATE mytable
SET myvalue = very_complex_function(#range)
WHERE rangeid = #range
, with very_complex_function reading some data from mytable itself. In other words, this query transforms a subset of mytable sharing the value of range.
Now, when two functions work on the same range, there may be two scenarios:
Pessimistic: the first query locks, the second query waits for it. The first query completes in 10 seconds, the second one does too. Total: 20 seconds.
Optimistic: both queries work independently (on the same input). This shares CPU time between them plus some overhead on switching. They should keep their intermediate data somewhere, so the data is stored twice (which implies twice I/O or memory). Let's say both complete almost at the same time, in 15seconds.
But when it's time to commit the work, the second query will conflict and will have to rollback its changes (say, it takes the same 15 seconds). Then it needs to reread the data again and do the work again, with the new set of data (10 seconds).
As a result, both queries complete later than with a pessimistic locking: 15 and 40 seconds vs. 10 and 20.
When one would need pessimistic TX isolation levels/hints in SQL Server2005+ if the later provides built-in optimistic (aka snapshot aka versioning) concurrency isolation?
Optimistic isolation levels are, well, optimistic. You should not use them when you expect high contention on your data.
BTW, optimistic isolation (for the read queries) was available in SQL Server 2000 too.
I have a detailed answer here: Developing Modifications that Survive Concurrency
I think there's a bit confusion over terminology here.
The technique of optimistic locking/optimistic concurrency/... is a programming technique used to avoid the following scenario :
start transaction
read data, setting a "read" lock on it to prevent any deletes/modifications to our data
display data on user's screen
await user input, lock remains active
keep awaiting user input, lock still preventing any writes/modifications
user input never comes (for whatever reason)
transaction times out (and this is usually not very rapidly, as the user must be given reasonable time to enter his input).
Optimistic locking replaces this with the following:
start transaction READ
read data, setting a "read" lock on it to prevent any deletes/modifications to our data
end transaction READ, releasing the read lock just set
display data on user's screen
await user input, but data can be modified/deleted meanwhile by other transactions
user input arrives
start transaction WRITE
verify that the data has remained unaltered, raising an exception if it has
apply user updates
end transaction WRITE
So the single "user transaction" to go fetch some data, and change and update them, consists of two distinct "database transactions". What is usually called "isolation levels" applies to those database transactions. The "optimistic locking" that you refer to applies to the "user transaction".
The matter is further complicated in that, broadly speaking, two completely distinct strategies are possible for the "isolating the database transactions part" :
MVCC
2-phase locking
I think the "snapshot versioning isolation level" means that the MVCC technique (well, one of its various possible variations) is being used for the database transaction. The other commonly known isolation levels apply more to transaction isolation using 2PL as the serialization(/isolation) technique. (And mixing them up can get messy ...)