I have a problem understanding what is the order of the transactions if a ROLLBACK occurs on one of the transactions.
Consider the next order of transactions from top to bottom using the TimeStamp protocol and when a transaction rollback, it will start from the beggining of its actions again.
I figured out that when T1-R(T) occurs, it will rollback because of the WTS(T) = 4.
And here is my poor understanding, what is the new order I am supposed to go by after T1 ROLLBACK?
I mean, if the rollback didn't occur then i would simply move on to the next action T3-R(N) but I fail to understand the new order to move between the actions of the transactions now.
Related
I came across a piece of code in a sql stored proc in our code base where it was using chunking without transaction block. I don’t see how chunking could be beneficial without tran block? I've been humbled a few times when I've jumped into conclusion without digging more, so what advantage does chunking without tran block offer? Is there any?
The pseudocode is something like:
Populate the Main temptable (ID, Name, UpdatedFlag). This flag column
indicates whether the record has been updated or not.
Start while loop (do as long as there is a record in MainTable with
UpdatedFlag = 0)
Only select the given chunkSize into ChunkSizeMain tempTable (ID,
Name) from the records that hasn’t been marked as updated
Begin TRY block
Start updating some other table by joining on ID of
ChunkSizeMainTable.
Update UpdatedFlag = 1 in MainTable.
End try
Begin catch //some action End Catch
Every update query in SQL Server runs in a transaction irrespective of whether it has a BEGIN TRAN next to it. (autocommit transaction if implicit_transaction is not on)
"Chunking" is usually done to stop the transaction log needing to increase in size when the database is in simple recovery mode. A single UPDATE statement that affects 1 million rows will need to have all of that logged to the active log. Dividing into batches can allow the log from earlier committed batches to be truncated and reused by later batches.
It may also be done to reduce the effect on concurrent queries by reducing the length of time of each operation and/or potentially reducing the risk of lock escalation by only updating a few thousand rows at a time.
I have a question but I can never get a clear answer. Any stored
procedure that used a transaction that I have looked at up until my recent job always had a commit transaction + a roll back in case of error. However I have seen a lot of code
at my new job that just has a begin transaction and then a commit at the end with no roll back. I understand why you would use a transaction with a rollback but why would you want to begin a transaction with no roll back? Is it so when you run that code you want to lock the table up so no values can be changed why your code is updating? If so why would you not want the added security of a roll back in case something goes wrong? Is this proper use of the transaction statement? Any thoughts or ideas would be great!
For Example:
BEGIN TRANSACTION [Tran1]
INSERT INTO [Test].[dbo].[T1]
([Title], [AVG])
VALUES ('Tidd130', 130), ('Tidd230', 230)
UPDATE [Test].[dbo].[T1]
SET [Title] = N'az2' ,[AVG] = 1
WHERE [dbo].[T1].[Title] = N'az'
COMMIT TRANSACTION [Tran1]
GO
shouldn't this code be using a roll back syntax for proper use of the begin transaction statement?
The idea is that if that set of transactions needs to be "all or nothing", wrapping the lot in a transaction is the way to ensure that is what will happen. You're not seeing an explicit rollback because that's not what they're guarding against. Imagine the ff scenario with your contrived example:
The insert happens
The server crashes (or the log fills up or some other external reason why things can't continue) before the update can happen
If they're both wrapped in the same transaction, the insert won't be reflected in the table data. Which is the desired behavior.
When transactions are not explicitly declared, SQL Server will automatically BEGIN and COMMIT a TRANSACTION for each command. This frees up each command's lock as soon as the command executes.
When executing multiple commands inside a single transaction (as in the example you posted), locks from all commands are held until the transaction is committed.
Depending on the desired behavior, the script you posted may be correct. However, I would be cautious to ensure that the developer did not mistakenly believe that the transaction would be automatically rolled back on error. If that behavior is desired, you do indeed need to explicitly ROLLBACK or SET XACT_ABORT ON
You use transaction when you need the outcome to be atomic, you would see this alot in financial related procedures where you are gravely worried about data acid consistency . Otherwise it is not necessary and introduces a great deal of locking overhead. There is a good question here and here that goes into great depth.
Edit
The takeaway point is if the procedure is a all or none and must either succeed or fail the correct decision is to use a transaction. If the procedure is not a all or none transaction such as simple insert update etc using a transaction is a) unnecessary and b) can introduce an undue performance overhead due to additional locking.
I have some questions about programming with a DBMS (no specific language needed, but I'm using Java; no specific DBMS in mind).
I open a transaction, select a row, then read a field, add 1 to the field, and update, then commit. What happens if another user runs in the same time a transaction on that field? Does it crash the transaction, or what?
Example: I'm a in a shop that has 1 kg of bread. Waiter1 has a client that needs 1 kg of bread. Waiter2 the same. If the program is:
select row "bread"
if quantity>=1 kg then quantity=quantity-1
update row
What happens if the two waiters run the transaction in the same time?
What are the best ways to implement multiuser, avoiding "collision"? Select and lock, transaction only, or what?
When to use optimistic lock, or pessimistic?
When to use lock, and when is it not needed?
Why are you handling this on the application side? Relational databases are built to handle situations like this. Just use an update statement:
UPDATE some_table
SET quantity = quantity - 1
WHERE item_name = 'bread' AND quantity >= 1
What you are looking for is Transaction Isolation. The official SQL standard would handle it like this:
If you don't lock specifically your database will generally lock either the row or even the table for you. Depending on your isolation level it will either wait or raise an error.
Serializable
The second transaction will wait for the first to complete before it can do anything.
Repeatable reads
As soon as the first transaction reads, the second will wait until the first one committed. Or the other way around, if somehow the second transaction starts reading before the first.
Read committed
If the first transaction writes before the second writes, the first will have to wait until the second has committed. Otherwise the second will have to wait until the first has committed.
Read uncommitted
Both can read without an issue, but the first to write will make the other write stall till the transaction has been committed.
If one of the transactions commits after the other reads, you could lose the data and end up with only 1 update.
Documentation says, serializable transactions execute one after one.
But in practic it seems not to be truth. Here's two almost equal transactions, the difference is delay for 15 seconds only.
#1:
set transaction isolation level serializable
go
begin transaction
if not exists (select * from articles where title like 'qwe')
begin
waitfor delay '00:00:15'
insert into articles (title) values ('qwe')
end
commit transaction go
#2:
set transaction isolation level serializable
go
begin transaction
if not exists (select * from articles where title like 'qwe')
begin
insert into articles (title) values ('asd')
end
commit transaction go
The second transaction has been run after couple of seconds since the start of first one.
The result is deadlock. The first transaction dies with
Transaction (Process ID 58) was deadlocked on
lock resources with another process and has been chosen as the deadlock victim.
Rerun the transaction.
reason.
The conclusion, serializable transactions are not serial?
serializable transactions don't necessarily execute serially.
The promise is just that transactions can only commit if the result would be as if they had executed serially (in any order).
The locking requirements to meet this guarantee can frequently lead to deadlock where one of the transactions needs to be rolled back. You would need to code your own retry logic to resubmit the failed query.
See The Serializable Isolation Level for more about the differences between the logical description and implementation.
What happens here:
Because transactions 1 runs in serializable isolation level, it keeps a share lock it obtains on table articles while it wait. This way, it is guaranteed that the non exists condition remains true until the transaction terminates.
Transaction 2 gets a share lock as well that allows it to do the exist check condition. Then, with the insert statement, Transaction 2 requires to convert the share lock to a exclusive lock but has to wait as Transaction 1 holds a shared lock.
When Transaction 1 finishes to wait, it also requests a conversion to exclusive mode => deadlock situation, 1 of the transaction has to be terminated.
I got into a similar problem and i found that:
From MSDN:
SERIALIZABLE
Specifies the following:
Statements cannot read data that has been modified but not yet
committed by other transactions.
No other transactions can modify data that has been read by the
current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would
fall in the range of keys read by any statements in the current
transaction until the current transaction completes.
The second point does not state that both sessions can't take the shared lock that will result in deadlock. We solved it with a hint on SELECT.
select * from articles WITH (UPDLOCK, ROWLOCK) where title like 'qwe'
Have not tried if it would work in this case but i think you would have to lock on the table part since the row is not yet created.
What is the mechanism for Transaction Rollback in sql server?
Every update in the database will first write an entry into the log containing the description of the change. Eg. if you update a column value from A to B the log will contain a record of the update, something like: in table T the column C was changed from A to B for record with key K by transaction with id I. If you rollback the transaction, the engine will start scanning the log backward looking for records of work done by your transaction and will undo the work: when it finds the record of update from A to B, will change the value back to A. An insert will be undone by deleting the inserted row. A delete will be undone by inserting back the row. This is described in Transaction Log Logical Architecture and Write-Ahead Transaction Log.
This is the high level explanation, the exact internal details how this happen are undocumented for laymen and not subject to your inspection nor changes.
Have a look at ROLLBACK TRANSACTION (Transact-SQL)
Rolls back an explicit or implicit
transaction to the beginning of the
transaction, or to a savepoint inside
the transaction.
In terms of how it does it, all of the data modifications within the transaction are stored within the transaction log, with additional space also reserved in the log for the undo records, in the event that it has to rollback.
Each transaction log has sufficient information within it, to reverse the change is has made, so that it can undo the change if required. (As well as replay them in a DR scenario)
If we take a simple delete operation as an example (since I've decoded that here as an example of the log contents) the record being deleted is stored inside the transaction log entry of LOP_DELETE_ROWS and with some non-trivial effort you can decode and demonstrate the entire row is within the log entry.
If the transaction is to be rolled back, the undo space reserved in the log is going to be used, and the row would be re-inserted. The reason for the undo reservation of space is to ensure that the transaction log can not be filled up mid transaction, leaving it no space to complete or rollback.