Transaction isolation level REPEATABLE READ causes deadlocks - sql-server

A part of my application updates a table as per business logic after opening a connection on transaction isolation level REPEATABLE READ. In a rare scenario, If this operation coincides with another part of the application which opens a different connection and tries to reset the same record to its default value. I get following error
Msg 1205, Level 13, State 45, Line 7
Transaction (Process ID 60) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
I think i am able to re-produce the issue using following example.
1.
create table Accounts
(
id int identity(1,1),
Name varchar(50),
Amount decimal
)
2.
insert into Accounts (Name,Amount) values ('ABC',5000)
insert into Accounts (Name,Amount) values ('WXY',4000)
insert into Accounts (Name,Amount) values ('XYZ',4500)
3.
Start a long transaction with isolation level as REPEATABLE READ
Set transaction isolation level REPEATABLE READ
begin tran
declare #var int
select #var=amount
from Accounts
where id=1
waitfor delay '0:0:10'
if #var > 4000
update accounts
set amount = amount -100;
Commit
4.
While Step.3 above is still being executed. Start another transaction on a different connection
Begin tran
update accounts
set Amount = 5000
where id = 1
commit tran
Transaction started in Step 3 would eventually complete but the one started in Step 4 would fail with following error message.
Msg 1205, Level 13, State 45, Line 7
Transaction (Process ID 60) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
What are my options to be able to eventually run transaction in step 4. The idea is to be able to reset the record to a default value and anything being performed on other transactions should be overridden in this case. I don't see any issue if both the transactions are not concurrent.

The idea is to be able to reset the record to a default value
In what order do you want the updates applied? Do you want the "reset" to always come through? Then you need to perform the reset strictly after the update in step 3 has completed. Also, the reset update should use a higher lock mode to avoid the deadlock:
update accounts WITH (XLOCK)
set Amount = 5000
where id = 1
That way the reset will wait for the other transaction to finish first because the other tran has an S-lock.
Alternatively, habe step 3 acquire an U-lock or X-lock.

You can set the deadlock priority of transaction in setp 4 to be higher
For more details see http://technet.microsoft.com/en-us/library/ms186736.aspx

Related

Serializable Isolation Level Confusion - Write Skew (Postgres)

I'm running Postgres12 and confused about the behavior of the serializable transaction level.
Tables:
Events
id
difficulty
Managers
id
level
Intended behavior (within serialized transaction):
check if there are 7 or more events of difficulty=2
if so, insert a manager with level=2
I'm running the following transactions in serializable but not seeing the behavior I am expected (expected the serializable transaction to detect write skew between 2 sessions)
-- session 1:
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE
SELECT count(*) from events WHERE difficulty=2
-- RETURNS 7
-- now start session 2
-- session 2:
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE
SELECT id FROM events WHERE difficulty=2 LIMIT 1;
/*
id
----
4
*/
UPDATE events SET difficulty=1 WHERE id=4;
COMMIT;
now there are only 6 events of difficulty=2
-- back in session 1
-- since we have counted 7 events of difficulty=2 in this session, create a manager
INSERT INTO manager (level) VALUES (2);
COMMIT;
-- Expected write skew to be detected here bc the read event rows have seen updates (only 6 actually)
Unfortunately, our final state is now 6 events of difficulty=2 and a manager of level 2.
Why didn't serializable isolation prevent this write skew?
What am I misunderstanding about serializable isolation use case? Why are events with difficulty=2 not locked or watched by predicate locking or some isolation mechanism?
Picture for clarity
SERIALIZABLE means that there is a way to execute the transactions serially (one after the other) so that the effect is the same. In your case, this equivalent serial execution would run session 1 first, then session 2, with the same effect.
You could say that session 1 executes "logically" before session 2.
Answering my own question after some thinking!
The serialization check is not preventing the two sessions from committing because it is possible to serialize the two transactions and still end up with a level 2 manager and 6 events of difficulty=2.
E.g.
Run session 1 (check if 7 events with difficulty=2, create manager level=2) COMMIT;
Run session 2 (remove one event, now 6 events with difficulty=2) COMMIT;
^Output = 6 events, 1 manager
This is the same result as running concurrently so this is deemed an "acceptable" state for these two serializable transactions.
if you want to prevent this behaviour, session 2 can be updated to the following
begin transaction isolation level serializable;
select count(*) from manager where level=2;
--if no managers
update events set difficulty=1 where id=4;
Now there is no logical way to end up with the state 6 events 1 manager with a serialized ordering. The two possible outcomes from a sequential ordering are
session 1 runs
session 2 runs
^output = 7 events 1 managers
session 2 runs
session 1 runs
^output = 6 events 0 managers
So in this case (with the updated session 2), one of your transactions would be blocked because the transactions are no longer serializable.
Write skew doesn't occur in SERIALIZABLE in PostgreSQL:
Isolation Level
Write Skew
READ UNCOMMITTED
Yes
READ COMMITTED
Yes
REPEATABLE READ
Yes
SERIALIZABLE
No
I experimented if write skew occurs in SERIALIZABLE in PostgreSQL. There is event table with name and user as shown below.
event table:
name
user
Make Sushi
John
Make Sushi
Tom
Then, I took these steps below for the experiment of write skew. *Only 3 users can join the event "Make Sushi":
Flow
Transaction (T1)
Transaction (T2)
Explanation
Step 1
BEGIN;
T1 starts.
Step 2
BEGIN;
T2 starts.
Step 3
SELECT count(*) FROM event WHERE name = 'Make Sushi';2
T1 reads 2 so only one user can join it.
Step 4
SELECT count(*) FROM event WHERE name = 'Make Sushi';2
T2 reads 2 so only one user can join it.
Step 5
INSERT INTO event values ('Make Sushi', 'Lisa');
T1 inserts Lisa to event table.
Step 6
COMMIT;
T1 commits.
Step 7
INSERT INTO event values ('Make Sushi', 'Kai');ERROR: could not serialize access due to read/write dependencies among transactions
T2 cannot insert Kai to event table and gets error.
Step 8
COMMIT;
T2 rollbacks with COMMIT query.*Write skew doesn't occur.

PostgreSQL's Repeatable Read Allows Phantom Reads But its document says that it does not allow

I have a problem with Postgresql repeatable read isolation level.
I did make an experiment about repeatable read isolation level's behavior when phantom read occurred.
Postgresql's manual says "The table also shows that PostgreSQL's Repeatable Read implementation does not allow phantom reads."
But phantom read occurred;
CREATE TABLE public.testmodel
(
id bigint NOT NULL
);
--Session 1 --
BEGIN TRANSACTION ISOLATION LEVEL Repeatable Read;
INSERT INTO TestModel(ID)
VALUES (10);
Select sum(ID)
From TestModel
where ID between 1 and 100;
--COMMIT;
--Session 2--
BEGIN TRANSACTION ISOLATION LEVEL Repeatable Read;
INSERT INTO TestModel(ID)
VALUES (10);
Select sum(ID)
From TestModel
where ID between 1 and 100;
COMMIT;
Steps I followed;
Create Table
Run session 1 (I commented commit statement)
Run session 2
Run commit statement in session 1.
To my surprise, both of them (session 1, session 2) worked without any exceptions.
As far as I understand from the document. It shouldn't have been.
I was expecting session 1 throw exception, when committing it after session 2.
What is the reason of this? I am confused.
The docs you referenced define a "phantom read" as a situation where:
A transaction re-executes a query returning a set of rows
that satisfy a search condition and finds that the set of rows
satisfying the condition has changed due to another recently-committed
transaction.
In other words, a phantom read has occurred if you run the same query twice (or two queries seeking the same data), and you get different results. The REPEATABLE READ isolation level prevents this from happening, i.e. if you repeat the same read, you will get the same answer. It does not guarantee that either of those results reflects the current state of the database.
Since you are only reading data once in each transaction, this cannot be an example of a phantom read. It falls under the more general category of a "serialization anomaly", i.e. behaviour which could not occur if the transactions were executed serially. This type of anomaly is only avoided at the SERIALIZABLE isolation level.
There is an excellent set of examples on the Postgres wiki, describing anomalies which are allowed under REPEATABLE READ, but prevented under SERIALIZABLE isolation:
https://wiki.postgresql.org/wiki/SSI
Phantom read doesn't occur in REPEATABLE READ in PostgreSQL as the documentation says. *I explain more about phantom read in my answer on Stack Overflow .
I experimented if phantom read occurs in REPEATABLE READ in PostgreSQL with 2 command prompts.
First, to set REPEATABLE READ, I rans the query below and log out and log in again:
ALTER DATABASE postgres SET DEFAULT_TRANSACTION_ISOLATION TO 'repeatable read';
And, I created person table with id and name as shown below.
person table:
id
name
1
John
2
David
Then, I did these steps below with PostgreSQL queries:
Flow
Transaction 1 (T1)
Transaction 2 (T2)
Explanation
Step 1
BEGIN;
T1 starts.
Step 2
BEGIN;
T2 starts.
Step 3
SELECT * FROM person;1 John2 David
T1 reads 2 rows.
Step 4
INSERT INTO person VALUES (3, 'Tom');
T2 inserts the row with 3 and Tom to person table.
Step 5
COMMIT;
T2 commits.
Step 6
SELECT * FROM person;1 John2 David
T1 still reads 2 rows instead of 3 rows after T2 commits.*Phantom read doesn't occur!!
Step 7
COMMIT;
T1 commits.
You are misunderstanding what "does not allow phantom reads" means.
This simply means that phantom reads can not happen, not that there will be an error.
Session 2 will not see any committed changes to the table until the transaction from session 2 is committed as well.
repeatable read guarantees a consistent state of the database for the duration of the transaction where only changes made by that transaction itself will be visible, but no other changes. There is no need to throw an error.

Lost update in snapshot vs all the rest isolation levels

Let's suppose we use create new table and enable snapshot isolation for our database:
alter database database_name set allow_snapshot_isolation on
create table marbles (id int primary key, color char(5))
insert marbles values(1, 'Black') insert marbles values(2, 'White')
Next, in session 1 begin a snaphot transaction:
set transaction isolation level snapshot
begin tran
update marbles set color = 'Blue' where id = 2
Now, before committing the changes, run the following in session 2:
set transaction isolation level snapshot
begin tran
update marbles set color = 'Yellow' where id = 2
Then, when we commit session 1, session 2 will fail with an error about transaction aborted - I understand that is preventing from lost update.
If we follow this steps one by one but with any other isolation level such as: serializable, repeatable read, read committed or read uncommitted this Session 2 will get executed making new update to our table.
Could someone please explain my why is this happening?
For me this is some kind of lost update, but it seems like only snapshot isolation is preventing from it.
Could someone please explain my why is this happening?
Because under all the other isolation levels the point-in-time at which the second session first sees the row is after the first transaction commits. Locking is a kind of time travel. A session enters a lock wait and is transported forward in time to when the resource is eventually available.
For me this is some kind of lost update
No. It's not. Both updates were properly completed, and the final state of the row would have been the same if the transactions had been 10 minutes apart.
In a lost update scenario, each session will read the row before attempting to update it, and the results of the first transaction are needed to properly complete the second transaction. EG if each is incrementing a column by 1.
And under locking READ COMMITTED, REPEATABLE READ, and SERIALIZABLE the SELECT would be blocked, and no lost update would occur. And under READ_COMMITTED_SNAPSHOT the SELECT should have a UPDLOCK hint, and it would block too.

Query from multiple threads on a database table

I have a database table with thousands of entries. I have multiple worker threads which pick up one row at a time, does some work (takes roughly one second each). While picking up the row, each thread updates a flag on the database row (like a timestamp) so that the other threads do not pick it up. But the problem is that I end up in a scenario where multiple threads are picking up the same row.
My general question is that what general design approach should I follow here to ensure that each thread picks up unique rows and does their task independently.
Note : Multiple threads are running in parallel to hasten the processing of the database rows. So I would like to have a as small as possible critical segment or exclusive lock.
Just to give some context, below is the stored proc which picks up the rows from the table after it has updated the flag on the row. Please note that the stored proc is not compilable as I have removed unnecessary portions from it. But generally that's the structure of it.
The problem happens when multiple threads execute the stored proc in parallel. The change made by the update statement (note that the update is done after taking up a lock) in one thread is not visible to the other thread unless the transaction is committed. And as there is a SELECT statement (which takes around 50ms) between the UPDATE and the TRANSACTION COMMIT, on 20% cases the UPDATE statement in a thread picks up a row which has already been processed.
I hope I am clear enough here.
USE ['mydatabase']
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[GetRequest]
AS
BEGIN
-- some variable declaration here
BEGIN TRANSACTION
-- check if there are blocking rows in the request table
-- FM: Remove records that don't qualify for operation.
-- delete operation on the table to remove rows we don't want to process
delete FROM request where somecondition = 1
-- Identify the requests to process
DECLARE #TmpTableVar table(TmpRequestId int NULL);
UPDATE TOP(1) request
WITH (ROWLOCK)
SET Lock = DateAdd(mi, 5, GETDATE())
OUTPUT INSERTED.ID INTO #TmpTableVar
FROM request tur
WHERE (Lock IS NULL OR GETDATE() > Lock) -- not locked or lock expired
AND GETDATE() > NextRetry -- next in the queue
IF(##RowCount = 0)
BEGIN
ROLLBACK TRANSACTION
RETURN
END
select #RequestID = TmpRequestId from #TmpTableVar
-- Get details about the request that has been just updated
SELECT somerows
FROM request
WHERE somecondition = 1
COMMIT TRANSACTION
END
The analog of a critical section in SQL Server is sp_getapplock, which is simple to use. Alternatively you can SELECT the row to update with (UPDLOCK,READPAST,ROWLOCK) table hints. Both of these require a multi-statement transaction to control the duration of the exclusive locking.
You need start a transaction isolation level on sql for isolation your line, but this can impact on your performance.
Look the sample:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
GO
BEGIN TRANSACTION
GO
SELECT ID, NAME, FLAG FROM SAMPLE_TABLE WHERE FLAG=0
GO
UPDATE SAMPLE_TABLE SET FLAG=1 WHERE ID=1
GO
COMMIT TRANSACTION
Finishing, not exist a better way for use isolation level. You need analyze the positive and negative point for each level isolation and test your system performance.
More information:
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql
http://www.besttechtools.com/articles/article/sql-server-isolation-levels-by-example
https://en.wikipedia.org/wiki/Isolation_(database_systems)

UPDATE heap table - Deadlock on RID

I'm setting up a test case to prove a certain deadlock scenario and require some insight on what is going on.
I have a heap table, conventiently called HeapTable. This table is updated by 2 transactions simulateously.
Transaction 1:
BEGIN TRAN
UPDATE HeapTable
SET FirstName = 'Dylan'
WHERE FirstName = 'Ovidiu';
WAITFOR DELAY '00:00:15';
UPDATE HeapTable
SET FirstName = 'Bob'
WHERE FirstName = 'Thierry';
ROLLBACK TRANSACTION
Transaction 2:
BEGIN TRAN
UPDATE HeapTable
SET FirstName = 'Pierre'
WHERE FirstName = 'Michael';
ROLLBACK TRAN
I fire off transaction 1 first, closely followed by transaction 2. As expected transaction 1 will claim some exclusive locks, together with some intent exclusive ones. Transaction 2 will come in and request an Update lock on the same RID:
spid dbid ObjId IndId Type Resource Mode Status
55 5 711673583 0 RID 1:24336:10 X GRANT
57 5 711673583 0 RID 1:24336:10 U WAIT
I was kind of surprised to see the second transaction ask for an Update lock on the same RID, since I thought this pointed to a single record & both update statements handle different data. I was somehow expecting a conflict on page level instead.
When the second update of transaction 1 kicks in transaction 2 will be seen as deadlock victim resulting in a rollback of transaction 2 & completion of transaction 1.
Can someone explain me why the second transaction would require an update lock on the same RID although updating a different record?
Can someone explain me why the second transaction would require an update lock on the same RID although updating a different record?
This can be rephrased as, how Update statement acquires locks on table that needs to be updated,when no indexes are present..
SQL takes an intent Exclusive lock on Page and then tries to take U lock on the rows of the page before reading it,if it matches with the value that is going to be updated,this lock will be converted to X lock..
This U lock strategy is to ensure ,no other incompatible lock will be taken on same row
Please see below link by Kalen Delaney for indepth details on same
http://sqlblog.com/blogs/kalen_delaney/archive/2009/11/13/update-locks.aspx

Resources