Read Committed vs Repeatable Read Example

Read Committed vs Repeatable Read Example - sql-server

I'm trying to execute the following two queries in SQL Server Management Studio (in separate query windows). I run them in the same order I typed them here.
When isolation level is set to READ COMMITTED they execute ok, but when it's set to REPEATABLE READS the transactions are dead locked.
Can you please help me to understand what is dead locked here?
First:
begin tran
declare #a int, #b int
set #a = (select col1 from Test where id = 1)
set #b = (select col1 from Test where id = 2)
waitfor delay '00:00:10'
update Test set col1 = #a + #b where id = 1
update Test set col1 = #a - #b where id = 2
commit
Second:
begin tran
update Test set col1 = -1 where id = 1
commit
UPD Answer is laready given but folowing the advice I'm inserting the deadlock graph

In both cases the selects use a shared lock and the updates an exclusive lock.
In READ COMMITTED mode, the shared lock is released immediately after the select finishes.
In REPEATABLE READS mode, the shared locks for the selects are held untill the end of the transaction, to ensure that no other sessions can change the data that was read. A new read within the same transaction is garantueed to yield the same results, unless the data was changed in the current session/transaction
Originally I thought, that you executed "First" in both sessions. Then the explanation would be trivial: both sessions acquire and get a shared lock, which then blocks the exclusive lock required for the updates.
The situation with a second session doing only an update is a little more complex. An update staement will first acquire an update lock (UPDLOCK) for selecting the rows that must be updated, which is probably similar to a shared lock, but at least not blocked by a shared lock. Next, when the data is actually updated, it tries to convert the update lock to an exclusive lock, which fails, because the first session is still holding the shared lock. Now both sessions block each other.

Related

SQL Server updates are sequential or parallel?

I have a procedure which has only two update statements. Both are on same table updating data based on different columns. For example
update table1
set column1 = somevalue, column2 = somevalue
where column3 = somevalue
update table1
set column3 = somevalue, column2 = somevalue
where column1 = somevalue
Intermittently I am getting an error
Transaction (Process ID "different number)) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim
Process id is pointing to the same stored procedure when I check in SQL Server using command sp_who.
There can be a situation where both update statements can update same row. can this be a reason of deadlock?
CREATE PROCEDURE update_tib_details
(#tib_id INT, #sys_id INT)
AS
BEGIN
UPDATE tib_sys_asc
SET tib_id = #tib_id
WHERE sys_id = #sys_id
UPDATE tib_sys_asc
SET sys_id = #sys_id
WHERE tib_id = #tib_id
END

This won't happen if the updates execute in the same process. A single process can't cause a deadlock with itself.
If this update is being trigged by some other process, and isn't somehow protected from concurrency, you could experience a deadlock.
Deadlock occurs when two processes are each mid-transaction and waiting on the other to complete before they can continue. In this case, for example
Process A starts, and updates row 1
Process B starts, and updates row 2
Process A now wants to update row 2, and must wait for Process B to commit
Process B now wants to update row 1, and must wait for Process A to commit
The database engine is pretty good at detecting these cross dependencies, and chooses which process to kill. If Process B is killed, Process A can finally update row 2 and commit, or vice versa.
In your case you should decide what the appropriate end result should be. If you don't care in which order operations complete (last in wins) then just commit after each update. If you do care, then you should be able to write an exclusive lock around the entire operation (i.e. Process B waits until the entirety of Process A has completed).

When UPDLOCK get released in SQL server?

Recently I have gone through with Hints and Locks in SQL server. While google about this topic I have read one blog where some query have been written which I am not bale to understand. Here it is
BOL states: Use update locks instead of shared locks while reading a table, and hold locks until the end of the statement or transaction. I have some trouble translating this. Does this mean that the update locks are released after the execution of the SELECT statement, unless the SELECT statement in within a transaction?
I other words, are my assumptions in the following 2 scenario's correct?
Scenario 1: no transaction
SELECT something FROM table WITH (UPDLOCK)
/* update locks released */
Scenario 2: with transaction
BEGIN TRANSACTION
SELECT something FROM table WITH (UPDLOCK)
/* some code, including an UPDATE */
COMMIT TRANSACTION
/* update locks released */
Example for scenario 2 (referred for stackoverflow blog)
BEGIN TRAN
SELECT Id FROM Table1 WITH (UPDLOCK)
WHERE AlertDate IS NULL;
UPDATE Table1 SET AlertDate = getutcdate()
WHERE AlertDate IS NULL;
COMMIT TRAN
Please help to understand the above query.
My second question is: once execution of select statement completed at same time UPDLOCK get released or not?

Your assumption in scenario 2 is correct.
To answer your second question, no. The Update locks are held on the selected row(s) until the transaction ends, or until converted to exclusive locks when the update statement modifies those row(s). Step through each statement one at a time using SSMS to verify.
BEGIN TRAN
-- execute sp_lock in second session - no locks yet
SELECT Id FROM Table1 WITH (UPDLOCK) WHERE AlertDate IS NULL;
-- execute sp_lock in second session - update locks present
UPDATE Table1 SET AlertDate = getutcdate() WHERE AlertDate IS NULL;
-- execute sp_lock in second session - update (U) locks are replace by exclusive locks (X) for all row(s) returned by SELECT and modified by the UPDATE (Lock Conversion).
-- Update locks (U) continue to be held for any row(s) returned by the SELECT but not modified by the UPDATE
-- exclusive locks (X) are also held on all rows not returned by SELECT but modified by UPDATE. Internally, lock conversion still occurs, because UPDATE statements must read and write.
COMMIT TRAN
-- sp_lock in second session - all locks gone.
As for what is going on in scenario 1, all T-SQL statements exist either in an implicit or explicit transaction. Senario 1 is implicitly:
BEGIN TRAN
SELECT something FROM table WITH (UPDLOCK)
-- execute sp_lock in second session - update locks (U) will be present
COMMIT TRAN;
-- execute sp_lock in second session - update locks are gone.

Does this mean that the update locks are released after the execution of the SELECT statement, unless the SELECT statement in within a transaction?
The locks will be released as soon as the row is read..but the lock hold will be U lock,so any parallel transaction trying to modify this will have to wait
if you wrap above select in a transaction,locks will be released only when the transaction is committed,so any parallel transaction acquiring locks incompatible with U lock will have to wait
begin tran
select * from t1 with (updlock)
for the below second scenario
BEGIN TRANSACTION
SELECT something FROM table WITH (UPDLOCK)
/* some code, including an UPDATE */
COMMIT TRANSACTION
Imagine, if your select query returned 100 rows,all will use U lock and imagine the update in same transaction affects 2 rows,the two rows will be converted to x locks .so now your query will have 98 u locks and 2 xlocks until the transaction is committed
I would like to think Updlock as repeatable read,any new rows can be added,but any parallel transaction can't delete or update existing rows

Query from multiple threads on a database table

I have a database table with thousands of entries. I have multiple worker threads which pick up one row at a time, does some work (takes roughly one second each). While picking up the row, each thread updates a flag on the database row (like a timestamp) so that the other threads do not pick it up. But the problem is that I end up in a scenario where multiple threads are picking up the same row.
My general question is that what general design approach should I follow here to ensure that each thread picks up unique rows and does their task independently.
Note : Multiple threads are running in parallel to hasten the processing of the database rows. So I would like to have a as small as possible critical segment or exclusive lock.
Just to give some context, below is the stored proc which picks up the rows from the table after it has updated the flag on the row. Please note that the stored proc is not compilable as I have removed unnecessary portions from it. But generally that's the structure of it.
The problem happens when multiple threads execute the stored proc in parallel. The change made by the update statement (note that the update is done after taking up a lock) in one thread is not visible to the other thread unless the transaction is committed. And as there is a SELECT statement (which takes around 50ms) between the UPDATE and the TRANSACTION COMMIT, on 20% cases the UPDATE statement in a thread picks up a row which has already been processed.
I hope I am clear enough here.
USE ['mydatabase']
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[GetRequest]
AS
BEGIN
-- some variable declaration here
BEGIN TRANSACTION
-- check if there are blocking rows in the request table
-- FM: Remove records that don't qualify for operation.
-- delete operation on the table to remove rows we don't want to process
delete FROM request where somecondition = 1
-- Identify the requests to process
DECLARE #TmpTableVar table(TmpRequestId int NULL);
UPDATE TOP(1) request
WITH (ROWLOCK)
SET Lock = DateAdd(mi, 5, GETDATE())
OUTPUT INSERTED.ID INTO #TmpTableVar
FROM request tur
WHERE (Lock IS NULL OR GETDATE() > Lock) -- not locked or lock expired
AND GETDATE() > NextRetry -- next in the queue
IF(##RowCount = 0)
BEGIN
ROLLBACK TRANSACTION
RETURN
END
select #RequestID = TmpRequestId from #TmpTableVar
-- Get details about the request that has been just updated
SELECT somerows
FROM request
WHERE somecondition = 1
COMMIT TRANSACTION
END

The analog of a critical section in SQL Server is sp_getapplock, which is simple to use. Alternatively you can SELECT the row to update with (UPDLOCK,READPAST,ROWLOCK) table hints. Both of these require a multi-statement transaction to control the duration of the exclusive locking.

You need start a transaction isolation level on sql for isolation your line, but this can impact on your performance.
Look the sample:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
GO
BEGIN TRANSACTION
GO
SELECT ID, NAME, FLAG FROM SAMPLE_TABLE WHERE FLAG=0
GO
UPDATE SAMPLE_TABLE SET FLAG=1 WHERE ID=1
GO
COMMIT TRANSACTION
Finishing, not exist a better way for use isolation level. You need analyze the positive and negative point for each level isolation and test your system performance.
More information:
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql
http://www.besttechtools.com/articles/article/sql-server-isolation-levels-by-example
https://en.wikipedia.org/wiki/Isolation_(database_systems)

SELECT on data that might be under SERIALIZABLE lock

I have a part of application that constantly updates values in table rows (1-100 rows).
Since this data integrity is important i am using SERIALIZABLE lock on transaction in functions reading and updating those rows.
Now my question is if i execute a simple read only SELECT (without lock) on rows that is currently used by a transaction i could probably get a DEADLOCK exception right?
So would that mean that i would still need a retry mechanism in case of DEADLOCK even in case of simple SELECT?

Now that I know what your specific business scenario is (from the comments), here's how you'd do something like what you're proposing without having to implement serializable isolation
CREATE PROCEDURE [dbo].[updateUserState] (#UserID int)
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN TRANSACTION
SELECT [State]
FROM [dbo].[UserState] WITH (UPDLOCK)
WHERE [UserID] = #UserID;
IF ([State] = 'logged out')
BEGIN
UPDATE [us]
SET [State] = 'logged in'
FROM [dbo].[UserState] AS [us]
WHERE [UserID] = #UserID;
END
COMMIT TRANSACTION
END
Note that this is simplified, but presents the main idea. The UPDLOCK hint on the SELECT statement is the key. It says "try to select data as through I was going to do an update (which you are! just later) and keep it until the end of the transaction". In your example, if T2 comes in and T1 is still running, T2 will be unable to obtain the update lock and will thus wait until T1 is complete (either successfully or no). Also note that setting the transaction isolation level explicitly is just for completeness; READ COMMITTED is the default in SQL Server.

Record locking and concurrency issues

My logical schema is as follows:
A header record can have multiple child records.
Multiple PCs can be inserting Child records, via a stored procedure that accepts details about the child record, and a value.
When a child record is inserted, a header record may need to be inserted if one doesn't exist with the specified value.
You only ever want one header record inserted for any given "value". So if two child records are inserted with the same "Value" supplied, the header should only be created once. This requires concurrency management during inserts.
Multiple PCs can be querying unprocessed header records, via a stored procedure
A header record needs to be queried if it has a specific set of child records, and the header record is unprocessed.
You only ever want one machine PC to query and process each header record. There should never be an instance where a header record and it's children should be processed by more than one PC. This requires concurrency management during selects.
So basically my header query looks like this:
BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT TOP 1
*
INTO
#unprocessed
FROM
Header h WITH (READPAST, UPDLOCK)
JOIN
Child part1 ON part1.HeaderID = h.HeaderID AND part1.Name = 'XYZ'
JOIN
Child part2 ON part1.HeaderID = part2.HeaderID AND
WHERE
h.Processed = 0x0;
UPDATE
Header
SET
Processed = 0x1
WHERE
HeaderID IN (SELECT [HeaderID] FROM #unprocessed);
SELECT * FROM #unprocessed
COMMIT TRAN
So the above query ensures that concurrent queries never return the same record.
I think my problem is on the insert query. Here's what I have:
DECLARE #HeaderID INT
BEGIN TRAN
--Create header record if it doesn't exist, otherwise get it's HeaderID
MERGE INTO
Header WITH (HOLDLOCK) as target
USING
(
SELECT
[Value] = #Value, --stored procedure parameter
[HeaderID]
) as source ([Value], [HeaderID]) ON target.[Value] = source.[Value] AND
target.[Processed] = 0
WHEN MATCHED THEN
UPDATE SET
--Get the ID of the existing header
#HeaderID = target.[HeaderID],
[LastInsert] = sysdatetimeoffset()
WHEN NOT MATCHED THEN
INSERT
(
[Value]
)
VALUES
(
source.[Value]
)
--Get new or existing ID
SELECT #HeaderID = COALESCE(#HeaderID , SCOPE_IDENTITY());
--Insert child with the new or existing HeaderID
INSERT INTO
[Correlation].[CorrelationSetPart]
(
[HeaderID],
[Name]
)
VALUES
(
#HeaderID,
#Name --stored procedure parameter
);
My problem is that insertion query is often blocked by the above selection query, and I'm receiving timeouts. The selection query is called by a broker, so it can be called fairly quickly. Is there a better way to do this? Note, I have control over the database schema.

To answer the second part of the question
You only ever want one machine PC to query and process each header
record. There should never be an instance where a header record and
it's children should be processed by more than one PC
Have a look at sp_getapplock.
I use app locks within the similar scenario. I have a table of objects that must be processed, similar to your table of headers. The client application runs several threads simultaneously. Each thread executes a stored procedure that returns the next object for processing from the table of objects. So, the main task of the stored procedure is not to do the processing itself, but to return the first object in the queue that needs processing.
The code may look something like this:
CREATE PROCEDURE [dbo].[GetNextHeaderToProcess]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarHeaderID int = NULL;
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'GetNextHeaderToProcess_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
-- Find the most suitable header for processing
SELECT TOP 1
#VarHeaderID = h.HeaderID
FROM
Header h
JOIN Child part1 ON part1.HeaderID = h.HeaderID AND part1.Name = 'XYZ'
JOIN Child part2 ON part1.HeaderID = part2.HeaderID
WHERE
h.Processed = 0x0
ORDER BY ....;
-- sorting is optional, but often useful
-- for example, order by some timestamp to process oldest/newest headers first
-- Mark the found Header to prevent multiple processing.
UPDATE Header
SET Processed = 2 -- in progress. Another procedure that performs the actual processing should set it to 1 when processing is complete.
WHERE HeaderID = #VarHeaderID;
-- There is no need to explicitly verify if we found anything.
-- If #VarHeaderID is null, no rows will be updated
END;
-- Return found Header, or no rows if nothing was found, or failed to acquire the lock
SELECT
#VarHeaderID AS HeaderID
WHERE
#VarHeaderID IS NOT NULL
;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
This procedure should be called from the procedure that does actual processing. In my case the client application does the actual processing, in your case it may be another stored procedure. The idea is that we acquire the app lock for the short time here. Of course, if the actual processing is fast you can put it inside the lock, so only one header can be processed at a time.
Once the lock is acquired we look for the most suitable header to process and then set its Processed flag. Depending on the nature of your processing you can set the flag to 1 (processed) right away, or set it to some intermediary value, like 2 (in progress) and then set it to 1 (processed) later. In any case, once the flag is not zero the header will not be chosen for processing again.
These app locks are separate from normal locks that DB puts when reading and updating rows and they should not interfere with inserts. In any case, it should be better than locking the whole table as you do WITH (UPDLOCK).
Returning to the first part of the question
You only ever want one header record inserted for any given "value".
So if two child records are inserted with the same "Value" supplied,
the header should only be created once.
You can use the same approach: acquire app lock in the beginning of the inserting procedure (with some different name than the app lock used in querying procedure). Thus you would guarantee that inserts happen sequentially, not simultaneously. BTW, in practice most likely inserts can't happen simultaneously anyway. The DB would perform them sequentially internally. They will wait for each other, because each insert locks a table for update. Also, each insert is written to transaction log and all writes to transaction log are also sequential. So, just add sp_getapplock to the beginning of your inserting procedure and remove that WITH (HOLDLOCK) hint in the MERGE.
The caller of the GetNextHeaderToProcess procedure should handle correctly the situation when procedure returns no rows. This can happen if the lock acquisition timed out, or there are simply no more headers to process. Usually the processing part simply retries after a while.
Inserting procedure should check if the lock acquisition failed and retry the insert or report the problem to the caller somehow. I usually return the generated identity ID of the inserted row (the ChildID in your case) to the caller. If procedure returns 0 it means that insert failed. The caller may decide what to do.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight