SELECT on data that might be under SERIALIZABLE lock - sql-server

I have a part of application that constantly updates values in table rows (1-100 rows).
Since this data integrity is important i am using SERIALIZABLE lock on transaction in functions reading and updating those rows.
Now my question is if i execute a simple read only SELECT (without lock) on rows that is currently used by a transaction i could probably get a DEADLOCK exception right?
So would that mean that i would still need a retry mechanism in case of DEADLOCK even in case of simple SELECT?

Now that I know what your specific business scenario is (from the comments), here's how you'd do something like what you're proposing without having to implement serializable isolation
CREATE PROCEDURE [dbo].[updateUserState] (#UserID int)
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN TRANSACTION
SELECT [State]
FROM [dbo].[UserState] WITH (UPDLOCK)
WHERE [UserID] = #UserID;
IF ([State] = 'logged out')
BEGIN
UPDATE [us]
SET [State] = 'logged in'
FROM [dbo].[UserState] AS [us]
WHERE [UserID] = #UserID;
END
COMMIT TRANSACTION
END
Note that this is simplified, but presents the main idea. The UPDLOCK hint on the SELECT statement is the key. It says "try to select data as through I was going to do an update (which you are! just later) and keep it until the end of the transaction". In your example, if T2 comes in and T1 is still running, T2 will be unable to obtain the update lock and will thus wait until T1 is complete (either successfully or no). Also note that setting the transaction isolation level explicitly is just for completeness; READ COMMITTED is the default in SQL Server.

Related

Query from multiple threads on a database table

I have a database table with thousands of entries. I have multiple worker threads which pick up one row at a time, does some work (takes roughly one second each). While picking up the row, each thread updates a flag on the database row (like a timestamp) so that the other threads do not pick it up. But the problem is that I end up in a scenario where multiple threads are picking up the same row.
My general question is that what general design approach should I follow here to ensure that each thread picks up unique rows and does their task independently.
Note : Multiple threads are running in parallel to hasten the processing of the database rows. So I would like to have a as small as possible critical segment or exclusive lock.
Just to give some context, below is the stored proc which picks up the rows from the table after it has updated the flag on the row. Please note that the stored proc is not compilable as I have removed unnecessary portions from it. But generally that's the structure of it.
The problem happens when multiple threads execute the stored proc in parallel. The change made by the update statement (note that the update is done after taking up a lock) in one thread is not visible to the other thread unless the transaction is committed. And as there is a SELECT statement (which takes around 50ms) between the UPDATE and the TRANSACTION COMMIT, on 20% cases the UPDATE statement in a thread picks up a row which has already been processed.
I hope I am clear enough here.
USE ['mydatabase']
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[GetRequest]
AS
BEGIN
-- some variable declaration here
BEGIN TRANSACTION
-- check if there are blocking rows in the request table
-- FM: Remove records that don't qualify for operation.
-- delete operation on the table to remove rows we don't want to process
delete FROM request where somecondition = 1
-- Identify the requests to process
DECLARE #TmpTableVar table(TmpRequestId int NULL);
UPDATE TOP(1) request
WITH (ROWLOCK)
SET Lock = DateAdd(mi, 5, GETDATE())
OUTPUT INSERTED.ID INTO #TmpTableVar
FROM request tur
WHERE (Lock IS NULL OR GETDATE() > Lock) -- not locked or lock expired
AND GETDATE() > NextRetry -- next in the queue
IF(##RowCount = 0)
BEGIN
ROLLBACK TRANSACTION
RETURN
END
select #RequestID = TmpRequestId from #TmpTableVar
-- Get details about the request that has been just updated
SELECT somerows
FROM request
WHERE somecondition = 1
COMMIT TRANSACTION
END
The analog of a critical section in SQL Server is sp_getapplock, which is simple to use. Alternatively you can SELECT the row to update with (UPDLOCK,READPAST,ROWLOCK) table hints. Both of these require a multi-statement transaction to control the duration of the exclusive locking.
You need start a transaction isolation level on sql for isolation your line, but this can impact on your performance.
Look the sample:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
GO
BEGIN TRANSACTION
GO
SELECT ID, NAME, FLAG FROM SAMPLE_TABLE WHERE FLAG=0
GO
UPDATE SAMPLE_TABLE SET FLAG=1 WHERE ID=1
GO
COMMIT TRANSACTION
Finishing, not exist a better way for use isolation level. You need analyze the positive and negative point for each level isolation and test your system performance.
More information:
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql
http://www.besttechtools.com/articles/article/sql-server-isolation-levels-by-example
https://en.wikipedia.org/wiki/Isolation_(database_systems)

Is a single SQL Server statement atomic and consistent?

Is a statement in SQL Server ACID?
What I mean by that
Given a single T-SQL statement, not wrapped in a BEGIN TRANSACTION / COMMIT TRANSACTION, are the actions of that statement:
Atomic: either all of its data modifications are performed, or none of them is performed.
Consistent: When completed, a transaction must leave all data in a consistent state.
Isolated: Modifications made by concurrent transactions must be isolated from the modifications made by any other concurrent transactions.
Durable: After a transaction has completed, its effects are permanently in place in the system.
The reason I ask
I have a single statement in a live system that appears to be violating the rules of the query.
In effect my T-SQL statement is:
--If there are any slots available,
--then find the earliest unbooked transaction and mark it booked
UPDATE Transactions
SET Booked = 1
WHERE TransactionID = (
SELECT TOP 1 TransactionID
FROM Slots
INNER JOIN Transactions t2
ON Slots.SlotDate = t2.TransactionDate
WHERE t2.Booked = 0 --only book it if it's currently unbooked
AND Slots.Available > 0 --only book it if there's empty slots
ORDER BY t2.CreatedDate)
Note: But a simpler conceptual variant might be:
--Give away one gift, as long as we haven't given away five
UPDATE Gifts
SET GivenAway = 1
WHERE GiftID = (
SELECT TOP 1 GiftID
FROM Gifts
WHERE g2.GivenAway = 0
AND (SELECT COUNT(*) FROM Gifts g2 WHERE g2.GivenAway = 1) < 5
ORDER BY g2.GiftValue DESC
)
In both of these statements, notice that they are single statements (UPDATE...SET...WHERE).
There are cases where the wrong transaction is being "booked"; it's actually picking a later transaction. After staring at this for 16 hours, I'm stumped. It's as though SQL Server is simply violating the rules.
I wondered what if the results of the Slots view is changing before the update happens? What if SQL Server is not holding SHARED locks on the transactions on that date? Is it possible that a single statement can be inconsistent?
So I decided to test it
I decided to check if the results of sub-queries, or inner operations, are inconsistent. I created a simple table with a single int column:
CREATE TABLE CountingNumbers (
Value int PRIMARY KEY NOT NULL
)
From multiple connections, in a tight loop, I call the single T-SQL statement:
INSERT INTO CountingNumbers (Value)
SELECT ISNULL(MAX(Value), 0)+1 FROM CountingNumbers
In other words the pseudo-code is:
while (true)
{
ADOConnection.Execute(sql);
}
And within a few seconds I get:
Violation of PRIMARY KEY constraint 'PK__Counting__07D9BBC343D61337'.
Cannot insert duplicate key in object 'dbo.CountingNumbers'.
The duplicate value is (1332)
Are statements atomic?
The fact that a single statement wasn't atomic makes me wonder if single statements are atomic?
Or is there a more subtle definition of statement, that differs from (for example) what SQL Server considers a statement:
Does this fundamentally means that within the confines of a single T-SQL statement, SQL Server statements are not atomic?
And if a single statement is atomic, what accounts for the key violation?
From within a stored procedure
Rather than a remote client opening n connections, I tried it with a stored procedure:
CREATE procedure [dbo].[DoCountNumbers] AS
SET NOCOUNT ON;
DECLARE #bumpedCount int
SET #bumpedCount = 0
WHILE (#bumpedCount < 500) --safety valve
BEGIN
SET #bumpedCount = #bumpedCount+1;
PRINT 'Running bump '+CAST(#bumpedCount AS varchar(50))
INSERT INTO CountingNumbers (Value)
SELECT ISNULL(MAX(Value), 0)+1 FROM CountingNumbers
IF (#bumpedCount >= 500)
BEGIN
PRINT 'WARNING: Bumping safety limit of 500 bumps reached'
END
END
PRINT 'Done bumping process'
and opened 5 tabs in SSMS, pressed F5 in each, and watched as they too violated ACID:
Running bump 414
Msg 2627, Level 14, State 1, Procedure DoCountNumbers, Line 14
Violation of PRIMARY KEY constraint 'PK_CountingNumbers'.
Cannot insert duplicate key in object 'dbo.CountingNumbers'.
The duplicate key value is (4414).
The statement has been terminated.
So the failure is independent of ADO, ADO.net, or none of the above.
For 15 years i've been operating under the assumption that a single statement in SQL Server is consistent; and the only
What about TRANSACTION ISOLATION LEVEL xxx?
For different variants of the SQL batch to execute:
default (read committed): key violation
INSERT INTO CountingNumbers (Value)
SELECT ISNULL(MAX(Value), 0)+1 FROM CountingNumbers
default (read committed), explicit transaction: no error key violation
BEGIN TRANSACTION
INSERT INTO CountingNumbers (Value)
SELECT ISNULL(MAX(Value), 0)+1 FROM CountingNumbers
COMMIT TRANSACTION
serializable: deadlock
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
INSERT INTO CountingNumbers (Value)
SELECT ISNULL(MAX(Value), 0)+1 FROM CountingNumbers
COMMIT TRANSACTION
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
snapshot (after altering database to enable snapshot isolation): key violation
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
BEGIN TRANSACTION
INSERT INTO CountingNumbers (Value)
SELECT ISNULL(MAX(Value), 0)+1 FROM CountingNumbers
COMMIT TRANSACTION
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
Bonus
Microsoft SQL Server 2008 R2 (SP2) - 10.50.4000.0 (X64)
Default transaction isolation level (READ COMMITTED)
Turns out every query I've ever written is broken
This certainly changes things. Every update statement I've ever written is fundamentally broken. E.g.:
--Update the user with their last invoice date
UPDATE Users
SET LastInvoiceDate = (SELECT MAX(InvoiceDate) FROM Invoices WHERE Invoices.uid = Users.uid)
Wrong value; because another invoice could be inserted after the MAX and before the UPDATE. Or an example from BOL:
UPDATE Sales.SalesPerson
SET SalesYTD = SalesYTD +
(SELECT SUM(so.SubTotal)
FROM Sales.SalesOrderHeader AS so
WHERE so.OrderDate = (SELECT MAX(OrderDate)
FROM Sales.SalesOrderHeader AS so2
WHERE so2.SalesPersonID = so.SalesPersonID)
AND Sales.SalesPerson.BusinessEntityID = so.SalesPersonID
GROUP BY so.SalesPersonID);
without exclusive holdlocks, the SalesYTD is wrong.
How have I been able to do anything all these years.
I've been operating under the assumption that a single statement in SQL Server is consistent
That assumption is wrong. The following two transactions have identical locking semantics:
STATEMENT
BEGIN TRAN; STATEMENT; COMMIT
No difference at all. Single statements and auto-commits do not change anything.
So merging all logic into one statement does not help (if it does, it was by accident because the plan changed).
Let's fix the problem at hand. SERIALIZABLE will fix the inconsistency you are seeing because it guarantees that your transactions behave as if they executed single-threadedly. Equivalently, they behave as if they executed instantly.
You will be getting deadlocks. If you are ok with a retry loop, you're done at this point.
If you want to invest more time, apply locking hints to force exclusive access to the relevant data:
UPDATE Gifts -- U-locked anyway
SET GivenAway = 1
WHERE GiftID = (
SELECT TOP 1 GiftID
FROM Gifts WITH (UPDLOCK, HOLDLOCK) --this normally just S-locks.
WHERE g2.GivenAway = 0
AND (SELECT COUNT(*) FROM Gifts g2 WITH (UPDLOCK, HOLDLOCK) WHERE g2.GivenAway = 1) < 5
ORDER BY g2.GiftValue DESC
)
You will now see reduced concurrency. That might be totally fine depending on your load.
The very nature of your problem makes achieving concurrency hard. If you require a solution for that we'd need to apply more invasive techniques.
You can simplify the UPDATE a bit:
WITH g AS (
SELECT TOP 1 Gifts.*
FROM Gifts
WHERE g2.GivenAway = 0
AND (SELECT COUNT(*) FROM Gifts g2 WITH (UPDLOCK, HOLDLOCK) WHERE g2.GivenAway = 1) < 5
ORDER BY g2.GiftValue DESC
)
UPDATE g -- U-locked anyway
SET GivenAway = 1
This gets rid of one unnecessary join.
Below is an example of an UPDATE statement that does increment a counter value atomically
-- Do this once for test setup
CREATE TABLE CountingNumbers (Value int PRIMARY KEY NOT NULL)
INSERT INTO CountingNumbers VALUES(1)
-- Run this in parallel: start it in two tabs on SQL Server Management Studio
-- You will see each connection generating new numbers without duplicates and without timeouts
while (1=1)
BEGIN
declare #nextNumber int
-- Taking the Update lock is only relevant in case this statement is part of a larger transaction
-- to prevent deadlock
-- When executing without a transaction, the statement will itself be atomic
UPDATE CountingNumbers WITH (UPDLOCK, ROWLOCK) SET #nextNumber=Value=Value+1
print #nextNumber
END
Select does not lock exclusively, even serializable does, but only for the time the select is executed! Once the select is over, the select lock is gone. Then, update locks take on as they now know what to lock as Select has return results. Meanwhile, anyone else can Select again!
The only sure way to safely read and lock a row is:
begin transaction
--lock what i need to read
update mytable set col1=col1 where mykey=#key
--now read what i need
select #d1=col1,#d2=col2 from mytable where mykey=#key
--now do here calculations checks whatever i need from the row i read to decide my update
if #d1<#d2 set #d1=#d2 else set #d1=#d2 * 2 --just an example calc
--now do the actual update on what i read and the logic
update mytable set col1=#d1,col2=#d2 where mykey=#key
commit transaction
This way any other connection running the same statement for the same data it will surely wait at the first (fake) update statement until the previous is done. This ensures that when lock is released only one connection will granted permission to lock request to 'update' and this one will surely read committed finalized data to make calculations and decide if and what to actually update at the second 'real' update.
In other words, when you need to select information to decide if/how to update, you need a begin/commit transaction block plus you need to start with a fake update of what you need to select - before you select it(update output will also do).

In SQL Server, how can I lock a single row in a way similar to Oracle's "SELECT FOR UPDATE WAIT"?

I have a program that connects to an Oracle database and performs operations on it. I now want to adapt that program to also support an SQL Server database.
In the Oracle version, I use "SELECT FOR UPDATE WAIT" to lock specific rows I need. I use it in situations where the update is based on the result of the SELECT and other sessions can absolutely not modify it simultaneously, so they must manually lock it first. The system is highly subject to sessions trying to access the same data at the same time.
For example:
Two users try to fetch the row in the database with the highest priority, mark it as busy, performs operations on it, and mark it as available again for later use.
In Oracle, the logic would go basically like this:
BEGIN TRANSACTION;
SELECT ITEM_ID FROM TABLE_ITEM WHERE ITEM_PRIORITY > 10 AND ITEM_CATEGORY = 'CT1'
ITEM_STATUS = 'available' AND ROWNUM = 1 FOR UPDATE WAIT 5;
UPDATE [locked item_id] SET ITEM_STATUS = 'unavailable';
COMMIT TRANSACTION;
Note that the queries are built dynamically in my code. Also note that when the previously most favorable row is marked as unavailable, the second user will automatically go for the next one and so on. Furthermore, different users working on different categories will not have to wait for each other's locks to be released. Worst comes to worst, after 5 seconds, an error would be returned and the operation would be cancelled.
So finally, the question is: how do I achieve the same results in SQL Server? I have been looking at locking hints which, in theory, seem like they should work. However, the only locks that prevents other locks are "UPDLOCK" AND "XLOCK" which both only work at a table level.
Those locking hints that do work at a row level are all shared locks, which also do not satisfy my needs (both users could lock the same row at the same time, both mark it as unavailable and perform redundant operations on the corresponding item).
Some people seem to add a "time modified" column so sessions can verify that they are the ones who modified it, but this sounds like there would be a lot of redundant and unnecessary accesses.
You're probably looking forwith (updlock, holdlock). This will make a select grab an exclusive lock, which is required for updates, instead of a shared lock. The holdlock hint tells SQL Server to keep the lock until the transaction ends.
FROM TABLE_ITEM with (updlock, holdlock)
As documentation sayed:
XLOCK
Specifies that exclusive locks are to be taken and held until the
transaction completes. If specified with ROWLOCK, PAGLOCK, or TABLOCK,
the exclusive locks apply to the appropriate level of granularity.
So solution is using WITH(XLOCK, ROWLOCK):
BEGIN TRANSACTION;
SELECT ITEM_ID
FROM TABLE_ITEM
WITH(XLOCK, ROWLOCK)
WHERE ITEM_PRIORITY > 10 AND ITEM_CATEGORY = 'CT1' AND ITEM_STATUS = 'available' AND ROWNUM = 1;
UPDATE [locked item_id] SET ITEM_STATUS = 'unavailable';
COMMIT TRANSACTION;
In SQL Server there are locking hints but they do not span their statements like the Oracle example you provided. The way to do it in SQL Server is to set an isolation level on the transaction that contains the statements that you want to execute. See this MSDN page but the general structure would look something like:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
select * from ...
update ...
COMMIT TRANSACTION;
SERIALIZABLE is the highest isolation level. See the link for other options. From MSDN:
SERIALIZABLE Specifies the following:
Statements cannot read data that has been modified but not yet
committed by other transactions.
No other transactions can modify data that has been read by the
current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would
fall in the range of keys read by any statements in the current
transaction until the current transaction completes.
Have you tried WITH (ROWLOCK)?
BEGIN TRAN
UPDATE your_table WITH (ROWLOCK)
SET your_field = a_value
WHERE <a predicate>
COMMIT TRAN

SQL Server ROWLOCK over a SELECT if not exists INSERT transaction

I have upgraded from SQL Server 2005 to 2008. I remember that in 2005, ROWLOCK simply did not work and I had to use PAGELOCK or XLOCK to achieve any type of actual locking. I know a reader of this will ask "what did you do wrong?" Nothing. I conclusively proved that I could edit a "ROWLOCKED" row, but couldn't if I escalated the lock level. I haven't had a chance to see if this works in SQL 2008. My first question is has anyone come across this issue in 2008?
My second question is as follows. I want to test if a value exists and if so, perform an update on relevant columns, rather than an insert of the whole row. This means that if the row is found it needs to be locked as a maintenance procedure could delete this row mid-process, causing an error.
To illustrate the principle, will the following code work?
BEGIN TRAN
SELECT ProfileID
FROM dbo.UseSessions
WITH (ROWLOCK)
WHERE (ProfileID = #ProfileID)
OPTION (OPTIMIZE FOR (#ProfileID UNKNOWN))
if ##ROWCOUNT = 0 begin
INSERT INTO dbo.UserSessions (ProfileID, SessionID)
VALUES (#ProfileID, #SessionID)
end else begin
UPDATE dbo.UserSessions
SET SessionID = #SessionID, Created = GETDATE()
WHERE (ProfileID = #ProfileID)
end
COMMIT TRAN
An explanation...
ROWLOCK/PAGELOCK is granularity
XLOCK is mode
Granularity and isolation level and mode are orthogonal.
Granularity = what is locked = row, page, table (PAGLOCK, ROWLOCK, TABLOCK)
Isolation Level = lock duration, concurrency (HOLDLOCK, READCOMMITTED, REPEATABLEREAD, SERIALIZABLE)
Mode = sharing/exclusivity (UPDLOCK, XLOCK)
"combined" eg NOLOCK, TABLOCKX
XLOCK would have locked the row exclusively as you want. ROWLOCK/PAGELOCK wouldn't have.

Does SQL Server wrap Select...Insert Queries into an implicit transaction?

When I perform a select/Insert query, does SQL Server automatically create an implicit transaction and thus treat it as one atomic operation?
Take the following query that inserts a value into a table if it isn't already there:
INSERT INTO Table1 (FieldA)
SELECT 'newvalue'
WHERE NOT EXISTS (Select * FROM Table1 where FieldA='newvalue')
Is there any possibility of 'newvalue' being inserted into the table by another user between the evaluation of the WHERE clause and the execution of the INSERT clause if I it isn't explicitly wrapped in a transaction?
You are confusing between transaction and locking. Transaction reverts your data back to the original state if there is any error. If not, it will move the data to the new state. You will never ever have your data in an intermittent state when the operations are transacted. On the other hand, locking is the one that allows or prevents multiple users from accessing the data simultaneously. To answer your question, select...insert is atomic and as long as no granular locks are explicitly requested, no other user will be able to insert while select..insert is in progress.
John, the answer to this depends on your current isolation level. If you're set to READ UNCOMMITTED you could be looking for trouble, but with a higher isolation level, you should not get additional records in the table between the select and insert. With a READ COMMITTED (the default), REPEATABLE READ, or SERIALIZABLE isolation level, you should be covered.
Using SSMS 2016, it can be verified that the Select/Insert statement requests a lock (and so most likely operates atomically):
Open a new query/connection for the following transaction and set a break-point on ROLLBACK TRANSACTION before starting the debugger:
BEGIN TRANSACTION
INSERT INTO Table1 (FieldA) VALUES ('newvalue');
ROLLBACK TRANSACTION --[break-point]
While at the above break-point, execute the following from a separate query window to show any locks (may take a few seconds to register any output):
SELECT * FROM sys.dm_tran_locks
WHERE resource_database_id = DB_ID()
AND resource_associated_entity_id = OBJECT_ID(N'dbo.Table1');
There should be a single lock associated to the BEGIN TRANSACTION/INSERT above (since by default runs in an ISOLATION LEVEL of READ COMMITTED)
OBJECT ** ********** * IX LOCK GRANT 1
From another instance of SSMS, open up a new query and run the following (while still stopped at the above break-point):
INSERT INTO Table1 (FieldA)
SELECT 'newvalue'
WHERE NOT EXISTS (Select * FROM Table1 where FieldA='newvalue')
This should hang with the string "(Executing)..." being displayed in the tab title of the query window (since ##LOCK_TIMEOUT is -1 by default).
Re-run the query from Step 2.
Another lock corresponding to the Select/Insert should now show:
OBJECT ** ********** 0 IX LOCK GRANT 1
OBJECT ** ********** 0 IX LOCK GRANT 1
ref: How to check which locks are held on a table

Resources