SQL deadlock on delete then bulk insert - sql-server

I have an issue with a deadlock in SQL Server that I haven't been able to resolve.
Basically I have a large number of concurrent connections (from many machines) that are executing transactions where they first delete a range of entries and then re-insert entries within the same range with a bulk insert.
Essentially, the transaction looks like this
BEGIN TRANSACTION T1
DELETE FROM [TableName] WITH( XLOCK HOLDLOCK ) WHERE [Id]=#Id AND [SubId]=#SubId
INSERT BULK [TableName] (
[Id] Int
, [SubId] Int
, [Text] VarChar(max) COLLATE SQL_Latin1_General_CP1_CI_AS
) WITH(CHECK_CONSTRAINTS, FIRE_TRIGGERS)
COMMIT TRANSACTION T1
The bulk insert only inserts items matching the Id and SubId of the deletion in the same transaction. Furthermore, these Id and SubId entries should never overlap.
When I have enough concurrent transaction of this form, I start to see a significant number of deadlocks between these statements.
I added the locking hints XLOCK HOLDLOCK to attempt to deal with the issue, but they don't seem to be helpling.
The canonical deadlock graph for this error shows:
Connection 1:
Holds RangeX-X on PK_TableName
Holds IX Page lock on the table
Requesting X Page lock on the table
Connection 2:
Holds IX Page lock on the table
Requests RangeX-X lock on the table
What do I need to do in order to ensure that these deadlocks don't occur.
I have been doing some reading on the RangeX-X locks and I'm not sure I fully understand what is going on with these. Do I have any options short of locking the entire table here?

Following on from Sam Saffron's answer:
Consider READPAST hint to skip over any held locks if #ID7#SubID is distinc
Consider SERIALIZABLE and remove XLOCK, HOLDLOCK
Use a separate staging table for the bulk insert, then copy from that

Its hard to give you an accurate answer without having a list of indexes / table size etc, however keep in mind that SQL can not grab multiple locks at the same instance. It will grab locks one at at time, and if another connection already holds the lock and it holds a lock to something the first transaction needs, kaboom you have a deadlock.
In this particular instance there are a few things you can do:
Ensure there is an index on (Id, SubId), that way SQL will be able to grab a single range lock for the data being deleted.
If deadlocks become rare, retry your deadlocks.
You can approach this with a sledghammer and use a TABLOCKX which will not deadlock ever
Get an accurate deadlock analysis using trace flag 1204 http://support.microsoft.com/kb/832524 (the more info you have about the actual deadlock the easier it is to work around)

Related

UPDLOCK and HOLDLOCK query not creating the expected lock

I have the below table:
CREATE TABLE [dbo].[table1](
[id] [int] IDENTITY(1,1) NOT NULL,
[name] [nvarchar](50) NULL,
CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
I'm learning how SQL locks work, and I'm trying to test a situation where I want to lock a row from being read and updated. Some of the inspiration in this quest starting from this article, and here's the original problem I was trying to solve.
When I run this T-SQL:
BEGIN TRANSACTION
SELECT * FROM dbo.table1 WITH (UPDLOCK, HOLDLOCK)
WAITFOR DELAY '00:00:15'
COMMIT TRANSACTION
I would expect an exclusive lock to be placed on the table, and specifically for the row (if I had a WHERE statement on the primary key)
But running this query, I can see that the GRANTed LOCK is for the request mode IX.
SELECT * FROM sys.dm_tran_locks WHERE resource_database_id = DB_ID() AND resource_associated_entity_id = OBJECT_ID(N'dbo.table1');
Also, in seperate SSMS windows, I can fully query the table while the transaction is running.
Why is MSSQL not respecting the lock hints?
(SQL Server 2016)
Edit 1
Any information about how these locks work is appreciated, however, the issue at hand is that SQL Server does not seem to be enforcing the locks I'm specifying. My hunch is that this has to do with row versioning, or something related.
Edit 2
I created this Github gist. It requires .NET and the external library Dapper to run (available via Nuget package).
Here's the interesting thing I noticed:
SELECT statements can be ran against table1 even though a previous query with UPDLOCK, HOLDLOCK has been requested.
INSERT statements cannot be ran while the lock is there
UPDATE statements against existing records cannot be ran while the lock is there
UPDATE statements against non-existing records can be ran.
Here's the Console output of that Gist:
Run locking SELECT Start - 00:00:00.0165118
Run NON-locking SELECT Start - 00:00:02.0155787
Run NON-locking SELECT Finished - 00:00:02.0222536
Run INSERT Start - 00:00:04.0156334
Run UPDATE ALL Start - 00:00:06.0259382
Run UPDATE EXISTING Start - 00:00:08.0216868
Run UPDATE NON-EXISTING Start - 00:00:10.0236223
Run UPDATE NON-EXISTING Finished - 00:00:10.0268826
Run locking SELECT Finished - 00:00:31.3204120
Run INSERT Finished - 00:00:31.3209670
Run UPDATE ALL Finished - 00:00:31.3213625
Run UPDATE EXISTING Finished - 00:00:31.3219371
and I'm trying to test a situation where I want to lock a row from
being read and updated
If you want to lock a row from being read and updated you need an exclusive lock, but UPDLOCK lock hint requests update locks, not exclusive locks. The query should be:
SELECT * FROM table1 WITH (XLOCK, HOLDLOCK, ROWLOCK)
WHERE Id = <some id>
Additionally, under READ COMMITTED SNAPSHOT and SNAPSHOT isolation levels, SELECT statements don't request shared locks, just schema stability locks. Therefore, the SELECT statement can read the row despite there is an exclusive lock. And surprisingly, under READ COMMITTED isolation level, SELECT statements might not request row level shared locks. You will need to add a query hint to the SELECT statement to prevent it from read the locked row:
SELECT * FROM dbo.Table1 WITH (REPEATABLEREAD)
WHERE id = <some id>
With REPEATABLEREAD lock hint, the SELECT statement will request shared locks and will hold them during the transaction, so it won't read exclusively locked rows. Note that using READCOMMITTEDLOCK is not enough, since SQL Server might not request shared locks under some circumstances as described in this blog post.
Please, take a look at the Lock Compatibility Table
Under the default isolation level READ COMMITTED, and with not lock hints, SELECT statements request shared locks for each row it reads, and those locks are released immediately after the row is read. However, if you use WITH (HOLDLOCK), the shared locks are held until the transaction ends. Taking into account the lock compatibility table, a SELECT statement running under READ COMMITTED, can read any row that is not locked exclusively (IX, SIX, X locks). Exclusive locks are requested by INSERT, UPDATE and DELETE statements or by SELECT statements with XLOCK hints.
I would expect an exclusive lock to be placed on the table, and
specifically for the row (if I had a WHERE statement on the primary
key)
I need to understand WHY SQL Server is not respcting the locking
directives given to it. (i.e. Why is an exclusive lock not on the
table, or row for that matter?)
UPDLOCK hint doesn't request exclusive locks, it requests update locks. Additionally, the lock can be granted on other resources than the row itself, it can be granted on the table, data pages, index pages, and index keys. The complete list of resource types SQL Server can lock is: DATABASE, FILE, OBJECT, PAGE, KEY, EXTENT, RID, APPLICATION, METADATA, HOBT, and ALLOCATION_UNIT. When ROWLOCK hint is specified, SQL Server will lock rows, not pages nor extents nor tables and the actual resources that SQL Server will lock are RID's and KEY's
#Remus Rusuanu has explained it a lot better than I ever could here.
In essence - you can always read UNLESS you ask for the same lock type (or more restrictive). However, if you want to UPDATE or DELETE then you will be blocked. But as I said, the link above explains it really well.
Your answer is right in the documentation:
https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-table
Lock hints ROWLOCK, UPDLOCK, AND XLOCK that acquire row-level locks may place locks on index keys rather than the actual data rows. For example, if a table has a nonclustered index, and a SELECT statement using a lock hint is handled by a covering index, a lock is acquired on the index key in the covering index rather than on the data row in the base table.
This is why you are getting an index lock (IX) and not a table row lock.
And this explains why you can read while running the first query:
http://aboutsqlserver.com/2011/04/14/locking-in-microsoft-sql-server-part-1-lock-types/
Update locks (U). Those locks are the mix between shared and exclusive locks. SQL Server uses them with data modification statements while searching for the rows need to be modified. For example, if you issue the statement like: “update MyTable set Column1 = 0 where Column1 is null” SQL Server acquires update lock for every row it processes while searching for Column1 is null. When eligible row found, SQL Server converts (U) lock to (X).
Your UPDLock is an update lock. Notice that update locks are SHARED while searching, and changed EXCLUSIVE when performing the actual update. Since your query is a select with an update lock hint, the lock is a SHARED lock. This will allow other queries to also read the rows.

Are there Two exclusive locks on same table

I am inserting rows into single table from two instances and they are completing successfully.
When any transaction updating any table then it acquire 'Exclusive' on that Table (resource) and there must be single Exclusive lock on table while inserting data.
Granted
Requested Exclusive(X) Shared(S)
Exclusive NO NO
Shared NO Yes
Creating Sample table:
create table TestTransaction
(
Colid int Primary Key,
name varchar(10)
)
Inserting Instance 1:
Declare #counter int =1
Declare #countName varchar(10)='te'
Declare #max int=1000000
while #counter<#max
Begin
insert into TestTransaction
values
(
#counter,
#countName+Cast(#counter as varchar(7))
)
Set #counter=#counter+1
End
Inserting Instance 2:
insert into TestTransaction
values
(2000001,'yesOUTofT')
Why it is successful?
At the same time retrieval (Select) from this table is not happening because of the lock on table.
When any transaction updating any table then it acquire 'Exclusive' on that Table (resource) and there must be single Exclusive lock on table while inserting data.
That's a common myth. Locks in SQL Server are usually per-row. Various things cause them to escalate to page, partition or table level. SQL Server is designed, though, to try to lock at the smallest level first in order to allow for more concurrency.
Do not rely on any particular locking behavior in your apps if you can. Rather, make use of the isolation level setting if possible in order to obtain the required consistency guarantees you need.

Confused about UPDLOCK, HOLDLOCK

While researching the use of Table Hints, I came across these two questions:
Which lock hints should I use (T-SQL)?
What effect does HOLDLOCK have on UPDLOCK?
Answers to both questions say that when using (UPDLOCK, HOLDLOCK), other processes will not be able to read data on that table, but I didn't see this. To test, I created a table and started up two SSMS windows. From the first window, I ran a transaction that selected from the table using various table hints. While the transaction was running, from the second window I ran various statements to see which would be blocked.
The test table:
CREATE TABLE [dbo].[Test](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Value] [nvarchar](50) NULL,
CONSTRAINT [PK_Test] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
From SSMS Window 1:
BEGIN TRANSACTION
SELECT * FROM dbo.Test WITH (UPDLOCK, HOLDLOCK)
WAITFOR DELAY '00:00:10'
COMMIT TRANSACTION
From SSMS Window 2 (ran one of the following):
SELECT * FROM dbo.Test
INSERT dbo.Test(Value) VALUES ('bar')
UPDATE dbo.Test SET Value = 'baz' WHERE Value = 'bar'
DELETE dbo.Test WHERE Value= 'baz'
Effect of different table hints on statements run in Window 2:
(UPDLOCK) (HOLDLOCK) (UPDLOCK, HOLDLOCK) (TABLOCKX)
---------------------------------------------------------------------------
SELECT not blocked not blocked not blocked blocked
INSERT not blocked blocked blocked blocked
UPDATE blocked blocked blocked blocked
DELETE blocked blocked blocked blocked
Did I misunderstand the answers given in those questions, or make a mistake in my testing? If not, why would you use (UPDLOCK, HOLDLOCK) vs. (HOLDLOCK) alone?
Further explanation of what I am trying to accomplish:
I would like to select rows from a table and prevent the data in that table from being modified while I am processing it. I am not modifying that data, and would like to allow reads to occur.
This answer clearly says that (UPDLOCK, HOLDLOCK) will block reads (not what I want). The comments on this answer imply that it is HOLDLOCK that prevents reads. To try and better understand the effects of the table hints and see if UPDLOCK alone would do what I wanted, I did the above experiment and got results that contradict those answers.
Currently, I believe that (HOLDLOCK) is what I should use, but I am concerned that I may have made a mistake or overlooked something that will come back to bite me in the future, hence this question.
Why would UPDLOCK block selects? The Lock Compatibility Matrix clearly shows N for the S/U and U/S contention, as in No Conflict.
As for the HOLDLOCK hint the documentation states:
HOLDLOCK: Is equivalent to SERIALIZABLE. For more information, see SERIALIZABLE
later in this topic.
...
SERIALIZABLE: ... The scan is performed with the same semantics as a transaction running at the SERIALIZABLE isolation level...
and the Transaction Isolation Level topic explains what SERIALIZABLE means:
No other transactions can modify data that has been read by the
current transaction until the current transaction completes.
Other transactions cannot insert new rows with key values that would
fall in the range of keys read by any statements in the current
transaction until the current transaction completes.
Therefore the behavior you see is perfectly explained by the product documentation:
UPDLOCK does not block concurrent SELECT nor INSERT, but blocks any UPDATE or DELETE of the rows selected by T1
HOLDLOCK means SERALIZABLE and therefore allows SELECTS, but blocks UPDATE and DELETES of the rows selected by T1, as well as any INSERT in the range selected by T1 (which is the entire table, therefore any insert).
(UPDLOCK, HOLDLOCK): your experiment does not show what would block in addition to the case above, namely another transaction with UPDLOCK in T2:
SELECT * FROM dbo.Test WITH (UPDLOCK) WHERE ...
TABLOCKX no need for explanations
The real question is what are you trying to achieve? Playing with lock hints w/o an absolute complete 110% understanding of the locking semantics is begging for trouble...
After OP edit:
I would like to select rows from a table and prevent the data in that
table from being modified while I am processing it.
The you should use one of the higher transaction isolation levels. REPEATABLE READ will prevent the data you read from being modified. SERIALIZABLE will prevent the data you read from being modified and new data from being inserted. Using transaction isolation levels is the right approach, as opposed to using query hints. Kendra Little has a nice poster exlaining the isolation levels.
UPDLOCK is used when you want to lock a row or rows during a select statement for a future update statement. The future update might be the very next statement in the transaction.
Other sessions can still see the data. They just cannot obtain locks that are incompatiable with the UPDLOCK and/or HOLDLOCK.
You use UPDLOCK when you wan to keep other sessions from changing the rows you have locked. It restricts their ability to update or delete locked rows.
You use HOLDLOCK when you want to keep other sessions from changing any of the data you are looking at. It restricts their ability to insert, update, or delete the rows you have locked. This allows you to run the query again and see the same results.

Adding table lock manually to specified table in SQL Server

I want to INSERT into one tables but prevent INSERTING to another one. It is possible to LOCK for example table a for INSERTING, INSERT to table b and then UNLOCK table a?
TABLOCK can lock only the table I am INSERTING in.
Thanks
Martin Pilch
SQL Server does not allow locking objects like you would do semaphors. Also, locking a table will not make it read-only; it will make it locked out for everybody.
You can place a lock by using a table hint such as SELECT * FROM MyTable WITH (LOCKNAME) but that is not a good programming practice.

Why does row level locking not appear to work correctly in SQL server?

This is a continuation from When I update/insert a single row should it lock the entire table?
Here is my problem.
I have a table that holds locks so that other records in the system don’t have to take locks out on common resources, but can still queue the tasks so that they get executed one at a time.
When I access a record in this locks table I want to be able to lock it and update it (just the one record) without any other process being able to do the same. I am able to do this with a lock hint such as updlock.
What happens though is that even though I’m using a rowlock to lock the record, it blocks a request to another process to alter a completely unrelated row in the same table that would also have specified the updlock hint along with rowlock.
You can recreate this be making a table
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Locks](
[ID] [int] IDENTITY(1,1) NOT NULL,
[LockName] [varchar](50) NOT NULL,
[Locked] [bit] NOT NULL,
CONSTRAINT [PK_Locks] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 100) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[Locks] ADD CONSTRAINT [DF_Locks_LockName] DEFAULT ('') FOR [LockName]
GO
ALTER TABLE [dbo].[Locks] ADD CONSTRAINT [DF_Locks_Locked] DEFAULT ((0)) FOR [Locked]
GO
Add two rows for a lock with LockName=‘A’ and one for LockName=‘B’
Then create two queries to run in a transaction at the same time against it:
Query 1:
Commit
Begin transaction
select * From Locks with (updlock rowlock) where LockName='A'
Query 2:
select * From Locks with (updlock rowlock) where LockName='B'
Please note that I am leaving the transaction open so that you can see this issue since it wouldn’t be visible without this open transaction.
When you run Query 1 locks are issues for the row and any subsequent queries for LockName=’A’ will have to wait. This behaviour is correct.
Where this gets a bit frustrating is when you run Query 2 you are blocked until Query 1 finishes even thought these are unrelated records. If you then run Query 1 again just as I have it above, it will commit the previous transaction, Query 2 will run and then Query 1 will once again lock the record.
Please offer some suggestions as to how I might be able to have it properly lock ONLY the one row and not prevent other items from being updated as well.
PS. Holdlock also fails to produce the correct behaviour after one of the rows is updated.
In SQL Server, the lock hints are applied to the objects scanned, not matched.
Normally, the engine places a shared lock on the objects (pages etc) while reading them and lifts them (or does not lift in SERIALIZABLE transactions) after the scanning is done.
However, you instruct the engine to place (and lift) the update locks which are not compatible with each other.
The transaction B locks while trying to put an UPDLOCK onto the row already locked with an UPDLOCK by transaction A.
If you create an index and force its usage (so no conflicting reads ever occur), your tables will not lock:
CREATE INDEX ix_locks_lockname ON locks (lockname)
Begin transaction
select * From Locks with (updlock rowlock INDEX (ix_locks_lockname)) where LockName='A'
Begin transaction
select * From Locks with (updlock rowlock INDEX (ix_locks_lockname)) where LockName='B'
For query 2, try using the READPAST hint - this (quote):
Specifies that the Database Engine not
read rows that are locked by other
transactions. Under most
circumstances, the same is true for
pages. When READPAST is specified,
both row-level and page-level locks
are skipped. That is, the Database
Engine skips past the rows or pages
instead of blocking the current
transaction until the locks are
released
This is typically used in queue-processing type environments - so multiple processes can pull off the next item from a queue table without being blocked out by other processes (of course, using UPDLOCK to prevent multiple processes picking up the same row).
Edit 1:
It could be caused if you don't have an index on the LockName field. With the index, query 2 could do an index seek to the exact row. But without it, it would be doing a scan (checking every row) meaning it gets held up by the first transaction. So if it's not indexed, try indexing it.
I am not sure what you are trying to accomplish, but typically those who are dealing with similar problems want to use sp_getapplock. Covered by Tony Rogerson:Assisting Concurrency by creating your own Locks (Mutexs in SQL)
If you want queueing in SQL Server, use UPDLOCK, ROWLOCK, READPAST hints. It works.
I'd consider changing your approach rather than trying to change SQL Server behaviour...

Resources