I have a transaction including several tables. I need to prevent inserts to TABLE_TO_BE_LOCKED after I have deleted some records from EDITED_TABLE. Currently I am using TABLOCKX
DELETE EDITED_TABLE FROM EDITED_TABLE
left join TABLE_TO_BE_LOCKED with (TABLOCKX) on ....
WHERE ...
I need to prevent inserting new records to TABLE_TO_BE_LOCKED but I would like to preserve possibility to read it. After the DELETE TABLE_TO_BE_LOCKED is still not changed and this is the reason for explicit locking.
Can I use TABLOCK, HOLDLOCK instead of TABLOCKX?
Note:
I know about How to lock a table for inserting in sql?. However there is result don't do it to prevent duplicities in primary key.
If the table has writes to in a transaction, then exclusive locks will be held until the end of the transaction. You can't enable SELECT in this case unless you use NOLOCK on the SELECT, but then you'll see deleted rows.
TABLOCK, HOLDLOCK will simply locks the whole table exclusively until the end of the transaction
It sounds like you're trying to fix your chosen solution: with more information we may be able to offer a better one that does what you want without lock hints. For example, if your transaction needs a consistent view of deleted data not affected by other transactions doing INSERTS then you could try one or more of
OUTPUT clause to work on the set of data you started with
Use SERIALIZABLE only to preserve the range (which is HOLDLOCK BTW)
Use SNAPSHOT isolation modes
After comments...
BEGIN TRAN
SELECT *
INTO #temp
FROM TABLE_TO_BE_LOCKED
WHERE .. -- or JOIN to EDITED_TABLE
DELETE EDITED_TABLE FROM EDITED_TABLE
left join #temp with (TABLOCKX) on ....
WHERE ...
-- do stuff with #temp
COMMIT
Related
A common case for DB transactions is performing operations on multiple tables, as you can then easily rollback all operations if one fails. However, a common scenario I run into is wanting to insert records to multiple tables where the later inserts need the serial ID from the previous inserts.
Since the ID is not generated/available until the transaction is actually committed, how can one accomplish this? If you have to commit after the first insert in order to get the ID and then execute the second insert, it seems to defeat the purpose of the transaction in the first place because after committing (or if I don't use a transaction at all) I cannot rollback the first insert if the second insert fails.
This seems like such a common use case for DB transactions that I can't imagine it would not be supported in some way. How can this be accomplished?
cte (common table expression) with data modifying statements should cover your need, see the manual.
Typical example :
WITH cte AS (INSERT INTO table_A (id) VALUES ... RETURNING id)
INSERT INTO table_B (id) SELECT id FROM cte
see the demo in dbfiddle
If I am performing SELECTS on a table, is it possible that a SELECT query on a table can block INSERTS into the same table?
A SELECT will take a shared lock on the rows, but it shouldn't effect inserts correct?
The query is a LIKE clause - will this block an insert? Does it have the potential?
SELECT * FROM USERS WHERE Description LIKE '%HELLO%'
Reference:
I read this response SQL Server SELECT statements causing blocking and I am confused how this would block an insert.
A SELECT will take a shared lock on the rows, but it shouldn't effect
inserts correct?
No, it's not exact.
When you make a SELECT, it can acquire shared locks on pages and even on the whole table,
you can test it by yourself by using paglock or tablock hints (of course you should use repeatable read or serializable to see them, as all the shared locks in read committed are released as soon as they are no needed anymore).
The same situation can be modelled this way:
if object_id('dbo.t') is not null drop table dbo.t;
select top 10000 cast(row_number() over(order by getdate()) as varchar(10)) as col
into dbo.t
from sys.columns c1
cross join sys.columns c2;
set transaction isolation level serializable;
begin tran
select *
from dbo.t
where col like '%0%';
select resource_type,
request_mode,
count(*) as cnt
from sys.dm_tran_locks
where request_session_id = ##spid
group by resource_type,
request_mode;
Here you see lock escalation result, my query wanted more than 5000 locks per statement so instead of them server took only one lock, shared lock on the table.
Now if you try to insert any row in this table from another session, your INSERT will be blocked.
This is because any insert first need to acquire IX on a table and IX on a page, but IX on a table is incompatible with S on the same table, so it will be blocked.
This way your select could block your insert.
To see what exactly happens on your server you should use sys.dm_tran_locks filtered by both your session_id.
General info - this is called SQL Server Concurrency and in SQL Server you will find two models:
Pessimistic;
Optimistic.
Answering your question - yes, you can block any insert during read and this is called "Pessimistic Concurrency". However, this model comes with specific properties and you have to be careful because:
Data being read is locked, so that no other user can modify the data;
Data being modified is locked, so that no other user can read or modify the data;
The number of locks acquired is high because every data access operation (read/write) acquires a lock;
Writers block readers and other writers. Readers block writers.
The point is that you should use Pessimistic Concurrency only if the locks are held for a short period of time and only if the cost of each lock is lower than rolling back the transaction in case of a conflict, as Neeraj said.
I would recommend to read more about isolation levels applying both Pessimistic and Optimistic models here.
EDIT - I found a very detailed explanation about isolation levels on Stack, here.
I know that NOLOCK is default for SELECT operations. So, if I even don't write with (NOLOCK) keyword for a select query, the row won't be locked.
I couldn't find what happens if with (ROWLOCK) is not specified for UPDATE and DELETE query. Is there a difference between below queries?
UPDATE MYTABLE set COLUMNA = 'valueA';
and
UPDATE MYTABLE WITH (ROWLOCK) set COLUMNA = 'valueA';
If there is no hint, then the db engine chooses the LOCK mdoe as a function of the operation(select/modify), the level of isolation and granularity, and the possibility of escalating the granularity level. Specifying ROWLOCKX does not give 100% of the result of the fact that it will be X on a rows. In general, a very large topic for such a broad issue
Read first about Lock Modes that https://technet.microsoft.com/en-us/library/ms175519(v=sql.105).aspx
If
In statement 1 (without rowlock) the DBMS decides to lock the entire table or the page that updating record is in it. so it means while updating the row all or number of other rows in the table are locked and could not be updated or deleted.
Statement 2 (with (ROWLOCK)) suggests the DBMS to only lock the record that is being updated. But be ware that this is just a HINT and there is no guarantee that the it will be accepted by the DBMS.
So, if I even don't write with (NOLOCK) keyword for a select query, the row won't be locked.
select queries always take a lock and it is called shared lock and duration of the lock depends on your isolation level
Is there a difference between below queries?
UPDATE MYTABLE set COLUMNA = 'valueA';
and
UPDATE MYTABLE WITH (ROWLOCK) set COLUMNA = 'valueA';
Suppose your first statement affects more than 5000 locks,locks will be escalated to table ,but with rowlock ...SQLServer won't lock total table
I know at least three ways to insert a record if it doesn't already exist in a table:
The first one is using if not exist:
IF NOT EXISTS(select 1 from table where <condition>)
INSERT...VALUES
The second one is using merge:
MERGE table AS target
USING (SELECT values) AS source
ON (condition)
WHEN NOT MATCHED THEN
INSERT ... VALUES ...
The third one is using insert...select:
INSERT INTO table (<values list>)
SELECT <values list>
WHERE NOT EXISTS(select 1 from table where <condition>)
But which one is the best?
The first option seems to be not thread-safe, as the record might be inserted between the select statement in the if and the insert statement that follows, if two or more users try to insert the same record.
As for the second option, merge seems to be an overkill for this, as the documentation states:
Performance Tip: The conditional behavior described for the MERGE statement works best when the two tables have a complex mixture of matching characteristics. For example, inserting a row if it does not exist, or updating the row if it does match. When simply updating one table based on the rows of another table, improved performance and scalability can be achieved with basic INSERT, UPDATE, and DELETE statements.
So I think the third option is the best for this scenario (only insert the record if it doesn't already exist, no need to update if it does), but I would like to know what SQL Server experts think.
Please note that after the insert, I'm not interested to know whether the record was already there or whether it's a brand new record, I just need it to be there so that I can carry on with the rest of the stored procedure.
When you need to guarantee the uniqueness of records on a condition that can not to be expressed by a UNIQUE or PRIMARY KEY constraint, you indeed need to make sure that the check for existence and insert are being done in one transaction. You can achieve this by either:
Using one SQL statement performing the check and the insert (your third option)
Using a transaction with the appropriate isolation level
There is a fourth way though that will help you better structure your code and also make it work in situations where you need to process a batch of records at once. You can create a TABLE variable or a temporary table, insert all of the records that need to be inserted in there and then write the INSERT, UPDATE and DELETE statements based on this variable.
Below is (pseudo)code demonstrating this approach:
-- Logic to create the data to be inserted if necessary
DECLARE #toInsert TABLE (idCol INT PRIMARY KEY,dataCol VARCHAR(MAX))
INSERT INTO #toInsert (idCol,dataCol) VALUES (1,'row 1'),(2,'row 2'),(3,'row 3')
-- Logic to insert the data
INSERT INTO realTable (idCol,dataCol)
SELECT TI.*
FROM #toInsert TI
WHERE NOT EXISTS (SELECT 1 FROM realTable RT WHERE RT.dataCol=TI.dataCol)
In many situations I use this approach as it makes the TSQL code easier to read, possible to refactor and apply unit tests to.
Following Vladimir Baranov's comment, reading Dan Guzman's blog posts about Conditional INSERT/UPDATE Race Condition and “UPSERT” Race Condition With MERGE, seems like all three options suffers from the same drawbacks in a multi-user environment.
Eliminating the merge option as an overkill, we are left with options 1 and 3.
Dan's proposed solution is to use an explicit transaction and add lock hints to the select to avoid race condition.
This way, option 1 becomes:
BEGIN TRANSACTION
IF NOT EXISTS(select 1 from table WITH (UPDLOCK, HOLDLOCK) where <condition>)
BEGIN
INSERT...VALUES
END
COMMIT TRANSACTION
and option 2 becomes:
BEGIN TRANSACTION
INSERT INTO table (<values list>)
SELECT <values list>
WHERE NOT EXISTS(select 1 from table WITH (UPDLOCK, HOLDLOCK)where <condition>)
COMMIT TRANSACTION
Of course, in both options there need to be some error handling - every transaction should use a try...catch so that we can rollback the transaction in case of an error.
That being said, I think the 3rd option is probably my personal favorite, but I don't think there should be a difference.
Update
Following a conversation I've had with Aaron Bertrand in the comments of some other question - I'm not entirely convinced that using ISOLATION LEVEL is a better solution than individual query hints, but at least that's another option to consider:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
INSERT INTO table (<values list>)
SELECT <values list>
WHERE NOT EXISTS(select 1 from table where <condition>);
COMMIT TRANSACTION;
This SQL (called from c#) occasionally results in a deadlock.
The server is not under much load so the approach used is to lock as much as possible.
-- Lock to prevent race-conditions when multiple instances of an application calls this SQL:
BEGIN TRANSACTION
-- Check that no one has inserted the rows in T1 before me, and that T2 is in a valid state (Test1 != null)
IF NOT EXISTS (SELECT TOP 1 1 FROM T1 WITH(HOLDLOCK, TABLOCKX) WHERE FKId IN {0}) AND
NOT EXISTS(SELECT TOP 1 1 FROM T2 WITH(HOLDLOCK, TABLOCKX) WHERE DbID IN {0} AND Test1 IS NOT NULL)
BEGIN
-- Great! Im the first - go insert the row in T1 and update T2 accordingly. Finally write a log to T3
INSERT INTO T1(FKId, Status)
SELECT DbId, {1} FROM T2 WHERE DbId IN {0};
UPDATE T2 SET LastChangedBy = {2}, LastChangedAt = GETDATE() WHERE DbId IN {0};
INSERT INTO T3 (F1, FKId, F3)
SELECT {2}, DbId, GETDATE() FROM T2 WHERE DbId IN {0} ;
END;
-- Select status on the rows so the program can evaluate what just happened
SELECT FKId, Status FROM T1 WHERE FkId IN {0};
COMMIT TRANSACTION
I believe the problem is that multiple tables needs to be locked.
I'm a bit unsure when the tables are actually xlocked - when a table is used the first time - or are all tables locked at one time at BEGIN TRANS?
Using Table Locks can increase the likelihood of getting deadlocks... Not all deadlocks are caused by out of sequence operations... Some can be caused, (as you have found) by other activity that only tries to lock a single record in the same table you are locking completely, so locking the entire table increases the probability for that conflict to occur. When using serializable isolation level, range locks are placed on index rows, which can prevent inserts/deletes by other sql operations, in a way that can cause a deadlock by two concurrent operations from the same procedure, even though they are coded to perform their ops in the same order...
In any event, to find out what exactly is causing the deadlock, set SQL Server Trace flags 1204 and 1222. These will cause detailed info to be written to the SQL Server Logs about each deadlock, including what statements were involved.
Here is a good article about how to do this.
(Don't forget to turn these flags off when you're done...)
Locks are done when you call lock or select with lock and released on commit or rollback.
You could get a dead lock if another procedure locks in T3 first and in T1 or T2 afterwards. Then two transactions are waiting for each other to get a resource, while locking what the other needs.
You could also avoid the table lock and use isolation level serializable.
The problem with locking is that you really need to look at all places you do locking at the same time, there's no way to isolate and split up the problem into many smaller ones and look at those individually.
For instance, what if some other code locks the same tables, but without it being obvious, and in the wrong order? That'll cause a deadlock.
You need to analyze the server state at the moment the deadlock is discovered to try to figure out what else is running at the moment. Only then can try to fix it.