is there any way to avoid deadlock on an update query without changing (or add) the index?
The following query generates always a deadlock
update table1
set Batch_ID=1
where item_id in (select top 300 t1.item_id
From table1 t1 inner join table2 t2 on t1.item_id=t2.item_id
inner join table3 t3 on t1.item_ID=t3.item_ID
Where IsNull(t3.item_Delivered,0) = 0
And t1.TBatch_ID is Null
And t2.Shipper_ID = 2
And DateDiff(day,t1.TShipping_Date,getdate()) < 90
And (
DateDiff(minute,IsNull(t1.LastTrackingDate,DateAdd(day,-2,GetDate())),getdate()) > 180
OR (DateDiff(minute,IsNull(t1.LastTrackingDate,DateAdd(day,-2,GetDate())),getdate()) > 60 And IsNull(t3.item_Indelivery,0) = 1)
)
And t2.Customer_ID not in (700,800)
Order By t1.LastTrackingDate, t2.Customer_ID)
usually I use set transaction isolation level read uncommitted on select query (reader), but in this case it is an update query (writer). So I cannot apply the same reasonning (isolation level).
Is there a way to set transaction isolation level just for the subquery (just for the select) ??
Can I add WITH (NOLOCK) for each table in the select clause of the subquery ?
Thanks
The query appears to toggle the Batch_ID column from NULL to 1 on the first 300 rows which meet a certain criteria.
This update is prone to deadlocks, given that if two connections both run the same query concurrently, both queries will find overlapping table1 rows and both will try and update (there is a race condition between the rows returned from the subquery and the outer update).
Re : (NOLOCK) - no, read uncommitted will lead to even more unpredictable behaviour. One option would be to synchronize concurrent calls to the update by raising the locking pessimism such that any concurrent connection will be blocked until the first connection's batch of 300 has completed tagging, e.g.:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
update table1
set Batch_ID=1
where item_id in (select top 300 t1.item_id
From table1 t1 (WITH XLOCK) ...
SET TRANSACTION ISOLATION READ COMMITTED;
Related
In a SP, three tables are getting updated in a single transaction. These update are dependent on each other. But intermittently deadlock is happening during this update. It is not happening consistently but rather intermittently.
A WCF service is being called and that calls the SP. The input of the SP is a XML. The XML is parsed wing the OPENXML method and the values are used to update the tables.
#Table is a table variable ,populated by OPENXML on applying the input XML of the SP. The input XML contains only one ID.
<A>
<Value>XYZ</Value>
<ID>1</ID>
</A>
BEGIN TRAN
--update Table1
Update Table1
Set ColumnA = A.value
JOIN #Table A
ON Table1.ID = A.ID
--update Table2
Update Table2
Set ColumnA = Table1.ColumnA
JOIN Table1
ON Table1.ID = Table2.ID
--update Table3
Update Table3
Set ColumnA = Table1.ColumnA
JOIN Table1
ON Table1.ID = Table3.ID
COMMIT TRAN
In Table1 , ID column is primary key.
In Table2, in ID column no index are available.
Here sometimes deadlock is happening while updating Table2.
Receiving the error "Transaction (Process ID 100) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction."
Advise is required on resolving this intermittent deadlock issue.
Deadlocks are often the result of more data being touched than needed by queries. Query and index tuning can help ensure only data needed by queries are accessed and locked, reducing both blocking and the likelihood of deadlocks by concurrent sessions.
Because your queries join on ID with no other criteria, an index on that column may help avoid the UPDATE and DELETE statements from touching other rows. I see from your comments that there was no index on the table2 ID column so a clustered index scan was performed. Not only did the scan result in suboptimal performance, it can lead to blocking and deadlocking when concurrent sessions contend for the same rows.
Adding a non-clustered index on ID changed the plan from a full clustered index scan to a non-clustered index seek. This should reduce, if not eliminate, the deadlocks going forward and improve performance considerably too. I like to say that performance and concurrency go hand-in-hand, an especially important detail with data modification statements.
All of explanations of phantom reads I managed to find demonstrate phantom read by running 2 select statements in one transaction (e.g. https://blobeater.blog/2017/10/26/sql-server-phantom-reads/ )
BEGIN TRAN
SELECT #1
DELAY DURING WHICH AN INSERT TAKES PLACE IN A DIFFERENT TRANSACTION
SELECT #2
END TRAN
Is it possible to reproduce a phantom read in one select statement? This would mean that select statement starts on transaction #1. Then insert runs on transaction #2 and commits. Finally select statement from transaction #1 completes, but does not return a row that transaction #2 has inserted.
The SQL Server Transaction Isolation Levels documentation defines a phantom row as one "that matches the search criteria but is not initially seen" (emphasis mine). Consequently, more than one SELECT statement is needed for a phantom read to occur.
Data inserted during execution SELECT statement execution might not be returned in the READ COMMITTED isolation level depending on the timing but this is not a phantom read by definition. The example below shows this behavior.
--create table with enough data for a long-running SELECT query
CREATE TABLE dbo.PhantomReadExample(
PhantomReadExampleID int NOT NULL
CONSTRAINT PK_PhantomReadExample PRIMARY KEY
, PhantomReadData char(8000) NOT NULL
);
--insert 100K rows
WITH
t10 AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) t(n))
,t1k AS (SELECT 0 AS n FROM t10 AS a CROSS JOIN t10 AS b CROSS JOIN t10 AS c)
,t1m AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS num FROM t1k AS a CROSS JOIN t1k AS b)
INSERT INTO dbo.PhantomReadExample WITH(TABLOCKX) (PhantomReadExampleID, PhantomReadData)
SELECT num*2, 'data'
FROM t1m
WHERE num <= 100000;
GO
--run this on connection 1
SELECT *
FROM dbo.PhantomReadExample
ORDER BY PhantomReadExampleID;
GO
--run this on connection 2 while the connection 1 SELECT is running
INSERT INTO dbo.PhantomReadExample(PhantomReadExampleID, PhantomReadData)
VALUES(1, 'data');
GO
Shared locks are acquired on rows as they are read during the SELECT query scan to ensure only committed data are read but these are immediately released once data are read improve concurrency. This allows other sessions to insert, update, and delete rows while the SELECT query is running.
The inserted row is not returned in this case because the ordered clustered index scan had already past the point of the insert.
Below is the wikipedia definition of phantom reads
A phantom read occurs when, in the course of a transaction, new rows
are added by another transaction to the records being read.
This can occur when range locks are not acquired on performing a
SELECT ... WHERE operation. The phantom reads anomaly is a special
case of Non-repeatable reads when Transaction 1 repeats a ranged
SELECT ... WHERE query and, between both operations, Transaction 2
creates (i.e. INSERT) new rows (in the target table) which fulfill
that WHERE clause.
This is certainly possible to reproduce in a single reading query (of course other database activity must also be happening to produce the phantom rows).
Setup
CREATE TABLE Test(X INT PRIMARY KEY);
Connection 1 (leave this running)
SET NOCOUNT ON;
WHILE 1 = 1
INSERT INTO Test VALUES (CRYPT_GEN_RANDOM(4))
Connection 2
This is extremely likely to return some rows if running at read committed lock isolation level (default for the on premise product and enforced with table hint below)
WITH CTE AS
(
SELECT *
FROM Test WITH (READCOMMITTEDLOCK)
WHERE X BETWEEN 0 AND 2147483647
)
SELECT *
FROM CTE c1
FULL OUTER HASH JOIN CTE c2 ON c1.X = c2.X
WHERE (c1.X IS NULL OR c2.X IS NULL)
The returned rows are values added between the first and second read of the table for rows matching the WHERE X BETWEEN 0 AND 2147483647 predicate.
I'm working with an awful view which internally joins many, many tables together, some of which are the same table.
I'm wondering, when a table is being joined to itself, how is the NOLOCK hint interpreted if it's on one of the joins and not the other? Is the NOLOCK still in effect on the table, or is the table locked altogether if NOLOCK is not included on one of the joins of the same table?
For example (this is pseduo-code, assume that there are valid JOIN ON conditions):
SELECT *
FROM Table1 t1 (NOLOCK)
JOIN Table2 t2 (NOLOCK)
JOIN Table2_Table2 tt (NOLOCK)
JOIN Table2 t22 (NOLOCK)
JOIN Table1 t11
Does Table1 get locked or stay NOLOCKed?
Yes it does get locked by the last Table1 t11 call. Each table locking hint is applied to the specific reference. If you apply it to only one of the table references that is only for that reference and the others will have their own individual locking settings. You can test this using BEGIN TRANSACTION and execute two different queries.
Query 1 (locks the table)
Intentionally commenting out the COMMIT TRANSACTION
BEGIN TRANSACTION
SELECT *
FROM Table1 WITH (TABLOCK)
-- COMMIT TRANSACTION
Since COMMIT TRANSACTION was commented out, the transaction is not closed and will still hold the lock. When the second query is run the first lock will still apply on the table from the first query.
Query 2 (this query will hang because of the first lock will block on Table1 t11)
BEGIN TRANSACTION
SELECT *
FROM Table1 t1 (NOLOCK)
JOIN Table2 t2 (NOLOCK)
JOIN Table2_Table2 tt (NOLOCK)
JOIN Table2 t22 (NOLOCK)
JOIN Table1 t11
COMMIT TRANSACTION
I would guess that not using nolock is going to result in some type of locking, regardless if it is joined elsewhere in the query with nolock. So it would result in a row lock likely, so put nolock next to the join that is missing it.
In very simplified terms, think of it like this: Each of the tables you reference in a query results in a physical execution plan operator accessing that table. Table hints apply to that operator. This means that you can have mixed locking hints for the same table. The locking behavior that you request is applied to those rows that this particular operator happens to read. The respective operator might scan a table, or scan a range of rows, or read a single row. Whatever it is, it is performed under the specified locking options.
Look at the execution plan for your query to find the individual operators.
I need to run a query that selects ten records. Then, based on their values and some outside information, update said records.
Unfortunately I am running into deadlocks when I do this in a multi-threaded fashion. Both threads A and B run their selects at the same time, acquiring read locks on the ten records. So when one of them tries to do an update, the other transaction is aborted.
So what I need to be able to say is "select and write-lock these ten records".
(Yea, I know serial transactions should be avoided, but this is a special case for me.)
Try applying UPDLOCK
BEGIN TRAN
SELECT * FROM table1
WITH (UPDLOCK, ROWLOCK)
WHERE col1 = 'value1'
UPDATE table1
set col1 = 'value2'
where col1 = 'value1'
COMMIT TRAN
Situation:
1) There is big TABLE1 (9GB data, 20GB idx space, 12M rows)
2) There are several UPDATE and UPDATE/SELECT on TABLE1 which are run one by one
3) Each UPDATE statement updates different columns
4) None of them are using previously updated column for calculation to new updated column
5) It takes a while to complete them all
Issue:
I want to run those UPDATEs at the same time, but im concerned about deadlocks. How to avoid them? SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED will help?
UPDATEs looks like:
update TABLE1 set col1 = subs.col2
from (select ID, col2 from TABLE2) subs
where TABLE1.ID = subs.ID
update TABLE1 set col10 = col2+col3+col4
update TABLE1 set col100 = col2 + subs.col4
from (
select
b.ID, a.col4
from
TABLE3 a
join TABLE1 b on TABLE1.ID2 = TABLE3.ID2
) subs
where TABLE1.ID = subs.ID
update TABLE1 set col1000 = col2+col3+col4
from TABLE1
join TABLE4 on TABLE4.date = TABLE1.date
join TABLE5 on TABLE5.ID3 = TABLE1.ID
Dirty reads with READ UNCOMMITTED might work if same columns not updated and not used in other clauses, but I'm afraid this is fragile solution.
For more consistent solution you can mix ROWLOCK/UPDLOCK/NOLOCK depends on operations. F.e.
UPDATE
TABLE1 WITH (ROWLOCK)
SET
col1 = TABLE2.col2
FROM
TABLE1 WITH (ROWLOCK, UPDLOCK)
INNER JOIN TABLE2 WITH (NOLOCK) ON (TABLE1.ID = TABLE2.ID)
If your statements updates mostly different rows, then ROWLOCK can be omitted.
In rare cases lock escalation might happens, but it can be limited by
ALTER TABLE TABLE1 SET (LOCK_ESCALATION = DISABLE)
BTW, what is the purpose of your solution? I don't think that you'll win a lot of performance and small partial updates can handle faster than large updates in parallel.
(1) Avoid sub-queries while updating. Multiple sub-queries can quickly lead to lock escalation and cause deadlock.
(2) Check out following discussion at TABLOCK vs TABLOCKX.
(3) for current blocking and locking check out the discussion at How to find out what table a page lock belongs to
Another strategy: create a temp table holding the IDs of the rows you want to insert, along with the column's new value.
CREATE TABLE #tmp (
RowID int,
NewCol1Value ...,
NewCol2Value ...,
NewCol2Value ...
)
-- Insert into the tmp table
...
UPDATE Table1
SET Col1 = ISNULL(NewCol1Value, Col1),
Col2 = ISNULL(NewCol2Value, Col2),
...
FROM Table1 INNER JOIN #tmp ON Table1.RowID = #tmp.RowID