Deadlock in dependent multiple update statements - sql-server

In a SP, three tables are getting updated in a single transaction. These update are dependent on each other. But intermittently deadlock is happening during this update. It is not happening consistently but rather intermittently.
A WCF service is being called and that calls the SP. The input of the SP is a XML. The XML is parsed wing the OPENXML method and the values are used to update the tables.
#Table is a table variable ,populated by OPENXML on applying the input XML of the SP. The input XML contains only one ID.
<A>
<Value>XYZ</Value>
<ID>1</ID>
</A>
BEGIN TRAN
--update Table1
Update Table1
Set ColumnA = A.value
JOIN #Table A
ON Table1.ID = A.ID
--update Table2
Update Table2
Set ColumnA = Table1.ColumnA
JOIN Table1
ON Table1.ID = Table2.ID
--update Table3
Update Table3
Set ColumnA = Table1.ColumnA
JOIN Table1
ON Table1.ID = Table3.ID
COMMIT TRAN
In Table1 , ID column is primary key.
In Table2, in ID column no index are available.
Here sometimes deadlock is happening while updating Table2.
Receiving the error "Transaction (Process ID 100) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction."
Advise is required on resolving this intermittent deadlock issue.

Deadlocks are often the result of more data being touched than needed by queries. Query and index tuning can help ensure only data needed by queries are accessed and locked, reducing both blocking and the likelihood of deadlocks by concurrent sessions.
Because your queries join on ID with no other criteria, an index on that column may help avoid the UPDATE and DELETE statements from touching other rows. I see from your comments that there was no index on the table2 ID column so a clustered index scan was performed. Not only did the scan result in suboptimal performance, it can lead to blocking and deadlocking when concurrent sessions contend for the same rows.
Adding a non-clustered index on ID changed the plan from a full clustered index scan to a non-clustered index seek. This should reduce, if not eliminate, the deadlocks going forward and improve performance considerably too. I like to say that performance and concurrency go hand-in-hand, an especially important detail with data modification statements.

Related

Which type of locking mode for INSERT, UPDATE or DELETE operations in Sql Server?

I know that NOLOCK is default for SELECT operations. So, if I even don't write with (NOLOCK) keyword for a select query, the row won't be locked.
I couldn't find what happens if with (ROWLOCK) is not specified for UPDATE and DELETE query. Is there a difference between below queries?
UPDATE MYTABLE set COLUMNA = 'valueA';
and
UPDATE MYTABLE WITH (ROWLOCK) set COLUMNA = 'valueA';
If there is no hint, then the db engine chooses the LOCK mdoe as a function of the operation(select/modify), the level of isolation and granularity, and the possibility of escalating the granularity level. Specifying ROWLOCKX does not give 100% of the result of the fact that it will be X on a rows. In general, a very large topic for such a broad issue
Read first about Lock Modes that https://technet.microsoft.com/en-us/library/ms175519(v=sql.105).aspx
If
In statement 1 (without rowlock) the DBMS decides to lock the entire table or the page that updating record is in it. so it means while updating the row all or number of other rows in the table are locked and could not be updated or deleted.
Statement 2 (with (ROWLOCK)) suggests the DBMS to only lock the record that is being updated. But be ware that this is just a HINT and there is no guarantee that the it will be accepted by the DBMS.
So, if I even don't write with (NOLOCK) keyword for a select query, the row won't be locked.
select queries always take a lock and it is called shared lock and duration of the lock depends on your isolation level
Is there a difference between below queries?
UPDATE MYTABLE set COLUMNA = 'valueA';
and
UPDATE MYTABLE WITH (ROWLOCK) set COLUMNA = 'valueA';
Suppose your first statement affects more than 5000 locks,locks will be escalated to table ,but with rowlock ...SQLServer won't lock total table

netezza left outer join query performance

I have a question related to Netezza query performance .I have 2 tables Table A and Table B and Table B is the sub set of Table A with data alteration .I need to update those new values to table A from table B
We can have 2 approaches here
1) Left outer join and select relevant columns and insert in target table
2) Insert table a data into target table and update those values from tableB using join
I tried both and logically both are same.But Explain plan is giving different cost
for normal select
a)Sub-query Scan table "TM2" (cost=0.1..1480374.0 rows=8 width=4864 conf=100)
update
b)Hash Join (cost=356.5..424.5 rows=2158 width=27308 conf=21)
for left outer join
Sub-query Scan table "TM2" (cost=51.0..101474.8 rows=10000000 width=4864 conf=100)
From this I feel left outer join is better .Can anyone put some thought on this and guide
Thanks
The reason that the cost of insert into table_c select ... from table_a; update table_c set ... from table_b; is higher is because you're inserting, deleting, then inserting. Updates in Netezza mark the records to be updated as deleted, then inserts new rows with the updated values. Once the data is written to an extent, it's never (to my knowledge) altered.
With insert into table_c select ... from table_a join table_b using (...); you're only inserting once, thereby only updating all the zone maps once. The cost will be noticeably lower.
Netezza does an excellent job of keeping you away from the disk on reads, but it will write to the disk as often as you tell it to. In the case of updates, seemingly more so. Try to only write as often as is necessary to gain benefits of new distributions and co-located joins. Any more than that, and you're just using excess commit actions.

Using NOLOCK on a table which is being joined to itself

I'm working with an awful view which internally joins many, many tables together, some of which are the same table.
I'm wondering, when a table is being joined to itself, how is the NOLOCK hint interpreted if it's on one of the joins and not the other? Is the NOLOCK still in effect on the table, or is the table locked altogether if NOLOCK is not included on one of the joins of the same table?
For example (this is pseduo-code, assume that there are valid JOIN ON conditions):
SELECT *
FROM Table1 t1 (NOLOCK)
JOIN Table2 t2 (NOLOCK)
JOIN Table2_Table2 tt (NOLOCK)
JOIN Table2 t22 (NOLOCK)
JOIN Table1 t11
Does Table1 get locked or stay NOLOCKed?
Yes it does get locked by the last Table1 t11 call. Each table locking hint is applied to the specific reference. If you apply it to only one of the table references that is only for that reference and the others will have their own individual locking settings. You can test this using BEGIN TRANSACTION and execute two different queries.
Query 1 (locks the table)
Intentionally commenting out the COMMIT TRANSACTION
BEGIN TRANSACTION
SELECT *
FROM Table1 WITH (TABLOCK)
-- COMMIT TRANSACTION
Since COMMIT TRANSACTION was commented out, the transaction is not closed and will still hold the lock. When the second query is run the first lock will still apply on the table from the first query.
Query 2 (this query will hang because of the first lock will block on Table1 t11)
BEGIN TRANSACTION
SELECT *
FROM Table1 t1 (NOLOCK)
JOIN Table2 t2 (NOLOCK)
JOIN Table2_Table2 tt (NOLOCK)
JOIN Table2 t22 (NOLOCK)
JOIN Table1 t11
COMMIT TRANSACTION
I would guess that not using nolock is going to result in some type of locking, regardless if it is joined elsewhere in the query with nolock. So it would result in a row lock likely, so put nolock next to the join that is missing it.
In very simplified terms, think of it like this: Each of the tables you reference in a query results in a physical execution plan operator accessing that table. Table hints apply to that operator. This means that you can have mixed locking hints for the same table. The locking behavior that you request is applied to those rows that this particular operator happens to read. The respective operator might scan a table, or scan a range of rows, or read a single row. Whatever it is, it is performed under the specified locking options.
Look at the execution plan for your query to find the individual operators.

UPDATE query from OUTER JOINed tables or derived tables

Is there any way in MS-Access to update a table where the data is coming from an outer joined dataset or a derived table? I know how to do it in MSSQL, but in Access I always receive an "Operation must use updateable query" error. The table being updated is updateable, the source data is not. After reading up on the error, Microsoft tells me that the error is caused when the query would violate referential integrity. I can assure this dataset will not. This limitation is crippling when trying to update large datasets. I also read that this can supposedly be remedied by enabling cascading updates. If this relationship between my tables is defined in the query only, is this a possibility? So far writing the dataset to a temp table and then inner joining that to the update table is my only solution; that is incredibly clunky. I would like to do something along the lines of this:
UPDATE Table1
LEFT JOIN Table2 ON Table1.Field1=Table2.Field1
WHERE Table2.Field1 IS Null
SET Table1.Field1= Table2.Field2
or
UPDATE Table1 INNER JOIN
(
SELECT Field1, Field2
FROM Table2, Table3
WHERE Field3=’Whatever’
) AS T2 ON Table1.Field1=T2.Field1
SET Table1.Field1= T2.Field2
Update Queries are very problematic in Access as you've been finding out.
The temp table idea is sometimes your only option.
Sometimes using the DISTINCTROW declaration solves the problem (Query Properties -> Unique Records to 'Yes'), and is worth trying.
Another thing to try would be to use Aliases on your tables, this seems to help out the JET engine as well.
UPDATE Table3
INNER JOIN
(Table1 INNER JOIN Table2 ON Table1.uid = Table2.uid)
ON
(Table3.uid = Table2.uid)
AND
(Table3.uid = Table1.uid)
SET
Table2.field=NULL;
What I did is:
1. Created 3 tables
2. Establish relationships between them
3. And used the query builder to update a field in Table2.
There seems to be a problem in the query logic. In your first example, you LEFT JOIN to Table2 on Field1, but then have
Table2.Field1 IS NULL
in the WHERE clause. So, this limits you to records where no JOIN could be made. But then you try and update Table 1 with data from Table2, despite there being no JOIN.
Perhaps you could explain what it is you are trying to do with this query?

Deadlock caused by SELECT JOIN statement with SQL Server

When executing a SELECT statement with a JOIN of two tables SQL Server seems to
lock both tables of the statement individually. For example by a query like
this:
SELECT ...
FROM
table1
LEFT JOIN table2
ON table1.id = table2.id
WHERE ...
I found out that the order of the locks depends on the WHERE condition. The
query optimizer tries to produce an execution plan that only reads as much
rows as necessary. So if the WHERE condition contains a column of table1
it will first get the result rows from table1 and then get the corresponding
rows from table2. If the column is from table2 it will do it the other way
round. More complex conditions or the use of indexes may have an effect on
the decision of the query optimizer too.
When the data read by a statement should be updated later in the transaction
with UPDATE statements it is not guaranteed that the order of the UPDATE
statements matches the order that was used to read the data from the 2 tables.
If another transaction tries to read data while a transaction is updating the
tables it can cause a deadlock when the SELECT statement is executed in
between the UPDATE statements because neither the SELECT can get the lock on
the first table nor can the UPDATE get the lock on the second table. For
example:
T1: SELECT ... FROM ... JOIN ...
T1: UPDATE table1 SET ... WHERE id = ?
T2: SELECT ... FROM ... JOIN ... (locks table2, then blocked by lock on table1)
T1: UPDATE table2 SET ... WHERE id = ?
Both tables represent a type hierarchy and are always loaded together. So it
makes sense to load an object using a SELECT with a JOIN. Loading both tables
individually would not give the query optimizer a chance to find the best
execution plan. But since UPDATE statements can only update one table at a
time this can causes deadlocks when an object is loaded while the object
is updated by another transaction. Updates of objects often cause UPDATEs on
both tables when properties of the object that belong to different types of the
type hierarchy are updated.
I have tried to add locking hints to the SELECT statement, but that does not
change the problem. It just causes the deadlock in the SELECT statements when
both statements try to lock the tables and one SELECT statement gets the lock
in the opposite order of the other statement. Maybe it would be possible to
load data for updates always with the same statement forcing the locks to be
in the same order. That would prevent a deadlock between two transactions that
want to update the data, but would not prevent a transaction that only reads
data to deadlock which needs to have different WHERE conditions.
The only work-a-round so this so far seems to be that reads may not get locks
at all. With SQL Server 2005 this can be done using SNAPSHOT ISOLATION. The
only way for SQL Server 2000 would be to use the READ UNCOMMITED isolation
level.
I would like to know if there is another possibilty to prevent the SQL Server
from causing these deadlocks?
This will never happen under snapshot isolation, when readers do not block writers. Other than that, there is no way to prevent such things. I wrote a lot of repro scripts here: Reproducing deadlocks involving only one table
Edit:
I don't have access to SQL 2000, but I would try to serialize access to the object by using sp_getapplock, so that reading and modifications never run concurrently. If you cannot use sp_getapplock, roll out your own mutex.
Another way to fix this is to split the select... from... join into multiple select statements. Set the isolation level to read committed. Use table variable to pipe data from select to be joined to other. Use distinct to filter down inserts into these table variables.
So if I've two tables A, B. I'm inserting/updating into A and then B. Where as the sql's query optimizer prefers to read B first and A. I'll split the single select into 2 selects. First I'll read B. Then pass on this data to next select statement which reads A.
Here deadlock won't happen because the read locks on table B will be released as soon as 1st statement is done.
PS I've faced this issue and this worked very good. Much better than my force order answer.
I was facing the same issue. Using query hint FORCE ORDER will fix this issue. The downside is you won't be able to leverage best plan that query optimizer has for your query, but this will prevent the deadlock.
So (this is from "Bill the Lizard" user) if you have a query FROM table1 LEFT JOIN table2 and your WHERE clause only contains columns from table2 the execution plan will normally first select the rows from table2 and then look up the rows from table1. With a small result set from table2 only a few rows from table1 have to be fetched. With FORCE ORDER first all rows from table1 have to be fetched, because it has no WHERE clause, then the rows from table2 are joined and the result is filtered using the WHERE clause. Thus degrading performance.
But if you know this won't be the case, use this. You might want to optimize the query manually.
The syntax is
SELECT ...
FROM
table1
LEFT JOIN table2
ON table1.id = table2.id
WHERE ...
OPTION (FORCE ORDER)

Resources