How to automatically re-run deadlocked transaction? (ASP.NET MVC/SQL Server) - sql-server

I have a very popular site in ASP.NET MVC/SQL Server, and unfortunately a lot of deadlocks occur. While I'm trying to figure out why they occur via the SQL profiler, I wonder how I can change the default behavior of SQL Server when doing the deadlocks.
Is it possible to re-run the transaction(s) that caused problems instead of showing the error screen?

Remus's answer is fundamentally flawed. According to https://stackoverflow.com/a/112256/14731 a consistent locking order does not prevent deadlocks. The best we can do is reduce their frequency.
He is wrong on two points:
The implication that deadlocks can be prevented. You will find both Microsoft and IBM post articles about reducing the frequency of deadlocks. No where do they claim you can prevent them altogether.
The implication that all deadlocks require you to re-evaluate the state and come to a new decision. It is perfectly correct to retry some actions at the application level, so long as you travel far back enough to the decision point.
Side-note: Remus's main point is that the database cannot automatically retry the operation on your behalf, and he is completely right on that count. But this doesn't mean that re-running operations is the wrong response to a deadlock.

You are barking up the wrong tree. You will never succeed in doing automated deadlock retries by the SQL engine, such concept is fundamentally wrong. The very definition of deadlock is that the state you base your decision on has changed therefore you need to read again the state and make a new decision. If your process has deadlocked, by definition another process has won the deadlocks, and it meas it has changed something you've read.
Your only focus should be at figuring out why the deadlocks occur and eliminate the cause. Invariably, the cause will turn out to be queries that scan more data that they should. While is true that other types of deadlock can occur, I bet is not your case. Many problems will be solved by deploying appropriate indexes. Some problems will send you back to the drawing board and you will have to rethink your requirements.
There are many, many resources out there on how to identify and solve deadlocks:
Detecting and Ending Deadlocks
Minimizing Deadlocks
You may also consider using snapshot isolation, since the lock-free reads involved in snapshot reduce the surface on which deadlocks can occur (ie. only write-write deadlocks can occur). See Using Row Versioning-based Isolation Levels.

A lot of deadlocks occurring is often an indication that you either do not have the correct indexes and/or that your statistics are out of date. Do you have regular scheduled index rebuilds as part of maintenance?
Your save code should automatically retry saves when error 1205 is returned (deadlock occurred). There is a standard pattern that looks like this:
catch (SqlException ex)
{
if (ex.Number == 1205)
{
// Handle Deadlock by retrying save...
}
else
{
throw;
}
}
The other option is to retry within your stored procedures. There is an example of that here: Using TRY...CATCH in Transact-SQL

One option in addition to those suggsted by Mitch and Remus, as your comments suggest you're looking for a fast fix. If you can identify the queries involved in the deadlocks, you can influence which of the queries involved are rolled back and which continue by setting DEADLOCK_PRIORITY for each query, batch or stored procedure.
Looking at your example in the comment to Mitch's answer:
Let's say deadlock occurs on page A,
but page B is trying to access the
locked data. The error will be
displayed on page B, but it doesn't
mean that the deadlock occurred on
page B. It still occurred on page A.
If you consistently see a deadlock occuring from the queries issued from page A and page B, you can influence which page results in an error and which completes successfully. As the others have said, you cannot automatically force a retry.
Post a question with the problem queries and/or the deadlock trace output and theres a good chance you'll get an explanation as to why its occurring and how it could be fixed.

in some cases, you can do below. Between begin tran and commit is all or nothing. So either #errorcode take 0 as value and ends loop, or, in case of failure, decrease counter by 1 and retry again. It may not work if you provide variables to code from outside begin tran/commit. Just an Idea :)
declare #errorcount int = 4 -- retry number
while #errorcount >0
begin
begin tran
<your code here>
set #errorcount =0
commit
set #errorcount=#errorcount-1
end

Related

Can Lost Update happen in read committed isolation level in PostgreSQL?

I have a query like below in PostgreSQL:
UPDATE
queue
SET
queue.status = 'PROCESSING'
WHERE
queue.status = 'WAITING' AND
queue.id = (SELECT id FROM queue WHERE STATUS = 'WAITING' LIMIT 1 )
RETURNING
queue.id
and many workers try to process one work at a time (that's why I have sub-query with limit 1). After this update, each worker grabs information about the id and processes the work, but sometimes they grab the same work and process it twice or more. The isolation level is Read Committed.
My question is how can I guarantee one work is going to be processed once? I know there is so many post out there but I can say I have tried most of them and it didn't help () ;
I have tried SELECT FOR UPDATE, but it caused deadlocked situation.
I have tried pg_try_advisory_xact_lock, but it caused out of shared
memory
I tried adding AND pg_try_advisory_xact_lock(queue.id) to the outer query's WHERE clause, but ... [?]
Any help would be appreciated.
A lost update won't occur in the situation you describe, but it won't work properly either.
What will happen in the example you've given above is that given (say) 10 workers started simultaneously, all 10 of them will execute the subquery and get the same ID. They will all attempt to lock that ID. One of them will succeed; the others will block on the first one's lock. Once the first backend commits or rolls back, the 9 others will race for the lock. One will get it, re-check the WHERE clause and see that the queue.status test no longer matches, and return without modifying any rows. The same will happen with the other 8. So you used 10 queries to do the work of one query.
If you fail to explicitly check the UPDATE result and see that zero rows were updated you might think you were getting lost updates, but you aren't. You just have a concurrency bug in your application caused by a misunderstanding of the order-of-execution and isolation rules. All that's really happening is that you're effectively serializing your backends so that only one at a time actually makes forward progress.
The only way PostgreSQL could avoid having them all get the same queue item ID would be to serialize them, so it didn't start executing query #2 until query #1 finished. If you want to you can do this by LOCKing the queue table ... but again, you might as well just have one worker then.
You can't get around this with advisory locks, not easily anyway. Hacks where you iterated down the queue using non-blocking lock attempts until you got the first lockable item would work, but would be slow and clumsy.
You are attempting to implement a work queue using the RDBMS. This will not work well. It will be slow, it will be painful, and getting it both correct and fast will be very very hard. Don't roll your own. Instead, use a well established, well tested system for reliable task queueing. Look at RabbitMQ, ZeroMQ, Apache ActiveMQ, Celery, etc. There's also PGQ from Skytools, a PostgreSQL-based solution.
Related:
In PostgreSQL, do multiple UPDATES to different rows in the same table having a locking conflict?
Can multiple threads cause duplicate updates on constrained set?
Why do we need message brokers like rabbitmq over a database like postgres?
SKIP LOCKED can be used to implement queue in PostgreSql. see
In PostgreSQL, lost update happens in READ COMMITTED and READ UNCOMMITTED but if you use SELECT FOR UPDATE in READ COMMITTED and READ UNCOMMITTED, lost update doesn't happen.
In addition, lost update doesn't happen in REPEATABLE READ and SERIALIZABLE whether or not you use SELECT FOR UPDATE. *Error happens if there is a lost update condition.

Oracle deadlock without explicit locking and read committed isolation level, why?

I get this error Message: ORA-00060: deadlock detected while waiting for resource even though I am not using any explicit table locking and my isolation level is set to READ COMMITTED.
I use multiple threads over the Spring TransactionTemplate with default propagation. In my business logic the data is separated so that two transaction will never have the same set of data. Therefor I don't need SERIALIZABLE
Why can Oracle detect a deadlock? Deadlocks are impossible in this constellation, or am I missing something? If I'm not missing anything then my separation algorithm must be wrong, right? Or could there be some other explaination?
Oracle by default does row level locking. You mention using multiple threads. I suspect one thread is locking one row then attempting to lock another which has been locked by another thread. That other thread is then attempting to lock the row the first thread locked. At this point, Oracle will automatically detect a deadlock and break it. The two rows mentioned above could be in the same table or in different tables.
A careful review of what each thread is doing is the starting point. It may be necessary to decide to not run things in parallel, or it may be necessary to use an explicit locking mechanism (select for update for example).
LMK of what you find and of any additional questions….
K
Encountering deadlocks has nothing to do per se with the serialization level. When a row is inserted/updated/deleted oracle locks the row. If you have two transactions running concurrently and trying to change the same row, you can encounter a deadlock. The emphasis in on "CAN". This generally happens if different type of transactions take locks in a different order, which is a sign of bad transaction design.
As was previously mentioned a trace file is generated on encountering a deadlock. If you look at the trace file, you can determine which two sessions are involved in the deadlock. In addition it also shows the respective SQL statements.

SQL Server Deadlock Fix: Force join order, or automatically retry?

i have a stored procedure that performs a join of TableB to TableA:
SELECT <--- Nested <--- TableA
Loop <--
|
---TableB
At the same time, in a transaction, rows are inserted into TableA, and then into TableB.
This situation is occasionally causing deadlocks, as the stored procedure select grabs rows from TableB, while the insert adds rows to TableA, and then each wants the other to let go of the other table:
INSERT SELECT
========= ========
Lock A Lock B
Insert A Select B
Want B Want A
....deadlock...
Logic requires the INSERT to first add rows to A, and then to B, while i personally don't care the order in which SQL Server performs its join - as long as it joins.
The common recommendation for fixing deadlocks is to ensure that everyone accesses resources in the same order. But in this case SQL Server's optimizer is telling me that the opposite order is "better". i can force another join order, and have a worse performing query.
But should i?
Should i override the optimizer, now and forever, with a join order that i want it to use?
Or should i just trap error native error 1205, and resubmit the select statement?
The question isn't how much worse the query might perform when i override the optimizer and for it to do something non-optimal. The question is: is it better to automatically retry, rather than running worse queries?
Is it better to automatically retry deadlocks. The reason being that you may fix this deadlock, only to hit another one later. The behavior may change between SQL releases, if the size of the tables changes, if the server hardware specifications change, and even if the load on the server changes. If the deadlock is frequent, you should take active steps to eliminate it (an index is usually the answer), but for rare deadlocks (say every 10 mins or so), retry in the application can mask the deadlock. You can retry reads or writes, since the writes are, of course, surrounded by proper begin transaction/commit transaction to keep all write operations atomic and hence able to retry them w/o problems.
Another avenue to consider is turning on read committed snapshot. When this is enabled, SELECT will simply not take any locks, yet yield consistent reads.
To avoid deadlocks, one of the most common recommendations is "to acquire locks in the same order" or "access objects in the same order". Clearly this makes perfect sense, but is it always feasible? Is it always possible? I keep encountering cases when I cannot follow this advice.
If I store an object in one parent table and one or more child ones, I cannot follow this advice at all. When inserting, I need to insert my parent row first. When deleting, I have to do it in the opposite order.
If I use commands that touch multiple tables or multiple rows in one table, then usually I have no control in which order locks are acquired, (assuming that I am not using hints).
So, in many cases trying to acquire locks in the same order does not prevent all deadlocks. So, we need some kind of handling deadlocks anyway - we cannot assume that we can eliminate them all. Unless, of course, we serialize all access using Service Broker or sp_getapplock.
When we retry after deadlocks, we are very likely to overwrite other processes' changes. We need to be aware that very likely someone else modified the data we intended to modify. Especially if all the readers run under snapshot isolation, then readers cannot be involved in deadlocks, which means that all the parties involved in a deadlock are writers, modified or attempted to modify the same data. If we just catch the exception and automatically retry, we can overwrite someone else's changes.
This is called lost updates, and this is usually wrong. Typically the right thing to do after a deadlock is to retry on a much higher level - re-select the data and decide whether to save in the same way the original decision to save was made.
For example, if a user pushed a Save button and the saving transaction was chosen as a deadlock victim, it might be a good idea to re-display the data on the screen as of after the deadlock.
Trapping and rerunning can work, but are you sure that the SELECT is always the deadlock victim? If the insert is the deadlock victim, you'll have to be much more careful about retrying.
The easiest solution in this case, I think, is to NOLOCK or READUNCOMMITTED (same thing) your select. People have justifiable concerns about dirty reads, but we've run NOLOCK all over the place for higher concurrency for years and have never had a problem.
I'd also do a little more research into lock semantics. For example, I believe if you set transaction isolation level to snapshot (requires 2005 or later) your problems go away.

How to Decide to use Database Transactions

How do you guys decide that you should be wrapping the sql in a transaction?
Please throw some light on this.
Cheers !!
A transaction should be used when you need a set of changes to be processed completely to consider the operation complete and valid. In other words, if only a portion executes successfully, will that result in incomplete or invalid data being stored in your database?
For example, if you have an insert followed by an update, what happens if the insert succeeds and the update fails? If that would result in incomplete data (in this case, an orphaned record), you should wrap the two statements in a transaction to get them to complete as a "set".
If you are executing two or more statements that you expect to be functionally atomic, you should wrap them in a transaction.
if your have more than a single data modifying statement to execute to complete a task, all should be within a transaction.
This way, if the first one is successful, but any of the following ones has an error, you can rollback (undo) everything as if nothing was ever done.
Whenever you wouldn't like it if part of the operation can complete and part of it doesn't.
Anytime you want to lock up your database and potentially crash your production application, anytime you want to litter your application with hidden scalability nightmares go ahead and create a transaction. Make it big, slow, and put a loop inside.
Seriously, none of the above answers acknowledge the trade-off and potential problems that come with heavy use of transactions. Be careful, and consider the risk/reward each time.
Ebay doesn't use them at all. I'm sure there are many others.
http://www.infoq.com/interviews/dan-pritchett-ebay-architecture
Whenever any operation falls under ACID(Atomicity,Consistency,Isolation,Durability) criteria you should use transactions
Read this article
When you want to use atomic or isolation property of database for a set of changes.
Atomicity: An atomic transaction is an indivisible and irreducible series of database operations such that either all occurs, or nothing occurs(according to wikipedia).
Isolation: isolation determines how transaction integrity is visible to other users and systems(according to wikipedia).

how to solve deadlock problem?

i have read this dead lock problem When database tables start accumulating thousands of rows and many users start working on the same table concurrently, SELECT queries on the tables start producing lock contentions and transaction deadlocks.
Is this deadlock problem related with TransactNo updlock?
If you know this problem, let me know pls.
Thanks in advance.
Deadlocks can occur for many reasons and sometimes troubleshooting deadlocks can be more of an art than a science.
What I use to find and get rid of deadlocks, outside of plain SQL Profiler, is a lightweight tool that gives a graphical depiction of deadlocks as they occur. When you see a deadlock, you can drill down and get valuable information. Deadlock Detector -- http://www.sqlsolutions.com/products/sql-deadlock-detector
It's a simple tool, but for me, it does exactly what it is supposed to do. One thing: the first time I used it, I had to wait 15 minutes for the tool to gather enough metrics to start showing deadlocks.
A common issue with high isolation is lock escalation deadlocks due the the following scenario; i.e. (where X is any resource, such as a row)
SPID a reads X - gets a read lock
SPID b reads X - gets a read lock
SPID a attempts to update X - blocked by b's read lock, so has to wait
SPID b attempts to update X - blocked by a's read lock, so has to wait
Deadlock! This scenario can be avoided by taking more locks:
SPID a reads X with (UPDLOCK) specified - gets an exclusive lock
SPID b attempts to reads X - blocked by a's exclusive lock, so has to wait
SPID a attempts to update X - fine
... (SPID a commits/rolls-back, and releases the lock at some point)
... (SPID b does whatever it wanted to do)
A deadlock can happen for many many reasons so you would have to do a little bit of homework first if you want to be helped and tell us what is causing the deadlock, ie. what are the batches involve din the deadlock executing, what resources are involved and so on and so forth. The Profiler deadlock event graph is always a great place to start the investigation.
If I'd venture a shot in the dark what happens is that your queries and indexes are not tuned properly so most of your read operations (and perhaps some of the writes) are full table scans and thus are guaranteed to collide with updates. This can cause deadlocks by order of index access, deadlock by order of operations, deadlock by escalation and so on and so forth.
Once you identify the cause of the deadlock then the proper action to remove it can be taken. The cases when he proper action is to resort to dirty reads are extremely rare.
BTW I'm not sure what you mean by 'TransactNo updlock'. Are you specifically asking about the S-U/U-S asymmetry of the U locks?
You have not supplied enough information to answer your question directly.
But most locking and blocking can be reduced (or even eliminated) by having the 'correct' indexes to cover your query workload.
Due you have a regular index maintainance job scheduled?
If you have SELECTs that do not need to be 100% accurate (i.e. allow dirty reads etc) then you can run some SELECTS with WITH(NOLOCK), which is the same as an isolation level of READ UNCOMMITED. Please Note: I'm not suggesting you place WITH(NOLOCK) everywhere; just on those SELECTS that do not need 100% intact data.
I'll throw my own articles and posts into the mix about deadlocks:
https://www.sqlskills.com/blogs/jonathan/category/deadlock/
I also have a series of videos on troubleshooting deadlocking on JumpstartTv.com as well:
http://jumpstarttv.com/profiles/1379/Jonathan-Kehayias.aspx
Deadlocks can be difficult to resolve, but unless you post your deadlock graph information, there isn't anyway we can do more than offer up links to posts and information on solving deadlocks.
"Deadlock Troubleshooting, Part 1"
http://blogs.msdn.com/bartd/archive/2006/09/09/Deadlock-Troubleshooting_2C00_-Part-1.aspx
"When Index Covering Prevents Deadlocks"
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/05/03/when-index-covering-prevents-deadlocks.aspx

Resources