I'm doing some reading up on the advantages/disadvantages of using timestamps for concurrency control in a distributed database. The material I'm reading mentions that although timestamps overcome traditional deadlock problems which can affect locking there is still the problem of "global deadlock" which it is vulnerable to.
The material describes global deadlock as a situation where no cycle exists in the wait-for graphs of local graphs but that there is a cycle in the global graph.
I'm wondering how this could happen? Could someone describe a situation where a timestamp system could cause this problem?
Here is an example, the simplest possible probably. We have machines A and B. Machine A has locks T1 and T2 with the relationship T1 < T2. Machine B has T3 and T4 with T3 > T4.
Now, the local graphs are just that T2 must wait for T1 and T3 must wait for T4. So there are no local cycles. But now, assume we have T4 < T1 so T1 has to wait for T4. And at the same time T2 < T3 so T3 has to wait for T2. In this case, there is a cycle globally.
So how does that cycle happen? The key here is that you never have the full information in a distributed system. So we may learn later that the inter-machine dependencies are there. And then we have a problem.
Timestamping is used to determine conflictresolution between local processes on a machine. It gives a means to solve deadlocks on that level. For distributed processes there is a possibilty of two processes on different machines to be waiting on each other. Which is in fact a regular deadlock, but across machines. This is called a 'global' deadlock. Imho timestamping might be used there also but is apparantly impractical.
Some info on this can be found on http://www.cse.scu.edu/~jholliday/dd_9_16.htm
Related
Why does it say by a sequence of short transactions? If transactions are long there should be no difference, no?
However, care must be taken to avoid the following scenario. Suppose a
transaction T2 has a shared-mode lock on a data item, and another
transaction T1 requests an exclusive-mode lock on the data item. T1
has to wait for T2 to release the shared mode lock. Meanwhile, a
transaction T3 may request a shared-mode lock on the same data item.
The lock request is compatible with the lock granted to T2, so T3 may
be granted the shared-mode lock. At this point T2 may release the
lock, but still T1 has to wait for T3 to finish. But again, there may
be a new transaction T4 that requests a shared-mode lock on the same
data item, and is granted the lock before T3 releases it. In fact, it
is possible that there is a sequence of transactions that each
requests a shared mode lock on the data item, and each transaction
releases the lock a short while after it is granted, but T1 never gets
the exclusive-mode lock on the data item. The transaction T1 may never
make progress, and is said to be starved.
Long transactions (in time) are actually more susceptible to blocking problems than short transactions are. Consequently, it is usually recommended that transactions be designed to hold blocking locks for as short a time as possible.
So, in the scenario above a series of "long" transactions are actually much more likely to cause this problem. However, the writer refers to a series of "short" transactions to emphasize that this problem can happen even when the transactions are short (if there are enough nearly simultaneous compatible transactions).
I am working with a postgres database that is being monitored by icinga2, and one of our monitors is looking at the commit ratio of a database:
select
round(100.*sd.xact_commit/(sd.xact_commit+sd.xact_rollback), 2) AS dcommitratio,
d.datname,
r.rolname AS rolname
FROM pg_stat_database sd
JOIN pg_database d ON (d.oid=sd.datid)
JOIN pg_roles r ON (r.oid=d.datdba)
WHERE sd.xact_commit+sd.xact_rollback<>0;
The problem is that an application recently had a bug (now fixed!) that increased the count of rollbacks considerably, so that the commit ratio is now only 78%, and it is triggering alarms every day.
I could run pg_stats_clear(), but is there a way to clear out these two counters only? I don't want to clear out any other necessary stats inadvertently, like any being used by the autovaccuum or the query optimizer. Or, is pg_stats_clear() considered safe to run?
Unfortunately it is all-or-nothing with resetting PostgreSQL statistics.
But I'd say that your monitoring system is monitoring the wrong thing anyway. Rather than monitoring the absolute values of xact_commit and xact_rollback, you should monitor the changes in the values since the last check.
Otherwise you will not detect a potential problem in a timely fashion: if there have been many months of normal operation, it will take a long time of misbehavior to change the ratio perceptibly.
I had question above question. We are using with nolock through out the application. In some cases I need to work select faster, what ever the effect.
So select with(TABLOCKX) will be faster or with(nolock)?
To answer your question, the with (nolock) table hint will be faster.
NOLOCK typically (depending on your DB engine) means give me your data, and I don't care what state it is in, and don't bother holding it still while you read from it. It is all at once faster, less resource-intensive, and very very dangerous.
As explained very well here NoLock
Nolock means you can read some locked rows (with shared locks). But you still have to wait on other locks.
Tablockx means you block whole table with exclusive lock for other queries - other session cannot make locks, you cannot be blocked after you block whole table. Tablockx is mostly used for rapid inserts.
Avoid using nolock everywhere. Try to avoid exclusive locks for longer times or try minimalize your blocking and then you don't need nolocks.
i have a stored procedure that performs a join of TableB to TableA:
SELECT <--- Nested <--- TableA
Loop <--
|
---TableB
At the same time, in a transaction, rows are inserted into TableA, and then into TableB.
This situation is occasionally causing deadlocks, as the stored procedure select grabs rows from TableB, while the insert adds rows to TableA, and then each wants the other to let go of the other table:
INSERT SELECT
========= ========
Lock A Lock B
Insert A Select B
Want B Want A
....deadlock...
Logic requires the INSERT to first add rows to A, and then to B, while i personally don't care the order in which SQL Server performs its join - as long as it joins.
The common recommendation for fixing deadlocks is to ensure that everyone accesses resources in the same order. But in this case SQL Server's optimizer is telling me that the opposite order is "better". i can force another join order, and have a worse performing query.
But should i?
Should i override the optimizer, now and forever, with a join order that i want it to use?
Or should i just trap error native error 1205, and resubmit the select statement?
The question isn't how much worse the query might perform when i override the optimizer and for it to do something non-optimal. The question is: is it better to automatically retry, rather than running worse queries?
Is it better to automatically retry deadlocks. The reason being that you may fix this deadlock, only to hit another one later. The behavior may change between SQL releases, if the size of the tables changes, if the server hardware specifications change, and even if the load on the server changes. If the deadlock is frequent, you should take active steps to eliminate it (an index is usually the answer), but for rare deadlocks (say every 10 mins or so), retry in the application can mask the deadlock. You can retry reads or writes, since the writes are, of course, surrounded by proper begin transaction/commit transaction to keep all write operations atomic and hence able to retry them w/o problems.
Another avenue to consider is turning on read committed snapshot. When this is enabled, SELECT will simply not take any locks, yet yield consistent reads.
To avoid deadlocks, one of the most common recommendations is "to acquire locks in the same order" or "access objects in the same order". Clearly this makes perfect sense, but is it always feasible? Is it always possible? I keep encountering cases when I cannot follow this advice.
If I store an object in one parent table and one or more child ones, I cannot follow this advice at all. When inserting, I need to insert my parent row first. When deleting, I have to do it in the opposite order.
If I use commands that touch multiple tables or multiple rows in one table, then usually I have no control in which order locks are acquired, (assuming that I am not using hints).
So, in many cases trying to acquire locks in the same order does not prevent all deadlocks. So, we need some kind of handling deadlocks anyway - we cannot assume that we can eliminate them all. Unless, of course, we serialize all access using Service Broker or sp_getapplock.
When we retry after deadlocks, we are very likely to overwrite other processes' changes. We need to be aware that very likely someone else modified the data we intended to modify. Especially if all the readers run under snapshot isolation, then readers cannot be involved in deadlocks, which means that all the parties involved in a deadlock are writers, modified or attempted to modify the same data. If we just catch the exception and automatically retry, we can overwrite someone else's changes.
This is called lost updates, and this is usually wrong. Typically the right thing to do after a deadlock is to retry on a much higher level - re-select the data and decide whether to save in the same way the original decision to save was made.
For example, if a user pushed a Save button and the saving transaction was chosen as a deadlock victim, it might be a good idea to re-display the data on the screen as of after the deadlock.
Trapping and rerunning can work, but are you sure that the SELECT is always the deadlock victim? If the insert is the deadlock victim, you'll have to be much more careful about retrying.
The easiest solution in this case, I think, is to NOLOCK or READUNCOMMITTED (same thing) your select. People have justifiable concerns about dirty reads, but we've run NOLOCK all over the place for higher concurrency for years and have never had a problem.
I'd also do a little more research into lock semantics. For example, I believe if you set transaction isolation level to snapshot (requires 2005 or later) your problems go away.
i have read this dead lock problem When database tables start accumulating thousands of rows and many users start working on the same table concurrently, SELECT queries on the tables start producing lock contentions and transaction deadlocks.
Is this deadlock problem related with TransactNo updlock?
If you know this problem, let me know pls.
Thanks in advance.
Deadlocks can occur for many reasons and sometimes troubleshooting deadlocks can be more of an art than a science.
What I use to find and get rid of deadlocks, outside of plain SQL Profiler, is a lightweight tool that gives a graphical depiction of deadlocks as they occur. When you see a deadlock, you can drill down and get valuable information. Deadlock Detector -- http://www.sqlsolutions.com/products/sql-deadlock-detector
It's a simple tool, but for me, it does exactly what it is supposed to do. One thing: the first time I used it, I had to wait 15 minutes for the tool to gather enough metrics to start showing deadlocks.
A common issue with high isolation is lock escalation deadlocks due the the following scenario; i.e. (where X is any resource, such as a row)
SPID a reads X - gets a read lock
SPID b reads X - gets a read lock
SPID a attempts to update X - blocked by b's read lock, so has to wait
SPID b attempts to update X - blocked by a's read lock, so has to wait
Deadlock! This scenario can be avoided by taking more locks:
SPID a reads X with (UPDLOCK) specified - gets an exclusive lock
SPID b attempts to reads X - blocked by a's exclusive lock, so has to wait
SPID a attempts to update X - fine
... (SPID a commits/rolls-back, and releases the lock at some point)
... (SPID b does whatever it wanted to do)
A deadlock can happen for many many reasons so you would have to do a little bit of homework first if you want to be helped and tell us what is causing the deadlock, ie. what are the batches involve din the deadlock executing, what resources are involved and so on and so forth. The Profiler deadlock event graph is always a great place to start the investigation.
If I'd venture a shot in the dark what happens is that your queries and indexes are not tuned properly so most of your read operations (and perhaps some of the writes) are full table scans and thus are guaranteed to collide with updates. This can cause deadlocks by order of index access, deadlock by order of operations, deadlock by escalation and so on and so forth.
Once you identify the cause of the deadlock then the proper action to remove it can be taken. The cases when he proper action is to resort to dirty reads are extremely rare.
BTW I'm not sure what you mean by 'TransactNo updlock'. Are you specifically asking about the S-U/U-S asymmetry of the U locks?
You have not supplied enough information to answer your question directly.
But most locking and blocking can be reduced (or even eliminated) by having the 'correct' indexes to cover your query workload.
Due you have a regular index maintainance job scheduled?
If you have SELECTs that do not need to be 100% accurate (i.e. allow dirty reads etc) then you can run some SELECTS with WITH(NOLOCK), which is the same as an isolation level of READ UNCOMMITED. Please Note: I'm not suggesting you place WITH(NOLOCK) everywhere; just on those SELECTS that do not need 100% intact data.
I'll throw my own articles and posts into the mix about deadlocks:
https://www.sqlskills.com/blogs/jonathan/category/deadlock/
I also have a series of videos on troubleshooting deadlocking on JumpstartTv.com as well:
http://jumpstarttv.com/profiles/1379/Jonathan-Kehayias.aspx
Deadlocks can be difficult to resolve, but unless you post your deadlock graph information, there isn't anyway we can do more than offer up links to posts and information on solving deadlocks.
"Deadlock Troubleshooting, Part 1"
http://blogs.msdn.com/bartd/archive/2006/09/09/Deadlock-Troubleshooting_2C00_-Part-1.aspx
"When Index Covering Prevents Deadlocks"
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/05/03/when-index-covering-prevents-deadlocks.aspx