starvation by a sequence of short transactions - database

Why does it say by a sequence of short transactions? If transactions are long there should be no difference, no?
However, care must be taken to avoid the following scenario. Suppose a
transaction T2 has a shared-mode lock on a data item, and another
transaction T1 requests an exclusive-mode lock on the data item. T1
has to wait for T2 to release the shared mode lock. Meanwhile, a
transaction T3 may request a shared-mode lock on the same data item.
The lock request is compatible with the lock granted to T2, so T3 may
be granted the shared-mode lock. At this point T2 may release the
lock, but still T1 has to wait for T3 to finish. But again, there may
be a new transaction T4 that requests a shared-mode lock on the same
data item, and is granted the lock before T3 releases it. In fact, it
is possible that there is a sequence of transactions that each
requests a shared mode lock on the data item, and each transaction
releases the lock a short while after it is granted, but T1 never gets
the exclusive-mode lock on the data item. The transaction T1 may never
make progress, and is said to be starved.

Long transactions (in time) are actually more susceptible to blocking problems than short transactions are. Consequently, it is usually recommended that transactions be designed to hold blocking locks for as short a time as possible.
So, in the scenario above a series of "long" transactions are actually much more likely to cause this problem. However, the writer refers to a series of "short" transactions to emphasize that this problem can happen even when the transactions are short (if there are enough nearly simultaneous compatible transactions).

Related

can we replay jdb2 transactions after a corrupted transaction

I am trying to understand the behaviour of jbd2 journaling when one of the many transactions that are written to the journal gets corrupted.
As per my understanding, for a write operation, first a write is done to persist data to the on-disk location, followed by the corresponding metadata transaction write on the journal. The format of metadata updates is as follows - 1) transaction descriptor block 2) the metadata blocks and 3) the transaction commit block. This continues for multiple transactions. Finally during a checkpoint, the metadata
corresponding to these transactions are written to the on-disk location.
I also understand that order needs to be maintained between transactions during replay if the file system crashed before checkpoint occurred. i.e. if we have 3 transactions T1, T2, T3, they will be replayed sequentially. This is to avoid the scenario where overwrites of the same block occur or there is delete and subsequent re-allocation of the same block in two consecutive transactions.
My question is for a special case: where T1, T2, and T3 being three consecutive transactions, if T1 and T3 maintain metadata change for say M1 metadata block, and T2 stores metadata change for M2 block. and M1 and M2 are not at all overlapping, in that case, if T2 gets corrupted, will T3 and all subsequent transactions be discarded?

How to get rid of the deadlock

I've described the problem here:
Deadlock under ReadCommited IL
and got an answer:
So a deadlock can occur when running SELECTs because the transaction running
those selects had acquired write locks before running the SELECT.
Okay, so what can I do to get rid of this? There are some common type of deadlocks which can be solved by adding covered indices or changing isolation level of changing the sql command text, using table hints, etc, but I can't think of a proper solution for my case.
Seems like this is the most common and easiest reason of deadlock:
Process A acquired lock on resouce R1, Process B acquires lock on resource R2
Process A waits for resource R2 to be released, and process B waits for R1
So this is largely a parallelism problem, and, probably, business logic also.
Maybe I would be able to avoid the deadock if the locks were applied to rows, but seems that there are several rowlocks within a page and then lock escalation occurs and then I have the whole page locked.
What would you advice? Disable lock escalation ? Can I do it locally for 1 transaction?
Or maybe applying some table-hint (WITH ROWLOCK) or something...idk
Changing isolation level to snapshot (or other type) is not an option now.
Thanks.
Fixing deadlocks is mostly a task that is specific to the particular transactions under consideration. There is little general advice to be given (except enable snapshot isolation which you cannot do).
One pattern emerges as a standard fix, though: Acquire all necessary locks in the right order with the right lock-modes. This might mean selecting WITH (UPDLOCK, ROWLOCK, HOLDLOCK) in order to proactively U-lock rows.
I haven't seen lock escalation to be the problem because it requires very high amounts of locks to kick in. Of course it might be the reason but most often, row locks are enough to trigger deadlock.

database unconfirmed deadlock t-sql 2005

For instance, there exists table A and table B, and i need to process an update in A and then B and I decide to table lock both them during use (as demanded by my architect). Simultaneously, another procedure is called which table locks B, then locks A.
Will this transaction complete? I have a feeling its a deadlock, quite sure of it as it's not releasing any resources...
Yes it is a possible deadlock.
The deadlock scenario is
Your task locks A
Other task locks B
then
Your task tries to lock B but it can't as you have the lock
and
other task tries to lock A but it can't as you have it.
So one of these tasks has to fail/rollback so the other can complete. Depending on RDBMS used the db will choose one of these to terminate.
Often the solution is for a guideline that you must lock resources in the same order in all processes usually this has to be manualy enforced.
Yes. This approach will end in a classic cyclic deadlock as mentioned here
Using TABLE level lock for an update is an Overkill. What is the rationale behind doing this ? If you have the correct indexes, locks will be acquired at the key level, which helps multiple processes concurrently access the tables in question.
Still it is a best practice to access the tables in the same order when possible.

How can timestamping cause "global deadlock"?

I'm doing some reading up on the advantages/disadvantages of using timestamps for concurrency control in a distributed database. The material I'm reading mentions that although timestamps overcome traditional deadlock problems which can affect locking there is still the problem of "global deadlock" which it is vulnerable to.
The material describes global deadlock as a situation where no cycle exists in the wait-for graphs of local graphs but that there is a cycle in the global graph.
I'm wondering how this could happen? Could someone describe a situation where a timestamp system could cause this problem?
Here is an example, the simplest possible probably. We have machines A and B. Machine A has locks T1 and T2 with the relationship T1 < T2. Machine B has T3 and T4 with T3 > T4.
Now, the local graphs are just that T2 must wait for T1 and T3 must wait for T4. So there are no local cycles. But now, assume we have T4 < T1 so T1 has to wait for T4. And at the same time T2 < T3 so T3 has to wait for T2. In this case, there is a cycle globally.
So how does that cycle happen? The key here is that you never have the full information in a distributed system. So we may learn later that the inter-machine dependencies are there. And then we have a problem.
Timestamping is used to determine conflictresolution between local processes on a machine. It gives a means to solve deadlocks on that level. For distributed processes there is a possibilty of two processes on different machines to be waiting on each other. Which is in fact a regular deadlock, but across machines. This is called a 'global' deadlock. Imho timestamping might be used there also but is apparantly impractical.
Some info on this can be found on http://www.cse.scu.edu/~jholliday/dd_9_16.htm

When I update/insert a single row should it lock the entire table?

I have two long running queries that are both on transactions and access the same table but completely separate rows in those tables. These queries also perform some update and inserts based on those queries.
It appears that when these run concurrently that they encounter a lock of some kind and it’s preventing the task from finishing and locks up when it goes to update one of the rows. I’m using an exclusive row lock on the rows being read and the lock that shows up on the process is a lck_m_ix lock.
Two questions:
When I update/insert a single row does it lock the entire table?
What can be done to work around this sort of issue?
Typically no, but it depends (most often used answer for SQL Server!)
SQL Server will have to lock the data involved in a transaction in some way. It has to lock the data in the table itself, and the data any affected indexes, while you perform a modification. In order to improve concurrency, there are several "granularities" of locking that the server might decide to use, in order to allow multiple processes to run: row locks, page locks, and table locks are common (there are more). Which scale of locking is in play depends on how the server decides to execute a given update. Complicating things, there are also classifications of locks like shared, exclusive, and intent exclusive, that control whether the locked object can be read and/or modified.
It's been my experience that SQL Server mainly uses page locks for changes to small portions of tables, and past some threshold will automatically escalate to a table lock, if a larger portion of a table seems (from stats) to be affected by an update or delete. The idea is that it is faster to lock a table (one lock) than obtaining and managing thousands of individual row or page locks for a big update.
To see what is happening in your specific case, you'd need to look at the query logic and, while your stuff is running, examine the locking/blocking conditions in sys.dm_tran_locks, sys.dm_os_waiting_tasks or other DMV's. You would want to discover what exactly is getting locked by what step in each of your processes, to discover why one is blocking the other.
The short version:
No
Fix your code.
The long version:
LCK_M_IX is an intent lock, meaning the operation will place an X lock on a subordinate element. Eg. When updating a row in a table, the operation table takes an IX lock on the table before locking X the row being updated/inserted/deleted. Intent locks are common strategy to deal with hierarchies, like table/page/row, because the lock manager cannot understand the physical structure of resources requested to be locked (ie. it cannot know that an X-lock on page P1 is incompatible with an S-lock on row R1 because R1 is contained in P1). For more details, see Lock Modes.
The fact that you are seeing contention on intent locks means you are trying to obtain high level object locks, like table locks. You will need to analyze your source code for the request being blocked (the one requesting the lock incompatible with LCK_M_IX) and remove the cause of the object level lock request. What that means will depend on your source code, I cannot know what you're doing there. My guess is that you use an erroneous lock hint.
A more general approach is to rely on SNAPSHOT ISOLATION. But this, most likely, will not solve the problem you're seeing, since snapshot isolation can only benefit row level contention issues, not applications that request table locks.
A frequent aim of using transactions: keep them as short and sweet as possible. I get the sense from your wording in the question that you are opening a transaction, then doing all kinds of things, some of which take a long time. Then expecting multiple users to be able to run this same code concurrently. Unfortunately, if you perform an insert at the beginning of that set of code, then do 40 other things before committing or rolling back, it is possible that that insert will block everyone else from running the same type of insert, essentially turning your operation from free-for-all to serial.
Find out what each query is doing, and if you are getting lock escalations that you wouldn't expect. Just because you say WITH (ROWLOCK) on a query doesn't mean SQL Server will be able to comply... if you are touched multiple indexes, indexed views, persisted computed columns etc. then there are all kinds of reasons why your rowlock may not hold any water. You also might have things later in the transaction that are taking longer than you think, and maybe you don't realize that the locks on all of the objects involved in the transaction (not just the statement that is currently running) can be held for the duration of the transaction.
Different databases have different locking mechanisms, but ones like SQL Server and Oracle have different types of locking.
The default on SQL Server appears to be pessimistic Page locking - so if you have a small number of records then all of them may get locked.
Most databases should not lock when running a script, so I'm wondering whether you're potentially running multiple queries concurrently without transactions.

Resources