To NOLOCK or NOT to NOLOCK, that is the question - sql-server

This is really more of a discussion than a specific question about nolock.
I took over an app recently that almost every query (and there are lots of them) has the nolock option on them. Now I am pretty new to SQL server (used Oracle for 10 years) but yet I find this pretty disturbing. So this weekend I was talking with one of my friends who runs a rather large ecommerce site (name will be withheld to protect the guilty) and he says he has to do this with all of his SQL servers cause he will always end in deadlocks.
Is this just a huge short fall with SQL server? Is this just a failure in the DB design (mine is not 3rd level, but its close) Is anybody out there running an SQL server app without nolocks? These are issues that Oracle handles better with more grandulare recordlocks.
Is SQL server just not able to handle big loads? Is there some better workaround than reading uncommited data? I would love to hear what people think.
Thanks

SQL Server has added snapshot isolation in SQL Server 2005, this will enable you to still read the latest correct value without having to wait for locks. StackOverflow is also using Snapshot Isolation. The Snapshot Isolation level is more or less the same that Oracle uses, this is why deadlocks are not very common on an Oracle box. Just be aware to have plenty of tempdb space if you do enable it
from Books On Line
When the READ_COMMITTED_SNAPSHOT
database option is set ON, read
committed isolation uses row
versioning to provide statement-level
read consistency. Read operations
require only SCH-S table level locks
and no page or row locks. When the
READ_COMMITTED_SNAPSHOT database
option is set OFF, which is the
default setting, read committed
isolation behaves as it did in earlier
versions of SQL Server. Both
implementations meet the ANSI
definition of read committed
isolation.

If somebody says that without NOLOCK their application always gets deadlocked, then there is (more than likely) a problem with their queries. A deadlock means that two transactions cannot proceed because of resource contention and the problem cannot be resolved. An example:
Consider Transactions A and B. Both are in-flight. Transaction A has inserted a row into table X and Transaction B has inserted a row into table Y, so Transaction A has an exclusive lock on X and Transaction B has an exclusive lock on Y.
Now, Transaction A needs run a SELECT against table Y and Transaction B needs to run a SELECT against table X.
The two transactions are deadlocked: A needs resource Y and B needs resource X. Since neither transaction can proceed until the other completes, the situtation cannot be resolved: neither transactions demand for a resource may be satisified until the other transaction releases its lock on the resource in contention (either by ROLLBACK or COMMIT, doesn't matter.)
SQL Server identifies this situation and select one transaction or the other as the deadlock victim, aborts that transaction and rolls back, leaving the other transaction free to proceed to its presumable completion.
Deadlocks are rare in real life (IMHO). One rectifies them by
ensuring that transaction scope is as small as possible, something SQL server does automatically (SQL Server's default transaction scope is a single statement with an implicit COMMIT), and
ensuring that transactions access resources in the same sequence. In the example above, if transactions A and B both locked resources X and Y in the same sequence, there would not be a deadlock.
Timeouts
A timeout, on the other hand, occurs when a transaction exceeds its wait time and is rolled back due to resource contention. For instance, Transaction A needs resource X. Resource X is locked by Transaction B, so Transaction A waits for the lock to be released. If the lock isn't released within the queries timeout limimt, the waiting transaction is aborted and rolled back. Every query has a query timeout associated with it (the default value is 30s, I believe), after which time the transaction is aborted and rolled back. The query timeout can be set to 0s, in which case SQL Server will let the query wait forever.
This is probably what they are talking about. In my experience, timeouts like this usually occur in big databases when large batch jobs are updating thousands and thousands of records in a single transaction, although they can happen because a transaction goes to long (connect to your production database in Query Abalyzer, execute BEGIN TRANSACTION, update a single row in a frequently hit table in Query Analyzer and go to lunch without executing ROLLBACK or COMMIT TRANSACTION and see how long it takes for the production DBAs to go apes**t on you. Don't ask me how I know this)
This sort of timeout is usually what results in splattering perfectly innocent SQL with all sorts of NOLOCK hints
[TIP: if your going to do that, just execute SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED as the first statement in your stored procedure and have done with it.]
The problem with this approach (NOLOCK/READ UNCOMMITTED) is that you can read uncommitted data from other transaction: stuff that is incomplete or that may get rolled back later, so your data integrity is comprimised. You might be sending out a bill based on data with a high level of bogosity.
My general rule is that one should avoid the use of table hints insofar as possible. Let SQL Server and its query optimizer do their jobs.
The right way to avoid this sort of issue is to avoid the sort of transactions (insert a million rows all at one fell swoop, for instance) that cause the problems. The locking strategy implicit in relational database SQL is designed around small transactions of short scope. Lock should be small in scope and short in duration. Think "bank teller updating somebody's checking account with a deposit." as the underlying use case. Design your processes to work in that model and you'll be much happier all the way 'round.
Instead of inserting a million rows in one mondo insert statement, do the work in independent chunks and commit each chunk independently. If your million row insert dies after processing 999,000 rows, all the work done is lost (not to mention that the rollback can be a b*tch, and the table is still locked during rollback as well.) If you insert the million rows in block of 1000 rows each, committing after each block, you avoid the lock contention that causes deadlocks, as locks will be obtained and released and things will keep moving. If something goes south in the 999th block of 1000 rows, and the transaction get aborted and rolled back, you've still gotten 998,000 rows inserted; you've only lost 1000 rows of work. Restart/Retry is much easier.
Also, lock escalation occurs in large transactions. For effiency, locks escalate to larger and larger scope as the number of locks held by transaction increases. If a single transaction inserts/updates/deletes a single row in a table, I get a row lock. Keep doing that and once the number of row locks held by that transaction against that table hits a threshold value, SQL Server will escalate the locking strategy: the row locks will be consolidated and converted into a smaller number page locks, thus increasing the scope of the locks held. From that point forward, an insert/delete/update of a single row will lock that page in the table. Once the number of page locks held hits its threshold value, the page locks are again consolidated and the locking strategy escalates to table locks: the transaction now locks the entire table and nobody else may play until the transaction commits or rolls back.
Whether you can avoid functionally avoid the use of NOLOCK/READ UNCOMMITTED is entirely dependent on the nature of the processes hitting the underlying database (and the culture of the organization owning it).
Myself, I try to avoid its use as much as possible.
Hope this helps.

No, there is no need to use NOLOCK. Links: SO 1
As for load, we deal with 2000 rows per second which is small change compared to 35k TPS
Deadlocks are caused by lock contention and usually caused by inconsistent write order on tables in transactions. ORMs especially are rubbish at this. We get them very infrequently. A well written DAL should retry too as per MSDN.

In a traditional normalized OLTP environment, NOLOCK is a code smell and almost certainly unnecessary in a properly designed system.
In a dimensional model, I used NOLOCK extensively to avoid locking very large fact and dimension tables which were being populated with later fact data (and dimensions may have been expiring). In the dimensional model, the facts either never change or never change after a certain point. Similarly, any dimension which is referenced will also be static, so for example, the NOLOCK will stop your long analysis operation on yesterday's data from blocking a dimension expiration during a data load for today's data.

You should only use nolock on an unchanging table. Of course, this will be the same then as Read Committed Snapshot. Without the snapshot, you are only saving the time it takes to apply a shared lock, and then to remove it, which for most cases isn't necessary.
As for a changing table... No lock doesn't just mean getting a row before a transaction is done updating all of its rows. You can get ghost data as data pages split, or even index pages split. Or no data. That alone scared me away, but I think there may be even more scenarios where you simply get the wrong data.
Of course, nolock for getting rough estimates or to just check in on a process might be reasonable.
Basic rule of thumb -- if you care about the data at all, and the data is changing, then do not use NoLOCK.

Related

postgres 9.5 row level locks concurrency exception

I have recently ran into postgres concurrency bug which I won't repeat here. The original post was can be found at this link.
I am still trying to better understand how postgres handles serializable concurrency. My situation is this. I have one stored procedure which reads a table and then inserts based on the output of the read. This stored procedure, if called by multiple clients results in the 40001 read/write dependencies exception.
The question is this. Lets assume that the stored procedure which reads a table and then inserts into it based on the read, only reads some rows. If it is guaranteed that every call to the stored procedure for reads-insert touches a different row, would the concurrency exception go away? Is postgres smart enough to keep track of which rows were read during a transaction so that it can accurately detect modification of those specific rows by a different transaction resulting in the exception? And if yes, how reliable is this mechanism? Can it be optimized away in some cases and postgres, just to be safe throws an exception on modification to any of the read tables?
First, what you encountered in the link you give is not a bug, but intended and documented behaviour.
I gather that you are using transaction isolation level SERIALIZABLE.
In this mode, every row you read is locked with a special SIReadLock which doesn't block anything, but is used to determine if a serialization anomality may have occurred, in which case the transaction is interrupted with a serialization error.
Note that not only rows that are returned to you are locked in this fashion, but all rows in all tables that are accessed during the execution of your query. So if there is a sequential scan in your execution plan, all rows of the table will have a SIReadLock. Moreover, if there are too many of these locks on a table, they get escalated to page or table level locks.
So it is possible that rows are locked unnecessarily. In addition to that, the algorithm which is used to detect inconsistencies can report false positives (it would be computationally too expensive to be exact).
As a consequence, you may receive serialization errors in the case you describe, although I would not expect any as long as everything is kept simple and there are no sequential scans.
Serialization errors are normal and to be expected on the SERIALIZABLE isolation level. Your application must be ready to handle them by retrying the transaction. It is the price you have to pay for not having to worry about data consistency.

is insert-select statement massive?

When multiple inserts are used with a select statement in a transaction, how does the database keep track of the changes during the transaction? Can there be problems with resources (such as memory or hard disk space) if a transaction is held open too long?
The short answer is, it depends on the size of the select. The select is part of the transaction, technically, but most selects don't have to be "rolled back", so the actual log of DB changes wouldn't include the select by itself. What it WILL include is a new row for every result from the select statement as an insert statement. If that select statement is 10k rows, the commit will be rather large, but no more so than if you'd written 10k individual insert statements within an explicit transaction.
Exactly how this works depends on the database. For example, in Oracle, it will require UNDO space (and eventually, if you run out, your transaction will be aborted, or your DBA will yell at you). In PostgreSQL, it'll prevent the vacuuming of old row versions. In MySQL/InnoDB, it'll use rollback space, and possibly cause lock timeouts.
There are several things the database must use space for:
Storing which rows your transaction has changed (the old values, the new values, or both) so that rollback can be performed
Keeping track of which data is visible to your transaction so that a consistent view is maintained (in transaction isolation levels other than read uncommitted). This overhead will often be greater the more isolation you request.
Keeping track of which data is visible to other transactions (unless the whole database is running in read uncommitted)
Keeping track of which objects which transactions have changed, so isolation rules are followed, especially in serializable isolation. (Probably not much space, but plenty of locks).
In general, you want your transactions to commit as soon as possible. So, e.g., you don't want to hold one open on an idle connection. How to best batch inserts depends on the database (often, many inserts on one transaction is better than one transaction per insert). And of course, the primary purpose of transactions is data integrity.
You can have many problems with the large transaction. First, in most databases you do not want to run row-by-row because for a million records that will take hours. But to insert a million records in one complex statement can cause locking on the tables involved and harm performance for everyone else. And a rollback if you kill the transaction can take a good while too. Usually the best alternative is to loop in batches. I usually test 50,000 at a time and raise or lower the set depending on how long that takes. I've had some databases where I do no more that 1000 in one set-based operation. If possible large inserts or updates should be scheduled for the off-peak hours that the database operates. If really large (and one-time - usually a large data migration) you might even want to close the database for maintenance, put it in single user mode and drop the indexes, do the insert and reindex.

when to prefer pessimistic model of transaction isolation over optimistic one?

Do I understand correctly that table/row lock hints are being used for pessimistic transaction (TX) isolation models of concurrency ONLY?
In other words, when can table/row lock hints be used during engagement of optimistic TX isolation provided by SQL Server (2005 and higher)?
When one would need pessimistic TX isolation levels/hints in SQL Server2005+ if the later provides built-in optimistic (aka snapshot aka versioning) concurrency isolation?
I did read that pessimistic options are legacy and are not needed anymore, though I am in doubt.
Also, having optimistic (aka snapshot aka versioning) TX isolation levels built-in SQL Server2005+,
when one would need to manually code for optimistic concurrency features?
The last question is inspired by having read:
"Optimistic Concurrency in SQL Server" (September 28, 2007)
describing custom coding to provide versioning in SQL Server.
Optimistic concurrency requires more resources and is more expensive when the conflict occurs.
Two sessions can read and modify the values and the conflict only occurs when they try to apply their changes simultaneously. This means that in case of the concurrent update both values should be stored somewhere (which of course requires resources).
Also, when a conflict occurs, usually the whole transaction should be rolled back or the cursor refetched, which is expensive too.
Pessimistic concurrency model uses locking, thus downgrading concurrency but improving performance.
In case of two concurrent tasks, it may be cheaper for the second task to wait for a lock to release than spending CPU time and disk I/O on two simultaneous works and then yet more on rolling back the less fortunate work and redoing it.
Say, you have a query like this:
UPDATE mytable
SET myvalue = very_complex_function(#range)
WHERE rangeid = #range
, with very_complex_function reading some data from mytable itself. In other words, this query transforms a subset of mytable sharing the value of range.
Now, when two functions work on the same range, there may be two scenarios:
Pessimistic: the first query locks, the second query waits for it. The first query completes in 10 seconds, the second one does too. Total: 20 seconds.
Optimistic: both queries work independently (on the same input). This shares CPU time between them plus some overhead on switching. They should keep their intermediate data somewhere, so the data is stored twice (which implies twice I/O or memory). Let's say both complete almost at the same time, in 15seconds.
But when it's time to commit the work, the second query will conflict and will have to rollback its changes (say, it takes the same 15 seconds). Then it needs to reread the data again and do the work again, with the new set of data (10 seconds).
As a result, both queries complete later than with a pessimistic locking: 15 and 40 seconds vs. 10 and 20.
When one would need pessimistic TX isolation levels/hints in SQL Server2005+ if the later provides built-in optimistic (aka snapshot aka versioning) concurrency isolation?
Optimistic isolation levels are, well, optimistic. You should not use them when you expect high contention on your data.
BTW, optimistic isolation (for the read queries) was available in SQL Server 2000 too.
I have a detailed answer here: Developing Modifications that Survive Concurrency
I think there's a bit confusion over terminology here.
The technique of optimistic locking/optimistic concurrency/... is a programming technique used to avoid the following scenario :
start transaction
read data, setting a "read" lock on it to prevent any deletes/modifications to our data
display data on user's screen
await user input, lock remains active
keep awaiting user input, lock still preventing any writes/modifications
user input never comes (for whatever reason)
transaction times out (and this is usually not very rapidly, as the user must be given reasonable time to enter his input).
Optimistic locking replaces this with the following:
start transaction READ
read data, setting a "read" lock on it to prevent any deletes/modifications to our data
end transaction READ, releasing the read lock just set
display data on user's screen
await user input, but data can be modified/deleted meanwhile by other transactions
user input arrives
start transaction WRITE
verify that the data has remained unaltered, raising an exception if it has
apply user updates
end transaction WRITE
So the single "user transaction" to go fetch some data, and change and update them, consists of two distinct "database transactions". What is usually called "isolation levels" applies to those database transactions. The "optimistic locking" that you refer to applies to the "user transaction".
The matter is further complicated in that, broadly speaking, two completely distinct strategies are possible for the "isolating the database transactions part" :
MVCC
2-phase locking
I think the "snapshot versioning isolation level" means that the MVCC technique (well, one of its various possible variations) is being used for the database transaction. The other commonly known isolation levels apply more to transaction isolation using 2PL as the serialization(/isolation) technique. (And mixing them up can get messy ...)

When I update/insert a single row should it lock the entire table?

I have two long running queries that are both on transactions and access the same table but completely separate rows in those tables. These queries also perform some update and inserts based on those queries.
It appears that when these run concurrently that they encounter a lock of some kind and it’s preventing the task from finishing and locks up when it goes to update one of the rows. I’m using an exclusive row lock on the rows being read and the lock that shows up on the process is a lck_m_ix lock.
Two questions:
When I update/insert a single row does it lock the entire table?
What can be done to work around this sort of issue?
Typically no, but it depends (most often used answer for SQL Server!)
SQL Server will have to lock the data involved in a transaction in some way. It has to lock the data in the table itself, and the data any affected indexes, while you perform a modification. In order to improve concurrency, there are several "granularities" of locking that the server might decide to use, in order to allow multiple processes to run: row locks, page locks, and table locks are common (there are more). Which scale of locking is in play depends on how the server decides to execute a given update. Complicating things, there are also classifications of locks like shared, exclusive, and intent exclusive, that control whether the locked object can be read and/or modified.
It's been my experience that SQL Server mainly uses page locks for changes to small portions of tables, and past some threshold will automatically escalate to a table lock, if a larger portion of a table seems (from stats) to be affected by an update or delete. The idea is that it is faster to lock a table (one lock) than obtaining and managing thousands of individual row or page locks for a big update.
To see what is happening in your specific case, you'd need to look at the query logic and, while your stuff is running, examine the locking/blocking conditions in sys.dm_tran_locks, sys.dm_os_waiting_tasks or other DMV's. You would want to discover what exactly is getting locked by what step in each of your processes, to discover why one is blocking the other.
The short version:
No
Fix your code.
The long version:
LCK_M_IX is an intent lock, meaning the operation will place an X lock on a subordinate element. Eg. When updating a row in a table, the operation table takes an IX lock on the table before locking X the row being updated/inserted/deleted. Intent locks are common strategy to deal with hierarchies, like table/page/row, because the lock manager cannot understand the physical structure of resources requested to be locked (ie. it cannot know that an X-lock on page P1 is incompatible with an S-lock on row R1 because R1 is contained in P1). For more details, see Lock Modes.
The fact that you are seeing contention on intent locks means you are trying to obtain high level object locks, like table locks. You will need to analyze your source code for the request being blocked (the one requesting the lock incompatible with LCK_M_IX) and remove the cause of the object level lock request. What that means will depend on your source code, I cannot know what you're doing there. My guess is that you use an erroneous lock hint.
A more general approach is to rely on SNAPSHOT ISOLATION. But this, most likely, will not solve the problem you're seeing, since snapshot isolation can only benefit row level contention issues, not applications that request table locks.
A frequent aim of using transactions: keep them as short and sweet as possible. I get the sense from your wording in the question that you are opening a transaction, then doing all kinds of things, some of which take a long time. Then expecting multiple users to be able to run this same code concurrently. Unfortunately, if you perform an insert at the beginning of that set of code, then do 40 other things before committing or rolling back, it is possible that that insert will block everyone else from running the same type of insert, essentially turning your operation from free-for-all to serial.
Find out what each query is doing, and if you are getting lock escalations that you wouldn't expect. Just because you say WITH (ROWLOCK) on a query doesn't mean SQL Server will be able to comply... if you are touched multiple indexes, indexed views, persisted computed columns etc. then there are all kinds of reasons why your rowlock may not hold any water. You also might have things later in the transaction that are taking longer than you think, and maybe you don't realize that the locks on all of the objects involved in the transaction (not just the statement that is currently running) can be held for the duration of the transaction.
Different databases have different locking mechanisms, but ones like SQL Server and Oracle have different types of locking.
The default on SQL Server appears to be pessimistic Page locking - so if you have a small number of records then all of them may get locked.
Most databases should not lock when running a script, so I'm wondering whether you're potentially running multiple queries concurrently without transactions.

Database deadlocks

One of the classical reasons we have a database deadlock is when two transactions are inserting and updating tables in a different order.
For example, transaction A inserts in Table A then Table B.
And transaction B inserts in Table B followed by A.
Such a scenario is always at risk of a database deadlock (assuming you are not using serializable isolation level).
My questions are:
What kind of patterns do you follow in your design to make sure that all transactions are inserting and updating in the same order.
A book I was reading- had a suggestion that you can sort the statements by the name of the table. Have you done something like this or different - which would enforce that all inserts and updates are in the same order?
What about deleting records? Delete needs to start from child tables and updates and inserts need to start from parent tables. How do you ensure that this would not run into a deadlock?
All transactions are
inserting\updating in the same order.
Deletes; identify records to be
deleted outside a transaction and
then attempt the deletion in the
smallest possible transaction, e.g.
looking up by the primary key or similar
identified during the lookup stage.
Small transactions generally.
Indexing and other performance
tuning to both speed transactions
and to promote index lookups over
tablescans.
Avoid 'Hot tables',
e.g. one table with incrementing
counters for other tables primary
keys. Any other 'switchboard' type
configuration is risky.
Especially if not using Oracle, learn
the looking behaviour of the target
RDBMS in detail (optimistic /
pessimistic, isolation levels, etc.)
Ensure you do not allow row locks to
escalate to table locks as some
RDMSes will.
Deadlocks are no biggie. Just be prepared to retry your transactions on failure.
And keep them short. Short transactions consisting of queries that touch very few records (via the magic of indexing) are ideal to minimize deadlocks - fewer rows are locked, and for a shorter period of time.
You need to know that modern database engines don't lock tables; they lock rows; so deadlocks are a bit less likely.
You can also avoid locking by using MVCC and the CONSISTENT READ transaction isolation level: instead of locking, some threads will just see stale data.
Carefully design your database processes to eliminate as much as possible transactions that involve multiple tables. When I've had database design control there has never been a case of deadlock for which I could not design out the condition that caused it. That's not to say they don't exist and perhaps abound in situations outside my limited experience; but I've had no shortage of opportunities to improve designs causing these kinds of problems. One obvious strategy is to start with a chronological write-only table for insertion of new complete atomic transactions with no interdependencies, and apply their effects in an orderly asynchronous process.
Always use the database default isolation levels and locking settings unless you are absolutely sure what risks they incur, and have proven it by testing. Redesign your process if at all possible first. Then, impose the least increase in protection required to eliminate the risk (and test to prove it.) Don't increase restrictiveness "just in case" - this often leads to unintended consequences, sometimes of the type you intended to avoid.
To repeat the point from another direction, most of what you will read on this and other sites advocating the alteration of database settings to deal with transaction risks and locking problems is misleading and/or false, as demonstrated by how they conflict with each other so regularly. Sadly, especially for SQL Server, I have found no source of documentation that isn't hopelessly confusing and inadequate.
I have found that one of the best investments I ever made in avoiding deadlocks was to use a Object Relational Mapper that could order database updates. The exact order is not important, as long as every transaction writes in the same order (and deletes in exactly the reverse order).
The reason that this avoids most deadlocks out of the box is that your operations are always table A first, then table B, then table C (which perhaps depends on table B).
You can achieve a similar result as long as you exercise care in your stored procedures or data layer's access code. The only problem is that it requires great care to do it by hand, whereas a ORM with a Unit of Work concept can automate most cases.
UPDATE: A delete should run forward to verify that everything is the version you expect (you still need record version numbers or timestamps) and then delete backwards once everything verifies. As this should all happen in one transaction, the possibility of something changing out from under you shouldn't exist. The only reason for the ORM doing it backwards is to obey the key requirements, but if you do your check forward, you will have all the locks you need already in hand.
I analyze all database actions to determine, for each one, if it needs to be in a multiple statement transaction, and then for each such case, what the minimum isolation level is required to prevent deadlocks... As you said serializable will certainly do so...
Generally, only a very few database actions require a multiple statement transaction in the first place, and of those, only a few require serializable isolation to eliminate deadlocks.
For those that do, set the isolation level for that transaction before you begin, and reset it whatever your default is after it commits.
Your example would only be a problem if the database locked the ENTIRE table. If your database is doing that...run :)

Resources