Here's my situation (SQL Server):
I have a web application that utilizes nHibernate for data access, and another 3 desktop applications. All access the same database, and are likely to utilize the same tables at any one time.
Now, with the help of NH I'm batching selects in order to load an aggregate with all of its hierarchy - so I would see 4 to maybe 7 selects being issued at once (not sure if it matters).
Every few days one of the applications will get a : "Transaction has been chosen as the deadlock victim." (this usually appears on a select)
I tried changing to snapshot isolation on the database , but that didn't helped - I was ending up with :
Snapshot isolation transaction aborted
due to update conflict. You cannot use
snapshot isolation to access table
'...' directly or indirectly in
database '...' to update,
delete, or insert the row that has
been modified or deleted by another
transaction. Retry the transaction or
change the isolation level for the
update/delete statement.
What suggestions to you have for this situation ? What should I try, or what should I read in order to find a solution ?
EDIT:
Actually there's no raid in there :). The number of users per day is small (I'll say 100 per day - with hundreds of small orders on a busy day), the database is a bit bigger at about 2GB and growing faster every day.
It's a business app, that handles orders, emails, reports, invoices and stuff like that.
Lazy loading would not be an option in this case.
I guess taking a very close looks at those indexes is my best bet.
Deadlocks are complicated. A deadlock means that at least two sessions have locks and are waiting for one another to release a different lock; since both are waiting, the locks never get released, neither session can continue, and a deadlock occurs.
In other words, A has lock X, B has lock Y, now A wants Y and B wants X. Neither will give up the lock they have until they are finished with their transaction. Both will wait indefinitely until they get the other lock. SQL Server sees that this is happening and kills one of the transactions in order to prevent the deadlock. Snapshot isolation won't help you - the DB still needs to preserve atomicity of transactions.
There is no simple answer anyone can give as to why a deadlock would be occurring. You'll need to profile your application to find out.
Start here: How to debug SQL deadlocks. That's a good intro.
Next, look at Detecting and Ending Deadlocks on MSDN. That will give you a lot of good background information on why deadlocks occur, and help you understand what you're looking at/for.
There are also some previous SO questions that you might want to look at:
Diagnosing Deadlocks in SQL Server 2005
Zero SQL deadlock by design
Or, if the deadlocks are very infrequent, just write some exception-handling code into your application to retry the transaction if a deadlock occurs. Sometimes it can be extremely hard (if not nearly impossible) to prevent certain deadlocks. As long as you write transactionally-safe code, it's not the end of the world; it's completely safe to just try the transaction again.
Is your hardware properly configured (specifically RAID configuration)? Is it capable of matching your workload?
If hardware is all good and humming, you should ensure you have the 'right' indexes to match your query workload.
Many locking/deadlock problems can be eliminated with the correct indexes (covering indexes can take pressure off the clustered index during inserts).
BTW: turning on snapshot isolation will put increased pressure on your tempDB. How is tempDB configured? RAID 0 is preferred (and even better use an SSD if tempDB is a bottleneck).
While it's not uncommon to find this error in NHibernate sessions with large numbers of users, it seems to be happening too often in your case.
Perhaps your objects are very large resulting in long-running selects? And if your selects are taking too long, that might indicate problems with your indexes (as Mitch Wheat explains)
If everything is in order, you could also try Lazy Loading to postpone your selects until when you really need your data. This might not be appropriate for your exact situation so you do have to see if it works.
Related
I joined a project a while ago, which is a a few web servers and a few backend servers.
They all do CRUD things on one database.
Unfortunately, a few tables fall into a deadlock situation for a while now. We can see those victim statements via SQL Server Management Studio and its extended events feature.
Primary keys and all the necessary indexes are set already. We even rebuilt them, alot of these had fragmentations over 50%.
Thing is, there is this one table we would like to switch to the isolation level called SNAPSHOT. I know this won't solve the deadlock situation at all hence I read that write statements might block each other.
One table contains logs (login of users, tasks started and ended on the backends, yadda yadda...), the other one contains all the processes, so the backends are selecting, inserting and updating (like setting the "running" field from 0 to 1 and vice versa). While the first one for logging reasons might be good for the snapshot level, I doubt it might be recommended for the process table, as far as I understood how the snapshot leveling is working. And I am also aware that rollbacks of transactions will block the tables during the rollback process anyway.
Even the sysobjects table is getting blocked sometimes when a table has to be dropped. And I must mention that the database is ridiculously large, like many many table.
What I would like to know is, if you guys ever switched from whatever isolation level to snapshot and what challenges you had to face, or even if you changed your mind when it came to deadlock prevention and tried a different approach, like hardware upgrade, etc...
We are using SQL Server 2012 EE but currently do not have the option to run queries on a R/O mirror though that is my long term goal, though am concerned I may run into the below issue in that scenario as well since the mirror would also be updating data I am querying.
I have a view that joins across several tables from two databases and is used for invoicing from existing data. Three of these tables are also actively updated by ongoing transactions. Running a report that used this view did not used to be a problem but now our database is getting much larger and I have run into some timeout problems. First the query was timing out so I set command timeout to 0 and reran the query which pegged all 4 CPUs 100% for 90 minutes and then I killed it. There were no problems with active transactions during that time. I reviewed the query and found a field I was joining on that was not indexed so created an index on that field, reran the report, which then finished in three minutes and all the CPUs were busy but not at all pegged out. Same data amount queried both times. I figured problem solved. Of course later, my boss ran a similar query, perhaps with some more data but probably not a lot more, and our live transactions started timing out 100% while his query was running. I did not get a chance to see the CPU usage during that time.
So my questions are two:
Given I have to use the live and active database, what is the proper way to run a long R/O query so that active transactions can still continue? I am considering NO LOCK but am hoping there is a better standard practice.
And what might cause sqlserver to peg out 4 CPUs with 100% busy and not cause live transaction timeouts, yet when my boss ran his query, after I added the index and my query ran much better, the live update transactions start timing out 100%?
I know this is not a lot of info to go on. I'm not very familiar with sql profiling and performance monitoring yet this behavior seems rather odd and am hoping a best practice would be the correct workaround.
The default behavior of SELECT queries in the READ_COMMITTED transaction isolation level is to acquire shared locks during query execution to provide the requested data consistency (read committed data only). These locks are typically row-level and released quickly during query execution immediately after each row is read. There are also less granular intent locks at the page and table level prevent concurrent updates to data as it is being read. Depending on the particulars of the execution plan, there may even be shared locks held at the table level for the duration of the query, which will prevent updates to the table during query execution and result in readers blocking writers.
Setting the READ_COMMITTED_SNAPSHOT database option causes SQL Server to use row versioning instead of locking to provide the same read consistency. A row version store is maintained in tempdb so that when a row requested by the query has changed since the query began, the most recent committed row version is returned instead. This row-versioning behavior avoids locking and effectively provides a statement-level snapshot of the database at the time the query began. Readers do not block writers and writers do not block readers. Do not confuse the READ_COMMITTED_SNAPSHOT database option with the SNAPSHOT isolation level (a common mistake).
The downside of setting the READ_COMMITTED_SNAPSHOT is additional resource usage. An additional 14 bytes of storage overhead for each row is incurred once the database option is enabled. Updates and deletes will generate row versions in tempdb. These versions require tempdb space for the duration of the longest running query and there is overhead in maintained the version store. Also consider whether you have existing applications that depend on readers-block-writers locking behavior. Despite this overhead, the concurrency benefits may yield better overall performance depending on your workload, while providing read integrity. See http://technet.microsoft.com/en-us/library/ms188277.aspx for more information.
Actually I decided to create a snapshot at the beginning of each month for reporting to run against. Then delete when no longer needed for reporting. This seems to work fine. I could do something similar with a database restore but slightly more work. This allows not needing a second SQL EE license, and lets me run reports w/o locking tables for live transactions.
I have a process with a Select which takes a long time to finish, on the order of 5 to 10 minutes. I am currently not using NOLOCK as a hint to the MS SQL database engine.At the same time we have another process doing updates and inserts into the same database and same tables. The first process has started, recently to end prematurely with a message
SQLEXCEPTION: Transaction was deadlocked on lock resources with another process and has been chosen as the deadlock victim.
This first process is running at other sites in identical conditions but with smaller databases and thus the select statement in question takes a much shorter period of time (on the order of 30 seconds or so). In these other sites, I don't get the deadlock message in these other sites. I also did not get this message at the site that is having the problem initially, but, I assume, as the database has grown, I believe I must have crossed some threshold. Here are my questions:
Could the time it takes for a transaction to execute make the associated process more likely to be flagged as a deadlock victim.
If I execute the select with a NOLOCK hint, will this remove the problem?
I suspect that a datetime field that is checked as part of the WHERE clause in the select statement is causing the slow lookup time. Can I create an index based on this field? Is it advisable?
Q1:Could the time it takes for a transaction to execute make the associated process more likely to be flagged as a deadlock victim.
No. The SELECT is the victim because it had only read data, therefore the transaction has a lower cost associated with it so is chosen as the victim:
By default, the Database Engine chooses as the deadlock victim the
session running the transaction that is least expensive to roll back.
Alternatively, a user can specify the priority of sessions in a
deadlock situation using the SET DEADLOCK_PRIORITY statement.
DEADLOCK_PRIORITY can be set to LOW, NORMAL, or HIGH, or alternatively
can be set to any integer value in the range (-10 to 10).
Q2. If I execute the select with a NOLOCK hint, will this remove the problem?
No. For several reasons:
you should first try to eliminate the deadlock properly, by investigating the root cause
dirty reads are inconsistent reads.
the proper way to specify dirty reads is to use transaction isolation levels
there is a much better solution: read committed snapshot.
Q3. I suspect that a datetime field that is checked as part of the WHERE clause in the select statement is causing the slow lookup time. Can I create an index based on this field? Is it advisable?
Probably. The cause of the deadlock is almost very likely to be a poorly indexed database.10 minutes queries are acceptable in such narrow conditions, that I'm 100% certain in your case is not acceptable.
With 99% confidence I declare that your deadlock is cased by a large table scan conflicting with updates. Start by capturing the deadlock graph to analyze the cause. You will very likely have to optimize the schema of your database. Before you do any modification, read this topic Designing Indexes and the sub-articles.
Here is how this particular deadlock problem actually occurred and how it was actually resolved. This is a fairly active database with 130K transactions occurring daily. The indexes in the tables in this database were originally clustered. The client requested us to make the indexes nonclustered. As soon as we did, the deadlocking began. When we reestablished the indexes as clustered, the deadlocking stopped.
The answers here are worth a try, but you should also review your code. Specifically have a read of Polyfun's answer here:
How to get rid of deadlock in a SQL Server 2005 and C# application?
It explains the concurrency issue, and how the usage of "with (updlock)" in your queries might correct your deadlock situation - depending really on exactly what your code is doing. If your code does follow this pattern, this is likely a better fix to make, before resorting to dirty reads, etc.
Although #Remus Rusanu's is already an excelent answer, in case one is looking forward a better insight on SQL Server's Deadlock causes and trace strategies, I would suggest you to read Brad McGehee's How to Track Down Deadlocks Using SQL Server 2005 Profiler
I'm writing some logging/auditing code that will be running in production (not just when errors are thrown or while developing). After reading Coding Horror's experiences with dead-locking and logging, I decided I should seek advice. (Jeff's solution of "not logging" won't work for me, this is legally mandated security auditing)
Is there an suitable Isolation level for minimizing contention and dead-locking? Any query hints I can add to the insert statement or the stored procedure?
I care deeply about the transactional integrity for everything except the audit table. The idea is that so much will be logged that if a few entries fail, it's not a problem. If the logging stops a some other transaction-- that would be bad.
I can log to a database or a file, although logging to a file is less attractive because I need to be able to display the results somehow. Logging to a file would (almost) guarantee the logging wouldn't interfere with other code though.
A normal transaction (ie. READ COMMITTED) insert already does the 'minimal' locking. Insert intensive applications will not deadlock on the insert, no matter the order of how the insert is mixed with other operations. At worst an intensive insert system may cause page latch contention on the hot spot where insert occurs, but not deadlocks.
To cause deadlocks as described by Jeff there has to be more at play, like any one of the following:
The system is using a higher isolation level (they had it coming then and well deserve it)
They were reading from the log table during the transaction (so is no longer 'append-only')
The deadlock chain involved application layer locks (ie. .Net lock statements in the log4net framework) resulting in undetectable deadlocks (ie. application hangs). Given that solving the problem involved looking at process dumps, I guess this is the scenario they were having.
So as long as you do insert only logging in READ COMMITTED isolation level transactions you are safe. If you expect the same problem I suspect SO had (ie. deadlocks involving application layer locks) then no amount of database wizardry can save you, as the problem can still manifest even if you log on separate transaction or into separate connection.
If you don't care about consistency on your logging table, why not perform all the logging from a separate thread.
I probably would not wait for transactions to complete before logging, since the log can be pivotal in diagnosing long running transactions. Also, this enables you to see all the work a transaction that rolled back did.
Grab the stack trace and all of your logging data in the logging thread, chuck it on a queue when there are new logging messages, flush them to the db in a single transaction.
Steps to minimizing locking:
(KEY) perform all appends to the logging table outside of the main thread/connection/transaction.
Ensure your logging table has a monotonically increasing clustered index (Eg. int identity ) that is increasing each time you append a log message. This ensures the pages being inserted into are usually in memory and avoids the performance hits you get with heap tables.
Perform multiple appends to the log in a transaction (10 inserts in a transaction are faster than 10 inserts out of a transaction and usually acquire/release less locks)
Give it a break. Only perform logging to your db every N milliseconds. Batch up bits of works.
If you need to report on stuff historically, you can consider partitioning your logging table. Example: You could create a new logging table every month, and at the same time have a log VIEW that is a UNION ALL of all the older logging tables. Perform the reporting against the most appropriate source.
You will get better performance by flushing multiple logging messages in a single (smallish) transaction, and have the advantage that if 10 threads are doing work and logging stuff, only a single thread is flushing stuff to the logging table. This pipelining actually makes stuff scale better.
Since you don't care about the transactional integrity of the audit table, you can obviously perform logging outside of the transaction (i.e. after it completes). That will minimise impact on the transaction.
Also, if you want to minimize locking, you should try to ensure that as much of your query workload as possible has covering non-clustered indexes. (SQL Server 2005 and above, the use of the INCLUDE statement in NC indexes can make a big difference)
One easy way to prevent your logging from having locking issues with your 'regular' database is to not use the same database. Just create another database for your logging. As a bonus, the rapid growth of your logging database won't result in fragmentation in your main DB. Personall, I usually prefer to log to a file -- but then again, I'm used to doing heavy text manipulation in my editor - VIM. Logging to a separate DB should help avoid deadlocking issues.
Just make sure that if you try writing your own database appender for the logging framework you use, you be very careful about your locks (which I'm guessing is what tripped up Jeff in the blog post you reference). Properly written (see several of the comments in Jeff's post), you shouldn't have locking issues with your logging framework unless they do something odd.
Can someone explain the implications of using with (nolock) on queries, when you should/shouldn't use it?
For example, if you have a banking application with high transaction rates and a lot of data in certain tables, in what types of queries would nolock be okay? Are there cases when you should always use it/never use it?
WITH (NOLOCK) is the equivalent of using READ UNCOMMITED as a transaction isolation level. So, you stand the risk of reading an uncommitted row that is subsequently rolled back, i.e. data that never made it into the database. So, while it can prevent reads being deadlocked by other operations, it comes with a risk. In a banking application with high transaction rates, it's probably not going to be the right solution to whatever problem you're trying to solve with it IMHO.
The question is what is worse:
a deadlock, or
a wrong value?
For financial databases, deadlocks are far worse than wrong values. I know that sounds backwards, but hear me out. The traditional example of DB transactions is you update two rows, subtracting from one and adding to another. That is wrong.
In a financial database you use business transactions. That means adding one row to each account. It is of utmost importance that these transactions complete and the rows are successfully written.
Getting the account balance temporarily wrong isn't a big deal, that is what the end of day reconciliation is for. And an overdraft from an account is far more likely to occur because two ATMs are being used at once than because of a uncommitted read from a database.
That said, SQL Server 2005 fixed most of the bugs that made NOLOCK necessary. So unless you are using SQL Server 2000 or earlier, you shouldn't need it.
Further Reading
Row-Level Versioning
Unfortunately it's not just about reading uncommitted data. In the background you may end up reading pages twice (in the case of a page split), or you may miss the pages altogether. So your results may be grossly skewed.
Check out Itzik Ben-Gan's article. Here's an excerpt:
" With the NOLOCK hint (or setting the
isolation level of the session to READ
UNCOMMITTED) you tell SQL Server that
you don't expect consistency, so there
are no guarantees. Bear in mind though
that "inconsistent data" does not only
mean that you might see uncommitted
changes that were later rolled back,
or data changes in an intermediate
state of the transaction. It also
means that in a simple query that
scans all table/index data SQL Server
may lose the scan position, or you
might end up getting the same row
twice. "
The text book example for legitimate usage of the nolock hint is report sampling against a high update OLTP database.
To take a topical example. If a large US high street bank wanted to run an hourly report looking for the first signs of a city level run on the bank, a nolock query could scan transaction tables summing cash deposits and cash withdrawals per city. For such a report the tiny percentage of error caused by rolled back update transactions would not reduce the value of the report.
Not sure why you are not wrapping financial transactions in database transactions (as when you transfer funds from one account to another - you don't commit one side of the transaction at-a-time - this is why explicit transactions exist). Even if your code is braindead to business transactions as it sounds like it is, all transactional databases have the potential to do implicit rollbacks in the event of errors or failure. I think this discussion is way over your head.
If you are having locking problems, implement versioning and clean up your code.
No lock not only returns wrong values it returns phantom records and duplicates.
It is a common misconception that it always makes queries run faster. If there are no write locks on a table, it does not make any difference. If there are locks on the table, it may make the query faster, but there is a reason locks were invented in the first place.
In fairness, here are two special scenarios where a nolock hint may provide utility
1) Pre-2005 sql server database that needs to run long query against live OLTP database this may be the only way
2) Poorly written application that locks records and returns control to the UI and readers are indefinitely blocked. Nolock can be helpful here if application cannot be fixed (third party etc) and database is either pre-2005 or versioning cannot be turned on.
NOLOCK is equivalent to READ UNCOMMITTED, however Microsoft says you should not use it for UPDATE or DELETE statements:
For UPDATE or DELETE statements: This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.
http://msdn.microsoft.com/en-us/library/ms187373.aspx
This article applies to SQL Server 2005, so the support for NOLOCK exists if you are using that version. In order to future-proof you code (assuming you've decided to use dirty reads) you could use this in your stored procedures:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
You can use it when you're only reading data, and you don't really care about whether or not you might be getting back data that is not committed yet.
It can be faster on a read operation, but I cannot really say by how much.
In general, I recommend against using it - reading uncommitted data can be a bit confusing at best.
Another case where it's usually okay is in a reporting database, where data is perhaps already aged and writes just don't happen. In this case, though, the option should be set at the database or table level by the administrator by changing the default isolation level.
In the general case: you can use it when you are very sure that it's okay to read old data. The important thing to remember is that its very easy to get that wrong. For example, even if it's okay at the time you write the query, are you sure something won't change in the database in the future to make these updates more important?
I'll also 2nd the notion that it's probably not a good idea in banking app. Or inventory app. Or anywhere you're thinking about transactions.
Simple answer - whenever your SQL is not altering data, and you have a query that might interfere with other activity (via locking).
It's worth considering for any queries used for reports, especially if the query takes more than, say, 1 second.
It's especially useful if you have OLAP-type reports you're running against an OLTP database.
The first question to ask, though, is "why am I worrying about this?" ln my experience, fudging the default locking behavior often takes place when someone is in "try anything" mode and this is one case where unexpected consequences are not unlikely. Too often it's a case of premature optimization and can too easily get left embedded in an application "just in case." It's important to understand why you're doing it, what problem it solves, and whether you actually have the problem.
Short answer:
Only use WITH (NOLOCK) in SELECT statement on tables that have a clustered index.
Long answer:
WITH(NOLOCK) is often exploited as a magic way to speed up database reads.
The result set can contain rows that have not yet been committed, that are often later rolled back.
If WITH(NOLOCK) is applied to a table that has a non-clustered index then row-indexes can be changed by other transactions as the row data is being streamed into the result-table. This means that the result-set can be missing rows or display the same row multiple times.
READ COMMITTED adds an additional issue where data is corrupted within a single column where multiple users change the same cell simultaneously.
My 2 cents - it makes sense to use WITH (NOLOCK) when you need to generate reports. At this point, the data wouldn't change much & you wouldn't want to lock those records.
If you are handling finance transactions then you will never want to use nolock. nolock is best used to select from large tables that have lots updates and you don't care if the record you get could possibly be out of date.
For financial records (and almost all other records in most applications) nolock would wreak havoc as you could potentially read data back from a record that was being written to and not get the correct data.
I've used to retrieve a "next batch" for things to do. It doesn't matter in this case which exact item, and I have a lot of users running this same query.
Use nolock when you are okay with the "dirty" data. Which means nolock can also read data which is in the process of being modified and/or uncommitted data.
It's generally not a good idea to use it in high transaction environment and that is why it is not a default option on query.
I use with (nolock) hint particularly in SQLServer 2000 databases with high activity. I am not certain that it is needed in SQL Server 2005 however. I recently added that hint in a SQL Server 2000 at the request of the client's DBA, because he was noticing a lot of SPID record locks.
All I can say is that using the hint has NOT hurt us and appears to have made the locking problem solve itself. The DBA at that particular client basically insisted that we use the hint.
By the way, the databases I deal with are back-ends to enterprise medical claims systems, so we are talking about millions of records and 20+ tables in many joins. I typically add a WITH (nolock) hint for each table in the join (unless it is a derived table, in which case you can't use that particular hint)
The simplest answer is a simple question - do you need your results to be repeatable? If yes then NOLOCKS is not appropriate under any circumstances
If you don't need repeatability then nolocks may be useful, especially if you don't have control over all processes connecting to the target database.