Advice for minimizing locking on an append only table in MS SQL Server? - sql-server

I'm writing some logging/auditing code that will be running in production (not just when errors are thrown or while developing). After reading Coding Horror's experiences with dead-locking and logging, I decided I should seek advice. (Jeff's solution of "not logging" won't work for me, this is legally mandated security auditing)
Is there an suitable Isolation level for minimizing contention and dead-locking? Any query hints I can add to the insert statement or the stored procedure?
I care deeply about the transactional integrity for everything except the audit table. The idea is that so much will be logged that if a few entries fail, it's not a problem. If the logging stops a some other transaction-- that would be bad.
I can log to a database or a file, although logging to a file is less attractive because I need to be able to display the results somehow. Logging to a file would (almost) guarantee the logging wouldn't interfere with other code though.

A normal transaction (ie. READ COMMITTED) insert already does the 'minimal' locking. Insert intensive applications will not deadlock on the insert, no matter the order of how the insert is mixed with other operations. At worst an intensive insert system may cause page latch contention on the hot spot where insert occurs, but not deadlocks.
To cause deadlocks as described by Jeff there has to be more at play, like any one of the following:
The system is using a higher isolation level (they had it coming then and well deserve it)
They were reading from the log table during the transaction (so is no longer 'append-only')
The deadlock chain involved application layer locks (ie. .Net lock statements in the log4net framework) resulting in undetectable deadlocks (ie. application hangs). Given that solving the problem involved looking at process dumps, I guess this is the scenario they were having.
So as long as you do insert only logging in READ COMMITTED isolation level transactions you are safe. If you expect the same problem I suspect SO had (ie. deadlocks involving application layer locks) then no amount of database wizardry can save you, as the problem can still manifest even if you log on separate transaction or into separate connection.

If you don't care about consistency on your logging table, why not perform all the logging from a separate thread.
I probably would not wait for transactions to complete before logging, since the log can be pivotal in diagnosing long running transactions. Also, this enables you to see all the work a transaction that rolled back did.
Grab the stack trace and all of your logging data in the logging thread, chuck it on a queue when there are new logging messages, flush them to the db in a single transaction.
Steps to minimizing locking:
(KEY) perform all appends to the logging table outside of the main thread/connection/transaction.
Ensure your logging table has a monotonically increasing clustered index (Eg. int identity ) that is increasing each time you append a log message. This ensures the pages being inserted into are usually in memory and avoids the performance hits you get with heap tables.
Perform multiple appends to the log in a transaction (10 inserts in a transaction are faster than 10 inserts out of a transaction and usually acquire/release less locks)
Give it a break. Only perform logging to your db every N milliseconds. Batch up bits of works.
If you need to report on stuff historically, you can consider partitioning your logging table. Example: You could create a new logging table every month, and at the same time have a log VIEW that is a UNION ALL of all the older logging tables. Perform the reporting against the most appropriate source.
You will get better performance by flushing multiple logging messages in a single (smallish) transaction, and have the advantage that if 10 threads are doing work and logging stuff, only a single thread is flushing stuff to the logging table. This pipelining actually makes stuff scale better.

Since you don't care about the transactional integrity of the audit table, you can obviously perform logging outside of the transaction (i.e. after it completes). That will minimise impact on the transaction.
Also, if you want to minimize locking, you should try to ensure that as much of your query workload as possible has covering non-clustered indexes. (SQL Server 2005 and above, the use of the INCLUDE statement in NC indexes can make a big difference)

One easy way to prevent your logging from having locking issues with your 'regular' database is to not use the same database. Just create another database for your logging. As a bonus, the rapid growth of your logging database won't result in fragmentation in your main DB. Personall, I usually prefer to log to a file -- but then again, I'm used to doing heavy text manipulation in my editor - VIM. Logging to a separate DB should help avoid deadlocking issues.
Just make sure that if you try writing your own database appender for the logging framework you use, you be very careful about your locks (which I'm guessing is what tripped up Jeff in the blog post you reference). Properly written (see several of the comments in Jeff's post), you shouldn't have locking issues with your logging framework unless they do something odd.

Related

SQL Server Deadlocks Process

I have two question..
I want to confirm are deadlock may happen if one session is querying a table which is locked by another session with.
And how do resolve the above mentioned SQL error when there are more than one computer accessing to the MSSQL Server for actions such as update and delete
enter image description here
A deadlock can occur when a transaction that has started to change data, conflicts with another transaction for acquiring an exclusive lock. A blockage, even a very long one, does not necessarily lead to a deadly lock.
It is possible to eradicate any deadlock using the banker's algorithm but this leads to crippling any access conccurancy in the database, which leads to have only one user in parallel!
However, we can reduce the occurrence of deadly locks by:
modeling the base correctly, i.e. respecting the normal form
indexing tables for reading as well as updating
reducing the level of isoaltion of transactions
switching from pessimistic locking (default in SQL Server) to optimized locking
avoiding unnecessary transactions
reducing the number of requests in the explicit transaction
changing the sequence of processing logic in the transaction
See the link below Using WITH For SELECT queries use WITH(NOLOCK) for UPDATE,DELETE,INSERT use WITH(XLOCK) ..

What is the proper way to run a long query against an active database?

We are using SQL Server 2012 EE but currently do not have the option to run queries on a R/O mirror though that is my long term goal, though am concerned I may run into the below issue in that scenario as well since the mirror would also be updating data I am querying.
I have a view that joins across several tables from two databases and is used for invoicing from existing data. Three of these tables are also actively updated by ongoing transactions. Running a report that used this view did not used to be a problem but now our database is getting much larger and I have run into some timeout problems. First the query was timing out so I set command timeout to 0 and reran the query which pegged all 4 CPUs 100% for 90 minutes and then I killed it. There were no problems with active transactions during that time. I reviewed the query and found a field I was joining on that was not indexed so created an index on that field, reran the report, which then finished in three minutes and all the CPUs were busy but not at all pegged out. Same data amount queried both times. I figured problem solved. Of course later, my boss ran a similar query, perhaps with some more data but probably not a lot more, and our live transactions started timing out 100% while his query was running. I did not get a chance to see the CPU usage during that time.
So my questions are two:
Given I have to use the live and active database, what is the proper way to run a long R/O query so that active transactions can still continue? I am considering NO LOCK but am hoping there is a better standard practice.
And what might cause sqlserver to peg out 4 CPUs with 100% busy and not cause live transaction timeouts, yet when my boss ran his query, after I added the index and my query ran much better, the live update transactions start timing out 100%?
I know this is not a lot of info to go on. I'm not very familiar with sql profiling and performance monitoring yet this behavior seems rather odd and am hoping a best practice would be the correct workaround.
The default behavior of SELECT queries in the READ_COMMITTED transaction isolation level is to acquire shared locks during query execution to provide the requested data consistency (read committed data only). These locks are typically row-level and released quickly during query execution immediately after each row is read. There are also less granular intent locks at the page and table level prevent concurrent updates to data as it is being read. Depending on the particulars of the execution plan, there may even be shared locks held at the table level for the duration of the query, which will prevent updates to the table during query execution and result in readers blocking writers.
Setting the READ_COMMITTED_SNAPSHOT database option causes SQL Server to use row versioning instead of locking to provide the same read consistency. A row version store is maintained in tempdb so that when a row requested by the query has changed since the query began, the most recent committed row version is returned instead. This row-versioning behavior avoids locking and effectively provides a statement-level snapshot of the database at the time the query began. Readers do not block writers and writers do not block readers. Do not confuse the READ_COMMITTED_SNAPSHOT database option with the SNAPSHOT isolation level (a common mistake).
The downside of setting the READ_COMMITTED_SNAPSHOT is additional resource usage. An additional 14 bytes of storage overhead for each row is incurred once the database option is enabled. Updates and deletes will generate row versions in tempdb. These versions require tempdb space for the duration of the longest running query and there is overhead in maintained the version store. Also consider whether you have existing applications that depend on readers-block-writers locking behavior. Despite this overhead, the concurrency benefits may yield better overall performance depending on your workload, while providing read integrity. See http://technet.microsoft.com/en-us/library/ms188277.aspx for more information.
Actually I decided to create a snapshot at the beginning of each month for reporting to run against. Then delete when no longer needed for reporting. This seems to work fine. I could do something similar with a database restore but slightly more work. This allows not needing a second SQL EE license, and lets me run reports w/o locking tables for live transactions.

Solution for preventing DB deadlock in ColdFusion?

An app I am working on has to handle lots of ajax requests that needs to update some data on DB.
[Macromedia][SQLServer JDBC Driver][SQLServer]Transaction (Process ID
66) was deadlocked on lock | communication buffer resources with
another process and has been chosen as the deadlock victim. Rerun the
transaction.
For reads, I've already used the WITH (NOLOCK) hint and that prevented a lot of deadlocks on reads.
What can I do to better deal with writes?
CFLock the update code in CF?
Or is there a way to ask SQL Server to lock a Row instead of a Table?
Have anyone tried implementing CQRS? Seems to solve the problem but I am not clear on how to handle:
ID generation (right now it uses Auto Increment on DB)
How to deal with update request fail if server couldn't send the error back to the client right away.
Thanks
Here are my thoughts on this.
From the ColdFusion server side
I do believe that using named <cflock> tags around your ColdFusion code that updates the database could prevent the deadlock issue on the database server. Using a named lock would make each call single threaded. However, you could run into timeouts on the ColdFusion server side, waiting for the <cflock>, if transactions are taking a while. Handling it in this way on the ColdFusion server side may also slow down your application. You could do some load testing before and after to see how this method affects your app.
From the database server side
First of all let me say that I don't think deadlocks on the database server can be entirely prevented, just minimized and handled appropriately. I found this reference on TechNet for you - Minimizing Deadlocks. The first sentence from that page:
Although deadlocks cannot be completely avoided, following certain coding conventions can minimize the chance of generating a deadlock.
Here are the key points from that reference. They go into a bit more detail about each topic so please read the original source.
Minimizing deadlocks can increase transaction throughput and reduce system overhead because fewer transactions are:
Rolled back, undoing all the work performed by the transaction.
Resubmitted by applications because they were rolled back when deadlocked.
To help minimize deadlocks:
Access objects in the same order.
Avoid user interaction in transactions.
Keep transactions short and in one batch.
Use a lower isolation level.
Use a row versioning-based isolation level.
Set READ_COMMITTED_SNAPSHOT database option ON to enable read-committed transactions to use row versioning.
Use snapshot isolation.
Use bound connections.
The "row versioning-based isolation level" may answer your question Or is there a way to ask SQL Server to lock a Row instead of a Table?. There are some notes mentioned in the original source regarding this option.
Here are some other references that came up during my search:
Avoiding deadlock by using NOLOCK hint
How to avoid sql deadlock?
Tips to avoid deadlocks? - This one mentions being careful when using the NOLOCK hint.
The Difficulty with Deadlocks
Using Row Versioning-based Isolation Levels

Huge transaction in Sql Server, are there any problems?

I have a program which does many bulk operations on an SQL Server 2005 or 2008 database (drops and creates indexes, creates columns, full table updates etc), all in one transaction.
Are there any problems to be expected?
I know that the transaction log expands even in Simple recovery mode.
This program is not executed during normal operation of the system, so locking and concurrency is not an issue.
Are there other reasons to split the transaction into smaller steps?
In short,
Using smaller transactions provides more robust recovery from failure.
Long transactions may also unnecessarily hold locks on objects for extended periods of time that other processes may require access to i.e. blocking.
Consider that if at any point between the time the transaction started and finished, your server experienced a failure, in order to be bring the database online SQL Server would have to perform the crash recovery process which would involve rolling back all uncommitted transactions from the log.
Supposing you developed a data processing solution that is intelligent enough to pick up from where it left off. By using a single transaction this would not be an option available to you because you would need to start the process from the begging once again.
If the transaction causes too many database log entries (updates) the log can hit what is known as the "high water mark". It's the point at which the log reaches (about) half of its absolute maximum size, when it must then commence rolling back all updates (which will consume about the same amount of disk as it took to do the updates.
Not rolling back at this point would mean risking eventually reaching the maximum log size and still not finishing the transaction or hitting a rollback command, at which point the database is screwed because there's not enough log space to rollback.
It isn't really a problem until you run out of disk space, but you'll find that rollback will take a long time. I'm not saying to plan for failure of course.
However, consider the process not the transaction log as such. I'd consider separating:
DDL into a separate transaction
Bulk load staging tables with a transaction
Flush data from staging to final table in another transaction
If something goes wrong I'd hope that you have rollback scripts and/or a backup.
Is there really a need to do everything atomically?
Depending on the complexity of your update statements, I'd recommend to do this only on small tables of, say, a few 100 rows. Especially if you have only a small amount of main memory available. Otherwise, for instance, updates on big tables can take a very long time and even appear to hang. Then it's difficult to figure out what the process (spid) is doing and how long it might take.
I'm not sure whether "Drop index" is transaction-logged operation anyway. See this question here on stackoverflow.com.

Concurrency issues

Here's my situation (SQL Server):
I have a web application that utilizes nHibernate for data access, and another 3 desktop applications. All access the same database, and are likely to utilize the same tables at any one time.
Now, with the help of NH I'm batching selects in order to load an aggregate with all of its hierarchy - so I would see 4 to maybe 7 selects being issued at once (not sure if it matters).
Every few days one of the applications will get a : "Transaction has been chosen as the deadlock victim." (this usually appears on a select)
I tried changing to snapshot isolation on the database , but that didn't helped - I was ending up with :
Snapshot isolation transaction aborted
due to update conflict. You cannot use
snapshot isolation to access table
'...' directly or indirectly in
database '...' to update,
delete, or insert the row that has
been modified or deleted by another
transaction. Retry the transaction or
change the isolation level for the
update/delete statement.
What suggestions to you have for this situation ? What should I try, or what should I read in order to find a solution ?
EDIT:
Actually there's no raid in there :). The number of users per day is small (I'll say 100 per day - with hundreds of small orders on a busy day), the database is a bit bigger at about 2GB and growing faster every day.
It's a business app, that handles orders, emails, reports, invoices and stuff like that.
Lazy loading would not be an option in this case.
I guess taking a very close looks at those indexes is my best bet.
Deadlocks are complicated. A deadlock means that at least two sessions have locks and are waiting for one another to release a different lock; since both are waiting, the locks never get released, neither session can continue, and a deadlock occurs.
In other words, A has lock X, B has lock Y, now A wants Y and B wants X. Neither will give up the lock they have until they are finished with their transaction. Both will wait indefinitely until they get the other lock. SQL Server sees that this is happening and kills one of the transactions in order to prevent the deadlock. Snapshot isolation won't help you - the DB still needs to preserve atomicity of transactions.
There is no simple answer anyone can give as to why a deadlock would be occurring. You'll need to profile your application to find out.
Start here: How to debug SQL deadlocks. That's a good intro.
Next, look at Detecting and Ending Deadlocks on MSDN. That will give you a lot of good background information on why deadlocks occur, and help you understand what you're looking at/for.
There are also some previous SO questions that you might want to look at:
Diagnosing Deadlocks in SQL Server 2005
Zero SQL deadlock by design
Or, if the deadlocks are very infrequent, just write some exception-handling code into your application to retry the transaction if a deadlock occurs. Sometimes it can be extremely hard (if not nearly impossible) to prevent certain deadlocks. As long as you write transactionally-safe code, it's not the end of the world; it's completely safe to just try the transaction again.
Is your hardware properly configured (specifically RAID configuration)? Is it capable of matching your workload?
If hardware is all good and humming, you should ensure you have the 'right' indexes to match your query workload.
Many locking/deadlock problems can be eliminated with the correct indexes (covering indexes can take pressure off the clustered index during inserts).
BTW: turning on snapshot isolation will put increased pressure on your tempDB. How is tempDB configured? RAID 0 is preferred (and even better use an SSD if tempDB is a bottleneck).
While it's not uncommon to find this error in NHibernate sessions with large numbers of users, it seems to be happening too often in your case.
Perhaps your objects are very large resulting in long-running selects? And if your selects are taking too long, that might indicate problems with your indexes (as Mitch Wheat explains)
If everything is in order, you could also try Lazy Loading to postpone your selects until when you really need your data. This might not be appropriate for your exact situation so you do have to see if it works.

Resources