I guess the real question is:
If I don't care about dirty reads, will adding the with (NOLOCK) hint to a SELECT statement affect the performance of:
the current SELECT statement
other transactions against the given table
Example:
Select *
from aTable with (NOLOCK)
1) Yes, a select with NOLOCK will complete faster than a normal select.
2) Yes, a select with NOLOCK will allow other queries against the effected table to complete faster than a normal select.
Why would this be?
NOLOCK typically (depending on your DB engine) means give me your data, and I don't care what state it is in, and don't bother holding it still while you read from it. It is all at once faster, less resource-intensive, and very very dangerous.
You should be warned to never do an update from or perform anything system critical, or where absolute correctness is required using data that originated from a NOLOCK read. It is absolutely possible that this data contains rows that were deleted during the query's run or that have been deleted in other sessions that have yet to be finalized. It is possible that this data includes rows that have been partially updated. It is possible that this data contains records that violate foreign key constraints. It is possible that this data excludes rows that have been added to the table but have yet to be committed.
You really have no way to know what the state of the data is.
If you're trying to get things like a Row Count or other summary data where some margin of error is acceptable, then NOLOCK is a good way to boost performance for these queries and avoid having them negatively impact database performance.
Always use the NOLOCK hint with great caution and treat any data it returns suspiciously.
NOLOCK makes most SELECT statements faster, because of the lack of shared locks. Also, the lack of issuance of the locks means that writers will not be impeded by your SELECT.
NOLOCK is functionally equivalent to an isolation level of READ UNCOMMITTED. The main difference is that you can use NOLOCK on some tables but not others, if you choose. If you plan to use NOLOCK on all tables in a complex query, then using SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED is easier, because you don't have to apply the hint to every table.
Here is information about all of the isolation levels at your disposal, as well as table hints.
SET TRANSACTION ISOLATION LEVEL
Table Hint (Transact-SQL)
In addition to what is said above, you should be very aware that nolock actually imposes the risk of you not getting rows that has been committed before your select.
See http://blogs.msdn.com/sqlcat/archive/2007/02/01/previously-committed-rows-might-be-missed-if-nolock-hint-is-used.aspx
It will be faster because it doesnt have to wait for locks
The answer is Yes if the query is run multiple times at once, because each transaction won't need to wait for the others to complete. However, If the query is run once on its own then the answer is No.
Yes. There's a significant probability that careful use of WITH(NOLOCK) will speed up your database overall. It means that other transactions won't have to wait for this SELECT statement to finish, but on the other hand, other transactions will slow down as they're now sharing their processing time with a new transaction.
Be careful to only use WITH (NOLOCK) in SELECT statements on tables that have a clustered index.
WITH(NOLOCK) is often exploited as a magic way to speed up database read transactions.
The result set can contain rows that have not yet been committed, that are often later rolled back.
If WITH(NOLOCK) is applied to a table that has a non-clustered index then row-indexes can be changed by other transactions as the row data is being streamed into the result-table. This means that the result-set can be missing rows or display the same row multiple times.
READ COMMITTED adds an additional issue where data is corrupted within a single column where multiple users change the same cell simultaneously.
Related
Suppose I have a T-SQL statement like so:
BEGIN TRAN
UPDATE dbo.TableA
...
...
...
DELETE FROM dbo.TableB
COMMIT TRAN
Suppose that the update on TableA is going to take some time.
By default, would SQL Server lock TableB until the transaction is completed? Would that mean you can't read or write to it while the update is ongoing?
Short answer: NO and NO.
Long answer:
This is, in fact, a great question as it goes deep in transaction concepts and how the engine works but I guess a complete answer can occupy a good deal of a chapter on a good book and is out of the scope of this site.
First, keep in mind the engine can work in several isolation modes: snapshot, read committed, etc. I can recommend good research on this topic (this can take a few days).
Second, the engine has the granularity level and will try to use the "smallest" one but can escalate it on demand, depending on many factors, for example: "will this operation need a page split?"
Third, BEGIN, COMMIT, ROLLBACK work more in a "semaphore" way, flagging how changes are being phased from "memory" to "disk". It's a lot more complicated than it and that why I use quotes.
That said a "default transaction" will use a row granularity in a read committed isolation mode. Nothing says how locks will be issued one way or another.
It depends on stuff like foreign keys, triggers, how much of the table is being changed, etc.
TLDR: It depends on a lot of minor details particular to your scenario. The best way to find out is by testing.
Following the comments of #Jeroen Mostert, #marc_s, and #Cato under the question, your locks on TableA and TableB here are likely to escalate to table exclusive locks as there is no "where" clause. If so, the other read and write operations from different connections may be affected based on their transaction isolation level until the end of this transaction.
Besides, locks are created on-demand; it means that the query first puts a lock on the tableA and after the execution of the update operation, it puts another lock on the tableB.
I had question above question. We are using with nolock through out the application. In some cases I need to work select faster, what ever the effect.
So select with(TABLOCKX) will be faster or with(nolock)?
To answer your question, the with (nolock) table hint will be faster.
NOLOCK typically (depending on your DB engine) means give me your data, and I don't care what state it is in, and don't bother holding it still while you read from it. It is all at once faster, less resource-intensive, and very very dangerous.
As explained very well here NoLock
Nolock means you can read some locked rows (with shared locks). But you still have to wait on other locks.
Tablockx means you block whole table with exclusive lock for other queries - other session cannot make locks, you cannot be blocked after you block whole table. Tablockx is mostly used for rapid inserts.
Avoid using nolock everywhere. Try to avoid exclusive locks for longer times or try minimalize your blocking and then you don't need nolocks.
Could someone give me some guidance on when I should use WITH (NOLOCK) as opposed to SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
What are the pros/cons of each? Are there any unintended consequences you've run into using one as opposed to the other?
They are the same thing. If you use the set transaction isolation level statement, it will apply to all the tables in the connection, so if you only want a nolock on one or two tables use that; otherwise use the other.
Both will give you dirty reads. If you are okay with that, then use them. If you can't have dirty reads, then consider snapshot or serializable hints instead.
WITH (NOLOCK) is a hint on a table level. Setting the transaction isolation level to READ_UNCOMMITTED with affect the connection. The difference is in terms of scope. See READUNCOMMITTED and NOLOCK in the SQL Server documentation here:
http://technet.microsoft.com/en-us/library/ms187373.aspx
For TRANSACTION ISOLATION LEVEL:
http://technet.microsoft.com/en-us/library/ms173763.aspx
NOLOCK is local to the table (or views etc)
READ UNCOMMITTED is per session/connection
As for guidelines... a random search from StackOverflow and the electric interweb...
Is the NOLOCK (Sql Server hint) bad practice?
When is it appropriate to use NOLOCK?
Get rid of those NOLOCK hints…
To my knowledge the only difference is the scope of the effects as Strommy said. NOLOCK hint on a table and the READ UNCOMMITTED on the session.
As to problems that can occur, it's all about consistency. If you care then be aware that you could get what is called dirty reads which could influence other data being manipulated on incorrect information.
I personally don't think I have seen any problems from this but that may be more due to how I use nolock. You need to be aware that there are scenarios where it will be OK to use. Scenarios where you are mostly adding new data to a table but have another process that comes in behind to check for a data scenario. That will probably be OK since the major flow doesn't include going back and updating rows during a read.
Also I believe that these days you should look into Multi-version Concurrency Control. I believe they added it in 2005 and it helps stop the writers from blocking readers by giving readers a snapshot of the database to use. I'll include a link and leave further research to the reader:
MVCC
Database Isolation Levels
You cannot use Set Transaction Isolation Level Read Uncommitted in a View (you can only have one script in there in fact), so you would have to use (nolock) if dirty rows should be included.
As you have to use WITH (NOLOCK) for each table it might be annoying to write it in every FROM or JOIN clause. However it has a reason why it is called a "dirty" read. So you really should know when you do one, and not set it as default for the session scope. Why?
Forgetting a WITH (NOLOCK) might not affect your program in a very dramatic way, however doing a dirty read where you do not want one can make the difference in certain circumstances.
So use WITH (NOLOCK) if the current data selected is allowed to be incorrect, as it might be rolled back later. This is mostly used when you want to increase performance, and the requirements on your application context allow it to take the risk that inconsistent data is being displayed. However you or someone in charge has to weigh up pros and cons of the decision of using WITH (NOLOCK).
i have a stored procedure that performs a join of TableB to TableA:
SELECT <--- Nested <--- TableA
Loop <--
|
---TableB
At the same time, in a transaction, rows are inserted into TableA, and then into TableB.
This situation is occasionally causing deadlocks, as the stored procedure select grabs rows from TableB, while the insert adds rows to TableA, and then each wants the other to let go of the other table:
INSERT SELECT
========= ========
Lock A Lock B
Insert A Select B
Want B Want A
....deadlock...
Logic requires the INSERT to first add rows to A, and then to B, while i personally don't care the order in which SQL Server performs its join - as long as it joins.
The common recommendation for fixing deadlocks is to ensure that everyone accesses resources in the same order. But in this case SQL Server's optimizer is telling me that the opposite order is "better". i can force another join order, and have a worse performing query.
But should i?
Should i override the optimizer, now and forever, with a join order that i want it to use?
Or should i just trap error native error 1205, and resubmit the select statement?
The question isn't how much worse the query might perform when i override the optimizer and for it to do something non-optimal. The question is: is it better to automatically retry, rather than running worse queries?
Is it better to automatically retry deadlocks. The reason being that you may fix this deadlock, only to hit another one later. The behavior may change between SQL releases, if the size of the tables changes, if the server hardware specifications change, and even if the load on the server changes. If the deadlock is frequent, you should take active steps to eliminate it (an index is usually the answer), but for rare deadlocks (say every 10 mins or so), retry in the application can mask the deadlock. You can retry reads or writes, since the writes are, of course, surrounded by proper begin transaction/commit transaction to keep all write operations atomic and hence able to retry them w/o problems.
Another avenue to consider is turning on read committed snapshot. When this is enabled, SELECT will simply not take any locks, yet yield consistent reads.
To avoid deadlocks, one of the most common recommendations is "to acquire locks in the same order" or "access objects in the same order". Clearly this makes perfect sense, but is it always feasible? Is it always possible? I keep encountering cases when I cannot follow this advice.
If I store an object in one parent table and one or more child ones, I cannot follow this advice at all. When inserting, I need to insert my parent row first. When deleting, I have to do it in the opposite order.
If I use commands that touch multiple tables or multiple rows in one table, then usually I have no control in which order locks are acquired, (assuming that I am not using hints).
So, in many cases trying to acquire locks in the same order does not prevent all deadlocks. So, we need some kind of handling deadlocks anyway - we cannot assume that we can eliminate them all. Unless, of course, we serialize all access using Service Broker or sp_getapplock.
When we retry after deadlocks, we are very likely to overwrite other processes' changes. We need to be aware that very likely someone else modified the data we intended to modify. Especially if all the readers run under snapshot isolation, then readers cannot be involved in deadlocks, which means that all the parties involved in a deadlock are writers, modified or attempted to modify the same data. If we just catch the exception and automatically retry, we can overwrite someone else's changes.
This is called lost updates, and this is usually wrong. Typically the right thing to do after a deadlock is to retry on a much higher level - re-select the data and decide whether to save in the same way the original decision to save was made.
For example, if a user pushed a Save button and the saving transaction was chosen as a deadlock victim, it might be a good idea to re-display the data on the screen as of after the deadlock.
Trapping and rerunning can work, but are you sure that the SELECT is always the deadlock victim? If the insert is the deadlock victim, you'll have to be much more careful about retrying.
The easiest solution in this case, I think, is to NOLOCK or READUNCOMMITTED (same thing) your select. People have justifiable concerns about dirty reads, but we've run NOLOCK all over the place for higher concurrency for years and have never had a problem.
I'd also do a little more research into lock semantics. For example, I believe if you set transaction isolation level to snapshot (requires 2005 or later) your problems go away.
I want to make almost 1000 inserts in a table in each second. And also each day I want to query all inserted rows just once at altogether. And I want to improve efficiency by multi-threading and connection-pooling. But i want to know which level of concurrency control is more suitable for me. The list of options for SQL-Server are in MSDN Site.
Thank you.
You should be OK with default isolation level for inserts. Do you have clustered index? If so, ensure that it doesn't fragment as you insert new rows. Typically guid would be a bad candidate for clustered index. Also if you have Enterprise edition and you are able to identify partitions in your table you might to partition the table using this column (for example region or city) and store partitions of the table on different filegroups. This way you might avoid IO contention.
If you select all data once a day and you would like to maintain inserts speed during the select without too much locking, you might consider creating database snapshot (again Enterprise Edition) and select from it. If you can live with dirty reads you might add with(nolock) hint to your select.
You might be barking up the wrong tree. Have a look into using row-versioning transaction isolation instead of supplying lock hints for individual statements.
A lot of people I talk to have had good results through the use of READ COMMITTED SNAPSHOT - which can be enabled at the database level and requires no code change.
I can say that SNAPSHOT has served me well in the past, but it does require code change.
And a word of warning, be sure that your tempdb throughput is good, as row-versioning increases the load on tempdb significantly.