How Read Committed Isolation Level prevents dirty reads

How Read Committed Isolation Level prevents dirty reads - sql-server

i start with a simple question:
according to Dirty Read definition in
Wikipedia
and Msdn :
we have 2 concurrent transactions, T1 and T2
Dirty Reads Occur in ,when T1 is Updating a row and T2 is reading row that "is not Committed yet" by T1
but at Read Committed Level shared locks are released as soon as the data is read (not at the end of the transaction or even the end of the statement
then how Read Committed prevents Dirty Reads?
Bkaz as soon as the share lock released on updated row T2 can read the updated row and t1 can rollback the whole operation,,then we have a dirty read on the hand of t1

It prevents the dirty read because T1 has a lock on the row, so T2 can't read the "not yet committed" row that could be rollbacked later.
The problem Read Committed tries to resolve is:
T1 creates a transaction and writes something
T2 reads that something
T1 rollback the transaction
now T2 has a data that didn't really ever existed.
Depending on how the DB is structured, there are two "good" possibilities:
T1 creates a transaction and writes something
T2 waits for T1 to end the transaction
or
T2 reads a "snapshot" of how the DB was BEFORE T1 began the transaction (it's called Read committed using row versioning)
(the default on MSSQL is the first option)
Here for example there is a comparison of the various isolation levels: http://msdn.microsoft.com/en-us/library/ms345124(SQL.90).aspx (read under Isolation Levels Offered in SQL Server 2005)

When SQL Server executes a statement at the read committed isolation level, it acquires short lived share locks on a row by row basis. The duration of these share locks is just long enough to read and process each row; the server generally releases each lock before proceeding to the next row. Thus, if you run a simple select statement under read committed and check for locks (e.g., with sys.dm_tran_locks), you will typically see at most a single row lock at a time. The sole purpose of these locks is to ensure that the statement only reads and returns committed data. The locks work because updates always acquire an exclusive lock which blocks any readers trying to acquire a share lock.
Ripped from here

Related

Dirty read problem- updation in database after transaction doing dirty read commits

Suppose we have 2 transactions as T1, T2 with T2 doing dirty read on data modified by T1 and committing before T1. Now suppose T1 fails and is rolled back. My question is- Since T2 is committed, changes made by T2 are transferred from shared buffer to original database or not(Since I have read that changes made by a transaction are made permanent to original db once transaction commits)? And if they are transferred to original database, then how will T1 rollback and get previous value of data item (which was read dirty by T2)? By its buffer or original database?

Unless you are specifically calling a dirty read by setting the isolation level, then this sort of problem simply can not happen. That is the whole idea of a transaction. T2 will be locked out of the row if T1 has updated it. If you are allowing dirty reads by setting set transaction isolation level read uncommitted then the handling of the data is up to you, usually by using a rowversion which your T2 checks hasnt changed before it commits.

Shared lock in READ_COMMITTED_SNAPSHOT and SNAPSHOT isolation

I've read on Microsoft's site
http://msdn.microsoft.com/en-us/library/ms173763.aspx
that Sql Server doesn't request locks when reading data, except when a database is being recovered.
Does it mean that Sql Server using READ_COMMITTED_SNAPSHOT/SNAPSHOT ISOLATION doesn't use shared locks at all?
How is that possible?
For example, if there are 2 transactions.
First transaction T1 wants to update some row.
Second transaction T2 starts reading same row (this transaction is copying him to some output buffer, response buffer or whatever it's called in Sql Server).
At the same time transaction T1 starts updating that row (it created versioned row first).
Isn't there a possibility that transaction T2 will read uncommited data?
Remember, transaction T2 started copying that row before T1 made update, so there is no exclusive lock on that row.
Is this situation even possible and how can this be avoided without setting shared lock on that row during copying of it's data?

Beside logical locks there are also physical latches to protect the database structures (particularly, in this example, pages). Latches protect any changes (modification of bits), irrelevant of isolation level. So even if the T1 does not acquire locks, it still needs to acquire a shared latch on the pages it reads, otherwise it would be victim to low level concurrent modifications done to the very structures it reads. T2 can modify the page containing the rows it modifies only if it obtains a page exclusive latch. Thus T1 can only see the image of the row either before T2 modified it (and therefore the row is the one T1 wants) or after T2 is complete with the modifications done to the row (and now T1 has to lookup the previous row image in the version store).
The latching protocol must be honored by all isolation levels, including read uncommitted and versioned reads (ie. snapshot and friends).

Does it mean that Sql Server using READ_COMMITTED_SNAPSHOT/SNAPSHOT ISOLATION doesn't use shared locks at all? How is that possible?
It is possible because SQL Server is reading from a SNAPSHOT, which is not going to go through any change at all. It's already frozen at the state of the DB at the start of the current transaction, disregarding uncommitted transactions from other processes. This is done by SQL Server keeping a snapshot (row-versioned) copy of the record in tempdb for transactions to refer to, letting the current in-progress data/index page(s) get changed.
Isn't there a possibility that transaction T2 will read uncommited data? Remember, transaction T2 started copying that row before T1 made update, so there is no exclusive lock on that row.
The above narrative explains this already. But to illustrate (simplified):
Scenario 1:
T1: begin tran (implicit/explicit)
T1: read value (4)
T2: read value (4) -- *
T1: update value to (8)
* - This is the committed value at the time the T2 transaction started
Scenario 2:
T1: begin tran (implicit/explicit)
T1: read value (4)
T1: update value to (8)
version of the row with the value (4) is made
T2: read value (4) -- * from the versioned row
T1: commit
* - (4) is [still] the *committed* value at the time the T2 transaction started

when/what locks are hold/released in READ COMMITTED isolation level

I am trying to understand isolation/locks in SQL Server.
I have following scenario in READ COMMITTED isolation level(Default)
We have a table.
create table Transactions(Tid int,amt int)
with some records
insert into Transactions values(1, 100)
insert into Transactions values(2, -50)
insert into Transactions values(3, 100)
insert into Transactions values(4, -100)
insert into Transactions values(5, 200)
Now from msdn i understood
When a select is fired shared lock is taken so no other transaction can modify data(avoiding dirty read).. Documentation also talks about row level, page level, table level lock. I thought of following scenarion
Begin Transaction
select * from Transactions
/*
some buisness logic which takes 5 minutes
*/
Commit
What I want to understand is for what duration of time shared lock would be acquired and which (row, page, table).
Will lock will be acquire only when statement select * from Transactions is run or would it be acquire for whole 5+ minutes till we reach COMMIT.

You are asking the wrong question, you are concerned about the implementation details. What you should think of and be concerned with are the semantics of the isolation level. Kendra Little has a nice poster explaining them: Free Poster! Guide to SQL Server Isolation Levels.
Your question should be rephrased like:
select * from Items
Q: What Items will I see?
A: All committed Items
Q: What happens if there are uncommitted transactions that have inserted/deleted/update Items?
A: your SELECT will block until all uncommitted Items are committed (or rolled back).
Q: What happens if new Items are inserted/deleted/update while I run the query above?
A: The results are undetermined. You may see some of the modifications, won't see some other, and possible block until some of them commit.
READ COMMITTED makes no promise once your statement finished, irrelevant of the length of the transaction. If you run the statement again you will have again exactly the same semantics as state before, and the Items you've seen before may change, disappear and new one can appear. Obviously this implies that changes can be made to Items after your select.
Higher isolation levels give stronger guarantees: REPEATABLE READ guarantees that no item you've selected the first time can be modified or deleted until you commit. SERIALIZABLE adds the guarantee that no new Item can appear in your second select before you commit.
This is what you need to understand, no how the implementation mechanism works. After you master these concepts, you may ask the implementation details. They're all described in Transaction Processing: Concepts and Techniques.

Your question is a good one. Understanding what kind of locks are acquired allows a deep understanding of DBMS's. In SQL Server, under all isolation levels (Read Uncommitted, Read Committed (default), Repeatable Reads, Serializable) Exclusive Locks are acquired for Write operations.
Exclusive locks are released when transaction ends, regardless of the isolation level.
The difference between the isolation levels refers to the way in which Shared (Read) Locks are acquired/released.
Under Read Uncommitted isolation level, no Shared locks are acquired. Under this isolation level the concurrency issue known as "Dirty Reads" (a transaction is allowed to read data from a row that has been modified by another running transaction and not yet committed, so it could be rolled back) can occur.
Under Read Committed isolation level, Shared Locks are acquired for the concerned records. The Shared Locks are released when the current instruction ends. This isolation level prevents "Dirty Reads" but, since the record can be updated by other concurrent transactions, "Non-Repeatable Reads" (transaction A retrieves a row, transaction B subsequently updates the row, and transaction A later retrieves the same row again. Transaction A retrieves the same row twice but sees different data) or "Phantom Reads" (in the course of a transaction, two identical queries are executed, and the collection of rows returned by the second query is different from the first) can occur.
Under Repeatable Reads isolation level, Shared Locks are acquired for the transaction duration. "Dirty Reads" and "Non-Repeatable Reads" are prevented but "Phantom Reads" can still occur.
Under Serializable isolation level, ranged Shared Locks are acquired for the transaction duration. None of the above mentioned concurrency issues occur but performance is drastically reduced and there is the risk of Deadlocks occurrence.

lock will only acquire when select * from Transaction is run
You can check it with below code
open a sql session and run this query
Begin Transaction
select * from Transactions
WAITFOR DELAY '00:05'
/*
some buisness logic which takes 5 minutes
*/
Commit
Open another sql session and run below query
Begin Transaction
Update Transactions
Set = ...
where ....
commit

First, lock only acquire when statement run.
Your statement seprate in two pieces, suppose to be simplfy:
select * from Transactions
update Transactions set amt = xxx where Tid = xxx
When/what locks are hold/released in READ COMMITTED isolation level?
when select * from Transactions run, no lock acquired.
Following update Transactions set amt = xxx where Tid = xxx will add X lock for updating/updated keys, IX lock for page/tab
All lock will release only after committed/rollbacked. That means no lock will release in trans running.

TABLOCK vs TABLOCKX

What is the difference between TABLOCK and TABLOCKX?
http://msdn.microsoft.com/en-us/library/ms187373.aspx states that TABLOCK is a shared lock while TABLOCKX is an exclusive lock. Is the first maybe only an index lock of sorts? And what is the concept of sharing a lock?

Big difference, TABLOCK will try to grab "shared" locks, and TABLOCKX exclusive locks.
If you are in a transaction and you grab an exclusive lock on a table, EG:
SELECT 1 FROM TABLE WITH (TABLOCKX)
No other processes will be able to grab any locks on the table, meaning all queries attempting to talk to the table will be blocked until the transaction commits.
TABLOCK only grabs a shared lock, shared locks are released after a statement is executed if your transaction isolation is READ COMMITTED (default). If your isolation level is higher, for example: SERIALIZABLE, shared locks are held until the end of a transaction.
Shared locks are, hmmm, shared. Meaning 2 transactions can both read data from the table at the same time if they both hold a S or IS lock on the table (via TABLOCK). However, if transaction A holds a shared lock on a table, transaction B will not be able to grab an exclusive lock until all shared locks are released. Read about which locks are compatible with which at msdn.
Both hints cause the db to bypass taking more granular locks (like row or page level locks). In principle, more granular locks allow you better concurrency. So for example, one transaction could be updating row 100 in your table and another row 1000, at the same time from two transactions (it gets tricky with page locks, but lets skip that).
In general granular locks is what you want, but sometimes you may want to reduce db concurrency to increase performance of a particular operation and eliminate the chance of deadlocks.
In general you would not use TABLOCK or TABLOCKX unless you absolutely needed it for some edge case.

Quite an old article on mssqlcity attempts to explain the types of locks:
Shared locks are used for operations that do not change or update data, such as a SELECT statement.
Update locks are used when SQL Server intends to modify a page, and later promotes the update page lock to an exclusive page lock before actually making the changes.
Exclusive locks are used for the data modification operations, such as UPDATE, INSERT, or DELETE.
What it doesn't discuss are Intent (which basically is a modifier for these lock types). Intent (Shared/Exclusive) locks are locks held at a higher level than the real lock. So, for instance, if your transaction has an X lock on a row, it will also have an IX lock at the table level (which stops other transactions from attempting to obtain an incompatible lock at a higher level on the table (e.g. a schema modification lock) until your transaction completes or rolls back).
The concept of "sharing" a lock is quite straightforward - multiple transactions can have a Shared lock for the same resource, whereas only a single transaction may have an Exclusive lock, and an Exclusive lock precludes any transaction from obtaining or holding a Shared lock.

This is more of an example where TABLOCK did not work for me and TABLOCKX did.
I have 2 sessions, that both use the default (READ COMMITTED) isolation level:
Session 1 is an explicit transaction that will copy data from a linked server to a set of tables in a database, and takes a few seconds to run. [Example, it deletes Questions]
Session 2 is an insert statement, that simply inserts rows into a table that Session 1 doesn't make changes to. [Example, it inserts Answers].
(In practice there are multiple sessions inserting multiple records into the table, simultaneously, while Session 1 is running its transaction).
Session 1 has to query the table Session 2 inserts into because it can't delete records that depend on entries that were added by Session 2. [Example: Delete questions that have not been answered].
So, while Session 1 is executing and Session 2 tries to insert, Session 2 loses in a deadlock every time.
So, a delete statement in Session 1 might look something like this:
DELETE tblA FROM tblQ LEFT JOIN tblX on ...
LEFT JOIN tblA a ON tblQ.Qid = tblA.Qid
WHERE ... a.QId IS NULL and ...
The deadlock seems to be caused from contention between querying tblA while Session 2, [3, 4, 5, ..., n] try to insert into tblA.
In my case I could change the isolation level of Session 1's transaction to be SERIALIZABLE. When I did this: The transaction manager has disabled its support for remote/network transactions.
So, I could follow instructions in the accepted answer here to get around it: The transaction manager has disabled its support for remote/network transactions
But a) I wasn't comfortable with changing the isolation level to SERIALIZABLE in the first place- supposedly it degrades performance and may have other consequences I haven't considered, b) didn't understand why doing this suddenly caused the transaction to have a problem working across linked servers, and c) don't know what possible holes I might be opening up by enabling network access.
There seemed to be just 6 queries within a very large transaction that are causing the trouble.
So, I read about TABLOCK and TabLOCKX.
I wasn't crystal clear on the differences, and didn't know if either would work. But it seemed like it would. First I tried TABLOCK and it didn't seem to make any difference. The competing sessions generated the same deadlocks. Then I tried TABLOCKX, and no more deadlocks.
So, in six places, all I needed to do was add a WITH (TABLOCKX).
So, a delete statement in Session 1 might look something like this:
DELETE tblA FROM tblQ q LEFT JOIN tblX x on ...
LEFT JOIN tblA a WITH (TABLOCKX) ON tblQ.Qid = tblA.Qid
WHERE ... a.QId IS NULL and ...

SQL Server SELECT statements causing blocking

We're using a SQL Server 2005 database (no row versioning) with a huge select statement, and we're seeing it block other statements from running (seen using sp_who2). I didn't realise SELECT statements could cause blocking - is there anything I can do to mitigate this?

SELECT can block updates. A properly designed data model and query will only cause minimal blocking and not be an issue. The 'usual' WITH NOLOCK hint is almost always the wrong answer. The proper answer is to tune your query so it does not scan huge tables.
If the query is untunable then you should first consider SNAPSHOT ISOLATION level, second you should consider using DATABASE SNAPSHOTS and last option should be DIRTY READS (and is better to change the isolation level rather than using the NOLOCK HINT). Note that dirty reads, as the name clearly states, will return inconsistent data (eg. your total sheet may be unbalanced).

From documentation:
Shared (S) locks allow concurrent transactions to read (SELECT) a resource under pessimistic concurrency control. For more information, see Types of Concurrency Control. No other transactions can modify the data while shared (S) locks exist on the resource. Shared (S) locks on a resource are released as soon as the read operation completes, unless the transaction isolation level is set to repeatable read or higher, or a locking hint is used to retain the shared (S) locks for the duration of the transaction.
A shared lock is compatible with another shared lock or an update lock, but not with an exlusive lock.
That means that your SELECT queries will block UPDATE and INSERT queries and vice versa.
A SELECT query will place a temporary shared lock when it reads a block of values from the table, and remove it when it done reading.
For the time the lock exists, you will not be able to do anything with the data in the locked area.
Two SELECT queries will never block each other (unless they are SELECT FOR UPDATE)
You can enable SNAPSHOT isolation level on your database and use it, but note that it will not prevent UPDATE queries from being locked by SELECT queries (which seems to be your case).
It, though, will prevent SELECT queries from being locked by UPDATE.
Also note that SQL Server, unlike Oracle, uses lock manager and keeps it locks in an in-memory linked list.
That means that under heavy load, the mere fact of placing and removing a lock may be slow, since the linked list should itself be locked by the transaction thread.

To perform dirty reads you can either:
using (new TransactionScope(TransactionScopeOption.Required,
new TransactionOptions {
IsolationLevel = System.Transactions.IsolationLevel.ReadUncommitted }))
{
//Your code here
}
or
SelectCommand = "SELECT * FROM Table1 WITH (NOLOCK) INNER JOIN Table2 WITH (NOLOCK) ..."
remember that you have to write WITH (NOLOCK) after every table you want to dirty read

You could set the transaction level to Read Uncommitted

You might also get deadlocks:
"deadlocks involving only one table"
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/01/reproducing-deadlocks-involving-only-one-table.aspx
and or incorrect results:
"Selects under READ COMMITTED and REPEATABLE READ may return incorrect results."
http://www2.sqlblog.com/blogs/alexander_kuznetsov/archive/2009/04/10/selects-under-read-committed-and-repeatable-read-may-return-incorrect-results.aspx

You can use WITH(READPAST) table hint. It's different than the WITH(NOLOCK). It will get the data before the transaction was started and will not block anyone. Imagine that, you ran the statement before the transaction was started.
SELECT * FROM table1 WITH (READPAST)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight