I've read on Microsoft's site
http://msdn.microsoft.com/en-us/library/ms173763.aspx
that Sql Server doesn't request locks when reading data, except when a database is being recovered.
Does it mean that Sql Server using READ_COMMITTED_SNAPSHOT/SNAPSHOT ISOLATION doesn't use shared locks at all?
How is that possible?
For example, if there are 2 transactions.
First transaction T1 wants to update some row.
Second transaction T2 starts reading same row (this transaction is copying him to some output buffer, response buffer or whatever it's called in Sql Server).
At the same time transaction T1 starts updating that row (it created versioned row first).
Isn't there a possibility that transaction T2 will read uncommited data?
Remember, transaction T2 started copying that row before T1 made update, so there is no exclusive lock on that row.
Is this situation even possible and how can this be avoided without setting shared lock on that row during copying of it's data?
Beside logical locks there are also physical latches to protect the database structures (particularly, in this example, pages). Latches protect any changes (modification of bits), irrelevant of isolation level. So even if the T1 does not acquire locks, it still needs to acquire a shared latch on the pages it reads, otherwise it would be victim to low level concurrent modifications done to the very structures it reads. T2 can modify the page containing the rows it modifies only if it obtains a page exclusive latch. Thus T1 can only see the image of the row either before T2 modified it (and therefore the row is the one T1 wants) or after T2 is complete with the modifications done to the row (and now T1 has to lookup the previous row image in the version store).
The latching protocol must be honored by all isolation levels, including read uncommitted and versioned reads (ie. snapshot and friends).
Does it mean that Sql Server using READ_COMMITTED_SNAPSHOT/SNAPSHOT ISOLATION doesn't use shared locks at all? How is that possible?
It is possible because SQL Server is reading from a SNAPSHOT, which is not going to go through any change at all. It's already frozen at the state of the DB at the start of the current transaction, disregarding uncommitted transactions from other processes. This is done by SQL Server keeping a snapshot (row-versioned) copy of the record in tempdb for transactions to refer to, letting the current in-progress data/index page(s) get changed.
Isn't there a possibility that transaction T2 will read uncommited data? Remember, transaction T2 started copying that row before T1 made update, so there is no exclusive lock on that row.
The above narrative explains this already. But to illustrate (simplified):
Scenario 1:
T1: begin tran (implicit/explicit)
T1: read value (4)
T2: read value (4) -- *
T1: update value to (8)
* - This is the committed value at the time the T2 transaction started
Scenario 2:
T1: begin tran (implicit/explicit)
T1: read value (4)
T1: update value to (8)
version of the row with the value (4) is made
T2: read value (4) -- * from the versioned row
T1: commit
* - (4) is [still] the *committed* value at the time the T2 transaction started
Related
The specification for the Repeatable-Read isolation level defines that a transaction with this IL will prevent other transactions from updating any rows that this transaction has read until this transaction has completed. Thus, repeatable reads are guaranteed.
Consider the following order of operations for two concurrent transactions T1 and T2, both using repeatable read IL:
T1: Read row
T2: Read row
T1: Update row
T2: Update row
I think that the update in step 3 would violate the specification for the isolation level, since T2 would read a different value if it read the row again.
The converse can be said for the update in step 4.
So, what different options are available for RDBMSs in general resolve this conflict?
More specifically, how is this handled in SQL Server 2017+?
Will this result in a deadlock since neither transaction can complete its operations?
Or would one transaction be rolled back?
I've seen that Lost Updates are prevented in SQL Server. What does this mean for the resolution of this specific case?
I have perused the answers to these questions:
Repeatable read and lock compatibility table
Repeatable Read - am I understanding this right?
repeatable read and second lost updates issue
MySQL Repeatable Read isolation level and Lost Update phenomena
And although the last one asks a similar question but doesn't include any specific info about how RDBMSs which prevent lost updates for txs with this isolation level handle this case.
Suppose we have 2 transactions as T1, T2 with T2 doing dirty read on data modified by T1 and committing before T1. Now suppose T1 fails and is rolled back. My question is- Since T2 is committed, changes made by T2 are transferred from shared buffer to original database or not(Since I have read that changes made by a transaction are made permanent to original db once transaction commits)? And if they are transferred to original database, then how will T1 rollback and get previous value of data item (which was read dirty by T2)? By its buffer or original database?
Unless you are specifically calling a dirty read by setting the isolation level, then this sort of problem simply can not happen. That is the whole idea of a transaction. T2 will be locked out of the row if T1 has updated it. If you are allowing dirty reads by setting set transaction isolation level read uncommitted then the handling of the data is up to you, usually by using a rowversion which your T2 checks hasnt changed before it commits.
I am trying to understand isolation/locks in SQL Server.
I have following scenario in READ COMMITTED isolation level(Default)
We have a table.
create table Transactions(Tid int,amt int)
with some records
insert into Transactions values(1, 100)
insert into Transactions values(2, -50)
insert into Transactions values(3, 100)
insert into Transactions values(4, -100)
insert into Transactions values(5, 200)
Now from msdn i understood
When a select is fired shared lock is taken so no other transaction can modify data(avoiding dirty read).. Documentation also talks about row level, page level, table level lock. I thought of following scenarion
Begin Transaction
select * from Transactions
/*
some buisness logic which takes 5 minutes
*/
Commit
What I want to understand is for what duration of time shared lock would be acquired and which (row, page, table).
Will lock will be acquire only when statement select * from Transactions is run or would it be acquire for whole 5+ minutes till we reach COMMIT.
You are asking the wrong question, you are concerned about the implementation details. What you should think of and be concerned with are the semantics of the isolation level. Kendra Little has a nice poster explaining them: Free Poster! Guide to SQL Server Isolation Levels.
Your question should be rephrased like:
select * from Items
Q: What Items will I see?
A: All committed Items
Q: What happens if there are uncommitted transactions that have inserted/deleted/update Items?
A: your SELECT will block until all uncommitted Items are committed (or rolled back).
Q: What happens if new Items are inserted/deleted/update while I run the query above?
A: The results are undetermined. You may see some of the modifications, won't see some other, and possible block until some of them commit.
READ COMMITTED makes no promise once your statement finished, irrelevant of the length of the transaction. If you run the statement again you will have again exactly the same semantics as state before, and the Items you've seen before may change, disappear and new one can appear. Obviously this implies that changes can be made to Items after your select.
Higher isolation levels give stronger guarantees: REPEATABLE READ guarantees that no item you've selected the first time can be modified or deleted until you commit. SERIALIZABLE adds the guarantee that no new Item can appear in your second select before you commit.
This is what you need to understand, no how the implementation mechanism works. After you master these concepts, you may ask the implementation details. They're all described in Transaction Processing: Concepts and Techniques.
Your question is a good one. Understanding what kind of locks are acquired allows a deep understanding of DBMS's. In SQL Server, under all isolation levels (Read Uncommitted, Read Committed (default), Repeatable Reads, Serializable) Exclusive Locks are acquired for Write operations.
Exclusive locks are released when transaction ends, regardless of the isolation level.
The difference between the isolation levels refers to the way in which Shared (Read) Locks are acquired/released.
Under Read Uncommitted isolation level, no Shared locks are acquired. Under this isolation level the concurrency issue known as "Dirty Reads" (a transaction is allowed to read data from a row that has been modified by another running transaction and not yet committed, so it could be rolled back) can occur.
Under Read Committed isolation level, Shared Locks are acquired for the concerned records. The Shared Locks are released when the current instruction ends. This isolation level prevents "Dirty Reads" but, since the record can be updated by other concurrent transactions, "Non-Repeatable Reads" (transaction A retrieves a row, transaction B subsequently updates the row, and transaction A later retrieves the same row again. Transaction A retrieves the same row twice but sees different data) or "Phantom Reads" (in the course of a transaction, two identical queries are executed, and the collection of rows returned by the second query is different from the first) can occur.
Under Repeatable Reads isolation level, Shared Locks are acquired for the transaction duration. "Dirty Reads" and "Non-Repeatable Reads" are prevented but "Phantom Reads" can still occur.
Under Serializable isolation level, ranged Shared Locks are acquired for the transaction duration. None of the above mentioned concurrency issues occur but performance is drastically reduced and there is the risk of Deadlocks occurrence.
lock will only acquire when select * from Transaction is run
You can check it with below code
open a sql session and run this query
Begin Transaction
select * from Transactions
WAITFOR DELAY '00:05'
/*
some buisness logic which takes 5 minutes
*/
Commit
Open another sql session and run below query
Begin Transaction
Update Transactions
Set = ...
where ....
commit
First, lock only acquire when statement run.
Your statement seprate in two pieces, suppose to be simplfy:
select * from Transactions
update Transactions set amt = xxx where Tid = xxx
When/what locks are hold/released in READ COMMITTED isolation level?
when select * from Transactions run, no lock acquired.
Following update Transactions set amt = xxx where Tid = xxx will add X lock for updating/updated keys, IX lock for page/tab
All lock will release only after committed/rollbacked. That means no lock will release in trans running.
i start with a simple question:
according to Dirty Read definition in
Wikipedia
and Msdn :
we have 2 concurrent transactions, T1 and T2
Dirty Reads Occur in ,when T1 is Updating a row and T2 is reading row that "is not Committed yet" by T1
but at Read Committed Level shared locks are released as soon as the data is read (not at the end of the transaction or even the end of the statement
then how Read Committed prevents Dirty Reads?
Bkaz as soon as the share lock released on updated row T2 can read the updated row and t1 can rollback the whole operation,,then we have a dirty read on the hand of t1
It prevents the dirty read because T1 has a lock on the row, so T2 can't read the "not yet committed" row that could be rollbacked later.
The problem Read Committed tries to resolve is:
T1 creates a transaction and writes something
T2 reads that something
T1 rollback the transaction
now T2 has a data that didn't really ever existed.
Depending on how the DB is structured, there are two "good" possibilities:
T1 creates a transaction and writes something
T2 waits for T1 to end the transaction
or
T2 reads a "snapshot" of how the DB was BEFORE T1 began the transaction (it's called Read committed using row versioning)
(the default on MSSQL is the first option)
Here for example there is a comparison of the various isolation levels: http://msdn.microsoft.com/en-us/library/ms345124(SQL.90).aspx (read under Isolation Levels Offered in SQL Server 2005)
When SQL Server executes a statement at the read committed isolation level, it acquires short lived share locks on a row by row basis. The duration of these share locks is just long enough to read and process each row; the server generally releases each lock before proceeding to the next row. Thus, if you run a simple select statement under read committed and check for locks (e.g., with sys.dm_tran_locks), you will typically see at most a single row lock at a time. The sole purpose of these locks is to ensure that the statement only reads and returns committed data. The locks work because updates always acquire an exclusive lock which blocks any readers trying to acquire a share lock.
Ripped from here
What is the difference between TABLOCK and TABLOCKX?
http://msdn.microsoft.com/en-us/library/ms187373.aspx states that TABLOCK is a shared lock while TABLOCKX is an exclusive lock. Is the first maybe only an index lock of sorts? And what is the concept of sharing a lock?
Big difference, TABLOCK will try to grab "shared" locks, and TABLOCKX exclusive locks.
If you are in a transaction and you grab an exclusive lock on a table, EG:
SELECT 1 FROM TABLE WITH (TABLOCKX)
No other processes will be able to grab any locks on the table, meaning all queries attempting to talk to the table will be blocked until the transaction commits.
TABLOCK only grabs a shared lock, shared locks are released after a statement is executed if your transaction isolation is READ COMMITTED (default). If your isolation level is higher, for example: SERIALIZABLE, shared locks are held until the end of a transaction.
Shared locks are, hmmm, shared. Meaning 2 transactions can both read data from the table at the same time if they both hold a S or IS lock on the table (via TABLOCK). However, if transaction A holds a shared lock on a table, transaction B will not be able to grab an exclusive lock until all shared locks are released. Read about which locks are compatible with which at msdn.
Both hints cause the db to bypass taking more granular locks (like row or page level locks). In principle, more granular locks allow you better concurrency. So for example, one transaction could be updating row 100 in your table and another row 1000, at the same time from two transactions (it gets tricky with page locks, but lets skip that).
In general granular locks is what you want, but sometimes you may want to reduce db concurrency to increase performance of a particular operation and eliminate the chance of deadlocks.
In general you would not use TABLOCK or TABLOCKX unless you absolutely needed it for some edge case.
Quite an old article on mssqlcity attempts to explain the types of locks:
Shared locks are used for operations that do not change or update data, such as a SELECT statement.
Update locks are used when SQL Server intends to modify a page, and later promotes the update page lock to an exclusive page lock before actually making the changes.
Exclusive locks are used for the data modification operations, such as UPDATE, INSERT, or DELETE.
What it doesn't discuss are Intent (which basically is a modifier for these lock types). Intent (Shared/Exclusive) locks are locks held at a higher level than the real lock. So, for instance, if your transaction has an X lock on a row, it will also have an IX lock at the table level (which stops other transactions from attempting to obtain an incompatible lock at a higher level on the table (e.g. a schema modification lock) until your transaction completes or rolls back).
The concept of "sharing" a lock is quite straightforward - multiple transactions can have a Shared lock for the same resource, whereas only a single transaction may have an Exclusive lock, and an Exclusive lock precludes any transaction from obtaining or holding a Shared lock.
This is more of an example where TABLOCK did not work for me and TABLOCKX did.
I have 2 sessions, that both use the default (READ COMMITTED) isolation level:
Session 1 is an explicit transaction that will copy data from a linked server to a set of tables in a database, and takes a few seconds to run. [Example, it deletes Questions]
Session 2 is an insert statement, that simply inserts rows into a table that Session 1 doesn't make changes to. [Example, it inserts Answers].
(In practice there are multiple sessions inserting multiple records into the table, simultaneously, while Session 1 is running its transaction).
Session 1 has to query the table Session 2 inserts into because it can't delete records that depend on entries that were added by Session 2. [Example: Delete questions that have not been answered].
So, while Session 1 is executing and Session 2 tries to insert, Session 2 loses in a deadlock every time.
So, a delete statement in Session 1 might look something like this:
DELETE tblA FROM tblQ LEFT JOIN tblX on ...
LEFT JOIN tblA a ON tblQ.Qid = tblA.Qid
WHERE ... a.QId IS NULL and ...
The deadlock seems to be caused from contention between querying tblA while Session 2, [3, 4, 5, ..., n] try to insert into tblA.
In my case I could change the isolation level of Session 1's transaction to be SERIALIZABLE. When I did this: The transaction manager has disabled its support for remote/network transactions.
So, I could follow instructions in the accepted answer here to get around it: The transaction manager has disabled its support for remote/network transactions
But a) I wasn't comfortable with changing the isolation level to SERIALIZABLE in the first place- supposedly it degrades performance and may have other consequences I haven't considered, b) didn't understand why doing this suddenly caused the transaction to have a problem working across linked servers, and c) don't know what possible holes I might be opening up by enabling network access.
There seemed to be just 6 queries within a very large transaction that are causing the trouble.
So, I read about TABLOCK and TabLOCKX.
I wasn't crystal clear on the differences, and didn't know if either would work. But it seemed like it would. First I tried TABLOCK and it didn't seem to make any difference. The competing sessions generated the same deadlocks. Then I tried TABLOCKX, and no more deadlocks.
So, in six places, all I needed to do was add a WITH (TABLOCKX).
So, a delete statement in Session 1 might look something like this:
DELETE tblA FROM tblQ q LEFT JOIN tblX x on ...
LEFT JOIN tblA a WITH (TABLOCKX) ON tblQ.Qid = tblA.Qid
WHERE ... a.QId IS NULL and ...