I am issuing the following query with an UPDLOCK applied:
select #local_var = Column
from table (UPDLOCK)
where OtherColumn = #parameter
What happens is that multiple connections hit this routine which is used inside a stored procedure to compute a unique id. Once the lock acquires we compute the next id, update the value in the row and commit. This is done because the client has a specific formatting requirement for certain Object ID's in their system.
The UPDLOCK locks the correct row and blocks the other processes, but every now and then we get a duplicate id. It seems the local variable is given the current value before the row is locked. I had assumed that the lock would be obtained before the select portion of the statement was processed.
I am using SQLServer 2012 and the isolation level is set to read committed.
If there is other information required, just let me know. Or if I am doing something obviously stupid, that information is also welcome.
From the SQL Server documentation on UPDLOCK:
Use update locks instead of shared locks while reading a table, and hold locks until the end of the statement or transaction. UPDLOCK has the advantage of allowing you to read data (without blocking other readers) and update it later with the assurance that the data has not changed since you last read it.
That means that other processes can still read the values.
Try using XLOCK instead, that will lock other reads out as well.
I think the issue is that your lock is only being held during this Select.
So once your Stored Proc has the Value, it releases the Lock, BEFORE it goes on to update the id (or insert a new row or whatever).
This means that another query running in Parallel is able to Query for the same value and then Update/Insert the same row.
You should additinoally add a HOLDLOCK to your 'with' statement so that the lock gets held a little longer.
This is treated quite well in this Answer
Related
I have been asked to check a production issue for which I need help. I am trying to understand the isolation levels and different locks available in SQL server.
I have a table JOB_STATUS having columns job_name (string, primary key), job_status (string), is_job_locked (string)
Sample data as below:
job_name
job_status
is_job_locked
JOB_A
INACTIVE
N
JOB_B
RUNNING
N
JOB_C
SUCCEEDED
N
JOB_D
RUNNING
N
JOB_E
INACTIVE
N
Multiple processes can update the table at the same time by calling a stored procedure and passing the job_name as input parameter. It is fine if two different rows are getting updated by separate processes at the same time.
BUT, two processes should not update the same row at the same time.
Sample update query is as follows:
update JOB_STATUS set is_job_locked='Y' where job_name='JOB_A' and is_job_locked='N';
Here if two processes are updating the same row, then one process should wait for the other one to complete. Also, if the is_job_locked column value is changed to Y by one process, then the other process should not update it again (which my update statement should handle if locking is proper).
So how can I do this row level locking and make sure the update query reads the latest data from the row before making an update using a stored procedure.
Also, would like to get the return value whether the update query updated the row or it did not as per the condition, so that I can use this value in my further application flow.
RE: "Here if two processes are updating the same row, then one process should wait for the other one to complete. "
That is how locking works in SQL Server. An UPDATE takes an exclusive lock on the row -- where "exclusive" means the English meaning of the word: the UPDATE process has excluded (locked out) all other processes while it is running. The other processes now wait for the UPDATE to complete. This includes READ processes for transaction isolation levels READ COMMITTED and above. When the UPDATE lock is released, then the next statement can access the value.
IF what you are looking for is that 2 processes cannot change the same row in a single table at the same time, then SQL Server does that for you out of the box and you do not need to add your own "is_job_locked" column.
However, typically an is_job_locked column is used to control access beyond a single table. For example, it may be used to prevent a second process from starting a job that is already running. Process A would mark is_job_locked, then start the job. Process B would check the flag before trying to start the job.
I did not had to use explicit lock or provide any isolation level as it was a single update query in the stored procedure.
At a time SQL server is only allowing one process to update a row which is then read committed by second process and not updated again.
Also, I used ##ROWCOUNT to get the No. of rows updated. My issue is solved now.
Thanks for the answers and comments.
I have a table which is used to create locks with unique key to control execution of a critical section over multiple servers, i.e. only one thread at a time from all the web servers can enter that critical section.
The lock mechanism starts by trying to add a record to the database, and if successful it enters the region, otherwise it waits. When it exits the critical section, it removes that key from the table. I have the following procedure for this:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRANSACTION
DECLARE #startTime DATETIME2
DECLARE #lockStatus INT
DECLARE #lockTime INT
SET #startTime = GETUTCDATE()
IF EXISTS (SELECT * FROM GuidLocks WITH (TABLOCKX, HOLDLOCK) WHERE Id = #lockName)
BEGIN
SET #lockStatus = 0
END
ELSE
BEGIN
INSERT INTO GuidLocks VALUES (#lockName, GETUTCDATE())
SET #lockStatus = 1
END
SET #lockTime = (SELECT DATEDIFF(millisecond, #startTime, GETUTCDATE()))
SELECT #lockStatus AS Status, #lockTime AS Duration
COMMIT TRANSACTION GetLock
So I do a SELECT on the table and use TABLOCKX and HOLDLOCK so I get an exclusive lock on the complete table and hold it until the end of the transaction. Then depending on the result, I either return fail status (0), or create a new record and return (1).
However, I am getting this exception from time to time and I just don't know how it is happening:
System.Data.SqlClient.SqlException: Violation of PRIMARY KEY constraint 'PK_GuidLocks'. Cannot insert duplicate key in object 'dbo.GuidLocks'. The duplicate key value is (XXXXXXXXX). The statement has been terminated.
Any idea how this is happening? How is it possible that two threads managed to obtain an exclusive lock on the same table and tried to insert rows at the same time?
UPDATE: It looks readers might have not fully understand my question here, so I would like to elaborate: My understanding is that using TABLOCKX obtains an exclusive lock on the table. I also understood from the documentation (and I could be mistaken) that if I use the HOLDLOCK statement, then the lock will be held till the end of the transaction, which in this case, I assume (and apparently my assumption is wrong, but that's what I understood from the documentation) is the outer transaction initiated by the BEGIN TRANSACTION statement and ended by COMMIT TRANSACTION statement. So the way I understand things here is that by the time SQL Server reach the SELECT statement having the TABLOCKX and HOLDLOCK, it will try to obtain an exclusive lock on the whole table, and will not release it until the execution of COMMIT TRANSACTION. If that's the case, how comes two threads seam to be trying to execute the same INSERT statement at the same time?
If you look up the documentation for tablock and holdlock, you'll see that it is not doing what you think it is:
Tablock: Specifies that the acquired lock is applied at the table level. The
type of lock that is acquired depends on the statement being executed.
For example, a SELECT statement may acquire a shared lock. By
specifying TABLOCK, the shared lock is applied to the entire table
instead of at the row or page level. If HOLDLOCK is also specified,
the table lock is held until the end of the transaction.
So the reason that your query is not working is because you are only getting a shared lock from the table. What Frisbee is attempting to point out is that you don't need to re-implement all of the transaction isolating and locking code because there is a more natural syntax that handles this implicitly. His version is better than yours because it's much more easy to not make a mistake that introduces bugs.
More generally, when ordering statements in your query, you should place the statements requiring the more restrictive lock first.
In my concurrent programming text many years ago, we read the parable of the blind train engineers who needed to transport trains both directions through a single track pass across the Andes only one track wide. In the first mutex model, an engineer would walk up to a synchronization bowl at the top of the pass and, if it was empty, place a pebble in to lock the pass. After driving through the pass he would remove his pebble to unlock the pass for the next train. This is the mutex model you have implemented and it doesn't work. In the parable a crach occurred soon after implementation, and sure enough there were two pebbles in the bowl - we have encountered a READ-READ-WRITE-WRTE anomaly due to the multi-threaded environment.
The parable then describes a second mutex model, where there is already a single pebble in the bowl. Each engineer walks up to the bowl and removes the pebble if one is there, placing it in his pocket while he drives through the pass. Then he restores the pebble to unlock the pass for the next train. If an engineer finds the bowl empty he keeps trying (or blocks for some length of time) until a pebble is available. This is the model that works.
You can implement this (correct) model by having (only ever) a single row in the GuidLocks table with a (by default) NULL value for the lock holder. In a suitable transaction each process UPDATES (in place) this single row with it's SPID exactly if the old value IS NULL; returning 1 if this succeeds and 0 if it fails. It again updates this column back to NULL when it releases the lock.
This will ensure that the resource being locked actually includes the row being modified, which in your case is clearly not always true.
See the answer by usr to this question for an interesting example.
I believe that you are being confused by the error message - clearly the engine is locating the row of a potential conflict before testing for the existence of a lock, resulting in a misleading error message, and that since (due to implementing model 1 above instead of model 2) the TABLOCK is being held on the resource used by the SELECT instead of the resource used by an INSERT/UPDATE, a second process is able to sneak in.
Note that, especially in the presence of support for snapshot isolation, the resource on which you have taken your TABLOCKX (the table snapshot before any inserts) does not guarantee to include the resource to which you have written the lock specifics (the table snapshot after an insert) .
Use an app lock.
exec sp_getapplock #resource = #lockName,
#LockMode='Exclusive',
#LockOwner = 'Session';
Your approach is incorrect from many point of view: granularity (table lock), scope (transaction which commit), leakage (will leak locks). Session scope app locks is what you actually intend to use.
INSERT INTO GuidLocks
select #lockName, GETUTCDATE()
where not exists ( SELECT *
FROM GuidLocks
WHERE Id = #lockName );
IF ##ROWCOUNT = 0 ...
to be safe about optimization
SELECT 1
FROM GuidLocks
We are trying to determine more efficient ways to perform some database operations.
One of the issues that we have is with an ancient primary key system where the primary key for a new record is selected by finding the MAX value in the table and then adding 1 (we cannot change this implementation, please don't suggest this as an answer).
There are some different approaches that we can take to resolve this issue (table-valued parameters, temp tables, etc), but we can never assume that another process won't insert another record during this process and business rules will not allow us to lock the table.
So, the crux of my question is, if we get the current MAX value in a sub-query using an UPDLOCK hint, will the lock hint last for the life of the containing query?
For example:
INSERT
INTO table1
( PKColumn,
DataColumn1,
DataColumn2 )
VALUES SELECT MAX(ISNULL(PKColumn, 0) + 1) FROM table1 WITH (UPDLOCK)) + RowNumber ,
DataColumn1 ,
DataColumn2
FROM #Table1Temp
If we use this to insert 100,000 records, for example, will the UPDLOCK hint hold on the table until all records are inserted or is it released as soon as the initial value is retrieved?
Specifies that update locks are to be taken and held until the
transaction completes. UPDLOCK takes update locks for read operations
only at the row-level or page-level. If UPDLOCK is combined with
TABLOCK, or a table-level lock is taken for some other reason, an
exclusive (X) lock will be taken instead.
So yes. The transaction will last at least as long as that statement. (Possibly longer if you aren't using auto commit transactions and have multiple statements in a transaction)
I'm wondering what is the benefit to use SELECT WITH (NOLOCK) on a table if the only other queries affecting that table are SELECT queries.
How is that handled by SQL Server? Would a SELECT query block another SELECT query?
I'm using SQL Server 2012 and a Linq-to-SQL DataContext.
(EDIT)
About performance :
Would a 2nd SELECT have to wait for a 1st SELECT to finish if using a locked SELECT?
Versus a SELECT WITH (NOLOCK)?
A SELECT in SQL Server will place a shared lock on a table row - and a second SELECT would also require a shared lock, and those are compatible with one another.
So no - one SELECT cannot block another SELECT.
What the WITH (NOLOCK) query hint is used for is to be able to read data that's in the process of being inserted (by another connection) and that hasn't been committed yet.
Without that query hint, a SELECT might be blocked reading a table by an ongoing INSERT (or UPDATE) statement that places an exclusive lock on rows (or possibly a whole table), until that operation's transaction has been committed (or rolled back).
Problem of the WITH (NOLOCK) hint is: you might be reading data rows that aren't going to be inserted at all, in the end (if the INSERT transaction is rolled back) - so your e.g. report might show data that's never really been committed to the database.
There's another query hint that might be useful - WITH (READPAST). This instructs the SELECT command to just skip any rows that it attempts to read and that are locked exclusively. The SELECT will not block, and it will not read any "dirty" un-committed data - but it might skip some rows, e.g. not show all your rows in the table.
On performance you keep focusing on select.
Shared does not block reads.
Shared lock blocks update.
If you have hundreds of shared locks it is going to take an update a while to get an exclusive lock as it must wait for shared locks to clear.
By default a select (read) takes a shared lock.
Shared (S) locks allow concurrent transactions to read (SELECT) a resource.
A shared lock as no effect on other selects (1 or a 1000).
The difference is how the nolock versus shared lock effects update or insert operation.
No other transactions can modify the data while shared (S) locks exist on the resource.
A shared lock blocks an update!
But nolock does not block an update.
This can have huge impacts on performance of updates. It also impact inserts.
Dirty read (nolock) just sounds dirty. You are never going to get partial data. If an update is changing John to Sally you are never going to get Jolly.
I use shared locks a lot for concurrency. Data is stale as soon as it is read. A read of John that changes to Sally the next millisecond is stale data. A read of Sally that gets rolled back John the next millisecond is stale data. That is on the millisecond level. I have a dataloader that take 20 hours to run if users are taking shared locks and 4 hours to run is users are taking no lock. Shared locks in this case cause data to be 16 hours stale.
Don't use nolocks wrong. But they do have a place. If you are going to cut a check when a byte is set to 1 and then set it to 2 when the check is cut - not a time for a nolock.
I have to add one important comment. Everyone is mentioning that NOLOCKreads only dirty data. This is not precise. It is also possible that you'll get the same row twice or the whole row is skipped during your read. The reason is that you could ask for some data at the same time when SQL Server is re-balancing b-tree.
Check another threads
https://stackoverflow.com/a/5469238/2108874
http://www.sqlmag.com/article/sql-server/quaere-verum-clustered-index-scans-part-iii.aspx)
With the NOLOCK hint (or setting the isolation level of the session to READ UNCOMMITTED) you tell SQL Server that you don't expect consistency, so there are no guarantees. Bear in mind though that "inconsistent data" does not only mean that you might see uncommitted changes that were later rolled back, or data changes in an intermediate state of the transaction. It also means that in a simple query that scans all table/index data SQL Server may lose the scan position, or you might end up getting the same row twice.
At my work, we have a very big system that runs on many PCs at the same time, with very big tables with hundreds of thousands of rows, and sometimes many millions of rows.
When you make a SELECT on a very big table, let's say you want to know every transaction a user has made in the past 10 years, and the primary key of the table is not built in an efficient way, the query might take several minutes to run.
Then, our application might me running on many user's PCs at the same time, accessing the same database. So if someone tries to insert into the table that the other SELECT is reading (in pages that SQL is trying to read), then a LOCK can occur and the two transactions block each other.
We had to add a "NO LOCK" to our SELECT statement, because it was a huge SELECT on a table that is used a lot by a lot of users at the same time and we had LOCKS all the time.
I don't know if my example is clear enough? This is a real life example.
The SELECT WITH (NOLOCK) allows reads of uncommitted data, which is equivalent to having the READ UNCOMMITTED isolation level set on your database. The NOLOCK keyword allows finer grained control than setting the isolation level on the entire database.
Wikipedia has a useful article: Wikipedia: Isolation (database systems)
It is also discussed at length in other stackoverflow articles.
select with no lock - will select records which may / may not going to be inserted. you will read a dirty data.
for example - lets say a transaction insert 1000 rows and then fails.
when you select - you will get the 1000 rows.
We're using a SQL Server 2005 database (no row versioning) with a huge select statement, and we're seeing it block other statements from running (seen using sp_who2). I didn't realise SELECT statements could cause blocking - is there anything I can do to mitigate this?
SELECT can block updates. A properly designed data model and query will only cause minimal blocking and not be an issue. The 'usual' WITH NOLOCK hint is almost always the wrong answer. The proper answer is to tune your query so it does not scan huge tables.
If the query is untunable then you should first consider SNAPSHOT ISOLATION level, second you should consider using DATABASE SNAPSHOTS and last option should be DIRTY READS (and is better to change the isolation level rather than using the NOLOCK HINT). Note that dirty reads, as the name clearly states, will return inconsistent data (eg. your total sheet may be unbalanced).
From documentation:
Shared (S) locks allow concurrent transactions to read (SELECT) a resource under pessimistic concurrency control. For more information, see Types of Concurrency Control. No other transactions can modify the data while shared (S) locks exist on the resource. Shared (S) locks on a resource are released as soon as the read operation completes, unless the transaction isolation level is set to repeatable read or higher, or a locking hint is used to retain the shared (S) locks for the duration of the transaction.
A shared lock is compatible with another shared lock or an update lock, but not with an exlusive lock.
That means that your SELECT queries will block UPDATE and INSERT queries and vice versa.
A SELECT query will place a temporary shared lock when it reads a block of values from the table, and remove it when it done reading.
For the time the lock exists, you will not be able to do anything with the data in the locked area.
Two SELECT queries will never block each other (unless they are SELECT FOR UPDATE)
You can enable SNAPSHOT isolation level on your database and use it, but note that it will not prevent UPDATE queries from being locked by SELECT queries (which seems to be your case).
It, though, will prevent SELECT queries from being locked by UPDATE.
Also note that SQL Server, unlike Oracle, uses lock manager and keeps it locks in an in-memory linked list.
That means that under heavy load, the mere fact of placing and removing a lock may be slow, since the linked list should itself be locked by the transaction thread.
To perform dirty reads you can either:
using (new TransactionScope(TransactionScopeOption.Required,
new TransactionOptions {
IsolationLevel = System.Transactions.IsolationLevel.ReadUncommitted }))
{
//Your code here
}
or
SelectCommand = "SELECT * FROM Table1 WITH (NOLOCK) INNER JOIN Table2 WITH (NOLOCK) ..."
remember that you have to write WITH (NOLOCK) after every table you want to dirty read
You could set the transaction level to Read Uncommitted
You might also get deadlocks:
"deadlocks involving only one table"
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/01/reproducing-deadlocks-involving-only-one-table.aspx
and or incorrect results:
"Selects under READ COMMITTED and REPEATABLE READ may return incorrect results."
http://www2.sqlblog.com/blogs/alexander_kuznetsov/archive/2009/04/10/selects-under-read-committed-and-repeatable-read-may-return-incorrect-results.aspx
You can use WITH(READPAST) table hint. It's different than the WITH(NOLOCK). It will get the data before the transaction was started and will not block anyone. Imagine that, you ran the statement before the transaction was started.
SELECT * FROM table1 WITH (READPAST)