I have this table:
TableAB
{
IDA;
IDB;
}
And I want to ensure that always I have the pair (ID1, ID2) and (ID2, ID1). So I am trying to use this to scripts:
To insert:
begin tran
insert into tablaAB (IDTablaA, IDTablaB) VALUES(1,2);
insert into tablaAB (IDTablaA, IDTablaB) VALUES(2,1);
commit
To delete:
begin tran
delete tablaAB where IDTablaA = 1 and IDTablaB = 2
delete tablaAB where IDTablaA = 2 and IDTablaB = 1;
commit
I am using two instance of Microsoft Management Studio, to run both queries, and in most of the cases, it works, I get the two rows or any of them. But sometimes, I get only one of them.
The steps are:
run the query to delete (1,2).
run the query to add (1,2).
In most of the cases, it is block until the transaction to delete both rows finishes, but in some case it can pass to the next line, to insert the second row. If this happens, then I don't have a coherence data.
But I don't know if it is because I make some mistakes in the test or in same rare cases the first query is not blocked as I expect.
Really in all cases the first insert should be block if the first delete is done?
The table is empty. So it seems that the row is blocked when I try to delete and it doesn't allow to insert the row, but I don't know if really can be some rare situations in which the row is not blocked.
Thanks.
But I don't know if it is because I make some mistakes in the test or in same rare cases the first query is not blocked as I expect.
Really in all cases the first insert should be block if the first delete is done?
It seems you are running using the READ COMMITTED isolation level. In this case, no lock is held by the DELETE session when no rows qualify so the INSERT session can proceed to insert rows. This becomes a race condition where you may end up with zero, one, or two rows. Consider this sequence that results in one row:
--session 1:
begin tran;
delete TableAB where IDTablaA = 1 and IDTablaB = 2;
--no row deleted, no lock held
--session 2:
begin tran
insert into TableAB (IDTablaA, IDTablaB) VALUES(1,2);
--row inserted, lock held
insert into TableAB (IDTablaA, IDTablaB) VALUES(2,1);
--row inserted, lock held
commit;
-- inserts committed and locks released
--session 1:
delete TableAB where IDTablaA = 2 and IDTablaB = 1;
--row deleted, lock held
commit;
--deleted committed, lock released
If you instead use the SERIALIZABLE isolation level, the DELETE statement will hold a lock (table lock in this case due to no indexes) and block the insert session. A less restrictive key range lock will be held with an index on the column used to locate rows to be deleted.
Note that SERIALIZABLE is it is more prone to deadlocks than less restrictive isolation levels.
Related
I was trying to understand ROWLOCK in SQL server to update a record after locking it. Here is my observation and would like to get a confirm if ROWLOCK is like a table or page lock sort of thing or I have not tried it correctly. ROWLOCK should be a lock to row only not to the table or page.
Here is what I tried:
I created a simple table:row_lock_temp_test with two columns ID and Name with no PK or index. Now I open SQL Server, two different clients but same credentials and tried executed a set of queries as follow:
Client 1:
1: BEGIN TRANSACTION;
2: update row_lock_temp_test set name = 'CC' where id = 2
3: COMMIT
Client 2:
1: BEGIN TRANSACTION;
2: update row_lock_temp_test set name= 'CC' where id = 2
3: COMMIT
I executed Query 1, 2 on C-1 and went to C-2 and executed the same queries, both clients executed the queries and then I committed the transaction, all good.
Then I added RowLock to update query,
C-1
1: BEGIN TRANSACTION;
2: update row_lock_temp_test WITH(rowlock) set name = 'CC' where id = 2
3: COMMIT
C-2
1: BEGIN TRANSACTION;
2: update row_lock_temp_test WITH(rowlock) set name = 'CC' where id = 2
3: COMMIT
Now, I executed query 1 and 2 on C-1 and then went to C-2 and tried to execute the same 2 queries, but query got Stuck as expected because the row is locked by C-1 so it should be in queue until the transaction is committed on C-1. as soon as I committed transaction on C-1 query on C-2 got executed and then I committed the transaction on C-2 as well. All good.
here I tried another scenario to execute the same set of queries with row id = 3
C-2
1: BEGIN TRANSACTION;
2: update row_lock_temp_test WITH(rowlock) set name = 'CC' where id = 3
3: COMMIT
I executed 1st two queries in C-1 and then went to executed 1st two queries of C-2, row id is different in both clients, but still, the query in C-2 got stuck. This means while updating query with id = 2 it has locked the page or table, I was expecting a row lock, but it seems a page or table lock.
I also tried using xlock, HOLDLOCK, and UPDLOCK with different combinations but it is always locking the table. is there any possibility to lock a row only.
Select and insert is working as expected.
Thanks in advance.
Lock hints are only hints. You can't "force" SQL to take a particular kind of lock.
You can see the locks being taken with the following query:
select tl.request_session_id,
tl.resource_type,
tl.request_mode,
tl.resource_description,
tl.request_status
from sys.dm_tran_locks tl
join sys.partitions pt on pt.hobt_id = tl.resource_associated_entity_id
join sys.objects ob on ob.object_id = pt.object_id
where tl.resource_database_id = db_id()
order by tl.request_session_id
OK, let's run some code in an SSMS query window:
create table t(i int, j int);
insert t values (1, 1), (2, 2);
begin tran;
update t with(rowlock) set j = 2 where i = 1;
Open a second SSMS window, and run this:
begin tran;
update t with(rowlock) set j = 2 where i = 2;
The second execution will be blocked. Why?
Run the locking query in a third window, and note that there are two rows with a resource_type of RID, one with a status of "grant", the other with a status of "wait". We'll get to the RID bit in a second. Also, look at the resource_description column for those rows. It's the same value.
OK, so what's a resource_description? It depends on theresource_type. But for our RID it represents: the file id, then the page id, then the row id (also known as the slot). But why are both executions taking a lock on row slot 0? Shouldn't they be trying to lock different rows? After all, we are updating different rows.
David Browne has given the answer: In order to find the correct row to update, SQL has to scan the entire table, because there is no index telling it how many rows there are where i = 1. It will take an update lock on each row as it scans through. Why does it take an update lock on each row? Well, it's not to "do" the update, to so speak. It will take an exclusive lock for that. Update locks are pretty much always taken to prevent deadlocks.
So, the first query has scanned through the rows, taking a U lock on each row. Of course, it found the row it wanted to update right away, in slot 0, and took an X lock. And it still has that X lock, because we haven't committed.
Then we started the second query, which also has to scan all of the rows to find the one it wants. It started off by trying to take the U lock on the first row, and was blocked. The X lock of our first query is blocking it.
So, you see, even with row locking, your second query is still blocked.
OK, let's rollback the queries, and see what happens if we have the first query update the second row, and the second query update the first row? Does that work? Nope! Because SQL still has no way of knowing how many rows match the predicate. So the first query takes its update lock on slot 0, sees that it doesn't have to update it, takes its update lock on slot 1, sees the correct value for i, takes its exclusive lock, and waits for us to commit.
The query 2 comes along, takes the update lock on slot 0, sees the value it wants, takes its exclusive lock, updates the value, and then tries to take an update lock on slot 1, because that might also have the value it wants.
You'll also see "intent locks" on the next "level" up, i.e., the page. The operation is letting the rest of the engine know that it might want to escalate the lock to the page level at some point in the future. But that's not a factor here. Page locking is not causing the issue.
Solution in this case? Add an index on column i. In this case, that's probably the primary key. You can then do the updates in either order. Asking for row locking in this case makes no difference, because SQL doesn't know how many rows match the predicate. But even if you try to force a row lock in some situation, and even with a primary key or appropriate index, SQL can still choose to escalate the lock type, because it can be way more efficient to lock a whole page, or a whole table, than to lock and unlock individual rows.
I am just wondering something about snapshot behavior on read committed isolation level. Let's assume that I have a table with name "A". Here is the first transaction:
Select blabla
From A
Insert Into A blabla
and second transaction does the same
Select blabla
From A
Insert Into A blabla
and assume that below timeline occurred:
Tran1: select
Tran1: insert (not yet committed)
Tran2: select (I don't know it is possible or not)
Tran2: insert
As far as I know, in standard read committed isolation level, tran2 select query would be blocked because of tran1 insert command not yet committed or rolled back. But, while "is_read_committed_snapshot" is enabled, I expect that any of lock won't acquired during insert or update command.
So what will happen to tran2?
I expect that tran2 select query won't see the data that inserted by tran1, because it would be "dirty read". But it wouldn't get block as well.
Because of the tran1 insert query does not acquire any lock, wouldn't this situation be a problem about concurrency of executing these two transactions?
I expect that any of lock won't acquired during insert or update
command.
That is wrong. Even if you have enabled RCSI, writers still block writers, and X locks are still acqiured.
What is different between RC and RCSI is reading behaviour.
When working on pessimistic RC, SELECT from Tran2 will be blocked on X lock held on A, while working on RCSI Tran2's SELECT will not be blocked, it will be provided with the last committed version of A, i.e. with the state of A before Tran1 has modificated it.
What happend then depends on your table organisation and on what you INSERT.
Some examples.
1) table A is a heap, you are doing single insert in both transactions.
In this case your INSERT in Tran2 will succeed in any case, be it the same value that you try to insert in both transactions or not, because what the server acquires in this case is IX on a table (that is compatible with IX held by Tran1), IX on a page (that is also compatible with IX held by Tran1, even if it is the same page), and X on RID (while Tran1 has X on another RID), so there is no conflict.
2) table A is clustered table, you are trying to insert the same new key in this table.
In this case your Tran2's INSERT will be blocked because of the conflict between two X lock on the same key, the first is held by Tran1, the secont is requested by Tran2 and is blocked.
3) table A is clustered table, you are trying to insert different keys in this table.
Insert2 will succeed because X lock on key requested by Tran2 will be granted as Tran1 holds IX on table, IX on page, and X on another key.
Lets say you're doing it this way:
SELECT id FROM customers
BEGIN TRAN new_tran
UPDATE customers
SET ID = '1'
WHERE ID = '01'
IF your query is something like this:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
GO
BEGIN TRAN
SELECT *
FROM customers
WHERE id = '01'
Result- Even if we have changed the value to 01, we will still see old record in session 2 (2, TWO).
Now, let’s commit transaction in session 1
Now lets say you commit the transaction, in session 2, now you'll get the new updated value:
COMMIT
SELECT *
FROM DemoTable
WHERE i = 2
You can read more about it on Pinal Dave's blog: blog.sqlauthority.com/2015/07/03/sql-server-difference-between-read-committed-snapshot-and-snapshot-isolation-level/
I'm trying to archive many records in batches rather than in one shot.
Will TSQL Join the two tables, TeamRoster and #teamIdsToDelete for every loop in the batch? My concern is that if my temporary table is huge and I don't remove records from the temporary table as I go, the JOIN might be unnecessarily expensive. On the other hand, how expensive is it to delete from the temporary table as I go? Is it made up for by the (?real/hypothetical?) smaller joins I'll have to do in each batch?
(Can provide more details/thoughts but will do so if helpful.)
DECLARE #teamIdsToDelete Table
(
RosterID int PRIMARY KEY
)
--collect the list of active teamIds. we will rely on the modified date to age them out.
INSERT INTO #teamIdsToDelete
SELECT DISTINCT tr.RosterID FROM
rosterload.TeamRoster tr WITH (NOLOCK)
WHERE tr.IsArchive=0 and tr.Loaded=1
--ageout out remaining rosters. (no cap - proved we can update more than 50k by modifying test case:
WHILE (1 = 1)
BEGIN
BEGIN TRANSACTION
UPDATE TOP (1000) r
SET [Status] = 'Delete', IsArchive = 1, ModifiedDate = GETDATE(), ModifiedBy = 'abc'
FROM rosterload.TeamRoster r with(rowlock)
JOIN #teamIdsToDelete ttd ON ttd.rosterID = r.RosterID
WHERE r.[Status] != 'Delete' AND r.IsArchive != 1 AND r.ModifiedBy != 'abc' -- predicate for filtering;
IF ##ROWCOUNT = 0 -- terminating condition;
BEGIN
COMMIT TRANSACTION
BREAK
END
COMMIT TRANSACTION
END
As I understand the goal of this query is to archive huge number of rows w/o blocking other queries at the same time. The temp table helps you to narrow down the subset of records to delete. Since it has one column which is clustered primary key, the join to another PK will be blazingly fast. You will spend more efforts on calculating and deleting updated records from the temp table.
Also, there is no reason to use transaction and do batches. You could just do one big update instead. The result is the same - table will be locked after first 5k row locks are acquired (~after first five batches updated) until the COMMIT statement. With rowlock hint does not prevent lock escalation. On the other hand, running w/o transaction would give other queries opportunity to continue after each 1000-row batch. If you need to make sure that all records are archived in one go - add some retry logic to your query or your application code for such errors like deadlocks or process interruption. And do you really need NOLOCK hint?
I'm trying to understand a problem I have run into that I don't believe should be possible when dealing with transactions utilizing the read committed isolation level. I have a table that is being used as a queue. In one thread (connection 1) I insert multiple batches of 20 records into each table. Each batch of 20 records is performed inside a transaction. In a second thread (connection 2) I perform an update to change the status of the records that have been inserted into the queue, which also occurs inside a transaction. When running concurrently, it is my expectation that the number of rows affected by the update (connection 2) should be a multiple of 20, since connection 1 is inserting rows in the table inserts in batches of 20 rows within a transaction.
But my testing shows this is not always the case, and on occasion I'm able to update a subset of records from connection 1's batch. Should this be possible or am I missing something about transactions, concurrency, and isolation levels? Below is a set of test scripts I created to reproduce this issue in T-SQL.
This script inserts 20,000 records into the table in transaction batches of 20.
USE ReadTest
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
GO
SET NOCOUNT ON
DECLARE #trans_id INTEGER
DECLARE #cmd_id INTEGER
DECLARE #text_str VARCHAR(4000)
SET #trans_id = 0
SET #text_str = 'Placeholder String Value'
-- First empty the table
DELETE FROM TABLE_A
WHILE #trans_id < 1000 BEGIN
SET #trans_id = #trans_id + 1
SET #cmd_id = 0
BEGIN TRANSACTION
-- Insert 20 records into the table per transaction
WHILE #cmd_id < 20 BEGIN
SET #cmd_id = #cmd_id + 1
INSERT INTO TABLE_A ( transaction_id, command_id, [type], status, text_field )
VALUES ( #trans_id, #cmd_id, 1, 1, #text_str )
END
COMMIT
END
PRINT 'DONE'
This script updates the records in the table, changing the status from 1 to 2 and then checks the rowcount from the update operation. When the rowcount is not a multiple of 20, and print statement indicates this and the number of rows affected.
USE ReadTest
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
GO
SET NOCOUNT ON
DECLARE #loop_counter INTEGER
DECLARE #trans_id INTEGER
DECLARE #count INTEGER
SET #loop_counter = 0
WHILE #loop_counter < 100000 BEGIN
SET #loop_counter = #loop_counter + 1
BEGIN TRANSACTION
UPDATE TABLE_A SET status = 2
WHERE status = 1
and type = 1
SET #count = ##ROWCOUNT
COMMIT
IF ( #count % 20 <> 0 ) BEGIN
-- Records in concurrent transaction inserting in batches of 20 records before commit.
PRINT '*** Rowcount not a multiple of 20. Count = ' + CAST(#count AS VARCHAR) + ' ***'
END
IF #count > 0 BEGIN
-- Delete the records where the status was changed.
DELETE TABLE_A WHERE status = 2
END
END
PRINT 'DONE'
This script creates the test queue table in a new database called ReadTest.
USE master;
GO
IF EXISTS (SELECT * FROM sys.databases WHERE name = 'ReadTest')
BEGIN;
DROP DATABASE ReadTest;
END;
GO
CREATE DATABASE ReadTest;
GO
ALTER DATABASE ReadTest
SET ALLOW_SNAPSHOT_ISOLATION OFF
GO
ALTER DATABASE ReadTest
SET READ_COMMITTED_SNAPSHOT OFF
GO
USE ReadTest
GO
CREATE TABLE [dbo].[TABLE_A](
[ROWGUIDE] [uniqueidentifier] NOT NULL,
[TRANSACTION_ID] [int] NOT NULL,
[COMMAND_ID] [int] NOT NULL,
[TYPE] [int] NOT NULL,
[STATUS] [int] NOT NULL,
[TEXT_FIELD] [varchar](4000) NULL
CONSTRAINT [PK_TABLE_A] PRIMARY KEY NONCLUSTERED
(
[ROWGUIDE] ASC
) ON [PRIMARY]
) ON [PRIMARY]
ALTER TABLE [dbo].[TABLE_A] ADD DEFAULT (newsequentialid()) FOR [ROWGUIDE]
GO
You expectations are completely misplaced. You have never expressed in your query the requirement to 'dequeue' exactly 20 rows. The UPDATE can return 0, 19, 20, 21 or 1000 rows and all results are correct, as long as the status is 1 and type is 1. If you expect that the 'dequeue' occurs in the order of the 'enqueue' (which is somehow eluded to in your question, but never explicitly stated) then your 'dequeue' operation must contain an ORDER BY clause. Had you add such an explicitly stated requirement then your expectation that 'dequeue' always return an entire batch of 'enqueue' rows (ie. multiple of 20 rows) would be one step closer to being a reasonable expectation. As things stand right now, is, as I said, completely misplaced.
For a lengthier discussion see Using Tables as Queues.
I shouldn't be concerned that while one transaction is committing a
batch of 20 inserted records, another concurrent transaction is only
able to update a subset of those records and not all 20?
Basically the question boils down to If I SELECT while I INSERT, how many inserted rows will I see?. You only have a right to be concerned if the isolation level is declared as SERIALIZABLE. None of the other isolation levels make any prediction about how many rows inserted while the UPDATE was running will be visible. Only SERIALIZABLE states that the outcome has to be the same as running the two statements one after another (ie. serialized, hence the name). While the technical details of how the UPDATE 'sees' only part of the INSERT batch are easy to understand once you consider physical order and the lack of ORDER BY clause, the explanation is irrelevant. The fundamental issue is that the expectation is non-warranted. Even if the 'issue' is 'fixed' by adding a proper ORDER BY and the correct clustered index key (the article linked above explains the details), the expectation is still non-warranted. It will still be perfectly legal for the UPDATE to 'see' 1, 19 or 21 rows, although it will be unlikely to happen.
I guess I've always understood READ COMMITTED to only read committed
data, and that a transaction commit is an atomic operation, making all
the changes that occurred in the transaction available at once.
That is correct. What is incorrect is to expect that a concurrent SELECT (or update) to see the entire change, irrelevant of where it happens to be in the execution. Open an SSMS query and run the following:
use tempdb;
go
create table test (a int not null primary key, b int);
go
insert into test (a, b) values (5,0)
go
begin transaction
insert into test (a, b) values (10,0)
Now open a new SSMS query and run the following:
update test
set b=1
output inserted.*
where b=0
This will block behind the uncommitted INSERT. Now go back to first query and run the following:
insert into test (a, b) values (1,0)
commit
When this commits, the second SSMS query will finish, and it will return two rows, not three. QED. This is READ COMMITTED. What you expect is SERIALIZABLE execution (in which case the example above will deadlock).
It could happen like this:
The writer/inserter writes 20 rows (does not commit)
The reader/updater reads one row (which is not committed - it discards it)
The writer/inserter commits
The reader/updater reads 19 rows which are now committed thus visible
I believe that only an isolation level of serializable (or snapshot isolation which is more concurrent) fixes this.
I have a SQL Server table that I'm using as a queue, and it's being processed by a multi-threaded (and soon to be multi-server) application. I'd like a way for a process to claim the next row from the queue, flagging it as "in-process", without the possibility that multiple threads (or multiple servers) will claim the same row at the same time.
Is there a way to update a flag in a row and retrieve that row at the same time? I want something like this psuedocode, but ideally, without blocking the whole table:
Block the table to prevent others from reading
Grab the next ID in the queue
Update the row of that item with a "claimed" flag (or whatever)
Release the lock and let other threads repeat the process
What's the best way to use T-SQL to accomplish this? I remember seeing a statement one time that would DELETE rows and, at the same time, deposit the DELETED rows into a temp table so you could do something else with them, but I can't for the life of me find it now.
You can use the OUTPUT clause
UPDATE myTable SET flag = 1
WHERE
id = 1
AND
flag <> 1
OUTPUT DELETED.id
Main thing is to use a combination of table hints as shown below, within a transaction.
DECLARE #NextId INTEGER
BEGIN TRANSACTION
SELECT TOP 1 #NextId = ID
FROM QueueTable WITH (UPDLOCK, ROWLOCK, READPAST)
WHERE BeingProcessed = 0
ORDER BY ID ASC
IF (#NextId IS NOT NULL)
BEGIN
UPDATE QueueTable
SET BeingProcessed = 1
WHERE ID = #NextID
END
COMMIT TRANSACTION
IF (#NextId IS NOT NULL)
SELECT * FROM QueueTable WHERE ID = #NextId
UPDLOCK will lock the next available row it finds that's available, preventing other processes from grabbing it.
ROWLOCK will ensure only the individual row is locked (I've never found it to be a problem not using this as I think it will only use a rowlock anyway, but safest to use it).
READPAST will prevent a process being blocked, waiting for another to finish.