I have a SQL Server stored procedure with these steps:
Begin procedure
User CURSOR to loop thru a small list (say 50)
Open CURSOR loop
For each item, do data manipulation (Insert or Delete) - let's call this operation A
For each item, insert an entry into a status table - let's call this operation B
Loop ends (and iterates)
Exit
Operation A takes are few minutes and Operation B takes just a fraction of second as it is just inserts 1 row into a status table.
So now the problem is, the status table in the operation B gets locked until all the 50 or so iterations complete.
I can give more background but this seems to be the crux of the problem. Pointers for solution will be really helpful (time sensitive).
What I tried - added explicit BEGIN TRANSACTION; and COMMIT TRANSACTION; for operation happening in both A and B. Still it seems that the B status table gets lock after the first iteration and until all the iterations complete.
Related
My understanding of locks in SQL Server is that when an operator is writing, it acquires an X lock only on the rows it is writing to for the duration of the transaction. So if I try to select from a table that is being written to, it should return all previously committed rows (under the default transaction level READ COMMITTED).
Based on this, I have created a stored procedure to check.
USE [SANDPIT]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[slowly_insert_into_adventure_time_characters]
AS
SET XACT_ABORT OFF -- Turn off auto abort on errors
SET NOCOUNT ON -- Turn off row count messages
BEGIN TRANSACTION;
INSERT INTO SANDPIT.dbo.adventure_time_characters([name])
VALUES
('Jake')
WAITFOR DELAY '00:00:01'
INSERT INTO SANDPIT.dbo.adventure_time_characters([name])
VALUES
('Finn')
WAITFOR DELAY '00:00:01'
INSERT INTO SANDPIT.dbo.adventure_time_characters([name])
VALUES
('Princess Bubblegum')
WAITFOR DELAY '00:00:01'
INSERT INTO SANDPIT.dbo.adventure_time_characters([name])
VALUES
('Abracadaniel')
WAITFOR DELAY '00:00:01'
INSERT INTO SANDPIT.dbo.adventure_time_characters([name])
VALUES
('Earl of Lemongrab')
COMMIT;
GO
If I run the stored procedure and then SELECT * FROM SANDPIT.dbo.adventure_time_characters while it is running, the query appears to wait until the stored procedure finishes then returns the new values. This is weird to me because I expected the select statement to return an empty table because at the time it was run there were no committed rows in the table.
Similarly, if I run the stored procedure again and select while it is running, I expect to see the previously commit rows, but not the new ones. However again it waits before returning everything.
My hypothesis is that because I have no where condition on my select statement, it is trying to select everything including rows that have an X lock and that is why it waits. I modified everything to include a where clause that only returns previously committed rows but I got the same results. I assumed this because it has to check every row to know which rows satisfy the where condition. So, I tried to add an nonclustered index but the same thing happened.
What am I missing?
I have a database table with thousands of entries. I have multiple worker threads which pick up one row at a time, does some work (takes roughly one second each). While picking up the row, each thread updates a flag on the database row (like a timestamp) so that the other threads do not pick it up. But the problem is that I end up in a scenario where multiple threads are picking up the same row.
My general question is that what general design approach should I follow here to ensure that each thread picks up unique rows and does their task independently.
Note : Multiple threads are running in parallel to hasten the processing of the database rows. So I would like to have a as small as possible critical segment or exclusive lock.
Just to give some context, below is the stored proc which picks up the rows from the table after it has updated the flag on the row. Please note that the stored proc is not compilable as I have removed unnecessary portions from it. But generally that's the structure of it.
The problem happens when multiple threads execute the stored proc in parallel. The change made by the update statement (note that the update is done after taking up a lock) in one thread is not visible to the other thread unless the transaction is committed. And as there is a SELECT statement (which takes around 50ms) between the UPDATE and the TRANSACTION COMMIT, on 20% cases the UPDATE statement in a thread picks up a row which has already been processed.
I hope I am clear enough here.
USE ['mydatabase']
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[GetRequest]
AS
BEGIN
-- some variable declaration here
BEGIN TRANSACTION
-- check if there are blocking rows in the request table
-- FM: Remove records that don't qualify for operation.
-- delete operation on the table to remove rows we don't want to process
delete FROM request where somecondition = 1
-- Identify the requests to process
DECLARE #TmpTableVar table(TmpRequestId int NULL);
UPDATE TOP(1) request
WITH (ROWLOCK)
SET Lock = DateAdd(mi, 5, GETDATE())
OUTPUT INSERTED.ID INTO #TmpTableVar
FROM request tur
WHERE (Lock IS NULL OR GETDATE() > Lock) -- not locked or lock expired
AND GETDATE() > NextRetry -- next in the queue
IF(##RowCount = 0)
BEGIN
ROLLBACK TRANSACTION
RETURN
END
select #RequestID = TmpRequestId from #TmpTableVar
-- Get details about the request that has been just updated
SELECT somerows
FROM request
WHERE somecondition = 1
COMMIT TRANSACTION
END
The analog of a critical section in SQL Server is sp_getapplock, which is simple to use. Alternatively you can SELECT the row to update with (UPDLOCK,READPAST,ROWLOCK) table hints. Both of these require a multi-statement transaction to control the duration of the exclusive locking.
You need start a transaction isolation level on sql for isolation your line, but this can impact on your performance.
Look the sample:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
GO
BEGIN TRANSACTION
GO
SELECT ID, NAME, FLAG FROM SAMPLE_TABLE WHERE FLAG=0
GO
UPDATE SAMPLE_TABLE SET FLAG=1 WHERE ID=1
GO
COMMIT TRANSACTION
Finishing, not exist a better way for use isolation level. You need analyze the positive and negative point for each level isolation and test your system performance.
More information:
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql
http://www.besttechtools.com/articles/article/sql-server-isolation-levels-by-example
https://en.wikipedia.org/wiki/Isolation_(database_systems)
I'm starting to work with SQL Server database and I'm having a hard time trying to understand Transaction Isolation Levels and how they lock data.
I'm trying to accomlish the following simple task:
Accept a pair of integers [ID, counter] in a SQL stored procedure
Determine whether ID exists in a certain table: SELCT COUNT(*) FROM MyTable WHERE Id = {idParam}
If the previous COUNT statement returns 0, insert this ID and counter:
INSERT INTO MyTable(Id, Counter) VALUES({idParam}, {counterParam})
If the COUNT statement returns 1, update the existing record: UPDATE MyTable SET Counter = Counter + {counterParam} WHERE Id = {idParam}
Now, I understand I have to wrap this whole stored procedure in a transaction, and according to this MS article the appropriate isolation level would be SERIALIZABLE (it says: No other transactions can modify data that has been read by the current transaction until the current transaction completes). Please correct me if I'm wrong here.
Suppose I called the procedure with ID=1, so the first query woluld be SELCT COUNT(*) FROM MyTable WHERE SomeId=1 (1st transaction began). Then, immediately after this query was executed, the procedure is called with ID=2 (2nd transaction began).
What I fail to understand is how much data would be locked during the execution of my stored procedure in this case:
If the 1st query of the 1st transaction returns 0 records, does this mean that 1st transaction locks nothing and other transactions are able to INSERT ID=1 before 1st transaction tries it?
Or does the 1st transaction lock the whole table making the 2nd transaction wait even though those 2 transactions can never try to read/update the same row?
Or does 1st transaction somehow forbid anyone else to read/write only records with ID=1 until it is comleted?
If your filter is on an index, that's what's going to get locked. So regardless of whether the row already exists or not, it's locked for the duration of the transaction. Take care, though - it's very easy to turn a row lock into something nastier, especially full table locks. And of course, it's easy to introduce deadlocks this way :)
However, I'd suggest a different approach. First, try to do an insert. If it works, you're done - if it doesn't, you know you can safely do an atomic update. Very fast, very cheap, very reliable :)
I have 3 stored procedures (simplified, please try to ignore why I'm updating the table twice and why the SP is called twice):
CREATE SP1 AS
BEGIN TRANSACTION
-- Updated twice
UPDATE Customers SET Name = 'something' Where Id = 1 OUTPUT INSERTED.*
UPDATE Customers SET Name = 'something'
COMMIT TRANSACTION;
END
CREATE SP2 AS
BEGIN TRANSACTION
UPDATE Customers SET Name = 'anothername'
COMMIT TRANSACTION;
END
CREATE SP3 AS
BEGIN TRANSACTION
-- Called twice
EXEC SP2
EXEC SP2
COMMIT TRANSACTION;
END
The problem is that I got a deadlock from sql server. It says that SP1 and SP3 are both waiting for the Customers table resource. Does it make sense? Could it be because of the inner transaction in SP2? or maybe the use of OUTPUT statement...?
The lock is a Key lock on the PK of Customers. The requested lock mode of each waiting SP is U and the owner is X (The other object i guess).
A few more details:
1. These are called from the same user multiple times on different processes.
2. The statements are called twice only for the sake of the example.
3. In my actual code, Customer is actualy called 'Pending Instructions'. The instructions table is sampled every minute by each listener (computer, actualy).
4. The first update query first gets all the pending instructions and the second one updates the status of the entire table to completed, just to make sure that none are left in pending mode.
5. SP3 is calling SP2 twice because it updates 2 proprietory instructions row, this happens once a day.
Thanks a lot!!
Why are you surprised by this? You have written the book case for a deadlock and hit it.
The first update query first gets all the pending instructions and the second one updates the status of the entire table to completed.
Yes, this will deadlock. Two concurrent calls will find different 'pending' instructions (as new 'pending' instructions can be inserted in between). Then they will proceed to attempt to update the entire table and block on each other, deadlock. Here is the timeline:
Table contains customer:1, pending
T1 (running first update of SP1) updates table and modifies customer:1
T2 inserts a new record, customer:2, pending
T3 (running first update of SP1) updates table and modifies customer:2
T1 (running second update of SP1) tries to update all table, is blocked by T3
T3 (running second update of SP1) tries to update all table, is blocked by T1. Deadlock.
I have good news though: the deadlock is the best outcome you can get. A far worse outcome is when your logic missed 'pending' customers (which will happen more often). simply stated, your SP1 will erroneously mark any new 'pending' customer inserted after the first update as 'processed', when it was actually just skipped. Here is the timeline:
Table contains customer:1, pending
T1 (running first update of SP1) updates table and modifies customer:1
T2 inserts a new record, customer:2, pending
T1 (running second update of SP1) tries updates the whole table. customer:2 was pending and is reset w/o actually had been processed (is not i SP1's result set).
Your business lost an update.
so I suggest to go back to the drawing board and design SP1 properly. I suggest SP1 should only update on the second statement what it had updated on on the first one, for instance. Posting real code, with proper DDL, would go along way toward getting a useful solution.
I have got a stored procedure in SQL Server 2008 and it does quite a fair amount of inserting / deleting / updating operations.
Now I am wondering if there would be any way that I might be able to detect whether or not a stored procedure has completed ALL Inserting / Deleting / Updating operations.
Also, I understand that there can be a returned value from a stored procedure, which in this case here can be a statusCode (0/1). But through some of my experiments, I found that the statusCode always would get returned immediately once the execution of the stored procedure was finished, while in the mean time, inserting / deleting / updating was actually still running. So what should I do here to see the statusCode only get returned when inserting / deleting / updating operations have all been completed?
Thanks.
Code Structure:
BEGIN
DECLARE #statusCode
SET #statusCode = 0
-- Loop through all tables in a given database
-- using cursor
-- do Insert / Update/ Delete operations
SET #statusCode = 1
SELECT #statusCode
END
If the stored procedure returns, all operations for that call are complete.
You have not seen operations continuing after a stored procedure finishes unless another connection is making changes too. For one, it would break A in ACID