I have a stored procedure which does a lot of probing of the database to determine if some records should be updated
Each record (Order) has a TIMESTAMP called [RowVersion]
I store the candidate record ids and RowVersions in a temporary table called #Ids
DECLARE #Ids TABLE (id int, [RowVersion] Binary(8))
I get the count of candidates with the the following
DECLARE #FoundCount int
SELECT #FoundCount = COUNT(*) FROM #Ids
Since records may change from when i SELECT to when i eventually try to UPDATE, i need a way to check concurrency and ROLLBACK TRANSACTION if that check fails
What i have so far
BEGIN TRANSACTION
-- create new combinable order group
INSERT INTO CombinableOrders DEFAULT VALUES
-- update orders found into new group
UPDATE Orders
SET Orders.CombinableOrder_Id = SCOPE_IDENTITY()
FROM Orders AS Orders
INNER JOIN #Ids AS Ids
ON Orders.Id = Ids.Id
AND Orders.[RowVersion] = Ids.[RowVersion]
-- if the rows updated dosnt match the rows found, then there must be a concurrecy issue, roll back
IF (##ROWCOUNT != #FoundCount)
BEGIN
ROLLBACK TRANSACTION
set #Updated = -1
END
ELSE
COMMIT
From the above, i'm filtering the UPDATE with the stored [RowVersion] this will skip any records that have since been changed (hopefully)
However i'm not quite sure if i'm using transactions or optimistic concurrency in regards to TIMESTAMP correctly, or if there are better ways to achieve my desired goals
It's difficult to understand what logic you are trying to implement.
But, if you absolutely must perform several non-atomic actions in a procedure and make sure that the whole block of code is not executed again while it is running (for example, by another user), consider using sp_getapplock.
Places a lock on an application resource.
Your procedure may look similar to this:
CREATE PROCEDURE [dbo].[YourProcedure]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'UniqueStringFor_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
-- perform your complex processing
-- populate table with IDs
-- update other tables using IDs
-- ...
END;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
When you SELECT the data, try using HOLDLOCK and UPDLOCK while inside of an explicit transaction. It's going to mess with the concurrency of OTHER transactions but not yours.
http://msdn.microsoft.com/en-us/library/ms187373.aspx
Related
I need to use existing global temp tables or create them and fill if they don't exist.
Example of one table
BEGIN TRANSACTION
if OBJECT_ID('tempdb..##myTable') IS NULL
begin
create table ##myTable (
--stuff
)
end
COMMIT TRANSACTION
BEGIN TRANSACTION
if (select count(*) from ##myTable) = 0
begin
insert into ##myTable
--select stuff
end
COMMIT TRANSACTION
Sometimes it works and sometimes error "Table ##myTable already exists" shows. I am the only one who uses those global temp tables
This is a classic race condition. Multiple sessions can execute the if OBJECT_ID('tempdb..##myTable') IS NULL query at the same time. If the table doesn't exist, both will attempt to create the table and only one will succeed.
One way to address the issue is with an application lock to serialize the code block among multiple sessions. For example:
SET XACT_ABORT ON; --best practice with explict T-SQL transactions
BEGIN TRANSACTION;
EXEC sp_getapplock
#Resource = 'create ##myTable'
,#LockMode = 'Exclusive'
,#LockOwner = 'Transaction';
if OBJECT_ID('tempdb..##myTable') IS NULL
begin
create table ##myTable (
--select stuff
);
end;
COMMIT TRANSACTION; --also releases app lock
I have a stored procedure that calls child stored procedures to parse JSON that is coming in from the client. The problem is that the JSON parsing mechanism is bottle-necking the application because it's doing it row-by-row. I had an idea to turn it into sort of a queue-ing structure where I could have x jobs running this same parent stored procedure and process through the records quicker. I created a proof of concept for this and it's periodically running into deadlocking issues that I am sort of at a loss as how to resolve. I'd first like to see if the way I'm handling the transactions and the general code is a reasonable method, and then I plan to approach my DBAs to see if they can help in actually resolving the deadlocks because I won't have the access in production to be able to do anything.
NOTE
Please ignore any syntax or spelling issues or lack of variables being declared. I cut out a lot of unnecessary code to keep this as simple as possible while trying to make sure the overarching strategy was still in place. Please consider this "pseudo-code". I realize that having the actual queries run in the child procs would be valuable in sorting out the cause, but I'm mainly interested in finding out if the way I'm using transactions in the nested procedures is correct, and if the transaction isolation level i'm using is right. Also, in the way of solutions, I'm wondering if programmatically re-processing the deadlocks is a valid work-around, or if I should be shooting for a different solution.
So this is the parent procedure:
CREATE PROCEDURE [dbo].[ParentProc]
#BatchSize int = 5
AS
BEGIN
SET NOCOUNT ON;
SET XACT_ABORT ON; --http://www.sommarskog.se/error_handling/Part1.html#jumpXACT_ABORT
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; --mandated by DBAs unless I provide good reason not to use it.
/*
I opted to use UPDLOCK and READPAST hints in this proc based on doing some research online to determine how best to handle a queue processing system, which is essentially what we're doing.
For reference: https://www.mssqltips.com/sqlservertip/1257/processing-data-queues-in-sql-server-with-readpast-and-updlock/
*/
BEGIN TRY
--this transaction just marks the records with a guid that this instance of the proc will be handling.
BEGIN TRANSACTION
DECLARE #ProcessID varchar(36) = (SELECT NEWID());
;with cte as (
SELECT TOP (#BatchSize) ProcessID, UpdateDate
FROM dbo.MainJSONTable
WHERE IsProcessed = 0
AND ProcessID IS NULL
AND ProcessingFailed = 0
ORDER BY CreateDate ASC
)
UPDATE cte WITH (UPDLOCK, READPAST)
SET ProcessID = #ProcessID,
UpdateDate = GETUTCDATE();
DECLARE #i bigint = 1;
SET #BatchSize = (SELECT COUNT(*) FROM dbo.MainJSONTable WHERE IsProcessed = 0 AND ProcessingFailed = 0 AND ProcessID = #ProcessID);
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
IF (##TRANCOUNT > 0)
BEGIN
ROLLBACK TRANSACTION;
RETURN;
END
END CATCH;
WHILE #i <= #BatchSize
BEGIN
BEGIN TRY
BEGIN TRANSACTION
SET #RowKey = (SELECT TOP 1 RowKey FROM dbo.MainJSONTable WHERE IsProcessed = 0 AND ProcessingFailed = 0 AND ProcessID = #ProcessID ORDER BY CreateDate ASC);
IF(#RowKey IS NULL)
BEGIN
RETURN;
END
BEGIN TRY
EXEC dbo.DoStuffInChildProc #RowKey;
END TRY
BEGIN CATCH
SET #msg = error_message();
RAISERROR (#msg, 16, 1);
END CATCH
UPDATE dbo.MainJSONTable
SET IsProcessed = 1
WHERE RowKey = #RowKey
AND ProcessID = #ProcessID;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
IF (##TRANCOUNT > 0)
BEGIN
ROLLBACK TRANSACTION;
END
EXEC Logging.SetLogEntry ##PROCID, #msg;
UPDATE dbo.MainJSONTable
SET ProcessingFailed = 1
WHERE RowKey = #RowKey
AND ProcessID = #ProcessID;
END CATCH;
SET #i = #i + 1;
END
END
And here is an example of the child proc that is called. There are multiple child procs called, but this is how they are all handled:
CREATE PROCEDURE [dbo].[DoStuffInChildProc]
#RowKey bigint
AS
BEGIN
SET NOCOUNT ON;
SET XACT_ABORT ON; --http://www.sommarskog.se/error_handling/Part1.html#jumpXACT_ABORT
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; --mandated by DBAs unless I provide good reason not to use it.
BEGIN TRY
--before this i get some dynamic sql ready.
EXEC sp_executesql #FinalSQL;
END TRY
BEGIN CATCH
IF ##trancount > 0 ROLLBACK TRANSACTION
DECLARE #msg NVARCHAR(200) = CONCAT('[dbo].[DoStuffInChildProc] generated an error while processing RowKey ', #RowKey);
EXEC Logging.SetLogEntry ##PROCID, #msg, #FinalSQL;
RAISERROR(#msg, 16, 1);
RETURN 1;
END CATCH;
END
The error in my log is for the parent procedure:
Transaction (Process ID 60) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
UPDATE
This is the table create statement, and I just realized there currently aren't any indexes on the table other than it's clustered on the primary key. All of the child procs are selecting the Result field which contains the JSON to be parsed. Should I be committing the child transactions as soon as they're finished, or just bubbling up to the parent proc?
CREATE TABLE dbo.MainJSONTable (
TableID bigint IDENTITY(1,1) NOT NULL,
ParentTableFK bigint NULL,
Result nvarchar(MAX),
IsProcessed bit NOT NULL,
ProcessingFailed big NOT NULL
ProcessID
)
I've got a project that is trying to apply DDD (Domain Driven Design). Currently, we've got something like this:
begin tran
try
_manager.CreateNewEmployee(newEmployeeCmd);
tran.Commit();
catch
rollback tran
Internally, the CreateNewEmployee method uses a domain service for checking if there's already an employee with the memberId. Here's some pseudo code:
void CreateNewEmployee(NewEmployeeCmd cmd)
if(_duplicateMember.AlreadyRegistered(cmd.MemberId) )
throw duplicate
// extra stuff
saveNewEmployee()
end
Now, in the end, it's as if we have the following SQL instructions executed (pesudo code again):
begin sql tran
select count(*) from table where memberId=#memberId and status=1 -- active
--some time goes by
insert into table ...
end
NOw, when I started looking at the code, I've noticed that it was using the default SQL Server locking level. In practice, that means that something like this could happen:
--thread 1
(1)select ... --assume it returns 0
--thread 2
(2)select ... ---nothing found
(3)insert recordA
--thread 1
(4)insert record --some as before
(5) commit tran
--thread 1
(6) commit tran
So, we could end up having repeated records. I've tried playing with the transaction levels, but the only way I've managed to make it work like it's intended was by changing the select that is used to check if there's already a record in the table. I've ended up using a table lock hint which instructs sql to maintain a lock until the end of the transaction. That was the only way I've managed to get a lock when the select starts (changing the other isolation levels still wouldn't do what I needed since they all allowed the select to run)
So, I've ended up using a table lock which is held from the beginning until the end of the transaction. In practice, that means that step (2) will block until thread 1 ends its job.
Is there a better option for this kind of scenarios (that don't depend on using, say, indexes)?
Thanks.
Luis
You need to get the proper locks on the initial select, which you can do with the locking hints with (updlock, serializable). Once you do that, thread 2 will wait for thread 1 to finish if thread 2 is using the same key range in its where.
You could use the Sam Saffron upsert approach.
For example:
create procedure dbo.Employee_getset_byName (#Name nvarchar(50), #MemberId int output) as
begin
set nocount, xact_abort on;
begin tran;
select #MemberId = Id
from dbo.Employee with (updlock, serializable) /* hold key range for #Name */
where Name = #Name;
if ##rowcount = 0 /* if we still do not have an Id for #Name */
begin;
/* for a sequence */
set #MemberId = next value for dbo.IdSequence; /* get next sequence value */
insert into dbo.Employee (Name, Id)
values (#Name, #MemberId);
/* for identity */
insert into dbo.Employee (Name)
values (#Name);
set #MemberId = scope_identity();
end;
commit tran;
end;
go
I've got a trigger:
CREATE TRIGGER tgr_incheck_vlucht
ON PassagierVoorVlucht
AFTER INSERT, UPDATE
AS
BEGIN
IF ##ROWCOUNT= 0 BEGIN RETURN END
SET NOCOUNT ON
BEGIN TRY
IF EXISTS
(SELECT *
FROM inserted I
WHERE EXISTS(SELECT *
FROM PassagierVoorVlucht P inner join Vlucht V on P.vluchtnummer = V.vluchtnummer
WHERE I.inchecktijdstip >= vertrektijdstip))
BEGIN
RAISERROR('Inchecktijdstip moet voor de aankomsttijdliggen', 16,1)
END
END TRY
BEGIN CATCH
IF ##TRANCOUNT > 0 BEGIN ROLLBACK TRANSACTION END
DECLARE #ErrorMessage NVARCHAR(4000) = ERROR_MESSAGE(),
#errorSeverity INT = ERROR_SEVERITY(),
#errorState INT = ERROR_STATE()
RAISERROR (#ErrorMessage, #errorSeverity, #errorState)
END CATCH
END
now i have written some test statements:
INSERT INTO PassagierVoorVlucht
VALUES(850, 5316, 1, '2002-01-01 13:37:00.000', 21),
(1002, 5316, 1, '2004-01-01 13:37:00.000', 21),
(1601, 5316, 1, '2004-05-01 13:37:00.000', 21),
(1602, 5316, 1, '2004-05-01 13:37:00.000', 21)
the trigger works for only ONE insert row at the time not for the whole block. How can i write the trigger that it can handle multiple inserts?
The answer is you cannot... Fundamentally, triggers are fired for each insertion, not a batch of code. This is why it is called a trigger, or a response.
Furthermore, if your table had an Insert Identity column, triggers will not prevent gaps in the rows since they function just like sequences are: activated at the row level and after the statement completes (unless INSTEAD OF).
Perhaps helpful in explaining triggers is the following from MSDN:
CREATE TRIGGER (Transact-SQL)
Although a TRUNCATE TABLE statement is in effect a DELETE statement, it does not activate a trigger because the operation does not log individual row deletions. However, only those users with permissions to execute a TRUNCATE TABLE statement need be concerned about inadvertently circumventing a DELETE trigger this way.
Notice the part about individual row deletions as well as the point about scope. Perhaps understanding more what your requirements are might make the solution clear.
Now, you may use an INSTEAD OF trigger to fire before the attempted code executes, but that likely is not going to be a solution, unless you had some staging table setup or similar for it to function. Instead of replaces the entire DML or DDL statement.
If you wish to protect the integrity of your data consider the use of stored or pre-planned procedures.
If that is unacceptable or impractical, other constraints like referential keys or even staging tables (depending on the need for the timeliness of data) which can later be imported could be options.
To do this you need to use a temporary table inside your trigger and store all the inserted data in it and then loop that table using a while loop.
here is an example where we want to add all the inserted users into a audit table using a trigger for insert
alter trigger tr_userData_forInsert
on userData
for insert
AS
BEGIN
declare #id int, #name varchar(20)
select *
into #inserted_data
from inserted
while ( exists(select id from #inserted_data) )
begin
select top 1 #id = id, #name = full_name
from #inserted_data
insert into loggs
values ( concat( 'a user with ',#id, ' and name of ', #name
, ' is added at ', getdate() ) )
delete from #inserted_data
where id = #id
end
end
I already used this in more than one situation and it work perfect
I am working on a system that uses multiple threads to read, process and then update database records. Threads run in parallel and try to pick records by calling Sql Server stored procedure.
They call this stored procedure looking for unprocessed records multiple times per second and sometimes pick this same record up.
I try to prevent this happening this way:
UPDATE dbo.GameData
SET Exported = #Now,
ExportExpires = #Expire,
ExportSession = #ExportSession
OUTPUT Inserted.ID INTO #ExportedIDs
WHERE ID IN ( SELECT TOP(#ArraySize) GD.ID
FROM dbo.GameData GD
WHERE GD.Exported IS NULL
ORDER BY GD.ID ASC)
The idea here is to set a record as exported first using an UPDATE with OUTPUT (remembering record id), so no other thread can pick it up again. When record is set as exported, then I can do some extra calculations and pass the data to the external system hoping that no other thread will pick this same record again in the mean time. Since the UPDATE that has in mind to secure the record first.
Unfortunately it doesn't seem to be working and the application sometimes pick same record twice anyway.
How to prevent it?
Kind regards
Mariusz
I think you should be able to do this atomically using a common table expression. (I'm not 100% certain about this, and I haven't tested, so you'll need to verify that it works for you in your situation.)
;WITH cte AS
(
SELECT TOP(#ArrayCount)
ID, Exported, ExportExpires, ExportSession
FROM dbo.GameData WITH (READPAST)
WHERE Exported IS NULL
ORDER BY ID
)
UPDATE cte
SET Exported = #Now,
ExportExpires = #Expire,
ExportSession = #ExportSession
OUTPUT INSERTED.ID INTO #ExportedIDs
I have a similar set up and I use sp_getapplock. My application runs many threads and they call a stored procedure to get the ID of the element that has to be processed. sp_getapplock guarantees that the same ID would not be chosen by two different threads.
I have a MyTable with a list of IDs that my application checks in an infinite loop using many threads. For each ID there are two datetime columns: LastCheckStarted and LastCheckCompleted. They are used to determine which ID to pick. Stored procedure picks an ID that wasn't checked for the longest period. There is also a hard-coded period of 20 minutes - the same ID can't be checked more often than every 20 minutes.
CREATE PROCEDURE [dbo].[GetNextIDToCheck]
-- Add the parameters for the stored procedure here
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarID int = NULL;
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'SomeUniqueName_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
-- Find ID that wasn't checked for the longest period
SELECT TOP 1
#VarID = ID
FROM
dbo.MyTable
WHERE
LastCheckStarted <= LastCheckCompleted
-- this ID is not being checked right now
AND LastCheckCompleted < DATEADD(minute, -20, GETDATE())
-- last check was done more than 20 minutes ago
ORDER BY LastCheckCompleted;
-- Start checking
UPDATE dbo.MyTable
SET LastCheckStarted = GETDATE()
WHERE ID = #VarID;
-- There is no need to explicitly verify if we found anything.
-- If #VarID is null, no rows will be updated
END;
-- Return found ID, or no rows if nothing was found,
-- or failed to acquire the lock
SELECT
#VarID AS ID
WHERE
#VarID IS NOT NULL
;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
The second procedure is called by an application when it finishes checking the found ID.
CREATE PROCEDURE [dbo].[SetCheckComplete]
-- Add the parameters for the stored procedure here
#ParamID int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'SomeUniqueName_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
-- Completed checking the given ID
UPDATE dbo.MyTable
SET LastCheckCompleted = GETDATE()
WHERE ID = #ParamID;
END;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
It does not work because multiple transactions might first execute the IN clause and find the same set of rows, then update multiple times and overwrite each other.
LukeH's answer is best, accept it.
You can also fix it by adding AND Exported IS NULL to cancel double updates.
Or, make this SERIALIZABLE. This will lead to some blocking and deadlocking. This can safely be handled by timeouts and retry in case of deadlock. SERIALIZABLE is always safe for all workloads but it might block/deadlock more often.