I have a stored procedure that calls child stored procedures to parse JSON that is coming in from the client. The problem is that the JSON parsing mechanism is bottle-necking the application because it's doing it row-by-row. I had an idea to turn it into sort of a queue-ing structure where I could have x jobs running this same parent stored procedure and process through the records quicker. I created a proof of concept for this and it's periodically running into deadlocking issues that I am sort of at a loss as how to resolve. I'd first like to see if the way I'm handling the transactions and the general code is a reasonable method, and then I plan to approach my DBAs to see if they can help in actually resolving the deadlocks because I won't have the access in production to be able to do anything.
NOTE
Please ignore any syntax or spelling issues or lack of variables being declared. I cut out a lot of unnecessary code to keep this as simple as possible while trying to make sure the overarching strategy was still in place. Please consider this "pseudo-code". I realize that having the actual queries run in the child procs would be valuable in sorting out the cause, but I'm mainly interested in finding out if the way I'm using transactions in the nested procedures is correct, and if the transaction isolation level i'm using is right. Also, in the way of solutions, I'm wondering if programmatically re-processing the deadlocks is a valid work-around, or if I should be shooting for a different solution.
So this is the parent procedure:
CREATE PROCEDURE [dbo].[ParentProc]
#BatchSize int = 5
AS
BEGIN
SET NOCOUNT ON;
SET XACT_ABORT ON; --http://www.sommarskog.se/error_handling/Part1.html#jumpXACT_ABORT
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; --mandated by DBAs unless I provide good reason not to use it.
/*
I opted to use UPDLOCK and READPAST hints in this proc based on doing some research online to determine how best to handle a queue processing system, which is essentially what we're doing.
For reference: https://www.mssqltips.com/sqlservertip/1257/processing-data-queues-in-sql-server-with-readpast-and-updlock/
*/
BEGIN TRY
--this transaction just marks the records with a guid that this instance of the proc will be handling.
BEGIN TRANSACTION
DECLARE #ProcessID varchar(36) = (SELECT NEWID());
;with cte as (
SELECT TOP (#BatchSize) ProcessID, UpdateDate
FROM dbo.MainJSONTable
WHERE IsProcessed = 0
AND ProcessID IS NULL
AND ProcessingFailed = 0
ORDER BY CreateDate ASC
)
UPDATE cte WITH (UPDLOCK, READPAST)
SET ProcessID = #ProcessID,
UpdateDate = GETUTCDATE();
DECLARE #i bigint = 1;
SET #BatchSize = (SELECT COUNT(*) FROM dbo.MainJSONTable WHERE IsProcessed = 0 AND ProcessingFailed = 0 AND ProcessID = #ProcessID);
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
IF (##TRANCOUNT > 0)
BEGIN
ROLLBACK TRANSACTION;
RETURN;
END
END CATCH;
WHILE #i <= #BatchSize
BEGIN
BEGIN TRY
BEGIN TRANSACTION
SET #RowKey = (SELECT TOP 1 RowKey FROM dbo.MainJSONTable WHERE IsProcessed = 0 AND ProcessingFailed = 0 AND ProcessID = #ProcessID ORDER BY CreateDate ASC);
IF(#RowKey IS NULL)
BEGIN
RETURN;
END
BEGIN TRY
EXEC dbo.DoStuffInChildProc #RowKey;
END TRY
BEGIN CATCH
SET #msg = error_message();
RAISERROR (#msg, 16, 1);
END CATCH
UPDATE dbo.MainJSONTable
SET IsProcessed = 1
WHERE RowKey = #RowKey
AND ProcessID = #ProcessID;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
IF (##TRANCOUNT > 0)
BEGIN
ROLLBACK TRANSACTION;
END
EXEC Logging.SetLogEntry ##PROCID, #msg;
UPDATE dbo.MainJSONTable
SET ProcessingFailed = 1
WHERE RowKey = #RowKey
AND ProcessID = #ProcessID;
END CATCH;
SET #i = #i + 1;
END
END
And here is an example of the child proc that is called. There are multiple child procs called, but this is how they are all handled:
CREATE PROCEDURE [dbo].[DoStuffInChildProc]
#RowKey bigint
AS
BEGIN
SET NOCOUNT ON;
SET XACT_ABORT ON; --http://www.sommarskog.se/error_handling/Part1.html#jumpXACT_ABORT
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; --mandated by DBAs unless I provide good reason not to use it.
BEGIN TRY
--before this i get some dynamic sql ready.
EXEC sp_executesql #FinalSQL;
END TRY
BEGIN CATCH
IF ##trancount > 0 ROLLBACK TRANSACTION
DECLARE #msg NVARCHAR(200) = CONCAT('[dbo].[DoStuffInChildProc] generated an error while processing RowKey ', #RowKey);
EXEC Logging.SetLogEntry ##PROCID, #msg, #FinalSQL;
RAISERROR(#msg, 16, 1);
RETURN 1;
END CATCH;
END
The error in my log is for the parent procedure:
Transaction (Process ID 60) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
UPDATE
This is the table create statement, and I just realized there currently aren't any indexes on the table other than it's clustered on the primary key. All of the child procs are selecting the Result field which contains the JSON to be parsed. Should I be committing the child transactions as soon as they're finished, or just bubbling up to the parent proc?
CREATE TABLE dbo.MainJSONTable (
TableID bigint IDENTITY(1,1) NOT NULL,
ParentTableFK bigint NULL,
Result nvarchar(MAX),
IsProcessed bit NOT NULL,
ProcessingFailed big NOT NULL
ProcessID
)
Related
I need to use existing global temp tables or create them and fill if they don't exist.
Example of one table
BEGIN TRANSACTION
if OBJECT_ID('tempdb..##myTable') IS NULL
begin
create table ##myTable (
--stuff
)
end
COMMIT TRANSACTION
BEGIN TRANSACTION
if (select count(*) from ##myTable) = 0
begin
insert into ##myTable
--select stuff
end
COMMIT TRANSACTION
Sometimes it works and sometimes error "Table ##myTable already exists" shows. I am the only one who uses those global temp tables
This is a classic race condition. Multiple sessions can execute the if OBJECT_ID('tempdb..##myTable') IS NULL query at the same time. If the table doesn't exist, both will attempt to create the table and only one will succeed.
One way to address the issue is with an application lock to serialize the code block among multiple sessions. For example:
SET XACT_ABORT ON; --best practice with explict T-SQL transactions
BEGIN TRANSACTION;
EXEC sp_getapplock
#Resource = 'create ##myTable'
,#LockMode = 'Exclusive'
,#LockOwner = 'Transaction';
if OBJECT_ID('tempdb..##myTable') IS NULL
begin
create table ##myTable (
--select stuff
);
end;
COMMIT TRANSACTION; --also releases app lock
I use below script to insert orders transaction manually. This script processes one order at time (#orderId - using this variable here). I got a list of 200 orders, is there a way i can process all orders using single script?
DECLARE #return_value int, #exceptionId bigint, #createDate datetime
EXEC #return_value = [dbo].[uspInsertException]
#exceptionTypeCode = N'CreateCustomerAccount',
#exceptionSource = N'SOPS',
#exceptionCode = N'PUSH2EQ',
#exceptionDescription = N'CreateCustomerAccount exception MANUALLY pushed to EQ',
#request = N'',
#response = N'',
#orderId = 227614128,
#sourceSystem = N'OMS',
#exceptionStatusCode = N'Open',
#actorId = 1,
#exceptionSubTypeCode = NULL,
#exceptionId = #exceptionId OUTPUT,
#createDate = #createDate OUTPUT
SELECT #exceptionId as N'#exceptionId', #createDate as N'#createDate'
SELECT 'Return Value' = #return_value
Absolutely it can be done. The best way I have found is building nested classes in your application, and then pass it to sql where you can shred it with OPENXML or xPath depending on the size of the xml.
Depending on your needs you can also use a webservice, where you can place the classes and the code to connect to the database. The application then references the classes in the webservice, and passes the data in the class hierarchy format to the web service, which then parses the data and passes it as a full block of xml to the database, where in a stored procedure it is shredded and inserted. If you use this method, make sure you make your c# classes serializable.
You can easily retrieve data from the database as part of the stored proc by using for xml, and I would recommend wrapping it in a transaction so that you don't have half of a file inserted when an error occurs.
If you need some code samples, provide a better description of how you are passing your data to the database.
CREATE PROCEDURE [dbo].[sp_InsertExceptions]
-- Add the parameters for the stored procedure here
#pXML XML
AS
BEGIN
SET XACT_ABORT ON;
BEGIN TRY
BEGIN TRANSACTION;
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
DECLARE #XML AS XML, #hDoc AS INT
print convert(varchar(max), #pRI)
EXEC sp_xml_preparedocument #hDoc OUTPUT, #pRI
{Put your shredding code here}
EXEC sp_xml_removedocument #hDoc
COMMIT TRANSACTION;
EXECUTE sp_GetErrors --This stored procedure is used to retrieve data
--previously inserted
END TRY
BEGIN CATCH
-- Execute error retrieval routine.
EXECUTE usp_GetErrorInfo; --This stored procedure gets your error
--information and can also store it in the
--database to track errors
-- Test XACT_STATE:
-- If 1, the transaction is committable.
-- If -1, the transaction is uncommittable and should
-- be rolled back.
-- XACT_STATE = 0 means that there is no transaction and
-- a commit or rollback operation would generate an error.
-- Test whether the transaction is uncommittable.
IF (XACT_STATE()) = -1
BEGIN
PRINT
N'The transaction is in an uncommittable state.' +
'Rolling back transaction.'
ROLLBACK TRANSACTION;
END;
-- Test whether the transaction is committable.
IF (XACT_STATE()) = 1
BEGIN
PRINT
N'The transaction is committable.' +
'Committing transaction.'
COMMIT TRANSACTION;
END;
END CATCH;
I am working on a system that uses multiple threads to read, process and then update database records. Threads run in parallel and try to pick records by calling Sql Server stored procedure.
They call this stored procedure looking for unprocessed records multiple times per second and sometimes pick this same record up.
I try to prevent this happening this way:
UPDATE dbo.GameData
SET Exported = #Now,
ExportExpires = #Expire,
ExportSession = #ExportSession
OUTPUT Inserted.ID INTO #ExportedIDs
WHERE ID IN ( SELECT TOP(#ArraySize) GD.ID
FROM dbo.GameData GD
WHERE GD.Exported IS NULL
ORDER BY GD.ID ASC)
The idea here is to set a record as exported first using an UPDATE with OUTPUT (remembering record id), so no other thread can pick it up again. When record is set as exported, then I can do some extra calculations and pass the data to the external system hoping that no other thread will pick this same record again in the mean time. Since the UPDATE that has in mind to secure the record first.
Unfortunately it doesn't seem to be working and the application sometimes pick same record twice anyway.
How to prevent it?
Kind regards
Mariusz
I think you should be able to do this atomically using a common table expression. (I'm not 100% certain about this, and I haven't tested, so you'll need to verify that it works for you in your situation.)
;WITH cte AS
(
SELECT TOP(#ArrayCount)
ID, Exported, ExportExpires, ExportSession
FROM dbo.GameData WITH (READPAST)
WHERE Exported IS NULL
ORDER BY ID
)
UPDATE cte
SET Exported = #Now,
ExportExpires = #Expire,
ExportSession = #ExportSession
OUTPUT INSERTED.ID INTO #ExportedIDs
I have a similar set up and I use sp_getapplock. My application runs many threads and they call a stored procedure to get the ID of the element that has to be processed. sp_getapplock guarantees that the same ID would not be chosen by two different threads.
I have a MyTable with a list of IDs that my application checks in an infinite loop using many threads. For each ID there are two datetime columns: LastCheckStarted and LastCheckCompleted. They are used to determine which ID to pick. Stored procedure picks an ID that wasn't checked for the longest period. There is also a hard-coded period of 20 minutes - the same ID can't be checked more often than every 20 minutes.
CREATE PROCEDURE [dbo].[GetNextIDToCheck]
-- Add the parameters for the stored procedure here
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarID int = NULL;
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'SomeUniqueName_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
-- Find ID that wasn't checked for the longest period
SELECT TOP 1
#VarID = ID
FROM
dbo.MyTable
WHERE
LastCheckStarted <= LastCheckCompleted
-- this ID is not being checked right now
AND LastCheckCompleted < DATEADD(minute, -20, GETDATE())
-- last check was done more than 20 minutes ago
ORDER BY LastCheckCompleted;
-- Start checking
UPDATE dbo.MyTable
SET LastCheckStarted = GETDATE()
WHERE ID = #VarID;
-- There is no need to explicitly verify if we found anything.
-- If #VarID is null, no rows will be updated
END;
-- Return found ID, or no rows if nothing was found,
-- or failed to acquire the lock
SELECT
#VarID AS ID
WHERE
#VarID IS NOT NULL
;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
The second procedure is called by an application when it finishes checking the found ID.
CREATE PROCEDURE [dbo].[SetCheckComplete]
-- Add the parameters for the stored procedure here
#ParamID int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'SomeUniqueName_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
-- Completed checking the given ID
UPDATE dbo.MyTable
SET LastCheckCompleted = GETDATE()
WHERE ID = #ParamID;
END;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
It does not work because multiple transactions might first execute the IN clause and find the same set of rows, then update multiple times and overwrite each other.
LukeH's answer is best, accept it.
You can also fix it by adding AND Exported IS NULL to cancel double updates.
Or, make this SERIALIZABLE. This will lead to some blocking and deadlocking. This can safely be handled by timeouts and retry in case of deadlock. SERIALIZABLE is always safe for all workloads but it might block/deadlock more often.
I have written a procedure like below lines of code
ALTER PROCEDURE [dbo].[CountrySave]
(
#CountryId uniqueidentifier,
#CountryName nvarchar(max)
)
AS
begin tran
if exists (select * from Country where CountryID =#CountryId)
begin
update Country set
CountryID = #CountryId,
CountryName =#CountryName
where CountryID = #CountryId
end
else
begin
insert INTO Country(CountryID, CountryName) values
(NewID(),#CountryName)
end
It throws "Transaction count after EXECUTE indicates a mismatching number of BEGIN and COMMIT statements. Previous count = 0, current count = 1.
A transaction that was started in a MARS batch is still active at the end of the batch. The transaction is rolled back." error message when executed!!!
Please Help...
Add COMMIT TRAN
ALTER PROCEDURE [dbo].[CountrySave]
#CountryId uniqueidentifier,
#CountryName nvarchar(max)
AS
BEGIN
BEGIN TRY
BEGIN TRAN
if exists (select * from Country where CountryID =#CountryId)
begin
update Country
set CountryID = #CountryId,
CountryName =#CountryName
where CountryID = #CountryId;
end
else
begin
insert INTO Country(CountryID, CountryName)
values(NewID(),#CountryName)
end
COMMIT TRAN
END TRY
BEGIN CATCH
/* Error occured log it */
ROLLBACK
END CATCH
END
The error message is fairly clear. When you open (begin) a transaction, you will need to do something at the end of it as well.
So either you ROLLBACK the transaction (in case one of the statements within the transaction fails), or you COMMIT the transaction in order to actually implement all changes your statements made.
From MSDN:
BEGIN TRANSACTION represents a point at which the data referenced by a
connection is logically and physically consistent. If errors are
encountered, all data modifications made after the BEGIN TRANSACTION
can be rolled back to return the data to this known state of
consistency. Each transaction lasts until either it completes without
errors and COMMIT TRANSACTION is issued to make the modifications a
permanent part of the database, or errors are encountered and all
modifications are erased with a ROLLBACK TRANSACTION statement.
More information: https://msdn.microsoft.com/en-us/library/ms188929.aspx
Your Problem is that you begin a transaction but you never commit it / do a rollback.
Try this structure for your procedure, worked very well for me in the past:
CREATE PROCEDURE [dbo].SomeProc
(#Parameter INT)
AS
BEGIN
--if you want to be to only active transaction then uncomment this:
--IF ##TRANCOUNT > 0
--BEGIN
-- RAISERROR('Other Transactions are active at the moment - Please try again later',16,1)
--END
BEGIN TRANSACTION
BEGIN TRY
/*
DO SOMETHING
*/
COMMIT TRANSACTION
END TRY
BEGIN CATCH
--Custom Error could be raised here
--RAISERROR('Something bad happened when doing something',16,1)
ROLLBACK TRANSACTION
END CATCH
END
I have a stored procedure which does a lot of probing of the database to determine if some records should be updated
Each record (Order) has a TIMESTAMP called [RowVersion]
I store the candidate record ids and RowVersions in a temporary table called #Ids
DECLARE #Ids TABLE (id int, [RowVersion] Binary(8))
I get the count of candidates with the the following
DECLARE #FoundCount int
SELECT #FoundCount = COUNT(*) FROM #Ids
Since records may change from when i SELECT to when i eventually try to UPDATE, i need a way to check concurrency and ROLLBACK TRANSACTION if that check fails
What i have so far
BEGIN TRANSACTION
-- create new combinable order group
INSERT INTO CombinableOrders DEFAULT VALUES
-- update orders found into new group
UPDATE Orders
SET Orders.CombinableOrder_Id = SCOPE_IDENTITY()
FROM Orders AS Orders
INNER JOIN #Ids AS Ids
ON Orders.Id = Ids.Id
AND Orders.[RowVersion] = Ids.[RowVersion]
-- if the rows updated dosnt match the rows found, then there must be a concurrecy issue, roll back
IF (##ROWCOUNT != #FoundCount)
BEGIN
ROLLBACK TRANSACTION
set #Updated = -1
END
ELSE
COMMIT
From the above, i'm filtering the UPDATE with the stored [RowVersion] this will skip any records that have since been changed (hopefully)
However i'm not quite sure if i'm using transactions or optimistic concurrency in regards to TIMESTAMP correctly, or if there are better ways to achieve my desired goals
It's difficult to understand what logic you are trying to implement.
But, if you absolutely must perform several non-atomic actions in a procedure and make sure that the whole block of code is not executed again while it is running (for example, by another user), consider using sp_getapplock.
Places a lock on an application resource.
Your procedure may look similar to this:
CREATE PROCEDURE [dbo].[YourProcedure]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'UniqueStringFor_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
-- perform your complex processing
-- populate table with IDs
-- update other tables using IDs
-- ...
END;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
When you SELECT the data, try using HOLDLOCK and UPDLOCK while inside of an explicit transaction. It's going to mess with the concurrency of OTHER transactions but not yours.
http://msdn.microsoft.com/en-us/library/ms187373.aspx