execute stored procedure atomicly - sql-server

I have created a "queue" of sorts in sql, and I want to be able to set an item as invisible to semi-simulate an azure like queue (instead of deleting it immediately in the event the worker fails to process it, it will appear automatically in the queue again for another worker to fetch).
As per recommendation from this SO: Is T-SQL Stored Procedure Execution 'atomic'?
I wrapped Begin Tran and Commit around my spDeQueue procedure, but I'm still running into duplicate pulls from my test agents. (They are all trying to empty a queue of 10 simultaneously and I'm getting duplicate reads, which I shouldn't)
This is my sproc
ALTER PROCEDURE [dbo].[spDeQueue]
#invisibleDuration int = 30,
#priority int = null
AS
BEGIN
begin tran
declare #now datetime = GETDATE()
-- get the queue item
declare #id int =
(
select top 1
[Id]
from
[Queue]
where
([InvisibleUntil] is NULL or [InvisibleUntil] <= #now)
and (#priority is NULL or [Priority] = #priority)
order by
[Priority],
[Created]
)
-- set the invisible and viewed count
update
[Queue]
set
[InvisibleUntil] = DATEADD(second, #invisibleDuration, #now),
[Viewed] = [Viewed] + 1
where
[Id] = #id
-- fetch the entire item
select
*
from
[Queue]
where
[Id] = #id
commit
return 0
END
What should I do to ensure this acts atomicly, to prevent duplicate dequeues.
Thanks

Your transaction (ie statements between 'begin trans' and 'commit') is atomic in the sense that either all the statements will be committed to the database, or none of them.
It appears you have transactions mixed up with synchronization / mutual exclusive execution.
Have a read into transaction isolation levels which should help enforce sequential execution - repeatable read might do the trick. http://en.wikipedia.org/wiki/Isolation_(database_systems)

Related

SQL Server Service Broker - Ways to improve SQL execution framework

Below is an outline of a SQL execution framework design using Service Broker that I have been playing with. I've outlined the process and have asked some questions through (highlight using a block quote) and would be interested in hearing any advice on the design.
Overview
I have a an ETL operation that needs to take data out of 5 databases and move it into 150 using select/insert statements or stored procedures. The result is about 2,000 individual queries, taking between 1 second to 1 hour each.
Each SQL query inserts data only. There is no need for data to be returned.
The operation can be broken up into 3 steps:
Pre-ETL
ETL
Post-ETL
The queries in each step can be executed in any order, but the steps have to stay in order.
Method
I am using Service Broker for asynchronous/parallel execution.
Any advice on how to tune service broker (e.g. any specific options to look at or guide for setting the number of queue workers?
Service Broker Design
Initiator
The initiator sends an XML message containing the SQL query to the Unprocessed queue, with an activation stored procedure called ProcessUnprocessedQueue. This process is wrapped in a try/catch in a transaction, rolling back the transaction when there is an exception.
ProcessUnpressedQueue
ProcessUnprocessedQueue passes the XML to procedure ExecSql
ExecSql - SQL Execution and Logging
ExecSql then handles the SQL execution and logging:
The XML is parsed, along with any other data about the execution that is going to be logged
Before the execution, a logging entry is inserted
If the transaction is started in the initiator, can I ensure the log entry insert is always committed if the outer transaction in the initiator is rolled back?
Something like SAVE TRANSACTION is not valid here, correct?
Should I not manipulate the transaction here, execute the query in a try/catch and, if it goes to the catch, insert a log entry for the exception and throw the exception since it is in the middle of the transaction?
The query is executed
Alternative Logging Solution?
I need to log:
The SQL query executed
Metadata about the operation
The time it takes for each process to finish
This is why I insert one row at the start and one at the end of the process
Any exceptions, if they exist
Would it be better to have an In-Memory OLTP table that contains the query information? So, I would have INSERT a row before the start of an operation and then do an UPDATE or INSERT to log exceptions and execution times. After the batch is done, I would then archive the data into a table stored to the disk to prevent the table from getting too big.
ProcessUnprocessedQueue - Manually process the results
After the execution, ProcessUnprocessedQueue gets back an updated version of the XML (to determine if the execution was successful, or other data about the transaction, for post-processing) and then sends that message to the ProcessedQueue, which does not have an activation procedure, so it can be manually processed (I need to know when a batch of queries has finished executing).
Processing the Queries
Since the ETL can be broken out into 3 steps, I create 3 XML variables where I will add all of the queries that are needed in the ETL operation, so I will have something like this:
#preEtlQueue xml
200 queries
#etlQueue xml
1500 queries
#postEtlQueue xml
300 queries
Why XML?
The XML queue variable is passed between different stored procedures as an OUTPUT parameter that updates it's values and/or add SQL queries to it. This variable needs to be written and read, so an alternative could be something like a global temp table or a persistent table.
I then process the XML variables:
Use a cursor to loop through the queries and send them to the service broker service.
Each group of queries contained in the XML variable is sent under the same conversation_group_id.
Values such as the to/from service, message type, etc. are all stored in the XML variable.
After the messages are sent to Service Broker, use a while loop to continuously check the ProcessedQueue until all the messages have been processed.
This implements a timeout to avoid an infinite loop
I'm thinking of redesigning this. Should I add an activation procedure on ProcessedQueue and then have that procedure insert the processed results into a physical table? If I do it this way, I wouldn't be able to use RECEIVE instead of a WHILE loop to check for processed items. Does that have any disadvantages?
I haven't built anything as massive as what you are doing now, but I will give you what worked for me, and some general opinions...
My preference is to avoid In-Memory OLTP and write everything to durable tables and keep the message queue as clean as possible
Use fastest possible hard drives in the server, write speed equivalent of NVMe or faster with RAID 10 etc.
I grab every message off the queue as soon as it hits and write it to a table I have named "mqMessagesReceived" (see code below, my all-purpose MQ handler named mqAsyncQueueMessageOnCreate)
I use a trigger in the "mqMessagesReceived" table that does a lookup to find which StoredProcedure to execute to process each unique message (see code below)
Each message has an identifier (in my case, I'm using the originating Tablename that wrote a message to the queue) and this identifier is used as a key for a lookup query run inside the the trigger of the mqMessagesReceived table, to figure out which subsequent Stored Procedure needs to be to run, to process each received message correctly.
Before sending a message on the MQ,
Can make a generic variable from the calling side (e.g. if a trigger is putting messages onto the MQ)
SELECT #tThisTableName = OBJECT_NAME(parent_object_id) FROM sys.objects
WHERE sys.objects.name = OBJECT_NAME(##PROCID)
AND SCHEMA_NAME(sys.objects.schema_id) = OBJECT_SCHEMA_NAME(##PROCID);
A configuration table is the lookup data for matching tablename with StoredProcedure that needs to be run, to process the MQ data that arrived and was written to the mqMessagesReceived table.
Here is the definition of that lookup table
CREATE TABLE [dbo].[mqMessagesConfig](
[ID] [int] IDENTITY(1,1) NOT NULL,
[tSourceTableReceived] [nvarchar](128) NOT NULL,
[tTriggeredStoredProcedure] [nvarchar](128) NOT NULL,
CONSTRAINT [PK_mqMessagesConfig] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Here is the activation stored procedure that gets run as a message hits the queue
CREATE PROCEDURE [dbo].[mqAsyncQueueMessageOnCreate]
AS
BEGIN
SET NOCOUNT ON
DECLARE
#h UNIQUEIDENTIFIER,
#t sysname,
#b varbinary(200),
#hand VARCHAR(36),
#body VARCHAR(2000),
#sqlcleanup nvarchar(MAX)
-- Get all of the messages on the queue
-- the WHILE loop is infinite, until BREAK is received when we get a null handle
WHILE 1=1
BEGIN
SET #h = NULL
--Note the semicolon..!
;RECEIVE TOP(1)
#h = conversation_handle,
#t = message_type_name,
#b = message_body
FROM mqAsyncQueue
--No message found (handle is now null)
IF #h IS NULL
BEGIN
-- all messages are now processed, but we still have the #hand variable saved from processing the last message
SET #sqlcleanup = 'EXEC [mqConversationsClearOne] #handle = N' + char(39) + #hand + char(39) + ';';
EXECUTE(#sqlcleanup);
BREAK
END
--mqAsyncMessage message type received
ELSE IF #t = 'mqAsyncMessage'
BEGIN
SET #hand = CONVERT(varchar(36),#h);
SET #body = CONVERT(varchar(2000),#b);
INSERT mqMessagesReceived (tMessageType, tMessageBody, tMessageBinary, tConversationHandle)
VALUES (#t, #body, #b, #hand);
END
--unknown message type was received that we dont understand
ELSE
BEGIN
INSERT mqMessagesReceived (tMessageBody, tMessageBinary)
VALUES ('Unknown message type received', CONVERT(varbinary(MAX), 'Unknown message type received'))
END
END
END
CREATE PROCEDURE [dbo].[mqConversationsClearOne]
#handle varchar(36)
AS
-- Note: you can check the queue by running this query
-- SELECT * FROM sys.conversation_endpoints
-- SELECT * FROM sys.conversation_endpoints WHERE NOT([State] = 'CO')
-- CO = conversing [State]
DECLARE #getid CURSOR
,#sql NVARCHAR(MAX)
,#conv_id NVARCHAR(100)
,#conv_handle NVARCHAR(100)
-- want to create a chain of statements like this, one per conversation
-- END CONVERSATION 'FE851F37-218C-EA11-B698-4CCC6AD00AE9' WITH CLEANUP;
-- END CONVERSATION 'A4B4F603-208C-EA11-B698-4CCC6AD00AE9' WITH CLEANUP;
SET #getid = CURSOR FOR
SELECT [conversation_id], [conversation_handle]
FROM sys.conversation_endpoints
WHERE conversation_handle = #handle;
OPEN #getid
FETCH NEXT
FROM #getid INTO #conv_id, #conv_handle
WHILE ##FETCH_STATUS = 0
BEGIN
SET #sql = 'END CONVERSATION ' + char(39) + #conv_handle + char(39) + ' WITH CLEANUP;'
EXEC sys.sp_executesql #stmt = #sql;
FETCH NEXT
FROM #getid INTO #conv_id, #conv_handle --, #conv_service
END
CLOSE #getid
DEALLOCATE #getid
and the table named "mqMessagesReceived" has this trigger
CREATE TRIGGER [dbo].[mqMessagesReceived_TriggerUpdate]
ON [dbo].[mqMessagesReceived]
AFTER INSERT
AS
BEGIN
DECLARE
#strMessageBody nvarchar(4000),
#strSourceTable nvarchar(128),
#strSourceKey nvarchar(128),
#strConfigStoredProcedure nvarchar(4000),
#sqlRunStoredProcedure nvarchar(4000),
#strErr nvarchar(4000)
SELECT #strMessageBody= ins.tMessageBody FROM INSERTED ins;
SELECT #strSourceTable = (select txt_Value from dbo.fn_ParseText2Table(#strMessageBody,'|') WHERE Position=2);
SELECT #strSourceKey = (select txt_Value from dbo.fn_ParseText2Table(#strMessageBody,'|') WHERE Position=3);
-- look in mqMessagesConfig to find the name of the final stored procedure
-- to run against the SourceTable
-- e.g. #strConfigStoredProcedure = mqProcess-tblLabDaySchedEventsMQ
SELECT #strConfigStoredProcedure =
(select tTriggeredStoredProcedure from dbo.mqMessagesConfig WHERE tSourceTableReceived = #strSourceTable);
SET #sqlRunStoredProcedure = 'EXEC [' + #strConfigStoredProcedure + '] #iKey = ' + #strSourceKey + ';';
EXECUTE(#sqlRunStoredProcedure);
INSERT INTO [mqMessagesProcessed]
(
[tMessageBody],
[tSourceTable],
[tSourceKey],
[tTriggerStoredProcedure]
)
VALUES
(
#strMessageBody,
#strSourceTable,
#strSourceKey,
#sqlRunStoredProcedure
);
END
Also, just some general SQL Server tuning advice that I found I also had to do (for dealing with a busy database)
By default there is just one single TempDB file per SQL Server, and TempDB has initial size of 8MB
However TempDB gets reset back to the initial 8MB size, every time the server reboots, and this company was rebooting the server every weekend via cron/taskscheduler.
The problem we saw was slow database and lots of record locks but only first thing Monday morning when everyone was hammering the database at once as they began their work-week.
When TempDB gets automatically re-sized, it is "locked" and therefore nobody at all can use that single TempDB (which is why the SQL Server was regularly becoming non-responsive)
By Friday the TempDB had grown to over 300MB.
So... to solve the following best practice recommendation, I created one TempDB file per vCPU, so I created 8 TempDB files, and I have distributed them across two available hard drives on that server, and most importantly, set their initial size to more than we need (200MB each is what I chose).
This fixed the problem with the SQL Server slowdown and record locking that was experienced every Monday morning.

SQL Server: using table lock hint in select for ensuring correctness?

I've got a project that is trying to apply DDD (Domain Driven Design). Currently, we've got something like this:
begin tran
try
_manager.CreateNewEmployee(newEmployeeCmd);
tran.Commit();
catch
rollback tran
Internally, the CreateNewEmployee method uses a domain service for checking if there's already an employee with the memberId. Here's some pseudo code:
void CreateNewEmployee(NewEmployeeCmd cmd)
if(_duplicateMember.AlreadyRegistered(cmd.MemberId) )
throw duplicate
// extra stuff
saveNewEmployee()
end
Now, in the end, it's as if we have the following SQL instructions executed (pesudo code again):
begin sql tran
select count(*) from table where memberId=#memberId and status=1 -- active
--some time goes by
insert into table ...
end
NOw, when I started looking at the code, I've noticed that it was using the default SQL Server locking level. In practice, that means that something like this could happen:
--thread 1
(1)select ... --assume it returns 0
--thread 2
(2)select ... ---nothing found
(3)insert recordA
--thread 1
(4)insert record --some as before
(5) commit tran
--thread 1
(6) commit tran
So, we could end up having repeated records. I've tried playing with the transaction levels, but the only way I've managed to make it work like it's intended was by changing the select that is used to check if there's already a record in the table. I've ended up using a table lock hint which instructs sql to maintain a lock until the end of the transaction. That was the only way I've managed to get a lock when the select starts (changing the other isolation levels still wouldn't do what I needed since they all allowed the select to run)
So, I've ended up using a table lock which is held from the beginning until the end of the transaction. In practice, that means that step (2) will block until thread 1 ends its job.
Is there a better option for this kind of scenarios (that don't depend on using, say, indexes)?
Thanks.
Luis
You need to get the proper locks on the initial select, which you can do with the locking hints with (updlock, serializable). Once you do that, thread 2 will wait for thread 1 to finish if thread 2 is using the same key range in its where.
You could use the Sam Saffron upsert approach.
For example:
create procedure dbo.Employee_getset_byName (#Name nvarchar(50), #MemberId int output) as
begin
set nocount, xact_abort on;
begin tran;
select #MemberId = Id
from dbo.Employee with (updlock, serializable) /* hold key range for #Name */
where Name = #Name;
if ##rowcount = 0 /* if we still do not have an Id for #Name */
begin;
/* for a sequence */
set #MemberId = next value for dbo.IdSequence; /* get next sequence value */
insert into dbo.Employee (Name, Id)
values (#Name, #MemberId);
/* for identity */
insert into dbo.Employee (Name)
values (#Name);
set #MemberId = scope_identity();
end;
commit tran;
end;
go

How to prevent multi threaded application to read this same Sql Server record twice

I am working on a system that uses multiple threads to read, process and then update database records. Threads run in parallel and try to pick records by calling Sql Server stored procedure.
They call this stored procedure looking for unprocessed records multiple times per second and sometimes pick this same record up.
I try to prevent this happening this way:
UPDATE dbo.GameData
SET Exported = #Now,
ExportExpires = #Expire,
ExportSession = #ExportSession
OUTPUT Inserted.ID INTO #ExportedIDs
WHERE ID IN ( SELECT TOP(#ArraySize) GD.ID
FROM dbo.GameData GD
WHERE GD.Exported IS NULL
ORDER BY GD.ID ASC)
The idea here is to set a record as exported first using an UPDATE with OUTPUT (remembering record id), so no other thread can pick it up again. When record is set as exported, then I can do some extra calculations and pass the data to the external system hoping that no other thread will pick this same record again in the mean time. Since the UPDATE that has in mind to secure the record first.
Unfortunately it doesn't seem to be working and the application sometimes pick same record twice anyway.
How to prevent it?
Kind regards
Mariusz
I think you should be able to do this atomically using a common table expression. (I'm not 100% certain about this, and I haven't tested, so you'll need to verify that it works for you in your situation.)
;WITH cte AS
(
SELECT TOP(#ArrayCount)
ID, Exported, ExportExpires, ExportSession
FROM dbo.GameData WITH (READPAST)
WHERE Exported IS NULL
ORDER BY ID
)
UPDATE cte
SET Exported = #Now,
ExportExpires = #Expire,
ExportSession = #ExportSession
OUTPUT INSERTED.ID INTO #ExportedIDs
I have a similar set up and I use sp_getapplock. My application runs many threads and they call a stored procedure to get the ID of the element that has to be processed. sp_getapplock guarantees that the same ID would not be chosen by two different threads.
I have a MyTable with a list of IDs that my application checks in an infinite loop using many threads. For each ID there are two datetime columns: LastCheckStarted and LastCheckCompleted. They are used to determine which ID to pick. Stored procedure picks an ID that wasn't checked for the longest period. There is also a hard-coded period of 20 minutes - the same ID can't be checked more often than every 20 minutes.
CREATE PROCEDURE [dbo].[GetNextIDToCheck]
-- Add the parameters for the stored procedure here
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarID int = NULL;
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'SomeUniqueName_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
-- Find ID that wasn't checked for the longest period
SELECT TOP 1
#VarID = ID
FROM
dbo.MyTable
WHERE
LastCheckStarted <= LastCheckCompleted
-- this ID is not being checked right now
AND LastCheckCompleted < DATEADD(minute, -20, GETDATE())
-- last check was done more than 20 minutes ago
ORDER BY LastCheckCompleted;
-- Start checking
UPDATE dbo.MyTable
SET LastCheckStarted = GETDATE()
WHERE ID = #VarID;
-- There is no need to explicitly verify if we found anything.
-- If #VarID is null, no rows will be updated
END;
-- Return found ID, or no rows if nothing was found,
-- or failed to acquire the lock
SELECT
#VarID AS ID
WHERE
#VarID IS NOT NULL
;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
The second procedure is called by an application when it finishes checking the found ID.
CREATE PROCEDURE [dbo].[SetCheckComplete]
-- Add the parameters for the stored procedure here
#ParamID int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'SomeUniqueName_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
-- Completed checking the given ID
UPDATE dbo.MyTable
SET LastCheckCompleted = GETDATE()
WHERE ID = #ParamID;
END;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
It does not work because multiple transactions might first execute the IN clause and find the same set of rows, then update multiple times and overwrite each other.
LukeH's answer is best, accept it.
You can also fix it by adding AND Exported IS NULL to cancel double updates.
Or, make this SERIALIZABLE. This will lead to some blocking and deadlocking. This can safely be handled by timeouts and retry in case of deadlock. SERIALIZABLE is always safe for all workloads but it might block/deadlock more often.

How can I set a lock inside a stored procedure?

I've got a long-running stored procedure on a SQL server database. I don't want it to run more often than once every ten minutes.
Once the stored procedure has run, I want to store the latest result in a LatestResult table, against a time, and have all calls to the procedure return that result for the next ten minutes.
That much is relatively simple, but we've found that, because the procedure checks the LatestResult table and updates it, that large userbases are getting a number of deadlocks, when two users call the procedure at the same time.
In a client-side/threading situation, I would solve this by using a lock, having the first user lock the function, the second user encounters the lock, waiting for the result, the first user finishes their procedure call, updates the LatestResult table, and unlocks the second user, who then picks up the result from the LatestResult table.
Is there any way to accomplish this kind of locking in SQL Server?
EDIT:
This is basically how the code looks without its error checking calls:
DECLARE #LastChecked AS DATETIME
DECLARE #LastResult AS NUMERIC(18,2)
SELECT TOP 1 #LastChecked = LastRunTime, #LastResult = LastResult FROM LastResult
DECLARE #ReturnValue AS NUMERIC(18,2)
IF DATEDIFF(n, #LastChecked, GetDate()) >= 10 OR NOT #LastResult = 0
BEGIN
SELECT #ReturnValue = ABS(ISNULL(SUM(ISNULL(Amount,0)),0)) FROM Transactions WHERE ISNULL(DeletedFlag,0) = 0 GROUP BY GroupID ORDER BY ABS(ISNULL(SUM(ISNULL(Amount,0)),0))
UPDATE LastResult SET LastRunTime = GETDATE(), LastResult = #ReturnValue
SELECT #ReturnValue
END
ELSE
BEGIN
SELECT #LastResult
END
I'm not really sure what's going on with the grouping, but I've found a test system where execution time is coming in around 4 seconds.
I think there's some work scheduled to archive some of these records and boil them down to running totals, which will probably help things given that there's several million rows in that four second table...
This is a valid opportunity to use an Application Lock (see sp_getapplock and sp_releaseapplock) as it is a lock taken out on a concept that you define, not on any particular rows in any given table. The idea is that you create a transaction, then create this arbitrary lock that has an indetifier, and other processes will wait to enter that piece of code until the lock is released. This works just like lock() at the app layer. The #Resource parameter is the label of the arbitrary "concept". In more complex situations, you can even concatenate a CustomerID or something in there for more granular locking control.
DECLARE #LastChecked DATETIME,
#LastResult NUMERIC(18,2);
DECLARE #ReturnValue NUMERIC(18,2);
BEGIN TRANSACTION;
EXEC sp_getapplock #Resource = 'check_timing', #LockMode = 'Exclusive';
SELECT TOP 1 -- not sure if this helps the optimizer on a 1 row table, but seems ok
#LastChecked = LastRunTime,
#LastResult = LastResult
FROM LastResult;
IF (DATEDIFF(MINUTE, #LastChecked, GETDATE()) >= 10 OR #LastResult <> 0)
BEGIN
SELECT #ReturnValue = ABS(ISNULL(SUM(ISNULL(Amount, 0)), 0))
FROM Transactions
WHERE DeletedFlag = 0
OR DeletedFlag IS NULL;
UPDATE LastResult
SET LastRunTime = GETDATE(),
LastResult = #ReturnValue;
END;
ELSE
BEGIN
SET #ReturnValue = #LastResult; -- This is always 0 here
END;
SELECT #ReturnValue AS [ReturnValue];
EXEC sp_releaseapplock #Resource = 'check_timing';
COMMIT TRANSACTION;
You need to manage errors / ROLLBACK yourself (as stated in the linked MSDN documentation) so put in the usual TRY / CATCH. But, this does allow you to manage the situation.
If there are any concerns regarding contention on this process, there shouldn't be much as the lookup done right after locking the resource is a SELECT from a single-row table and then an IF statement that (ideally) just returns the last known value if the 10-minute timer hasn't elapsed. Hence, most calls should process rather quickly.
Please note: sp_getapplock / sp_releaseapplock should be used sparingly; Application Locks can definitely be very handy (such as in cases like this one) but they should only be used when absolutely necessary.

Record locking and concurrency issues

My logical schema is as follows:
A header record can have multiple child records.
Multiple PCs can be inserting Child records, via a stored procedure that accepts details about the child record, and a value.
When a child record is inserted, a header record may need to be inserted if one doesn't exist with the specified value.
You only ever want one header record inserted for any given "value". So if two child records are inserted with the same "Value" supplied, the header should only be created once. This requires concurrency management during inserts.
Multiple PCs can be querying unprocessed header records, via a stored procedure
A header record needs to be queried if it has a specific set of child records, and the header record is unprocessed.
You only ever want one machine PC to query and process each header record. There should never be an instance where a header record and it's children should be processed by more than one PC. This requires concurrency management during selects.
So basically my header query looks like this:
BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT TOP 1
*
INTO
#unprocessed
FROM
Header h WITH (READPAST, UPDLOCK)
JOIN
Child part1 ON part1.HeaderID = h.HeaderID AND part1.Name = 'XYZ'
JOIN
Child part2 ON part1.HeaderID = part2.HeaderID AND
WHERE
h.Processed = 0x0;
UPDATE
Header
SET
Processed = 0x1
WHERE
HeaderID IN (SELECT [HeaderID] FROM #unprocessed);
SELECT * FROM #unprocessed
COMMIT TRAN
So the above query ensures that concurrent queries never return the same record.
I think my problem is on the insert query. Here's what I have:
DECLARE #HeaderID INT
BEGIN TRAN
--Create header record if it doesn't exist, otherwise get it's HeaderID
MERGE INTO
Header WITH (HOLDLOCK) as target
USING
(
SELECT
[Value] = #Value, --stored procedure parameter
[HeaderID]
) as source ([Value], [HeaderID]) ON target.[Value] = source.[Value] AND
target.[Processed] = 0
WHEN MATCHED THEN
UPDATE SET
--Get the ID of the existing header
#HeaderID = target.[HeaderID],
[LastInsert] = sysdatetimeoffset()
WHEN NOT MATCHED THEN
INSERT
(
[Value]
)
VALUES
(
source.[Value]
)
--Get new or existing ID
SELECT #HeaderID = COALESCE(#HeaderID , SCOPE_IDENTITY());
--Insert child with the new or existing HeaderID
INSERT INTO
[Correlation].[CorrelationSetPart]
(
[HeaderID],
[Name]
)
VALUES
(
#HeaderID,
#Name --stored procedure parameter
);
My problem is that insertion query is often blocked by the above selection query, and I'm receiving timeouts. The selection query is called by a broker, so it can be called fairly quickly. Is there a better way to do this? Note, I have control over the database schema.
To answer the second part of the question
You only ever want one machine PC to query and process each header
record. There should never be an instance where a header record and
it's children should be processed by more than one PC
Have a look at sp_getapplock.
I use app locks within the similar scenario. I have a table of objects that must be processed, similar to your table of headers. The client application runs several threads simultaneously. Each thread executes a stored procedure that returns the next object for processing from the table of objects. So, the main task of the stored procedure is not to do the processing itself, but to return the first object in the queue that needs processing.
The code may look something like this:
CREATE PROCEDURE [dbo].[GetNextHeaderToProcess]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE #VarHeaderID int = NULL;
DECLARE #VarLockResult int;
EXEC #VarLockResult = sp_getapplock
#Resource = 'GetNextHeaderToProcess_app_lock',
#LockMode = 'Exclusive',
#LockOwner = 'Transaction',
#LockTimeout = 60000,
#DbPrincipal = 'public';
IF #VarLockResult >= 0
BEGIN
-- Acquired the lock
-- Find the most suitable header for processing
SELECT TOP 1
#VarHeaderID = h.HeaderID
FROM
Header h
JOIN Child part1 ON part1.HeaderID = h.HeaderID AND part1.Name = 'XYZ'
JOIN Child part2 ON part1.HeaderID = part2.HeaderID
WHERE
h.Processed = 0x0
ORDER BY ....;
-- sorting is optional, but often useful
-- for example, order by some timestamp to process oldest/newest headers first
-- Mark the found Header to prevent multiple processing.
UPDATE Header
SET Processed = 2 -- in progress. Another procedure that performs the actual processing should set it to 1 when processing is complete.
WHERE HeaderID = #VarHeaderID;
-- There is no need to explicitly verify if we found anything.
-- If #VarHeaderID is null, no rows will be updated
END;
-- Return found Header, or no rows if nothing was found, or failed to acquire the lock
SELECT
#VarHeaderID AS HeaderID
WHERE
#VarHeaderID IS NOT NULL
;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
This procedure should be called from the procedure that does actual processing. In my case the client application does the actual processing, in your case it may be another stored procedure. The idea is that we acquire the app lock for the short time here. Of course, if the actual processing is fast you can put it inside the lock, so only one header can be processed at a time.
Once the lock is acquired we look for the most suitable header to process and then set its Processed flag. Depending on the nature of your processing you can set the flag to 1 (processed) right away, or set it to some intermediary value, like 2 (in progress) and then set it to 1 (processed) later. In any case, once the flag is not zero the header will not be chosen for processing again.
These app locks are separate from normal locks that DB puts when reading and updating rows and they should not interfere with inserts. In any case, it should be better than locking the whole table as you do WITH (UPDLOCK).
Returning to the first part of the question
You only ever want one header record inserted for any given "value".
So if two child records are inserted with the same "Value" supplied,
the header should only be created once.
You can use the same approach: acquire app lock in the beginning of the inserting procedure (with some different name than the app lock used in querying procedure). Thus you would guarantee that inserts happen sequentially, not simultaneously. BTW, in practice most likely inserts can't happen simultaneously anyway. The DB would perform them sequentially internally. They will wait for each other, because each insert locks a table for update. Also, each insert is written to transaction log and all writes to transaction log are also sequential. So, just add sp_getapplock to the beginning of your inserting procedure and remove that WITH (HOLDLOCK) hint in the MERGE.
The caller of the GetNextHeaderToProcess procedure should handle correctly the situation when procedure returns no rows. This can happen if the lock acquisition timed out, or there are simply no more headers to process. Usually the processing part simply retries after a while.
Inserting procedure should check if the lock acquisition failed and retry the insert or report the problem to the caller somehow. I usually return the generated identity ID of the inserted row (the ChildID in your case) to the caller. If procedure returns 0 it means that insert failed. The caller may decide what to do.

Resources