Why is my UPDATE stored procedure executed multiple times? - sql-server

I use stored procedures to manage a warehouse. PDA scanners scan the added stock and send it in bulk (when plugged back) to a SQL database (SQL Server 2016).
The SQL database is fairly remote (other country), so there's sometimes delay in some queries, but this particular one is problematic: even if the stock table is fine, I had some problems with updating the occupancy of the warehouse spots. The PDA tracks the added items in every spot as a SMALLINT, then send back this value to the stored procedure below.
PDA "send_spots" query:
SELECT spot, added_items FROM spots WHERE change=1
Stored procedure:
CREATE PROCEDURE [dbo].[update_spots]
#spot VARCHAR(10),
#added_items SMALLINT
AS
BEGIN
BEGIN TRAN
UPDATE storage_spots
SET remaining_capacity = remaining_capacity - #added_items
WHERE storage_spot=#spot
IF ##ROWCOUNT <> 1
BEGIN
ROLLBACK TRAN
RETURN - 1
END
ELSE
BEGIN
COMMIT TRAN
RETURN 0
END
END
GO
If the remaining_capacity value is 0, the PDAs can't add more items to it on next round. But with this process, I had negative values because the query allegedly ran two times (so subtracted #added_items two times).
Is there a way for that to be possible? How could I fix it? From what I understood the transaction should be cancelled (ROLLBACK) if the affected rows are != 1, but maybe that's something else.
EDIT: current solution with the help of #Zero:
CREATE PROCEDURE [dbo].[update_spots]
#spot VARCHAR(10),
#added_racks SMALLINT
AS
BEGIN
-- Recover current capacity of the spot
DECLARE #orig_capacity SMALLINT
SELECT TOP 1
#orig_capacity = remaining_capacity
FROM storage_spots
WHERE storage_spot=#spot
-- Test if double is present in logs by comparing dates (last 10 seconds)
DECLARE #is_double BIT = 0
SELECT #is_double = CASE WHEN EXISTS(SELECT *
FROM spot_logs
WHERE log_timestamp >= dateadd(second, -10, getdate()) AND storage_spot=#spot AND delta=#added_racks)
THEN 1 ELSE 0 END
BEGIN
BEGIN TRAN
UPDATE storage_spots
SET remaining_capacity= #orig_capacity - #added_racks
WHERE storage_spot=#spot
IF ##ROWCOUNT <> 1 OR #is_double <> 0
-- If double, rollback UPDATE
ROLLBACK TRAN
ELSE
-- If no double, commit UPDATE
COMMIT TRAN
-- write log
INSERT INTO spot_logs
(storage_spot, former_capacity, new_capacity, delta, log_timestamp, double_detected)
VALUES
(#spot, #orig_capacity, #orig_capacity-#added_racks, #added_racks, getdate(), #is_double)
END
END
GO

I was thinking about possible causes (and a way to trace them) and then it hit me - you have no value validation!
Here's a simple example to illustrate the problem:
Spot | capacity
---------------
x1 | 1
Update spots set capacity = capacity - 2 where spot = 'X1'
Your scanner most likely gave you higher quantity than you had capacity to take in.
I am not sure how your business logic goes, but you need to perform something in lines of
Update spots set capacity = capacity - #added_items where spot = 'X1' and capacity >= #added_items
if ##rowcount <> 1;
rollback;
EDIT: few methods to trace your problem without implementing validation:
Create a logging table (with timestamp, user id (user, that is connected to DB), session id, old value, new value, delta value (added items).
Option 1:
Log all updates that change value from positive to negative (at least until you figure out the problem).
The drawback of this option is that it will not register double calls that do not result in a negative capacity.
Option 2: (logging identical updates):
Create script that creates a global temporary table and deletes records, from that table timestamps older than... let's say 10 minutes once every minute or so (play around with numbers).
This temporary table should hold the data passed to your update statement so 'spot', 'added_items' + 'timestamp' (for tracking).
Now the crucial part: When you call your update statement check if a similar record exists in the temporary table (same spot and added_items and where current timestamp is between timestamp and [timestamp + 1 second] - again play around with numbers). If a record like that exist log that update, if not add it to temporary table.
This will register identical updates that go within 1 second of each other (or whatever time-frame you choose).

I found here an alternative SQL query that does the update the way I need, but with a temporary value using DECLARE. Would it work better in my case, or is my initial query correct?
Initial query:
UPDATE storage_spots
SET remaining_capacity = remaining_capacity - #added_items
WHERE storage_spot=#spot
Alternative query:
DECLARE #orig_capacity SMALLINT
SELECT TOP 1 #orig_capacity = remaining_capacity
FROM storage_spots
WHERE spot=#spot
UPDATE Products
SET remaining_capacity = #orig_capacity - #added_items
WHERE spot=#spot
Also, should I get rid of the ROLLBACK/COMMIT instructions?

Related

How to convert Row by row execution in to SET based approach in SQL

I'm working on a huge SQL code and unfortunately it has a CURSOR which handles another two nested CURSORS within it (totally three cursors inside a stored procedure), which handles millions of data to be DELETE,UPDATE and INSERT. This takes a whole lot of time because of row by row execution and I wish to modify this in to SET based approach
From many articles it shows use of CURSORs is not recommend and the alternate is to use WHILE loops instead, So I tried and replaced the three CUROSRs with three WHILE loops nothing more, though I get the same result but there is no improvement in performance, it took the same time as it took for CUROSRs.
Below is the basic structure of the code I'm working on (i Will try to put as simple as possible) and I will put the comments what they are supposed to do.
declare #projects table (
ProjectID INT,
fieldA int,
fieldB int,
fieldC int,
fieldD int)
INSERT INTO #projects
SELECT ProjectID,fieldA,fieldB,fieldC, fieldD
FROM ProjectTable
DECLARE projects1 CURSOR LOCAL FOR /*First cursor - fetch the cursor from ProjectaTable*/
Select ProjectID FROM #projects
OPEN projects1
FETCH NEXT FROM projects1 INTO #ProjectID
WHILE ##FETCH_STATUS = 0
BEGIN
BEGIN TRY
BEGIN TRAN
DELETE FROM T_PROJECTGROUPSDATA td
WHERE td.ID = #ProjectID
DECLARE datasets CURSOR FOR /*Second cursor - this will get the 'collectionDate'field from datasetsTable for every project fetched in above cursor*/
Select DataID, GroupID, CollectionDate
FROM datasetsTable
WHERE datasetsTable.projectID = #ProjectID /*lets say this will fetch ten records for a single projectID*/
OPEN datasets
FETCH NEXT FROM datasets INTO #DataID, #GroupID, #CollectionDate
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE period CURSOR FOR /*Third Cursor - this will process the records from another table called period with above fetched #collectionDate*/
SELECT ID, dbo.fn_GetEndOfPeriod(ID)
FROM T_PERIODS
WHERE DATEDIFF(dd,#CollectionDate,dbo.fn_GetEndOfPeriod(ID)) >= 0 /*lets say this will fetch 20 records for above fetched single #CollectionDate*/
ORDER BY [YEAR],[Quarter]
OPEN period
FETCH NEXT FROM period INTO #PeriodID, #EndDate
WHILE ##FETCH_STATUS = 0
BEGIN
IF EXISTS (some conditions No - 1 )
BEGIN
BREAK
END
IF EXISTS (some conditions No - 2 )
BEGIN
FETCH NEXT FROM period INTO #PeriodID, #EndDate
CONTINUE
END
/*get the appropirate ID from T_uploads table for the current projectID and periodID fetched*/
SET #UploadID = (SELECT ID FROM T_UPLOADS u WHERE u.project_savix_ID = #ProjectID AND u.PERIOD_ID = #PeriodID AND u.STATUS = 3)
/*Update some fields in T_uploads table for the current projectID and periodID fetched*/
UPDATE T_uploads
SET fieldA = mp.fieldA, fieldB = mp.fieldB
FROM #projects mp
WHERE T_UPLOADS.ID = #UploadID AND mp.ProjectID = #ProjectID
/*Insert some records in T_PROJECTGROUPSDATA table for the current projectID and periodID fetched*/
INSERT INTO T_PROJECTGROUPSDATA tpd ( fieldA,fieldB,fieldC,fieldD,uploadID)
SELECT fieldA,fieldB,fieldC,fieldD,#UploadID
FROM #projects
WHERE tpd.DataID = #DataID
FETCH NEXT FROM period INTO #PeriodID, #EndDate
END
CLOSE period
DEALLOCATE period
FETCH NEXT FROM datasets INTO #DataID, #GroupID, #CollectionDate, #Status, #Createdate
END
CLOSE datasets
DEALLOCATE datasets
COMMIT
END TRY
BEGIN CATCH
Error handling
IF ##TRANCOUNT > 0
ROLLBACK
END CATCH
FETCH NEXT FROM projects1 INTO #ProjectID, #FAID
END
CLOSE projects1
DEALLOCATE projects1
SELECT 1 as success
I request you to suggest any methods to rewrite this code to follow the SET based approach.
Until the table structure and expected result sample data is not provided, here are a few quick things I see that can be improved (some of those are already mentioned by others above):
WHILE Loop is also a cursor. So, changing into to while loop is not
going make things any faster.
Use LOCAL FAST_FORWARD cursor unless you have need to back track a record. This would make the execution much faster.
Yes, I agree that having a SET based approach would be the fastest in most cases, however if you must store intermediate resultset somewhere, I would suggest using a temp table instead of a table variable. Temp table is 'lesser evil' between these 2 options. Here are a few reason why you should try to avoid using a table variable:
Since SQL Server would not have any prior statistics on the table variable during building on Execution Plan, it will always consider that only one record would be returned by the table variable during construction of the execution plan. And accordingly Storage Engine would assign only as much RAM memory for execution of the query. But in reality, there could be millions of records that the table variable might hold during execution. If that happens, SQL Server would be forced spill the data to hard disk during execution (and you will see lots of PAGEIOLATCH in sys.dm_os_wait_stats) making the queries way slower.
One way to get rid of the above issue would be by providing statement level hint OPTION (RECOMPILE) at the end of each query where a table value is used. This would force SQL Server to construct the Execution Plan of those queries each time during runtime and the less memory allocation issue can be avoided. However the downside of this is: SQL Server will no longer be able to take advantage of an already cached execution plan for that stored procedure, and would require recompilation every time, which would deteriorate the performance by some extent. So, unless you know that data in the underlying table changes frequently or the stored procedure itself is not frequently executed, this approach is not recommended by Microsoft MVPs.
Replacing Cursor with While blindly, is not a recommended option, hence it would not impact your performance and might even have negative impact on the performance.
When you define the cursor using Declare C Cursor in fact you are going to create a SCROLL cursor which specifies that all fetch options (FIRST, LAST, PRIOR, NEXT, RELATIVE, ABSOLUTE) are available.
When you need just Fetch Next as scroll option, you can declare the cursor as FAST_FORWARD
Here is the quote about FAST_FORWARD cursor in Microsoft docs:
Specifies that the cursor can only move forward and be scrolled from
the first to the last row. FETCH NEXT is the only supported fetch
option. All insert, update, and delete statements made by the current
user (or committed by other users) that affect rows in the result set
are visible as the rows are fetched. Because the cursor cannot be
scrolled backward, however, changes made to rows in the database after
the row was fetched are not visible through the cursor. Forward-only
cursors are dynamic by default, meaning that all changes are detected
as the current row is processed. This provides faster cursor opening
and enables the result set to display updates made to the underlying
tables. While forward-only cursors do not support backward scrolling,
applications can return to the beginning of the result set by closing
and reopening the cursor.
So you can declare your cursors using DECLARE <CURSOR NAME> FAST_FORWARD FOR ... and you will get noticeable improvements
I think all the cursors code above can be simplified to something like this:
DROP TABLE IF EXISTS #Source;
SELECT DISTINCT p.ProjectID,p.fieldA,p.fieldB,p.fieldC,p.fieldD,u.ID AS [UploadID]
INTO #Source
FROM ProjectTable p
INNER JOIN DatasetsTable d ON d.ProjectID = p.ProjectID
INNER JOIN T_PERIODS s ON DATEDIFF(DAY,d.CollectionDate,dbo.fn_GetEndOfPeriod(s.ID)) >= 0
INNER JOIN T_UPLOADS u ON u.roject_savix_ID = p.ProjectID AND u.PERIOD_ID = s.ID AND u.STATUS = 3
WHERE NOT EXISTS (some conditions No - 1)
AND NOT EXISTS (some conditions No - 2)
;
UPDATE u SET u.fieldA = s.fieldA, u.fieldB = s.fieldB
FROM T_UPLOADS u
INNER JOIN #Source s ON s.UploadID = u.ID
;
INSERT INTO T_PROJECTGROUPSDATA (fieldA,fieldB,fieldC,fieldD,uploadID)
SELECT DISTINCT s.fieldA,s.fieldB,s.fieldC,s.fieldD,s.UploadID
FROM #Source s
;
DROP TABLE IF EXISTS #Source;
Also it would be nice to know "some conditions No" details as query can differ depends on that.

How can I set a lock inside a stored procedure?

I've got a long-running stored procedure on a SQL server database. I don't want it to run more often than once every ten minutes.
Once the stored procedure has run, I want to store the latest result in a LatestResult table, against a time, and have all calls to the procedure return that result for the next ten minutes.
That much is relatively simple, but we've found that, because the procedure checks the LatestResult table and updates it, that large userbases are getting a number of deadlocks, when two users call the procedure at the same time.
In a client-side/threading situation, I would solve this by using a lock, having the first user lock the function, the second user encounters the lock, waiting for the result, the first user finishes their procedure call, updates the LatestResult table, and unlocks the second user, who then picks up the result from the LatestResult table.
Is there any way to accomplish this kind of locking in SQL Server?
EDIT:
This is basically how the code looks without its error checking calls:
DECLARE #LastChecked AS DATETIME
DECLARE #LastResult AS NUMERIC(18,2)
SELECT TOP 1 #LastChecked = LastRunTime, #LastResult = LastResult FROM LastResult
DECLARE #ReturnValue AS NUMERIC(18,2)
IF DATEDIFF(n, #LastChecked, GetDate()) >= 10 OR NOT #LastResult = 0
BEGIN
SELECT #ReturnValue = ABS(ISNULL(SUM(ISNULL(Amount,0)),0)) FROM Transactions WHERE ISNULL(DeletedFlag,0) = 0 GROUP BY GroupID ORDER BY ABS(ISNULL(SUM(ISNULL(Amount,0)),0))
UPDATE LastResult SET LastRunTime = GETDATE(), LastResult = #ReturnValue
SELECT #ReturnValue
END
ELSE
BEGIN
SELECT #LastResult
END
I'm not really sure what's going on with the grouping, but I've found a test system where execution time is coming in around 4 seconds.
I think there's some work scheduled to archive some of these records and boil them down to running totals, which will probably help things given that there's several million rows in that four second table...
This is a valid opportunity to use an Application Lock (see sp_getapplock and sp_releaseapplock) as it is a lock taken out on a concept that you define, not on any particular rows in any given table. The idea is that you create a transaction, then create this arbitrary lock that has an indetifier, and other processes will wait to enter that piece of code until the lock is released. This works just like lock() at the app layer. The #Resource parameter is the label of the arbitrary "concept". In more complex situations, you can even concatenate a CustomerID or something in there for more granular locking control.
DECLARE #LastChecked DATETIME,
#LastResult NUMERIC(18,2);
DECLARE #ReturnValue NUMERIC(18,2);
BEGIN TRANSACTION;
EXEC sp_getapplock #Resource = 'check_timing', #LockMode = 'Exclusive';
SELECT TOP 1 -- not sure if this helps the optimizer on a 1 row table, but seems ok
#LastChecked = LastRunTime,
#LastResult = LastResult
FROM LastResult;
IF (DATEDIFF(MINUTE, #LastChecked, GETDATE()) >= 10 OR #LastResult <> 0)
BEGIN
SELECT #ReturnValue = ABS(ISNULL(SUM(ISNULL(Amount, 0)), 0))
FROM Transactions
WHERE DeletedFlag = 0
OR DeletedFlag IS NULL;
UPDATE LastResult
SET LastRunTime = GETDATE(),
LastResult = #ReturnValue;
END;
ELSE
BEGIN
SET #ReturnValue = #LastResult; -- This is always 0 here
END;
SELECT #ReturnValue AS [ReturnValue];
EXEC sp_releaseapplock #Resource = 'check_timing';
COMMIT TRANSACTION;
You need to manage errors / ROLLBACK yourself (as stated in the linked MSDN documentation) so put in the usual TRY / CATCH. But, this does allow you to manage the situation.
If there are any concerns regarding contention on this process, there shouldn't be much as the lookup done right after locking the resource is a SELECT from a single-row table and then an IF statement that (ideally) just returns the last known value if the 10-minute timer hasn't elapsed. Hence, most calls should process rather quickly.
Please note: sp_getapplock / sp_releaseapplock should be used sparingly; Application Locks can definitely be very handy (such as in cases like this one) but they should only be used when absolutely necessary.

Deleting 1 millions rows in SQL Server

I am working on a client's database and there is about 1 million rows that need to be deleted due to a bug in the software. Is there an efficient way to delete them besides:
DELETE FROM table_1 where condition1 = 'value' ?
Here is a structure for a batched delete as suggested above. Do not try 1M at once...
The size of the batch and the waitfor delay are obviously quite variable, and would depend on your servers capabilities, as well as your need to mitigate contention. You may need to manually delete some rows, measuring how long they take, and adjust your batch size to something your server can handle. As mentioned above, anything over 5000 can cause locking (which I was not aware of).
This would be best done after hours... but 1M rows is really not a lot for SQL to handle. If you watch your messages in SSMS, it may take a while for the print output to show, but it will after several batches, just be aware it won't update in real-time.
Edit: Added a stop time #MAXRUNTIME & #BSTOPATMAXTIME. If you set #BSTOPATMAXTIME to 1, the script will stop on it's own at the desired time, say 8:00AM. This way you can schedule it nightly to start at say midnight, and it will stop before production at 8AM.
Edit: Answer is pretty popular, so I have added the RAISERROR in lieu of PRINT per comments.
DECLARE #BATCHSIZE INT, #WAITFORVAL VARCHAR(8), #ITERATION INT, #TOTALROWS INT, #MAXRUNTIME VARCHAR(8), #BSTOPATMAXTIME BIT, #MSG VARCHAR(500)
SET DEADLOCK_PRIORITY LOW;
SET #BATCHSIZE = 4000
SET #WAITFORVAL = '00:00:10'
SET #MAXRUNTIME = '08:00:00' -- 8AM
SET #BSTOPATMAXTIME = 1 -- ENFORCE 8AM STOP TIME
SET #ITERATION = 0 -- LEAVE THIS
SET #TOTALROWS = 0 -- LEAVE THIS
WHILE #BATCHSIZE>0
BEGIN
-- IF #BSTOPATMAXTIME = 1, THEN WE'LL STOP THE WHOLE JOB AT A SET TIME...
IF CONVERT(VARCHAR(8),GETDATE(),108) >= #MAXRUNTIME AND #BSTOPATMAXTIME=1
BEGIN
RETURN
END
DELETE TOP(#BATCHSIZE)
FROM SOMETABLE
WHERE 1=2
SET #BATCHSIZE=##ROWCOUNT
SET #ITERATION=#ITERATION+1
SET #TOTALROWS=#TOTALROWS+#BATCHSIZE
SET #MSG = 'Iteration: ' + CAST(#ITERATION AS VARCHAR) + ' Total deletes:' + CAST(#TOTALROWS AS VARCHAR)
RAISERROR (#MSG, 0, 1) WITH NOWAIT
WAITFOR DELAY #WAITFORVAL
END
BEGIN TRANSACTION
DoAgain:
DELETE TOP (1000)
FROM <YourTable>
IF ##ROWCOUNT > 0
GOTO DoAgain
COMMIT TRANSACTION
Maybe this solution from Uri Dimant
WHILE 1 = 1
BEGIN
DELETE TOP(2000)
FROM Foo
WHERE <predicate>;
IF ##ROWCOUNT < 2000 BREAK;
END
(Link: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/b5225ca7-f16a-4b80-b64f-3576c6aa4d1f/how-to-quickly-delete-millions-of-rows?forum=transactsql)
Here is something I have used:
If the bad data is mixed in with the good-
INSERT INTO #table
SELECT columns
FROM old_table
WHERE statement to exclude bad rows
TRUNCATE old_table
INSERT INTO old_table
SELECT columns FROM #table
Not sure how good this would be but what if you do like below (provided table_1 is a stand alone table; I mean no referenced by other table)
create a duplicate table of table_1 like table_1_dup
insert into table_1_dup select * from table_1 where condition1 <> 'value';
drop table table_1
sp_rename table_1_dup table_1
If you cannot afford to get the database out of production while repairing, do it in small batches. See also: How to efficiently delete rows while NOT using Truncate Table in a 500,000+ rows table
If you are in a hurry and need the fastest way possible:
take the database out of production
drop all non-clustered indexes and triggers
delete the records (or if the majority of records is bad, copy+drop+rename the table)
(if applicable) fix the inconsistencies caused by the fact that you dropped triggers
re-create the indexes and triggers
bring the database back in production

TSQL Insert Transaction Log Filling up

Need to generate some test data. This insert is 800,000 X 1,000. I know a lot but this is a real application where the random will be a calculated number.
How can I break this up so the transaction log does not fill up?
insert into sIDcrossMatch
select
docSVsys1.sID, docSVsys2.sID, Abs(Checksum(NewId())) % 100 As RandomInteger
from docSVsys as docSVsys1
join docSVsys as docSVsys2
on docSVsys1.sID <> docSVsys2.sID
where docSVsys1.sID < 1000
order by docSVsys1.sID, docSVsys2.sID
It will insert one docSVsys1.sID without filling up the transaction log.
Since that is your test database, make sure your Recovery model to Simple first and then let log grow as much as you could provide space to it (add more files if needed). And be sure that You understand consequences of these settings.
Next step, or first step if you can't set recovery model and allow log to grow, split your insert statement into multiple insert statements by adding where clause like this:
insert #1: where docSVsys1.sID % 2 = 0 /* % means moduo */
insert #2: where docSVsys1.sID % 2 = 1 /* % means moduo */
if that would not be enough, increase divider (2) and add more insert statements. The idea behind multiple inserts is to use less log space and reuse log space.
Or, if possible for You, use SSIS and have one source component with your select query and one destination component (don't forget to set batch size).
As #usr pointed out this is just too much data. Should have done the math up front. A square of a big number is a really big number. This is more for Azure Table Storage. This appears to work. Gave the check Filip.
DECLARE #sid int
DECLARE sID_cursor CURSOR FOR
SELECT top 10 sID
FROM docSVsys
ORDER BY sID
OPEN sID_cursor
FETCH NEXT FROM sID_cursor
INTO #sID
WHILE ##FETCH_STATUS = 0
BEGIN
PRINT #sID
insert into sIDcrossMatch
select #sid, docSVsys.sID, Abs(Checksum(NewId())) % 100 As RandomInteger
from docSVsys
where docSVsys.sID <> #sid
order by docSVsys.sID;
checkpoint;
FETCH NEXT FROM sID_cursor
INTO #sID
END
CLOSE sID_cursor;
DEALLOCATE sID_cursor;

SQL Server - Implementing sequences

I have a system which requires I have IDs on my data before it goes to the database. I was using GUIDs, but found them to be too big to justify the convenience.
I'm now experimenting with implementing a sequence generator which basically reserves a range of unique ID values for a given context. The code is as follows;
ALTER PROCEDURE [dbo].[Sequence.ReserveSequence]
#Name varchar(100),
#Count int,
#FirstValue bigint OUTPUT
AS
BEGIN
SET NOCOUNT ON;
-- Ensure the parameters are valid
IF (#Name IS NULL OR #Count IS NULL OR #Count < 0)
RETURN -1;
-- Reserve the sequence
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION
-- Get the sequence ID, and the last reserved value of the sequence
DECLARE #SequenceID int;
DECLARE #LastValue bigint;
SELECT TOP 1 #SequenceID = [ID], #LastValue = [LastValue]
FROM [dbo].[Sequences]
WHERE [Name] = #Name;
-- Ensure the sequence exists
IF (#SequenceID IS NULL)
BEGIN
-- Create the new sequence
INSERT INTO [dbo].[Sequences] ([Name], [LastValue])
VALUES (#Name, #Count);
-- The first reserved value of a sequence is 1
SET #FirstValue = 1;
END
ELSE
BEGIN
-- Update the sequence
UPDATE [dbo].[Sequences]
SET [LastValue] = #LastValue + #Count
WHERE [ID] = #SequenceID;
-- The sequence start value will be the last previously reserved value + 1
SET #FirstValue = #LastValue + 1;
END
COMMIT TRANSACTION
END
The 'Sequences' table is just an ID, Name (unique), and the last allocated value of the sequence. Using this procedure I can request N values in a named sequence and use these as my identifiers.
This works great so far - it's extremely quick since I don't have to constantly ask for individual values, I can just use up a range of values and then request more.
The problem is that at extremely high frequency, calling the procedure concurrently can sometimes result in a deadlock. I have only found this to occur when stress testing, but I'm worried it'll crop up in production. Are there any notable flaws in this procedure, and can anyone recommend any way to improve on it? It would be nice to do with without transactions for example, but I do need this to be 'thread safe'.
MS themselves offer a solution and even they say it locks/deadlocks.
If you want to add some lock hints then you'd reduce concurrency for your high loads
Options:
You could develop against the "Denali" CTP which is the next release
Use IDENTITY and the OUTPUT clause like everyone else
Adopt/modify the solutions above
On DBA.SE there is "Emulate a TSQL sequence via a stored procedure": see dportas' answer which I think extends the MS solution.
I'd recommend sticking with the GUIDs, if as you say, this is mostly about composing data ready for a bulk insert (it's simpler than what I present below).
As an alternative, could you work with a restricted count? Say, 100 ID values at a time? In that case, you could have a table with an IDENTITY column, insert into that table, return the generated ID (say, 39), and then your code could assign all values between 3900 and 3999 (e.g. multiply up by your assumed granularity) without consulting the database server again.
Of course, this could be extended to allocating multiple IDs in a single call - provided that your okay with some IDs potentially going unused. E.g. you need 638 IDs - so you ask the database to assign you 7 new ID values (which imply that you've allocated 700 values), use the 638 you want, and the remaining 62 never get assigned.
Can you get some kind of deadlock trace? For example, enable trace flag 1222 as shown here. Duplicate the deadlock. Then look in the SQL Server log for the deadlock trace.
Also, you might inspect what locks are taken out in your code by inserting a call to exec sp_lock or select * from sys.dm_tran_locks immediately before the COMMIT TRANSACTION.
Most likely you are observing a conversion deadlock. To avoid them, you want to make sure that your table is clustered and has a PK, but this advice is specific to 2005 and 2008 R2, and they can change the implementation, rendering this advice useless. Google up "Some heap tables may be more prone to deadlocks than identical tables with clustered indexes".
Anyway, if you observe an error during stress testing, it is likely that sooner or later it will occur in production as well.
You may want to use sp_getapplock to serialize your requests. Google up "Application Locks (or Mutexes) in SQL Server 2005". Also I described a few useful ideas here: "Developing Modifications that Survive Concurrency".
I thought I'd share my solution. I doesn't deadlock, nor does it produce duplicate values. An important difference between this and my original procedure is that it doesn't create the queue if it doesn't already exist;
ALTER PROCEDURE [dbo].[ReserveSequence]
(
#Name nvarchar(100),
#Count int,
#FirstValue bigint OUTPUT
)
AS
BEGIN
SET NOCOUNT ON;
IF (#Count <= 0)
BEGIN
SET #FirstValue = NULL;
RETURN -1;
END
DECLARE #Result TABLE ([LastValue] bigint)
-- Update the sequence last value, and get the previous one
UPDATE [Sequences]
SET [LastValue] = [LastValue] + #Count
OUTPUT INSERTED.LastValue INTO #Result
WHERE [Name] = #Name;
-- Select the first value
SELECT TOP 1 #FirstValue = [LastValue] + 1 FROM #Result;
END

Resources