SQL Server - Implementing sequences - sql-server

I have a system which requires I have IDs on my data before it goes to the database. I was using GUIDs, but found them to be too big to justify the convenience.
I'm now experimenting with implementing a sequence generator which basically reserves a range of unique ID values for a given context. The code is as follows;
ALTER PROCEDURE [dbo].[Sequence.ReserveSequence]
#Name varchar(100),
#Count int,
#FirstValue bigint OUTPUT
AS
BEGIN
SET NOCOUNT ON;
-- Ensure the parameters are valid
IF (#Name IS NULL OR #Count IS NULL OR #Count < 0)
RETURN -1;
-- Reserve the sequence
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION
-- Get the sequence ID, and the last reserved value of the sequence
DECLARE #SequenceID int;
DECLARE #LastValue bigint;
SELECT TOP 1 #SequenceID = [ID], #LastValue = [LastValue]
FROM [dbo].[Sequences]
WHERE [Name] = #Name;
-- Ensure the sequence exists
IF (#SequenceID IS NULL)
BEGIN
-- Create the new sequence
INSERT INTO [dbo].[Sequences] ([Name], [LastValue])
VALUES (#Name, #Count);
-- The first reserved value of a sequence is 1
SET #FirstValue = 1;
END
ELSE
BEGIN
-- Update the sequence
UPDATE [dbo].[Sequences]
SET [LastValue] = #LastValue + #Count
WHERE [ID] = #SequenceID;
-- The sequence start value will be the last previously reserved value + 1
SET #FirstValue = #LastValue + 1;
END
COMMIT TRANSACTION
END
The 'Sequences' table is just an ID, Name (unique), and the last allocated value of the sequence. Using this procedure I can request N values in a named sequence and use these as my identifiers.
This works great so far - it's extremely quick since I don't have to constantly ask for individual values, I can just use up a range of values and then request more.
The problem is that at extremely high frequency, calling the procedure concurrently can sometimes result in a deadlock. I have only found this to occur when stress testing, but I'm worried it'll crop up in production. Are there any notable flaws in this procedure, and can anyone recommend any way to improve on it? It would be nice to do with without transactions for example, but I do need this to be 'thread safe'.

MS themselves offer a solution and even they say it locks/deadlocks.
If you want to add some lock hints then you'd reduce concurrency for your high loads
Options:
You could develop against the "Denali" CTP which is the next release
Use IDENTITY and the OUTPUT clause like everyone else
Adopt/modify the solutions above
On DBA.SE there is "Emulate a TSQL sequence via a stored procedure": see dportas' answer which I think extends the MS solution.

I'd recommend sticking with the GUIDs, if as you say, this is mostly about composing data ready for a bulk insert (it's simpler than what I present below).
As an alternative, could you work with a restricted count? Say, 100 ID values at a time? In that case, you could have a table with an IDENTITY column, insert into that table, return the generated ID (say, 39), and then your code could assign all values between 3900 and 3999 (e.g. multiply up by your assumed granularity) without consulting the database server again.
Of course, this could be extended to allocating multiple IDs in a single call - provided that your okay with some IDs potentially going unused. E.g. you need 638 IDs - so you ask the database to assign you 7 new ID values (which imply that you've allocated 700 values), use the 638 you want, and the remaining 62 never get assigned.

Can you get some kind of deadlock trace? For example, enable trace flag 1222 as shown here. Duplicate the deadlock. Then look in the SQL Server log for the deadlock trace.
Also, you might inspect what locks are taken out in your code by inserting a call to exec sp_lock or select * from sys.dm_tran_locks immediately before the COMMIT TRANSACTION.

Most likely you are observing a conversion deadlock. To avoid them, you want to make sure that your table is clustered and has a PK, but this advice is specific to 2005 and 2008 R2, and they can change the implementation, rendering this advice useless. Google up "Some heap tables may be more prone to deadlocks than identical tables with clustered indexes".
Anyway, if you observe an error during stress testing, it is likely that sooner or later it will occur in production as well.
You may want to use sp_getapplock to serialize your requests. Google up "Application Locks (or Mutexes) in SQL Server 2005". Also I described a few useful ideas here: "Developing Modifications that Survive Concurrency".

I thought I'd share my solution. I doesn't deadlock, nor does it produce duplicate values. An important difference between this and my original procedure is that it doesn't create the queue if it doesn't already exist;
ALTER PROCEDURE [dbo].[ReserveSequence]
(
#Name nvarchar(100),
#Count int,
#FirstValue bigint OUTPUT
)
AS
BEGIN
SET NOCOUNT ON;
IF (#Count <= 0)
BEGIN
SET #FirstValue = NULL;
RETURN -1;
END
DECLARE #Result TABLE ([LastValue] bigint)
-- Update the sequence last value, and get the previous one
UPDATE [Sequences]
SET [LastValue] = [LastValue] + #Count
OUTPUT INSERTED.LastValue INTO #Result
WHERE [Name] = #Name;
-- Select the first value
SELECT TOP 1 #FirstValue = [LastValue] + 1 FROM #Result;
END

Related

SQL Server Memory Optimized Table - poor performance compared to temporary table

I'm trying to benchmark memory optimized tables in Microsoft SQL Server 2016 with classic temporary tables.
SQL Server version:
Microsoft SQL Server 2016 (SP2) (KB4052908) - 13.0.5026.0 (X64) Mar 18 2018 09:11:49
Copyright (c) Microsoft Corporation
Developer Edition (64-bit) on Windows 10 Enterprise 10.0 <X64> (Build 17134: ) (Hypervisor)
I'm following steps described here: https://learn.microsoft.com/en-us/sql/relational-databases/in-memory-oltp/faster-temp-table-and-table-variable-by-using-memory-optimization?view=sql-server-ver15.
CrudTest_TempTable 1000, 100, 100
go 1000
versus
CrudTest_memopt_hash 1000, 100, 100
go 1000
What this test does?
1000 inserts
100 random updates
100 random deletes
And this is repeated 1000 times.
First stored procedure that uses classic temporary tables takes about 6 seconds to run.
Second stored procedure takes at least 15 seconds and usually errors out:
Beginning execution loop
Msg 3998, Level 16, State 1, Line 3
Uncommittable transaction is detected at the end of the batch. The transaction is rolled back.
Msg 701, Level 17, State 103, Procedure CrudTest_memopt_hash, Line 16 [Batch Start Line 2]
There is insufficient system memory in resource pool 'default' to run this query.
I have done following optimizations (before it was even worse):
hash index includes both Col1 and SpidFilter
doing everything in single transaction makes it works faster (however it would be nice to run without it)
I'm generating random ids - without it records from every iteration ended up in the same buckets
I haven't created natively compiled SP yet since my results are awful.
I have plenty of free RAM on my box and SQL Server can consume it - in different scenarios it allocates much memory but in this test case it simply errors out.
For me these results mean that memory optimized tables cannot replace temporary tables. Do you have similar results or am I doing something wrong?
The code that uses temporary tables is:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
DROP PROCEDURE IF EXISTS CrudTest_TempTable;
GO
CREATE PROCEDURE CrudTest_TempTable
#InsertsCount INT, #UpdatesCount INT, #DeletesCount INT
AS
BEGIN
SET NOCOUNT ON;
BEGIN TRAN;
CREATE TABLE #tempTable
(
Col1 INT NOT NULL PRIMARY KEY CLUSTERED,
Col2 NVARCHAR(4000),
Col3 NVARCHAR(4000),
Col4 DATETIME2,
Col5 INT NOT NULL
);
DECLARE #cnt INT = 0;
DECLARE #currDate DATETIME2 = GETDATE();
WHILE #cnt < #InsertsCount
BEGIN
INSERT INTO #tempTable (Col1, Col2, Col3, Col4, Col5)
VALUES (#cnt,
'sdkfjsdjfksjvnvsanlknc kcsmksmk ms mvskldamvks mv kv al kvmsdklmsdkl mal mklasdmf kamfksam kfmasdk mfksamdfksafeowa fpmsad lak',
'msfkjweojfijm skmcksamepi eisjfi ojsona npsejfeji a piejfijsidjfai spfdjsidjfkjskdja kfjsdp fiejfisjd pfjsdiafjisdjfipjsdi s dfipjaiesjfijeasifjdskjksjdja sidjf pajfiaj pfsdj pidfe',
#currDate, 100);
SET #cnt = #cnt + 1;
END
SET #cnt = 0;
WHILE #cnt < #UpdatesCount
BEGIN
UPDATE #tempTable SET Col5 = 101 WHERE Col1 = cast ((rand() * #InsertsCount) as int);
SET #cnt = #cnt + 1;
END
SET #cnt = 0;
WHILE #cnt < #DeletesCount
BEGIN
DELETE FROM #tempTable WHERE Col1 = cast ((rand() * #InsertsCount) as int);
SET #cnt = #cnt + 1;
END
COMMIT;
END
GO
The objects used in the in-memory test are :
DROP PROCEDURE IF EXISTS CrudTest_memopt_hash;
GO
DROP SECURITY POLICY IF EXISTS tempTable_memopt_hash_SpidFilter_Policy;
GO
DROP TABLE IF EXISTS tempTable_memopt_hash;
GO
DROP FUNCTION IF EXISTS fn_SpidFilter;
GO
CREATE FUNCTION fn_SpidFilter(#SpidFilter smallint)
RETURNS TABLE
WITH SCHEMABINDING , NATIVE_COMPILATION
AS
RETURN
SELECT 1 AS fn_SpidFilter
WHERE #SpidFilter = ##spid;
GO
CREATE TABLE tempTable_memopt_hash
(
Col1 INT NOT NULL,
Col2 NVARCHAR(4000),
Col3 NVARCHAR(4000),
Col4 DATETIME2,
Col5 INT NOT NULL,
SpidFilter SMALLINT NOT NULL DEFAULT (##spid),
INDEX ix_SpidFiler NONCLUSTERED (SpidFilter),
INDEX ix_hash HASH (Col1, SpidFilter) WITH (BUCKET_COUNT=100000),
CONSTRAINT CHK_SpidFilter CHECK ( SpidFilter = ##spid )
) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY);
GO
CREATE SECURITY POLICY tempTable_memopt_hash_SpidFilter_Policy
ADD FILTER PREDICATE dbo.fn_SpidFilter(SpidFilter)
ON dbo.tempTable_memopt_hash
WITH (STATE = ON);
GO
And the stored procedure that uses them is:
CREATE PROCEDURE CrudTest_memopt_hash
#InsertsCount INT, #UpdatesCount INT, #DeletesCount int
AS
BEGIN
SET NOCOUNT ON;
BEGIN TRAN;
DECLARE #cnt INT = 0;
DECLARE #currDate DATETIME2 = GETDATE();
DECLARE #IdxStart INT = CAST ((rand() * 1000) AS INT);
WHILE #cnt < #InsertsCount
BEGIN
INSERT INTO tempTable_memopt_hash(Col1, Col2, Col3, Col4, Col5)
VALUES (#IdxStart + #cnt,
'sdkfjsdjfksjvnvsanlknc kcsmksmk ms mvskldamvks mv kv al kvmsdklmsdkl mal mklasdmf kamfksam kfmasdk mfksamdfksafeowa fpmsad lak',
'msfkjweojfijm skmcksamepi eisjfi ojsona npsejfeji a piejfijsidjfai spfdjsidjfkjskdja kfjsdp fiejfisjd pfjsdiafjisdjfipjsdi s dfipjaiesjfijeasifjdskjksjdja sidjf pajfiaj pfsdj pidfe',
#currDate, 100);
SET #cnt = #cnt + 1;
END
SET #cnt = 0;
WHILE #cnt < #UpdatesCount
BEGIN
UPDATE tempTable_memopt_hash
SET Col5 = 101
WHERE Col1 = #IdxStart + cast ((rand() * #InsertsCount) as int);
SET #cnt = #cnt + 1;
END
SET #cnt = 0;
WHILE #cnt < #DeletesCount
BEGIN
DELETE FROM tempTable_memopt_hash
WHERE Col1 = #IdxStart + cast ((rand() * #InsertsCount) as int);
SET #cnt = #cnt + 1;
END
DELETE FROM tempTable_memopt_hash;
COMMIT;
END
GO
Index stats:
table index total_bucket_count empty_bucket_count empty_bucket_percent avg_chain_length max_chain_length
[dbo].[tempTable_memopt_hash] PK__tempTabl__3ED0478731BB5AF0 131072 130076 99 1 3
UPDATE
I'm including my final test cases and sql code for creating procedures, tables, etc. I've performed test on empty database.
SQL Code: https://pastebin.com/9K6SgAqZ
Test cases: https://pastebin.com/ckSTnVqA
My last run looks like this (temp table is the fastest one when it comes to tables, but I am able to achieve fastest times using memory optimized table variable):
Start CrudTest_TempTable 2019-11-18 10:45:02.983
Beginning execution loop
Batch execution completed 1000 times.
Finish CrudTest_TempTable 2019-11-18 10:45:09.537
Start CrudTest_SpidFilter_memopt_hash 2019-11-18 10:45:09.537
Beginning execution loop
Batch execution completed 1000 times.
Finish CrudTest_SpidFilter_memopt_hash 2019-11-18 10:45:27.747
Start CrudTest_memopt_hash 2019-11-18 10:45:27.747
Beginning execution loop
Batch execution completed 1000 times.
Finish CrudTest_memopt_hash 2019-11-18 10:45:46.100
Start CrudTest_tableVar 2019-11-18 10:45:46.100
Beginning execution loop
Batch execution completed 1000 times.
Finish CrudTest_tableVar 2019-11-18 10:45:47.497
IMHO, the test in OP cannot show the advantages of memory-optimized tables
because the greatest advantage of these tables is that they are lock-and-latch free, this means your update/insert/delete do not take locks at all that permits concurrent changes to these tables.
But the test made does not include concurrent changes at all, the code shown make all the changes in one session.
Another observation: hash index defined on the table is wrong as you search only on one column and hash index is defined on two columns. Hash index on two columns means that hash function is applied to both arguments, but you search only on one column so hash index just cannot be used.
Do you think by using mem opt tables I can get performance
improvements over temp tables or is it just for limiting IO on tempdb?
Memory-optimized tables are not supposed to substitute temporary tables, as already mentioned, you'll see the profit in highly concurrent OLTP environment, while as you guess temporary table is visible only to your session, there is no concurrency at all.
Eliminate latches and locks. All In-Memory OLTP internal data structures are latch- and lock-free. In-Memory OLTP uses a new
multi-version concurrency control (MVCC) to provide transaction
consistency. From a user standpoint, it behaves in a way similar to
the regular SNAPSHOT transaction isolation level; however, it does
not use locking under the hood. This schema allows multiple sessions
to work with the same data without locking and blocking each other
and improves the scalability of the system allowing fully utilize
modern multi-CPU/multi-core hardware.
Cited book: Pro SQL Server Internals by Dmitri Korotkevitch
What do you think about the title "Faster temp table and table
variable by using memory optimization"
I opened this article and see these examples (in the order they are in the article)
A. Basics of memory-optimized table variables
B. Scenario: Replace global tempdb ##table
C. Scenario: Replace session tempdb #table
A. I use table variables only in cases they contain very few rows. Why should I even take care about this few rows?
B. Replace global tempdb ##table . I just don't use them at all.
C. Replace session tempdb #table. As already mentioned, session tempdb #table is not visible to any other session, so what is the gain? That the data don't go to the disk? May be you shuold think about fastest SSD disk for your tempdb if you really have problems with tempdb? Starting with 2014 tempdb objects don't necessarily goes to disk even in case of bulk inserts, in any case I have even RCSI enabled on my databases and have no problems with tempdb.
Likely will not see performance improvement, only in very special applications. SQL dabbled with things like 'pin table' in the past but the optimizer, choosing what pages are in memory based on real activity, is probably as good as it gets for almost all cases. This has been performance tuned over decades. I think that 'in memory' is more a marketing touchpoint than any practical use. Prove me wrong please.

Why is my UPDATE stored procedure executed multiple times?

I use stored procedures to manage a warehouse. PDA scanners scan the added stock and send it in bulk (when plugged back) to a SQL database (SQL Server 2016).
The SQL database is fairly remote (other country), so there's sometimes delay in some queries, but this particular one is problematic: even if the stock table is fine, I had some problems with updating the occupancy of the warehouse spots. The PDA tracks the added items in every spot as a SMALLINT, then send back this value to the stored procedure below.
PDA "send_spots" query:
SELECT spot, added_items FROM spots WHERE change=1
Stored procedure:
CREATE PROCEDURE [dbo].[update_spots]
#spot VARCHAR(10),
#added_items SMALLINT
AS
BEGIN
BEGIN TRAN
UPDATE storage_spots
SET remaining_capacity = remaining_capacity - #added_items
WHERE storage_spot=#spot
IF ##ROWCOUNT <> 1
BEGIN
ROLLBACK TRAN
RETURN - 1
END
ELSE
BEGIN
COMMIT TRAN
RETURN 0
END
END
GO
If the remaining_capacity value is 0, the PDAs can't add more items to it on next round. But with this process, I had negative values because the query allegedly ran two times (so subtracted #added_items two times).
Is there a way for that to be possible? How could I fix it? From what I understood the transaction should be cancelled (ROLLBACK) if the affected rows are != 1, but maybe that's something else.
EDIT: current solution with the help of #Zero:
CREATE PROCEDURE [dbo].[update_spots]
#spot VARCHAR(10),
#added_racks SMALLINT
AS
BEGIN
-- Recover current capacity of the spot
DECLARE #orig_capacity SMALLINT
SELECT TOP 1
#orig_capacity = remaining_capacity
FROM storage_spots
WHERE storage_spot=#spot
-- Test if double is present in logs by comparing dates (last 10 seconds)
DECLARE #is_double BIT = 0
SELECT #is_double = CASE WHEN EXISTS(SELECT *
FROM spot_logs
WHERE log_timestamp >= dateadd(second, -10, getdate()) AND storage_spot=#spot AND delta=#added_racks)
THEN 1 ELSE 0 END
BEGIN
BEGIN TRAN
UPDATE storage_spots
SET remaining_capacity= #orig_capacity - #added_racks
WHERE storage_spot=#spot
IF ##ROWCOUNT <> 1 OR #is_double <> 0
-- If double, rollback UPDATE
ROLLBACK TRAN
ELSE
-- If no double, commit UPDATE
COMMIT TRAN
-- write log
INSERT INTO spot_logs
(storage_spot, former_capacity, new_capacity, delta, log_timestamp, double_detected)
VALUES
(#spot, #orig_capacity, #orig_capacity-#added_racks, #added_racks, getdate(), #is_double)
END
END
GO
I was thinking about possible causes (and a way to trace them) and then it hit me - you have no value validation!
Here's a simple example to illustrate the problem:
Spot | capacity
---------------
x1 | 1
Update spots set capacity = capacity - 2 where spot = 'X1'
Your scanner most likely gave you higher quantity than you had capacity to take in.
I am not sure how your business logic goes, but you need to perform something in lines of
Update spots set capacity = capacity - #added_items where spot = 'X1' and capacity >= #added_items
if ##rowcount <> 1;
rollback;
EDIT: few methods to trace your problem without implementing validation:
Create a logging table (with timestamp, user id (user, that is connected to DB), session id, old value, new value, delta value (added items).
Option 1:
Log all updates that change value from positive to negative (at least until you figure out the problem).
The drawback of this option is that it will not register double calls that do not result in a negative capacity.
Option 2: (logging identical updates):
Create script that creates a global temporary table and deletes records, from that table timestamps older than... let's say 10 minutes once every minute or so (play around with numbers).
This temporary table should hold the data passed to your update statement so 'spot', 'added_items' + 'timestamp' (for tracking).
Now the crucial part: When you call your update statement check if a similar record exists in the temporary table (same spot and added_items and where current timestamp is between timestamp and [timestamp + 1 second] - again play around with numbers). If a record like that exist log that update, if not add it to temporary table.
This will register identical updates that go within 1 second of each other (or whatever time-frame you choose).
I found here an alternative SQL query that does the update the way I need, but with a temporary value using DECLARE. Would it work better in my case, or is my initial query correct?
Initial query:
UPDATE storage_spots
SET remaining_capacity = remaining_capacity - #added_items
WHERE storage_spot=#spot
Alternative query:
DECLARE #orig_capacity SMALLINT
SELECT TOP 1 #orig_capacity = remaining_capacity
FROM storage_spots
WHERE spot=#spot
UPDATE Products
SET remaining_capacity = #orig_capacity - #added_items
WHERE spot=#spot
Also, should I get rid of the ROLLBACK/COMMIT instructions?

Need help understanding why I get dirty reads on code that has same locking during execution

CAVEAT: This is legacy code, we know we should be using an actual SQL sequence, the goal here is to understand why one method is providing different results from another because we can't just change to a sequence on the fly at this time.
ISSUE: We are using a single valued table as a sequence object (created pre-2012) to generate and return what needs to be a guaranteed unique, incrementing number. The stored procedure in place works but is churning high CPU and causing severe blocking under high load; The remediation looks as though it should work but does not. CPU and blocking is relieved with new code but we can't seem to be able to prevent dirty reads and this is resulting in duplication.
Here's how the table looks:
if OBJECT_ID('dbo.Sequence') is not null
drop table dbo.Sequence;
create table dbo.Sequence (number int Primary Key);
insert into dbo.Sequence values (1);
GO
This stored proc uses serializable locking and gets accurate results, but it doesn't perform well due to the blocking:
CREATE OR ALTER PROC dbo.usp_GetSequence_Serializable #AddSomeNumber INT = 1 AS
BEGIN
declare #return int; --value to be returned and declared within sproc
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
Select #return = number from dbo.Sequence with (updlock) --get initial value plus one
Update dbo.Sequence set number = #return + #AddSomeNumber --increment it
COMMIT TRANSACTION
Select #return + #AddSomeNumber as nextid
END
GO
Here's our faster version that avoids serializable, but it's getting duplicate values back from time to time:
CREATE OR ALTER PROC dbo.usp_GetSequence_DefaultIsolationLevel #AddSomeNumber INT = 1 AS
BEGIN
declare #return int; --value to be returned and declared within sproc
update dbo.Sequence set #return = number = number + #AddSomeNumber --we tried interchanging various locking hints with no change in results (likely due to an exclusive being taken already)
Select #return as nextid
END
GO
What can we do to get the faster performance of the second (non-serializable) proc, while avoiding duplicate generated IDs?
In largish databases times will arise when you are unable to avoid deadlocks in your code. The database will not always look ahead on all things to avoid contention. In these rare and special cases you can make use of the high performance locking used by SQL Server itself --> SP_GETAPPLOCK(). You can structure your code in such a way to give exclusive access to a critical section in your stored procedure and serialize access to that. SP_GETAPPLOCK is definitely not a bad thing when all traditional lock hints fail.
DECLARE #MyTimeoutMiliseconds INT =5000--Wait only five seconds max then timeout
BEGIN TRAN
EXEC #LockRequestResult=SP_GETAPPLOCK 'MyCriticalWork','Exclusive','Transaction',#MyTimeoutMiliseconds
IF(#LockRequestResult>=0)BEGIN
/*
DO YOUR CRITICAL READS AND WRITES HERE
You may prefer try finally to release
*/
COMMIT TRAN -- <--Releases the lock!
END ELSE
ROLLBACK TRAN --<--Releases the lock!

Table Variable inside cursor, strange behaviour - SQL Server

I observed a strange thing inside a stored procedure with select on table variables. It always returns the value (on subsequent iterations) that was fetched in the first iteration of cursor. Here is some sample code that proves this.
DECLARE #id AS INT;
DECLARE #outid AS INT;
DECLARE sub_cursor CURSOR FAST_FORWARD
FOR SELECT [TestColumn]
FROM testtable1;
OPEN sub_cursor;
FETCH NEXT FROM sub_cursor INTO #id;
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #Log TABLE (LogId BIGINT NOT NULL);
PRINT 'id: ' + CONVERT (VARCHAR (10), #id);
INSERT INTO Testtable2 (TestColumn)
OUTPUT inserted.[TestColumn] INTO #Log
VALUES (#id);
IF ##ERROR = 0
BEGIN
SELECT TOP 1 #outid = LogId
FROM #Log;
PRINT 'Outid: ' + CONVERT (VARCHAR (10), #outid);
INSERT INTO [dbo].[TestTable3] ([TestColumn])
VALUES (#outid);
END
FETCH NEXT FROM sub_cursor INTO #id;
END
CLOSE sub_cursor;
DEALLOCATE sub_cursor;
However, while I was posting the code on SO and tried various combinations, I observed that removing top from the below line, gives me the right values out of table variable inside a cursor.
SELECT TOP 1 #outid = LogId FROM #Log;
which would make it like this
SELECT #outid = LogId FROM #Log;
I am not sure what is happening here. I thought TOP 1 on table variable should work, thinking that a new table is created on every iteration of the loop. Can someone throw light on the table variable scoping and lifetime.
Update: I have the solution to circumvent the strange behavior here.
As a solution, I have declared the table at the top before the loop and deleting all rows at the beginning of the loop.
There are numerous things a bit off with this code.
First off, you roll back your embedded transaction on error, but I never see you commit it on success. As written, this will leak a transaction, which could cause major issues for you in the following code.
What might be confusing you about the #Log table situation is that SQL Server doesn't use the same variable scoping and lifetime rules as C++ or other standard programming languages. Even when declaring your table variable in the cursor block you will only get a single #Log table which then lives for the remainder of the batch, and which gets multiple rows inserted into it.
As a result, your use of TOP 1 is not really meaningful, since there's no ORDER BY clause to impose any sort of deterministic ordering on the table. Without that, you get whatever order SQL Server sees fit to give you, which in this case appears to be the insertion order, giving you the first inserted element of that log table every time you run the SELECT.
If you truly want only the last ID value, you will need to provide some real ordering criterion for your #Log table -- some form of autonumber or date field alongside the data column that can be used to provide the proper ordering for what you want to do.

How do you write a recursive stored procedure

I simply want a stored procedure that calculates a unique id (that is separate from the identity column) and inserts it. If it fails it just calls itself to regenerate said id. I have been looking for an example, but cant find one, and am not sure how I should get the SP to call itself, and set the appropriate output parameter. I would also appreciate someone pointing out how to test this SP also.
Edit
What I have now come up with is the following (Note I already have an identity column, I need a secondary id column.
ALTER PROCEDURE [dbo].[DataInstance_Insert]
#DataContainerId int out,
#ModelEntityId int,
#ParentDataContainerId int,
#DataInstanceId int out
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
WHILE (#DataContainerId is null)
EXEC DataContainer_Insert #ModelEntityId, #ParentDataContainerId, #DataContainerId output
INSERT INTO DataInstance (DataContainerId, ModelEntityId)
VALUES (#DataContainerId, #ModelEntityId)
SELECT #DataInstanceId = scope_identity()
END
ALTER PROCEDURE [dbo].[DataContainer_Insert]
#ModelEntityId int,
#ParentDataContainerId int,
#DataContainerId int out
AS
BEGIN
BEGIN TRY
SET NOCOUNT ON;
DECLARE #ReferenceId int
SELECT #ReferenceId = isnull(Max(ReferenceId)+1,1) from DataContainer Where ModelEntityId=#ModelEntityId
INSERT INTO DataContainer (ReferenceId, ModelEntityId, ParentDataContainerId)
VALUES (#ReferenceId, #ModelEntityId, #ParentDataContainerId)
SELECT #DataContainerId = scope_identity()
END TRY
BEGIN CATCH
END CATCH
END
In CATCH blocks you must check the XACT_STATE value. You may be in a doomed transaction (-1) and in that case you are forced to rollback. Or your transaction may had already had rolled back and you should not continue to work under the assumption of an existing transaction. For a template procedure that handles T-SQL exceptions, try/catch blcoks and transactions correctly, see Exception handling and nested transactions
Never, under any languages, do recursive calls in exception blocks. You don't check why you hit an exception, therefore you don't know if is OK to try again. What if the exception is 652, read-only filegroup? Or your database is at max size? You'll re-curse until you'll hit stackoverflow...
Code that reads a value, makes a decision based on that value, then writes something is always going to fail under concurrency unless properly protected. You need to wrap the SELECT and INSERT in a transaction and your SELECT must be under SERIALISABLE isolation level.
And finally, ignoring the blatantly wrong code in your post, here is how you call a stored procedure passing in OUTPUT arguments:
exec DataContainer_Insert #SomeData, #DataContainerId OUTPUT;
Better yet, why not make UserID an identity column instead of trying to re-implement an identity column manually?
BTW: I think you meant
VALUES (#DataContainerId + 1 , SomeData)
Why not use the:
NewId()
T SQL function? (assuming sql server 2005/2008)
that sp will never ever do a successful insert, you have an identity property on the DataContainer table but you are inserting the ID, in that case you will need to set identity_insert on but then scope_identity() won't work
A PK violation also might not be trapped so you might also need to check for XACT_STATE()
why are you messing around with max, use scope_identity() and be done with it

Resources