TSQL ID generation - sql-server

I have a question regarding locking in TSQL. Suppose I have a the following table:
A(int id, varchar name)
where id is the primary key, but is NOT an identity column.
I want to use the following pseudocode to insert a value into this table:
lock (A)
uniqueID = GenerateUniqueID()
insert into A values (uniqueID, somename)
unlock(A)
How can this be accomplished in terms of T-SQL? The computation of the next id should be done with the table A locked in order to avoid other sessions to do the same operation at the same time and get the same id.

If you have custom logic that you want to apply in generating the ids, wrap it up into a user defined function, and then use the user defined function as the default for the column. This should reduce concurrency issue similarly to the provided id generators by deferring the generation to the point of insert and piggy backing on the insert locking behavior.
create table ids (id int, somval varchar(20))
Go
Create function GenerateUniqueID()
returns int as
Begin
declare #ret int
select #ret = max(isnull(id,1)) * 2 from ids
if #ret is null set #ret = 2
return #ret
End
go
alter table ids add Constraint DF_IDS Default(dbo.GenerateUniqueID()) for Id

There are really only three ways to go about this.
Change the ID column to be an IDENTITY column where it auto increments by some value on each insert.
Change the ID column to be a GUID with a default constraint of NEWID() or NEWSEQUENTIALID(). Then you can insert your own value or let the table generate one for you on each insert.
On each insert, start a transaction. Then get the next available ID using something like select max(id)+1 . Do this in a single sql statement if possible in order to limit the possibility of a collision.
On the whole, most people prefer option 1. It's fast, easy to implement, and most people understand it.
I tend to go with option 2 with the apps I work on simply because we tend to scale out (and up) our databases. This means we routinely have apps with a multi-master situation. Be aware that using GUIDs as primary keys can mean your indexes are routinely trashed.
I'd stay away from option 3 unless you just don't have a choice. In which case I'd look at how the datamodel is structured anyway because there's bound to be something wrong.

You use the NEWID() function and you do not need any locking mechanism
You tell a column to be IDENTITY and you do not need any locking mechanism
If you generate these IDs manually and there is a chance parallel calls could generate the same IDs then something like this:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
#NextID = GenerateUniqueID()
WHILE EXISTS (SELECT ID FROM A WHERE ID = #NextID)
BEGIN
#NextID = GenerateUniqueID()
END
INSERT INTO A (ID, Text) VALUES (#NextID , 'content')
COMMIT TRANSACTION

#Markus, you should look at using either IDENTITY or NEWID() as noted in the other answers. if you absolutely can't, here's an option for you...
DECLARE #NewID INT
BEGIN TRAN
SELECT #NewID = MAX(ID) + 1
FROM TableA (tablockx)
INSERT TableA
(ID, OtherFields)
VALUES (#NewID, OtherFields)
COMMIT TRAN

If you're using SQL2005+, you can use the OUTPUT clause to do what you're asking, without any kind of lock (The table Test1 simulates the table you're inserted into, and since OUTPUT requires a temp table and not a variable to hold the results, #Result will do that):
create table test1( test INT)
create table #result (LastValue INT)
insert into test1
output INSERTED.test into #result(test)
select GenerateUniqueID()
select LastValue from #result

Just to update an old post. It is now possible with SQL Server 2012 to use a feature called Sequence. Sequences are created in much the same way a function and it is possible to specify the range, direction(asc, desc) and rollover point. After which it's possible to invoke the NEXT VALUE FOR method to generate the next value in the range.
See the following documentation from Microsoft.
http://technet.microsoft.com/en-us/library/ff878091.aspx

Related

Lock database table for just a couple of sentences

Suppose a table in SQLServer with this structure:
TABLE t (Id INT PRIMARY KEY)
Then I have a stored procedure, which is constantly being called, that works inserting data in this table among other kind of things:
BEGIN TRAN
DECLARE #Id INT = SELECT MAX(Id) + 1 FROM t
INSERT t VALUES (#Id)
...
-- Stuff that gets a long time to get completed
...
COMMIT
The problem with this aproach is sometimes I get a primary key violation because 2 or more procedure calls get and try to insert the same Id on the table.
I have been able to solve this problem adding a tablock in the SELECT sentence:
DECLARE #Id INT = SELECT MAX(Id) + 1 FROM t WITH (TABLOCK)
The problem now is sucessive calls to the procedure must wait to the completion of the transaction currently beeing executed to start their work, allowing just one procedure to run simultaneosly.
Is there any advice or trick to get the lock just during the execution of the select and insert sentence?
Thanks.
TABLOCK is a terrible idea, since you're serialising all the calls (no concurrency).
Note that with an SP you will retain all the locks granted over the run until the SP completes.
So you want to minimise locks except for where you really need them.
Unless you have a special case, use an internally generated id:
CREATE TABLE t (Id INT IDENTITY PRIMARY KEY)
Improved performance, concurrency etc. since you are not dependent on external tables to manage the id.
If you have existing data you can (re)set the start value using DBCC
DBCC CHECKIDENT ('t', RESEED, 100)
If you need to inject rows with a value preassigned, use:
SET IDENTITY_INSERT t ON
(and off again afterwards, resetting the seed as required).
[Consider whether you want this value to be the primary key, or simply unique.
In many cases where you need to reference a tables PK as a FK then you'll want it as PK for simplicity of join, but having a business readable value (eg, Accounting Code or OrderNo+OrderLine is completely valid) : that's just modelling]

SQL Server add auto serial id to additional field during insert

I have an insert statement in a stored procedure who's primary key is a serial id. I want to be able to populate an additional field in the same table during the same insert statement with the serial id used for the primary key. Is this possible?
Unfortunately this is a solution already in place... I just have to implement it.
Regards
I can't imagine a reason why you would want a copy of the key in another column. But in order to do it, I think you'll need to follow your update with a statement to get the value of the identity key, and then an update to put that value in the other column. Since you're already in a stored procedure, it's probably ok to have a few extra statements, instead of doing it in the very same one.
DECLARE #ID INT;
INSERT INTO TABLE_THINGY (Name, Address) VALUES ('Joe Blow', '123 Main St');
SET #ID = SCOPE_IDENTITY();
UPDATE TABLE_THINGY SET IdCopy = #Id WHERE ID = #ID
If it's important that this be done every single time, you might want to create a Trigger to do it; beware, however, that many people hate triggers because of the obfuscation and difficulty in debugging, among other reasons.
http://blog.sqlauthority.com/2007/03/25/sql-server-identity-vs-scope_identity-vs-ident_current-retrieve-last-inserted-identity-of-record/
I agree, it is odd that you would replicate the key within the same table but with that said you could use a trigger, thus making it have no impact to current insert statements.
The below trigger is "After Insert" so technically it happens milliseconds after the insert if you truly wanted it to happen at the same time you would use a FOR INSERT instead and just replicate the logic used to create the serial id field into the new field.
CREATE TRIGGER triggerName ON dbo.tableName
AFTER INSERT
AS
BEGIN
update dbo.tableName set newField = inserted.SerialId where serialId = inserted.SerialId
END
GO
You could have a computed column that just returns the id column.
CREATE TABLE dbo.Products
(
ProductID int IDENTITY (1,1) NOT NULL
, OtherProductID AS ProductID
);
Having said that, data should only live in one place and to duplicate it in the same table is just a wrong design.
No, you cannot use the same insert statement for identity Id and copy that auto generated Id to the same row.
Multi-Statement using OUTPUT inserted or Trigger is your best bet.

Using ##identity or output when inserting into SQL Server view?

(forgive me - I'm new to both StackOverflow & SQL)
Tl;dr - When using ##identity (or any other option such as scope_identity or output variable), is it possible to also use a view? Here is an example of a stored procedure using ##identity:
--SNIP--
DECLARE #AID INT
DECLARE #BID INT
INSERT INTO dbo.A (oct1)
VALUES
(#oct1)
SELECT #AID = ##IDENTITY;
INSERT INTO dbo.B (duo1)
VALUES
(#duo2)
SELECT #BID = ##IDENTITY
INSERT INTO dbo.tblAB (AID, BID)
VALUES
(#AID, #BID)
GO
Longer:
When inserting into a table, you can capture the current value of the identity seed using ##identity. This is useful if you want to insert into table A and B, capture the identity value, then insert into table AB relating A to B. Obviously this is for purposes of data normalization.
Let's say you were to abstract the DB Schema with a few that performs inner joins on your tables to make the data easier to work with. How would you populate the cross reference tables properly in that case? Can it be done the same way, if so, how?
Avoid using ##IDENTITY or SCOPE_IDENTITY() if your system is using Parallel plans as there is a nasty bug. Please refer -
http://connect.microsoft.com/SQL/feedback/ViewFeedback.aspx?FeedbackID=328811
Better way to fetch the inserted Identity ID would be to use OUTPUT clause.
CREATE TABLE tblTest
(
Sno INT IDENTITY(1,1) NOT NULL,
FirstName VARCHAR(20)
)
DECLARE #pk TABLE (ID INT)
INSERT INTO tblTest(FirstName)
OUTPUT INSERTED.Sno INTO #pk
SELECT 'sample'
SELECT * FROM #pk
EDIT:
It would work with Views as well. Please see the sample below. Hope this is what you were looking for.
CREATE VIEW v1
AS
SELECT sno, firstname FROM tbltest
GO
DECLARE #pk TABLE (ID INT)
INSERT INTO v1(FirstName)
OUTPUT INSERTED.Sno INTO #pk
SELECT 'sample'
SELECT ID FROM #pk
##IDENTITY returns the last IDENTITY value produced on a connection, regardless of the table that produced the value, and regardless of the scope of the statement that produced the value.
SCOPE_IDENTITY() returns the last IDENTITY value produced on a connection and by a statement in the same scope, regardless of the table that produced the value. SCOPE_IDENTITY(), like ##IDENTITY, will return the last identity value created in the current session, but it will also limit it to your current scope as well
Although the issue with either of these is fixed by microsoft , I would suggest you should go with "OUTPUT", and yes, it can be used with view as well

Handling max(ID) in a concurrent environment

I am new to web application programming and handling concurrency using an RDBMS like SQL Server. I am using SQL Server 2005 Express Edition.
I am generating employee code in which the last four digits come from this query:
SELECT max(ID) FROM employees WHERE district = "XYZ";
I am not following how to handle issues that might arise due to concurrent connections. Many users can pick same max(ID) and while one user clicks "Save Record", the ID might have already been occupied by another user.
How to handle this issue?
Here are two ways of doing what you want. The fact that you might end up with unique constraint violation on EmpCode I will leave you to worry about :).
1. Use scope_identity() to get the last inserted ID and use that to calculate EmpCode.
Table definition:
create table Employees
(
ID int identity primary key,
Created datetime not null default getdate(),
DistrictCode char(2) not null,
EmpCode char(10) not null default left(newid(), 10) unique
)
Add one row to Employees. Should be done in a transaction to be sure that you will not be left with the default random value from left(newid(), 10) in EmpCode:
declare #ID int
insert into Employees (DistrictCode) values ('AB')
set #ID = scope_identity()
update Employees
set EmpCode = cast(year(Created) as char(4))+DistrictCode+right(10000+#ID, 4)
where ID = #ID
2. Make EmpCode a computed column.
Table definition:
create table Employees
(
ID int identity primary key,
Created datetime not null default getdate(),
DistrictCode char(2) not null,
EmpCode as cast(year(Created) as char(4))+DistrictCode+right(10000+ID, 4) unique
)
Add one row to Employees:
insert into Employees (DistrictCode) values ('AB')
It is a bad idea to use MAX, because with a proper locking mechanism, you will not be able to insert rows in multiple threads for the same district.
If it is OK for you that you can only create one user at a time, and if your tests show that the MAX scales up even with a lot of users per district, it may be ok to use it.
Long story short, dealing with identies, as much as possible, you should rely on IDENTITY. Really.
But if it is not possible, one solution is to handle IDs in a separate table.
Create Table DistrictID (
DistrictCode char(2),
LastID Int,
Constraint PK_DistrictCode Primary Key Clustered (DistrictCode)
);
Then you increment the LastID counter. It is important that incrementing IDs is a transaction separated to the user creation transaction if you want to create many users in parallel threads. You can limit to have only the ID generation in sequence.
The code can look like this:
Create Procedure usp_GetNewId(#DistrictCode char(2), #NewId Int Output)
As
Set NoCount On;
Set Transaction Isolation Level Repeatable Read;
Begin Tran;
Select #NewId = LastID From DistrictID With (XLock) Where DistrictCode = #DistrictCode;
Update DistrictID Set LastID = LastID + 1 Where DistrictCode = #DistrictCode;
Commit Tran;
The Repeatable Read and XLOCK keywords are the minimum that you need to avoid two threads to get the same ID.
If the table does not have all districts, you will need to change the Repeatable Read into a Serializable, and fork the Update with a Insert.
This can be done through Transaction Isolation Levels. For instance, if you specify SERIALIZABLE as the level then other transactions will be blocked so that you aren't running into this problem.
If I did not understand your question correctly, please let me know.

Using a trigger to simulate a second identity column in SQL Server 2005

I have various reasons for needing to implement, in addition to the identity column PK, a second, concurrency safe, auto-incrementing column in a SQL Server 2005 database. Being able to have more than one identity column would be ideal, but I'm looking at using a trigger to simulate this as close as possible to the metal.
I believe I have to use a serializable isolation level transaction in the trigger. Do I go about this like Ii would use such a transaction in a normal SQL query?
It is a non-negotiable requirement that the business meaning of the second incrementing column remain separated from the behind the scenes meaning of the first, PK, incrementing column.
To put things as simply as I can, if I create JobCards '0001', '0002', and '0003', then delete JobCards '0002' and '0003', the next Jobcard I create must have ID '0002', not '0004'.
Just an idea, if you have 2 "identity" columns, then surely they would be 'in sync' - if not exactly the same value, then would differ by a constant value. If so, then why not add the "second identity" column as a COMPUTED column, which offsets the primary identity? Or is my logic flawed here?
Edit : As per Martin's comment, note that your calc might need to be N * id + C, where N is the Increment and C the offset / delta - excuse my rusty maths.
For example:
ALTER TABLE MyTable ADD OtherIdentity AS Id * 2 + 1;
Edit
Note that for Sql 2012 and later, that you can now use an independent sequence to create two or more independently incrementing columns in the same table.
Note: OP has edited the original requirement to include reclaiming sequences (noting that identity columns in SQL do not reclaim used ID's once deleted).
I would disallow all the deletes from this table altogether. Instead of deleting, I would mark rows as available or inactive. Instead of inserting, I would first search if there are inactive rows, and reuse the one with the smallest ID if they exist. I would insert only if there are no available rows already in the table.
Of course, I would serialize all inserts and deletes with sp_getapplock.
You can use a trigger to disallow all deletes, it is simpler than filling gaps.
A solution to this issue from "Inside Microsoft SQL Server 2008: T-SQL Querying" is to create another table with a single row that holds the current max value.
CREATE TABLE dbo.Sequence(
val int
)
Then to allocate a range of sufficient size for your insert
CREATE PROC dbo.GetSequence
#val AS int OUTPUT,
#n as int =1
AS
UPDATE dbo.Sequence
SET #val = val = val + #n;
SET #val = #val - #n + 1;
This will block other concurrent attempts to increment the sequence until the first transaction commits.
For a non blocking solution that doesn't handle multi row inserts see my answer here.
This is probably a terrible idea, but it works in at least a limited use scenario
Just use a regular identity and reseed on deletes.
create table reseedtest (
a int identity(1,1) not null,
name varchar(100)
)
insert reseedtest values('erik'),('john'),('selina')
select * from reseedtest
go
CREATE TRIGGER TR_reseedtest_D ON reseedtest FOR DELETE
AS
BEGIN TRAN
DECLARE #a int
SET #a = (SELECT TOP 1 a FROM reseedtest WITH (TABLOCKX, HOLDLOCK))
--anyone know another way to lock a table besides doing something to it?
DBCC CHECKIDENT(reseedtest, reseed, 0)
DBCC CHECKIDENT(reseedtest, reseed)
COMMIT TRAN
GO
delete reseedtest where a >= 2
insert reseedtest values('katarina'),('david')
select * from reseedtest
drop table reseedtest
This won't work if you are deleting from the "middle of the stack" as it were, but it works fine for deletes from the incrementing end.
Reseeding once to 0 then again is just a trick to avoid having to calculate the correct reseed value.
if you never delete from the table, you could create a view with a materialized column that uses ROW_NUMBER().
ALSO, a SQL Server identity can get out of sync with a user generated one, depending on the use of rollback.

Resources