Setting version column in append only table - sql-server

We have a table that will store versions of records.
The columns are:
Id (Guid)
VersionNumber (int)
Title (nvarchar)
Description (nvarchar)
etc...
Saving an item will insert a new row into the table with the same Id and an incremented VersionNumber.
I am not sure how is best to generate the sequential VersionNumber values. My initial thought is to:
SELECT #NewVersionNumber = MAX(VersionNumber) + 1
FROM VersionTable
WHERE Id = #ObjectId
And then use the the #NewVersionNumber in my insert statement.
If I use this method do I need set my transaction as serializable to avoid concurrency issues? I don't want to end up with duplicate VersionNumbers for the same Id.
Is there a better way to do this that doesn't make me use serializable transactions?

In order to avoid concurrency issues (or in your specific case duplicate inserts) you could create a Compound Key as the Primary Key for your table, consisting of the ID and VersionNumber columns. This would then enforce a unique constraint on the key column.
Subsequently your insert routine/logic can be devised to handle or rather CATCH an insert error due to a duplicate key and then simply re-issue the insert process.
It may also be worth mentioning that unless you specifically need to use a GUID i.e. because of working with SQL Server Replication or multiple data sources, that you should consider using an alternative data type such as BIGINT.

I had thought that the following single insert statement would avoid concurrency issues, but after Heinzi's excellent answer to my question here it turns out that this is not safe at all:
Insert Into VersionTable
(Id, VersionNumber, Title, Description, ...)
Select #ObjectId, max(VersionNumber) + 1, #Title, #Description
From VersionTable
Where Id = #ObjectId
I'm leaving it just for reference. Of course this would work with either table hints or a transaction isolation level of Serializable, but overall the best solution is to use a constraint.

Related

Triggers: Tracking ID Updates

For a trigger that is tracking UPDATEs to a table, two temp tables may be referenced: deleted and inserted. Is there a way to cross-reference the two w/o using an INNER JOIN on their primary key?
I am trying to maintain referential integrity without foreign keys (don't ask), so I'm using triggers. I want UPDATEs to the primary key in table A to be reflected in the "foreign key" of look-up table B, and for this to happen when an UPDATE affects multiple records in table A.
All UPDATE trigger examples that I've seen hinge on joining the inserted and deleted tables to track changes; and they use the updated table's ID field (primary key) to set the join. But if that ID field (GUID) is the changed field in a record (or set of records), is there a good way to track those changes, so that I can enforce those changes in the corresponding look-up table?
I've just had this issue (or rather, a similar one), myself, hence the resurrection...
My eventual approach was to simply disallow updates to the PK field precisely because it would break the trigger. Thankfully, I had no business case to support updating the primary key column (these were surrogate IDs, anyway), so I could get away with it.
SQL Server offers the UPDATE function, for use within triggers, to check for this edge case:
CREATE TRIGGER your_trigger
ON your_table
INSTEAD OF UPDATE
AS BEGIN
IF UPDATE(pk1) BEGIN
ROLLBACK
DECLARE #proc SYSNAME, #table SYSNAME
SELECT TOP 1
#proc = OBJECT_NAME(##PROCID)
,#table = OBJECT_NAME(parent_id)
FROM sys.triggers
WHERE object_id = ##PROCID
RAISERROR ('Trigger %s prevents UPDATE of table %s due to locked primary key', 16, -1, #proc, #table) WITH NOWAIT
END
ELSE UPDATE t SET
col1 = i.col1
,col2 = i.col2
,col3 = i.col3
FROM your_table t
INNER JOIN inserted i ON t.pk1 = i.pk1
END
GO
(Note that the above is untested, and probably contains all manner of issues with regards to XACT_STATE or TRIGGER_NESTLEVEL -- it's just there to demonstrate the principle)
It gets a bit messy, though, so I would definitely consider code generation for this, to handle changes to the table during development (maybe even done by a DDL trigger on CREATE/ALTER table).
If you have a composite primary key, you can use IF UPDATE(pk1) OR UPDATE(pk2)... or do some bitwise work with the COLUMNS_UPDATED function, which will give you a bitmask based on the column ordinal (but I'm not going to cover that here -- see MSDN/BOL).
The other (simpler) option is to DENY UPDATE ON your_table(pk) TO public, but remember that any member of sysadmins (and probably dbo) will not honour this.
I'm with #Aaron, without a primary key you're stuck. If you have DDL privileges to add a trigger can't you add a auto increment PK column while you're at it? If you'd like, it doesn't even need to be the PK.

How to emulate a BEFORE INSERT trigger in T-SQL / SQL Server for super/subtype (Inheritance) entities? [duplicate]

This question already has answers here:
How can I do a BEFORE UPDATED trigger with sql server?
(9 answers)
Closed 2 years ago.
This is on Azure.
I have a supertype entity and several subtype entities, the latter of which needs to obtain their foreign keys from the primary key of the super type entity on each insert. In Oracle, I use a BEFORE INSERT trigger to accomplish this. How would one accomplish this in SQL Server / T-SQL?
DDL
CREATE TABLE super (
super_id int IDENTITY(1,1)
,subtype_discriminator char(4) CHECK (subtype_discriminator IN ('SUB1', 'SUB2')
,CONSTRAINT super_id_pk PRIMARY KEY (super_id)
);
CREATE TABLE sub1 (
sub_id int IDENTITY(1,1)
,super_id int NOT NULL
,CONSTRAINT sub_id_pk PRIMARY KEY (sub_id)
,CONSTRAINT sub_super_id_fk FOREIGN KEY (super_id) REFERENCES super (super_id)
);
I wish for an insert into sub1 to fire a trigger that actually inserts a value into super and uses the super_id generated to put into sub1.
In Oracle, this would be accomplished by the following:
CREATE TRIGGER sub_trg
BEFORE INSERT ON sub1
FOR EACH ROW
DECLARE
v_super_id int; //Ignore the fact that I could have used super_id_seq.CURRVAL
BEGIN
INSERT INTO super (super_id, subtype_discriminator)
VALUES (super_id_seq.NEXTVAL, 'SUB1')
RETURNING super_id INTO v_super_id;
:NEW.super_id := v_super_id;
END;
Please advise on how I would simulate this in T-SQL, given that T-SQL lacks the BEFORE INSERT capability?
Sometimes a BEFORE trigger can be replaced with an AFTER one, but this doesn't appear to be the case in your situation, for you clearly need to provide a value before the insert takes place. So, for that purpose, the closest functionality would seem to be the INSTEAD OF trigger one, as #marc_s has suggested in his comment.
Note, however, that, as the names of these two trigger types suggest, there's a fundamental difference between a BEFORE trigger and an INSTEAD OF one. While in both cases the trigger is executed at the time when the action determined by the statement that's invoked the trigger hasn't taken place, in case of the INSTEAD OF trigger the action is never supposed to take place at all. The real action that you need to be done must be done by the trigger itself. This is very unlike the BEFORE trigger functionality, where the statement is always due to execute, unless, of course, you explicitly roll it back.
But there's one other issue to address actually. As your Oracle script reveals, the trigger you need to convert uses another feature unsupported by SQL Server, which is that of FOR EACH ROW. There are no per-row triggers in SQL Server either, only per-statement ones. That means that you need to always keep in mind that the inserted data are a row set, not just a single row. That adds more complexity, although that'll probably conclude the list of things you need to account for.
So, it's really two things to solve then:
replace the BEFORE functionality;
replace the FOR EACH ROW functionality.
My attempt at solving these is below:
CREATE TRIGGER sub_trg
ON sub1
INSTEAD OF INSERT
AS
BEGIN
DECLARE #new_super TABLE (
super_id int
);
INSERT INTO super (subtype_discriminator)
OUTPUT INSERTED.super_id INTO #new_super (super_id)
SELECT 'SUB1' FROM INSERTED;
INSERT INTO sub (super_id)
SELECT super_id FROM #new_super;
END;
This is how the above works:
The same number of rows as being inserted into sub1 is first added to super. The generated super_id values are stored in a temporary storage (a table variable called #new_super).
The newly inserted super_ids are now inserted into sub1.
Nothing too difficult really, but the above will only work if you have no other columns in sub1 than those you've specified in your question. If there are other columns, the above trigger will need to be a bit more complex.
The problem is to assign the new super_ids to every inserted row individually. One way to implement the mapping could be like below:
CREATE TRIGGER sub_trg
ON sub1
INSTEAD OF INSERT
AS
BEGIN
DECLARE #new_super TABLE (
rownum int IDENTITY (1, 1),
super_id int
);
INSERT INTO super (subtype_discriminator)
OUTPUT INSERTED.super_id INTO #new_super (super_id)
SELECT 'SUB1' FROM INSERTED;
WITH enumerated AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rownum
FROM inserted
)
INSERT INTO sub1 (super_id, other columns)
SELECT n.super_id, i.other columns
FROM enumerated AS i
INNER JOIN #new_super AS n
ON i.rownum = n.rownum;
END;
As you can see, an IDENTIY(1,1) column is added to #new_user, so the temporarily inserted super_id values will additionally be enumerated starting from 1. To provide the mapping between the new super_ids and the new data rows, the ROW_NUMBER function is used to enumerate the INSERTED rows as well. As a result, every row in the INSERTED set can now be linked to a single super_id and thus complemented to a full data row to be inserted into sub1.
Note that the order in which the new super_ids are inserted may not match the order in which they are assigned. I considered that a no-issue. All the new super rows generated are identical save for the IDs. So, all you need here is just to take one new super_id per new sub1 row.
If, however, the logic of inserting into super is more complex and for some reason you need to remember precisely which new super_id has been generated for which new sub row, you'll probably want to consider the mapping method discussed in this Stack Overflow question:
Using merge..output to get mapping between source.id and target.id
While Andriy's proposal will work well for INSERTs of a small number of records, full table scans will be done on the final join as both 'enumerated' and '#new_super' are not indexed, resulting in poor performance for large inserts.
This can be resolved by specifying a primary key on the #new_super table, as follows:
DECLARE #new_super TABLE (
row_num INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
super_id int
);
This will result in the SQL optimizer scanning through the 'enumerated' table but doing an indexed join on #new_super to get the new key.

Creating a SQL Server trigger to transition from a natural key to a surrogate key

Backstory
At work where we're planning on deprecating a Natural Key column in one of our primary tables. The project consists of 100+ applications that link to this table/column; 400+ stored procedures that reference this column directly; and a vast array of common tables between these applications that also reference this column.
The Big Bang and Start from Scratch methods are out of the picture. We're going to deprecate this column one application at a time, certify the changes, and move on to the next... and we've got a lengthy target goal to make this effort practical.
The problem I have is that a lot of these applications have shared stored procedures and tables. If I completely convert all of Application A's tables/stored procedures Application B and C will be broken until converted. These in turn may break applications D, E, F...Etc. I've already got a strategy implemented for Code classes and Stored Procedures, the part I'm stuck on is the transitioning state of the database.
Here's a basic example of what we have:
Users
---------------------------
Code varchar(32) natural key
Access
---------------------------
UserCode varchar(32) foreign key
AccessLevel int
And we're aiming now just for transitional state like this:
Users
---------------------------
Code varchar(32)
Id int surrogate key
Access
---------------------------
UserCode varchar(32)
UserID int foreign key
AccessLevel int
The idea being during the transitional phase un-migrated applications and stored procedures will still be able to access all the appropriate data and new ones can start pushing to the correct columns -- Once the migration is complete for all stored procedures and applications we can finally drop the extra columns.
I wanted to use SQL Server's triggers to automatically intercept any new Insert/Update's and do something like the following on each of the affected tables:
CREATE TRIGGER tr_Access_Sync
ON Access
INSTEAD OF INSERT(, UPDATE)
AS
BEGIN
DIM #code as Varchar(32)
DIM #id as int
SET #code = (SELECT inserted.code FROM inserted)
SET #id = (SELECT inserted.code FROM inserted)
-- This is a migrated application; find the appropriate legacy key
IF #code IS NULL AND #id IS NOT NULL
SELECT Code FROM Users WHERE Users.id = #id
-- This is a legacy application; find the appropriate surrogate key
IF #id IS NULL AND #code IS NOT NULL
SELECT Code FROM Users WHERE Users.id = #id
-- Impossible code:
UPDATE inserted SET inserted.code=#code, inserted.id=#id
END
Question
The 2 huge problems I'm having so far are:
I can't do an "AFTER INSERT" because NULL constraints will make the insert fail.
The "impossible code" I mentioned is how I'd like to cleanly proxy the original query; If the original query has x, y, z columns in it or just x, I ideally would like the same trigger to do these. And if I add/delete another column, I'd like the trigger to remain functional.
Anyone have a code example where this could be possible, or even an alternate solution for keeping these columns properly filled even when only one of values is passed to SQL?
Tricky business...
OK, first of all: this trigger will NOT work in many circumstances:
SET #code = (SELECT inserted.code FROM inserted)
SET #id = (SELECT inserted.code FROM inserted)
The trigger can be called with a set of rows in the Inserted pseudo-table - which one are you going to pick here?? You need to write your trigger in such a fashion that it will work even when you get 10 rows in the Inserted table. If a SQL statement inserts 10 rows, your trigger will not be fired ten times - one for each row - but only once for the whole batch - you need to take that into account!
Second point: I would try to make the ID's IDENTITY fields - then they'll always get a value - even for "legacy" apps. Those "old" apps should provide a legacy key instead - so you should be fine there. The only issue I see and don't know how you handle those are inserts from an already converted app - do they provide an "old-style" legacy key as well? If not - how quickly do you need to have such a key?
What I'm thinking about would be a "cleanup job" that would run over the table and get all the rows with a NULL legacy key and then provide some meaningful value for it. Make this a regular stored procedure and execute it every e.g. day, four hours, 30 minutes - whatever suits your needs. Then you don't have to deal with triggers and all the limitations they have.
Wouldn't it be possible to make the schema changes 'bigbang' but create views over the top of those tables that 'hide' the change?
I think you might find you are simply putting off the breakages to a later point in time: "We're going to deprecate this column one application at a time" - it might be my naivety but I can't see how that's ever going to work.
Surely, a worse mess can occur when different applications are doing things differently?
After sleeping on the problem, this seems to be the most generic/re-usable solution I could come up with within the SQL Syntax. It works fine even if both columns have a NOT NULL restraint, even if you don't reference the "other" column at all in your insert.
CREATE TRIGGER tr_Access_Sync
ON Access
INSTEAD OF INSERT
AS
BEGIN
/*-- Create a temporary table to modify because "inserted" is read-only */
/*-- "temp" is actually "#temp" but it throws off stackoverflow's syntax highlighting */
SELECT * INTO temp FROM inserted
/*-- If for whatever reason the secondary table has it's own identity column */
/*-- we need to get rid of it from our #temp table to do an Insert later with identities on */
ALTER TABLE temp DROP COLUMN oneToManyIdentity
UPDATE temp
SET
UserCode = ISNULL(UserCode, (SELECT UserCode FROM Users U WHERE U.UserID = temp.UserID)),
UserID = ISNULL(UserID, (SELECT UserID FROM Users U WHERE U.UserCode = temp.UserCode))
INSERT INTO Access SELECT * FROM temp
END

Transact-SQL / Check if a name already exists

Simple question here.
Context: A Transact-SQL table with an int primary key, and a name that also must be unqiue (even though it's not a primary key). Let's say:
TableID INT,
TableName NVARCHAR(50)
I'm adding a new rows to this able through a stored procedure (and, thus, specifying TableName with a parameter).
Question: What's the best/simplest way to verify if the provided TableName parameter already exist in the table, and to prevent the add of a new row if it's the case?
Is possible to do this directly within my AddNewRow stored procedure?
If you're using SQL Server 2008 then you could use a MERGE statement in your sproc:
MERGE INTO YourTable AS target
USING (VALUES (#tableName)) AS source (TableName)
ON target.TableName = source.TableName
WHEN NOT MATCHED THEN
INSERT (TableName) VALUES (TableName)
You should still ensure that the TableName column has a UNIQUE constraint.
To add a unique constraint on TableName and handle the error if you try and insert a duplicate.
This avoids any issues with concurrent transactions inserting a duplicate in between you reading that it is not there and trying your insert.
See this related question.
I would prefer using Unique Constraint on the column and then explicitly checking on for its existance.
Handling an exception will result into Identity increment if present,
Secondly exception can be avoided by checking for existence before insertion which other wise is more expensive operation.
IF EXISTS (SELECT TOP(1) ColName FROM MyTable WHERE ColName=#myParameter)
If using Unique constraint you can also apply Unique Nonclustured index resulting into fast retrieval alongwith.

TSQL ID generation

I have a question regarding locking in TSQL. Suppose I have a the following table:
A(int id, varchar name)
where id is the primary key, but is NOT an identity column.
I want to use the following pseudocode to insert a value into this table:
lock (A)
uniqueID = GenerateUniqueID()
insert into A values (uniqueID, somename)
unlock(A)
How can this be accomplished in terms of T-SQL? The computation of the next id should be done with the table A locked in order to avoid other sessions to do the same operation at the same time and get the same id.
If you have custom logic that you want to apply in generating the ids, wrap it up into a user defined function, and then use the user defined function as the default for the column. This should reduce concurrency issue similarly to the provided id generators by deferring the generation to the point of insert and piggy backing on the insert locking behavior.
create table ids (id int, somval varchar(20))
Go
Create function GenerateUniqueID()
returns int as
Begin
declare #ret int
select #ret = max(isnull(id,1)) * 2 from ids
if #ret is null set #ret = 2
return #ret
End
go
alter table ids add Constraint DF_IDS Default(dbo.GenerateUniqueID()) for Id
There are really only three ways to go about this.
Change the ID column to be an IDENTITY column where it auto increments by some value on each insert.
Change the ID column to be a GUID with a default constraint of NEWID() or NEWSEQUENTIALID(). Then you can insert your own value or let the table generate one for you on each insert.
On each insert, start a transaction. Then get the next available ID using something like select max(id)+1 . Do this in a single sql statement if possible in order to limit the possibility of a collision.
On the whole, most people prefer option 1. It's fast, easy to implement, and most people understand it.
I tend to go with option 2 with the apps I work on simply because we tend to scale out (and up) our databases. This means we routinely have apps with a multi-master situation. Be aware that using GUIDs as primary keys can mean your indexes are routinely trashed.
I'd stay away from option 3 unless you just don't have a choice. In which case I'd look at how the datamodel is structured anyway because there's bound to be something wrong.
You use the NEWID() function and you do not need any locking mechanism
You tell a column to be IDENTITY and you do not need any locking mechanism
If you generate these IDs manually and there is a chance parallel calls could generate the same IDs then something like this:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
#NextID = GenerateUniqueID()
WHILE EXISTS (SELECT ID FROM A WHERE ID = #NextID)
BEGIN
#NextID = GenerateUniqueID()
END
INSERT INTO A (ID, Text) VALUES (#NextID , 'content')
COMMIT TRANSACTION
#Markus, you should look at using either IDENTITY or NEWID() as noted in the other answers. if you absolutely can't, here's an option for you...
DECLARE #NewID INT
BEGIN TRAN
SELECT #NewID = MAX(ID) + 1
FROM TableA (tablockx)
INSERT TableA
(ID, OtherFields)
VALUES (#NewID, OtherFields)
COMMIT TRAN
If you're using SQL2005+, you can use the OUTPUT clause to do what you're asking, without any kind of lock (The table Test1 simulates the table you're inserted into, and since OUTPUT requires a temp table and not a variable to hold the results, #Result will do that):
create table test1( test INT)
create table #result (LastValue INT)
insert into test1
output INSERTED.test into #result(test)
select GenerateUniqueID()
select LastValue from #result
Just to update an old post. It is now possible with SQL Server 2012 to use a feature called Sequence. Sequences are created in much the same way a function and it is possible to specify the range, direction(asc, desc) and rollover point. After which it's possible to invoke the NEXT VALUE FOR method to generate the next value in the range.
See the following documentation from Microsoft.
http://technet.microsoft.com/en-us/library/ff878091.aspx

Resources