Difference Identity and Sequence/Trigger for autoincrement - database

i'm new to apex/oracle db and just found out that you would either use a sequence + trigger (usually for versions < 12c) or an identity-column ( versions >=12c).
What is better practice and what are the differences between the two ?
Thanks :)

One big difference is in dealing with parent-child insertions - here you first need to insert the parent, and then use the generated ID value from the parent table as a foreign key in the child table's inserts.
In those instances, with an identity column you either need to be able to use the RETURNING clause to get back the just-inserted ID (not supported in all middleware), or you do the insert of the parent record and then query to get the ID that was created so that you can use it as the FK value in the child table. If your table does not have a natural key to easily identify the just-inserted row - this may be problematic.
On the other hand, for those situations, if you do not use IDENTITY you instead first do a SELECT on the sequence to get the next incremental value, and then use that directly in your parent and child insert statements. This is a more portable solution, and is compatible with all Oracle versions if you may need to do an install to an earlier version of Oracle for a given client. In that case you don't have the trigger do the select from the sequence to set the value - you do it yourself.
Yes, it is an extra round-trip to the DB to get the sequence.nextval, but if your middleware doesn't support the RETURNING clause you're going to be doing that round trip to get the inserted ID anyway, and almost certainly using a more expensive query.
Also, if you have a bunch of PL/SQL library code that manipulates data using the very convenient %ROWTYPE conventions, and if your IDENTITY column is set to GENERATED ALWAYS, then you can start running into problems on inserts as noted here. Something to be aware of if thinking of switching to IDENTITY columns underneath an existing code base.

There is a third alternative to the two mentioned in the question (IDENTITY column and sequence + trigger): namely, create a sequence and set a default on the column, e.g.:
CREATE SEQUENCE my_sequence;
CREATE TABLE my_table
( my_column NUMBER DEFAULT my_sequence.nextval NOT NULL
, my_other_column DATE DEFAULT SYSDATE NOT NULL
);

Related

SQL: How to have a non-unique "identity column", with automatic SET behaviour?

The idea
So, the title may seem somewhat vague but this is what I would like to have (in Microsoft SQL Server, recent version), to be used from an ASP.NET C# application:
A table with an ordinary primary key, defined as an "official" identity column
some other columns
An additional "logical identity" column
The additional "logical idendity" column should have the following properties
be of type integer
not strictly unique (multiple rows can have the same "locigal idendity")
mandatory
immutable (once set, it may never change). However DELETE of the row must be allowed.
When not provided at INSERT, set to a not yet used value
The last point is probably the hardest to achieve, so that's the question:
The question
How to enforce (preferably on the database level) that a mandatory value is always set to a yet unique value, when not provided by the INSERT script?
The thoughts
What I have considered yet:
Having a normal "identity" on that column is not possible because it's not unique among the existing values
Having a random value is not possible, because it must be unique for new values
Extending the =SaveChanges= Method would be problematic, because it would require to query the database in it
Maybe a database triggered function, but I would hope that there are easier solutions
The context
On some occations, especially when there will be an additional row with the same "logical idendity" insert, the application already defines the "loigcal idendity", and it should be used.
Currently, when the application sets a value as "logical ID" it will be among the existing values. Thus, I could force the database to accept only INSERTed values that at least exist once. This would help it when required to provide new, unique values.
However, if this is some sort of new item, the system should provide a new "locigal idendity" on the fly, while inserting. It must be sure, that no existing value is reused for this.
I will use Entity Framework (Version 6) as my ORM.
If the above requirements are not met, an exception should be thrown on the "Add"
If such a value would be changed, an exception should be thrown on the "Update"
One option is with a SEQUENCE value assigned with a DEFAULT constraint. The immutable requirement is the biggest challenge because SQL Server doesn't provide a declarative way to specify a read-only column so one needs a trigger implementation.
Below is example DDL. I don't know if this technique will pose challenges with EF.
CREATE SEQUENCE SQ_Example_Sequence
AS int START WITH 1 INCREMENT BY 1;
GO
CREATE TABLE dbo.Example(
IdentityColumn int NOT NULL IDENTITY
CONSTRAINT PK_Example PRIMARY KEY CLUSTERED
,SomeDataColumn int
,SequenceColumn int NOT NULL
CONSTRAINT DF_Example_SequenceColumn DEFAULT NEXT VALUE FOR SQ_Example_Sequence
);
GO
CREATE TRIGGER TR_Example_Update
ON dbo.Example FOR UPDATE
AS
IF EXISTS(SELECT 1
FROM inserted
JOIN deleted ON inserted.IdentityColumn = deleted.IdentityColumn
WHERE inserted.SequenceColumn <> deleted.SequenceColumn
)
BEGIN
THROW 50000, 'SequenceColumn value cannot be changed', 16;
END;
GO

SQL Server re-uses the same IDENTITY Id twice

I hope the question is not too generic.
I have a table Person that has a PK Identity column Id.
Via C#, I insert new entries for Person and the Id get set to 1,2,3 for the 3 persons added.
Also via C#, I perform all deletions of the persons with Id=1,2,3 so that there's no Person in the Table anymore.
Afterwards, I run some change scripts (I can't post them as they are too long) also on Table Person.
I don't do any RESEED.
Now the fun:
If I call SELECT IDENT_CURRENT('Person') it shows 3 instead of 4.
If I do an insert of Person again, I get a Person with the Id 3 added instead of Id 4.
Any idea why and how this can happen?
EDIT
I think I found the explanation of my question:
While performing DB Changes via SQL Server Management Studio, The Designer creates
a temp table Tmp_Person and moves the data from Person inside there. Afterwards he performs a rename of Tmp_Person to Person. Since this is a new table the Index starts again from the beginning.
An IDENTITY property doesn't guarentee uniqueness. That's what a PRIMARY KEY or UNIQUE INDEX is for. This is covered in the documentation in the remarks section, along with other intended behaviour. CREATE TABLE (Transact-SQL) IDENTITY (Property) - Remarks:
The identity property on a column does not guarantee the following:
Uniqueness of the value - Uniqueness must be enforced by using a PRIMARY KEY or UNIQUE constraint or UNIQUE index.
Consecutive values within a transaction - A transaction inserting multiple rows is not guaranteed to get consecutive values for the rows
because other concurrent inserts might occur on the table. If values
must be consecutive then the transaction should use an exclusive lock
on the table or use the SERIALIZABLE isolation level.
Consecutive values after server restart or other failures -SQL Server might cache identity values for performance reasons and some of
the assigned values can be lost during a database failure or server
restart. This can result in gaps in the identity value upon insert. If
gaps are not acceptable then the application should use its own
mechanism to generate key values. Using a sequence generator with the
NOCACHE option can limit the gaps to transactions that are never
committed.
Reuse of values - For a given identity property with specific seed/increment, the identity values are not reused by the engine. If a
particular insert statement fails or if the insert statement is rolled
back then the consumed identity values are lost and will not be
generated again. This can result in gaps when the subsequent identity
values are generated.
These restrictions are part of the design in order to improve
performance, and because they are acceptable in many common
situations. If you cannot use identity values because of these
restrictions, create a separate table holding a current value and
manage access to the table and number assignment with your
application.
Emphasis mine for this question.

Autoincrement in Entity Framework 5 without identity column in database

I have not been able to find any appropriate solution for my problem, so here's my question for you:
In Entity Framework (5.0), how can I setup an ID-column (PK) to be autocremented when no identity column is defined in the actual database (SQL Server 2005)?
I have seen the StoreGeneratedPattern, but not sure how this would work without identity in the db. The manual approach would be to manually populate the POCO with MAX(id)+1, but that feels like a hack and I'm worried that it will introduce problems in a multi-threaded environment where multiple requests may insert records to my table at the "same" time.
Note that I do not have the possibility to alter the table schema in the database.
What's the best way to solve this?
If one instance of your application is the only thing inserting rows into this table, then the MAX(Id) + 1 hack is probably good enough. Otherwise, you'll need to alter the database schema to generate these values on insert -- either by using IDENTITY or by re-inventing the wheel using triggers, sprocs, etc.
Whatever your solution, it should guarantee that a duplicate key will never be generated -- even if a transaction happens to rollback one or more inserts.
If nothing else inserts into the table, you should be able to alter Id to an identity column without breaking compatibility.
FYI: Entity Framework's StoreGeneratedPattern (or DatabaseGeneratedOption) only specifies how values are handled on insert and update. Using Identity tells EF that the value is expected to be generated by the database on insert. Computed means it's generated on both insert and update.

Id of object before insertion into database (Linq to SQL)

From what I gather, Linq to SQL doesn't actually execute any database commands (including the opening of database connections) until the SubmitChanges() method is called. If this is the case, I would like to increase the efficiency of a few methods. Is it possible for me to retrieve the ID of an object before inserting it? I'd rather not call SubmitChanges() twice, if it's possible for me to know the value of the ID before it's actually inserted into the database. From a logical point of view, it would only makes sense to have to open a connection to the database in order to find out the value, but does an insertion procedure also have to take place?
Thanks
The usual technique to solve this, is to generate a unique identifier in the application layer (such as a GUID) and use this as the ID. That way you do not have to retrieve the ID on a subsequent call.
Of course, using a GUID as a primary key can have it's drawbacks. If you decide to go this way look up COMB GUID.
Well, here is the problem: You get somehow id BEFORE inserting to database, and do some processing with it. In the same time another thread does the same, and get's the same ID, you've got a conflict.
I.e. I don't think there is an easy way of doing this.
I don't necessarily recommend this, but have seen it done. You can calculate your own ID as an integer using a stored procedure and a table to hold the value of the next ID. The stored procedure selects the value from the table to return it, then increments the value. The table will look something like the following
Create Table Keys(
name varchar(128) not null primary key,
nextID int not null
)
Things to note before doing this is that if you select and then update in 2 different batches you have potential key collision. Both steps need to be treated as an atomic transaction.

Integration services and identity columns

I am a bit of an SSIS newbie and while the whole system seems straightforward, I don't conceptually understand the process I need to go through in this scenario:
Need to map Invoice and InvoiceLine tables from a source database to two equivalent tables in a destination database - with different identity values.
For each invoice inserted across, I need to get the identity it was assigned and then insert all its lines referencing that new identity
There is a surrogate key on the invoices (the invoice number), however these might also clash with invoice numbers in the target system, hence they would also have to be renumbered.
This must be a common scenario in integration - is there a common solution?
Chris KL - you are correct that this is harder than one would expect. I have three methods for this, which work in different situations:
IF the data you are loading is small (hundreds or thousands but not hundreds OF thousands) then you can do this: use an OLEDB command that performs one insert for each parent row and returns the identity value back; then downstream from that join the output from that to the child rows, and insert them. Advantage: intuitive. Disadvantage: scales badly. This method is documented on the web and should Google for you.
If we are talking about a bigger system where you need bulk loading, then there are two other flavors:
a. If you have exclusive access to the table during the load (really exclusive, enforced in some way) then you can grab the max existing ID from the table, use an SSIS script task to number the rows starting above that max id, then Set Identity Insert On, stuff them in, and Set Identity Insert Off. You then have those script-generated keys in SSIS to assign to the child rows. Advantage: fast and simple, one trip to the DB. Disadvantage: possible errors if some other process inserts into your table at the same time. Brittle.
b. If you don't have exclusive access, then the only way I know of is with a round trip to the DB, thus: Insert all parent rows but keep track of a key for them that is not the identity column (a business key, for example). In a second dataflow, process the child records by using a Lookup transform that uses the business key to fetch the parent ID. Make sure the lookup is tuned appropriately vs. caching, and that thee business key is indexed.
OK, this is a good news / bad news situation I'm afraid. First the good news and a bit of background which you may know but I'll put it down in case you don't.
You generally can't insert anything into IDENTITY columns. Of course, like everything else in life there are times when you need to and that can be done with the IDENTITY_INSERT option.
SET IDENTITY_INSERT MyTable ON
INSERT INTO MyTable (
MyIdCol,
Etc…
)
SELECT SourceIdCol,
Etc…
FROM MySourceTable
SET IDENTITY_INSERT MyTable OFF
Now, you say that you have surrogate keys in the target but then you say that they may clash. So I'm a little confused… Are you using the keys from the source (e.g. IDENTITY columns) or are you generating new keys in the target? I would strongly advise against trying to merge the keyspaces in a single key column. If you need to retain the keys then I would suggest a multi-field key using something like SourceSystemId to keep them unique.
Finally the bad news: SSIS doesn't provide a simple means of using the IDENTITY_INSERT option. The only way I've been able to do it is by turning it on in a SQL task that executes before the insert task. You should be able to pass the table name into the script as a variable. Make sure to include another SQL task afterwards to turn it off because you can only use on one table at a time.

Resources