Avoiding gaps in an identity column - sql-server

I have a table in MS SQL SERVER 2008 and I have set its primary key to increment automatically but if I delete any row from this table and insert some new rows in the table it starts from the next identity value which created gap in the identity value. My program requires all the identities or keys to be in sequence.
Like:
Assignment Table has total 16 rows with sequence identities(1-16) but if I delete a value at 16th position
Delete From Assignment Where assignment_id=16;
and after this operation when I insert a new row
Insert into Assignment(assignment_title)Values('myassignment');
Rather than assigning 16 as a primary key to this new value it assigns 17.
How can I solve this Problem ?

Renaming or re-numbering primary key values is not a good database management practice. I suggest you keep the primary key as is, and create a separate column index with the values you require to be re-numbered. Then simply create a trigger to run a routine that will re-number every row in the order you expect, obviously by seeking the "gaps" and entering them with values incremented from their previous value.

This is SQL Servers standard behaviour. If you deleted a row with ID=8 in your example, you would still have a gap.
All you could do, is write a function getSmallestDreeID in SQL Server, that you called for every insert and that would get you the smallest not assigned ID. But you would have to take great care of transactions and ACID.

The behavior you desire isn't possible without some post processing logic to renumber the rows.
Consider thus scenario:
Session 1 begins a transaction, inserts a row (id=16), but doesn't commit yet.
Session 2 begins a transaction, inserts a row (id=17) and commits.
Session1 rolls back.
Whether 16 will or will not exist in the table is decided after 17 is committed.
And you can't renumber these in a trigger, you'll get deadlocked.
What you probably need to do is to query the data adding a row number that is a sequential integer.
Gaps in identity values isn't a problem

well, i have recently faced the same problem: i need the ID values in an external C# application in order to retrieve files named exactly as the ID.
==> here is what i did to avoid the identity property, i entered id values manually because it was a small table, but if it is not in your case, use a SEQUENCE SQL Server 2014.
Use the statement UPDATE instead of delete to keep the id values in order.

Related

SQL Server re-uses the same IDENTITY Id twice

I hope the question is not too generic.
I have a table Person that has a PK Identity column Id.
Via C#, I insert new entries for Person and the Id get set to 1,2,3 for the 3 persons added.
Also via C#, I perform all deletions of the persons with Id=1,2,3 so that there's no Person in the Table anymore.
Afterwards, I run some change scripts (I can't post them as they are too long) also on Table Person.
I don't do any RESEED.
Now the fun:
If I call SELECT IDENT_CURRENT('Person') it shows 3 instead of 4.
If I do an insert of Person again, I get a Person with the Id 3 added instead of Id 4.
Any idea why and how this can happen?
EDIT
I think I found the explanation of my question:
While performing DB Changes via SQL Server Management Studio, The Designer creates
a temp table Tmp_Person and moves the data from Person inside there. Afterwards he performs a rename of Tmp_Person to Person. Since this is a new table the Index starts again from the beginning.
An IDENTITY property doesn't guarentee uniqueness. That's what a PRIMARY KEY or UNIQUE INDEX is for. This is covered in the documentation in the remarks section, along with other intended behaviour. CREATE TABLE (Transact-SQL) IDENTITY (Property) - Remarks:
The identity property on a column does not guarantee the following:
Uniqueness of the value - Uniqueness must be enforced by using a PRIMARY KEY or UNIQUE constraint or UNIQUE index.
Consecutive values within a transaction - A transaction inserting multiple rows is not guaranteed to get consecutive values for the rows
because other concurrent inserts might occur on the table. If values
must be consecutive then the transaction should use an exclusive lock
on the table or use the SERIALIZABLE isolation level.
Consecutive values after server restart or other failures -SQL Server might cache identity values for performance reasons and some of
the assigned values can be lost during a database failure or server
restart. This can result in gaps in the identity value upon insert. If
gaps are not acceptable then the application should use its own
mechanism to generate key values. Using a sequence generator with the
NOCACHE option can limit the gaps to transactions that are never
committed.
Reuse of values - For a given identity property with specific seed/increment, the identity values are not reused by the engine. If a
particular insert statement fails or if the insert statement is rolled
back then the consumed identity values are lost and will not be
generated again. This can result in gaps when the subsequent identity
values are generated.
These restrictions are part of the design in order to improve
performance, and because they are acceptable in many common
situations. If you cannot use identity values because of these
restrictions, create a separate table holding a current value and
manage access to the table and number assignment with your
application.
Emphasis mine for this question.

SQL Server : primary key auto increment - what about deleted rows and free key values?

I'm kind of new to SQL and databases and there's one thing that bothers me.
I'm using SQL Server for my ASP.NET MVC project and my database and its tables were auto-generated by Entity Framework using a code-first approach.
I have a table for book collections - just CollectionId and Name columns.
During development I've made many inserts and deletes in this table and right now it has 10 rows with Id's 1 to 10 (the initial entries). But when I add a new one it has the Id set to 37. Obviously in the past there were entries with Id up to 36, but there are now gone and these numbers seem to be free.
Then why a new entry does not have the Id set to 11? Is it a kind of limitation or maybe a security feature?
Thank you for answers.
This is default behavior when we define identity column. Whenever we perform delete operations there will be gaps in records for identity column.
Remarks from MSDN
If an identity column exists for a table with frequent deletions, gaps can occur between identity values. If this is
a concern, do not use the IDENTITY property. However, to ensure that
no gaps have been created or to fill an existing gap, evaluate the
existing identity values before explicitly entering one with SET
IDENTITY_INSERT ON.
IDENTITY
In addition to the other answer, it also has to do with performance of the server. The server typically cache's a group of ID's in memory to make assignment much faster, since the next number has to be stored on disk somewhere. So if the server allocates 100 numbers at a time, it only has to write to disk 1 out of every 100 usages (inserts) of the identity.
Trying to maintain gaps in the sequence would suck up a lot of time.
If you create a new table, insert a single row, kill the server and restart, you'll find the next insert will most likely contain a gap of whatever that number of cached values contains.

why sql server increment the Identity specification?

I am using sql server 2012, in my database I have set primaykey on userid also I have set the Identity specification Yes,Is Identity Yes,Identity Increment 1 and Identity Seed 1.
I just insert 5 users and userid is 1,2,3,4,5. I am sure after that I haven't did any insert and no other sp or trigger is using this table. this is just a new table. Now when I tried to insert 6th user it has inserted userid is 1001.
and for 7th 1002 and for 8th it inserted 2002 ,
why such jumped in userid?
Usually Gaps occur when:
1. records are deleted.
2. error has occurred when attempting to insert a new record (e.g. not-null constraint error).the identity value is helplessly skipped.
3. somebody has inserted/updated it with explicit value (e.g. identity_insert option).
4. incremental value is more than 1.
The identity property on a column does not guarantee the following:
Uniqueness of the value – Uniqueness must be enforced by using a PRIMARY KEY or UNIQUE constraint or UNIQUE index.
Consecutive values within a transaction – A transaction inserting multiple rows is not guaranteed to get consecutive values for the rows because other concurrent inserts might occur on the table. If values must be consecutive then the transaction should use an exclusive lock on the table or use the SERIALIZABLE isolation level.
Consecutive values after server restart or other failures –SQL Server might cache identity values for performance reasons and some of the assigned values can be lost during a database failure or server restart. This can result in gaps in the identity value upon insert. If gaps are not acceptable then the application should use a sequence generator with the NOCACHE option or use their own mechanism to generate key values.
Reuse of values – For a given identity property with specific seed/increment, the identity values are not reused by the engine. If a particular insert statement fails or if the insert statement is rolled back then the consumed identity values are lost and will not be generated again. This can result in gaps when the subsequent identity values are generated.
Also,
If an identity column exists for a table with frequent deletions, gaps can occur between identity values. If this is a concern, do not use the IDENTITY property. However, to make sure that no gaps have been created or to fill an existing gap, evaluate the existing identity values before explicitly entering one with SET IDENTITY_INSERT ON.
Also, Check the Identity Column Properties & check the Identity Increment value. Its should be 1.
Open your table in design view
Now check that Identity Seed and Identity Increment values are correct. If not then you must correct them.

How can I get current autoincrement value

How can I get last autoincrement value of specific table right after I open database? It's not last_insert_rowid() because there is no insertion transaction. In other words I want to know in advance which number autoincrement will choose when inserting new row for particular table.
It depends on how the autoincremented column has been defined.
If the column definition is INTEGER PRIMARY KEY AUTOINCREMENT, then SQLite will keep the largest ID in an internal table called sqlite_sequence.
If the column definition does NOT contain the keyword AUTOINCREMENT, SQLite will use its ‘regular’ routine to determine the new ID. From the documentation:
The usual algorithm is to give the newly created row a ROWID that is
one larger than the largest ROWID in the table prior to the insert. If
the table is initially empty, then a ROWID of 1 is used. If the
largest ROWID is equal to the largest possible integer
(9223372036854775807) then the database engine starts picking positive
candidate ROWIDs at random until it finds one that is not previously
used. If no unused ROWID can be found after a reasonable number of
attempts, the insert operation fails with an SQLITE_FULL error. If no
negative ROWID values are inserted explicitly, then automatically
generated ROWID values will always be greater than zero.
I remember reading that, for columns without AUTOINCREMENT, the only surefire way to determine the next ID is to VACUUM the database first; that will reset all ID counters to the largest existing ID for that table + 1. But I can’t find that quote anymore, so this may no longer be true.
That said, I agree with slash_rick_dot that fetching auto-incremented IDs beforehand is a bad idea, especially if there’s even a remote chance that another process might write to the database at the same time.
Different databases implement auto-increment differently. But as far as I know, none of them will answer the question you are asking.
The auto increment feature is intended to create a unique ID for a newly added table row. If a row hasn't been inserted yet, then the feature hasn't produced the id.
And it makes sense... If you did get the next auto increment number, what would you do with it? Likely the intent is to assign it as the primary key of the not-yet-inserted new row. But between the time you got the id, and the time you used it to insert the row, the database could have used that id to insert a row for another process.
Your choices are this: manage the creation of ids yourself, or wait until rows are inserted before using their auto-created identifiers.

Does SQL Server guarantee sequential inserting of an identity column?

In other words, is the following "cursoring" approach guaranteed to work:
retrieve rows from DB
save the largest ID from the returned records for later, e.g. in LastMax
later, "SELECT * FROM MyTable WHERE Id > {0}", LastMax
In order for that to work, I have to be sure that every row I didn't get in step 1 has an Id greater than LastMax. Is this guaranteed, or can I run into weird race conditions?
Guaranteed as in absolutely under no circumstances whatsoever could you possibly get a value that might be less than or equal to the current maximum value? No, there is no such guarantee. That said, the circumstances under which that scenario could happen are limited:
Someone disables identity insert and inserts a value.
Someone reseeds the identity column.
Someone changes the sign of the increment value (i.e. instead of +1 it is changed to -1)
Assuming none of these circumstances, you are safe from race conditions creating a situation where the next value is lower than an existing value. That said, there is no guarantee that the rows will be committed in the order that of their identity values. For example:
Open a transaction, insert into your table with an identity column. Let's say it gets the value 42.
Insert and commit into the same table another value. Let's say it gets value 43.
Until the first transaction is committed, 43 exists but 42 does not. The identity column is simply reserving a value, it is not dictating the order of commits.
I think this can go wrong depending on the duration of transactions
Consider the following sequence of events:
Transaction A starts
Transaction A performs insert - This creates a new entry in the identity column
Transaction B starts
Transaction B performs insert - This creates a new entry in the identity column
Transaction B commits
Your code performs its select and sees the identity value from the 2nd transaction
Transaction A commits -
The row inserted by Transaction A will never be found by your code. It was not already committed when step 6 was performed. And when the next query is performed it will not be found, because it has a lower value in the identity column than the query is looking for.
It could work if you perform the query with a read-uncommitted isolation mode
Identities will will always follow the increment that defines the identity:
IDENTITY [(seed ,increment)] http://msdn.microsoft.com/en-us/library/aa933196(SQL.80).aspx
which can be positive or negative (you can have it increment forward or backwards). If you set your identity to increment forward, your identity values will always be larger than the previous, but you may miss some, if you rollback an INSERT.
Yes, if you set your identity increment to a positive value your loop logic will work.
The only time records might get inserted that you wouldn't get would be if someone turns the identity insert on and manually inserts a record to a skipped id (or in some cases to a negative number). This is a fairly rare occurance and generally would only be done by a system admin. Might be done to reinsert an accidentally deleted record for instance.
The only thing that SQL Server guarantees is that your IDENTITY column will always be incremented.
Things to consider though:
If a fail INSERT occurs, the IDENTITY column will get incremented anyhow;
If a rollback occurs, the IDENTITY column will not return to its previous value;
Which explains why SQL Server doesn't guarantee sequential INDENTITY.
There is a way to reset an IDENTITY column like so using the DBCC command. But before doing so, please consider the following:
Ensure your IDENTITY column is not referenced by any other table, as your foreign keys could be not updated with it, so big troubles ahead;
You might use the SET IDENTITY_INSERT ON/OFF instruction so that you may manually specify the IDENTITY while INSERTing a row (never forget to turn it on afterward).
An IDENTITY column is one of the most important element never to be changed in DBRMs.
Here is a link that should help you: Understanding IDENTITY columns
EDIT: What you seem to do shall work as the IDENTITY column from LastMax will always increment for each INSERTed row. So:
Selecting rows from data table;
Saving LastMax state;
Selecting rows where Id > LastMax.
3) will only select rows where the IDENTITY column will be greater than LastMax, so inserted since LastMax has been saved.

Resources