Say I have an MSSQL table with two columns: an int ID column that's the identity column and some other datetime or whatever column. Say the table has 10 records with IDs 1-10. Now I delete the record with ID = 5.
Are there any scenarios where another record will "fill-in" that missing ID? I.e. when would a record be inserted and given an ID of 5?
No, unless you specifically enable identity inserts (typically done when copying tables with identity columns) and insert a row manually with the id of 5. SQLServer keeps track of the last identity inserted into each table with identity columns and increments the last inserted value to obtain the next value on insert.
Only if you manually turn off identity IDs by using SET IDENTITY_INSERT command and then do a insert with ID=5
Otherwise MS-SQL will always increment to a higher number and missing slots are never re-used.
One scenario not already mentioned where another record will "fill-in" missing IDENTITY values is when the IDENTITY is reseeded. Example (SQL Server 2008):
CREATE TABLE Test
(
ID INTEGER IDENTITY(1, 1) NOT NULL,
data_col INTEGER NOT NULL
);
INSERT INTO Test (data_col)
VALUES (1), (2), (3), (4);
DELETE
FROM Test
WHERE ID BETWEEN 2 AND 3;
DBCC CHECKIDENT ('Test', RESEED, 1)
INSERT INTO Test (data_col)
VALUES (5), (6), (7), (8);
SELECT T1.ID, T1.data_col
FROM Test AS T1
ORDER
BY data_col;
The results are:
ID data_col
1 1
4 4
2 5
3 6
4 7
5 8
This shows that, not only are the 'holes' filled in with new auto-generated values, values that were auto-generated before the reseed are resued and can even duplicate existing IDENTITY values.
Related
I am working on a use case where I need to implement a surrogate key. I have a column ID that should auto-increment by 1 but when I use merge it skips the 2 to 4 sequence.
create or replace table auto_increment(id int primary key autoincrement start 1 increment 1,name varchar(50),city varchar(50));
create or replace table emp(name varchar(50),city varchar(50));
create or replace stream emp_stream on table emp;
insert into emp values('salman','mumbai'),('akshay','pune'),('aamir','mumbai');
merge into auto_increment as a
using emp_stream as e
on e.name=a.name
when matched then update
set a.name=e.name,
a.city=e.city
when not matched then insert (name,city) values(e.name,e.city);
select * from auto_increment;
ID
1
2
3
insert into emp values('aamir','chennai'),('akshay','mumbai'),('ranjikant','chennai'),('mahesh babu','hyderabad');
merge into auto_increment as a
using emp_stream as e
on e.name=a.name
when matched then update
set a.name=e.name,
a.city=e.city
when not matched then insert (name,city) values(e.name,e.city);
select * from auto_increment;
ID
1
2
3
6
7
why it has skipped 4,5? when I use merge again, it gives more gaps in the sequence.
It's already answered here:
MERGE command results in gaps in sequence numbers
Per the Snowflake documentation, Snowflake does not guarantee there will be no gaps in sequences.
https://docs.snowflake.net/manuals/user-guide/querying-sequences.html.
I can say that Snowflake development team is working on improving sequences for MERGE statements.
In SQL Server, I have created a Table with an ID column that I have made an IDENTITY COLUMN,
EmployeeID int NOT NULL IDENTITY(100,10) PRIMARY KEY
It is my understanding, when I use the IDENTITY feature, it auto increments the EmployeeID. What I don't know/not sure is:
Is that IDENTITY number created, unique?
Does SQL search the entire column in the table to confirm the number created does not already exist?
Can I override that auto increment number manually?
If I did manually override that number, would the number I enter be checked to make sure it is not a duplicate/existing ID number?
Thanks for any help provided.
Is that IDENTITY number created, unique?
Yes, Identity property is unique
Does SQL search the entire column in the table to confirm the number created does not already exist? \
It need not, what this property does is, just incrementing the old value
Can I override that auto increment number manually?
Yes, you can. You have to use SET IDENTITY_INSERT TABLENAME ON
If I did manually override that number, would the number I enter be checked to make sure it is not a duplicate/existing ID number?
No, that won't be taken care by SQL Server, you will have to ensure you have constraints to take care of this
Below is a simple demo to prove that
create table #temp
(
id int identity(1,1)
)
insert into #temp
default values
go 3
select * from #temp--now id column has 3
set identity_insert #temp on
insert into #temp (id)
values(4)
set identity_insert #temp off
select * from #temp--now id column has 4
insert into #temp
default values
go
select * from #temp--now id column has 5,next value from the last highest
Updating info from comments:
Identity column will allow gaps once you reseed them,also you can't update them
I am using SQL Server 2012.
I have two tables, tblPerson and tblGender.
tblPerson has 4 columns
ID
Name
Email
GenderID (foreign Key)
tblGender has 2 columns
ID
Gender
tblGender has only two entry, male and female, having id 1 and 2.
Now, if I insert bad data to the GenderId column, like 3, 4 etc. it rejects the value but it increments the Identity column value, and when I insert another data even if it is valid, it gives the next id number.
How can I solve this problem?
At first, as it was already mentioned, that is normal behavior of an IDENTITY column. SQL Server inserts the row first, increments Identity column value, but then rejects this row because of failed constraint.
An advice will be following:
1. Leave your ID column with gaps.
2. For very sequential 'EmployeeID' number you can use Sequence, which you insert from a variable: https://msdn.microsoft.com/en-us/library/ff878058(v=sql.110).aspx
Then you are supposed to have no gaps in your 'EmployeeID'.
I'll try and explain in simple terms leaving out the whys and wheres of how this occured.
Currently there are 2 databases that need to be merged, they have the same tables etc and in some cases lookup tables are identical, in some cases they are and in some cases records in one database have different identity values for there equivalent in the other DB. So it's a mess.
Let us say on one of the databases we update all the identity values bu adding 10,000 to them and updating the related records. Then we could import the data as is and yes in some cases lookups would have the same value twice with different identities.
The question will not be regarding the above mess :). I want to know after re enabling the identity column we will have seed values of
1,2,3,4,5 etc and 10001, 10002, 10003 etc. Should more rows be inserted and they continue from lets just say 9999 will the identity column use 10,000 and then 10,004 or will SQL Server complain on the next insert that the identity value is already used?
I just tested this with simple INSERT's: you have to disable IDENTITY_INSERT first for each table you want to import data
SET IDENTITY_INSERT table OFF
Then you can insert your data with their original identity column values (which you'll need in order to maintain the references correctly)
After
SET IDENTITY_INSERT table ON
SQL Server continues the sequence with the highest element plus one, so in your case (after inserting IDs 10001, 10002, 10003) it would continue with 10004.
It's important to realise that, although they frequently appear together, IDENTITY and PRIMARY KEY are two orthogonal concepts1. So, to the question as asked, the answer is no - as IDENTITY column will quite happily provide a value that has already been used in the same column:
set nocount on
go
create table II (
ID int IDENTITY(1,1) not null,
Value varchar(10) not null
)
insert into II(Value) values ('abc'),('def')
set identity_insert II on
insert into II(ID,Value) values (6,'ghi')
set identity_insert II off
select * from II
insert into II(Value) values ('jkl')
select * from II
GO
dbcc checkident (II, RESEED, 5);
GO
insert into II(Value) values ('mno'),('pqr')
select * from II
Results:
ID Value
----------- ----------
1 abc
2 def
6 ghi
ID Value
----------- ----------
1 abc
2 def
6 ghi
7 jkl
Checking identity information: current identity value '7'.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
ID Value
----------- ----------
1 abc
2 def
6 ghi
7 jkl
6 mno
7 pqr
Whereas a PRIMARY KEY will complain if you attempt to insert a duplicate value:
create table III (
ID int IDENTITY(1,1) not null PRIMARY KEY,
Value varchar(10) not null
)
insert into III(Value) values ('abc'),('def')
set identity_insert III on
insert into III(ID,Value) values (6,'ghi')
set identity_insert III off
select * from III
insert into III(Value) values ('jkl')
select * from III
GO
dbcc checkident (III, RESEED, 5);
GO
insert into III(Value) values ('mno'),('pqr')
select * from III
go
(The only different from the previous script is the table name and the addition of PRIMARY KEY)
Results:
ID Value
----------- ----------
1 abc
2 def
6 ghi
ID Value
----------- ----------
1 abc
2 def
6 ghi
7 jkl
Checking identity information: current identity value '7'.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
Msg 2627, Level 14, State 1, Line 1
Violation of PRIMARY KEY constraint 'PK__III__3214EC27FCCBBCB7'. Cannot insert duplicate key in object 'dbo.III'. The duplicate key value is (6).
The statement has been terminated.
ID Value
----------- ----------
1 abc
2 def
6 ghi
7 jkl
1 The third concept that is frequently conflated with these two is that of the Clustered Index. It's perfectly possible for a table to have a Primary Key, an Identity Column and a Clustered Index that have no columns in common.
I'm using SQL Server 2008
as per microsoft, http://msdn.microsoft.com/en-us/library/ms188059.aspx
when I execute the following
set identity_insert on
//insert statements here
set identity_insert off
the identity of the column is set to the maximum value. Can I avoid this?
Consider the following scenario,
my table has 2 rows as follows
id, name comm
1, John, 232.43
2, Alex, 353.52
now using the above code, when I insert
10, Smith, 334.23
as per the above link, SQL Server automatically sets the identity to 10. So for newly inserted records (without using identity_insert on), id automatically starts with 11.
I want the identity value to be 3, after using identity_insert on/off
please help.
Here's a test table for this discussion
create table t4721736 ( id int identity primary key, name varchar(10), comm money )
insert t4721736 select 'John', 232.43 -- id=1
insert t4721736 select 'Alex', 353.52 -- id=2
-- check contents
select * from t4721736
-- do all this in a transaction
BEGIN TRAN
-- dummy insert
insert t4721736 select 'dummy', null
-- get what the id should be
declare #resetto bigint
set #resetto = scope_identity()
-- remove dummy record
delete t4721736 where id = #resetto
-- perform the insert(s)
set identity_insert t4721736 on;
insert t4721736(id,name,comm) select 10000000, 'Smith', 334.23;
set identity_insert t4721736 off;
-- reset the identity
set #resetto = #resetto - 1 -- it needs to be 1 prior
DBCC CHECKIDENT(t4721736, RESEED, #resetto)
COMMIT
Assuming you fully understand (I believe you do) that it will fail as soon as the range runs up to the records with the nominated IDs. SQL Server won't perform any auto-skip over IDs that already have records attached.
that will not be a problem, coz when i
insert using identity_insert on, value
of id will be greater than 10 million.
so there will not be any problem of
clashing
To see how this fails, shortcut the process by changing the "10000000" into "10" in the code above. Then, follow up with these:
-- inspect contents, shows records 1,2,10
select * from t4721736
-- next, insert 7 more records, bringing the id up to 9
insert t4721736 select 'U3', 0
insert t4721736 select 'U4', 0
insert t4721736 select 'U5', 0
insert t4721736 select 'U6', 0
insert t4721736 select 'U7', 0
insert t4721736 select 'U8', 0
insert t4721736 select 'U9', 0
Finally, try the next insert below
insert t4721736 select 'U10', 0
You can reset the seed value using DBCC CHECKIDENT:
DBCC CHECKIDENT ("MyTable", RESEED, 3);
GO
However, you have inserted a record Id of 10, so yes, the next one will indeed be 11.
It is documented on the command:
If the current identity value for a table is less than the maximum identity value stored in the identity column, it is reset using the maximum value in the identity column.
You can't have it both ways. Either have the lowest ID be the value of the base seed, or not.
If these rows you're inserting are special/magic rows (so they need specific IDs), have you considered making these rows have negative ID values? That way there's no conflict, and the IDENTITY value will not be reset by your adding them.
If it's some other reason why you need to insert these rows with vastly different ID values, perhaps you could expand your question to provide some info on that - we may be able to offer better solutions.
Another way to get around the "planted bug" dilemma is to create your own identity generator procedure and tracking table. The table includes a tablename and value that the next ID should be. This way you can reset it any value at any time. The procedure would include logic to check to see if the next generated key exists and if it does exist it will increment the key till it finds an ID that does not exist in the table and pass that back out to you. This would have to be implemented on all inserts to work correctly. Which is possible with a trigger. The downside is more processing overhead than using a negative number like Damien_The_Unbeliever suggests.