Resetting Primary key without deleting truncating table

Resetting Primary key without deleting truncating table - sql-server

I have a table with a primary key, now without any reason I don't know when I am inserting data it is being loaded like this
Pk_Col Some_Other_Col
1 A
2 B
3 C
1002 D
1003 E
1901 F
Is there any way I can reset my table like below, without deleting/ truncating the table?
Pk_Col Some_Other_Col
1 A
2 B
3 C
4 D
5 E
6 F

You can't update the IDENTITY column so DELETE/INSERT is the only way. You can reseed the IDENTITY column and recreate the data, like this:
DBCC CHECKIDENT ('dbo.tbl',RESEED,0);
INSERT INTO dbo.tbl (Some_Other_Col)
SELECT Some_Other_Col
FROM (DELETE FROM tbl OUTPUT deleted.*) d;
That assumes there are no foreign keys referencing this data.

If you really, really want to have neat identity values you can write a cursor (slow but maintainable) or investigate any number of "how can I find gaps in my sequence" question on SO and perform an UPDATE accordingly (runs faster but tricky to get right). This becomes exponentially harder when you start having foreign keys pointing back to this table. Be prepared to re-run this script any time data is put into, or removed from this table.
Edit: IDENTITY columns cannot be updated per se. You can, however, SET IDENTITY_INSERT dbo.MyTable ON;, INSERT a row with the desired IDENTITY value and the values from the other columns of an existing row, then DELETE the existing row. The nett effect on the data being the same as an UPDATE.
The only sensible reason to do this is if your table has about two billion rows and you're about to run out of integers for your identity column. If that's the case you have a whole world of other stuff to worry about, too.
But seriously - listen to #Damien, don't worry about it.

ALTER TABLE #temp1
DROP CONSTRAINT PK_Id
ALTER TABLE #temp1
DROP COLUMN Id
ALTER TABLE #temp1
ADD Id int identity(1,1)
Try this one.

Related

Resetting the primary key to 1

I have a script for microsoft sql server database which has hundreds of tables and tables contains data as well. This is the database of a web application.what I want to do is to delete the previous records and reset the primary key to 1 or 0.
I have tried
`DBCC CHECKIDENT ('dbo.tbl',RESEED,0); `
but it does not work for me as in most of the tables the primary key is not identity.
I can not truncate the table as its primary key is being used as FK in many other tables.
I have also tried to add the identity specification in the primary key of the table and run the checkident query and then changing it back to non-identity spec, but after adding the record again it starts from where it left.
Making changes in the code is not an option for me.
please help.

According with your question I am not sure about the main objective, Why? If you need truncate a lot of tables and change their structures to have an Identity property why you can't disabled the FK? . In the past I have used an standard process for rebuild a table and migrate all the information, this represent a group of steps, I would try to help you but you should follow the next steps.
Steps:
1) Disable FK for alter the structure of your tables. You can get the solution for this task in the next link:
Temporarily disable all foreign key constraints
2) Alter the table with the new property Identity, this is a classic process of ALTER TABLE xxxxxx.
3) Execute the syntax that previously posted :
DBCC CHECKIDENT ('dbo.tbl',RESEED,0);
Try to follow this path and if you have any problem only ask us.

You can not truncate table that have relation. You shoud remove relation firstly.

My understanding of this question:
You have a database with tables that you want to empty and next have them use primary key values starting at 0 or 1.
Some of these tables use an identity value and you already have a solution for those (you know you can find out which columns have an identity by using the sys.columns view? Look for the is_identity column).
Some tables do not use an identity but get their pk values from an unknown source, which we can't modify.
The only solution I see, is creating an after insert trigger (or modifying) on those tables that subtracts from the new pk value.
E.g.: your "hidden generator" will generate a next value 5254, but you want the next pk value to become one:
CREATE TRIGGER trg_sometable_ai
ON sometable
AFTER INSERT
AS
BEGIN
UPDATE st
SET st.pk_col = st.pk_col - 5253
FROM sometable AS st
INNER JOIN INSERTED AS i
ON i.pk_col = th.pk_col
END
You'll have to determine the next value and thus the "subtract value" for each table.
If the code also inserts child records into tables with a foreign key to this table, and uses the previously generated value, you have to modify those triggers as well...
This is a "last resort" solution and something I would recommend against in any scenario that has other options. Manipulating primary key values is generally not a good idea.

Converting int primary key to bigint in Sql Server

We have a production table with 770 million rows and change. We want(/need?) to change the Primary ID column from int to bigint to allow for future growth (and to avoid the sudden stop when the 32bit integer space is exhausted)
Experiments in DEV have shown that this is not as simple as altering the column as we would need to drop the index and then re-create it. So far in DEV (which is a bit humbler than PROD) the dropping of the index has not finished after 1 and a half hours. This table is hit 24/7 and having it offline for such a long time is not an option.
Has anyone else had to deal with a similar situation? How did you get it done?
Are there alternatives?
Edit: Additional Info:
The Primary key is clustered.

You could attempt a staged approach.
Create a new bigint column
Create an insert trigger to keep new entries in sync with the 2 columns
Execute an update to populate all the empty values in the bigint column with the converted value
Change the primary index on the table from your old id column to the new one
Point any FK's and queries to use the new column
Change the new column to become your identity column and remove the insert trigger from #2
Delete the old ID column
You should end up spreading the pain out over these 7 steps instead of hitting it all at once.

Create a parallel table with the longer data type for new rows and UNION the results?

What I had to do was copy the data into a new table with the desired structure (primary/clustered key only, non-clustered/FK once complete). If you don't have the room, you could bcp out the data and back in. You may need an application outage to make this happen.
What doesn't work: alter table Orderhistory alter column ID bigint because of the primary key. Don't drop the key and alter column as you will just fill your log file and take much longer than copy/bcp.
Never use the SSMS tools designer to change a column property, it copies table into temp table then does a rename once done. Lookup the alter table alter column syntax and use it and possibly defrag once complete if you modified a column wider that sits in middle of table.

Creating a Primary Key on a temp table - When?

I have a stored procedure that is working with a large amount of data. I have that data being inserted in to a temp table. The overall flow of events is something like
CREATE #TempTable (
Col1 NUMERIC(18,0) NOT NULL, --This will not be an identity column.
,Col2 INT NOT NULL,
,Col3 BIGINT,
,Col4 VARCHAR(25) NOT NULL,
--Etc...
--
--Create primary key here?
)
INSERT INTO #TempTable
SELECT ...
FROM MyTable
WHERE ...
INSERT INTO #TempTable
SELECT ...
FROM MyTable2
WHERE ...
--
-- ...or create primary key here?
My question is when is the best time to create a primary key on my #TempTable table? I theorized that I should create the primary key constraint/index after I insert all the data because the index needs to be reorganized as the primary key info is being created. But I realized that my underlining assumption might be wrong...
In case it is relevant, the data types I used are real. In the #TempTable table, Col1 and Col4 will be making up my primary key.
Update: In my case, I'm duplicating the primary key of the source tables. I know that the fields that will make up my primary key will always be unique. I have no concern about a failed alter table if I add the primary key at the end.
Though, this aside, my question still stands as which is faster assuming both would succeed?

This depends a lot.
If you make the primary key index clustered after the load, the entire table will be re-written as the clustered index isn't really an index, it is the logical order of the data. Your execution plan on the inserts is going to depend on the indexes in place when the plan is determined, and if the clustered index is in place, it will sort prior to the insert. You will typically see this in the execution plan.
If you make the primary key a simple constraint, it will be a regular (non-clustered) index and the table will simply be populated in whatever order the optimizer determines and the index updated.
I think the overall quickest performance (of this process to load temp table) is usually to write the data as a heap and then apply the (non-clustered) index.
However, as others have noted, the creation of the index could fail. Also, the temp table does not exist in isolation. Presumably there is a best index for reading the data from it for the next step. This index will need to either be in place or created. This is where you have to make a tradeoff of speed here for reliability (apply the PK and any other constraints first) and speed later (have at least the clustered index in place if you are going to have one).

If the recovery model of your database is set to simple or bulk-logged, SELECT ... INTO ... UNION ALL may be the fastest solution. SELECT .. INTO is a bulk operation and bulk operations are minimally logged.
eg:
-- first, create the table
SELECT ...
INTO #TempTable
FROM MyTable
WHERE ...
UNION ALL
SELECT ...
FROM MyTable2
WHERE ...
-- now, add a non-clustered primary key:
-- this will *not* recreate the table in the background
-- it will only create a separate index
-- the table will remain stored as a heap
ALTER TABLE #TempTable ADD PRIMARY KEY NONCLUSTERED (NonNullableKeyField)
-- alternatively:
-- this *will* recreate the table in the background
-- and reorder the rows according to the primary key
-- CLUSTERED key word is optional, primary keys are clustered by default
ALTER TABLE #TempTable ADD PRIMARY KEY CLUSTERED (NonNullableKeyField)
Otherwise, Cade Roux had good advice re: before or after.

You may as well create the primary key before the inserts - if the primary key is on an identity column then the inserts will be done sequentially anyway and there will be no difference.

Even more important than performance considerations, if you are not ABSOLUTELY, 100% sure that you will have unique values being inserted into the table, create the primary key first. Otherwise the primary key will fail to be created.
This prevents you from inserting duplicate/bad data.

If you add the primary key when creating the table, the first insert will be free (no checks required.) The second insert just has to see if it's different from the first. The third insert has to check two rows, and so on. The checks will be index lookups, because there's a unique constraint in place.
If you add the primary key after all the inserts, every row has to be matched against every other row. So my guess is that adding a primary key early on is cheaper.
But maybe Sql Server has a really smart way of checking uniqueness. So if you want to be sure, measure it!

I was wondering if I could improve a very very "expensive" stored procedure entailing a bunch of checks at each insert across tables and came across this answer. In the Sproc, several temp tables are opened and reference each other. I added the Primary Key to the CREATE TABLE statement (even though my selects use WHERE NOT EXISTS statements to insert data and ensure uniqueness) and my execution time was cut down SEVERELY. I highly recommend using the primary keys. Always at least try it out even when you think you don't need it.

I don't think it makes any significant difference in your case:
either you pay the penalty a little bit at a time, with each single insert
or you'll pay a larger penalty after all the inserts are done, but only once
When you create it up front before the inserts start, you could potentially catch PK violations as the data is being inserted, if the PK value isn't system-created.
But other than that - no big difference, really.
Marc

I wasn't planning to answer this, since I'm not 100% confident on my knowledge of this. But since it doesn't look like you are getting much response ...
My understanding is a PK is a unique index and when you insert each record, your index is updated and optimized. So ... if you add the data first, then create the index, the index is only optimized once.
So, if you are confident your data is clean (without duplicate PK data) then I'd say insert, then add the PK.
But if your data may have duplicate PK data, I'd say create the PK first, so it will bomb out ASAP.

When you add PK on table creation - the insert check is O(Tn) (where Tn is "n-th triangular number", which is 1 + 2 + 3 ... + n) because when you insert x-th row, it's checked against previously inserted "x - 1" rows
When you add PK after inserting all the values - the checker is O(n^2) because when you insert x-th row, it's checked against all n existing rows.
First one is obviously faster since O(Tn) is less than O(n^2)
P.S. Example: if you insert 5 rows it is 1 + 2 + 3 + 4 + 5 = 15 operations vs 5^2 = 25 operations

How to insert a new record to table A when table A deppends on table B and vice versa

I'm not sure if this is well designed, if it's not please, advice me on how to do this.
I'm using Sql Server 2008
I have:
TableA (TableA_ID int identity PK, Value varchar(10), TableB_ID PK not null)
TableB (TableB_ID int identity PK, Value varchar(10), TableA_ID PK not null)
The goal is simple:
TableA can have rows only if there is at least 1 row in TableB associated with TableA;
And for each row in TableB, there must be a row associated with it in TableA);
TableA is the "Parent Table", and TableB is the "Children's table", it's something like, a parent should have 1 or more children, and each child can have only 1 parent.
Is this right?
The problem I'm having is when I try to do an INSERT statement, if this is correct, how should I make the INSERT? temporary disable the constraints?
Thanks!
The problem I'm having is when I try to insert

TableA (TableA_ID int identity PK, Value varchar(10))
TableB (TableB_ID int identity PK, Value varchar(10), TableA_ID not null)
as a parent, table a does not need to reference table b, since table be requires there be a field in table a. this is called a one to many relationship.
so in table a you might have these values:
1 a
2 b
3 c
and in table b you could have these:
1 asdf 1
2 sdfg 1
3 pof 2
4 dfgbsd 3
now you can make a query to show the data from table a with this:
select b.TableB_ID, b.Value, a.TableA_ID, a.Value
from TableB b
inner join TableA
on b.TableA_ID=a.TableA_ID

The parents don't depend on the children. You need to remove your reference to Table B in Table A.

You have a circular dependency. These don't really work well for declarative enforcement, you would have to disable the constraints every time you wanted to insert.

That's an unusual requirement. If I was stuck with it (and I would really push back to make sure it was indeed a requirement) I would design it this way:
Make a regular foreign key from table a to table b with a the parent and b the child.
Add a trigger to table a that inserts a record to table b if one does not exist when a table a record is inserted. Add another trigger to table b that deletes the table a record if the last related record in table b is deleted.
ALternatively, you could put the inserts to both tables ina stored proc. Remove all insert rights to the table except through the proc. YOu would still need the foreign key relationship from tablea to table b and the trigger on table b to ensure that if the last record is deleted the table a record is deleted. But you could do away with the trigger on table a in this case.
I would use the first scenario unless there is information in table b that cannot be found from the trigger on table a, say one or more required fields that don't have a value you can figur eout form table a.

I would put the inserts into a proc: disable the constraints, insert the data, enable the constraints. You may need to make sure that this is the only transaction going on whilst the constraints are disabled though.
That could be acheived by making the isolation level SERIALIZABLE, but that in turn could massace your concurrency.
Kev

How to increment (or reserve) IDENTITY value in SQL Server without inserting into table

Is there a way to reserve or skip or increment value of identity column?
I Have two tables joined in one-to-one relation ship. First one has IDENTITY PK column, and second one int PK (not IDENTITY). I used to insert in first, get ID and insert in second. And it works ok.
Now I need to insert values in second table without inserting into first.
Now, how to increment IDENTITY seed, so I can insert it into second table, but leave "hole" in ID's of first table?
EDIT: More info
This works:
-- I need new seed number, but not table row
-- so i will insert foo row, get id, and delete it
INSERT INTO TABLE1 (SomeRequiredField) VALUES ('foo');
SET #NewID = SCOPE_IDENTITY();
DELETE FROM TABLE1 WHERE ID=#NewID;
-- Then I can insert in TABLE2
INSERT INTO (ID, Field, Field) VALUES (#NewID, 'Value', 'Value');
Once again - this works.
Question is can I get ID without inserting into table?
DBCC needs owner rights; is there a clean user callable SQL to do that?

This situation will make your overall data structure very hard to understand. If there is not a relationship between the values, then break the relationship.
There are ways to get around this to do what you are looking for, but typically it is in a distributed environment and not done because of what appears to be a data model change.

Then its no more a one-to-one relationship.
Just break the PK constraint.

Use a DBCC CHECKIDENT statement.

This article from SQL Server Books Online discusses the use of the DBCC CHECKIDENT method to update the identity seed of a table.
From that article:
This example forces the current identity value in the jobs table to a value of 30.
USE pubs
GO
DBCC CHECKIDENT (jobs, RESEED, 30)
GO

I would look into the OUTPUT INTO feature if you are using SQL Server 2005 or greater. This would allow you to insert into your primary table, and take the IDs assigned at that time to create rows in the secondary table.
I am assuming that there is a foreign key constraint enforced - because that would be the only reason you would need to do this in the first place.

How do you plan on matching them up later? I would not put records into the second table without a record in the first, that is why it is set up in a foreign key relationship - to stio that sort of action. Just why do you not want to insert records into the first table anyway? If we knew more about the type of application and why this is necessary we might be able to guide you to a solution.

this might help
SET IDENTITY_INSERT [ database_name . [ schema_name ] . ] table { ON | OFF }
http://msdn.microsoft.com/en-us/library/aa259221(SQL.80).aspx
It allows explicit values to be inserted into the identity column of a table.