Is it safe to add IDENTITY PK Column to existing SQL SERVER table?

Is it safe to add IDENTITY PK Column to existing SQL SERVER table? - sql-server

After rebuilding all of the tables in one of my SQL SERVER databases, into a new database, I failed to set the 'ID' column to IDENTITY and PRIMARY KEY for many of the tables. Most of them have data.
I discovered this T-SQL, and have successfully implemented it for a couple of the tables already. The new/replaced ID column contains the same values from the previous column (simply because they were from an auto-incremented column in the table I imported from), and my existing stored procedures all still work.
Alter Table ExistingTable
Add NewID Int Identity(1, 1)
Go
Alter Table ExistingTable Drop Column ID
Go
Exec sp_rename 'ExistingTable.NewID', 'ID', 'Column'
--Then open the table in Design View, and set the new/replaced column as the PRIMARY KEY
--I understand that I could set the PK when I create the new IDENTITY column
The new/replaced ID column is now the last column in the table, and so far, I haven't ran into issues with the ASP.Net/C# data access objects that call the stored procedures.
As mentioned, each of these tables had no PRIMARY KEY (nor FOREIGN KEY) set. With that in mind, are there any additional steps I should take to ensure the integrity of the database?
I ran across this SO post, which suggests that I should run the 'ALTER TABLE REBUILD' statement, but since there was no PK already set, do I really need to do this?
Ultimately, I just want to be sure I'm not creating issues that won't appear until later in the game, and be sure the methods I'm implementing are sound, logical, and ensure data integrity.
I suppose it might be a better option to DROP/RECREATE the table with the proper PK/IDENTITY column, and I could write some T-SQL to dump the existing data into a TEMP table, then drop/recreate, and re-populate the new table with data from the TEMP table. I specifically avoided this option as it seems much more aggressive, and I don't fully understand what it means for the Stored Procedures/Functions, etc., that depend on these tables.
Here is an example of one of the tables I've performed this on. You can see the NewID values are identical to the original ID.enter image description here

Give this a go; it's rummaged up from a script we used a few years ago in a similar situation, can't remember what version of SQLS it was used against.. If it works out for your scenario you can adapt it to your tables..
SELECT MAX(Id)+1 FROM causeCodes -- run and use value below
CREATE TABLE [dbo].[CauseCodesW]( [ID] [int] NOT NULL IDENTITY(put_maxplusone_here,1), [Code] [varchar](50) NOT NULL, [Description] [varchar](500) NULL, [IsActive] [bit] NOT NULL )
ALTER TABLE CauseCodes SWITCH TO CauseCodesW;
DROP TABLE CauseCodes;
EXEC sp_rename 'CauseCodesW','CauseCodes';
ALTER TABLE CauseCodes ADD CONSTRAINT PK_CauseCodes_Id PRIMARY KEY CLUSTERED (Id);
SELECT * FROM CauseCodes;
You can now find any tables that have FKs to this table and recreate those relationships..

Related

Recommended SQL Server table design for file import and processing

I have a scenario where files will be uploaded into a database table (dbo.FileImport) with each line of the file in a new row. Each row will contain the line data and the name of the file it came from. The file names are unique but may contain a few million lines. Multiple file's data may exist in the table at one time.
Each file is processed and the results are stored in a separate table. After processing the data related to the file, the data is deleted from the import table to keep the table from growing indefinitely.
The table structure is as follows:
CREATE TABLE [dbo].[FileImport] (
[Id] BIGINT IDENTITY (1, 1) NOT NULL,
[FileName] VARCHAR (100) NOT NULL,
[LineData] NVARCHAR (300) NOT NULL
);
During the processing the data for the relevant file is loaded with the following query:
SELECT [LineData] FROM [dbo].[FileImport] WHERE [FileName] = #FileName
And then deleted with the following statement:
DELETE FROM [dbo].[FileImport] WHERE [FileName] = #FileName
My question is pertaining to the table design with regard to performance and longevity...
Is it necessary to have the [Id] column if I never use it (I am concerned about running out of numbers in the Identity eventually too)?
Should I add a PRIMARY KEY Constraint to the [Id] column?
Should I have a CLUSTERED or NONCLUSTERED index for the [FileName] column?
Should I be making use of NOLOCK whenever I query this table (it is updated very regularly)?
Would there be concern of fragmentation with the continual adding and deleting of data to/from this table? If so, how should I handle this?
Any advice or thoughts would be much appreciated. Opinionated designs are welcome ;-)
Update 2017-12-10
I failed to mention that the lines of a file may not be unique. So please take this into account if this affects the recommendation.
An example script in the answer would be an added bonus! ;-)

Is it necessary to have the [Id] column if I never use it (I am
concerned about running out of numbers in the Identity eventually
too)?
It is not necessary to have an unused column. This is not a relational table and will not be referenced by a foreign key so one could make the argument a primary key is unnecessary.
I would not be concerned about running out of 64-bit integer values. bigint can hold a positive value of up to 36,028,797,018,963,967. It would take centuries to run out of values if you load 1 billion rows a second.
Should I add a PRIMARY KEY Constraint to the [Id] column?
I would create a composite clustered primary key on FileName and ID. That would provide an incremental value to facilitate retrieving rows in the order of insertion and the FileName leftmost key column would benefit your queries greatly.
Should I have a CLUSTERED or NONCLUSTERED index for the [FileName]
column?
See above.
Should I be making use of NOLOCK whenever I query this table (it is
updated very regularly)?
No. Assuming you query by FileName, only the rows requested will be touched with the suggested primary key.
Would there be concern of fragmentation with the continual adding and
deleting of data to/from this table? If so, how should I handle this?
Incremental keys avoid fragmentation.
EDIT:
Here's the suggested DDL for the table:
CREATE TABLE dbo.FileImport (
FileName VARCHAR (100) NOT NULL
, RecordNumber BIGINT NOT NULL IDENTITY
, LineData NVARCHAR (300) NOT NULL
CONSTRAINT PK_FileImport PRIMARY KEY CLUSTERED(FileName, RecordNumber)
);

Here is a rough sketch how I would do it
CREATE TABLE [FileImport].[FileName] (
[FileId] BIGINT IDENTITY (1, 1) NOT NULL,
[FileName] VARCHAR (100) NOT NULL
);
go
alter table [FileImport].[FileName]
add constraint pk_FileName primary key nonclustered (FileId)
go
create clustered index cix_FileName on [FileImport].[FileName]([FileName])
go
CREATE TABLE [FileImport].[LineData] (
[FileId] VARCHAR (100) NOT NULL,
[LineDataId] BIGINT IDENTITY (1, 1) NOT NULL,
[LineData] NVARCHAR (300) NOT NULLL.
constraint fk_LineData_FileName foreign key (FileId) references [FileImport].[FileName](FIleId)
);
alter table [FileImport].[LineData]
add constraint pk_FileName primary key clustered (FileId, LineDataId)
go
This is with some normalization so you don't have to reference your full file name every time - you probably don't have to do (in case you prefer not to and just move FileName to second table instead of the FileId and cluster your index on (FileName, LeneDataId)) it but since we are using relational database ...
No need for any additional indexes - tables are sorted by the right keys
Should I be making use of NOLOCK whenever I query this table (it is
updated very regularly)?
If your data means anything to you, don't use it, It's a matter in fact, if you have to use it - something really wrong with your DB architecture. The way it is indexed SQL Server will use Seek operation which is very fast.
Would there be concern of fragmentation with the continual adding and
deleting of data to/from this table? If so, how should I handle this?
You can set up a maintenance job that rebuilds your indexes and run it nightly with Agent (or what ever)

Error occurred while changing is Identity to no in SQL Server

I have to change the auto increment on ID to explicitly define ID. For this I Go to
datatabse-> tables -> mytable -> design. There I set is dentity (under identity specification) to No. But when I click save it throws an error saying.
Saving changes is not permitted. The changes you have made require the following tables to
be droped and re created....
Is there any way to do it without dropping the table. I searched this error and got the solution to run a following query
SET IDENTITY_INSERT mytable ON GO
But when I try to insert from code, it throws error that
Cannot insert explicit value for identity column in table 'mytable' when IDENTITY_INSERT is set to OFF
Is there any way to get out of this problem

Once identity, always identity. You cannot change the identity property on a column. Technically, you could use IDENTITY_INSERT to get around it, but this requires setting the option on every single insert you do (this setting doesn't persist over sessions). This is probably not what you want.
Your only alternative, if recreating the table isn't an option, is to create a new column that isn't an identity column, then dropping the old one:
ALTER TABLE MyTable ADD NotAnID INT NULL;
GO
BEGIN TRANSACTION
UPDATE MyTable SET NotAnID = ID;
ALTER TABLE MyTable ALTER COLUMN NotAnID INT NOT NULL;
ALTER TABLE MyTable DROP COLUMN ID;
EXECUTE sp_rename 'MyTable.NotAnID', 'ID';
COMMIT;
This assumes your identity column is NOT NULL (as it usually is), that ID is not the primary key, that it isn't participating in foreign key constraints, and that you want the new column to take place of the old one.
If ID is the primary key, this exercise gets more involved because you need to drop the primary key constraint and recreate it -- which has its own challenges. Doubly so if it's also the clustered index. In this case, you are probably better off recreating the table anyway, because recreating the clustered index means the whole table is rewritten -- this will almost certainly interrupt production work, so you may as well let SSMS do the tough work for you. To allow that, go to Tools -> Options -> Designers and uncheck "Prevent saving changes that require table re-creation".

Is there a way to update primary key Identity specification Increment 1 without dropping Foreign Keys?

I am trying to change a primary key Id to identity to increment 1 on each entry. But the column has been referenced already by other tables. Is there any way to set primary key to auto increment without dropping the foreign keys from other tables?

If the table isn't that large generate script to create an identical table but change the schema it created to:
CREATE TABLE MYTABLE_NEW (
PK INT PRIMARY KEY IDENTITY(1,1),
COL1 TYPEx,
COL2 TYPEx,
COLn
...)
Set your database to single-user mode or make sure no one is in the
database or tables you're changing or change the table you need to
change to READ/ONLY.
Import your data into MYTABLE_NEW from MYTABLE using set IDENTITY_INSERT on
Script your foreign key constraints and save them--in case you need
to back out of your change later and/or re-implement them.
Drop all the constraints from MYTABLE
Rename MYTABLE to MYTABLE_SAV
Rename MYTABLE_NEW to MYTABLE
Run constraint scripts to re-implement constraints on MYTABLE
p.s.
you did ask if there was a way to not drop the foreign key constraints. Here's something to try on your test system. on Step 4 run
ALTER TABLE MYTABLE NOCHECK CONSTRAINT ALL
and on Step 7 ALTER TABLE MYTABLE CHECK CONSTRAINT ALL. I've not tried this myself -- interesting to see if this would actually work on renamed tables.
You can script all this ahead of time on a test SQL Server or even a copy of the database staged on a production server--to make implementation day a no-brainer and gauge your SLAs for any change control procedures for your company.
You can do a similar methodology by deleting the primary key and re-adding it back, but you'll need to have the same data inserted in the new column before you delete the old column. So you'll be deleting and inserting schema and inserting primary key data with this approach. I like to avoid touching a production table if at all possible and having MYTABLE_SAV around in case "anything" unexpected occurs is a comfort to me personally--as I can tell management "the production data was not touched". But some tables are simply too large for this approach to be worthwhile and, also, tastes and methodologies differ largely from DBA to DBA.

How do you add a unique primary key field automatically in SQL Server?

I am using SQL Server 2012 and need to add a column with a unique primary key. I am about to load several hundred thousand records BULK and just discovered repetition in the field I was going to use. Have seen SEQUENCE and GUID. Need some guidance on the best choice and how to go about setting this up so that the key field is populated during the bulk load.

When you create your table in which you want to insert information create an IDENTITY column. That will serve as an auto-populating column with a unique number for each record.
Here is a link that might help you.
If you have already created your table just change this query to what suits to your table name and run it in order to add the new column you requested.
ALTER TABLE mytable
ADD COLUMN unique_id IDENTITY (1,1)

Just a slight update on what’s already posted that includes details for adding primary key constraint
alter table database.schema.table_t
add ID_column int identity(1,1)
primary key (ID_column)
If you already set the primary key on this table just go and remove it before you execute this statement.

SQL Server: how to constrain a table to contain a single row?

I want to store a single row in a configuration table for my application. I would like to enforce that this table can contain only one row.
What is the simplest way to enforce the single row constraint ?

You make sure one of the columns can only contain one value, and then make that the primary key (or apply a uniqueness constraint).
CREATE TABLE T1(
Lock char(1) not null,
/* Other columns */,
constraint PK_T1 PRIMARY KEY (Lock),
constraint CK_T1_Locked CHECK (Lock='X')
)
I have a number of these tables in various databases, mostly for storing config. It's a lot nicer knowing that, if the config item should be an int, you'll only ever read an int from the DB.

I usually use Damien's approach, which has always worked great for me, but I also add one thing:
CREATE TABLE T1(
Lock char(1) not null DEFAULT 'X',
/* Other columns */,
constraint PK_T1 PRIMARY KEY (Lock),
constraint CK_T1_Locked CHECK (Lock='X')
)
Adding the "DEFAULT 'X'", you will never have to deal with the Lock column, and won't have to remember which was the lock value when loading the table for the first time.

You may want to rethink this strategy. In similar situations, I've often found it invaluable to leave the old configuration rows lying around for historical information.
To do that, you actually have an extra column creation_date_time (date/time of insertion or update) and an insert or insert/update trigger which will populate it correctly with the current date/time.
Then, in order to get your current configuration, you use something like:
select * from config_table order by creation_date_time desc fetch first row only
(depending on your DBMS flavour).
That way, you still get to maintain the history for recovery purposes (you can institute cleanup procedures if the table gets too big but this is unlikely) and you still get to work with the latest configuration.

You can implement an INSTEAD OF Trigger to enforce this type of business logic within the database.
The trigger can contain logic to check if a record already exists in the table and if so, ROLLBACK the Insert.
Now, taking a step back to look at the bigger picture, I wonder if perhaps there is an alternative and more suitable way for you to store this information, perhaps in a configuration file or environment variable for example?

I know this is very old but instead of thinking BIG sometimes better think small use an identity integer like this:
Create Table TableWhatever
(
keycol int primary key not null identity(1,1)
check(keycol =1),
Col2 varchar(7)
)
This way each time you try to insert another row the check constraint will raise preventing you from inserting any row since the identity p key won't accept any value but 1

Here's a solution I came up with for a lock-type table which can contain only one row, holding a Y or N (an application lock state, for example).
Create the table with one column. I put a check constraint on the one column so that only a Y or N can be put in it. (Or 1 or 0, or whatever)
Insert one row in the table, with the "normal" state (e.g. N means not locked)
Then create an INSERT trigger on the table that only has a SIGNAL (DB2) or RAISERROR (SQL Server) or RAISE_APPLICATION_ERROR (Oracle). This makes it so application code can update the table, but any INSERT fails.
DB2 example:
create table PRICE_LIST_LOCK
(
LOCKED_YN char(1) not null
constraint PRICE_LIST_LOCK_YN_CK check (LOCKED_YN in ('Y', 'N') )
);
--- do this insert when creating the table
insert into PRICE_LIST_LOCK
values ('N');
--- once there is one row in the table, create this trigger
CREATE TRIGGER ONLY_ONE_ROW_IN_PRICE_LIST_LOCK
NO CASCADE
BEFORE INSERT ON PRICE_LIST_LOCK
FOR EACH ROW
SIGNAL SQLSTATE '81000' -- arbitrary user-defined value
SET MESSAGE_TEXT='Only one row is allowed in this table';
Works for me.

I use a bit field for primary key with name IsActive.
So there can be 2 rows at most and and the sql to get the valid row is:
select * from Settings where IsActive = 1
if the table is named Settings.

The easiest way is to define the ID field as a computed column by value 1 (or any number ,....), then consider a unique index for the ID.
CREATE TABLE [dbo].[SingleRowTable](
[ID] AS ((1)),
[Title] [varchar](50) NOT NULL,
CONSTRAINT [IX_SingleRowTable] UNIQUE NONCLUSTERED
(
[ID] ASC
)
) ON [PRIMARY]

You can write a trigger on the insert action on the table. Whenever someone tries to insert a new row in the table, fire away the logic of removing the latest row in the insert trigger code.

Old question but how about using IDENTITY(MAX,1) of a small column type?
CREATE TABLE [dbo].[Config](
[ID] [tinyint] IDENTITY(255,1) NOT NULL,
[Config1] [nvarchar](max) NOT NULL,
[Config2] [nvarchar](max) NOT NULL

IF NOT EXISTS ( select * from table )
BEGIN
///Your insert statement
END

Here we can also make an invisible value which will be the same after first entry in the database.Example:
Student Table:
Id:int
firstname:char
Here in the entry box,we have to specify the same value for id column which will restrict as after first entry other than writing lock bla bla due to primary key constraint thus having only one row forever.
Hope this helps!

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight