SQL Server: how to constrain a table to contain a single row? - sql-server

I want to store a single row in a configuration table for my application. I would like to enforce that this table can contain only one row.
What is the simplest way to enforce the single row constraint ?

You make sure one of the columns can only contain one value, and then make that the primary key (or apply a uniqueness constraint).
CREATE TABLE T1(
Lock char(1) not null,
/* Other columns */,
constraint PK_T1 PRIMARY KEY (Lock),
constraint CK_T1_Locked CHECK (Lock='X')
)
I have a number of these tables in various databases, mostly for storing config. It's a lot nicer knowing that, if the config item should be an int, you'll only ever read an int from the DB.

I usually use Damien's approach, which has always worked great for me, but I also add one thing:
CREATE TABLE T1(
Lock char(1) not null DEFAULT 'X',
/* Other columns */,
constraint PK_T1 PRIMARY KEY (Lock),
constraint CK_T1_Locked CHECK (Lock='X')
)
Adding the "DEFAULT 'X'", you will never have to deal with the Lock column, and won't have to remember which was the lock value when loading the table for the first time.

You may want to rethink this strategy. In similar situations, I've often found it invaluable to leave the old configuration rows lying around for historical information.
To do that, you actually have an extra column creation_date_time (date/time of insertion or update) and an insert or insert/update trigger which will populate it correctly with the current date/time.
Then, in order to get your current configuration, you use something like:
select * from config_table order by creation_date_time desc fetch first row only
(depending on your DBMS flavour).
That way, you still get to maintain the history for recovery purposes (you can institute cleanup procedures if the table gets too big but this is unlikely) and you still get to work with the latest configuration.

You can implement an INSTEAD OF Trigger to enforce this type of business logic within the database.
The trigger can contain logic to check if a record already exists in the table and if so, ROLLBACK the Insert.
Now, taking a step back to look at the bigger picture, I wonder if perhaps there is an alternative and more suitable way for you to store this information, perhaps in a configuration file or environment variable for example?

I know this is very old but instead of thinking BIG sometimes better think small use an identity integer like this:
Create Table TableWhatever
(
keycol int primary key not null identity(1,1)
check(keycol =1),
Col2 varchar(7)
)
This way each time you try to insert another row the check constraint will raise preventing you from inserting any row since the identity p key won't accept any value but 1

Here's a solution I came up with for a lock-type table which can contain only one row, holding a Y or N (an application lock state, for example).
Create the table with one column. I put a check constraint on the one column so that only a Y or N can be put in it. (Or 1 or 0, or whatever)
Insert one row in the table, with the "normal" state (e.g. N means not locked)
Then create an INSERT trigger on the table that only has a SIGNAL (DB2) or RAISERROR (SQL Server) or RAISE_APPLICATION_ERROR (Oracle). This makes it so application code can update the table, but any INSERT fails.
DB2 example:
create table PRICE_LIST_LOCK
(
LOCKED_YN char(1) not null
constraint PRICE_LIST_LOCK_YN_CK check (LOCKED_YN in ('Y', 'N') )
);
--- do this insert when creating the table
insert into PRICE_LIST_LOCK
values ('N');
--- once there is one row in the table, create this trigger
CREATE TRIGGER ONLY_ONE_ROW_IN_PRICE_LIST_LOCK
NO CASCADE
BEFORE INSERT ON PRICE_LIST_LOCK
FOR EACH ROW
SIGNAL SQLSTATE '81000' -- arbitrary user-defined value
SET MESSAGE_TEXT='Only one row is allowed in this table';
Works for me.

I use a bit field for primary key with name IsActive.
So there can be 2 rows at most and and the sql to get the valid row is:
select * from Settings where IsActive = 1
if the table is named Settings.

The easiest way is to define the ID field as a computed column by value 1 (or any number ,....), then consider a unique index for the ID.
CREATE TABLE [dbo].[SingleRowTable](
[ID] AS ((1)),
[Title] [varchar](50) NOT NULL,
CONSTRAINT [IX_SingleRowTable] UNIQUE NONCLUSTERED
(
[ID] ASC
)
) ON [PRIMARY]

You can write a trigger on the insert action on the table. Whenever someone tries to insert a new row in the table, fire away the logic of removing the latest row in the insert trigger code.

Old question but how about using IDENTITY(MAX,1) of a small column type?
CREATE TABLE [dbo].[Config](
[ID] [tinyint] IDENTITY(255,1) NOT NULL,
[Config1] [nvarchar](max) NOT NULL,
[Config2] [nvarchar](max) NOT NULL

IF NOT EXISTS ( select * from table )
BEGIN
///Your insert statement
END

Here we can also make an invisible value which will be the same after first entry in the database.Example:
Student Table:
Id:int
firstname:char
Here in the entry box,we have to specify the same value for id column which will restrict as after first entry other than writing lock bla bla due to primary key constraint thus having only one row forever.
Hope this helps!

Related

Recommended SQL Server table design for file import and processing

I have a scenario where files will be uploaded into a database table (dbo.FileImport) with each line of the file in a new row. Each row will contain the line data and the name of the file it came from. The file names are unique but may contain a few million lines. Multiple file's data may exist in the table at one time.
Each file is processed and the results are stored in a separate table. After processing the data related to the file, the data is deleted from the import table to keep the table from growing indefinitely.
The table structure is as follows:
CREATE TABLE [dbo].[FileImport] (
[Id] BIGINT IDENTITY (1, 1) NOT NULL,
[FileName] VARCHAR (100) NOT NULL,
[LineData] NVARCHAR (300) NOT NULL
);
During the processing the data for the relevant file is loaded with the following query:
SELECT [LineData] FROM [dbo].[FileImport] WHERE [FileName] = #FileName
And then deleted with the following statement:
DELETE FROM [dbo].[FileImport] WHERE [FileName] = #FileName
My question is pertaining to the table design with regard to performance and longevity...
Is it necessary to have the [Id] column if I never use it (I am concerned about running out of numbers in the Identity eventually too)?
Should I add a PRIMARY KEY Constraint to the [Id] column?
Should I have a CLUSTERED or NONCLUSTERED index for the [FileName] column?
Should I be making use of NOLOCK whenever I query this table (it is updated very regularly)?
Would there be concern of fragmentation with the continual adding and deleting of data to/from this table? If so, how should I handle this?
Any advice or thoughts would be much appreciated. Opinionated designs are welcome ;-)
Update 2017-12-10
I failed to mention that the lines of a file may not be unique. So please take this into account if this affects the recommendation.
An example script in the answer would be an added bonus! ;-)
Is it necessary to have the [Id] column if I never use it (I am
concerned about running out of numbers in the Identity eventually
too)?
It is not necessary to have an unused column. This is not a relational table and will not be referenced by a foreign key so one could make the argument a primary key is unnecessary.
I would not be concerned about running out of 64-bit integer values. bigint can hold a positive value of up to 36,028,797,018,963,967. It would take centuries to run out of values if you load 1 billion rows a second.
Should I add a PRIMARY KEY Constraint to the [Id] column?
I would create a composite clustered primary key on FileName and ID. That would provide an incremental value to facilitate retrieving rows in the order of insertion and the FileName leftmost key column would benefit your queries greatly.
Should I have a CLUSTERED or NONCLUSTERED index for the [FileName]
column?
See above.
Should I be making use of NOLOCK whenever I query this table (it is
updated very regularly)?
No. Assuming you query by FileName, only the rows requested will be touched with the suggested primary key.
Would there be concern of fragmentation with the continual adding and
deleting of data to/from this table? If so, how should I handle this?
Incremental keys avoid fragmentation.
EDIT:
Here's the suggested DDL for the table:
CREATE TABLE dbo.FileImport (
FileName VARCHAR (100) NOT NULL
, RecordNumber BIGINT NOT NULL IDENTITY
, LineData NVARCHAR (300) NOT NULL
CONSTRAINT PK_FileImport PRIMARY KEY CLUSTERED(FileName, RecordNumber)
);
Here is a rough sketch how I would do it
CREATE TABLE [FileImport].[FileName] (
[FileId] BIGINT IDENTITY (1, 1) NOT NULL,
[FileName] VARCHAR (100) NOT NULL
);
go
alter table [FileImport].[FileName]
add constraint pk_FileName primary key nonclustered (FileId)
go
create clustered index cix_FileName on [FileImport].[FileName]([FileName])
go
CREATE TABLE [FileImport].[LineData] (
[FileId] VARCHAR (100) NOT NULL,
[LineDataId] BIGINT IDENTITY (1, 1) NOT NULL,
[LineData] NVARCHAR (300) NOT NULLL.
constraint fk_LineData_FileName foreign key (FileId) references [FileImport].[FileName](FIleId)
);
alter table [FileImport].[LineData]
add constraint pk_FileName primary key clustered (FileId, LineDataId)
go
This is with some normalization so you don't have to reference your full file name every time - you probably don't have to do (in case you prefer not to and just move FileName to second table instead of the FileId and cluster your index on (FileName, LeneDataId)) it but since we are using relational database ...
No need for any additional indexes - tables are sorted by the right keys
Should I be making use of NOLOCK whenever I query this table (it is
updated very regularly)?
If your data means anything to you, don't use it, It's a matter in fact, if you have to use it - something really wrong with your DB architecture. The way it is indexed SQL Server will use Seek operation which is very fast.
Would there be concern of fragmentation with the continual adding and
deleting of data to/from this table? If so, how should I handle this?
You can set up a maintenance job that rebuilds your indexes and run it nightly with Agent (or what ever)

Auto increment primary key in SQL Server Management Studio 2012

How do I auto increment the primary key in a SQL Server database table? I've had a look through the forum but can't see how to do this.
I've looked at the properties but can't see an option. I saw an answer where you go to the Identity specification property and set it to yes and set the Identity increment to 1, but that section is grayed out and I can't change the no to yes.
There must be a simple way to do this but I can't find it.
Make sure that the Key column's datatype is int and then setting identity manually, as image shows
Or just run this code
-- ID is the name of the [to be] identity column
ALTER TABLE [yourTable] DROP COLUMN ID
ALTER TABLE [yourTable] ADD ID INT IDENTITY(1,1)
the code will run, if ID is not the only column in the table
image reference fifo's
When you're creating the table, you can create an IDENTITY column as follows:
CREATE TABLE (
ID_column INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
...
);
The IDENTITY property will auto-increment the column up from number 1. (Note that the data type of the column has to be an integer.) If you want to add this to an existing column, use an ALTER TABLE command.
Edit:
Tested a bit, and I can't find a way to change the Identity properties via the Column Properties window for various tables. I guess if you want to make a column an identity column, you HAVE to use an ALTER TABLE command.
You have to expand the Identity section to expose increment and seed.
Edit: I assumed that you'd have an integer datatype, not char(10). Which is reasonable I'd say and valid when I posted this answer
Expand your database, expand your table right click on your table and select design from dropdown.
Now go Column properties below of it scroll down and find Identity Specification, expand it and you will find Is Identity make it Yes. Now choose Identity Increment right below of it give the value you want to increment in it.
CREATE TABLE Persons (
Personid int IDENTITY(1,1) PRIMARY KEY,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int
);
The MS SQL Server uses the IDENTITY keyword to perform an auto-increment feature.
In the example above, the starting value for IDENTITY is 1, and it will increment by 1 for each new record.
Tip: To specify that the "Personid" column should start at value 10 and increment by 5, change it to IDENTITY(10,5).
To insert a new record into the "Persons" table, we will NOT have to specify a value for the "Personid" column (a unique value will be added automatically):
Perhaps I'm missing something but why doesn't this work with the SEQUENCE object? Is this not what you're looking for?
Example:
CREATE SCHEMA blah.
GO
CREATE SEQUENCE blah.blahsequence
START WITH 1
INCREMENT BY 1
NO CYCLE;
CREATE TABLE blah.de_blah_blah
(numbers bigint PRIMARY KEY NOT NULL
......etc
When referencing the squence in say an INSERT command just use:
NEXT VALUE FOR blah.blahsequence
More information and options for SEQUENCE
When you're using Data Type: int you can select the row which you want to get autoincremented and go to the column properties tag. There you can set the identity to 'yes'. The starting value for autoincrement can also be edited there. Hope I could help ;)
I had this issue where I had already created the table and could not change it without dropping the table so what I did was:
(Not sure when they implemented this but had it in SQL 2016)
Right click on the table in the Object Explorer:
Script Table as > DROP And CREATE To > New Query Editor Window
Then do the edit to the script said by Josien; scroll to the bottom where the CREATE TABLE is, find your Primary Key and append IDENTITY(1,1) to the end before the comma. Run script.
The DROP and CREATE script was also helpful for me because of this issue. (Which the generated script handles.)
You can use the keyword IDENTITY as the data type to the column along with PRIMARY KEY constraint when creating the table.
ex:
StudentNumber IDENTITY(1,1) PRIMARY KEY
In here the first '1' means the starting value and the second '1' is the incrementing value.
If the table is already populated it is not possible to change a column to IDENTITY column or convert it to non IDENTITY column. You would need to export all the data out then you can change column type to IDENTITY or vice versa and then import data back.
I know it is painful process but I believe there is no alternative except for using sequence as mentioned in this post.
Be carefull like if you want the ID elements to be contigius or not. As SQLSERVER ID can jump by 1000 .
Examle: before restart ID=11
after restart , you insert new row in the table, then the id will be 1012.
You could do the following: New Table Creation:
-- create new table with Column ID which is Primary Key and Auto Increment --
CREATE TABLE titles(
id INT NOT NULL IDENTITY(1,1) PRIMARY KEY, --Primary Key with Auto-Increment --
keyword VARCHAR(260),
status VARCHAR(10),
);
If you Table Already exists and need to make the changes to ID column to be auto-increment and Primary key, then see below:
ALTER TABLE table DROP COLUMN id; // drop the existing ID in the table
ALTER TABLE table ADD id int IDENTITY(1, 1) NOT NULL; // add new column ID with auto-increment
ALTER TABLE table ADD CONSTRAINT PK_ident_test PRIMARY KEY CLUSTERED (id); // make it primary key

Mysql auto increment alternatives

I have the below table.
create table mytable(
physical_id int auto_increment,
logical_id int,
data varchar(20),
version_start_date datetime,
version_end_date datetime,
primary key(physical_id),
unique key(logical_id,version_start_date, version_end_date)
);
The idea behind the schema is, I want to keep track of modification to
every row and find the valid row on any particular date by checking the
version_start_date and version_end_date. I want my logical id to be
auto_increment, but mysql allows only one id to be auto_increment.
So, I want to set logical_id to physical_id, when creating a new row. I
am able to do it using the trigger.
delimiter $$
create trigger myTrigger before insert on mytable for each row begin set
new.logical_id = (select auto_increment from information_schema.tables
where table_schema = database() and table_name = 'mytable') ; end$$
delimiter ;
Few other options I checked out are, http://feedblog.org/2007/06/20/portable-sequence-generation-with-mysql/ and http://www.redhat.com/docs/en-US/JBoss_Hibernate/3.2.4.sp01.cp03/html/Reference_Guide/Native_SQL-Custom_SQL_for_create_update_and_delete.html
The problem with these approaches is, I have to create a new sequence table and keep inserting a record into that table.
Is there a better alternative?
Thank you
Bala
-- Update
I am not sure why #tpdi, why I need parent, when I can just emulate this with below table.
create table logical_id_seq (
logical_id int auto_increment,
primary key(logical_id)
);
create table mytable (
physical_id int auto_increment,
logical_id int not null references parent(logical_id),
data varchar(20),
version_start_date datetime not null,
version_end_date datetime not null,
primary key(physical_id),
foreign key (logical_id) references logical_id_seq(logical_id),
unique key (logical_id,version_start_date,version_end_date)
);
This is why I prefer Oracle/PostgreSQL sequences to MySQL's auto_increment and SQL Server's IDENTITY - you'd be able to define multiple sequences for this.
Creating a separate table solely for a column in order to use auto_increment for an additional auto_increment is the best solution I can think of, accessed via trigger or stored procedure.
Why wouldn't you want to have a table with the current version, and then an additional table for a history? Then, you use an insert into the current version table to generate your ID, and only query the history table when you actually need the history information.
Now I understand your issue :). Changing my answer.
You will need to use some trigger so that whenever some row is edited logical id is incremented by 1 finding max of logical id available in table.
create table parent (
logical_id int auto_increment,
physical_id int null references mytable(id)
);
create table mytable(
physical_id int auto_increment,
logical_id not null references parent(logical_id),
data varchar(20),
version_start_date datetime,
version_end_date datetime,
primary key(physical_id)
);
To insert a new instance of an existing record, insert into mytable ...., then capture new mytable.id, and update parent's physical_id with it (possibly in a trigger).
Now, to look up the current record, look up the logical id in parent, and use parent.physical_id to join to the correct record in "mytable"
Parent points to the current valid record (of all records) in "mytable".
To find all instances of a logical record, use the logical_id in "mytable".
To insert a wholly new record, first insert into parent to get the new logical_id, then insert the data in "mytable"; this is why we allow parent.physical_id to be nullable.
As to the OP's update: yes, you can "emulate" by just "stealing" a sequence, but that doesn't model what's really going on. "Stealing" the sequence is just an implementation detail; having two tables as I've shown above makes explicit what's really going on, namely: I have a current state (which parent points to) in "mytable", and 0, 1 or many prior states in "mytable". It's a cleaner separation of concerns, a table that tells you "what's current for any id", and a table that tells you "the data of that current state, and the data of prior states".

Creating a Primary Key on a temp table - When?

I have a stored procedure that is working with a large amount of data. I have that data being inserted in to a temp table. The overall flow of events is something like
CREATE #TempTable (
Col1 NUMERIC(18,0) NOT NULL, --This will not be an identity column.
,Col2 INT NOT NULL,
,Col3 BIGINT,
,Col4 VARCHAR(25) NOT NULL,
--Etc...
--
--Create primary key here?
)
INSERT INTO #TempTable
SELECT ...
FROM MyTable
WHERE ...
INSERT INTO #TempTable
SELECT ...
FROM MyTable2
WHERE ...
--
-- ...or create primary key here?
My question is when is the best time to create a primary key on my #TempTable table? I theorized that I should create the primary key constraint/index after I insert all the data because the index needs to be reorganized as the primary key info is being created. But I realized that my underlining assumption might be wrong...
In case it is relevant, the data types I used are real. In the #TempTable table, Col1 and Col4 will be making up my primary key.
Update: In my case, I'm duplicating the primary key of the source tables. I know that the fields that will make up my primary key will always be unique. I have no concern about a failed alter table if I add the primary key at the end.
Though, this aside, my question still stands as which is faster assuming both would succeed?
This depends a lot.
If you make the primary key index clustered after the load, the entire table will be re-written as the clustered index isn't really an index, it is the logical order of the data. Your execution plan on the inserts is going to depend on the indexes in place when the plan is determined, and if the clustered index is in place, it will sort prior to the insert. You will typically see this in the execution plan.
If you make the primary key a simple constraint, it will be a regular (non-clustered) index and the table will simply be populated in whatever order the optimizer determines and the index updated.
I think the overall quickest performance (of this process to load temp table) is usually to write the data as a heap and then apply the (non-clustered) index.
However, as others have noted, the creation of the index could fail. Also, the temp table does not exist in isolation. Presumably there is a best index for reading the data from it for the next step. This index will need to either be in place or created. This is where you have to make a tradeoff of speed here for reliability (apply the PK and any other constraints first) and speed later (have at least the clustered index in place if you are going to have one).
If the recovery model of your database is set to simple or bulk-logged, SELECT ... INTO ... UNION ALL may be the fastest solution. SELECT .. INTO is a bulk operation and bulk operations are minimally logged.
eg:
-- first, create the table
SELECT ...
INTO #TempTable
FROM MyTable
WHERE ...
UNION ALL
SELECT ...
FROM MyTable2
WHERE ...
-- now, add a non-clustered primary key:
-- this will *not* recreate the table in the background
-- it will only create a separate index
-- the table will remain stored as a heap
ALTER TABLE #TempTable ADD PRIMARY KEY NONCLUSTERED (NonNullableKeyField)
-- alternatively:
-- this *will* recreate the table in the background
-- and reorder the rows according to the primary key
-- CLUSTERED key word is optional, primary keys are clustered by default
ALTER TABLE #TempTable ADD PRIMARY KEY CLUSTERED (NonNullableKeyField)
Otherwise, Cade Roux had good advice re: before or after.
You may as well create the primary key before the inserts - if the primary key is on an identity column then the inserts will be done sequentially anyway and there will be no difference.
Even more important than performance considerations, if you are not ABSOLUTELY, 100% sure that you will have unique values being inserted into the table, create the primary key first. Otherwise the primary key will fail to be created.
This prevents you from inserting duplicate/bad data.
If you add the primary key when creating the table, the first insert will be free (no checks required.) The second insert just has to see if it's different from the first. The third insert has to check two rows, and so on. The checks will be index lookups, because there's a unique constraint in place.
If you add the primary key after all the inserts, every row has to be matched against every other row. So my guess is that adding a primary key early on is cheaper.
But maybe Sql Server has a really smart way of checking uniqueness. So if you want to be sure, measure it!
I was wondering if I could improve a very very "expensive" stored procedure entailing a bunch of checks at each insert across tables and came across this answer. In the Sproc, several temp tables are opened and reference each other. I added the Primary Key to the CREATE TABLE statement (even though my selects use WHERE NOT EXISTS statements to insert data and ensure uniqueness) and my execution time was cut down SEVERELY. I highly recommend using the primary keys. Always at least try it out even when you think you don't need it.
I don't think it makes any significant difference in your case:
either you pay the penalty a little bit at a time, with each single insert
or you'll pay a larger penalty after all the inserts are done, but only once
When you create it up front before the inserts start, you could potentially catch PK violations as the data is being inserted, if the PK value isn't system-created.
But other than that - no big difference, really.
Marc
I wasn't planning to answer this, since I'm not 100% confident on my knowledge of this. But since it doesn't look like you are getting much response ...
My understanding is a PK is a unique index and when you insert each record, your index is updated and optimized. So ... if you add the data first, then create the index, the index is only optimized once.
So, if you are confident your data is clean (without duplicate PK data) then I'd say insert, then add the PK.
But if your data may have duplicate PK data, I'd say create the PK first, so it will bomb out ASAP.
When you add PK on table creation - the insert check is O(Tn) (where Tn is "n-th triangular number", which is 1 + 2 + 3 ... + n) because when you insert x-th row, it's checked against previously inserted "x - 1" rows
When you add PK after inserting all the values - the checker is O(n^2) because when you insert x-th row, it's checked against all n existing rows.
First one is obviously faster since O(Tn) is less than O(n^2)
P.S. Example: if you insert 5 rows it is 1 + 2 + 3 + 4 + 5 = 15 operations vs 5^2 = 25 operations

How do I create a unique constraint that also allows nulls?

I want to have a unique constraint on a column which I am going to populate with GUIDs. However, my data contains null values for this columns. How do I create the constraint that allows multiple null values?
Here's an example scenario. Consider this schema:
CREATE TABLE People (
Id INT CONSTRAINT PK_MyTable PRIMARY KEY IDENTITY,
Name NVARCHAR(250) NOT NULL,
LibraryCardId UNIQUEIDENTIFIER NULL,
CONSTRAINT UQ_People_LibraryCardId UNIQUE (LibraryCardId)
)
Then see this code for what I'm trying to achieve:
-- This works fine:
INSERT INTO People (Name, LibraryCardId)
VALUES ('John Doe', 'AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA');
-- This also works fine, obviously:
INSERT INTO People (Name, LibraryCardId)
VALUES ('Marie Doe', 'BBBBBBBB-BBBB-BBBB-BBBB-BBBBBBBBBBBB');
-- This would *correctly* fail:
--INSERT INTO People (Name, LibraryCardId)
--VALUES ('John Doe the Second', 'AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA');
-- This works fine this one first time:
INSERT INTO People (Name, LibraryCardId)
VALUES ('Richard Roe', NULL);
-- THE PROBLEM: This fails even though I'd like to be able to do this:
INSERT INTO People (Name, LibraryCardId)
VALUES ('Marcus Roe', NULL);
The final statement fails with a message:
Violation of UNIQUE KEY constraint 'UQ_People_LibraryCardId'. Cannot insert duplicate key in object 'dbo.People'.
How can I change my schema and/or uniqueness constraint so that it allows multiple NULL values, while still checking for uniqueness on actual data?
What you're looking for is indeed part of the ANSI standards SQL:92, SQL:1999 and SQL:2003, ie a UNIQUE constraint must disallow duplicate non-NULL values but accept multiple NULL values.
In the Microsoft world of SQL Server however, a single NULL is allowed but multiple NULLs are not...
In SQL Server 2008, you can define a unique filtered index based on a predicate that excludes NULLs:
CREATE UNIQUE NONCLUSTERED INDEX idx_yourcolumn_notnull
ON YourTable(yourcolumn)
WHERE yourcolumn IS NOT NULL;
In earlier versions, you can resort to VIEWS with a NOT NULL predicate to enforce the constraint.
SQL Server 2008 +
You can create a unique index that accept multiple NULLs with a WHERE clause. See the answer below.
Prior to SQL Server 2008
You cannot create a UNIQUE constraint and allow NULLs. You need set a default value of NEWID().
Update the existing values to NEWID() where NULL before creating the UNIQUE constraint.
SQL Server 2008 And Up
Just filter a unique index:
CREATE UNIQUE NONCLUSTERED INDEX UQ_Party_SamAccountName
ON dbo.Party(SamAccountName)
WHERE SamAccountName IS NOT NULL;
In Lower Versions, A Materialized View Is Still Not Required
For SQL Server 2005 and earlier, you can do it without a view. I just added a unique constraint like you're asking for to one of my tables. Given that I want uniqueness in column SamAccountName, but I want to allow multiple NULLs, I used a materialized column rather than a materialized view:
ALTER TABLE dbo.Party ADD SamAccountNameUnique
AS (Coalesce(SamAccountName, Convert(varchar(11), PartyID)))
ALTER TABLE dbo.Party ADD CONSTRAINT UQ_Party_SamAccountName
UNIQUE (SamAccountNameUnique)
You simply have to put something in the computed column that will be guaranteed unique across the whole table when the actual desired unique column is NULL. In this case, PartyID is an identity column and being numeric will never match any SamAccountName, so it worked for me. You can try your own method—be sure you understand the domain of your data so that there is no possibility of intersection with real data. That could be as simple as prepending a differentiator character like this:
Coalesce('n' + SamAccountName, 'p' + Convert(varchar(11), PartyID))
Even if PartyID became non-numeric someday and could coincide with a SamAccountName, now it won't matter.
Note that the presence of an index including the computed column implicitly causes each expression result to be saved to disk with the other data in the table, which DOES take additional disk space.
Note that if you don't want an index, you can still save CPU by making the expression be precalculated to disk by adding the keyword PERSISTED to the end of the column expression definition.
In SQL Server 2008 and up, definitely use the filtered solution instead if you possibly can!
Controversy
Please note that some database professionals will see this as a case of "surrogate NULLs", which definitely have problems (mostly due to issues around trying to determine when something is a real value or a surrogate value for missing data; there can also be issues with the number of non-NULL surrogate values multiplying like crazy).
However, I believe this case is different. The computed column I'm adding will never be used to determine anything. It has no meaning of itself, and encodes no information that isn't already found separately in other, properly defined columns. It should never be selected or used.
So, my story is that this is not a surrogate NULL, and I'm sticking to it! Since we don't actually want the non-NULL value for any purpose other than to trick the UNIQUE index to ignore NULLs, our use case has none of the problems that arise with normal surrogate NULL creation.
All that said, I have no problem with using an indexed view instead—but it brings some issues with it such as the requirement of using SCHEMABINDING. Have fun adding a new column to your base table (you'll at minimum have to drop the index, and then drop the view or alter the view to not be schema bound). See the full (long) list of requirements for creating an indexed view in SQL Server (2005) (also later versions), (2000).
Update
If your column is numeric, there may be the challenge of ensuring that the unique constraint using Coalesce does not result in collisions. In that case, there are some options. One might be to use a negative number, to put the "surrogate NULLs" only in the negative range, and the "real values" only in the positive range. Alternately, the following pattern could be used. In table Issue (where IssueID is the PRIMARY KEY), there may or may not be a TicketID, but if there is one, it must be unique.
ALTER TABLE dbo.Issue ADD TicketUnique
AS (CASE WHEN TicketID IS NULL THEN IssueID END);
ALTER TABLE dbo.Issue ADD CONSTRAINT UQ_Issue_Ticket_AllowNull
UNIQUE (TicketID, TicketUnique);
If IssueID 1 has ticket 123, the UNIQUE constraint will be on values (123, NULL). If IssueID 2 has no ticket, it will be on (NULL, 2). Some thought will show that this constraint cannot be duplicated for any row in the table, and still allows multiple NULLs.
For people who are using Microsoft SQL Server Manager and want to create a Unique but Nullable index you can create your unique index as you normally would then in your Index Properties for your new index, select "Filter" from the left hand panel, then enter your filter (which is your where clause). It should read something like this:
([YourColumnName] IS NOT NULL)
This works with MSSQL 2012
When I applied the unique index below:
CREATE UNIQUE NONCLUSTERED INDEX idx_badgeid_notnull
ON employee(badgeid)
WHERE badgeid IS NOT NULL;
every non null update and insert failed with the error below:
UPDATE failed because the following SET options have incorrect settings: 'ARITHABORT'.
I found this on MSDN
SET ARITHABORT must be ON when you are creating or changing indexes on computed columns or indexed views. If SET ARITHABORT is OFF, CREATE, UPDATE, INSERT, and DELETE statements on tables with indexes on computed columns or indexed views will fail.
So to get this to work correctly I did this
Right click [Database]-->Properties-->Options-->Other
Options-->Misscellaneous-->Arithmetic Abort Enabled -->true
I believe it is possible to set this option in code using
ALTER DATABASE "DBNAME" SET ARITHABORT ON
but i have not tested this
It can be done in the designer as well
Right click on the Index > Properties to get this window
Create a view that selects only non-NULL columns and create the UNIQUE INDEX on the view:
CREATE VIEW myview
AS
SELECT *
FROM mytable
WHERE mycolumn IS NOT NULL
CREATE UNIQUE INDEX ux_myview_mycolumn ON myview (mycolumn)
Note that you'll need to perform INSERT's and UPDATE's on the view instead of table.
You may do it with an INSTEAD OF trigger:
CREATE TRIGGER trg_mytable_insert ON mytable
INSTEAD OF INSERT
AS
BEGIN
INSERT
INTO myview
SELECT *
FROM inserted
END
It is possible to create a unique constraint on a Clustered Indexed View
You can create the View like this:
CREATE VIEW dbo.VIEW_OfYourTable WITH SCHEMABINDING AS
SELECT YourUniqueColumnWithNullValues FROM dbo.YourTable
WHERE YourUniqueColumnWithNullValues IS NOT NULL;
and the unique constraint like this:
CREATE UNIQUE CLUSTERED INDEX UIX_VIEW_OFYOURTABLE
ON dbo.VIEW_OfYourTable(YourUniqueColumnWithNullValues)
In my experience - if you're thinking a column needs to allow NULLs but also needs to be UNIQUE for values where they exist, you may be modelling the data incorrectly. This often suggests you're creating a separate sub-entity within the same table as a different entity. It probably makes more sense to have this entity in a second table.
In the provided example, I would put LibraryCardId in a separate LibraryCards table with a unique not-null foreign key to the People table:
CREATE TABLE People (
Id INT CONSTRAINT PK_MyTable PRIMARY KEY IDENTITY,
Name NVARCHAR(250) NOT NULL,
)
CREATE TABLE LibraryCards (
LibraryCardId UNIQUEIDENTIFIER CONSTRAINT PK_LibraryCards PRIMARY KEY,
PersonId INT NOT NULL
CONSTRAINT UQ_LibraryCardId_PersonId UNIQUE (PersonId),
FOREIGN KEY (PersonId) REFERENCES People(id)
)
This way you don't need to bother with a column being both unique and nullable. If a person doesn't have a library card, they just won't have a record in the library cards table. Also, if there are additional attributes about the library card (perhaps Expiration Date or something), you now have a logical place to put those fields.
Maybe consider an "INSTEAD OF" trigger and do the check yourself? With a non-clustered (non-unique) index on the column to enable the lookup.
As stated before, SQL Server doesn't implement the ANSI standard when it comes to UNIQUE CONSTRAINT. There is a ticket on Microsoft Connect for this since 2007. As suggested there and here the best options as of today are to use a filtered index as stated in another answer or a computed column, e.g.:
CREATE TABLE [Orders] (
[OrderId] INT IDENTITY(1,1) NOT NULL,
[TrackingId] varchar(11) NULL,
...
[ComputedUniqueTrackingId] AS (
CASE WHEN [TrackingId] IS NULL
THEN '#' + cast([OrderId] as varchar(12))
ELSE [TrackingId_Unique] END
),
CONSTRAINT [UQ_TrackingId] UNIQUE ([ComputedUniqueTrackingId])
)
You can create an INSTEAD OF trigger to check for specific conditions and error if they are met. Creating an index can be costly on larger tables.
Here's an example:
CREATE TRIGGER PONY.trg_pony_unique_name ON PONY.tbl_pony
INSTEAD OF INSERT, UPDATE
AS
BEGIN
IF EXISTS(
SELECT TOP (1) 1
FROM inserted i
GROUP BY i.pony_name
HAVING COUNT(1) > 1
)
OR EXISTS(
SELECT TOP (1) 1
FROM PONY.tbl_pony t
INNER JOIN inserted i
ON i.pony_name = t.pony_name
)
THROW 911911, 'A pony must have a name as unique as s/he is. --PAS', 16;
ELSE
INSERT INTO PONY.tbl_pony (pony_name, stable_id, pet_human_id)
SELECT pony_name, stable_id, pet_human_id
FROM inserted
END
You can't do this with a UNIQUE constraint, but you can do this in a trigger.
CREATE TRIGGER [dbo].[OnInsertMyTableTrigger]
ON [dbo].[MyTable]
INSTEAD OF INSERT
AS
BEGIN
SET NOCOUNT ON;
DECLARE #Column1 INT;
DECLARE #Column2 INT; -- allow nulls on this column
SELECT #Column1=Column1, #Column2=Column2 FROM inserted;
-- Check if an existing record already exists, if not allow the insert.
IF NOT EXISTS(SELECT * FROM dbo.MyTable WHERE Column1=#Column1 AND Column2=#Column2 #Column2 IS NOT NULL)
BEGIN
INSERT INTO dbo.MyTable (Column1, Column2)
SELECT #Column2, #Column2;
END
ELSE
BEGIN
RAISERROR('The unique constraint applies on Column1 %d, AND Column2 %d, unless Column2 is NULL.', 16, 1, #Column1, #Column2);
ROLLBACK TRANSACTION;
END
END
CREATE UNIQUE NONCLUSTERED INDEX [UIX_COLUMN_NAME]
ON [dbo].[Employee]([Username] ASC) WHERE ([Username] IS NOT NULL)
WITH (ALLOW_PAGE_LOCKS = ON, ALLOW_ROW_LOCKS = ON, PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, STATISTICS_NORECOMPUTE = OFF, ONLINE = OFF,
MAXDOP = 0) ON [PRIMARY];
this code if u make a register form with textBox and use insert and ur textBox is empty and u click on submit button .
CREATE UNIQUE NONCLUSTERED INDEX [IX_tableName_Column]
ON [dbo].[tableName]([columnName] ASC) WHERE [columnName] !=`''`;

Resources