Unique index or unique key? - sql-server

What is the diffrence between a unique index and a unique key?

The unique piece is not where the difference lies. The index and key are not the same thing, and are not comparable.
A key is a data column, or several columns, that are forced to be unique with a constraint, either primary key or explicitly defined unique constraint. Whereas an index is a structure for storing data location for faster retrieval.
From the docs:
Unique Index
Creates a unique index on a table or
view. A unique index is one in which
no two rows are permitted to have the
same index key value. A clustered
index on a view must be unique
Unique key (Constraint)
You can use UNIQUE constraints to make
sure that no duplicate values are
entered in specific columns that do
not participate in a primary key.
Although both a UNIQUE constraint and
a PRIMARY KEY constraint enforce
uniqueness, use a UNIQUE constraint
instead of a PRIMARY KEY constraint
when you want to enforce the
uniqueness of a column, or combination
of columns, that is not the primary
key.

This MSDN article comparing the two is what you're after. The terminology is such that "constraint" is ANSI, but in SQL Server you can't disable a Unique Constraint...
For most purposes, there's no difference - the constraint is implemented as an index under the covers. The MSDN article backs this up--the difference is in the meta-data, for things like:
tweaking FILLFACTOR
INCLUDE provides more efficient covering indexes (composite constraint)
A filtered index is like a constraint over a subset of rows/ignore multiple null etc.

"Unique key" is a tautology. A Key (AKA "Candidate Key") is logical feature of the database - a constraint that enforces the uniqueness of a set of attributes in a table.
An index is a physical level feature intended to optimise performance in some way. There are many types of index.

Unique Key: It is a constraint which imposes limitation on database. That limitation is it will not allow duplicate values . For example if you want to select one column as primary key it should be NOT NULL & UNIQUE.
Unique Index: It is a index which improves the performance while executing queries on your data base. In unique index it also not allows duplicate values in index . ie.no two rows will have the same index key value.

Here are few key differences:
Purpose:
Unique Key: Ensures integrity of data at table level, so that no duplicates can be entered in the table. Is not used for query planning, does not contribute to query speed. (It's different purpose than Primary Key, primary key uniquely identifies each record for data operations such as update / delete etc. In complex tables, a unique key can be combinations of several columns and it will be inefficient to use unique key for identifying records for transactions. Hence primary key is quick way of identifying a particular record in the table, while unique key guarantees that no two records have same key attributes.)
Unique Index: Ensures uniqueness of data at index level, cannot guarantee uniqueness at the table level e.g. in case of filtered index. Is used for query planning and fetching data and thus speeds up queries depending on columns used / queried.
Filter Option:
Unique Key: Filter option is not available
Unique Index: Filter option is available
Storage Option:
Unique Key: Filegroup only
Unique Index: Filegroup or partition
Icon:
Unique Key: Icon is vertical key [ ]
Unique Index: Icon is b-tree [ ]

Both the key (aka keyword) and index are identifiers of a table row.
Though index is parallel identification structure, containing a pointer to the identified row, while keys are in situ field members.
The key, as identifier, implies uniqueness (constraint) and NOT NULL (constraint).
There is no sense in NULL as identifier (as null cannot identify anything) as well nonunique identifying value.
Non-clustered index can contain real data, not serving as identifier to real data, and so be non-unique [1]
It is unfortunate practice that the key or index (identifier) is called by constraint (rule or restriction) what most previous answers here followed.
Keys are used in context of:
alternate aka secondary aka candidate keys, can be multiple
composite key (a few fields combined)
primary key (superkey), natural or surrogate key, only one, really used for referential integrity
foreign key
Foreign key is the key in another table (where it is primary key) and even not a key to which they frequently refer. Such use is explained by confusing shortcutting of "foreign key constraint" term to just "foreign key".
Primary key constraint really implies NOT NULL and UNIQUE constraints + that referenced column (or combined columns) is identifier and also unfortunately substituted by "primary key" or "primary key constraint" while it is both which cannot be called either by only (primary key) constraint or by only (primary) key.
Update:
My related question:
[1]
UNIQUE argument for INDEX creation - what's for?

The functionalities are more or less same, it’s dependent on your use case.
Suppose you want to permit duplicate rows based on CUSTOMER_ID and TEAM_NAME.
In that case you can use both:
UNIQUE INDEX idx_customer_id_name (CUSTOMER_ID,TEAM_NAME)
UNIQUE KEY unique_key_customer_id_name (CUSTOMER_ID,TEAM_NAME)
But you should consider how often you fetch records based on CUSTOMER_ID AND TEAM_NAME. If it is more, then you should use unique index as it would help in faster retrieval of records otherwise you should go with unique key as it would prevent overheard of fetching based on index.

Related

Unique clustered index vs primary key

I have a table with composite primary key of 7 fields, but table is allowing duplicate entries with primary key. Later I noticed it also has Unique clustered index with 10 fields including 7 of primary key. Is that the reason system allowing to insert duplicate primary key data?
If so, I am not able to think of reason creating Unique Index with field additional fields not much in use for searching data except limit of fields in composite key. I tried to look for answer but didn't find result for limitation. Can someone please help. I am using sybase.
Peformance-wise, clustered key, set correctly (aka does not include all fields you deem unique), gives you the most advantage.
7-column PK is a strange construct; if you need to guard uniqueness, I would go for a combination of a cluster + unique constraint.

Primary keys, index and constrains in SQL

Please correct if im wrong. And kindly point me to articles on this concept.
When we create a primary key, in the background there is automatically a unique index, clustered index, and a not null constraint created on that coloumn.
Does this also mean that if we create a not null constraint, [clustered index or non clustered index] and unique index on a column, then that column becomes a primary key?
I want to understand the core concept/relation between primary key, index and constrains.
The primary key is the one that is declared as the "primary" key. Just having the characteristics doesn't make a key "primary". It has to be explicitly declared as such.
Different databases implement primary keys in different ways. Although primary keys are usually implemented with a clustered unique index, that is not a requirement.
The primary key is exactly what its name suggests: "primary". Any other column or group of columns can be declared both unique and not null. That does not make them primary keys. In some databases, you could even define another column or group of columns as not null, unique and clustered -- without that being the primary key.
In summary:
You can have any number of unique indexes on a table.
You can have any number of unique indexes on non-NULL columns on a table.
You can have at most one clustered index. In almost all cases, this would be the primary key. But is not required in all databases.
You can have at most one primary key. In almost all cases, this would be clustered, although that is not required in all databases.
For more detail, you should refer to the documentation of the database you are using.
If you have multiple columns comprising non-NULL, unique keys, then only one is "primary" -- that one that has been explicitly declared as primary.
Why would you have a non-clustered primary key? I can give one scenario. Imagine a database where UUIDs are the keys for rows. The company does not want to use auto-generated sequence numbers, because they provide information in the number.
However, UUIDs are remarkably bad candidates for clustered indexes, because inserts are almost never at the end. In this case, you might want to design the table with a clustered auto-generated sequential key, to speed inserts You might make this key the primary key. But, you want all foreign key references to use the UUID -- and you want all foreign key references to be to the primary key of the table.
No.
All the columns could be added with Not null and Non-clustered index and Unique But only ONE column could be PK.
And the Unique allows NULL while Primary Key does not.
You might be talking about Candidate Key, here is the ref:
https://www.techopedia.com/definition/21/candidate-key

What is the difference between a primary key and a index key

Can anyone tell me what is the difference between a primary key and index key. And when to use which?
A primary key is a special kind of index in that:
there can be only one;
it cannot be nullable; and
it must be unique.
You tend to use the primary key as the most natural unique identifier for a row (such as social security number, employee ID and so forth, although there is a school of thought that you should always use an artificial surrogate key for this).
Indexes, on the other hand, can be used for fast retrieval based on other columns. For example, an employee database may have your employee number as the primary key but it may also have an index on your last name or your department.
Both of these indexes (last name and department) would disallow NULLs (probably) and allow duplicates (almost certainly), and they would be useful to speed up queries looking for anyone with (for example) the last name 'Corleone' or working in the 'HitMan' department.
A key (minimal superkey) is a set of attributes, the values of which are unique for every tuple (every row in the table at some point in time).
An index is a performance optimisation feature that enables data to be accessed faster.
Keys are frequently good candidates for indexing and some DBMSs automatically create indexes for keys, but that doesn't have to be so.
The phrase "index key" mixes these two quite different words and might be best avoided if you want to avoid any confusion. "Index key" is sometimes used to mean "the set of attributes in an index". However the set of attributes in question are not necessarily a key because they may not be unique.
Oracle Database enforces a UNIQUE key or PRIMARY KEY integrity constraint on a table by creating a unique index on the unique key or primary key. This index is automatically created by the database when the constraint is enabled.
You can create indexes explicitly (outside of integrity constraints) using the SQL statement CREATE INDEX .
Indexes can be unique or non-unique. Unique indexes guarantee that no two rows of a table have duplicate values in the key column (or columns). Non-unique indexes do not impose this restriction on the column values.
Use the CREATE UNIQUE INDEX statement to create a unique index.
Specifying the Index Associated with a Constraint
If you require more explicit control over the indexes associated with UNIQUE and PRIMARY KEY constraints, the database lets you:
1. Specify an existing index that the database is to use
to enforce the constraint
2. Specify a CREATE INDEX statement that the database is to use to create
the index and enforce the constraint
These options are specified using the USING INDEX clause.
Example:
CREATE TABLE a (
a1 INT PRIMARY KEY USING INDEX (create index ai on a (a1)));
http://docs.oracle.com/cd/B28359_01/server.111/b28310/indexes003.htm
Other responses are defining the Primary Key, but not the Primary Index.
A Primary Index isn't an index on the Primary Key.
A Primary Index is your table's data structure, but only if your data structure is ordered by the Primary Key, thus allowing efficient lookups without a requiring a separate data structure to look up records by the Primary Key.
All databases (that I'm aware of) have a Primary Key.
Not all databases have a Primary Index. Most of those that don't build a secondary index on the Primary Key by default.

Index vs. Unique Key in SQL Server 2008

I have the following table that serves to join 3 tables:
ClientID int
BlogID int
MentionID int
Assuming that queries will always come via ClientID, I can create 1 multi-column index (ClientID, BlogID, MentionID).
The question is, should I create it as a clustered index or a unique key? I understand a clustered index stores the data on its leaf nodes. Of course, in this case, the index is the data, so I don't know if SQL Server will duplicate the data or not. Be that as it may, I can't find anything on MSDN about the significance of using "unique key".
How does this differ from Type = Index & IsUnique = yes?
Can someone tell me the advantages each way?
Clustered index is "the table itself", that is, index nodes are arranged in a tree, and its leaf nodes contains row data. Clustered index doesn't have to be declared as unique (though it usually is); if it is not unique, the server implicitly adds a "uniqalizer" to this index, so that each row is uniquely identified.
Other indexes store clustered index value as their leaf nodes (and possibly some other columns if they are included with INCLUDE clause in CREATE INDEX staetment).
Any index might be decalred as unique, so the server would perform an additional check to prevent duplicate values forom getting into the table.
It seems you are asking for the difference among:
MYTABLE
id integer primary key autoincrement
clientid integer
blogid integer
mentionid integer
-- with a unique composite index on (clientid, blogid, mentionid) and three foreign key constraints
and
MYTABLE
clientid
blogid
mentionid
-- with a composite primary key on (clientid, blogid, mentionid) and three foreign key constraints
and
MYTABLE
id integer primary key autoincrement
clientid integer
blogid integer
mentionid integer
with an index on clientid and also an index on blogid and the three foreign key constraints
In the first, you have the index on the integer primary key and also the alternative unique index on the triad. If the second, you have only the unique index on the triadic primary key. In the third, you have a unique index on the integer primary key and two other non-unique indexes, one on clientid and the other on blogid.
The performance gain with the second option's marginally greater efficiency would be de minimis, and so I'd base the decision on other factors. The third is the most flexible in terms of queries and offers greater simplicity of coding; it offers the benefit of indexes on client and blog both, in case you wanted to have a query with blog, not client, in the WHERE clause. As for coding, some GUI tools and middleware have trouble with multi-part primary keys, and your update/insert/delete logic will be simpler when it has to deal with a single integer PK column. I have found that code simplicity and ease of maintenance are far better things than a few seconds or only a few fractions of seconds of improvement in query response time.
A unique index, a unique key and
a unique constraint are basically
the same thing. They result in an
index that enforces uniqueness.
Clustered means that the index
becomes the table itself. It's good
to have a clustered index, otherwise
the table hangs around in an
unordered heap.
Unique and clustered are unrelated properties. You can combine them in any way you like. So in your case, I'd create a unique clustered index. The normal way to do that is by creating the index as a clustered primary key.
The data will not be duplicated if you create a clustered unique index on your three columns.
The unique clustered index will be the data - and the index at the same time :-)
Since this is a three-way join table, this clustered index probably does make a lot of sense. I'd say: go for it!
UNIQUE INDEX and UNIQUE CONSTRAINT are somewhat different concepts.
UNIQUE CONSTRAINT is a logical concept and means "make sure this column is unique, no matter how"
UNIQUE INDEX is a physical concept and means "create a B-Tree index on this column and fail whenever duplicates are inserted there"
The latter implies the former but not vice versa.
For instance, in Oracle, if you have a non-unique index on col1:
CREATE UNIQUE INDEX (col1) will fail and say "these columns are already indexed"
ALTER TABLE ADD CONSTRAINT UNIQUE(col1) will succeed and use the existing index to police the constraint.
Use CONSTRAINT if you just want the column to be unique and INDEX if you know a B-Tree index is what you want (to speed up searches etc).

Can I have multiple primary keys in a single table?

Can I have multiple primary keys in a single table?
A Table can have a Composite Primary Key which is a primary key made from two or more columns. For example:
CREATE TABLE userdata (
userid INT,
userdataid INT,
info char(200),
primary key (userid, userdataid)
);
Update: Here is a link with a more detailed description of composite primary keys.
You can only have one primary key, but you can have multiple columns in your primary key.
You can also have Unique Indexes on your table, which will work a bit like a primary key in that they will enforce unique values, and will speed up querying of those values.
A table can have multiple candidate keys. Each candidate key is a column or set of columns that are UNIQUE, taken together, and also NOT NULL. Thus, specifying values for all the columns of any candidate key is enough to determine that there is one row that meets the criteria, or no rows at all.
Candidate keys are a fundamental concept in the relational data model.
It's common practice, if multiple keys are present in one table, to designate one of the candidate keys as the primary key. It's also common practice to cause any foreign keys to the table to reference the primary key, rather than any other candidate key.
I recommend these practices, but there is nothing in the relational model that requires selecting a primary key among the candidate keys.
This is the answer for both the main question and for #Kalmi's question of
What would be the point of having multiple auto-generating columns?
This code below has a composite primary key. One of its columns is auto-incremented. This will work only in MyISAM. InnoDB will generate an error "ERROR 1075 (42000): Incorrect table definition; there can be only one auto column and it must be defined as a key".
DROP TABLE IF EXISTS `test`.`animals`;
CREATE TABLE `test`.`animals` (
`grp` char(30) NOT NULL,
`id` mediumint(9) NOT NULL AUTO_INCREMENT,
`name` char(30) NOT NULL,
PRIMARY KEY (`grp`,`id`)
) ENGINE=MyISAM;
INSERT INTO animals (grp,name) VALUES
('mammal','dog'),('mammal','cat'),
('bird','penguin'),('fish','lax'),('mammal','whale'),
('bird','ostrich');
SELECT * FROM animals ORDER BY grp,id;
Which returns:
+--------+----+---------+
| grp | id | name |
+--------+----+---------+
| fish | 1 | lax |
| mammal | 1 | dog |
| mammal | 2 | cat |
| mammal | 3 | whale |
| bird | 1 | penguin |
| bird | 2 | ostrich |
+--------+----+---------+
(Have been studying these, a lot)
Candidate keys - A minimal column combination required to uniquely identify a table row.
Compound keys - 2 or more columns.
Multiple Candidate keys can exist in a table.
Primary KEY - Only one of the candidate keys that is chosen by us
Alternate keys - All other candidate keys
Both Primary Key & Alternate keys can be Compound keys
Sources:
https://en.wikipedia.org/wiki/Superkey
https://en.wikipedia.org/wiki/Candidate_key
https://en.wikipedia.org/wiki/Primary_key
https://en.wikipedia.org/wiki/Compound_key
As noted by the others it is possible to have multi-column primary keys.
It should be noted however that if you have some functional dependencies that are not introduced by a key, you should consider normalizing your relation.
Example:
Person(id, name, email, street, zip_code, area)
There can be a functional dependency between id -> name,email, street, zip_code and area
But often a zip_code is associated with a area and thus there is an internal functional dependecy between zip_code -> area.
Thus one may consider splitting it into another table:
Person(id, name, email, street, zip_code)
Area(zip_code, name)
So that it is consistent with the third normal form.
Primary Key is very unfortunate notation, because of the connotation of "Primary" and the subconscious association in consequence with the Logical Model. I thus avoid using it. Instead I refer to the Surrogate Key of the Physical Model and the Natural Key(s) of the Logical Model.
It is important that the Logical Model for every Entity have at least one set of "business attributes" which comprise a Key for the entity. Boyce, Codd, Date et al refer to these in the Relational Model as Candidate Keys. When we then build tables for these Entities their Candidate Keys become Natural Keys in those tables. It is only through those Natural Keys that users are able to uniquely identify rows in the tables; as surrogate keys should always be hidden from users. This is because Surrogate Keys have no business meaning.
However the Physical Model for our tables will in many instances be inefficient without a Surrogate Key. Recall that non-covered columns for a non-clustered index can only be found (in general) through a Key Lookup into the clustered index (ignore tables implemented as heaps for a moment). When our available Natural Key(s) are wide this (1) widens the width of our non-clustered leaf nodes, increasing storage requirements and read accesses for seeks and scans of that non-clustered index; and (2) reduces fan-out from our clustered index increasing index height and index size, again increasing reads and storage requirements for our clustered indexes; and (3) increases cache requirements for our clustered indexes. chasing other indexes and data out of cache.
This is where a small Surrogate Key, designated to the RDBMS as "the Primary Key" proves beneficial. When set as the clustering key, so as to be used for key lookups into the clustered index from non-clustered indexes and foreign key lookups from related tables, all these disadvantages disappear. Our clustered index fan-outs increase again to reduce clustered index height and size, reduce cache load for our clustered indexes, decrease reads when accessing data through any mechanism (whether index scan, index seek, non-clustered key lookup or foreign key lookup) and decrease storage requirements for both clustered and nonclustered indexes of our tables.
Note that these benefits only occur when the surrogate key is both small and the clustering key. If a GUID is used as the clustering key the situation will often be worse than if the smallest available Natural Key had been used. If the table is organized as a heap then the 8-byte (heap) RowID will be used for key lookups, which is better than a 16-byte GUID but less performant than a 4-byte integer.
If a GUID must be used due to business constraints than the search for a better clustering key is worthwhile. If for example a small site identifier and 4-byte "site-sequence-number" is feasible then that design might give better performance than a GUID as Surrogate Key.
If the consequences of a heap (hash join perhaps) make that the preferred storage then the costs of a wider clustering key need to be balanced into the trade-off analysis.
Consider this example::
ALTER TABLE Persons
ADD CONSTRAINT pk_PersonID PRIMARY KEY (P_Id,LastName)
where the tuple "(P_Id,LastName)" requires a uniqueness constraint, and may be a lengthy Unicode LastName plus a 4-byte integer, it would be desirable to (1) declaratively enforce this constraint as "ADD CONSTRAINT pk_PersonID UNIQUE NONCLUSTERED (P_Id,LastName)" and (2) separately declare a small Surrogate Key to be the "Primary Key" of a clustered index. It is worth noting that Anita possibly only wishes to add the LastName to this constraint in order to make that a covered field, which is unnecessary in a clustered index because ALL fields are covered by it.
The ability in SQL Server to designate a Primary Key as nonclustered is an unfortunate historical circumstance, due to a conflation of the meaning "preferred natural or candidate key" (from the Logical Model) with the meaning "lookup key in storage" from the Physical Model. My understanding is that originally SYBASE SQL Server always used a 4-byte RowID, whether into a heap or a clustered index, as the "lookup key in storage" from the Physical Model.
A primary key is the key that uniquely identifies a record and is used in all indexes. This is why you can't have more than one. It is also generally the key that is used in joining to child tables but this is not a requirement. The real purpose of a PK is to make sure that something allows you to uniquely identify a record so that data changes affect the correct record and so that indexes can be created.
However, you can put multiple fields in one primary key (a composite PK). This will make your joins slower (espcially if they are larger string type fields) and your indexes larger but it may remove the need to do joins in some of the child tables, so as far as performance and design, take it on a case by case basis. When you do this, each field itself is not unique, but the combination of them is. If one or more of the fields in a composite key should also be unique, then you need a unique index on it. It is likely though that if one field is unique, this is a better candidate for the PK.
Now at times, you have more than one candidate for the PK. In this case you choose one as the PK or use a surrogate key (I personally prefer surrogate keys for this instance). And (this is critical!) you add unique indexes to each of the candidate keys that were not chosen as the PK. If the data needs to be unique, it needs a unique index whether it is the PK or not. This is a data integrity issue. (Note this is also true anytime you use a surrogate key; people get into trouble with surrogate keys because they forget to create unique indexes on the candidate keys.)
There are occasionally times when you want more than one surrogate key (which are usually the PK if you have them). In this case what you want isn't more PK's, it is more fields with autogenerated keys. Most DBs don't allow this, but there are ways of getting around it. First consider if the second field could be calculated based on the first autogenerated key (Field1 * -1 for instance) or perhaps the need for a second autogenerated key really means you should create a related table. Related tables can be in a one-to-one relationship. You would enforce that by adding the PK from the parent table to the child table and then adding the new autogenerated field to the table and then whatever fields are appropriate for this table. Then choose one of the two keys as the PK and put a unique index on the other (the autogenerated field does not have to be a PK). And make sure to add the FK to the field that is in the parent table. In general if you have no additional fields for the child table, you need to examine why you think you need two autogenerated fields.
Some people use the term "primary key" to mean exactly an integer column that gets its values generated by some automatic mechanism. For example AUTO_INCREMENT in MySQL or IDENTITY in Microsoft SQL Server. Are you using primary key in this sense?
If so, the answer depends on the brand of database you're using. In MySQL, you can't do this, you get an error:
mysql> create table foo (
id int primary key auto_increment,
id2 int auto_increment
);
ERROR 1075 (42000): Incorrect table definition;
there can be only one auto column and it must be defined as a key
In some other brands of database, you are able to define more than one auto-generating column in a table.
Having two primary keys at the same time, is not possible. But (assuming that you have not messed the case up with composite key), may be what you might need is to make one attribute unique.
CREATE t1(
c1 int NOT NULL,
c2 int NOT NULL UNIQUE,
...,
PRIMARY KEY (c1)
);
However note that in relational database a 'super key' is a subset of attributes which uniquely identify a tuple or row in a table. A 'key' is a 'super key' that has an additional property that removing any attribute from the key, makes that key no more a 'super key'(or simply a 'key' is a minimal super key). If there are more keys, all of them are candidate keys. We select one of the candidate keys as a primary key. That's why talking about multiple primary keys for a one relation or table is being a conflict.
Good technical answers were given in better way than I can do.
I am only can add to this topic:
If you want something that not allowed/acceptable it is good reason to take step back.
Understand the core of why it's not acceptable.
Dig more in documentation/journal articles/web and etc.
Analyze/review current design and point major flaws.
Consider and test every step during new design.
Always look forward and try to create adaptive solution.
Hope it will helps someone.
Yes, Its possible in SQL,
but we can't set more than one primary keys in MsAccess.
Then, I don't know about the other databases.
CREATE TABLE CHAPTER (
BOOK_ISBN VARCHAR(50) NOT NULL,
IDX INT NOT NULL,
TITLE VARCHAR(100) NOT NULL,
NUM_OF_PAGES INT,
PRIMARY KEY (BOOK_ISBN, IDX)
);

Resources