Can anyone tell me what is the difference between a primary key and index key. And when to use which?
A primary key is a special kind of index in that:
there can be only one;
it cannot be nullable; and
it must be unique.
You tend to use the primary key as the most natural unique identifier for a row (such as social security number, employee ID and so forth, although there is a school of thought that you should always use an artificial surrogate key for this).
Indexes, on the other hand, can be used for fast retrieval based on other columns. For example, an employee database may have your employee number as the primary key but it may also have an index on your last name or your department.
Both of these indexes (last name and department) would disallow NULLs (probably) and allow duplicates (almost certainly), and they would be useful to speed up queries looking for anyone with (for example) the last name 'Corleone' or working in the 'HitMan' department.
A key (minimal superkey) is a set of attributes, the values of which are unique for every tuple (every row in the table at some point in time).
An index is a performance optimisation feature that enables data to be accessed faster.
Keys are frequently good candidates for indexing and some DBMSs automatically create indexes for keys, but that doesn't have to be so.
The phrase "index key" mixes these two quite different words and might be best avoided if you want to avoid any confusion. "Index key" is sometimes used to mean "the set of attributes in an index". However the set of attributes in question are not necessarily a key because they may not be unique.
Oracle Database enforces a UNIQUE key or PRIMARY KEY integrity constraint on a table by creating a unique index on the unique key or primary key. This index is automatically created by the database when the constraint is enabled.
You can create indexes explicitly (outside of integrity constraints) using the SQL statement CREATE INDEX .
Indexes can be unique or non-unique. Unique indexes guarantee that no two rows of a table have duplicate values in the key column (or columns). Non-unique indexes do not impose this restriction on the column values.
Use the CREATE UNIQUE INDEX statement to create a unique index.
Specifying the Index Associated with a Constraint
If you require more explicit control over the indexes associated with UNIQUE and PRIMARY KEY constraints, the database lets you:
1. Specify an existing index that the database is to use
to enforce the constraint
2. Specify a CREATE INDEX statement that the database is to use to create
the index and enforce the constraint
These options are specified using the USING INDEX clause.
Example:
CREATE TABLE a (
a1 INT PRIMARY KEY USING INDEX (create index ai on a (a1)));
http://docs.oracle.com/cd/B28359_01/server.111/b28310/indexes003.htm
Other responses are defining the Primary Key, but not the Primary Index.
A Primary Index isn't an index on the Primary Key.
A Primary Index is your table's data structure, but only if your data structure is ordered by the Primary Key, thus allowing efficient lookups without a requiring a separate data structure to look up records by the Primary Key.
All databases (that I'm aware of) have a Primary Key.
Not all databases have a Primary Index. Most of those that don't build a secondary index on the Primary Key by default.
Related
Please correct if im wrong. And kindly point me to articles on this concept.
When we create a primary key, in the background there is automatically a unique index, clustered index, and a not null constraint created on that coloumn.
Does this also mean that if we create a not null constraint, [clustered index or non clustered index] and unique index on a column, then that column becomes a primary key?
I want to understand the core concept/relation between primary key, index and constrains.
The primary key is the one that is declared as the "primary" key. Just having the characteristics doesn't make a key "primary". It has to be explicitly declared as such.
Different databases implement primary keys in different ways. Although primary keys are usually implemented with a clustered unique index, that is not a requirement.
The primary key is exactly what its name suggests: "primary". Any other column or group of columns can be declared both unique and not null. That does not make them primary keys. In some databases, you could even define another column or group of columns as not null, unique and clustered -- without that being the primary key.
In summary:
You can have any number of unique indexes on a table.
You can have any number of unique indexes on non-NULL columns on a table.
You can have at most one clustered index. In almost all cases, this would be the primary key. But is not required in all databases.
You can have at most one primary key. In almost all cases, this would be clustered, although that is not required in all databases.
For more detail, you should refer to the documentation of the database you are using.
If you have multiple columns comprising non-NULL, unique keys, then only one is "primary" -- that one that has been explicitly declared as primary.
Why would you have a non-clustered primary key? I can give one scenario. Imagine a database where UUIDs are the keys for rows. The company does not want to use auto-generated sequence numbers, because they provide information in the number.
However, UUIDs are remarkably bad candidates for clustered indexes, because inserts are almost never at the end. In this case, you might want to design the table with a clustered auto-generated sequential key, to speed inserts You might make this key the primary key. But, you want all foreign key references to use the UUID -- and you want all foreign key references to be to the primary key of the table.
No.
All the columns could be added with Not null and Non-clustered index and Unique But only ONE column could be PK.
And the Unique allows NULL while Primary Key does not.
You might be talking about Candidate Key, here is the ref:
https://www.techopedia.com/definition/21/candidate-key
How do we specify which key is used for building the index for a
database in SQL?
In most if not all RDBMS, is the search key used for building the index for a database always
the primary key?
From Database Management Systems, 3rd Edition, by Raghu
Ramakrishnan, and Johannes Gehrke
In principle, we can use any key, not just the primary
key, to refer to a tuple. However, using the primary key is
preferable because it is what the DBMS expects - this is the
significance of designating a particular candidate key as a
primary key and optimizes for. For example, the DBMS may create
an index with the primary key fields as the search key, to make the
retrieval of a tuple given its primary key value efficient.
Thanks.
That depends on which RDBMS you are using. It will be something like
CREATE INDEX index_name ON table_name(key_name).
YES and NO.
a) If you are creating a table, and generally the RDBMS will create the index for this table using the primary key you specify in your CREATE TABLE statement. If you don't specify a primary key, RDBMS will help you choose an unique and non-null key, OR create an internal key (probably an int type) as primary key for this table.
b) Sometimes, according to the query pattern, you may find some keys other than primary key are used frequently (in where clause for example), then it is good to build new indexes using these keys.
There are two aspects to your question:
I'm trying to interpret your questions. Perhaps what you need to understand is that there can be more than one index into a table?
Let's say you have a Customers table with 3 columns, CustomerID, LastName, and FirstName.
You create an index using a specific CREATE INDEX or ALTER TABLE command where you specify the specific columns you want to have included in the index. You do not have to have the primary key as part of an index; for example, you may create an index on a table of customers by their last name and first name to speed name searches while still having a different primary key like customerID. Here's some SQL-like syntax.
CREATE INDEX customer_name_idx ON Customers(LastName, FirstName)
This index doesn't include any primary keys nor does it require a primary key to function properly. Internally, it will likely point to some internal row IDs that only the DBMS cares about.
I'm trying to understand what you mean here as well.
A DBMS can return a result regardless of the presence of an index; an index just makes it more efficient if your query matches up nicely with an index.
Designating a column as a primary key provides benefits such as enforced uniqueness, and possibly some performance benefits for enforcing other foreign key constraints.
As your quote says though, there is no written rule that says a primary key must also be an index. MySQL, and probably many other DMBSes, creates an index automatically on the table's primary key as it makes sense to do so from a technical level.
Anyway, I hope this makes sense and I hope I can clarify better if you have other questions.
In SQL Server, I have a non nullable column with a unique clustered index on it.
If I make this column a Primary Key the exact same index is created automatically plus
the column is recognized as a Primary Key.
I understand the abstract/semantic difference.
(Primary Key identifies the entity, while any other column with this index may not.
For example, a Person can have Email field which is Unique,Non-nullable... but can be changed)
But what bothers me is the actual difference when it comes to the DB engine itself.
What will happen if I will just create an Id column, make it non-nullable, create a unique clustered index for it, make it Identity Increment, but without the Primary Key constraint?
In what scenarios the Primary Key constraint comes into play?
(I've looked at many related questions before asking this, but all the answers I saw ended up with an abstract/theoretical explanation).
Nothing will be different really. You specify PRIMARY KEY to relay your intentions, not so that the engine does anything differently. When constructing a query plan, the optimizer will still use the uniqueness for all of its properties, and will still use the clustered index for all of its properties, regardless of whether you technically created it as a PRIMARY KEY. When creating a FOREIGN KEY, you can still reference the column(s) specified as unique (clustered or not). The difference is solely in the metadata (sys.indexes.is_primary_key) and in SSMS' representation to you (oh and the fact that you can create a unique clustered index on a NULLable column, but you can't create a PRIMARY KEY on that column).
In fact there are many cases where you want to completely separate the clustered index from the PRIMARY KEY. If you have a table where the PK is a GUID, for example, and you are typically running date range queries against the table, you are probably better off having the PK be non-clustered and have a clustered index on a naturally increasing column (the datetime column) - both to minimize page splits on heavy insert activity and also to best assist date range queries. The non-clustered index will be perfectly fine for looking up individual GUIDs. (I wanted to mention that because a lot of people think the primary key has to be clustered. Not true.)
Also interesting to note that if you create a PRIMARY KEY constraint, then create a unique clustered index with the same name using DROP_EXISTING, the is_primary_key column will still be 1 and Object Explorer will still show the index name under Keys.
Here is one scenario - a lot of code to data mapping frameworks look at the database metadata (what are the primary keys, foreign keys, etc) to determine how code is executed. For example Hibernate requires a primary key.
A typical scenario might be generating a where clause for an update.
What is the diffrence between a unique index and a unique key?
The unique piece is not where the difference lies. The index and key are not the same thing, and are not comparable.
A key is a data column, or several columns, that are forced to be unique with a constraint, either primary key or explicitly defined unique constraint. Whereas an index is a structure for storing data location for faster retrieval.
From the docs:
Unique Index
Creates a unique index on a table or
view. A unique index is one in which
no two rows are permitted to have the
same index key value. A clustered
index on a view must be unique
Unique key (Constraint)
You can use UNIQUE constraints to make
sure that no duplicate values are
entered in specific columns that do
not participate in a primary key.
Although both a UNIQUE constraint and
a PRIMARY KEY constraint enforce
uniqueness, use a UNIQUE constraint
instead of a PRIMARY KEY constraint
when you want to enforce the
uniqueness of a column, or combination
of columns, that is not the primary
key.
This MSDN article comparing the two is what you're after. The terminology is such that "constraint" is ANSI, but in SQL Server you can't disable a Unique Constraint...
For most purposes, there's no difference - the constraint is implemented as an index under the covers. The MSDN article backs this up--the difference is in the meta-data, for things like:
tweaking FILLFACTOR
INCLUDE provides more efficient covering indexes (composite constraint)
A filtered index is like a constraint over a subset of rows/ignore multiple null etc.
"Unique key" is a tautology. A Key (AKA "Candidate Key") is logical feature of the database - a constraint that enforces the uniqueness of a set of attributes in a table.
An index is a physical level feature intended to optimise performance in some way. There are many types of index.
Unique Key: It is a constraint which imposes limitation on database. That limitation is it will not allow duplicate values . For example if you want to select one column as primary key it should be NOT NULL & UNIQUE.
Unique Index: It is a index which improves the performance while executing queries on your data base. In unique index it also not allows duplicate values in index . ie.no two rows will have the same index key value.
Here are few key differences:
Purpose:
Unique Key: Ensures integrity of data at table level, so that no duplicates can be entered in the table. Is not used for query planning, does not contribute to query speed. (It's different purpose than Primary Key, primary key uniquely identifies each record for data operations such as update / delete etc. In complex tables, a unique key can be combinations of several columns and it will be inefficient to use unique key for identifying records for transactions. Hence primary key is quick way of identifying a particular record in the table, while unique key guarantees that no two records have same key attributes.)
Unique Index: Ensures uniqueness of data at index level, cannot guarantee uniqueness at the table level e.g. in case of filtered index. Is used for query planning and fetching data and thus speeds up queries depending on columns used / queried.
Filter Option:
Unique Key: Filter option is not available
Unique Index: Filter option is available
Storage Option:
Unique Key: Filegroup only
Unique Index: Filegroup or partition
Icon:
Unique Key: Icon is vertical key [ ]
Unique Index: Icon is b-tree [ ]
Both the key (aka keyword) and index are identifiers of a table row.
Though index is parallel identification structure, containing a pointer to the identified row, while keys are in situ field members.
The key, as identifier, implies uniqueness (constraint) and NOT NULL (constraint).
There is no sense in NULL as identifier (as null cannot identify anything) as well nonunique identifying value.
Non-clustered index can contain real data, not serving as identifier to real data, and so be non-unique [1]
It is unfortunate practice that the key or index (identifier) is called by constraint (rule or restriction) what most previous answers here followed.
Keys are used in context of:
alternate aka secondary aka candidate keys, can be multiple
composite key (a few fields combined)
primary key (superkey), natural or surrogate key, only one, really used for referential integrity
foreign key
Foreign key is the key in another table (where it is primary key) and even not a key to which they frequently refer. Such use is explained by confusing shortcutting of "foreign key constraint" term to just "foreign key".
Primary key constraint really implies NOT NULL and UNIQUE constraints + that referenced column (or combined columns) is identifier and also unfortunately substituted by "primary key" or "primary key constraint" while it is both which cannot be called either by only (primary key) constraint or by only (primary) key.
Update:
My related question:
[1]
UNIQUE argument for INDEX creation - what's for?
The functionalities are more or less same, it’s dependent on your use case.
Suppose you want to permit duplicate rows based on CUSTOMER_ID and TEAM_NAME.
In that case you can use both:
UNIQUE INDEX idx_customer_id_name (CUSTOMER_ID,TEAM_NAME)
UNIQUE KEY unique_key_customer_id_name (CUSTOMER_ID,TEAM_NAME)
But you should consider how often you fetch records based on CUSTOMER_ID AND TEAM_NAME. If it is more, then you should use unique index as it would help in faster retrieval of records otherwise you should go with unique key as it would prevent overheard of fetching based on index.
What meaning does the concept of a primary key have to the database engine of SQL Server? I don't mean the clustered/nonclustered index created on the "ID" column, i mean the constraint object "primary key". Does it matter if it exists or not?
Alternatives:
alter table add primary key clustered
alter table create clustered index
Does it make a difference?
In general, a KEY is a column (or combination of columns) that uniquely identifies each row in the table. It is possible to have multiple KEYs in a table (for example, you might have a Person table where both the social security number as well as an auto-increasing number are both KEYs).
The database designer chooses one of theses KEYs to be the PRIMARY KEY. Conceptually, it does not matter which KEY is chosen as the PRIMARY KEY. However, since the PRIMARY KEY is usually used to refer to entries in this table from other tables (through FOREIGN KEYs), choosing a good PRIMARY KEY can be relevant w.r.t. (a) performance and (b) maintainability:
(a) Since the primary key will usually be used in JOINs, the index on the primary key (its size, its distribution, ...) is much more relevant to performance than other indexes.
(b) Since the primary key is used as a foreign key in other tables, changing the primary key value is always a hassle, since all the foreign key values in the other tables need to be modified as well.
A PRIMARY KEY is a constraint - this is a logical object that says something about the rules that your data must adhere to. An index is an access structure - it says something about the way the machine can search through the data. To implement a PRIMARY KEY, most RDBMS-es use an index.
Some RDBMS-es (fe. MySQL) do not make the distinction between PRIMARY KEY or UNIQUE constraint and the index that is used to help implement it. But for example, Oracle does: in oracle you can do something like: ALTER TABLE t DROP pk KEEP INDEX. This is useful if you want to change the definition of the primary key (for example, you are replacing a natural primary key with a surrogate primary key) but you still want to have a unique constraint on the original primary key columns without rebuilding the index. That makes sense if the index is very large and would take considerable table and resources to rebuild.
From what I can see, MS SQL does not make the distinction. I mean a tool like Management studio does display "Keys", "Indexes" and "Constraints" in differrent folders, but changing the name of one immediately changes the name of the corresponding objects in the other folders. So I think here the distinction is not really present in this case.