Is primary key also super key and candidate key? - database

Is the primary key also a super key and a candidate key? Their definitions are lengthy but I wonder if this is true?
Please note that I'm not asking if they are the same term. I'm just asking in one direction, not the other way round.

Super Key - is a set of one or more columns that can be used to identify a record uniquely in a table
Candidate Key – can be any column or a combination of columns that can qualify as a unique key in database. There can be multiple Candidate Keys in one table. Each Candidate Key can qualify as a Primary Key. You can think of this as the "shortest" super key or minimal super key
Primary Key – is a column or a combination of columns that uniquely identify a record. Only one Candidate Key can be Primary Key.
For a Candidate Key to qualify as a Primary Key, it should be unique and non-null.
So, basically a primary key is just one of the candidate keys, which is a just a minimal super key.

According to dry definitions:
Your primary key is a super key by definition - you can not have two rows with the same primary key.
However, the primary key is not a natural constraint of your business, but an artificial constraint in your data store: for example, you could set a person's birthday as the primary key in your table, and never have two people who were born on the same day. That would be silly, but possible. In that case, the primary key of the table is not a super key of the domain.
However, your primary key is not necessarily a candidate key - you can add redundant columns to your primary key.

Different set of attributes which are able to identify any row in the database is known as super key. And minimal super key is termed as candidate key i.e. among set of super keys one with minimum number of attributes. Primary key could be any key which is able to identify a specific row in database in a unique manner. from this thread
And typing all three keys in google gives about 2,480,000 results

It depends.
The Primary key is the main key the table uses to identify between different elements. It is chosen from the candidate keys.
The candidate keys are all the keys that COULD be the primary key. All the keys that are unique and can be differentiated upon in the table.
The super key is a primary key with additional attributes, this extra information is used to uniquely identify an instance of the entity set.

A candidate key is the most minimal subset of fields that uniquely identifies a tuple. For example if you have a candidate key on the column "user_id" and "pet_id" you'll never have more than 1 tuple with the same user_id and pet_id and neither user_id nor pet_id individually will work as a unique identifier for the tuple.
A super key is a set of fields that contains a key. Using the above example where the combination of "user_id" and "pet_id" uniquely identifies a tuple if we added "pet_name" (which is not key because we can have multiple pets named "fluffy") it would be a super key. Basically it's like a candidate key without the "minimal subset of fields" constraint.
A primary key is a candidate key that you tell the DB to optimize on. There might be multiple ways of referring to a unique tuple (ie. multiple candidate keys) but you can specify one when you're creating the table that you will use the most frequently.

Yes we can simply say that a candiate key is primary key but it must be unique.

Related

what is difference between primary and unique key?

What is the difference between Primary and Unique key (in MySQL)? How these can be considered as a foreign key? please explain.
I tried creating a database table and don't know how to make a primary key as a foreign key. Does it take Joins concept where the separating attribute automatically create a foreign key?
A table in MySQL can only have one Primary Key at most, while you can create as many unique Keys or indexes as you want.
Also a primary key is not nullable while a unique key can have NULL as value.
But the biggest difference is the purpose:
You want to have a primary key because you need an identifier
The unique Key/Index on the other hand is usefull to controll values that are automatically getting inserted into your table (e.g. to avoid duplicates where none are allowed)
If you want to use a column as a forgein key, you need to define it as a primary key first. Unique Constraint can not be related with other tables as a foreign key.

Primary keys, index and constrains in SQL

Please correct if im wrong. And kindly point me to articles on this concept.
When we create a primary key, in the background there is automatically a unique index, clustered index, and a not null constraint created on that coloumn.
Does this also mean that if we create a not null constraint, [clustered index or non clustered index] and unique index on a column, then that column becomes a primary key?
I want to understand the core concept/relation between primary key, index and constrains.
The primary key is the one that is declared as the "primary" key. Just having the characteristics doesn't make a key "primary". It has to be explicitly declared as such.
Different databases implement primary keys in different ways. Although primary keys are usually implemented with a clustered unique index, that is not a requirement.
The primary key is exactly what its name suggests: "primary". Any other column or group of columns can be declared both unique and not null. That does not make them primary keys. In some databases, you could even define another column or group of columns as not null, unique and clustered -- without that being the primary key.
In summary:
You can have any number of unique indexes on a table.
You can have any number of unique indexes on non-NULL columns on a table.
You can have at most one clustered index. In almost all cases, this would be the primary key. But is not required in all databases.
You can have at most one primary key. In almost all cases, this would be clustered, although that is not required in all databases.
For more detail, you should refer to the documentation of the database you are using.
If you have multiple columns comprising non-NULL, unique keys, then only one is "primary" -- that one that has been explicitly declared as primary.
Why would you have a non-clustered primary key? I can give one scenario. Imagine a database where UUIDs are the keys for rows. The company does not want to use auto-generated sequence numbers, because they provide information in the number.
However, UUIDs are remarkably bad candidates for clustered indexes, because inserts are almost never at the end. In this case, you might want to design the table with a clustered auto-generated sequential key, to speed inserts You might make this key the primary key. But, you want all foreign key references to use the UUID -- and you want all foreign key references to be to the primary key of the table.
No.
All the columns could be added with Not null and Non-clustered index and Unique But only ONE column could be PK.
And the Unique allows NULL while Primary Key does not.
You might be talking about Candidate Key, here is the ref:
https://www.techopedia.com/definition/21/candidate-key

Does a foreign key need to reference a primary key or candidate key?

I am having two conflicting definitions of foreign key.
From Wikipedia
the foreign key is defined in a second table, but it refers to the primary key in the first table.
From my lecture notes:
Foreign key does not have to match a primary key but must match a candidate key in some relation
Which is which? Does a foreign key need to reference a primary key or candidate key?
In the relational model of data, a foreign key must reference a candidate key.
In almost all SQL dbms, a foreign key must reference a candidate key.
In MySQL, a foreign key can reference just about anything.
Additionally, MySQL requires that the referenced columns be indexed for performance reasons. However, the system does not enforce a requirement that the referenced columns be UNIQUE or be declared NOT NULL.
Emphasis added.
This is a Bad Thing, IMHO.
The rule that a FK must reference a "primary" key was imposed by the older versions of the SQL standard. That rule has been relaxed by now (and it is now permitted for a FK to reference any candidate key), but it may be the case that some products still go by the old rule, just as it may be the case that some textbooks or other sources have not yet been updated to reflect the new state of affairs.
I don't know precisely when the rule was relaxed in the standard, but imo it must certainly have been no later than 2003.
A foreign key must refer to an unique key (a primary key is unique), because if it doesn't, it cans be the reference of 2 lines, and it's just impossible for a foreign key. Then you can have your primary key, but an unique key who is not a primary key and do a foreign key on it. And your unique key must be NOT NULL
Foreign keys have to be linked to candidate keys because, if join between tables was made using foreign key, the FK linked to an attribute that was not a candidate key could result in multiple values being retrieved where only one value should be retrieved.
Practically, the foreign key has nothing to do with the primary key tag of another table, if it points to a unique column (not necessarily a primary key) of another table then too, it would be a foreign key. So, a correct definition of the foreign key would be:
Foreign keys are the columns of a table that points to the candidate key of another table.

What is the use of a candidate key if a tuple in database can be identified by a primary key?

I couldn't find an clear answer on this during my research. Any help will be appreciated.
A candidate key in relational theory is a column (or columns) that contains unique values, and may be used to uniquely identify a row; a primary key is a candidate key that has been identified as the method of identifying the rows for purposes of building relationships.
Depending on the design, a table may have multiple candidate keys, but only one of them is identified as the primary key. For example, you may have a table of parts for resale; it is possible that the vendor identification for the parts may be unique and different from your own unique resale ids. this table would have two candidate keys, either of which could be used as a primary key (or a third surrogate instead).

2 primary keys in a table

Making a primary key in a table in database is fine. Making a Composite Primary is also fine. But why cant I have 2 primary keys in a table? What kind of problems may occur if we have 2 primary keys.
Suppose I have a Students table. I don't want Roll No. and Names of each student to be unique. Then why can't I create 2 primary keys in a table? I don't see any logical problem in it now. But definitely I am missing a serious issue that's the reason it does not exist.
I am new in databases, so don't have much idea. It may also create a technical issue rather. Will be happy if someone can educate me on this.
Thanks.
You can create a UNIQUE constraint for both columns UNIQUE(roll,name).
The PK is unique by definition, cause it is used to identify a row from the others, for example, when a foreign key references that table, it is referencing the PK.
If you need another column to 'act' like a PK, give it the attributes unique and not null.
Well, this is simply by definition. There can not be two "primary" conditions, just like there can not be two "latest" versions.
Every table can contain more than one unique keys, but if you decide to have a primary key, this is just one of these unique keys, the "one" you deem the "most important", which identifies every record uniquely.
If you have a table and come to the conclusion that your primary key does not uniquely identify each record (also meaning that there can't be two records with the same values for the primary key), you have chosen the wrong primary key, as by definition, the fields of the primary key must uniquely define each record.
That, however, does not mean there can be no other combination of fields uniquely identifying the record! This is where a second feature kicks in: referential integrity.
You can "link" tables using their primary key. For example: If you have a Customer table and an Orders table, where the Customers table has a primary key on the customer number and the Orders table has a primary key on the order number and the customer number, that means:
Every customer can be identified uniquely by his customer number
Every order is uniquely identified by the order number and the customer number
You can then link the two tables on the customer number. The DB system then ensures several things, among which is the fact that you can not remove a customer who has orders in your database without first removing the orders. Otherwise, you would have orders without being able to find out the customer data, which would violate your database's referential integrity.
If you had two primary keys, the system would not know on which to ensure referential integrity, so you'd have to tell the system which key to use - which would make one of the primary keys more important, which would make it the "primary key" (!) of the primary keys.
You can have multiple candidate keys in a table but by convention only one key per table is called "primary". That's just a convention though and it doesn't make any real difference to the function of the keys. A primary key is no different to any other candidate key. If you find it convenient to call more than one key "primary" then I suggest you do so. In my opinion (I'm not the only one) the idea of designating a "primary" key at all is essentially an outdated concept of very little importance in database design.
You might be interested to know that early papers on the relational database model (e.g. by E.F.Codd, the relational model's inventor) actually used the term "primary key" to describe all the keys of a relation and not just one. So there is a perfectly good precedent for multiple primary keys per table. The idea of designating exactly one primary key is more recent and probably came into common use through the popularity of ER modelling techniques.
Create an unique index on the 2nd attribute (Names), it's almost the same as primary key with another name.
From Wikipedia (http://en.wikipedia.org/wiki/Unique_key):
A table can have at most one primary key, but more than one unique
key. A primary key is a combination of columns which uniquely specify
a row. It is a special case of unique keys. One difference is that
primary keys have an implicit NOT NULL constraint while unique keys do
not. Thus, the values in unique key columns may or may not be NULL,
and in fact such a column may contain at most one NULL fields.
Another difference is that primary keys must be defined using another
syntax.

Resources