Can I have a foreign key with fewer number of columns of the primary/unique keys which it is referencing? (ORA-02270 error says no)
You can use fewer columns, but the columns you do use have to have a UNIQUE constraint on them.
And, of course, if you can put a UNIQUE constraint on just part of the primary key, then you have to ask yourself a question: "Why does my primary key have more columns than it needs in the first place?"
A FK isn't necessarily the PK of the other table. It can be, but it doesn't have to.
So you can use as FK as many columns as you need...
From oracle:
The foreign key in the child table will generally reference a primary key in the parent table.
Edit: the original question was about number of columns, not about using the incomplete PK as a FK.
The FK must reference something that is unique in the other table. So, you'll have to use any column that have a UNIQUE constraint, or the PK of the other table (since the PK must also be unique. For example, you select multiple columns from the table because each column, alone, is not unique, but the concatenation of multiple columns will be unique).
Related
Let's say I create a table Clients. I define a primary key and a set of constraints, such as:
NOT NULL
Length > 5
UPPERCASE
and so on..
Now, I create another table, with a foreign key to Clients primary key.
Should I create the same CONSTRAINTS for the foreign key?
If I don't it wouldn't matter, since the value won't exist on the primary table in the first place:
Example: I don't create the constraints on the foreign key, and I try to add a value which length is lower than 5 characters, and is lowercase... The database will not find that value on the parent table, hence the value will not be recorded, so what is the point of setting the same set of constraints on the foreign table?
At least you should keep the Foreign key column as not null. Otherwise, you can have many NULL values coming into the child table.
Again, as #Dale Burrell, mentioned, PRIMARY KEY should be system generated, to enforce uniqueness. If you are going to create clustered index on the primary key column, it should be narrow, incrementing, not null, unique value for getting good performance.
You don't need the length or uppercase constraints on the foreign key. They'll be "implicitly checked" by the foreign key constraint (as you say, because the referenced data cannot exist).
But for nullability, it's a choice. In SQL Server, if one or more columns in the referencing table of a foreign key are NULL, the constraint isn't enforced.
So here, the question should instead be - are there rows in the other table which should validly not reference a row in the Clients table?
Others have advised that the PK should be system generated. Whilst I agree that it's often useful to do so, don't forget to also enforce the constraints on the real data. E.g. even if this column doesn't end up being your PK, maybe it needs a unique constraint on it to ensure that you don't end up with duplicates in your data.
Please correct if im wrong. And kindly point me to articles on this concept.
When we create a primary key, in the background there is automatically a unique index, clustered index, and a not null constraint created on that coloumn.
Does this also mean that if we create a not null constraint, [clustered index or non clustered index] and unique index on a column, then that column becomes a primary key?
I want to understand the core concept/relation between primary key, index and constrains.
The primary key is the one that is declared as the "primary" key. Just having the characteristics doesn't make a key "primary". It has to be explicitly declared as such.
Different databases implement primary keys in different ways. Although primary keys are usually implemented with a clustered unique index, that is not a requirement.
The primary key is exactly what its name suggests: "primary". Any other column or group of columns can be declared both unique and not null. That does not make them primary keys. In some databases, you could even define another column or group of columns as not null, unique and clustered -- without that being the primary key.
In summary:
You can have any number of unique indexes on a table.
You can have any number of unique indexes on non-NULL columns on a table.
You can have at most one clustered index. In almost all cases, this would be the primary key. But is not required in all databases.
You can have at most one primary key. In almost all cases, this would be clustered, although that is not required in all databases.
For more detail, you should refer to the documentation of the database you are using.
If you have multiple columns comprising non-NULL, unique keys, then only one is "primary" -- that one that has been explicitly declared as primary.
Why would you have a non-clustered primary key? I can give one scenario. Imagine a database where UUIDs are the keys for rows. The company does not want to use auto-generated sequence numbers, because they provide information in the number.
However, UUIDs are remarkably bad candidates for clustered indexes, because inserts are almost never at the end. In this case, you might want to design the table with a clustered auto-generated sequential key, to speed inserts You might make this key the primary key. But, you want all foreign key references to use the UUID -- and you want all foreign key references to be to the primary key of the table.
No.
All the columns could be added with Not null and Non-clustered index and Unique But only ONE column could be PK.
And the Unique allows NULL while Primary Key does not.
You might be talking about Candidate Key, here is the ref:
https://www.techopedia.com/definition/21/candidate-key
I'm not asking HOW to do this, but if it's what I SHOULD be doing.
Two employees can be working on the same job. So of course, both FKs, EmployeeID and JobID, can have a MANY relationship in a "Employee_Jobs" table.
Let's take Employee A, Employee B, Job A and Job B. All of the following would be acceptable:
A A
A B
B A
B B
What would NOT be acceptable is a duplicate of a combination of these two PKs... since we cannot have for example, [Employee A working on Job A] twice.
So would it be correct to say that the only way to manage this is to make the combination of the two PKs, EmployeeID and JobID, a Unique, non-clustered index?
I tried to think of how to instead, break this up to more tables but I keep getting back to this same problem.
Yes, not only is it appropriate, but in fact, the combination of these two attributes should be the PRIMARY KEY.
and in any other table where the entity represented by rows in the table has a logical attribute (consisting of the two columns employeeId and JobId), which represents the work done by an employee on a job, (or the contribution of the employee to a job, or the association of an employee to a job in any way), a FK in that table should be a composite Foreign Key consisting of these same two columns.
If you are using a surrogate key on this table to simplify joins and definition of Foreign Keys in other tables, then by all means continue to do so, but keep the two-column natural key in this table, as either a unique index or a Alternate Key. (a Key is a Key - anything that is declared or defined to be unique) so as to ensure data integrity in this table. In fact, to make it clear to users of the schema, when this situation comes up, I generally make the composite Natural Key the PRIMARY KEY, and add/define the surrogate (which is used in Joins and Other table FKs), as an alternate key or unique index. This is pretty much only a semantic distinction, only as they create almost identical functionality. But because data integrity is more important to me than join syntax and Foreign Key structure, To me, the Natural Key is the PRIMARY key,
Yes, In that case you should consider making both those fields as primary key; in specific a composite primary key or compound primary key like below which will make sure uniqueness of combination of both the fields.
primary key (EmployeeID , JobID)
Though as you said a Unique, non-clustered index but marking both the field as primary key will create a UNIQUE Clustered Index on them actually.
In a SQL Server db, what is the difference between a Primary Key and an Identity column? A column can be a primary key without being an indentity. A column cannot, however, be an identity without being a primary key.
In addition to the differences, what does a PK and Identity column offer that just a PK column doesn't?
A column can definitely be an identity without being a PK.
An identity is simply an auto-increasing column.
A primary key is the unique column or columns that define the row.
These two are often used together, but there's no requirement that this be so.
This answer is more of WHY identity and primary key than WHAT they are since Joe has answered WHAT correctly above.
An identity is a value your SQL controls. Identity is a row function. It is sequential, either increasing or decreasing in value, at least in SQL Server. It should never be modified and gaps in the value should be ignored. Identity values are very useful in linking table B to table A since the value is never duplicated. The identity is not the best choice for a clustered index in every case. If a table contains audit data the clustered index may be better being created on the date occurred as it will present the answer to the question " what happened between today and four days ago" with less work because the records for the dates are sequential in the data pages.
A primary key makes the column or columns in a row unique. Primary key is a column function. Only one primary key may be defined on any table, but multiple unique indexes may be created which simulates the primary key. Clustering the primary key is not always the correct choice. Consider a phone book. If the phone book is clustered by the primary key(phone number) the query to return the phone numbers on "First Street" will be very costly.
The general rules I follow for identity and primary key are:
Always use an identity column
Create the clustered index on the column or columns which are used in range lookups
Keep the clustered index narrow since the clustered index is added to the end of every other index
Create primary key and unique indexes to reject duplicate values
Narrow keys are better
Create an index for every column or columns used in joins
These are my GENERAL rules.
A primary key (also known as a candidate key) is any set of attributes that have the properties of uniqueness and minimality. That means the key column or columns are constrained to be unique. In other words the DBMS won't permit any two rows to have the same set of values for those attributes.
The IDENTITY property effectively creates an auto-incrementing default value for a column. That column does not have to be unique though, so an IDENTITY column isn't necessarily a key.
However, an IDENTITY column is typically intended to be used as a key and therefore it usually has a uniqueness constraint on it to ensure that duplicates are not permitted.
Major Difference between Primary and Identity Column
Primary Column:
Primary Key cannot have duplicate values.
It creates a clustered index for the Table.
It can be set for any column type.
We need to provide the primary column value while inserting in the table.
Identity Column:
Identity Column can have duplicate value.
It can only be set for Integer related columns like int, bigint, smallint, tinyint or decimal
No need to insert values in the identity column. It is inserted automatically based on the seed.
EDITS MADE BASED ON FEEDBACK
A key is unique to a row. It's a way of identifying a row. Rows may have none, one, or several keys. These keys may consist of one or more columns.
Keys are indexes with a unique constraint. This differentiates them from non-key indexes.
Any index with multi-columns is called a "composite index".
Traditionally, a primary key is viewed as the main key that uniquely identifies a row. There may only be one of these.
Depending on the table's design, one may have no primary key.
A primary key is just that - a "prime key". It's the main one that specifies the unique identity of a row. Depending on a table's design, this can be a misnomer and multiple keys express the uniqueness.
In SQL Server, a primary key may be clustered. This means the remaining columns are attached to this key at the leaf level of the index. In other words, once SQL Server has found the key, it has also found the row (to be clear, this is because of the clustered aspect).
An identity column is simply a method of generating a unique ID for a row.
These two are often used together, but this is not a requirement.
You can use IDENTITY not only with integers, but also with any numeric data type that has a scale of 0
primary key could have scale but its not required.
IDENTITY, combined with a PRIMARY KEY or UNIQUE constraint, lets you provide a simple unique row identifier
Primary key emphasizing on uniqueness and avoid duplication value for all records on the same column, while identity provides increasing numbers in a column without inserting data.
Both features could be on a single column or on difference one.
What is the difference between linking two tables and then the PK is an FK in the other table, but the FK has not got the primary key option (so it does not have the gold key),
and having the PK in one table as a PK in another table?
Am I right to think that the second option is for a many-to-many relationship?
Thanks
FK means that any value in our table should be present in the foreign table.
Since the column in the foreign table should be declared as a PK or a UNIQUE key, this means it can be present only once in the foreign table.
PK means that any value in our table should be present only once.
Combined together, they mean that any value should be present only once both in our table and in foreign table.
This is a (0-1):1 relationship.