Understanding 3NF FamCation - database

Would these be in 3NF because from my understanding I it doesn't have any transitive or partial dependency. BookID is the primary key, while CondoID, GuestID are foreign keys. Or would this not be 3NF because of the foreign keys? Honestly I am really confused, this is a small part of a big 10 entity FamCation.
Booking(BookID, CondoID, GuestID, StartDate, EndDate)

Related

How do primary keys work in junction tables for a DBMS? How can a composite key be a primary key?

In a DBMS we have
Superkey - An attribute or a set of attributes that uniquely identifies a row in a table.
Candidate Key - An attribute or set of attributes that uniquely identify identifies a row in a table. The difference between the superkey and a candidate key is no subset of a candidate key can itself be a candidate key.
Primary key - A chosen candidate key that become the attribute to uniquely identify a row.
If we want to identify a many to many relation between two tables we can define a junction table such as:
Tables:
Author(AuthorID, FirstName, LastName) -- AuthorID is primary key
Book(BookID, BookTitle) -- BookID is primary key
To create the relation between both:
AuthorBook(authorID, BookID) -- together authorID and BookID are primary key
I am thinking bookID and authorID are both primary keys in their own respect.
Since a candidate key (and therefore a primary key) must not have a subset containing a candidate key, how can authorID plus BookID be a primary key? This seems to break the definition of a primary key.
I understand this may be the difference between real world an theory but as the DBMS textbooks I have read seem to define junction tables this way and define primary keys this way it seems like there is a disconnect there.
Am I misunderstanding this concept?
When we use one of those terms we have to be talking about a given table (variable, value or expression). The superkeys, CKs & PKs of a table are not determined by roles its attributes play in other tables. They are determined by what valid values can arise for the table under the given business rules.
Superkey - An attribute or a set of attributes that uniquely identifies a row in a database.
A superkey of a given table can be defined as a set of attributes that "uniquely identifies a row" of the table. (Not database.) Although that quoted phrase is a kind of shorthand that isn't a very clear description if you don't already know what it means.
A superkey of a given table can be defined as a set of attributes whose subrow values can only appear once in the table. Or as a set of attributes that functionally determines every set of attributes in the table.
When a superkey has just one attribute we can sloppily talk about that attribute being a superkey.
Candidate Key - An attribute or set of attributes that uniquely identifies a row in a database.
It's true that every CK (candidate key) of a certain table is a superkey of that table. But you mean that a set of attributes is by definition a superkey when/iff that and some other condition(s) hold. But you don't clearly say that when you write this section.
The difference between the superkey and a candidate key is no subset of a candidate key can itself be a candidate key.
No. A set is a subset of itself so a CK is a subset of itself so a CK always has a subset that is a CK--itself. What you mean is, no proper/smaller subset. Then your statement is true. But also true and more important is that no proper/smaller subset of a CK is a superkey.
You don't actually define "CK" in this paragraph. A CK of a given table can be defined as a superkey of that table that contains no proper/smaller subset that is a superkey of that table.
Primary key - A chosen candidate key that becomes the attribute to uniquely identify a row.
No. The PK (primary key) of a given table is defined as the one CK of that table that you decided to call the PK. (Not attribute.)
Note that CKs & PKs are superkeys. PKs don't matter to relational theory.
To create the relation between both:
AuthorBook(authorID, BookID) -- together authorID and BookID are primary key
What the superkeys & CKs are & so what the PK can be is determined by the FDs (functional dependencies) that hold in the table. But if you are presuming that this is a many to many table then it takes an authorID-bookID pair to uniquely identify a row, so there can only be one CK, {authorID, bookID}. So that is the only possible PK. So {authorID} & {bookID} cannot be superkeys or CKs or PKs.
You can see this by looking at examples & applying the definitions.
authorID bookID
1 a
1 b
Here authorID does not uniquely identify a row. So it can't be a superkey. So it can't be a CK. So it can't be a PK.
textbooks I have read seem to define junction tables this way and define primary keys this way
No, they don't.
However they do say that certain sets of attributes & subsets of superkeys, CKs & PKs in the junction table are FKs (foreign keys) in the junction table referencing those other tables where they are CKs (which might be PKs) of/in those other tables.
A FK of a given table can be defined as a certain set of attributes in the table whose subrow values must appear as certain CK subrows in a certain other table.
But since you say this is a junction table, presumably {authorID} is a FK to an author table where its values appear under a CK/PK & {bookID} is a FK to a book table where its values appear under a CK/PK. So FK {authorID} in AuthorBook referencing {authorID} in Author & FK {bookID} in AuthorBook referencing {bookID} in Book.
PS PK & other terms mean something else in SQL. A declared SQL PK can have a smaller SQL UNIQUE declared within it. SQL "uniqueness" itself is defined in terms of SQL NULL. It's reasonable to say that an SQL PK is more reminiscent a relational superkey than it is reminiscent of a relational PK. Similarly a SQL FK is more reminiscent of what we could reasonably call a relational foreign superkey than a relational foreign key.

Normalization, Primary Key Dependency Candidate Key

In the process of decomposition to normalize a relation.
If I reach the point where all attributes in a relation depend on the primary key, can I assume that they will all depend entirely on the different candidate keys?
If that is not a case can you please give me an example of a case where all attributes depend on the primary key, but some of them depend on the part of other candidate keys.
I'm starting learning databases
Surrogate primary IDs make an example really easy:
(row_id PK, student_id, course_id, student_name)
where row_id and (student_id, course_id) are candidate keys and student_id -> student_name. Of course the row_id trivially determines any other attributes if it's an auto-incremented number.

Database relation in 3NF?

I have following relation. A company has several employees. Each employee is defined by its employee number ENr and he is living on an address EAddress with a ZipCode ZZipCode. The City with the ZipCode is an own table because otherwise there is redundancy in table Employee. Therefore ZZipCode is a foreign key in Employee.
A Group is defined by its GGroupId, therefore that is the primary key. Each group has one group leader which can be any employee. Therefore ENr is a foreign key.
Each employee can work on none, one or more groups. For this reason the table GroupMember exists where the the tuple ENr and GGroupID define the primary key and both are foreign keys (I cannot do both, bold and italic).
And last, a product is defined by its product id PId and is associated to a group GGroupID.
Well here are the relations for that written description.
Employe(ENr, EName, EGender, EAddress, ZZipCode, ESocNr,
ESalery) Group(GGroupId, GName, GCostNr, ENr)
GroupMember(ENr, GGroupID) #both members are foreign keys
too! Product(PId, PName, PPrice, GGRoupId)
Zip(ZZipCode, ZCityName, SStateID) State(SStateID,
SStateName)
For clarification: bold members are primary keys and italic members are foreign keys.
I tried to put that relation into 3NF. Can anyone confirm that this is right?
This seems to be good and normalised. I dont see any further division of the tables.

Does this database design fulfill 2NF or 3NF?

Ok so i have 2 tables. First table's name is owner with
owner_id primary key
first_name
last_name
and i have another table car with
car_reg_no primary key
car_type
car_make
car_model
car_origin
owner_id foreign key
Is this design in 2NF or 3NF or neither ?
AFAIK, 2NF, due to interdependence of the fields of the car table. You would need a third table for car_type which lists make, model and origin, and a foreign car_type_id in the car table.
A 3NF means its in 2NF and there are no transitive functional dependencies. In a slightly more understandable terms: "all attributes depend on key, whole key, and nothing but the key".
The first table fulfills all that, so it's in 3NF.
The second table needs some analysis: are there functional dependencies on non-keys? Specifically, can there be the same car model belonging to a different make?
If yes, then the functional dependency car_model -> car_make does not exist, and the table is in 3NF (unless some other dependency violates 3NF - see the comment on car_origin below).
It no, then there there is car_model -> car_make which violates 3NF.
Also, what's the meaning of car_origin? If it functionally depends on a non-key, this could also violate the 3NF.

Normalisation--2NF vs 3NF

We say 2NF is "the whole key" and 3NF "nothing but the key".
Referencing this answer by Smashery:
What are database normal forms and can you give examples?
The example used for 3NF is exactly the same as 2NF--it's a field which is dependent on only one key attribute. How is the example for 3NF different from the one for 2NF?
Suppose that some relation satisifies a non-trivial functional dependency of the form A->B, where B is a nonprime attribute.
2NF is violated if A is not a superkey but is a proper subset of a candidate key
3NF is violated if A is not a superkey
You have spotted that the 3NF requirement is just a special case (but not really so special) of the 2NF requirement. 2NF in itself is not very important. The important issue is whether A is a superkey, not whether A just happens to be some part of a candidate key.
Since you ask very specific question about an answer for existing so question here is an explanation of that (and basically I'll say what dportas already said in his answer, but in more words).
The examples of design that is not in 2NF and not in 3NF are not the same.
Yes, the dependency in both cases is on a single field.
However, in non 2NF example:
dependency is on the part of the primary key
while in non 3NF example (which is in 2NF):
dependency is on a field that is not a part of the primary key (and also notice that in that example it does satisfy 2NF; this is to show that even if you check for 2NF you should also check for 3NF)
In both cases to normalize you would create additional table which would not exhibit update anomalies (example of update anomaly: in 2NF example, what happens if you update Coursename for IT101|2009-2, but not for IT101|2009-1? You get inconsistent=meaningless=unusable data).
So, if you memorize the key, the whole key and nothing but the key, which covers both 2NF and 3NF, that should work for you in practice when normalizing. The distinction between 2NF and 3NF might seem subtle to you (question if in the additional dependency the attribute(s) on which the data is dependent are part of candidate key or not) - and, well, it is - so just accept it.
2NF allows non-prime attributes to be functionally dependent on non-prime attributes
but
3NF allows non-prime attributes to be functionally dependent only on super key
Thus,when a table is in 3NF it is in 2NF and 3NF is stricter than 2NF
Hope this helps...
You have achieved the 3rd NF when there are no relations between the key and other columns that don't depend on it.
Not sure my professor would have said that like this but this is what it is.
If you're "in the field". Forget about the definitions. Look for "best practices". One is DRY : Don't Repeat Yourself.
If you follow that principle, you already master everything you need for NF.
Here is an example.
Your table has the following schema:
PERSONS : id, name, age, car make, car model
Age and name are related to the person entry (=> id) but the model depends to the car and not the person.
Then, you would split it in two tables:
PERSONS : id, name, age, car_models_id (references CAR_MODELS.id)
CAR_MODELS : id, name, car_makes_id (references CAR_MAKES.id)
CAR_MAKES : id, name
You can have replication in 2FN but not in 3FN anymore.
Normalization is all about non-replication, consistency, and from another point of view foreign keys and JOINs.
The more normalized the better for data but not for performance nor understanding if it gets really too complicated.
2NF follows the partial dependency whereas 3NF follows the transitive functional dependency. It is important to know that the 3NF must be in 2NF and support transitive functional dependency.
First, we have to know the tools we work with:
candidate key attribute;
non candidate key attribute;
partial dependency;
full dependency;
Candidate Key Attribute
A candidate key attribute is any column or combination of columns that can be/form primary key. You can have many candidate keys, but you will pick only one of these to be primary key. Still, any candidate key attribute is important in 2NF. No need to be primary key, or any key, it is enough to be a candidate key attribute. 2NF refers to CANDIDATE KEY. Those that say key or primary key instead of "candidate key" add to the confusion.
Non Candidate Key Attribute
Any column that can't be primary key and can't be part of the primary key.
Partial Dependency
Partial dependency arrives when there is a candidate key formed by MORE THAN ONE column, AND a non candidate key attribute depends only on A column that constitutes the candidate key.
Full Dependency
Any non candidate key attribute, if depends on a candidate key, then depends on the WHOLE candidate key. If the candidate key is formed by more than one column, then the dependent column must depend on any column that forms the candidate key.
Now you have the tools to understand 2NF and 3NF.
2NF does not allow partial dependency. If you find a non candidate key attribute that is partially dependent on a candidate key attribute, you must beak partial dependency to make it full dependency. So 2NF allows a non candidate key attribute to be full dependent on a candidate key attribute that is not primary key. It is just a possible primary key, if you pick it, but you are not forced to pick it. 2NF is compliant only by that.
Let's say you have it in 2NF. All non candidate key attributes are full dependent on candidate key attributes. But a non candidate key attribute is full dependent on a candidate key attribute that you did not pick it to be primary key. 3NF do not allow it. All full dependencies must be with primary key (at this point you picked a primary key already).

Resources