Ok so i have 2 tables. First table's name is owner with
owner_id primary key
first_name
last_name
and i have another table car with
car_reg_no primary key
car_type
car_make
car_model
car_origin
owner_id foreign key
Is this design in 2NF or 3NF or neither ?
AFAIK, 2NF, due to interdependence of the fields of the car table. You would need a third table for car_type which lists make, model and origin, and a foreign car_type_id in the car table.
A 3NF means its in 2NF and there are no transitive functional dependencies. In a slightly more understandable terms: "all attributes depend on key, whole key, and nothing but the key".
The first table fulfills all that, so it's in 3NF.
The second table needs some analysis: are there functional dependencies on non-keys? Specifically, can there be the same car model belonging to a different make?
If yes, then the functional dependency car_model -> car_make does not exist, and the table is in 3NF (unless some other dependency violates 3NF - see the comment on car_origin below).
It no, then there there is car_model -> car_make which violates 3NF.
Also, what's the meaning of car_origin? If it functionally depends on a non-key, this could also violate the 3NF.
Related
When my instructor taught my SQL class about 2NF, they mentioned that it's violated if there's partial dependency - that is, when a table has a composite key and a non-key column depends only on one of the keys, and not all of the columns that comprise the PK.
If there's an entity with a single-columned PK, and there's a non-key attribute that does not depend on this PK, does it mean that it's in 2NF because the entity does not have a composite key and partial dependency is not possible, and would therefore never be violated (an attribute is only either dependent or not dependent on the PK)?
Thank you!
I'm certainly not an expert on this subject but to quote GeeksforGeeks:
"Second Normal Form applies to relations with composite keys, that is, relations with a primary key composed of two or more attributes. A relation with a single-attribute primary key is automatically in at least 2NF." (https://www.geeksforgeeks.org/second-normal-form-2nf/)
So, at least according to them, the answer is yes.
I want to normalize the table
So far i have done this:
is it my 3Nf correct?
Order(orderNo,orderDate,customerId)
Customer(customerId,customerName,customerCurrentAddress,customerPermanentAddress)
Product(productId,productName,Qty,categoryId)
Category(categoryId,categoryName)
Sub-Category(subCatId,subCatName,categoryId)
Given table here:
https://drive.google.com/file/d/1F7cQjjxz9rnY6RHaGtZXuwVGFEHBVgNo/view
You have to make a junction table between Order and Product in order to keep track of orders that contain more than one product.
Order(orderNo,orderDate,customerId)
Order_Details(orderNo,productId,Qty)
Customer(customerId,customerName,customerCurrentAddress,customerPermanentAddress)
Product(productId,productName,Qty,categoryId)
Category(categoryId,categoryName)
Sub-Category(subCatId,subCatName,categoryId)
Base on the definition from Wiki: a table is in 2NF if it is in 1NF and no non-prime attribute is dependent on any proper subset of any candidate key of the table. Most of your tables only have 2 columns, so they satisfied this. For Customer(customerId,customerName,customerCurrentAddress,customerPermanentAddress), the candidate key is customerId, and the other columns completely depend on the whole candidate key, then it's OK.
For 3NF, Wiki said: all the attributes in a table are determined only by the candidate keys of that table and not by any non-prime attributes. As you see, your tables are all satisfied.
I am trying to normalize the following table. I want to go from the UNF form to 3NF form. I want to know, what do you do at the 1NF stage? It says it's where you remove the repetitive columns or groups (ex. ManagerID, ManagerName). This is considered repetitive because it's leads to the same data.
The Unnormalized data table has the following columns
CustomerRental(CustNo,CustName,PropNo,PAddress,RentStart,RentFinish,Rent,OwnerNo,OName)
There are no repeating columns/fields and each cell has a single value, but there is not a primary key. The functional dependencies I see in the table are:
{CustNo}->{Cname}
{PropNo}->{Paddress,RentStart,RentFinish,Rent,OwnerNo,Oname}
{CustNo,PropNo}->
{Paddress,RentStart,RentFinish,Rent,OwnerNo,OName,CustName}
{OwnerNo,PropNo}->{Rent,Paddress,Oname,RentInfo}
The primary key I picked was a composite key, CustNo + PropNo. Since it has a primary key, the table is in 1NF form, correct? This is what I thought, but the answer excludes CustNo and CustName from the table. They are in their own table.
From the above, I normalized it 2NF. At this stage, you are supposed to ensure that all non-prime attributes are fully dependent on the primary key. This is not the case. These are the functional dependencies in the table:
{OwnerNo}->{Oname}
{CustNo}->{CustName}
{PropNo}->{Paddress,Rent,OwnerNo,Oname}
I moved these values out of the table to create three new tables in 2NF form:
Customers(CustNo(PK),CustName)
Property(PropNo(PK),Paddress,City,Rent,OwnerNo,OwnerName)
Rentals(RentalNo(PK),CustNo,OwnerNo,PropNo,RentStart,RentFinish)
Now the main table, Rentals, is in 2NF form. It has a primary key, RentalNo, and each of the non-prime attributes depends on it.
I think that there is a transitive dependency on it. You can find OwnerNo through the PropNo. So, to make it comply with 3NF rules, you have to move the OwnerNo to its own table to create these tables:
Customers(CustNo,CustName)
Property(PropNo,Paddress,City,Rent)
Owners(OwnerNo,OwnerName)
Rentals(RentalNo,CustNo,PropNo,RentStart,RentFinish)
Is this correct? I read that at the 1NF stage, you are supposed to remove repetitive columns (ex. OwnerNo,OwnerName). Is this true? Why or why not?
The picture showing my tables is here:
Normalized Tables
We don't normalize to a NF (normal form) by going through lower NFs between it and 1NF. We use a proven algorithm for the NF we want. Find one in a published academic textbook. (If that doesn't describe the reference(s) you were told to use, find one that it does & quote it.)
Pay close attention to the terms and steps. Details matter. Eg you will need to know all the FDs (functional dependencies) that hold, not just some of them. Eg whenever some FDs hold, all the ones generated by Armstrong's axioms hold. Eg PKs (primary keys) are irrelevant, CKs (candidate keys) matter. Eg every table has a CK. Eg normalization to higher NFs does not change column names. So already your question does not reflect a correct process.
You really need to read & quote the reference(s) you were told to use in order to get to "1NF", because "1NF" is in the eye of the beholder. Normalization to higher NFs works on any relation.
We say 2NF is "the whole key" and 3NF "nothing but the key".
Referencing this answer by Smashery:
What are database normal forms and can you give examples?
The example used for 3NF is exactly the same as 2NF--it's a field which is dependent on only one key attribute. How is the example for 3NF different from the one for 2NF?
Suppose that some relation satisifies a non-trivial functional dependency of the form A->B, where B is a nonprime attribute.
2NF is violated if A is not a superkey but is a proper subset of a candidate key
3NF is violated if A is not a superkey
You have spotted that the 3NF requirement is just a special case (but not really so special) of the 2NF requirement. 2NF in itself is not very important. The important issue is whether A is a superkey, not whether A just happens to be some part of a candidate key.
Since you ask very specific question about an answer for existing so question here is an explanation of that (and basically I'll say what dportas already said in his answer, but in more words).
The examples of design that is not in 2NF and not in 3NF are not the same.
Yes, the dependency in both cases is on a single field.
However, in non 2NF example:
dependency is on the part of the primary key
while in non 3NF example (which is in 2NF):
dependency is on a field that is not a part of the primary key (and also notice that in that example it does satisfy 2NF; this is to show that even if you check for 2NF you should also check for 3NF)
In both cases to normalize you would create additional table which would not exhibit update anomalies (example of update anomaly: in 2NF example, what happens if you update Coursename for IT101|2009-2, but not for IT101|2009-1? You get inconsistent=meaningless=unusable data).
So, if you memorize the key, the whole key and nothing but the key, which covers both 2NF and 3NF, that should work for you in practice when normalizing. The distinction between 2NF and 3NF might seem subtle to you (question if in the additional dependency the attribute(s) on which the data is dependent are part of candidate key or not) - and, well, it is - so just accept it.
2NF allows non-prime attributes to be functionally dependent on non-prime attributes
but
3NF allows non-prime attributes to be functionally dependent only on super key
Thus,when a table is in 3NF it is in 2NF and 3NF is stricter than 2NF
Hope this helps...
You have achieved the 3rd NF when there are no relations between the key and other columns that don't depend on it.
Not sure my professor would have said that like this but this is what it is.
If you're "in the field". Forget about the definitions. Look for "best practices". One is DRY : Don't Repeat Yourself.
If you follow that principle, you already master everything you need for NF.
Here is an example.
Your table has the following schema:
PERSONS : id, name, age, car make, car model
Age and name are related to the person entry (=> id) but the model depends to the car and not the person.
Then, you would split it in two tables:
PERSONS : id, name, age, car_models_id (references CAR_MODELS.id)
CAR_MODELS : id, name, car_makes_id (references CAR_MAKES.id)
CAR_MAKES : id, name
You can have replication in 2FN but not in 3FN anymore.
Normalization is all about non-replication, consistency, and from another point of view foreign keys and JOINs.
The more normalized the better for data but not for performance nor understanding if it gets really too complicated.
2NF follows the partial dependency whereas 3NF follows the transitive functional dependency. It is important to know that the 3NF must be in 2NF and support transitive functional dependency.
First, we have to know the tools we work with:
candidate key attribute;
non candidate key attribute;
partial dependency;
full dependency;
Candidate Key Attribute
A candidate key attribute is any column or combination of columns that can be/form primary key. You can have many candidate keys, but you will pick only one of these to be primary key. Still, any candidate key attribute is important in 2NF. No need to be primary key, or any key, it is enough to be a candidate key attribute. 2NF refers to CANDIDATE KEY. Those that say key or primary key instead of "candidate key" add to the confusion.
Non Candidate Key Attribute
Any column that can't be primary key and can't be part of the primary key.
Partial Dependency
Partial dependency arrives when there is a candidate key formed by MORE THAN ONE column, AND a non candidate key attribute depends only on A column that constitutes the candidate key.
Full Dependency
Any non candidate key attribute, if depends on a candidate key, then depends on the WHOLE candidate key. If the candidate key is formed by more than one column, then the dependent column must depend on any column that forms the candidate key.
Now you have the tools to understand 2NF and 3NF.
2NF does not allow partial dependency. If you find a non candidate key attribute that is partially dependent on a candidate key attribute, you must beak partial dependency to make it full dependency. So 2NF allows a non candidate key attribute to be full dependent on a candidate key attribute that is not primary key. It is just a possible primary key, if you pick it, but you are not forced to pick it. 2NF is compliant only by that.
Let's say you have it in 2NF. All non candidate key attributes are full dependent on candidate key attributes. But a non candidate key attribute is full dependent on a candidate key attribute that you did not pick it to be primary key. 3NF do not allow it. All full dependencies must be with primary key (at this point you picked a primary key already).
someone told me the following table isn't fit for the second database normalization. but i don't know why? i am a newbie of database design, i have read some tutorials of the 3NF. but to the 2NF and 3NF, i can't understand them well. expect someone can explain it for me. thank you,
+------------+-----------+-------------------+
pk pk row
+------------+-----------+-------------------+
A B C
+------------+-----------+-------------------+
A D C
+------------+-----------+-------------------+
A E C
+------------+-----------+-------------------+
Your question cannot be answered properly unless you state what dependencies are supposed to be satisfied here. You appear to have two attributes with the same name (pk), in which case this table doesn't even satisfy 1NF because it doesn't qualify as a relation.
About your example: that table doesn't fit the second database normalization (with your sample data, I presume that the C depends only on A). The second normalization form requires that:
No non-prime attribute in the table is functionally dependent on a
proper subset of a candidate key
(Wikipedia)
So the C depends on "A", which is a subset of your primary key. Your primary key is a special superkey. (dportas point out the fact that it can't be called candidate key, since it's not minimal).
Let's say more about the second normalization form. Transform your example a little for easy understanding, presume that there's a table CUSTOMER(customer_id, customer_name, address). A super key is a sub-set of your properties which uniquely determine a tube. In this case, there are 3 super key: (customer_id) ; (customer_id, customer_name) ; (customer_id, customer_name, address). (Customer name may be the same for 2 people)
In your case, you have determined (customer_id, customer_name) be the Primary Key. It violated the second form rules; since it only needs customer_id to determine uniquely a tube in your database. For the sake of theory accuration, the problem here raised from the choice of primary key(it's not a candidate key), though the same argument can be applied to show the redundance. You may find some useful example here.
The third normal form states that:
Every non-prime attribute is
non-transitively dependent on every
candidate key in the table
Let give it an example. Changing the previous table to fit the second form, now we have the table CUSTOMER(customer_id,customer_name, city, postal_code), with customer_id is primary key.
Clearly enough, "postal_code" depends on the "city" of customer. This is where it violated the third rule: postal_code depends on city, city depends on customer_id. That means postal_code transitively depends on customer_id, so that table doesn't fit the third normal form.
To correct it, we need to eliminate the transitive dependence. So we split the table into 2 table: CUSTOMER(customer_id, customer_name, city) and CITY(city, postal_code). This prevent the redundance of having too many tubes with the same city & postal_code.