It is necessary to normalise the table in order? - database

It is necessary to normalize table in order like first we have to convert in 1NF then 2NF then 3NF and so on.. or we can skip any.
Ex-
R(ABCDE)
AB->C,C->D,B->E
It is in only 1NF not in 2NF bcoz C->D is Partial dependency.
so first i have to convert it into 2NF then 3NF (by rule i thnk so)
but i can convert directly it in 3NF without 2NF
For 3NF :
AB->C is correct.
C->D & B->E not correct.
so i can make NEW tables
R(ABC),R(CD),R(BE) [candidate keys are AB,C,B in respective tables)
AB->C ,C->D,B->E is correct so it is in 3NF
So it necessary to follow order or not.

Some lower normal forms don't apply. Sometimes a relation is already in 3NF before you do anything to it. You can skip normal forms that don't apply and you can jump straight to 3NF (or higher, if applicable) directly. It is not necessary to do each step before proceeding to the next.

Related

Exactly what is 2NF and 3NF?

What's the main point of Normalization?
I mean if a normal form is not in 2NF, it is because of partial dependency i.e. a non key attribute is dependent on a part of a candidate key.
So, let's say, for a relation R(A,B,C) with FDs:
AB->C, B->C
Clearly, AB is the candidate key and B->C is the partial dependency.
Solution: Decompose the relation such that (B,C) forms a new relation with B as the key.
Now, if a relation is not in 3NF, it is because a non key attribute is dependent on another non key attribute i.e. to say
if FDs for a relation R(A,B,C) are:
A->B,B->C
Clearly, A is the key and B->C shows transitive dependency, so not in 3NF.
Solution: Decompose the relation such that (B,C) forms a new relation with B as the key.
So, what's the exact difference?
I mean, why such a marked distinction? Essentially in both of the cases the action is same.
Decompose the relation using the dependency where the determinant (B here) is either PART of a key or not.
Why have separate terms like partial dependency or transitive dependency?
Why not just see, if there exists a dependency wherein a non prime attribute is determined by a something which is NOT a candidate key( no matter whether it is a partial key or another non prime attribute )
Why can't we implement a method like this:
1 NF -- having all elements in the atomic form
X NF -- if there's any
dependency of the form non_key -> non_prime_attribute(s) ,
decompose the relation with one of the new relation having this
particular "non_key" as the key with those non_prime_attributes.
BCNF
: Where for all the dependencies of the form X->Y, X is a superkey?
Can we have such NF condition format? Does it combine all the conditions?
So, what's the exact difference?
2NF is not 3NF & definitions of 2NF are not definitions of 3NF. There isn't any particular semantic or syntactic structural similarity that would leave some kind of "difference" other than that a 2NF relation can have the sort of problem FD (functional dependency) that violates 3NF that a 3NF relation doesn't have. You can find definitions all over the place. You almost give them correctly here yourself. But a NF (normal form) is a condition, not a process. What do you mean "actions are the same"? Being in 3NF implies being in 2NF, so naturally decomposing to 3NF also gives 2NF. But there are relations that are in 2NF but not in 3NF, and there may be decompositions for a relation to 2NF that don't get to 3NF. Those decompositions will involve in a removal of all problem partial FDs that does not result in the removal of all problem transitive FDs.
(Because 3NF is always achievable and there are no other disadvantages compared to 2NF, 2NF isn't even useful. It's just a condition that was discovered first that is not as strong as 3NF.)
(3NF is frequently defined in terms of 2NF plus no transitive dependencies of non-prime attributes on CKs, but actually no such FDs implies no partial FDs of non-prime attributes on CKs, hence 2NF, so the first condition is redundant.)
Why not just see, if there exists a dependency wherein a non prime attribute is determined by a something which is NOT a candidate key
Why should that condition be helpful? It is not a description of just getting rid of the problem FDs of 2NF & 3NF--that's what putting into 3NF does.
Getting rid of non-trivial FDs that are not determined by superkeys happens to give BCNF. It implies 2NF & 3NF. But it is different from both of them. A BCNF relation exhibits no FD-based update anomalies. It is always achievable. However 3NF is alway achievable while "preserving FDs", whereas BCNF is not. There are cases where in order for a FD that held in the original to be enforced in a view/query that gives it via constraints on its components we need an EQD (equality dependency) constraint. That says two column sets have the same set of subrow values, which is more expensive to enforce than a FD. Either you have BCNF & an EQD & fewer update anomalies or you have 3NF/EKNF & a FD & certain update anomalies.
The NF that really matters is 5NF, which implies BCNF, with no update anomalies & with other benefits. (We might then decide to denormalize for performance reasons.)
PS Normalization to a given NF does not necessarily involve normalization to lower NFs.
It almost sounds as though you want to know why they called these two normal forms by different names instead of inventing just one form that covers both cases. If that's not the case, please ignore this answer.
Part of the answer is that the forms weren't discovered at the same time. And part of the answer is that the problem with 1NF that gave rise to 2NF is not the same as the problem with 2NF that gave rise to 3NF, even though they both exhibit harmful redundancy.
What might satisfy you a little more is BCNF. BCNF was actually discovered later than 4NF, so that name was already in use. But BCNF has to be placed between 3NF and 4NF, because it is more restrictive than 3NF but less restrictive than 4NF. So it was discovered "out of sequence", so to speak.
In BCNF, every (non trivial) determinant is a candidate key. That seems to be what you are looking for. I conjecture that any relation that is in 1NF and where every determinant is a candidate key, could be shown to be in 2NF and 3NF. But the proof is beyond me.
2NF and 3NF are essentially historical concepts and your question is a reasonable one. There is no real reason to apply them in practical database design because better tools exist today.
When it comes to teaching there is possibly some justification for mentioning 2NF and 3NF. Doing so allows students to explore the concepts involved (as you have done) while also teaching them a bit about the origins and rationale of design theory. In school maths lessons I was taught long division and differentiation from first principles. No one uses those techniques in practice, they are just teaching aids.
Before checking for 2NF the relation should be in 1NF. In simple words 2NF have only full dependencies only, no partial dependencies in relation. Full dependency means if x gives y, then by removal of any element in x, then y is not having any relation. If by removal of x, you are having relation with y then it is partial dependency. For 3NF we have to check for the 2NF, in 3NF we should not have any transitive relations like if x gives z, then there is no relation like x gives y and y gives z.
Solution for 2NF create a table for the partial dependcies and add foreign key in new relation which is primary key on the previous relation.
Solution for 3NF create a relation for both x gives y and y gives z. Add keys to relations.

When does BCNF not preserve functional dependencies, and should I then use 3NF?

When is BCNF not able to preserve functional dependencies?
When is a 3NF decomposition desired instead of a BCNF decomposition preserving functional dependencies?
Please explain with an example.
I saw this question but it does not answer my question:
Decomposition that does not preserve functional dependency
When is BCNF not able to preserve functional dependencies?
Turns out this question is problematic in a certain way that "ok you defined 'prime number' but when is a number prime?" is, but "ok you defined 'simplest form of a fraction' but when is a fraction in simplest form?" isn't. Definition(s) say "when". But what you mean is something like, multiple conditions apply so what more simple/intuitive definition or non-brute-force algorithm characterizes this? But it has been shown that (informally) there is no non-exponential/non-exhaustive algorithm to enumerate BCNF decompositions that do/don't preserve FDs (functional dependencies).
When is a 3NF decomposition desired instead of a BCNF decomposition [not] preserving functional dependencies?
If a 3NF design is not in BCNF then it preserves a FD that is not out of a superkey and so cannot be declaratively enforced in most SQL DBMSs. But the BCNF design, not having preserved the FD, needs a constraint enforced that is equivalent to two SQL FK (foreign key) constraints to each other, which cannot be declaratively enforced in most SQL DBMSs. Since there's nothing special about cycles that prevents DBMSs from enforcing them and the two designs can represent each other, there isn't any reason per se why a DBMS couldn't support both.
There's a similar mental complexity for these two design forms--3NF plus FDs not out of CKs vs BCNF plus extra equality dependencies. But since the 3NF relation is the join of its BCNF components, the meaning of a 3NF tuple is the AND/conjunction of the meanings of the BCNF components. Since a user implicitly knows this and should be explicitly told it, and since constraints are not needed to query or modify a database (they're for integrity), the BCNF design is in some sense simpler. But if the user is always wanting to update both components then the 3NF design is in some sense simpler.
Thus, in case we are not able to get a dependency-preserving BCNF decomposition, it is generally preferable to opt for BCNF, since checking functional dependencies other than primary key constraints is difficult in SQL.
-- Database System Concepts 6th Edition (2011) by Silberschatz, Korth & Sudarshan
You can find an example facing this choice in most textbooks, and dozens are online in pdf. It must involve overlapping (composite) CKs (candidate keys).
The meaning of an SJT tuple (s,j,t)--simplified notation--is that student s is taught subject j by teacher t. The following constraints apply:
For each subject, each student of that subject is taught by only one teacher
Each teacher teaches only one subject (but each subject is taught by several teachers).
[...] From the first constraint, we have the FD {S,J} → T. From the second constraint, we have the FD T → J.
-- An Introduction to Database Systems 8th Edition (2004) by Date
(A 3NF design can suffer from further problems that could be eliminated by further decomposing the BCNF design to higher normal forms. This is why we should always decompose to 5NF then if desired explicitly denormalize. So any non-BCNF 3NF table should have come from such a denormalization.)

Database Normalization Process to 3NF CustomerRental for CustNo, PropNo, OwnerNo, etc

I am trying to normalize the following table. I want to go from the UNF form to 3NF form. I want to know, what do you do at the 1NF stage? It says it's where you remove the repetitive columns or groups (ex. ManagerID, ManagerName). This is considered repetitive because it's leads to the same data.
The Unnormalized data table has the following columns
CustomerRental(CustNo,CustName,PropNo,PAddress,RentStart,RentFinish,Rent,OwnerNo,OName)
There are no repeating columns/fields and each cell has a single value, but there is not a primary key. The functional dependencies I see in the table are:
{CustNo}->{Cname}
{PropNo}->{Paddress,RentStart,RentFinish,Rent,OwnerNo,Oname}
{CustNo,PropNo}->
{Paddress,RentStart,RentFinish,Rent,OwnerNo,OName,CustName}
{OwnerNo,PropNo}->{Rent,Paddress,Oname,RentInfo}
The primary key I picked was a composite key, CustNo + PropNo. Since it has a primary key, the table is in 1NF form, correct? This is what I thought, but the answer excludes CustNo and CustName from the table. They are in their own table.
From the above, I normalized it 2NF. At this stage, you are supposed to ensure that all non-prime attributes are fully dependent on the primary key. This is not the case. These are the functional dependencies in the table:
{OwnerNo}->{Oname}
{CustNo}->{CustName}
{PropNo}->{Paddress,Rent,OwnerNo,Oname}
I moved these values out of the table to create three new tables in 2NF form:
Customers(CustNo(PK),CustName)
Property(PropNo(PK),Paddress,City,Rent,OwnerNo,OwnerName)
Rentals(RentalNo(PK),CustNo,OwnerNo,PropNo,RentStart,RentFinish)
Now the main table, Rentals, is in 2NF form. It has a primary key, RentalNo, and each of the non-prime attributes depends on it.
I think that there is a transitive dependency on it. You can find OwnerNo through the PropNo. So, to make it comply with 3NF rules, you have to move the OwnerNo to its own table to create these tables:
Customers(CustNo,CustName)
Property(PropNo,Paddress,City,Rent)
Owners(OwnerNo,OwnerName)
Rentals(RentalNo,CustNo,PropNo,RentStart,RentFinish)
Is this correct? I read that at the 1NF stage, you are supposed to remove repetitive columns (ex. OwnerNo,OwnerName). Is this true? Why or why not?
The picture showing my tables is here:
Normalized Tables
We don't normalize to a NF (normal form) by going through lower NFs between it and 1NF. We use a proven algorithm for the NF we want. Find one in a published academic textbook. (If that doesn't describe the reference(s) you were told to use, find one that it does & quote it.)
Pay close attention to the terms and steps. Details matter. Eg you will need to know all the FDs (functional dependencies) that hold, not just some of them. Eg whenever some FDs hold, all the ones generated by Armstrong's axioms hold. Eg PKs (primary keys) are irrelevant, CKs (candidate keys) matter. Eg every table has a CK. Eg normalization to higher NFs does not change column names. So already your question does not reflect a correct process.
You really need to read & quote the reference(s) you were told to use in order to get to "1NF", because "1NF" is in the eye of the beholder. Normalization to higher NFs works on any relation.

Can a 2NF database already be in 3NF?

I'm doing a homework question where I have to convert a database to 1NF, 2NF and 3NF. I have gotten to 2NF and it does not appear to have any transitive dependencies. Does that mean that it is already in 3NF?
Yes. When a relation (variable or value) is in a given normal form it can also be in higher normal forms at the same time. (But beware that sometimes people sloppily say that a relation is in a given normal form but they mean that it's in that normal form but also not any higher one.)
Being in a normal form is a property of a relation. The way they are named, 1-2-3-BCNF-4-5 are stricter and stricter conditions. So when a relation meets one of those conditions it meets all the preceding ones and it might meet later ones. You happen to have a 2NF relation that is also a 3NF relation. Or to put that anther way, you have a 3NF relation that, like every 3NF relation is also in 2NF. You just happened to notice that it was in 2NF before you noticed it was in 3NF.
Yes, unless you missed a transitive functional dependency.

Identifying the Boyce Codd Normal Form

I'm trying to get my head around the differences between 3NF and BCNF and I think I'm getting there but it would be great if anyone can help out.
The following is a series of relations in the 3rd normal form (helpfully stolen from Identifying Functional Dependencies which in turn took them from Connolly & Begg's Database Systems):
Client {clientNo(PK), clientName}
Owner {ownerNo(PK), ownerName}
Property {propertyNo (PK), propertyAddress, rent}
ClientRental {clientNo(PK), propertyNo(PK), rentStart, rentFinish, ownerNo(FK)}
Each property has only one owner and clients can rent those properties. Assume rent is fixed for each property.
So my question is: Are these also in the BCNF?
My hunch is the ClientRental relation is not because PropertyNo->ownerNo. So PropertyNo is a determinant in a functional dependency but it isn't a superkey.
Am is anywhere near the right ballpark?
The short, informal way to express the difference is that, in BCNF, every "arrow" for every functional dependency is an "arrow" out of a candidate key. For a relation that's in 3NF, but isn't in BCNF, there will be at least one "arrow" out of something besides a candidate key.
Wikipedia entry for 3NF table not meeting BCNF
A common misconception is that you can normalize to 2NF and no higher, then to 3NF and no higher, then to BCNF and no higher. In fact, fixing a partial key dependency in order to reach 2NF often leaves you with all relations in 5NF. That is, you went from one relation in 2NF to multiple relations in 5NF without stopping at BCNF in between.

Resources