I am having a hard time understanding the 3 Normal form.
3 NF: 2 NF + No transitions
So, for eg: If I have,
A -> B
B -> C
Then the above is sort of a transition relation and hence won't be in 3 NF?
Am I understanding it correctly?
But in this answer What exactly does database normalization do? , by paxdiablo, it says,
Third normal form (3NF) - 2NF and every non-key column in a table depends on nothing but the key.According to this, it will be in 3 NF. Where am I going wrong?
A relation is in 3NF if it is in 2NF and:
either each attribute depends on a key,
or, if an attribute depends on a non-key, then it is prime.
(being prime means that it belongs to a key).
See for instance Wikipedia.
A relation is in Boyce-Codd normal form if only the first condition hold, that is:
each attribute depends on a key
So, in your example, if the relation has only three attributes A, B and C and the two dependencies, it is not in 3NF, since C is not prime, and depends on B, which is a not a key. On the other hand, if there are other attributes, and C is a key or part of a key, then it could be in 3NF (but this depends on the other functional dependencies, that should satisfy the above conditions).
The 2NF says that each non-prime attribute depends on each whole candidate key, and not by part of it. For instance, if a relation has attributes A, B and C, the only key is AB, and B -> C, then this relation is not in 2NF.
The 2-part 3nf definition you are trying for is:
2NF holds and every non-prime attribute of R is non-transitively dependent on every superkey. (X transitively determines Z when there's a Y where X → Y and Y → Z and not Y → X.)
The other definition of 3NF is:
For every non-trivial FD X → Y, either X is a superkey or the attributes in Y but not in X are prime. (X → Y is trivial when X contains Y.)
Then BCNF is:
For every non-trivial FD X → Y, X is a superkey
See this answer.
If your example's only columns are A, B and C and your two FDs form a minimal cover then the only candidate key is A and C is dependent on a non-superkey so it is not in 3NF (or BCNF).
You are (mis)using terms so sloppily that your sentences don't mean anything. Learn the terms and how they are used in their definitions to refer to various things and use them that way in reference to appropriate things. And get your definitions from a (reputable) textbook.
Related
I have a relation R(a,b,c,d) where (a,b) is a primary key so I have the determinant a,b -> c,d.
In addition to that I have the following determinants: a,c -> b,d and a,d -> b.
Let's be ok that this relation is in 3NF.
I'm wondering if it's in BCNF or not. I was using a definition for the BCNF that recognize:
a relation is in BCNF if it's in 3NF and there's no determinant X -> Y such as X is non-key attributes and Y is a part (or the totality) of the key
which is not applicable in my case for the determinant a,d -> b for example. Another definition is that
A relation, R, is in BCNF iff for every nontrivial FD (X->A) satisfied
by R the following condition is true:
(a) X is a superkey for R
which left me undecided insofar as in a,d->b it's clear that (a,d) is not a superkey (nor a key), but we have (a,d) clearly a key to the relation R!
So, my question is:
Is the relation R in BCNF or not, and why?
And what's the right process to determine whether a relation is or isn't in BCNF?
About terminology
You say:
I have the determinant a,b -> c,d
This is wrong therminology. a,b -> c,d is a functional dependency (sometimes abbreviated with FD), which has a determinant a,b (sometimes called left hand side (LHS) of the FD) and a determinate c,d (sometimes called right hand side, RHS, of the FD). This terminology is used since the values of the attributes a,b uniquely determinate the values of the attributes c,d.
About the key
The information that:
(a,b) is a primary key
can be irrelevant when normalizing a relation when you have enough information about the functional dependencies. From those dependencies you can calcolate which are the candidate keys: set of attributes that uniquely determines all the attributes of the relation and such that you cannot remove any attribute from them maintaining this property (in other words, minimal sets of attributes that uniquely determines all the attributes of the relation). The information about the primary key can be relevant when you have only partial information about the functional dependencies that hold in a relation, but in your case the information about all the (candidate) keys can be derived from the functional dependencies.
In your example, for instance, there are three candidate keys:
1. a, b
2. a, c
3. a, d
You can verify this fact by computing the closure of the attributes of a candidate key to see if it contains all the attributes. For instance, let's try to calculate the closure of a,d (called a,d *):
1. a,d * = a,d
2. a,d * = a,d,b (since a, d -> b)
3. a,d * = a,d,b,c (since a, b -> c, d)
So a,d is a candidate key (which is also a superkey, i.e. a set of attributes that determines all the attributes of the relation).
About the BCNF
There are different definitions of BCNF. Using for instance the second one that you cited, all the three dependencies have a determinant which is a candidate key (and so a superkey), and so the relation is in BCNF.
My doubt is for a given set of funtional dependencies F = { AE -> BCD, B -> E
}. Is this in BCNF or 3NF? It's a question from a test I have recently done and I would say that it is 3NF, but my teacher said it's neither 3NF nor BCNF. (I believe it is an error).
I have obtained as candidate keys AE and AB, and as in the first functional dependency the left side is a candidate key and in B -> E, E is contained in a candidate key, so it is in 3NF.
Is this in BCNF, 3NF or neither?
Assuming that all the attributes of the relations are A B C D and E, and that the only dependencies given are the two described (F), you are correct. Since the (only) candidate keys are correctly A E and A B, and since the functional dependency B → E has a determinant which is not a superkey, the relation is not in BCNF. Given one of the definitions of BNCF: “for all the non-trivial dependencies X → Y of F+, X is a superkey”, there is a theorem that shows that a necessary and sufficient condition for this is that the property of being a superkey holds for all the dependencies in F.
On the other hand, since E is a prime attribute, i.e. an attribute of a candidate key, the dependency B → E does not violate the 3NF, so that the relation is in 3NF. This, again, given one of the definitions of 3NF: “for all the non-trivial dependencies X → A in F+, then X is a superkey or A is a prime attribute”, is due to a theorem that says that this condition is equivalent to check, “for each functional dependency X → A1,...,An in F, and for each i in {1..n}, either Ai belongs to X, or X is a superkey or Ai is prime”. And this is satified by the two dependencies of F.
You need to use a definition of a NF when you claim/show that a relation is in it.
You don't actually say what all the attributes are. I'll assume the attributes are A through E. Otherwise, the CKs (candidate keys) are not what you say.
You are right in your argument against BCNF. You are using the definition that all determinants of FDs (functional dependencies) are out of superkeys. You found a counterexample FD B → E.
If it were an either-or question re BCNF vs 3NF you could stop there.
in the first functional dependency the left side is a candidate key and in B -> E, E is contained in a candidate key
You don't show that the table meets the conditions of either of the following definitions (from Wikipedia that happen to be correct) that a table is in 3NF if and only if:
both of the following conditions hold:
The relation is in 2NF
Every non-prime attribute is non-transitively dependent on every [candidate] key
for each of its functional dependencies X → A, at least one of the following conditions holds:
X contains A
X is a superkey
each attribute in A-X is prime
You seem to using definition 2 (but not saying so). You show bullet 2 holds for AE → BCD. Pointing out that E is prime in B → E seems to be part of showing that E-B is all prime. But you need to show every FD satisfies a bullet. Note that more FDs hold than the given ones. Armstrong's axioms tell you what all the FDs are.
In practice it can be easier to show a schema is in 3NF by applying a 3NF algorithm.
I have a question regarding functional dependencies.
I understand that functional dependency means that the value of an attribute can be determined by the value of another attribute.
Suppose we have this table
|A|B|C|D|
Here A and B are the primary keys.
Is it correct to say that both C and D are functionally dependent on both A and B ?
You are saying “A and B are the primary keys” but this phrase is ambiguous: you mean: “The primary key is A B” or “the are two candidate keys, A and B”? (and note that in a relation in a relational database you can have only a single primary key and many candidate keys).
Given the definition of a (candidate) key, that is that it determines all the other attributes and that you cannot remove any attribute without losing this property, in the first case you can say that:
A B -> C D
or, which is equivalent, that:
A B -> C
A B -> D
(so C e D depends on the combination of A and B), while in the second case, you have that:
A -> C D
B -> C D
or, which is equivalent, that:
A -> C
A -> D
B -> C
B -> D
(that is, C and D are functionally dependent both on A and on B).
"S (functionally) determines T" means that all appearances of a particular subtuple value for attribute set S have the same subtuple value for attribute set T. If we say an attribute X is determining or determined then it's understood that we really mean that set {X} is determining/determined.
A superkey is a set of attributes that determines every attribute. A CK (candidate key) is a superkey that contains no smaller superkey. There can be many CKs. One CK can be chosen as PK (primary key). (PKs play no role in relational theory.)
Since there can only be one PK, it's odd that you talk about a relation value or variable having more than one. Maybe you mean two CKs. Maybe you mean a 2-attribute PK.
It happens that if every subtuple value for a set of attributes appears just once then it is a superkey. (Each single-attribute superkey is a CK unless {} is the CK, which happens when the relation is limited to one tuple.) So it determines all attributes. But in general the dependencies tell us what the superkeys & CKs are.
So if each of A and B are CKs then each determines C and D, ie {C} and {D}. And if {A,B} is a PK then it determines C and D, ie {C} and {D}. It happens that if both T1 and T2 are determined by S then T1 U T2 is too. So either way, the CK(s) here determine(s) {C,D} also.
PS There is an ambiguity in English where it is not clear whether "both C and D are functionally dependent" means that C is dependent and D is dependent or that {C,D} is dependent. Similarly for "are functionally dependent on both A and B". So it is clearer to say "the set ..." rather than just using "both" and/or "and".
Let R be a relation with Schema R(X,Y,Z)
and it's FDs are
{XY -> Z, Z -> Y}
I am not able to decompose it into BCNF .
Because r1(Z,Y), r2(Z,X) will lose FD XY -> Z and
R(X,Y,Z) itself is not the solution as Z->Y shows that Z should be a key ..
How to do this ???
Every conversion into BCNF may not be dependency preserving
We only need to give a counter example: Consider the following schema;
a b c and c->b
Clearly the above schema is in 3NF, because ab->c is a superkey dependency and ,from c->b we
can see that b-c=b, which is a subset of the primary key (such dependency is also allowed in 3NF).
But, the above schema is not in BCNF because c->b is neither super-key nor trivial dependency.
So we decompose above schema , keeping it lossless.
Only possible lossless decomposition is: ac and cb. (because,their intersection c is primary key for the 2nd table).
But clearly the dependency ab->c is lost.
Hence, proved.
I'm trying to produce a 3NF and BCNF decomposition of a schema. I have been looking at the algorithms but I am very confused at how to do this.
If I have my minimal cover say: F' = {A->F, A->G, CF->A, BG->C) and I have identified one candidate key for the relation, say it is A. Then what exactly do I do?
I have been looking at examples, one which has the following:
F = {A → AB,A → AC,A → B,A → C,B → BC}
Minimal cover: F′ = {A → B,B → C}
And the final result was: (AB,A → B), (BC,B → C). How did they get to this?
If I have my minimal cover say: F' = {A->F, A->G, CF->A, BG->C) and I
have identified one candidate key for the relation, say it is A. Then
what exactly do I do?
F' is not a minimal cover: you have to combine A->F and A->G to A->FG
Even worth A cannot be a candidate key given F' since B does not belong yo the closure of A. A possible candidate key would be AB.
For 3NF you start with creating tables for each one of the dependencies in F', i.e.,
R1(A,F,G) R2(A,C,F) R3(B,C,G)
Next you check whether one of the tables contains a candidate key. Since B appears only on the left side of the dependencies, B should always be a part of a candidate key. The only table with B is R3 and it does not contain candidate keys (check it!). Hence, we add a new table R4 with a candidate key as attributes
R4(A,B)
Finally, we check whether the set of attributes of one of the tables is contained in the set of attributes of another table. This is not the case for our running example.
Hence, our 3NF decomposition is
R1(A,F,G) R2(A,C,F) R3(B,C,G) R4(A,B)
For BCNF you start with R(A,B,C,F,G) and look for BCNF violations.
For instance A->FG is a violation of BCNF because this dependency is not trivial and A is not a superkey. Hence we split R into
R1(A,F,G) and R2(A,B,C)
None of the relations obtained contains BCNF violations, so the process stops here.