I am not clear in the concept of Normalization. The below problem has a loop in FD for the prime attributes pno->pname->pno so in which normal form will this be? Can we consider two keys while checking for NF?
Suppose we have WORKS_ON as following:WORKS_ON(ESSN, PNo, PName*, Hours)
FDs (suppose):
{ESSN, PNo} --> Hours
{ESSN, PName} --> Hours
PNO --> Pname
Pname --> PNo
Keys: {ESSN, PNO}, and {ESSN, PName}
You have to consider every candidate key when you're evaluating FDs and determining normal forms.
For example, if a non-prime attribute is dependent on just part of any candidate key, the relation is not in 2NF. Think about it for a minute. It wouldn't make logical sense for the normal form to depend on which candidate key you chose, would it? Because then you could "change" the normal form just by evaluating against different candidate keys.
Related
I was doing homework for DB class.
One of the questions bugs me out even though I got the answer(I think)..
Question was simple.
FOR GIVEN RELATION R(A,B,C,D,E) and Functional dependencies F(AB -> C, D->E, DE ->B)
1. IS R IN 2NF?
2. IS R IN 3NF?
3. IS R IN BCNF?
I thought since there's no A and D on right-hand side of all FDs in F, A and D must be part of Candidate keys.
So I checked the Closure of AD, and I got AD+ : {A,B,C,D,E}.
That means that AD is super key.
Also, since both A and D must be part of Candidate key and AD cannot be reduced(no closure of subset of AD is {A,B,C,D,E}), AD is a candidate key and only possible candidate key. (Am I doing this right?)
With candidate key AD, D->E is partial dependence on candidate key AD.
So it violates the condition of 2NF.
On DE -> B, is this FD is violating 2NF?
If that's true then..
Is it violating because we can get D->DE from D->E . so DE -> B is equivalent to D -> B. Is this D->B is violating 2NF ??
OR
DE->B itself violates the 2NF without any conversion because of D on left-hand side?
It really confuses me when XY -> Z X is part of Candidate key and Y,Z is non-prime key.
Because I can't say it is violating 2NF or not. I think it is violating 2NF but I can't say why clearly.
I've been looking for examples and explanations and clips for hours but I haven't got any satisfying answer.
It's okay if I don't care specific reason and just want credit . But I can't bare myself with that kind of attitude.
Also, since both A and D must be part of Candidate key and AD cannot be reduced(no closure of subset of AD is {A,B,C,D,E}), AD is a candidate key and only possible candidate key. (Am I doing this right?)
Yes
Is this D->B is violating 2NF ?
Yes, since B is a non prime attribute and D is part of a candidate key and the dependency holds, since it is implied by D -> E and DE -> B (in a relation in 2NF dependencies where the determinant is a proper subset of a candidate key and the determinant is a non-prime attribute cannot hold).
I have a set of functional dependencies:
V = {ABCDEF} F = {AB → CD,ABDE → F,BC → A,C → DF}
Candidate keys are: {ABE, BCE}
Canonical cover is: {AB→ C, BC→ A, C→ DF} [This is what I think, might be wrong]
However, as you can see an attribute of candidate key, E, is not in my canonical cover and as far as I know candidate keys should be same in the canonical cover.
If you consider Augmentation rule from Armstrong calculus we can say it is correct but I am confused. Does attribute E have to be represented in the canonical cover?
You say:
as far as I know candidate keys should be same in the canonical cover
This is not true. On the contrary, if an attribute does not belong to any right part of the functional dependencies of a canonical cover, it must be present in any candidate key (this is because it cannot be derived from any other subset of attributes, so, since a candidate key must determines all the attributes, it should be present in any key). Your canonical cover and candidate keys are correct.
Note that if an attribute does not belong to any functional dependency (both in the left and right part), as E in your example, this is a special case of above (it does not belong the a right part side), and must be present in any candidate key.
Finally, note that this can be considered a “symptom” of something wrong in the relation and in fact the schema is not in 3NF or BCNF.
Well, when I try to do Bernnstein synthesis from this relation (ABCDEF) I have to use basis: {AB→C,BC→A,C→DF} I need to add candidate keys since no candidate key exist when we form a relation from functional dependencies : R1(ABC) and R2(CDF) and I was wondering if we need to add E here since our basis doesn't contain E and we consider basis when we do synthesis. That's why I was little confused. But, I think we need to add E since we are doing a synthesis from original R(ABCDEF) so it should be R1(ABC), R(CDF) and R3( ABCE). R3 contains all candidate keys.
Consider a database relation of student records as follows:
Student (I,G,P,M,S,Y,E,L,R,C)
(a) Show how to derive two candidate keys for Student, or justify why you cannot do so.
(b) What normal form is Student in? Show working that justifies your answer.
(c) If F contained MSY→LRCE instead of PMSY→LRCE, what would this imply about paper
names? (i.e., the values of M)
(d) Find a minimal cover (i.e, an irreducible set of functional dependencies) for Student.
(e) Find a decomposition of Student into third normal form (3NF).
I stuck on the first question about the candidate key. I know that the candidate keys must be a subset of (I,P,M,S,Y,L,R) since these appear on the left hand side of the Functional dependancies above and determine all of the remaining attributes. We can remove M which is determined by P, but then I was kinda confused about how to make these attributes to be the minimal, especially from complexed functional dependencies such as PMSY→LRCE. Thx for any solution and suggestions.
I won't do your homework but as a hint on (a);
F:IGPMSYELRC->IGPMSYELRC
always holds. By virtue of F:P->M you can remove M and get
F:IGPSYELRC->IGPMSYELRC
now apply F:R->C to get
F:IGPSYELR->IGPMSYELRC .
Repeat this until you cannot remove any attributes from the left-hand side.
Then you got a candidate key.
With different permutations of F this may yield other candidate keys.
I have a problem about the 2nd normal form. The rule says : “A relation is in second formal form when it is in 1NF and there is no such non-key
attribute that depends on part of the candidate key, but on the entire candidate key.” (Neeraj Sharma, 2010) My problem is about the candidate key. It is only the primary key of a relation or all possible candidate keys.
Thank you for your help
It counts for any candidate key. If it counted only for the primary key, simply adding a surrogate id would be enough to put any table into 3NF. However, that wouldn't help to ensure that each fact is recorded once only and independent of other facts.
Trying to clear your doubt by an example:
According to 2NF "Partial Dependencies are not allowed in a relation."
Assume this relation: R(A,B,C,D)
lets suppose there are 3 CK's related to this relation (Assume CK's: AB,AC,B).
Then first write all the attributes that are present in any of CK's,these are called Prime attributes.Other than that are called non prime attributes.
Here:
Prime Attributes (3)= {A,B,C}
Non Prime Attributes (1)={D}
Now According to 2 NF, any FD should not be in this form:
This kind of FD's aren Not allowed in 2NF:
"Part of any candidate key(Partial Dependency) ---> Non Prime attribute"
Means:
Here : C---> D(Not allowed in 2 NF because C is a part of CK "AC" and D is non prime attribute)
Hope this helps. For more detail, you can also refer : Detailed explanation of Normal forms
I was learning database normalization and join dependencies and
5NF. I had a hard time. Can anyone give me some practical examples of the multivalue dependency rule:
MVD3: (transitivity) If X ↠ Y and Y ↠ Z, then X ↠ (Z − Y).
Functional dependency / Normalization theory and the normal forms up to and including BCNF, were developed on the hypothesis of all data attributes (columns/types/...) being "atomic" in a certain sense. That "certain sense" has long been deprecated by now, but essentially it boiled down to the notion that "a single cell value in a table could not itself hold a multiplicity of values". Think, a textual CSV list of ISBN numbers, a table appearing as a value in a cell in a table (truly nested tables), ...
Now imagine an example with courses, professors, and study books used as course material. Imagine all of that modeled in a single 3-column table which says that "Professor (P) teaches course (C) and uses book (B) as course material." If there can be more than one book (B) used for any given course (Cn) and there can be more than one course (C) taught by any given professor (Pn) and there can be more than one professor (P) teaching any given course (Cn), then this table is clearly all-key (key is the full set of attributes {P,C,B} ).
This means that this table satisfies BCNF.
But now imagine that there is a rule to the effect that "the set of books used for any given course (Cn) must be the same, regardless of which professor teaches it.".
In the days when normalization was developed to the form in which it is now commonly known, it was not allowed to have table columns (relation attributes) that were themselves tables (relations). (Because such a design was considered a violation of 1NF, a notion which is now considered suspect.)
Imagine for a moment that we are indeed allowed to model relation attributes to be of type relation. Then we could model our 3-column table (/relation) as follows : "Professor (P) teaches course (C) and uses THE SET OF BOOKS (SB) as course material.". Attribute SB would no longer be an ISBN number, as in the previous and more obvious design, but it would be a (probably unary) RELATION holding the entire set of ISBN numbers. If we draw our design like that, and we then consider our rule that "all professors use the same set of books for the same course", then we see that this rule is now expressible as an FD from (C) to (SB) !!! And this means that we have a violation of a lower NF on our hand !!!
4 and 5 NF have arisen out of this kind of problems (where the appearance of a single attribute value -courseID (C)- causes a requirement for the appearance of A MULTITUDE of rows (multiple (B) ISBn numbers) being recognised quite early on, but without the solution that is currently regarded as the best (RVA's), being recognised as a valid one. So 4 and 5 NF were created "new and further normal forms", where the then-existing definitions of 2, 3 and BC NF were already sufficient for dealing with the situation at hand, provided RVA's had been recognised as a valid design approach.
To support that claim, let's look at what whould be done to eliminate the NF violation in our {P,C,SB} design with the FD C->SB :
We would split the table into two separate tables {P,C} and {C,SB} with keys {P,C} and {C}, repsectively. Both tables satisfy BCNF.
But we still have this SB attribute that holds a set of ISBN numbers. Dealing with this can be done by applying a technique like "UNGROUPING". Applying this to our {C,SB} table would get us a {C,B} table, where B are the ISBN book numbers (or whatever identifier you like to use in your database), and the key to the table is {C,B}. This is exactly the same design we would get if we eliminated the 4/5 NF violation !!!
You might also want to take a look at Multivalue Dependency violation?