How to identify the corrects step to complete 3NF? - database

This is an example from a textbook:
Consider the relation R (A ,B ,C ,D ,E ) with FD’s AB -> C,
C -> B, and A -> D.
We get that the key is ABE and ACE. With decompositions: ABE+=ACE+=ABCDE.
How do you check minimality? I know that AB+=ABD and the textbook says that because AB+ does not include C. Then it is minimal. C+=AB and A+=AD are also minimal. But I do not know why. How do you check minimality?
Also, do we have to find all the FD's besides the ones given to check whether to perform 3-NF or not?
We then check if AB -> C can be split into A -> C and B -> C, we notice that these do not stand on their own so AB -> C is not splittable.
We are left with the final relations: S1(ABC), S2(BC), S3(AD) and the key (since not present) S4(ABE) (or S4(ABC)). We then remove S2 because it's a subset of S1.
If it is in 3NF and there are no violations, then why do they split the original relation into: S1(A, B, C), S2(A, D), and S4(A, B, E).
Book name and page: Ullman's Database Systems page 103

How do you check minimality?
The authors don't use the word minimality here. To check for the minimal basis, follow the procedure in the first two paragraphs of example 3.27. It boils down to
". . . verify that we cannot eliminate any of the given dependencies."
". . . verify that we cannot eliminate any attributes from a left side."
Also, do we have to find all the FD's besides the ones given to check whether to perform 3-NF or not?
That question doesn't really make sense. 3NF isn't something you perform. The example in the textbook has to do with the synthesis algorithm for 3NF schemas. The synthesis algorithm decomposes a relation R into relations that are all in at least 3NF.
The synthesis algorithm operates on the FDs you've been given. In an academic setting, as you might find in a textbook, the assumption is that you've been given enough information to solve the problem. In real-world applications, you might be given a set of FDs from a business analyst. Don't assume the analyst has given you enough information; look for more FDs.
We then check if AB -> C can be split into A -> C and B -> C, we notice that these do not stand on their own so AB -> C is not splittable.
No. You verify (not notice) that you can't eliminate any attributes from a left side. Eliminating A leaves B->C; eliminating B leaves A->C. Neither of these are implied by the three original FDs. So you can't eliminate any attributes from a left side.
If [the original relation] is in 3NF and there are no violations . . .
The original relation is not in 3NF. It's not even in 2NF. (A->D)

Related

BCNF Decompositions and Lossless joins for Databases

Hey all I have an assignment that says:
Let R(ABCD) be a relation with functional dependencies
A → B, C → D, AD → C, BC → A
Which of the following is a lossless-join decomposition of R into Boyce-Codd Normal Form (BCNF)?
I have been researching and watching videos on youtube and I cannot seem to find how to start this. I think I'm supposed to break it down to subschemas and then fill out a table to find which one is lossless, but I'm having trouble getting started with that. Any help would be appreciated!
Your question
Which of the following is a lossless-join decomposition of R into
Boyce-Codd Normal Form (BCNF)?
suggests that you have a set of options and you have to choose which one of those is a lossless decomposition but since you have not mentioned the options I would first (PART A) decompose the relation into BCNF ( first to 3NF then BCNF ) and then (PART B) illustrate how to check whether this given decomposition is a lossless-join decomposition or not. If you are just interested in knowing how to check whether a given BCNF decomposition is lossless or not jump directly to PART B of my answer.
PART A
To convert a relation R and a set of functional dependencies(FD's) into 3NF you can use Bernstein's Synthesis. To apply Bernstein's Synthesis -
First we make sure the given set of FD's is a minimal cover
Second we take each FD and make it its own sub-schema.
Third we try to combine those sub-schemas
For example in your case:
R = {A,B,C,D}
FD's = {A->B,C->D,AD->C,BC->A}
First we check whether the FD's is a minimal cover (singleton right-hand side , no extraneous left-hand side attribute, no redundant FD)
Singleton RHS: All the given FD's already have singleton RHS.
No extraneous LHS attribute: None of the FD's have extraneous LHS attribute that needs to e removed.
No redundant FD's: There is no redundant FD.
Hence the given set of FD's is already a minimal cover.
Second we make each FD its own sub-schema. So now we have - (the keys for each relation are in bold)
R1={A,D,C}
R2={B,C,A}
R3={C,D}
R4={A,B}
Third we see if any of the sub-schemas can be combined. We see that R1 and R2 already have all the attributes of R and hence R3 and R4 can be omitted. So now we have -
S1 = {A,D,C}
S2 = {B,C,A}
This is in 3NF. Now to check for BCNF we check if any of these relations (S1,S2) violate the conditions of BCNF (i.e. for every functional dependency X->Y the left hand side (X) has to be a superkey) . In this case none of these violate BCNF and hence it is also decomposed to BCNF.
PART B
When you apply Bernstein Synthesis as above to decompose R the decomposition is always dependency preserving. Now the question is, is the decomposition lossless? To check that we can follow the following method :
Create a table as shown in figure 1, with number of rows equal to the number of decomposed relations and number of column equal to the number of attributes in our original given R.
We put a in all the attributes that our present in the respective decomposed relation as in figure 1. Now we go through all the FD's {C->D,A->B,AD->C,BC->A} one by one and add a whenever possible. For example, first FD is C->D. Since both the rows in column C has a and there is an empty slot in second row of column D we put a a there as shown in the right part of the image. We stop as soon as one of the rows is completely filled with a which indicates that it is a lossless decomposition. If we go through all the FD's and none of the rows of our table get completely filled with a then it is a lossy decomposition.
Also, note if it is a lossy decomposition we can always make it lossless by adding one more relation to our set of decomposed relations consisting of all attributes of the primary key.
I suggest you see this video for more examples of this method. Also other way to check for lossless join decomposition which involves relational algebra.

Is this Lossy Join and Dependency Preserving

I am reading this topic Functional dependency and Normalization in Database Management Subject. I came across this example.
Relation R(A,B,C,D) Which one is Lossy join but Dependency Preserving BCNF Decomposition?
a. A ->B, B -> CD
b. A -> B, B -> C, C->D
c. AB -> C, C -> AD
d. A -> BCD
Now answer given is option C.
How can option C. be a lossy decomposition. if you do ABC union CAD = ABCD This satisfies first condition.
if we do ABC intersection CAD = AC which is perfectly fine, since in AC, C is key for (CAD) C -> AD decomposition. which also satisfies the second condition. Am i making any mistake in understanding this concept.
Usually for a Normalisation/decomposition exercise, you are given:
The full relation and its attributes. [yes: R(A, B, C, D)]
The Functional dependencies. [yes? it looks like a., b., c., d. are possible sets of Fun Deps.]
The proposed decomposition. [Often named R1, R2, etc. I don't see those. I can't interpret option d. to be proposing a decomposition.]
Perhaps your post has missed out part of the exercise? Perhaps the exercise wants you to decide which decomp preserves the dependencies in BCNF? (But results in a lossy join.)
[editted in response to Nikhil's comment] Note that the list of FD's alone doesn't amount to a decomposition: the FD C -> AD is short-hand for C -> A, C -> D. Does that mean two decomposing relations? No, because A and C are already in the FD AB -> C. So we have R1= (A, B, C), R2 = (C, D). But I don't know if that is what the exercise is asking. Think about it. What does option d. mean in terms of decompositions?
Perhaps the exercise is asking (for example): given a proposed decomposition into R1 = (A, B) and R2 = (B, C, D), which of the sets of FD's would give a lossy decomposition?
There's a worked example here: http://en.wikipedia.org/wiki/Lossless-Join_Decomposition.
It points to a previous q Lossless Join Property.
And there's further references.
By the way, options a., b., include the same Fun Deps as option d., by the transitivity of dependencies (Armstrong's Axioms http://en.wikipedia.org/wiki/Armstrong%27s_axioms see also http://en.wikipedia.org/wiki/Heath%27s_theorem). This is a clue.

boyce codd and finding candidate keys

needing desperate help with understanding boyce codd and finding the candidate keys.
i found a link here http://djitz.com/neu-mscs/how-to-find-candidate-keys/ which i have understood for most part but i get stuck
e.g
(A B C D E F)
A B → C D E
B C D → A
B C E → A D
B D → E
right as far as i understand from the link i know you find the common sets from the left which is only B, and common sets from the right which are none
now where do i go from here? i know all candidate sets will have B in them but i need guidance on finding candidate sets after that. someone explain in simple language
The linked article isn't written particularly well. (That's an observation, not a criticism. The author's first language isn't English.) I'll try to rewrite the algorithm. This isn't me telling you how to do this. It's my interpretation of how the original author is telling you to do this.
Identify the attributes that are on neither the left side nor right side of any FD.
Identify the attributes that are only on the right side of any FD.
Identify the attributes that are only on the left side of any FD.
Combine the attributes from steps 1 and 3.
Compute the closure of the attributes from step 4. If the closure comprises all the attributes, then the attributes from step 4 make up the only candidate key. (No matter how many candidate keys there are, every one of them must contain these attributes.)
Identify the attributes not included in step 4 and step 2.
Compute the closure of the attributes from step 4 plus every possible combination of attributes from step 6.
So for the FDs you posted, you'd end up with this.
{F}
{}
{B}
{BF}
The closure of {BF} is {BF}. That's not all the attributes. (But every candidate key must contain {BF}.)
{ACDE}
Compute the closure of these sets of attributes.
{ABF}
{CBF}
{DBF}
{EBF}
{ACBF}
{ADBF}
{AEBF}
{CDBF}
{CEBF}
{DEBF}
{ACDBF}
{ADEBF}
{CDEBF}
If I got those combinations right, every candidate key will be found among the possibilities in step 7. In your example, there are 3 candidate keys.
http://www.sroede.nl/projects/fdhelper.aspx
this would help'just put in ur relation and FD's
click generate at the bottom

Database Relational Homework help

The Problem "Consider a relation R with five attributes ABCDE. You are given the following dependancies
A->B
BC->E
ED->A
List all the keys for R.
The teacher gave us the keys, Which are ACD,BCD,CDE
And we need to show the work to get to them.
The First two I solved.
For BCD, the transitive of 2 with 3 to get (BC->E)D->A => BCD->A.
and for ACD id the the transitive of 1 with 4 (BCD), to get (A->B)CD->A => ACD->A
But I can't figure out how to get CDE.
So it seems I did it wrong, after googling I found this answer
methodology to find keys:
consider attribute sets α containing: a. the determinant attributes of F (i.e. A, BC,
ED) and b. the attributes NOT contained in the determined ones (i.e. C,D). Then
do the attribute closure algorithm:
if α+ superset R then α -> R
Three keys: CDE, ACD, BCD
Source
From what I can tell, since C,D are not on the left side of the dependencies. The keys are left sides with CD pre-appended to them. Can anyone explain this to me in better detail as to why?
To get they keys, you start with one of the dependencies and using inference to extend the set.
Let me have a go with simple English, you can find formal definition the net easily.
e.g. start with 3).
ED -> A
(knowing E and D, I know A)
ED ->AB
(knowing E and D, I know A, by knowing A, I know B as well)
ED->AB
Still, C cannot be known, and I have used all the rules now except BC->E,
So I add C to the left hand side, i.e.
CDE ->AB
so, by knowing C,D and E, you will know A and B as well,
Hence CDE is a key for your relation ABCDE. You repeat the same process, starting with other rules until exhausted.

Question about relation normalization

Let's consider, for instance, the following relation:
R (A,B,C,D,E,F)
where the bold denotes that it is a primary key attribute
with
F = {AB->DE, D->E}
Now, this looks to be in the first normal form. It can't be on the third normal form as I have a transitive dependency and it cannot be in the second form as not all non-key attributes depend on the whole primary key.
So my questions are:
I don't know what to make of F and C. I don't have any functional dependency info on them! F doesn't depend on anything? If that is the case, I can't think of any solution to get R into the 2nd normal form without taking it out!
What about C? C also suffers from the problem of not being referred on the functional dependencies list. What to do about it?
My attempt to get R into the 2nd normal form would be something like:
R(A,B,D)
R' (D,E)
but as stated earlier, I don't have a clue of what to do of C and F. Are they redundant so I simply take them out and the above attempt is all I have to do to get it into the 2nd form (and 3rd!)?
Thanks
Given the definition of R that { A, B, C } is the primary key, then there is inherently a functional dependency:
ABC → ABCDEF
That says that the values of A, B and C inherently determine or control the values of D, E and F as well as the trivial fact that they determine their own values.
You have a few additional dependencies, identified by the set F (which is distinct from the attribute F - the notation is not very felicitous, and could be causing confusion*):
AB → DE
D → E
As you rightly diagnose, the system is in 1NF (because 1NF really means "it is a table"). It is not in 2NF or 3NF or BCNF etc because of the transitive dependency and because some of the attributes only depend on part of the key.
You are right that you will end up with the following two relations as part of your decomposition:
R1(D, E)
R2(A, B, D)
You also need the third relation:
R3(A, B, C, F)
From these, you can recreate the original relation R using joins. The set of relations { R1, R2, R3 } is a non-loss decomposition of the original relation R.
* If the F identifying the set of subsidiary functional dependencies is intended to be the same as the attribute F, then there is something very weird about the definition of that attribute. I'd need to see sample data for the relation R to have a chance of knowing how to interpret it.
I think the primary key of R is set wrong. If F isn't functionally related to anything it has to be a part of the key
So you have R( ABCF DE) which is now in the first normal form (with F = {AB->DE, D->E}) Now you can change it to the second normal form. DE isn't dependant on the whole key (partial dependency) so you put it in another relation to get to second normal form:
R( ABCF ) F = {}
R1( #AB DE) F = {AB->DE}
Now this relation doesn't have any transitive dependencies so it is already in third normal form.
F doesn't depend on anything?
No, you just haven't been given any explicit information about it in the form
{something -> F}
And essentially the same can be said for C. You're expected to infer the other dependencies by applying Armstrong's axioms. (Probably.)
Think about how to finish this:
Given R (A,B,C,D,E,F)
{ABC -> ?}
[Later . . . I see that Jonathan Leffler has broken the suspense, so I'll just finish this.]
{ABC -> DEF} (By definition) therefore,
{ABC -> F} (By decomposition. Here's where F and C come in. And this is your third relation. ).

Resources