Decomposition to BCNF and set of super key - database

So I have this set of relation
AB->CDEF
G->H,I
ABJ->K
C->L
How should I decompose this? I am so confused. Am I supposed to find the set of super key first?

We can first convert the relation R to 3NF and then to BCNF.
To convert a relation R and a set of functional dependencies(FD's) into 3NF you can use Bernstein's Synthesis. To apply Bernstein's Synthesis -
First we make sure the given set of FD's is a minimal cover
Second we take each FD and make it its own sub-schema.
Third we try to combine those sub-schemas
For example in your case:
R = {A,B,C,D,E,F,G,H,I,J,K,L}
FD's = {AB->CDEF,G->HI,ABJ->K,C->L}
First we check whether the FD's is a minimal cover (singleton right-hand side , no extraneous left-hand side attribute, no redundant FD)
Singleton RHS: We write the FD's with singleton RHS. So now we have FD's as {AB->C, AB->D, AB->E, AB->F, G->H, G->I, ABJ->K, C->L}
No extraneous LHS attribute: We remove the extraneous LHS attribute if any. There are no extraneous LHS attributes here.
No redundant FD's: We remove the redundant dependencies if any. Now FD's are {AB->C, AB->D, AB->E, AB->F, G->H, G->I, ABJ->K, C->L}
Second we make each FD its own sub-schema. So now we have - (the keys for each relation are in bold)
R1={A,B,C}
R2={A,B,D}
R3={A,B,E}
R4={A,B,F}
R5={G,H}
R6={G,I}
R7={A,B,J,K}
R8={C,L}
Third we combine all sub-schemas with the same LHS. So now we have -
S1 = {A,B,C,D,E,F}
S2 = {G,H,I}
S3 = {A,B,J,K}
S4 = {C,L}
Since none of the above decomposed relations contain contain a key of R, we need to create an additional relation schema that contains attributes that form of a key of R. This is to ensure lossless join decomposition that preserves dependencies. So we add -
S5 = {A,B,G,J}
ABGJ is the key of the original relation R
This is in 3NF. Now to check for BCNF we check if any of these relations (S1,S2,S3,S4,S5) violate the conditions of BCNF (i.e. for every functional dependency X->Y the left hand side (X) has to be a superkey) . In this case none of these violate BCNF and hence it is also decomposed to BCNF.
Note - The importance of some of these steps may not be clear in this example. Have a look at other examples here and here.

Related

Is the following decomposition lossless and dependency preserving?

In R(A,B,C, D),
let the dependencies be
A->B
B->C
C->D
D-> B
the decomposition of R into (A,B), (B,C) and (B,D) is lossless or dependency preserving?
My attempt : (A,B) and (B,C) can be combined lossless-ly because of B->C. However, for (A,B,C) and (B,D), B does not form a key for either. Hence the decomposition in lossy.
Also for dependency preserving, the relation (C-D) can't be gotten from any of the decomposed relations and hence the decomposition is not dependency preserving.
However the answer given is that the decomposition is both lossless and dependency preserving. So where am I wrong?
Also the key for relation R is only {A} isn't it?
You say:
However, for (A,B,C) and (B,D), B does not form a key for either. Hence the decomposition in lossy.
This is wrong, since B is a key for (B, D). We can see this by computing B+ from the original dependencies, assuming that they form a cover of the relation.
B+ = B
B+ = BC (for the dependency B->C)
B+ = BCD (for the dependency C->D)
so, since D is included in B+, we have that B->D can be derived from the original set of dependencies, and in the decomposition (B, D) B is a key (as it is D).
To be dependency preserving we must check that the union of all the projected dependencies of the decomposition is a cover of the original set of dependencies. Since three covers of the decomposed relations are, respectively, {A->B}, {C->B, B->C} and {D->B, B->D}, by uniting those three sets you can easily derive also D->C as well as C->D, so the dependencies are preserved.
Finally, yes {A} is the unique candidate key of the original relation.

Candidate keys after canonical cover

I have a set of functional dependencies:
V = {ABCDEF} F = {AB → CD,ABDE → F,BC → A,C → DF}
Candidate keys are: {ABE, BCE}
Canonical cover is: {AB→ C, BC→ A, C→ DF} [This is what I think, might be wrong]
However, as you can see an attribute of candidate key, E, is not in my canonical cover and as far as I know candidate keys should be same in the canonical cover.
If you consider Augmentation rule from Armstrong calculus we can say it is correct but I am confused. Does attribute E have to be represented in the canonical cover?
You say:
as far as I know candidate keys should be same in the canonical cover
This is not true. On the contrary, if an attribute does not belong to any right part of the functional dependencies of a canonical cover, it must be present in any candidate key (this is because it cannot be derived from any other subset of attributes, so, since a candidate key must determines all the attributes, it should be present in any key). Your canonical cover and candidate keys are correct.
Note that if an attribute does not belong to any functional dependency (both in the left and right part), as E in your example, this is a special case of above (it does not belong the a right part side), and must be present in any candidate key.
Finally, note that this can be considered a “symptom” of something wrong in the relation and in fact the schema is not in 3NF or BCNF.
Well, when I try to do Bernnstein synthesis from this relation (ABCDEF) I have to use basis: {AB→C,BC→A,C→DF} I need to add candidate keys since no candidate key exist when we form a relation from functional dependencies : R1(ABC) and R2(CDF) and I was wondering if we need to add E here since our basis doesn't contain E and we consider basis when we do synthesis. That's why I was little confused. But, I think we need to add E since we are doing a synthesis from original R(ABCDEF) so it should be R1(ABC), R(CDF) and R3( ABCE). R3 contains all candidate keys.

BCNF Decompositions and Lossless joins for Databases

Hey all I have an assignment that says:
Let R(ABCD) be a relation with functional dependencies
A → B, C → D, AD → C, BC → A
Which of the following is a lossless-join decomposition of R into Boyce-Codd Normal Form (BCNF)?
I have been researching and watching videos on youtube and I cannot seem to find how to start this. I think I'm supposed to break it down to subschemas and then fill out a table to find which one is lossless, but I'm having trouble getting started with that. Any help would be appreciated!
Your question
Which of the following is a lossless-join decomposition of R into
Boyce-Codd Normal Form (BCNF)?
suggests that you have a set of options and you have to choose which one of those is a lossless decomposition but since you have not mentioned the options I would first (PART A) decompose the relation into BCNF ( first to 3NF then BCNF ) and then (PART B) illustrate how to check whether this given decomposition is a lossless-join decomposition or not. If you are just interested in knowing how to check whether a given BCNF decomposition is lossless or not jump directly to PART B of my answer.
PART A
To convert a relation R and a set of functional dependencies(FD's) into 3NF you can use Bernstein's Synthesis. To apply Bernstein's Synthesis -
First we make sure the given set of FD's is a minimal cover
Second we take each FD and make it its own sub-schema.
Third we try to combine those sub-schemas
For example in your case:
R = {A,B,C,D}
FD's = {A->B,C->D,AD->C,BC->A}
First we check whether the FD's is a minimal cover (singleton right-hand side , no extraneous left-hand side attribute, no redundant FD)
Singleton RHS: All the given FD's already have singleton RHS.
No extraneous LHS attribute: None of the FD's have extraneous LHS attribute that needs to e removed.
No redundant FD's: There is no redundant FD.
Hence the given set of FD's is already a minimal cover.
Second we make each FD its own sub-schema. So now we have - (the keys for each relation are in bold)
R1={A,D,C}
R2={B,C,A}
R3={C,D}
R4={A,B}
Third we see if any of the sub-schemas can be combined. We see that R1 and R2 already have all the attributes of R and hence R3 and R4 can be omitted. So now we have -
S1 = {A,D,C}
S2 = {B,C,A}
This is in 3NF. Now to check for BCNF we check if any of these relations (S1,S2) violate the conditions of BCNF (i.e. for every functional dependency X->Y the left hand side (X) has to be a superkey) . In this case none of these violate BCNF and hence it is also decomposed to BCNF.
PART B
When you apply Bernstein Synthesis as above to decompose R the decomposition is always dependency preserving. Now the question is, is the decomposition lossless? To check that we can follow the following method :
Create a table as shown in figure 1, with number of rows equal to the number of decomposed relations and number of column equal to the number of attributes in our original given R.
We put a in all the attributes that our present in the respective decomposed relation as in figure 1. Now we go through all the FD's {C->D,A->B,AD->C,BC->A} one by one and add a whenever possible. For example, first FD is C->D. Since both the rows in column C has a and there is an empty slot in second row of column D we put a a there as shown in the right part of the image. We stop as soon as one of the rows is completely filled with a which indicates that it is a lossless decomposition. If we go through all the FD's and none of the rows of our table get completely filled with a then it is a lossy decomposition.
Also, note if it is a lossy decomposition we can always make it lossless by adding one more relation to our set of decomposed relations consisting of all attributes of the primary key.
I suggest you see this video for more examples of this method. Also other way to check for lossless join decomposition which involves relational algebra.

Specific scenario regarding BCNF decomposition

Say I have a relation ABCD with FD's (A->D and AB -> ABCD)
Will a decomposed relation ABC be in BCNF? According to the second FD, AB form a key and is therefore in BCNF, but if you only look at the FD A -> D, is the relation no longer in BCNF then?
If you decompose a given relation schema (to which given dependencies apply), the next task is to determine, for each individual dependency in the original set :
(a) which (if any) of the new, decomposed, schemas does it apply to ?
(b) how has the decomposition affected the very definition of the FD ?
Question (a) applies to your original A->D dependency.
Question (b) applies, sort of, to your original AB->ABCD dependency. I say "sort of" because that version is quite "overstated". Given that A->D was already a given, it could just as well just say AB->C.

how to produce a third normal form and BCNF decompositions

I'm trying to produce a 3NF and BCNF decomposition of a schema. I have been looking at the algorithms but I am very confused at how to do this.
If I have my minimal cover say: F' = {A->F, A->G, CF->A, BG->C) and I have identified one candidate key for the relation, say it is A. Then what exactly do I do?
I have been looking at examples, one which has the following:
F = {A → AB,A → AC,A → B,A → C,B → BC}
Minimal cover: F′ = {A → B,B → C}
And the final result was: (AB,A → B), (BC,B → C). How did they get to this?
If I have my minimal cover say: F' = {A->F, A->G, CF->A, BG->C) and I
have identified one candidate key for the relation, say it is A. Then
what exactly do I do?
F' is not a minimal cover: you have to combine A->F and A->G to A->FG
Even worth A cannot be a candidate key given F' since B does not belong yo the closure of A. A possible candidate key would be AB.
For 3NF you start with creating tables for each one of the dependencies in F', i.e.,
R1(A,F,G) R2(A,C,F) R3(B,C,G)
Next you check whether one of the tables contains a candidate key. Since B appears only on the left side of the dependencies, B should always be a part of a candidate key. The only table with B is R3 and it does not contain candidate keys (check it!). Hence, we add a new table R4 with a candidate key as attributes
R4(A,B)
Finally, we check whether the set of attributes of one of the tables is contained in the set of attributes of another table. This is not the case for our running example.
Hence, our 3NF decomposition is
R1(A,F,G) R2(A,C,F) R3(B,C,G) R4(A,B)
For BCNF you start with R(A,B,C,F,G) and look for BCNF violations.
For instance A->FG is a violation of BCNF because this dependency is not trivial and A is not a superkey. Hence we split R into
R1(A,F,G) and R2(A,B,C)
None of the relations obtained contains BCNF violations, so the process stops here.

Resources