The entire question is:
The relation R(A, B, C, D, E) with F = {A→B ; A→C ; B,D→A} is decomposed considering the functional dependency A→B from the beginning.
What is the functional dependency that cannot be preserved by such a decomposition?
It seems to me that I have insufficient data to answer this question. Firstly, doesn't knowing which FD cannot be preserved depend on the normal form we are using? Secondly, the decomposition is unfinished; doesn't the preservation depend on which decomposition we use?
The question is certainly poorly phrased. Presumably F is supposed to be a cover for the FDs that hold in R--ie the FDs that hold in R are the ones implied by the ones in F. It's not clear what is meant by "decomposed considering the functional dependency A → B from the beginning". We would reasonably say that "decomposing considering a FD" means decomposing introducing a component with the FD's attributes. I suspect that they are trying to say, if you binary decompose to component AB plus some other component(s) then what FD "cannot be preserved by such a decomposition"?
When we losslessly decompose we really mean we are only interested in decompositions where all the components are smaller than the original & no component is a subset of another. So in a decomposition having component AB the other components would join to either ACDE or BCDE.
Look at the remaining given FDs: A → C & B, D → A. For each of those projections/components, if it were losslessly joinable with AB then the FDs that are implied by those per Armstrong's axioms & that have all their attributes in it will hold in it. (And no others will hold in it.) But there's one of those two FDs that can't hold in either of the two projections. So that's the one that "cannot be preserved by such a decomposition".
Firstly, doesn't knowing which FD cannot be preserved depend on the normal form we are using?
They seem to be trying to ask about a decomposition of a certain nature, so the resulting NFs don't matter.
Secondly, the decomposition is unfinished; doesn't the preservation depend on which decomposition we use?
They are telling you that a certain FD "cannot be preserved by such a decomposition". So we can expect that as long as we have "such a decomposition" the answer doesn't further "depend on which decomposition we use".
(Note that a decomposition to 2NF or 3NF can fail to preserve FDs. But there's always a decomposition that doesn't fail.)
Related
This is an example from a textbook:
Consider the relation R (A ,B ,C ,D ,E ) with FD’s AB -> C,
C -> B, and A -> D.
We get that the key is ABE and ACE. With decompositions: ABE+=ACE+=ABCDE.
How do you check minimality? I know that AB+=ABD and the textbook says that because AB+ does not include C. Then it is minimal. C+=AB and A+=AD are also minimal. But I do not know why. How do you check minimality?
Also, do we have to find all the FD's besides the ones given to check whether to perform 3-NF or not?
We then check if AB -> C can be split into A -> C and B -> C, we notice that these do not stand on their own so AB -> C is not splittable.
We are left with the final relations: S1(ABC), S2(BC), S3(AD) and the key (since not present) S4(ABE) (or S4(ABC)). We then remove S2 because it's a subset of S1.
If it is in 3NF and there are no violations, then why do they split the original relation into: S1(A, B, C), S2(A, D), and S4(A, B, E).
Book name and page: Ullman's Database Systems page 103
How do you check minimality?
The authors don't use the word minimality here. To check for the minimal basis, follow the procedure in the first two paragraphs of example 3.27. It boils down to
". . . verify that we cannot eliminate any of the given dependencies."
". . . verify that we cannot eliminate any attributes from a left side."
Also, do we have to find all the FD's besides the ones given to check whether to perform 3-NF or not?
That question doesn't really make sense. 3NF isn't something you perform. The example in the textbook has to do with the synthesis algorithm for 3NF schemas. The synthesis algorithm decomposes a relation R into relations that are all in at least 3NF.
The synthesis algorithm operates on the FDs you've been given. In an academic setting, as you might find in a textbook, the assumption is that you've been given enough information to solve the problem. In real-world applications, you might be given a set of FDs from a business analyst. Don't assume the analyst has given you enough information; look for more FDs.
We then check if AB -> C can be split into A -> C and B -> C, we notice that these do not stand on their own so AB -> C is not splittable.
No. You verify (not notice) that you can't eliminate any attributes from a left side. Eliminating A leaves B->C; eliminating B leaves A->C. Neither of these are implied by the three original FDs. So you can't eliminate any attributes from a left side.
If [the original relation] is in 3NF and there are no violations . . .
The original relation is not in 3NF. It's not even in 2NF. (A->D)
Hey all I have an assignment that says:
Let R(ABCD) be a relation with functional dependencies
A → B, C → D, AD → C, BC → A
Which of the following is a lossless-join decomposition of R into Boyce-Codd Normal Form (BCNF)?
I have been researching and watching videos on youtube and I cannot seem to find how to start this. I think I'm supposed to break it down to subschemas and then fill out a table to find which one is lossless, but I'm having trouble getting started with that. Any help would be appreciated!
Your question
Which of the following is a lossless-join decomposition of R into
Boyce-Codd Normal Form (BCNF)?
suggests that you have a set of options and you have to choose which one of those is a lossless decomposition but since you have not mentioned the options I would first (PART A) decompose the relation into BCNF ( first to 3NF then BCNF ) and then (PART B) illustrate how to check whether this given decomposition is a lossless-join decomposition or not. If you are just interested in knowing how to check whether a given BCNF decomposition is lossless or not jump directly to PART B of my answer.
PART A
To convert a relation R and a set of functional dependencies(FD's) into 3NF you can use Bernstein's Synthesis. To apply Bernstein's Synthesis -
First we make sure the given set of FD's is a minimal cover
Second we take each FD and make it its own sub-schema.
Third we try to combine those sub-schemas
For example in your case:
R = {A,B,C,D}
FD's = {A->B,C->D,AD->C,BC->A}
First we check whether the FD's is a minimal cover (singleton right-hand side , no extraneous left-hand side attribute, no redundant FD)
Singleton RHS: All the given FD's already have singleton RHS.
No extraneous LHS attribute: None of the FD's have extraneous LHS attribute that needs to e removed.
No redundant FD's: There is no redundant FD.
Hence the given set of FD's is already a minimal cover.
Second we make each FD its own sub-schema. So now we have - (the keys for each relation are in bold)
R1={A,D,C}
R2={B,C,A}
R3={C,D}
R4={A,B}
Third we see if any of the sub-schemas can be combined. We see that R1 and R2 already have all the attributes of R and hence R3 and R4 can be omitted. So now we have -
S1 = {A,D,C}
S2 = {B,C,A}
This is in 3NF. Now to check for BCNF we check if any of these relations (S1,S2) violate the conditions of BCNF (i.e. for every functional dependency X->Y the left hand side (X) has to be a superkey) . In this case none of these violate BCNF and hence it is also decomposed to BCNF.
PART B
When you apply Bernstein Synthesis as above to decompose R the decomposition is always dependency preserving. Now the question is, is the decomposition lossless? To check that we can follow the following method :
Create a table as shown in figure 1, with number of rows equal to the number of decomposed relations and number of column equal to the number of attributes in our original given R.
We put a in all the attributes that our present in the respective decomposed relation as in figure 1. Now we go through all the FD's {C->D,A->B,AD->C,BC->A} one by one and add a whenever possible. For example, first FD is C->D. Since both the rows in column C has a and there is an empty slot in second row of column D we put a a there as shown in the right part of the image. We stop as soon as one of the rows is completely filled with a which indicates that it is a lossless decomposition. If we go through all the FD's and none of the rows of our table get completely filled with a then it is a lossy decomposition.
Also, note if it is a lossy decomposition we can always make it lossless by adding one more relation to our set of decomposed relations consisting of all attributes of the primary key.
I suggest you see this video for more examples of this method. Also other way to check for lossless join decomposition which involves relational algebra.
Sorry for asking a question one might consider a basic one)
Suppose we have a relation R(A,B,C,D,E) with multivalued dependencies:
A->>B
B->>D.
Relation R doesn't have any functional dependencies.
Next, suppose we decompose R into 4NF.
My considerations:
Since we don't have any functional dependencies, the only key is all attributes (A,B,C,D,E). There are two ways we can decompose our relation R:
R1(A,B) R2(A,C,D,E)
R3(B,D) R4(A,B,C,E)
My question is - are these 2 decompositions final? Looks like they are since there are no nontrivial multivalued dependencies left. Or am I missing something?
Relation R doesn't have any functional dependencies.
You mean, non-trivial FDs (functional dependencies). (There must always be trivial FDs.)
Assuming that the MVDs (multivalued dependencies) holding in R are those in the transitive closure of {A ↠ B, B ↠ D}:
In 1 R1(A,B) R2(A,C,D,E), we can reconstruct R as R1 JOIN R2 and both R1 & R2 are in 4NF and their join will satisfy A ↠ B. If some component contained all the attributes of the other MVD then we could further decompose it per that MVD. And we would know that, given some alleged values for all components, their alleged reconstruction of R by joining would satisfy both MVDs. But here there is no such component. So we can't further decompose. And we know that an alleged reconstruction of R by joining satisfies A ↠ B but we would still have to check whether B ↠ D. We say that the MVD B ↠ D is "not preserved" and the decomposition to R1 & R2 "does not preserve MVDs".
In 2 R3(B,D) R4(A,B,C,E), we can reconstruct R as R3 JOIN R4 and both R3 & R4 are in 4NF and the join will satisfy B ↠ D. Now some component contains all the attributes of the other MVD so we can further decompose it per that MVD. And we know that, given some alleged values for all components, their alleged reconstruction of R by joining satisfies both MVDs. That component is R4, which we can further decompose, reconstructing as AB JOIN ACE. And we know that an alleged reconstruction of R by joining satisfies both A ↠ B & B ↠ D. Because the MVDs in the original appear in a component, we say these decompositions "preserve MVDs".
PS 1 The 4NF decomposition must be to three components
MVDs always come in pairs. Suppose MVD X ↠ Y holds in a relation with attributes S, normalized to components XY & X(S-Y). Notice that S-XY is the set of non-X non-Y attributes, and X(S-Y) = X(S-XY). Then there is also an MVD X ↠ S-XY, normalized to components X(S-XY) & X(S-(S-XY)), ie X(S-XY) & XY, ie X(S-Y) & XY. Why? Notice that both MVDs give the same component pair. Ie both MVDs describe the same condition, that S = XY JOIN X(S-XY). So when an MVD holds, that partner holds too. We can write the condition expressed by each of the MVDs using the special explicit & symmetrical notation X ↠ Y | S-XY.
We say a JD (join dependency) of some components of S holds if and only if they join to S. So if S = XY JOIN X(S-Y) = XY JOIN X(S-XY) then the JD *{XY, X(S-XY)} holds. Ie the condition that both MVDs describe is that JD. So a certain MVD and a certain binary JD correspond. That's one way of seeing why normalizing an MVD away involves a 2-way join and why MVDs come in pairs. The JDs that cause a 4NF relation to not be in 5NF are those that do not correspond to MVDs.
Your example involves two MVDs that aren't partners & neither otherwise holds as a consequence of the other, so you know that the final form of a lossless decomposition will involve two joins, one for each MVD pair.
PS 2 Ambiguity of "Suppose we have a relation with these multi-valued dependencies"
When decomposing per FDs (functional dependencies) we are usually given a canonical/minimal cover for the relation, ie a set in a certain form whose transitive closure under Armstrong's axioms (set of FDs that must consequently hold) holds all the FDs in the relation. This is frequently forgotten when we are told that some FDs hold. We must either be given a canonical/minimal cover for the relation or be given an arbitrary set and be told that the FDs that hold in the relation are the ones in its transitive closure. If we're just given a set of FDs that hold, we know that the ones in its transitive closure hold, but there might be others. So in general we can't normalize.
Here you give some MVDs that hold. But they aren't the only ones, because each has a partner. Moreover others might (and here do) consequently hold. (Eg X ↠ Y and Y ↠ Z implies X ↠ Z-Y holds.) But you don't say that they form a canonical or minimal cover. One way to get a canonical form for MVDs (a unique one per each transitive closure, hopefully more concise!) would be a minimal cover (one that can't lose any MVDs and still have the same transitive closure) augmented by the partner of each MVD. (Whereas for FDs the standard canonical form is minimal.) You also don't say "the MVDs that hold are those in the transitive closure of these". You just say that those MVDs hold. So maybe some not in the transitive closure do too. So your example can't be solved. We can guess that you probably mean that this is a minimal cover. (It's not canonical.) Or that the MVDs that hold in the relation are those in the transitive closure of the given ones. (Which in this case are then a minimal cover.)
A Table is in 4NF if and only if, for every one of its non-trivial multivalued dependencies X ->> Y, X is a superkey—that is, X is either a candidate key or a superset.
In your first decomposition(1 with R1 and R2) B->>D is not satisfying so it's not dependency preserving decomposition as well as not in 4NF as A is not superkey in 2nd table.
On the other hand,second decomposition(2 with R3 and R4) is dependency preserving and lossless join with B and ACE as primary key in respective tables but it's not in 4NF because A->>B dependency exists in second table and A is not superkey, you have to decompose second table further in to two tables that can be {A B} and {A C E}.
So if I follow your reasoning (suraj3), are R1(A,B) and R2(B,C,D,E) correct decomposition? I think this will preserve the FD B->>D.
Say I have a relation ABCD with FD's (A->D and AB -> ABCD)
Will a decomposed relation ABC be in BCNF? According to the second FD, AB form a key and is therefore in BCNF, but if you only look at the FD A -> D, is the relation no longer in BCNF then?
If you decompose a given relation schema (to which given dependencies apply), the next task is to determine, for each individual dependency in the original set :
(a) which (if any) of the new, decomposed, schemas does it apply to ?
(b) how has the decomposition affected the very definition of the FD ?
Question (a) applies to your original A->D dependency.
Question (b) applies, sort of, to your original AB->ABCD dependency. I say "sort of" because that version is quite "overstated". Given that A->D was already a given, it could just as well just say AB->C.
Is this statment correct?
"A Prime attribute can be transitively dependon a key in a BCNF relation" ?
according to me it is wrong
if it is wrong then what is the Normal Form of given Relation
R(A,B,C,D) and its functional dependency set is { AB->C ,AB->D , CD->A ,CD->B ,AB->CD }
A BCNF relation can satisfy a transitive FD like A->B->C only if A and B are both superkeys or if either A->B or B->C is trivial.
Gramatical errors aside the statement is strictly correct - it's just not very interesting or useful. Normally we are interested in whether a relation satisfies any non-superkey, non-trivial FDs, which are the ones that BCNF prohibits. I suggest you recheck the quotation to make sure you have it right.