Which non-trivial functional dependencies hold in the following table?
Can anyone explain step by step the rules please?
A B C
------------
a1 b2 c1
a2 b1 c6
a3 b2 c4
a1 b2 c5
a2 b1 c3
a1 b2 c7
I'll start with a disclaimer to state that my knowledge of functional dependencies is limited to what was explained in the Wikipedia article, and that I currently don't have the need nor the inclination to study up on it further.
However, since OP asked for clarification, I'll attempt to clarify how I obtained the seemingly correct answer that I posted in the comments.
First off, this is Wikipedia's definition:
Given a relation R, a set of attributes X in R is said to
functionally determine another set of attributes Y, also in R,
(written X → Y) if, and only if, each X value is associated with
precisely one Y value; R is then said to satisfy the functional
dependency X → Y.
Additionally, Wikipedia states that:
A functional dependency FD: X → Y is called trivial if Y is a
subset of X.
Taking these definitions, I arrive at the following two non-trivial functional dependencies for the given relation:
A → B
C → {A, B}
Identifying these was a completely inductive process. Rather than applying a series of rules, formulas and calculations, I looked at the presented data and searched for those constraints that satisfy the above definitions.
In this case:
A → B
There are three possible values presented for A: a1, a2 and a3. Looking at the corresponding values for B, you'll find the following combinations: a1 → b2, a2 → b1, and a3 → b2. Or, every value of A is associated with precisely one B value, conforming to the definition.
C → {A, B}
The same reasoning goes for this dependency. In this case, identifying it is a little bit easier as the values for C are unique in this relation. In this sense, C could be considered as a key. In database terms, a candidate key is exactly that: a minimal set of attributes that uniquely identifies every tuple.
Undoubtedly, there's a way to mathematically derive the functional dependencies from the data, but for simple cases like this, the inductive process seems to work just fine.
So, non-trivial functional dependencies in the above table are:
1. A->B
2. A,C->B
3. B,C->A
4. C->A,B
Related
I'm revising for coming exams, and I am having trouble understanding an example question regarding the closure of attributes. Here is the problem:
AB→C
BE→I
E→C
CI→D
Find the closure of the set of attributes BE, explaining each step.
I've found many explanations of the method of closure when the given step is a single entity type, say 'C', using Armstrong axioms, but I don't understand how to answer for 'BE'.
First, you are confusing two very different things, attributes and entity types. Briefly, entity types are used to describe the real world entities that are modelled in a database schema. Attributes describe facts about such entities. For instance an entity type Person could have as attributes Family Name, Date of Birth, etc.
So the question is how to compute the closure of a set of attributes. You can apply the Armstrong’s axioms, trying at each step to apply one of them, until possible, but you can also simplify the computation by using the following, very simple, algorithm (and if you google "algorithm closure set attributes" you find a lot of descriptions of it):
We want to find X+, the closure of the set of attributes X.
To find it, first assign X to X+.
Then repeat the following while X+ changes:
If there is a functional dependency W → V such as W ⊆ X+ and V ⊈ X+,
unite V to X+.
So in your case, given:
AB → C
BE → I
E → C
CI → D
to compute BE+ we can procede in this way:
1. BE+ = BE
2. BE+ = BEI (because of BE → I)
3. BE+ = BEIC (because of E → C)
4. BE+ = BEICD (because of CI → D)
No other dependency can be used to modify BE+, so the algorithm terminates and the result is BCDEI. In terms of Armstrong’ axioms, the step 1 is due to Reflexivity, while the steps 2 to 4 are due to a combination of Transitivity and Augmentation.
Given these functional dependencies for
R: {A,B,C,D,E,F}
AC->EF
E->CD
C->ADEF
BDF->ACD
I got this as the canonical cover:
E->C
C->ADEF
BF->C
And then broke it down to Boyce Codd Normal Form:
Relation 1: {C,A,D,E,F}
Relation 2: {B,F,C}
I figured that this is lossless and dependency preserving? But is this true, since from the original functional dependencies BDF->ACD is no longer in any of my relations. But if I go from my calculated canonical cover then all my functional dependencies are preserved.
So that question is: Is this decomposition to BCNF dependency preserving?
A decomposition preserves the dependencies if and only if the union of the projection of the dependencies on the decomposed relations is a cover of the dependencies of the relation.
So, to know if a decomposition preserves or not the dependencies it is not sufficient to check if the dependencies of a particular cover have been preserved or not (for instance by looking if some decomposed relation has all the attributes of the dependency). For instance, in a relation R(ABC) with a cover F = {A→B, B→C, C→A} one could think that in the decomposition R1(AB) and R2(BC) the dependency C→A is not preserved. But if you project F on AB you obtain A→B, B→A, projecting it on BC you obtain B→C, C→B, so from their union you can derive also C→A.
The check is not simple, even if there exists polynomial algorithms that perform this task (for instance, one is described in J. Ullman, Principles of Database Systems, Computer Science Press, 1983).
Assuming the dependencies that you have given form a cover of the dependencies of the relation, the canonical cover that you have found is incorrect. In fact BF -> C cannot be derived from the original dependencies.
For this reason, your decomposition is not correct, since R2(BCF) is not in BCNF (actually, it is not in 2NF).
One possible canonical cover of R is the following:
BDF → C
C → A
C → E
C → F
E → C
E → D
Following the analysis algorithm, there are two possible decompositions in BCNF (according to the dependencies chosen for elimination). One is:
R1 = (ACDEF)
R2 = (BC)
while the other is:
R1 = (ACDEF)
R3 = (BE)
(note that BC and BE are candidate keys of the original relation, together with BDF).
A cover of the dependencies in R1 is:
C → A
C → E
C → F
E → C
E → D
while both in R2 and R3 no non-trivial dependencies hold.
From this, we can conclude that both decompositions do not preserve the dependencies; for instance the following dependency (and all those derived from it) cannot be obtained:
BDF → C
I have the following relation and I need to normalize it to 4NF.
Relation
First I've tried to find all the FD's and MVD's that hold.
AB ->> C (MVD)
C -> D (FD)
D -> E (FD)
ABC -> F (FD)
Next, using these dependencies I've managed to find the candidate key: ABC.
Let me know if what I've done so far is right. Also, is it ok to have a multivalued dependency in 4NF? Like AB ->> C and ABC -> F?
Thanks.
In general dependencies describe important constraints on the data, for instance a functional dependency X → A means that a certain value of X determines uniquely a certain value of A (that is, each time we find in a tuple a certain value of X, we always find the same value of A). Such kinds of constraints cannot be inferred by (few) rows of a table, in which is unknown the meaning of the data.
At the best, we can infer a set of possible functional dependencies holding in that particular instance of the table, hoping (but without any particular reason) that those functional dependencies will hold on every instance of the table, which is the only condition for which we can “normalize” the relation (and not simply find a non-redundant way of storing a particular instance of that table).
In your case, for instance, since the table has very few rows, many functional dependencies could be seen as holding in it, for instance at least the following:
F → AB
E → AD
D → AE
C → ADE
B → A
EF → ABCD
DF → ABCE
CF → ABDE
CB → ADEF
(while ABC → F can be derived from CB → ADEF, and AB →→ C does not hold).
And if we should apply a normalization algorithm to that instance (for instance the synthesis algorithm for 3NF), we will decompose the relation in an exaggerate number of subschemas:
R1(AB), R2(BCF), R3(CD), R4(ADE), R5(CEF),
five relations for a table with six attributes!
I am reading this topic Functional dependency and Normalization in Database Management Subject. I came across this example.
Relation R(A,B,C,D) Which one is Lossy join but Dependency Preserving BCNF Decomposition?
a. A ->B, B -> CD
b. A -> B, B -> C, C->D
c. AB -> C, C -> AD
d. A -> BCD
Now answer given is option C.
How can option C. be a lossy decomposition. if you do ABC union CAD = ABCD This satisfies first condition.
if we do ABC intersection CAD = AC which is perfectly fine, since in AC, C is key for (CAD) C -> AD decomposition. which also satisfies the second condition. Am i making any mistake in understanding this concept.
Usually for a Normalisation/decomposition exercise, you are given:
The full relation and its attributes. [yes: R(A, B, C, D)]
The Functional dependencies. [yes? it looks like a., b., c., d. are possible sets of Fun Deps.]
The proposed decomposition. [Often named R1, R2, etc. I don't see those. I can't interpret option d. to be proposing a decomposition.]
Perhaps your post has missed out part of the exercise? Perhaps the exercise wants you to decide which decomp preserves the dependencies in BCNF? (But results in a lossy join.)
[editted in response to Nikhil's comment] Note that the list of FD's alone doesn't amount to a decomposition: the FD C -> AD is short-hand for C -> A, C -> D. Does that mean two decomposing relations? No, because A and C are already in the FD AB -> C. So we have R1= (A, B, C), R2 = (C, D). But I don't know if that is what the exercise is asking. Think about it. What does option d. mean in terms of decompositions?
Perhaps the exercise is asking (for example): given a proposed decomposition into R1 = (A, B) and R2 = (B, C, D), which of the sets of FD's would give a lossy decomposition?
There's a worked example here: http://en.wikipedia.org/wiki/Lossless-Join_Decomposition.
It points to a previous q Lossless Join Property.
And there's further references.
By the way, options a., b., include the same Fun Deps as option d., by the transitivity of dependencies (Armstrong's Axioms http://en.wikipedia.org/wiki/Armstrong%27s_axioms see also http://en.wikipedia.org/wiki/Heath%27s_theorem). This is a clue.
I was going through the conditions of minimum cover of a set of function dependencies.
Here, it is mentioned that the right hand side can have only single attribute. So {A1A2 → B1B2} is not possible. It should be split as {A1A2 → B1, A1A2 → B2}.
But in DBMS by Korth, the following condition is there
Each left side of a functional dependency in Fc is unique. That is, there are no
two dependencies A1 → B1 and A2 → B2 in Fc such that A1 = A2.
So, according to this {A1A2 → B1, A1A2 → B2} is not possible. The dependencies should be combined as {A1A2 → B1B2} to avoid repetition.
Please clarify which is correct.
This seems to me to be a difference in notation, and nothing more. These two sets of FDs are equivalent.
{A1A2 → B1}
{A1A2 → B2}
{A1A2 → B1B2}
Most of the automated tools I've used express the minimum cover as you see in the first set. Your text seems to prefer the second set.
The two different expressions have no effect on reducibility or coverage or closure, which are the real issues in computing a minimum cover. You could argue that the first version, which has no more than one non-prime attribute on the right-hand side, is better because it's closer to a decomposition in 6NF.
But you should use the version your text and your professor require, keeping in mind that it's a false requirement. It's false in the sense that changing the notation from the second version to the first has no effect on whether you've actually found the minimum cover, and it has no substantial effect on the work you need to do to compute the minimum cover.