Reflexivity in functional dependencies - database

I'm taking a class on databases and I'm doing an assignment on functional dependencies. As an example of taking given dependencies and deriving other non-trivial dependencies using Armstrong's Axioms, the TA wrote this and I can't wrap my head around it.
Considering the relation R(c,p,h,s,e,n) and F the set of functional dependencies {1. c->p, 2. hs->c, 3. hp->s, 4. ce->n, 5. he->s}:
Iteration 1:
From F, we can build F1
6. hs->p (transitivity: 1+2)
7. hc->s (pseudo-transitive. 1+3)
8. hp->c
1. hp->hs (reflexivity 3)
2. hp->c (transitivity: 8.1+2)
9. he->c
1. he->hs (reflexivity: 4)
2. he->c (transitivity: 9.1+2)
I understand most of it fine except the cases where 'reflexivity' is used (using quotes because that's pretty far from the definition of reflexivity in my textbook). Can anyone tell me how that's reflexivity? Also, how do I know when an iteration is over? Couldn't you find an infinity of ways to rewrite functional dependencies?

These are the classical Armstrong’s Axioms (see for instance wikipedia):
Reflexivity: If Y ⊆ X then X → Y
Augmentation: If X → Y then XZ → YZ for any Z
Transitivity: If X → Y and Y → Z, then X → Z
So in your example, to derive hp → c you can procede in the following way:
1. hp → s (given)
2. hp → hs (by augmentation of 1 adding h)
3. hs → c (given)
4. hp → c (by transitivity of 2 + 3)
Note that to produce hp → hs from hp → s the axiom to use is Augmentation, in which the role of Z is taken by h, and not Reflexivity, and this is the axiom to use also to derive he → c (by Reflexivity you can only derive, for instance, hp → hp, hp → p, hp → h).
You are also asking:
How do I know when an iteration is over? Couldn't you find an infinity of ways to rewrite functional dependencies?
The Armstrong’s Axioms can be applied to a set of functional dependencies only a finite number of times to produce new functional dependencies. This is simple to show since the number of attributes is finite, and, given n attributes you can have at most 2n * 2n different functional dependencies (since you can have any subset of the attributes both on the left and the right part, of course including the trivial dependencies).

A name doesn't tell you anything except what somebody decided to call something.
Trivial FD X -> X holds in any relation with the attributes in X. Ie a set of attributes in a relation functionally determines itself. That's reasonably called reflexive. It happens that it functionally determines every subset of itself. It happens "reflexivity" was chosen as the name for the more general rule & the more general rule was chosen as one of a set of sufficient but non-redundant rules.
Armstrong's axioms have been shown to be sound & complete. Sound means they only generate implied FDs. Complete means that if you keep applying an axiom until you don't get any new FDs by applying any of them then you get all the FDs that can be derived from the original set, ie that also must hold when the original ones hold. Any textbook tells you that you can generate such a transitive closure of a set of FDs by doing just that.
There are also sound & complete axiom sets for FDs + MVDs. But there aren't for FDs + JDs.

Related

How do deal with combined entity types when computing the closure of a set of attributes

I'm revising for coming exams, and I am having trouble understanding an example question regarding the closure of attributes. Here is the problem:
AB→C
BE→I
E→C
CI→D
Find the closure of the set of attributes BE, explaining each step.
I've found many explanations of the method of closure when the given step is a single entity type, say 'C', using Armstrong axioms, but I don't understand how to answer for 'BE'.
First, you are confusing two very different things, attributes and entity types. Briefly, entity types are used to describe the real world entities that are modelled in a database schema. Attributes describe facts about such entities. For instance an entity type Person could have as attributes Family Name, Date of Birth, etc.
So the question is how to compute the closure of a set of attributes. You can apply the Armstrong’s axioms, trying at each step to apply one of them, until possible, but you can also simplify the computation by using the following, very simple, algorithm (and if you google "algorithm closure set attributes" you find a lot of descriptions of it):
We want to find X+, the closure of the set of attributes X.
To find it, first assign X to X+.
Then repeat the following while X+ changes:
If there is a functional dependency W → V such as W ⊆ X+ and V ⊈ X+,
unite V to X+.
So in your case, given:
AB → C
BE → I
E → C
CI → D
to compute BE+ we can procede in this way:
1. BE+ = BE
2. BE+ = BEI (because of BE → I)
3. BE+ = BEIC (because of E → C)
4. BE+ = BEICD (because of CI → D)
No other dependency can be used to modify BE+, so the algorithm terminates and the result is BCDEI. In terms of Armstrong’ axioms, the step 1 is due to Reflexivity, while the steps 2 to 4 are due to a combination of Transitivity and Augmentation.

Dependency preservation, based of original functional dependencies or canonical cover?

Given these functional dependencies for
R: {A,B,C,D,E,F}
AC->EF
E->CD
C->ADEF
BDF->ACD
I got this as the canonical cover:
E->C
C->ADEF
BF->C
And then broke it down to Boyce Codd Normal Form:
Relation 1: {C,A,D,E,F}
Relation 2: {B,F,C}
I figured that this is lossless and dependency preserving? But is this true, since from the original functional dependencies BDF->ACD is no longer in any of my relations. But if I go from my calculated canonical cover then all my functional dependencies are preserved.
So that question is: Is this decomposition to BCNF dependency preserving?
A decomposition preserves the dependencies if and only if the union of the projection of the dependencies on the decomposed relations is a cover of the dependencies of the relation.
So, to know if a decomposition preserves or not the dependencies it is not sufficient to check if the dependencies of a particular cover have been preserved or not (for instance by looking if some decomposed relation has all the attributes of the dependency). For instance, in a relation R(ABC) with a cover F = {A→B, B→C, C→A} one could think that in the decomposition R1(AB) and R2(BC) the dependency C→A is not preserved. But if you project F on AB you obtain A→B, B→A, projecting it on BC you obtain B→C, C→B, so from their union you can derive also C→A.
The check is not simple, even if there exists polynomial algorithms that perform this task (for instance, one is described in J. Ullman, Principles of Database Systems, Computer Science Press, 1983).
Assuming the dependencies that you have given form a cover of the dependencies of the relation, the canonical cover that you have found is incorrect. In fact BF -> C cannot be derived from the original dependencies.
For this reason, your decomposition is not correct, since R2(BCF) is not in BCNF (actually, it is not in 2NF).
One possible canonical cover of R is the following:
BDF → C
C → A
C → E
C → F
E → C
E → D
Following the analysis algorithm, there are two possible decompositions in BCNF (according to the dependencies chosen for elimination). One is:
R1 = (ACDEF)
R2 = (BC)
while the other is:
R1 = (ACDEF)
R3 = (BE)
(note that BC and BE are candidate keys of the original relation, together with BDF).
A cover of the dependencies in R1 is:
C → A
C → E
C → F
E → C
E → D
while both in R2 and R3 no non-trivial dependencies hold.
From this, we can conclude that both decompositions do not preserve the dependencies; for instance the following dependency (and all those derived from it) cannot be obtained:
BDF → C

Database normalization - 4NF

I have the following relation and I need to normalize it to 4NF.
Relation
First I've tried to find all the FD's and MVD's that hold.
AB ->> C (MVD)
C -> D (FD)
D -> E (FD)
ABC -> F (FD)
Next, using these dependencies I've managed to find the candidate key: ABC.
Let me know if what I've done so far is right. Also, is it ok to have a multivalued dependency in 4NF? Like AB ->> C and ABC -> F?
Thanks.
In general dependencies describe important constraints on the data, for instance a functional dependency X → A means that a certain value of X determines uniquely a certain value of A (that is, each time we find in a tuple a certain value of X, we always find the same value of A). Such kinds of constraints cannot be inferred by (few) rows of a table, in which is unknown the meaning of the data.
At the best, we can infer a set of possible functional dependencies holding in that particular instance of the table, hoping (but without any particular reason) that those functional dependencies will hold on every instance of the table, which is the only condition for which we can “normalize” the relation (and not simply find a non-redundant way of storing a particular instance of that table).
In your case, for instance, since the table has very few rows, many functional dependencies could be seen as holding in it, for instance at least the following:
F → AB
E → AD
D → AE
C → ADE
B → A
EF → ABCD
DF → ABCE
CF → ABDE
CB → ADEF
(while ABC → F can be derived from CB → ADEF, and AB →→ C does not hold).
And if we should apply a normalization algorithm to that instance (for instance the synthesis algorithm for 3NF), we will decompose the relation in an exaggerate number of subschemas:
R1(AB), R2(BCF), R3(CD), R4(ADE), R5(CEF),
five relations for a table with six attributes!

Deriving functional dependencies of minimal cover

Given the set of dependencies AB->C, BD->EF, AD->GH, A->I, H->J.
How would you find a minimal cover? By applying a process described in a book I get: AB->C, A->I, BD->EF, AD->GH, H->J instead of AB->CI, BD->EF, AD->GHIJ. Is it possible to combine AB->C and A->I into AB->CI and get rid of A->I?
The functional dependencies of a minimal cover of a set of dependencies F must satisfy four conditions:
They must be a cover of F (of course!)
Each right part has only one attribute
Each left part must not have extraneous attributes (that is attributes such that original dependency can be derived even if we remove them)
No dependency of the cover is redundant (i.e. can be derived from the remaining dependencies).
So this means that a minimal cover of the example is (note that there can be more then one minimal cover, that is set satisfying the above conditions):
{ AB → C
AD → G
AD → H
A → I
BD → E
BD → F
H → J }
Of course to this set you can apply the Armstrong’s axioms to derive many other dependencies (for instance AD → GH, AB → CI, AD → GHIJ, ABD → EJ, etc.) but these are not part of any minimal cover of F (that is, they do not satisfy the above definition).

algorithm for computing closure of a set of FDs

I'm looking for an easy to understand algorithm to compute (by hand) a closure of a set of functional dependencies.
Some sources, including my instructor says I should just play with Armstrong's axioms and see what I can get at. To me, that's not a systematic way about doing it (i.e. easy to miss something).
Our course textbook (DB systems - the complete book, 2nd ed) doesn't give an algorithm for this either.
To put it in more "systematic" fashion, this could be the algorithm you are looking for. Using the first three Armstrong's Axioms:
Closure = S
Loop
For each F in S, apply reflexivity and augmentation rules
Add the new FDs to the Closure
For each pair of FDs in S, apply the transitivity rule
Add the new Fd to Closure
Until closure doesn't change any further`
Which I took from this presentation notes. However, I found the following approach way easier to implement:
/*F is a set of FDs */
F⁺ = empty set
for each attribute set X
compute the closure X⁺ of X on F
for each attribute A in X⁺
add to F⁺ the FD: X -> A
return F⁺
which is taken from here
The set of all attributes functionally determined by α under a set F of FDs.
eg. Suppose we have these starting FDs and we want to compute all the closure using the below once.
A -> BC
AC -> D
D -> B
AB -> D
More FDs calculated by above present once
A+ -> (A,B,C,D)
B+ -> (B)
Similarly we can calculate C+, D+, (AC)+, (AB)+ and so on...
There is a very easy alg. to calculate all set of FDs though
RESULT <- α
MODIFIED <- TRUE
WHILE(MODIFIED) {
MODIFIED <- FALSE
∀ FD A->B IN F DO{
IF A ⊂ RESULT {
RESULT <- (RESULT) ∪ (B)
MODIFIED <- TRUE IF RESULT CHANGES
}
}
}
If by "play" he meant an exhaustive search, then in no way this is not-systematic ;) And a simple solution could look like iterative expansion*) of the set of dependencies on the rule-by-rule basis seems to be just a queue of items to be revisited and a few (two?) loops.. Have you tried it?
Btw. googling around I immediatelly found http://www.cs.sfu.ca/CourseCentral/354/zaiane/material/notes/Chapter6/node12.html - but I cannot verify if it is reasonable because my battery in laptop is literally going down!
*) Simply: apply them iteratively as long as anything has changed in the previous iteration. When applying ALL of them does not change anything in the current state (i.e. (A -> ABCDEF) += (ADEF) ==> (A -> ABCDEF) so no new symbol was added to the set), then it means, well, that no further expansions expand it furhter, so I think that's the point to assume the no-more-growing set to be complete.

Resources