Determining Candidate Keys from Functional Dependencies

Determining Candidate Keys from Functional Dependencies - database

If I Have R(E, F, G, H), what would be the candidate keys from these functional dependencies?
FD1: EF -> G
FD2: EF -> H
FD3: G -> E
FD4: H -> F
My thought process was that EF would be considered a candidate key, since EF -> G and EF -> H, therefore EF+ = {E, F, G, H}. Could I say the same in saying that GH is also a candidate key, since G -> E, H -> F, therefore GH -> EF and GH+ = {E, F, G, H}? Would there be any other candidate keys?

The schema has four candidate keys: EF, EH, FG, GH. You can easily verify this fact by computing the closure of each pair of attributes, and noting that it contains all the attributes.
The question is naturally how to find them. The trivial method is simply to try the closure of all the subsets of attributes of the relation, but this is obviously inefficient, being an exponential process.
There are more efficient algorithms to find all the candidate keys, but they are quite complex. There are simple heuristics that can help in reducing the complexity of the solution, without using a formal algorithm.
First, you should start from a canonical cover, otherwise these heuristics cannot be applied (in your example you have already a canonical cover). The first step is that you can exclude any attribute that appears only in the right hand sides of the dependencies (not in this case), and consider that all the attributes appearing only in left hand sides must be always part of any key (also not in this case).
Then, you can start from the left hand sides of the dependencies, and compute their closures to see if those sets of attributes can determine all the others. If this is not the case, you can add the other attributes, one at time, and again compute the closure of the resulting set, stopping considering those attributes when you have found a key or the set includes a subset already considered.
For instance, from EF you have found that you can determine all the other attributes, so this is a candidate key. Then, considering G, you can add E, noting that EG+ = EG, so this is not a candidate key, then add H, noting that GH+ = EFGH, so this is a candidate key, and finally add F, finding that FG is a candidate key. Of course, when a set of attributes is a candidate key you do not add to it other attributes. Another set of tests starts with H, first HE (which produces a candidate key), then HF, which do not produce a candidate key. At this point we should check if adding an attribute to EG or to HF we obtain a candidate key, but we can safely stop here since we will obtain just a superset of a set already considered (like EGF, for instance, that contains GF).

Related

Candidate keys after canonical cover

I have a set of functional dependencies:
V = {ABCDEF} F = {AB → CD,ABDE → F,BC → A,C → DF}
Candidate keys are: {ABE, BCE}
Canonical cover is: {AB→ C, BC→ A, C→ DF} [This is what I think, might be wrong]
However, as you can see an attribute of candidate key, E, is not in my canonical cover and as far as I know candidate keys should be same in the canonical cover.
If you consider Augmentation rule from Armstrong calculus we can say it is correct but I am confused. Does attribute E have to be represented in the canonical cover?

You say:
as far as I know candidate keys should be same in the canonical cover
This is not true. On the contrary, if an attribute does not belong to any right part of the functional dependencies of a canonical cover, it must be present in any candidate key (this is because it cannot be derived from any other subset of attributes, so, since a candidate key must determines all the attributes, it should be present in any key). Your canonical cover and candidate keys are correct.
Note that if an attribute does not belong to any functional dependency (both in the left and right part), as E in your example, this is a special case of above (it does not belong the a right part side), and must be present in any candidate key.
Finally, note that this can be considered a “symptom” of something wrong in the relation and in fact the schema is not in 3NF or BCNF.

Well, when I try to do Bernnstein synthesis from this relation (ABCDEF) I have to use basis: {AB→C,BC→A,C→DF} I need to add candidate keys since no candidate key exist when we form a relation from functional dependencies : R1(ABC) and R2(CDF) and I was wondering if we need to add E here since our basis doesn't contain E and we consider basis when we do synthesis. That's why I was little confused. But, I think we need to add E since we are doing a synthesis from original R(ABCDEF) so it should be R1(ABC), R(CDF) and R3( ABCE). R3 contains all candidate keys.

How can I find candidate keys?

Example:
Let R = (A, B, C, D)
Let F = {C -> AD, AB -> C}
Then how can I find the candidate keys?
The answer is {AB, BC}
Why?

Given a relation schema R with a set of attributes T and a non-empty set of non-trivial functional dependencies F describing a certain set of constraints that are assumed to hold in that schema:
Every attribute that does not appear in the right part of a FD in F must be present in any candidate key.
Every attribute that does not appear in the left part of a FD in F cannot be present in any candidate key.
To find all the candidate keys, for all the other attributes, you should try to add to the attributes of 1 above every possible combination of them, and see if the closure determines all the attributes of the relation (and such that you cannot remove any attribute from the combination without losing this property).
Note that, if the set F is empty, the only candidate key is constituted by all the attributes T.
In practice there are algorithms that can be relatively efficient (since the problem of finding all the keys is in the general case exponential).
A simple approach is to start from a canonical cover of the functional dependencies, in this case for instance from:
{ A B → C
C → A
C → D }
and after finding the attributes that must be present in any candidate key (in this case B), try to add to them the left hand side of the dependencies (in this case both AB, that is A, and C) (in any order, and possibly combining them) and compute the closure to see if they determine all the attributes. When you discover that some set of attributes determines all the relation attributes, you have found a candidate key (and it is not necessary to add other attributes to it). In your example:
(A B)+ = A B C D
(B C)+ = A B C D
So A B and B C are candidate keys (since you cannot remove any attribute to both of them without losing the property of determining all the other attributes). And since there are no other attributes (a part from D that cannot be present in a candidate key), you know that you have found all the candidate keys.

Understanding candidate key

Consider R(A,B,C,D,E)
F = {BC->AE, A->D, D->C, ABD->E}.
I need to find all candidate key of the schema.
I know that BA,BC,BD are the keys, but i want to know how do discover them.
I saw some answers in candidate keys from functional dependencies = but i didn't fully understand them.
form what they suggest, I got L={B}, M={A,C,D}, R={E}
Now i need to add from M one at a time to L.
I start with A, i get BA. So BA->A, BA->B (trivial) and because A->D so BA->D and because D->C we get BA->C.
But, how we get E?

adapting the answer from https://stackoverflow.com/a/14595217/3591273
Since we have the functional dependencies: BC->AE, A->D, D->C, ABD->E, we have the following superkeys:
ABCDE (All attributes is always a super key)
ABCD (We can get attribute E through ABD -> E)
ABC (Just add D through A -> D)
ABD (Just add C through D -> C)
AB (We can get D through A -> D, and then we can get C through D -> C)
BC (We can get E through BC -> E, and then we can get C through D -> C)
BD (We can get C through D -> C, and then we can get AE through BC -> AE)
(One trick here to realize, is that since B never appears on the right side of a functional dependency, every key must include B, ie key B is independent and cannot be derived from other keys)
Now that we have all our super keys, we can see that only the last
three are candidate keys. Since the first four can all be trimmed
down. But we cannot take any attributes away from the last three
superkeys and still have them remain a superkey.
so the minimal keys are AB, BC, BD
update
this was a reduction approach, i.e succesively reduce the trivial superkey by use of functional dependencies, but one can take the opposite road and use an augment approach, i.e start with single trivial keys and augment them with other keys wrt dependency relations untill keys become superflous

Database candidate keys from functional dependencies - specific technicality

I am working with a relational database's set of attributes and set of functional dependencies and have a specific question about which keys would be considered candidate keys of this schema.
The set of attributes I am working with is:
R = (A, B, C, D, E, F, G, H)
And the set of functional dependencies are:
F = { AC -> B, AB -> C, AD -> E, C -> D, BC -> A, E -> G, ABE -> D, FG -> E}
So here's what I am trying to figure out: Would this set of attributes have any candidate keys since H is not determined/mentioned at all in the set of functional dependencies?
By definition, candidate keys determine everything else, correct? If H is not determined by anything but itself, would there still be any candidate keys in this set?
Any insight is appreciated. Thanks!

Recall (Wikipedia) that
In the relational model of databases, a candidate key of a relation is
a minimal superkey for that relation; that is, a set of attributes
such that the relation does not have two distinct tuples (i.e. rows or
records in common database language) with the same values for these
attributes (which means that the set of attributes is a superkey)
there is no proper subset of these attributes for which (1) holds
(which means that the set is minimal).
Hence,
So here's what I am trying to figure out: Would this set of attributes have any candidate keys since H is not determined/mentioned at all in the set of functional dependencies?
This simply means that H will be contained in every candidate key R might have. For instance, ACFH is a candidate key. You can infer B because of AC->B, D because of C->D, E because of AD->E, and G because of E->G. On the other hand, you cannot infer F from ACH, H from ACF, C from AFH and A from CFH.

Question about relation normalization

Let's consider, for instance, the following relation:
R (A,B,C,D,E,F)
where the bold denotes that it is a primary key attribute
with
F = {AB->DE, D->E}
Now, this looks to be in the first normal form. It can't be on the third normal form as I have a transitive dependency and it cannot be in the second form as not all non-key attributes depend on the whole primary key.
So my questions are:
I don't know what to make of F and C. I don't have any functional dependency info on them! F doesn't depend on anything? If that is the case, I can't think of any solution to get R into the 2nd normal form without taking it out!
What about C? C also suffers from the problem of not being referred on the functional dependencies list. What to do about it?
My attempt to get R into the 2nd normal form would be something like:
R(A,B,D)
R' (D,E)
but as stated earlier, I don't have a clue of what to do of C and F. Are they redundant so I simply take them out and the above attempt is all I have to do to get it into the 2nd form (and 3rd!)?
Thanks

Given the definition of R that { A, B, C } is the primary key, then there is inherently a functional dependency:
ABC → ABCDEF
That says that the values of A, B and C inherently determine or control the values of D, E and F as well as the trivial fact that they determine their own values.
You have a few additional dependencies, identified by the set F (which is distinct from the attribute F - the notation is not very felicitous, and could be causing confusion*):
AB → DE
D → E
As you rightly diagnose, the system is in 1NF (because 1NF really means "it is a table"). It is not in 2NF or 3NF or BCNF etc because of the transitive dependency and because some of the attributes only depend on part of the key.
You are right that you will end up with the following two relations as part of your decomposition:
R1(D, E)
R2(A, B, D)
You also need the third relation:
R3(A, B, C, F)
From these, you can recreate the original relation R using joins. The set of relations { R1, R2, R3 } is a non-loss decomposition of the original relation R.
* If the F identifying the set of subsidiary functional dependencies is intended to be the same as the attribute F, then there is something very weird about the definition of that attribute. I'd need to see sample data for the relation R to have a chance of knowing how to interpret it.

I think the primary key of R is set wrong. If F isn't functionally related to anything it has to be a part of the key
So you have R( ABCF DE) which is now in the first normal form (with F = {AB->DE, D->E}) Now you can change it to the second normal form. DE isn't dependant on the whole key (partial dependency) so you put it in another relation to get to second normal form:
R( ABCF ) F = {}
R1( #AB DE) F = {AB->DE}
Now this relation doesn't have any transitive dependencies so it is already in third normal form.

F doesn't depend on anything?
No, you just haven't been given any explicit information about it in the form
{something -> F}
And essentially the same can be said for C. You're expected to infer the other dependencies by applying Armstrong's axioms. (Probably.)
Think about how to finish this:
Given R (A,B,C,D,E,F)
{ABC -> ?}
[Later . . . I see that Jonathan Leffler has broken the suspense, so I'll just finish this.]
{ABC -> DEF} (By definition) therefore,
{ABC -> F} (By decomposition. Here's where F and C come in. And this is your third relation. ).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight