Understanding candidate key

Understanding candidate key - database

Consider R(A,B,C,D,E)
F = {BC->AE, A->D, D->C, ABD->E}.
I need to find all candidate key of the schema.
I know that BA,BC,BD are the keys, but i want to know how do discover them.
I saw some answers in candidate keys from functional dependencies = but i didn't fully understand them.
form what they suggest, I got L={B}, M={A,C,D}, R={E}
Now i need to add from M one at a time to L.
I start with A, i get BA. So BA->A, BA->B (trivial) and because A->D so BA->D and because D->C we get BA->C.
But, how we get E?

adapting the answer from https://stackoverflow.com/a/14595217/3591273
Since we have the functional dependencies: BC->AE, A->D, D->C, ABD->E, we have the following superkeys:
ABCDE (All attributes is always a super key)
ABCD (We can get attribute E through ABD -> E)
ABC (Just add D through A -> D)
ABD (Just add C through D -> C)
AB (We can get D through A -> D, and then we can get C through D -> C)
BC (We can get E through BC -> E, and then we can get C through D -> C)
BD (We can get C through D -> C, and then we can get AE through BC -> AE)
(One trick here to realize, is that since B never appears on the right side of a functional dependency, every key must include B, ie key B is independent and cannot be derived from other keys)
Now that we have all our super keys, we can see that only the last
three are candidate keys. Since the first four can all be trimmed
down. But we cannot take any attributes away from the last three
superkeys and still have them remain a superkey.
so the minimal keys are AB, BC, BD
update
this was a reduction approach, i.e succesively reduce the trivial superkey by use of functional dependencies, but one can take the opposite road and use an augment approach, i.e start with single trivial keys and augment them with other keys wrt dependency relations untill keys become superflous

Related

Determining Candidate Keys from Functional Dependencies

If I Have R(E, F, G, H), what would be the candidate keys from these functional dependencies?
FD1: EF -> G
FD2: EF -> H
FD3: G -> E
FD4: H -> F
My thought process was that EF would be considered a candidate key, since EF -> G and EF -> H, therefore EF+ = {E, F, G, H}. Could I say the same in saying that GH is also a candidate key, since G -> E, H -> F, therefore GH -> EF and GH+ = {E, F, G, H}? Would there be any other candidate keys?

The schema has four candidate keys: EF, EH, FG, GH. You can easily verify this fact by computing the closure of each pair of attributes, and noting that it contains all the attributes.
The question is naturally how to find them. The trivial method is simply to try the closure of all the subsets of attributes of the relation, but this is obviously inefficient, being an exponential process.
There are more efficient algorithms to find all the candidate keys, but they are quite complex. There are simple heuristics that can help in reducing the complexity of the solution, without using a formal algorithm.
First, you should start from a canonical cover, otherwise these heuristics cannot be applied (in your example you have already a canonical cover). The first step is that you can exclude any attribute that appears only in the right hand sides of the dependencies (not in this case), and consider that all the attributes appearing only in left hand sides must be always part of any key (also not in this case).
Then, you can start from the left hand sides of the dependencies, and compute their closures to see if those sets of attributes can determine all the others. If this is not the case, you can add the other attributes, one at time, and again compute the closure of the resulting set, stopping considering those attributes when you have found a key or the set includes a subset already considered.
For instance, from EF you have found that you can determine all the other attributes, so this is a candidate key. Then, considering G, you can add E, noting that EG+ = EG, so this is not a candidate key, then add H, noting that GH+ = EFGH, so this is a candidate key, and finally add F, finding that FG is a candidate key. Of course, when a set of attributes is a candidate key you do not add to it other attributes. Another set of tests starts with H, first HE (which produces a candidate key), then HF, which do not produce a candidate key. At this point we should check if adding an attribute to EG or to HF we obtain a candidate key, but we can safely stop here since we will obtain just a superset of a set already considered (like EGF, for instance, that contains GF).

How can I find candidate keys?

Example:
Let R = (A, B, C, D)
Let F = {C -> AD, AB -> C}
Then how can I find the candidate keys?
The answer is {AB, BC}
Why?

Given a relation schema R with a set of attributes T and a non-empty set of non-trivial functional dependencies F describing a certain set of constraints that are assumed to hold in that schema:
Every attribute that does not appear in the right part of a FD in F must be present in any candidate key.
Every attribute that does not appear in the left part of a FD in F cannot be present in any candidate key.
To find all the candidate keys, for all the other attributes, you should try to add to the attributes of 1 above every possible combination of them, and see if the closure determines all the attributes of the relation (and such that you cannot remove any attribute from the combination without losing this property).
Note that, if the set F is empty, the only candidate key is constituted by all the attributes T.
In practice there are algorithms that can be relatively efficient (since the problem of finding all the keys is in the general case exponential).
A simple approach is to start from a canonical cover of the functional dependencies, in this case for instance from:
{ A B → C
C → A
C → D }
and after finding the attributes that must be present in any candidate key (in this case B), try to add to them the left hand side of the dependencies (in this case both AB, that is A, and C) (in any order, and possibly combining them) and compute the closure to see if they determine all the attributes. When you discover that some set of attributes determines all the relation attributes, you have found a candidate key (and it is not necessary to add other attributes to it). In your example:
(A B)+ = A B C D
(B C)+ = A B C D
So A B and B C are candidate keys (since you cannot remove any attribute to both of them without losing the property of determining all the other attributes). And since there are no other attributes (a part from D that cannot be present in a candidate key), you know that you have found all the candidate keys.

Candidate Keys on Functional Dependencies?

The relation R=(A,B,C,D,E) and functional dependencies F are given as follows:
F={A->BC, CD->E, B->D, E->A}
E, BC and CD can be a candidate keys, but B cannot.
Anyone could point me how this fact is calculated? I google it but couldn't understand more as what I known before.

You can find all the dependent attributes of a given set of attributes by computing the closure of its functional dependencies. Let me demonstrate:
A -> ABC -> ABCD -> ABCDE
A determines BC (given) as well as itself (trivially) therefore A -> ABC. Add the fact that B -> D to get ABC -> ABCD. Finally, add CD -> E to get ABCD -> ABCDE. We stop here because we've determined the whole relation, therefore A is a candidate key.
You should verify that, starting from E, BC and CD, you can indeed determine the whole relation.
Starting from B, we get:
B -> BD
and that's it. The rest of the relation can't be determined from BD, so it's not a candidate key.
A more visual way of doing it is to sketch the functional dependencies:
Starting from any set of attributes, try finding a path to every other attribute by following the arrows. You can only get to E if you start at E or visited both C and D.
From B, you can reach D, but without C, you're not allowed to go to E, which also excludes A. So B can't be a candidate key.

Database Relational Homework help

The Problem "Consider a relation R with five attributes ABCDE. You are given the following dependancies
A->B
BC->E
ED->A
List all the keys for R.
The teacher gave us the keys, Which are ACD,BCD,CDE
And we need to show the work to get to them.
The First two I solved.
For BCD, the transitive of 2 with 3 to get (BC->E)D->A => BCD->A.
and for ACD id the the transitive of 1 with 4 (BCD), to get (A->B)CD->A => ACD->A
But I can't figure out how to get CDE.
So it seems I did it wrong, after googling I found this answer
methodology to find keys:
consider attribute sets α containing: a. the determinant attributes of F (i.e. A, BC,
ED) and b. the attributes NOT contained in the determined ones (i.e. C,D). Then
do the attribute closure algorithm:
if α+ superset R then α -> R
Three keys: CDE, ACD, BCD
Source
From what I can tell, since C,D are not on the left side of the dependencies. The keys are left sides with CD pre-appended to them. Can anyone explain this to me in better detail as to why?

To get they keys, you start with one of the dependencies and using inference to extend the set.
Let me have a go with simple English, you can find formal definition the net easily.
e.g. start with 3).
ED -> A
(knowing E and D, I know A)
ED ->AB
(knowing E and D, I know A, by knowing A, I know B as well)
ED->AB
Still, C cannot be known, and I have used all the rules now except BC->E,
So I add C to the left hand side, i.e.
CDE ->AB
so, by knowing C,D and E, you will know A and B as well,
Hence CDE is a key for your relation ABCDE. You repeat the same process, starting with other rules until exhausted.

Question about relation normalization

Let's consider, for instance, the following relation:
R (A,B,C,D,E,F)
where the bold denotes that it is a primary key attribute
with
F = {AB->DE, D->E}
Now, this looks to be in the first normal form. It can't be on the third normal form as I have a transitive dependency and it cannot be in the second form as not all non-key attributes depend on the whole primary key.
So my questions are:
I don't know what to make of F and C. I don't have any functional dependency info on them! F doesn't depend on anything? If that is the case, I can't think of any solution to get R into the 2nd normal form without taking it out!
What about C? C also suffers from the problem of not being referred on the functional dependencies list. What to do about it?
My attempt to get R into the 2nd normal form would be something like:
R(A,B,D)
R' (D,E)
but as stated earlier, I don't have a clue of what to do of C and F. Are they redundant so I simply take them out and the above attempt is all I have to do to get it into the 2nd form (and 3rd!)?
Thanks

Given the definition of R that { A, B, C } is the primary key, then there is inherently a functional dependency:
ABC → ABCDEF
That says that the values of A, B and C inherently determine or control the values of D, E and F as well as the trivial fact that they determine their own values.
You have a few additional dependencies, identified by the set F (which is distinct from the attribute F - the notation is not very felicitous, and could be causing confusion*):
AB → DE
D → E
As you rightly diagnose, the system is in 1NF (because 1NF really means "it is a table"). It is not in 2NF or 3NF or BCNF etc because of the transitive dependency and because some of the attributes only depend on part of the key.
You are right that you will end up with the following two relations as part of your decomposition:
R1(D, E)
R2(A, B, D)
You also need the third relation:
R3(A, B, C, F)
From these, you can recreate the original relation R using joins. The set of relations { R1, R2, R3 } is a non-loss decomposition of the original relation R.
* If the F identifying the set of subsidiary functional dependencies is intended to be the same as the attribute F, then there is something very weird about the definition of that attribute. I'd need to see sample data for the relation R to have a chance of knowing how to interpret it.

I think the primary key of R is set wrong. If F isn't functionally related to anything it has to be a part of the key
So you have R( ABCF DE) which is now in the first normal form (with F = {AB->DE, D->E}) Now you can change it to the second normal form. DE isn't dependant on the whole key (partial dependency) so you put it in another relation to get to second normal form:
R( ABCF ) F = {}
R1( #AB DE) F = {AB->DE}
Now this relation doesn't have any transitive dependencies so it is already in third normal form.

F doesn't depend on anything?
No, you just haven't been given any explicit information about it in the form
{something -> F}
And essentially the same can be said for C. You're expected to infer the other dependencies by applying Armstrong's axioms. (Probably.)
Think about how to finish this:
Given R (A,B,C,D,E,F)
{ABC -> ?}
[Later . . . I see that Jonathan Leffler has broken the suspense, so I'll just finish this.]
{ABC -> DEF} (By definition) therefore,
{ABC -> F} (By decomposition. Here's where F and C come in. And this is your third relation. ).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight