Can a candidate key be implied by other attributes? - database

Say I have the relational schema R(A,B,C,D,E) and one functional dependency A->BCDE. Since the closure of A is ABCDE (i.e. every attribute), it is a superkey; since it is the smallest key not containing any other key, it is also a candidate key.
What if we then add the FD B->A - does this mean that B is a candidate key, or does it mean that A is no longer a candidate key?
My tutor was working through an example and said that a way to determine candidate keys from a set of FDs was to find any attribute that doesn't appear on the RHS of any FD (i.e. any (set of) attribute(s) that isn't implied by any other attributes). Is this necessarily true? If an attribute implies all others but is itself implied by some other set of attibrutes, can it be a candidate key?

If A->BCDE and B->A then A->B->A. Therefore A and B are both candidate keys of R.
Suppose you have a relation R and a set of dependencies F. If you must infer the keys of R only from what is in F then any attribute of R that doesn't appear on the RHS of any dependency in F must be a prime attribute (i.e. part of a candidate key). I expect that is what your teacher meant. That doesn't mean that prime attributes may never appear on the RHS. They may do if there are multiple candidate keys.

Can a candidate key be implied by other attributes?
Candidate keys are implied by the dependencies. That's probably what you meant anyway.
My tutor was working through an example and said that a way to determine candidate keys from a set of FDs was to find any attribute that doesn't appear on the RHS of any FD (i.e. any (set of) attribute(s) that isn't implied by any other attributes).
That doesn't determine candidate keys. It determines the columns that must be part of every candidate key.

Related

Does union of candidate keys together form a candidate key?

Relation R consists of columns {A,B,C,D}. A uniquely defines a tuple. So does B. A and B are candidate keys, since they are minimal. What about a set {A,B}? {A,B} together uniquely defines a tuple, but it is not minimal.
What is the term for {A,B}. Usually non-minimal candidate keys are called super keys. Is there a special name for a union of candidate keys?
EDIT:
Excuse me for imprecise question. It can indeed be clearer. As far as I understand, key == candidate key == minimal set of attributes that uniquely define a tuple.
The union of candidate keys K1 and K2 yields a candidate key iff K1=K2, that is, if they are in fact the very same key. In all other cases, it will by definition yield a (super)key that isn't irreducible, and if your definition of "candidate key" is that it is an irreducible (super)key then (the result of) that union is obviously no longer a candidate key.
As for keys terminology, I think most respectable (there are others too) textbooks stick to the convention :
"superkey" = just any key
"candidate key" = irreducible (=minimal) superkey
non-minimal superkey = "proper superkey" (and not just "superkey" as you stated)
"key" is supposed to be used as synonym for "candidate key" but the linguistics of the word cause it to often also be used with the meaning of "just any key". Beware !
And no, I don't think there is a special term for the particular kind of proper superkey that happens to be a union of two (or more) candidate keys. There is no useful purpose in ever knowing such a thing about a key.
Candidate key:
A candidate key is a combination of attributes that can be uniquely used to identify a database record without referring to any other data.
The word Candidate actually means that the keys are candidates for Primary key selection, so it is clear that yes it's up-to you which candidate key or combination of candidate keys you want to qualify for Primary key.

Can a table be in 3NF with no primary keys?

1.
A table is automatically in 3NF if one of the following holds:
(i) If a relation consists of two attributes.
(ii) If 2NF table consists of only one non key attribute.
2.
If X → A is a dependency, then the table is in 3NF, if one of the following conditions exists:
(i) If X is a superkey
(ii) If A is a part of superkey
I got the above claims from this site.
I think that in both the claims, 2nd subpoint is wrong.
The first one says that a table in 2NF will be in 3NF if we have all non-key attributes and the table is in 2NF.
Consider the example R(A,B,C) with dependency A->B.
Here we have no candidate key, so all attributes are non-prime attributes and the relation is not in 3NF but in 2NF.
The second one says that for a dependency of the form X->A if A is part of a super key then it's in 3NF.
Consider the example R(A,B,C) with dependencies A->B, B->C . Here a CK is {A}. Now one of the super keys can be AC and the RHS of FD B->C contains part of AC but still the above relation R is not in 3NF.
I think it should be A should be part of a candidate key and not super key.
Am I correct?
Also can a particular relation be in 1NF, 3NF or 2NF if there are no functional dependencies present?
A CK (candidate key) is a superkey that contains no smaller superkey. A superkey is a unique set of attributes. A relation is a set of tuples. So every relation has a superkey, the set of all attributes. So it has at least one CK.
A FD (functional dependency) holds by definition when each value of a determining set of attributes appears always with the same value for its determined set. Every relation value or variable satisfies "trivial" FDs, the ones where the determined set is a subset of the determining set. Every set of attributes determines {}. So every relation satisfies at least one FD. However, the correct forms of definitions typically specifically talk about non-trivial FDs. Don't use the web, use textbooks, of which dozens are free online, although not all are well-written. Many textbooks also forget about FDs where the determinant and/or determined set is {}.
Your first point is not a correct definition of 3NF. Since its phrased "if..." instead of "if and only if", maybe it's not trying to be a definition. However, it is still wrong. (i) is wrong because a relation with two attributes is not in 3NF if one is a CK and the other has the same value in every tuple, ie it is determined by {}.
Similarly the second point is not a proper definition and also even if you treat it as only a consequence of 3NF (if...) it's false. It would be a definition if it used if and only if and talked about an FD that holds and it said it was a non-trivial FD and some other things were fixed.
Since those are neither correct definitions nor correct implications, there's a unlimited number of ways to disprove them. Read a book (or my posts) and get correct definitions.
Some comments re your reasoning:
First one says that, a table in 2NF will be in 3NF if we have all non key attributes and table is in 2NF.
I have no idea why you think that.
Here we have no candidate key
There's always one or more CKs. You need to read a definition of CK. There are also non-brute-force algorithms for finding them all.
Second one says that, for the dependency of form X->A if A is part of super key then it's in 3NF.
I have no idea why you think that.
A should be part of candidate key and not super key.
A correct defintion like the second point does normally say "... or (ii) A-X is part of a CK". But I can't follow your reasoning.
Sound reasoning involves starting from assumptions and writing new statements that we know are true because we applied a definition, a previously proved statement (theorem) or a sound rule of reasoning, eg from 'A implies B' and 'A' we can derive 'B'. You seem to need to read about how to do that.

2NF and 3NF Normalization

I seem to have a strange problem when doing normalization problems. When I'm giving relations with actual names I can figure these out easily but when I'm given letters it seems to be a lot harder.
For the following problem I don't know why it's not 3NF and why it is 2NF.
Given R (A, B, C, D, E, F)
FDs = {AB->C, DBE->A, BC->D, BE->F, F->D}
So for 2NF all the right hand side attributes must be fully functionally dependent on the left hand side attributes. For 3NF either all the left hand side attributes must be superkeys or the right hand attributes must be prime attributes.
I tried drawing this out, but I can't even find a candidate key. Can anyone help me determine why this is not 3NF? Also, what is the candidate key here? Since I don't see any attribute that has a closure equal to the original relation.
I seem to have a strange problem when doing normalization problems.
When I'm giving relations with actual names I can figure these out
easily but when I'm given letters it seems to be a lot harder.
Yes, its less intuitive with letters. I will tell you a neat method which you can follow to determine the candidate keys in such situations :
Make three columns left(L), middle(M) and right(R) where left columns consists of all the attributes that appear only on the left side in all the given functional dependencies. In our case such attributes will be B and E since they are always on the left side of any FD given (or you can say they are never on the right side in any of the given FD.). Similarly middle column contains attribute that appear on both left and right side of the given FD's. So we have A,C,D and F in the middle column. The right column contains attributes which only occur on the right hand side of FD's (never on the LHS of any given FD's). So we have :
L | M |R
B,E|A,C,D,F|-
Now that you have this table remember the following rules: (these are very intuitive)
Attributes in the left(L) column are always part of the candidate keys
Attributes in the right(R) column are never part of the candidate keys
Attributes in the middle(M) column may or may not be a part of the candidate keys.
So in our case we start with checking if BE is a candidate key. We find BE-closure consists of all the attributes of the relation R so it is the candidate key. (Note: If BE would not have been the candidate key then we would have taken attributes from middle(M) column one-by-one and combine it with BE and check its closure eg. BEA,BEC,BED ...)
So now we have only 1 candidate key BE. So our prime attributes are {B,E} and non-prime attributes are {A,C,D,F}.
We know that 3NF is violated if RHS is a non-prime attribute and LHS is not a candidate key. Given FD's are:
AB->C
DBE->A
BC->D
BE->F
F->D
We note that in all these FD's RHS is a non-prime attribute. So in all of these LHS should be a key for it to be in 3NF. We see that (1),(3) and (5) violate this so it is not in 3NF. (Note: In (2) we can see that D on the LHS is an extraneous attribute so its BE->A and hence (2) does not violate 3NF rule)

Can trivial superkey be considered as candidate key?

Suppose relation R(A,B,C,D) exists with no functional dependency. So what should be considered as its candidate key? Clearly any individual attribute or proper subset of all attributes cannot be a candidate key because by no means they can identify non prime attributes. So can ABCD be considered as candidate key? Or this relation will not have any candidate key?
Suppose relation R(A,B,C,D) exists with no functional dependency. So can ABCD be considered as candidate key?
Yes, the key1 is comprised from all attributes together.
This is quite rare in practice, though. It mostly happens with junction/link tables that implement many-to-many (or many-to-many-to-many etc.) relationship.
Or this relation will not have any candidate key?
A relation must have at least one key, otherwise it's not a relation2.
Relation is a set, and any given object either belongs to a set or doesn't - it cannot belong multiple times (unlike for multiset). Without at least one key, the same tuple would be able to belong multiple times.
1 Just saying "key" is synonymous with "candidate key".
2 At the very least, all attributes, taken together, can be considered a key (as in your case).

Determining candidate keys with functional dependencies simple

Let R(A,B,C,D,E) be a relation schema and F = {A→C, B→D, C→E, E→A}, Find all candidate keys.
I believe that there exists no CK's in this set due to not being able to map. B or D to any other relation besides B -> D . Does this mean that that there are no Candiate Keys? Although I am able to map A to all other entities besides B and D.
The first step in normalization is to find all of the keys of a relation. Here are some facts that can help find the keys:
If an attribute is in none of the FDs, then it is in every key.
If an attribute occurs on the right-hand side of an FD, but never occurs on the left-hand side, then it is never in a key.
If an attribute occurs on the left-hand side of an FD, but never occurs on the right-hand side, then it is in every key.
If an attribute occurs both on the right-hand side an FD and the left-hand side of an FD, then one cannot say anything about the attribute.
To find the keys, identify which attributes are in each of the cases above. The ones in the first and third cases must be in every key. Call this set of attributes the core. Compute the attributes that are determined by the core. This is called the closure of the core. If all of the attributes are in the closure of the core, then the core is not only a key, it is also the only key. If the closure of the core is not the entire set of attributes, then some will be missing. Write down this set of attributes, and remove any attribute that is in the second set above (i.e., it occurs on the right-hand side of an FD, but never occurs on the left-hand side). These are the exterior attributes. To get a key one must add one or more exterior attributes to the core. Accordingly, add them to the core, first one at a time, then two at a time, and so on, until every key has been found.
There are three candidate keys.
B doesn't appear on the right-hand side of any functional dependency. That means B must be part of every candidate key. I think that alone doesn't guarantee there is at least one candidate key, but it should be clear from inspection that AB is one of the three candidate keys here.
Your textbook should include at least one algorithm for determining the set of all candidate keys. If you're lucky, it includes one algorithm suitable for paper and pencil, and another suitable for automation by programming.
Since B doesn't come on right hand side so B should be the part of candidate key, and A and C occur on both sides so they can form a super key with B. on mapping AB and BC are super keys and as candidate key is minimal super key so AB and BC are candidate key.
Every attribute that is strictly only on the left hand side across all the functional dependencies, is an attribute that must form part of each of the candidate keys.
The next step is to realise whether such an attribute alone can generate, or determine all the attributes inside the schema or not. If yes, then that attribute is a candidate key in it's original stand-alone form. If not, group that attribute with each of the other attributes, one at a time, two at a time and so on. All such minimal sets that traverse the entire set of attributes can be called as the candidate keys.

Resources