I am currently studying Database Management and we were introduced to the idea of functional dependence:
Let A and B be attributes in a relation. B is considered functionally dependent on A if and only if for every A value you can determine a B value. i.e. A -> B
My question:
If this is the case, then given A, B, and C; if C can be evaluated arithmetically using A and B, can you consider C to be functionally dependent on A and B?
That is, (A/B) = C <=> AB -> C
As an example:
Say I have a table containing online order information. It includes the attributes: PROD_PRICE, QTY, and TOTAL_PRICE.
Seeing as the total price can be established by multiplying PROD_PRICE by the QTY is it accurate to say that PROD_PRICE QTY → TOTAL_PRICE?
If this is the case, then given A, B, and C; if C can be evaluated arithmetically using A and B, can you consider C to be functionally dependent on A and B?
Yes, by definition of functional dependency. In a functional dependency, in general, you have a set of attributes (the “determinant”) that determines another set (the “determinate”), that is each time in an instance of a relation we found the same value of the determinant the determinate must be equal, and this is true if the determinate is obtained as value of an expression over the determinate, like in your example.
Note that in general you do not know which is the function that produces the determinate from the determinant, simply you know that such a function exists. The functional dependency concept captures this fact, which is very important when representing data.
So, if for every row of a table we have c=a/b, then ab->c, but for instance, when PRODUCT_ID -> PRODUCT_NAME holds, there is no mathematical function that can derive the latter from the former.
Related
Suppose we have a table with 3 columns A,B and C
A B C
---------------
1 2 3
2 4 5
4 6 7
n 5 n
Here 'n' means null.
Can we say that A -> B and A -> C? I know the definition of functional dependencies but I'm just confused in the case of null values.
If null is considered a value, then the answer is yes. A -> B, C holds in the given data. However, to be a value imposes certain requirements. All operators applicable to the domain (e.g. integers) like equality, addition, less than, and so on, must be well-defined in the presence of nulls.
If null is not a value, then the answer is more complicated. Functional dependencies, strictly speaking, apply to relations. If a table represents a relation, then we can refer to functional dependencies in the table. However, a symbol that represents the absence of a value is metadata, not data. It allows multiple union-incompatible relations to be represented by a single table. In this case, we can't apply the concept of functional dependency to the table since it's not clear which relation we're talking about.
Further confusing things, SQL DBMSs don't handle nulls consistently. In some cases, they're handled like values, in others like the absence of values. If you want to understand and describe a table logically, the best option is to decompose it into a set of null-free relations, and then to analyze each of those parts independently.
In the case of your example table, we run into a problem if null isn't a value. The last row has no unique identifier (it can't be B:4 since another row has B:4 as well) and we can't determine anything from a lack of information. The example can't be decomposed into a set of relations without discarding that row.
If we change the last row to have B:5 instead, then we decompose it into two relations: R1 = {(A:1, B:2, C:3), (A:2, B:4, C:5), (A:4, B:6, C:7)} and R2 = {(B:2), (B:4), (B:6), (B:5)}. We can say A -> B, C holds in R1 but not in R2.
Example:
Let R = (A, B, C, D)
Let F = {C -> AD, AB -> C}
Then how can I find the candidate keys?
The answer is {AB, BC}
Why?
Given a relation schema R with a set of attributes T and a non-empty set of non-trivial functional dependencies F describing a certain set of constraints that are assumed to hold in that schema:
Every attribute that does not appear in the right part of a FD in F must be present in any candidate key.
Every attribute that does not appear in the left part of a FD in F cannot be present in any candidate key.
To find all the candidate keys, for all the other attributes, you should try to add to the attributes of 1 above every possible combination of them, and see if the closure determines all the attributes of the relation (and such that you cannot remove any attribute from the combination without losing this property).
Note that, if the set F is empty, the only candidate key is constituted by all the attributes T.
In practice there are algorithms that can be relatively efficient (since the problem of finding all the keys is in the general case exponential).
A simple approach is to start from a canonical cover of the functional dependencies, in this case for instance from:
{ A B → C
C → A
C → D }
and after finding the attributes that must be present in any candidate key (in this case B), try to add to them the left hand side of the dependencies (in this case both AB, that is A, and C) (in any order, and possibly combining them) and compute the closure to see if they determine all the attributes. When you discover that some set of attributes determines all the relation attributes, you have found a candidate key (and it is not necessary to add other attributes to it). In your example:
(A B)+ = A B C D
(B C)+ = A B C D
So A B and B C are candidate keys (since you cannot remove any attribute to both of them without losing the property of determining all the other attributes). And since there are no other attributes (a part from D that cannot be present in a candidate key), you know that you have found all the candidate keys.
I have a question regarding functional dependencies.
I understand that functional dependency means that the value of an attribute can be determined by the value of another attribute.
Suppose we have this table
|A|B|C|D|
Here A and B are the primary keys.
Is it correct to say that both C and D are functionally dependent on both A and B ?
You are saying “A and B are the primary keys” but this phrase is ambiguous: you mean: “The primary key is A B” or “the are two candidate keys, A and B”? (and note that in a relation in a relational database you can have only a single primary key and many candidate keys).
Given the definition of a (candidate) key, that is that it determines all the other attributes and that you cannot remove any attribute without losing this property, in the first case you can say that:
A B -> C D
or, which is equivalent, that:
A B -> C
A B -> D
(so C e D depends on the combination of A and B), while in the second case, you have that:
A -> C D
B -> C D
or, which is equivalent, that:
A -> C
A -> D
B -> C
B -> D
(that is, C and D are functionally dependent both on A and on B).
"S (functionally) determines T" means that all appearances of a particular subtuple value for attribute set S have the same subtuple value for attribute set T. If we say an attribute X is determining or determined then it's understood that we really mean that set {X} is determining/determined.
A superkey is a set of attributes that determines every attribute. A CK (candidate key) is a superkey that contains no smaller superkey. There can be many CKs. One CK can be chosen as PK (primary key). (PKs play no role in relational theory.)
Since there can only be one PK, it's odd that you talk about a relation value or variable having more than one. Maybe you mean two CKs. Maybe you mean a 2-attribute PK.
It happens that if every subtuple value for a set of attributes appears just once then it is a superkey. (Each single-attribute superkey is a CK unless {} is the CK, which happens when the relation is limited to one tuple.) So it determines all attributes. But in general the dependencies tell us what the superkeys & CKs are.
So if each of A and B are CKs then each determines C and D, ie {C} and {D}. And if {A,B} is a PK then it determines C and D, ie {C} and {D}. It happens that if both T1 and T2 are determined by S then T1 U T2 is too. So either way, the CK(s) here determine(s) {C,D} also.
PS There is an ambiguity in English where it is not clear whether "both C and D are functionally dependent" means that C is dependent and D is dependent or that {C,D} is dependent. Similarly for "are functionally dependent on both A and B". So it is clearer to say "the set ..." rather than just using "both" and/or "and".
i have the following table
Case ( referenceID, startDate, endDate, caseDetail, caseType, caseTypeRate,
lawyerName, lawyerContact, clientID, clientName, clientAddress, clientContact,
serviceProvided, serviceDate, serviceCost,
otherPartyID, otherPartyName, otherPartyContact )
my FDs are
referenceID-->caseDetail
referenceID-->caseType
referenceID-->ServiceProvided
lawyerContact-->lawayerName
clientID-->clientName
am i correct or are there more? i'm still a bit unsure of how it works after reading the theory. i need clear examples. how do i determine the mvds as well?
Functional dependency:-If one value for X there is only one value of Y then we can say that Y is functionaly dependent upon X and written as follow.
X -> Y
Multivalue dependecy:-If for one value of X there are more than one values of Y then we can say that Y is multivalue dependency upon X and it is written as followes.
X ->-> Y
Loosely speaking, a functional dependency expressed as x -> y means, "When I know any value of x, I know one and only one value of y." So the value of x determines one and only one value of y.
To determine whether a functional dependency exists, you ask yourself the question, "If I know any value for x, do I know one and only one value for y?", and then answer it.
In your case, I'd guess that most of these additional functional dependencies will hold. It's hard to tell for sure, since there's no sample data, and since I don't know what the columns mean. (Trying to determine functional dependencies based solely on column names is very risky. Here, "startDate" could mean just about anything.)
referenceID -> startDate
referenceID -> endDate
referenceID -> caseType
referenceID -> caseTypeRate
clientID -> clientName
clientID -> clientAddress
clientID -> clientContact
otherPartyID -> otherPartyName
otherPartyID -> otherPartyContact
There are others.
Wikipedia has a concise example of a multi-valued dependency.
Here is a good example of how to determine a MVD: https://web.archive.org/web/20140212170321/https://www.cs.oberlin.edu/~jdonalds/311/lecture08.html.
Basically, follow this algorithm:
1) Figure out if A determines a set of values for B,
2) Figure out if A determines a set of values for C, and then
3) Determine if B and C are independent of each other.
A, B, C are a set of attributes. If the conditions are satisfied then
A -->> B and A -->> C are MVDs.
Let's consider, for instance, the following relation:
R (A,B,C,D,E,F)
where the bold denotes that it is a primary key attribute
with
F = {AB->DE, D->E}
Now, this looks to be in the first normal form. It can't be on the third normal form as I have a transitive dependency and it cannot be in the second form as not all non-key attributes depend on the whole primary key.
So my questions are:
I don't know what to make of F and C. I don't have any functional dependency info on them! F doesn't depend on anything? If that is the case, I can't think of any solution to get R into the 2nd normal form without taking it out!
What about C? C also suffers from the problem of not being referred on the functional dependencies list. What to do about it?
My attempt to get R into the 2nd normal form would be something like:
R(A,B,D)
R' (D,E)
but as stated earlier, I don't have a clue of what to do of C and F. Are they redundant so I simply take them out and the above attempt is all I have to do to get it into the 2nd form (and 3rd!)?
Thanks
Given the definition of R that { A, B, C } is the primary key, then there is inherently a functional dependency:
ABC → ABCDEF
That says that the values of A, B and C inherently determine or control the values of D, E and F as well as the trivial fact that they determine their own values.
You have a few additional dependencies, identified by the set F (which is distinct from the attribute F - the notation is not very felicitous, and could be causing confusion*):
AB → DE
D → E
As you rightly diagnose, the system is in 1NF (because 1NF really means "it is a table"). It is not in 2NF or 3NF or BCNF etc because of the transitive dependency and because some of the attributes only depend on part of the key.
You are right that you will end up with the following two relations as part of your decomposition:
R1(D, E)
R2(A, B, D)
You also need the third relation:
R3(A, B, C, F)
From these, you can recreate the original relation R using joins. The set of relations { R1, R2, R3 } is a non-loss decomposition of the original relation R.
* If the F identifying the set of subsidiary functional dependencies is intended to be the same as the attribute F, then there is something very weird about the definition of that attribute. I'd need to see sample data for the relation R to have a chance of knowing how to interpret it.
I think the primary key of R is set wrong. If F isn't functionally related to anything it has to be a part of the key
So you have R( ABCF DE) which is now in the first normal form (with F = {AB->DE, D->E}) Now you can change it to the second normal form. DE isn't dependant on the whole key (partial dependency) so you put it in another relation to get to second normal form:
R( ABCF ) F = {}
R1( #AB DE) F = {AB->DE}
Now this relation doesn't have any transitive dependencies so it is already in third normal form.
F doesn't depend on anything?
No, you just haven't been given any explicit information about it in the form
{something -> F}
And essentially the same can be said for C. You're expected to infer the other dependencies by applying Armstrong's axioms. (Probably.)
Think about how to finish this:
Given R (A,B,C,D,E,F)
{ABC -> ?}
[Later . . . I see that Jonathan Leffler has broken the suspense, so I'll just finish this.]
{ABC -> DEF} (By definition) therefore,
{ABC -> F} (By decomposition. Here's where F and C come in. And this is your third relation. ).