All the examples of partial dependancy I have seen have only one attribute as a primary key.
A,B,C -> D,E,F
A,B-> D (composite PK)
Can i say "A,B->D" is a partial dependancy "A,B,C -> D,E,F" ?
As much as I can infer, you are asking whether the decomposition can involve the new table having a composite PK, for you have so far seen only single attribute key's examples.
Yes. The idea is that if you can uniquely determine one or more non-primary attribute(s), by a part (whether a single attribute or many) of the primary key, then you should create a separate table for that.
So, if A,B -> D, then create another table of {A, B, D} and remove duplication from the bigger table.
This link shows the single table broken to two after removing this redundancy, though it too uses a single attribute as the key. Still, you'd get the idea by looking at the table initially and after it was divided.
TL;DR Your title question is ill-formed re FDs, and your first sentence is ill-formed re FDs, and what you ask about saying is ill-formed re FDs. (Please rephrase your question to make sense re FDs by using terminology properly. Please make the connections between your sentences clear and explain your examples.)
We say that a FD (functional dependency) X -> Y holds or that Y is functionally dependent on X or that X functionally determines Y or that Y is functionally determined by X. We often leave the "functionally" out. We say that X -> Y is partial when some smaller/proper subset of X also determines Y. The terminology "partial" arises because Y is dependent on only part of X. A FD that is not partial is full. Notice that CKs (candidate keys) are not involved.
A,B,C -> D,E,F
A,B-> D (composite PK)
Normalization involves CKs. The only way that it involves PKs (primary keys) is that there is a tradition irrelevant to normalization to pick one CK to call "the PK". Of course, if there's only one CK then it's the only choice for PK and we can call it the PK.
I'll assume that A-F are the only attributes of some relation value or variable. If these two FDs (and those in their closure and no others) hold, then {A,B} is not a CK. {A,B,C} is the only CK.
Can i say "A,B->D" is a partial dependancy "A,B,C -> D,E,F" ?
It's never the case that an FD "is" a different FD. Maybe you mean "is a partial FD of". But it doesn't make sense to say that a FD "is a partial FD of" some FD. I don't know whether your "Can I can say" means "Is it true that" or "Does it make sense to say". But it doesn't make sense to say that. I don't know what you are trying to ask.
But here are some things that are true:
{A,B,C} -> {D,E,F} is not partial because no smaller/proper subset of {A,B,C} determines {D,E,F}. {A,B} -> {D} is not partial because no smaller/proper subset of {A,B} determines {D}. But {A,B,C} -> {D} is partial because smaller/proper subset {A,B} of {A,B,C} determines {D}.
A relation is in 2NF when no non-prime attribute is partially dependent on a CK. Ie when all non-prime attributes are fully functionally dependent on every CK. Since non-prime attribute {D} is partially dependent on CK {A,B,C}, this relation value or variable is not in 2NF.
In 2NF, can a partial dependency have a composite primary key?
This is also doesn't make sense. A partial FD doesn't "have" a PK or CK. I don't know what you are trying to ask. Maybe you mean "determinant" not "PK".
All the examples of partial dependancy I have seen have only one attribute as a primary key.
This also doesn't make sense since dependencies don't "have" PKs or CKs. Maybe you are trying to say, all the examples of partial FDs you have seen are in relations that have only one attribute as a CK. Maybe you are trying to say that all the examples of partial FDs you have seen have only one attribute as a determinant.
The only time that a partial FD can have a single-attribute determinant is when {} is its subset determinant. Ie if {J} -> Y is partial then {} -> Y. This is when the Y subrow value is the same in every tuple. So you have probably never thought about a partial dependency not having a composite determinant. (So you probably didn't mean "determinant" for "PK".)
(I also can't connect any these three quotes to each other or your examples.)
Related
What's the main point of Normalization?
I mean if a normal form is not in 2NF, it is because of partial dependency i.e. a non key attribute is dependent on a part of a candidate key.
So, let's say, for a relation R(A,B,C) with FDs:
AB->C, B->C
Clearly, AB is the candidate key and B->C is the partial dependency.
Solution: Decompose the relation such that (B,C) forms a new relation with B as the key.
Now, if a relation is not in 3NF, it is because a non key attribute is dependent on another non key attribute i.e. to say
if FDs for a relation R(A,B,C) are:
A->B,B->C
Clearly, A is the key and B->C shows transitive dependency, so not in 3NF.
Solution: Decompose the relation such that (B,C) forms a new relation with B as the key.
So, what's the exact difference?
I mean, why such a marked distinction? Essentially in both of the cases the action is same.
Decompose the relation using the dependency where the determinant (B here) is either PART of a key or not.
Why have separate terms like partial dependency or transitive dependency?
Why not just see, if there exists a dependency wherein a non prime attribute is determined by a something which is NOT a candidate key( no matter whether it is a partial key or another non prime attribute )
Why can't we implement a method like this:
1 NF -- having all elements in the atomic form
X NF -- if there's any
dependency of the form non_key -> non_prime_attribute(s) ,
decompose the relation with one of the new relation having this
particular "non_key" as the key with those non_prime_attributes.
BCNF
: Where for all the dependencies of the form X->Y, X is a superkey?
Can we have such NF condition format? Does it combine all the conditions?
So, what's the exact difference?
2NF is not 3NF & definitions of 2NF are not definitions of 3NF. There isn't any particular semantic or syntactic structural similarity that would leave some kind of "difference" other than that a 2NF relation can have the sort of problem FD (functional dependency) that violates 3NF that a 3NF relation doesn't have. You can find definitions all over the place. You almost give them correctly here yourself. But a NF (normal form) is a condition, not a process. What do you mean "actions are the same"? Being in 3NF implies being in 2NF, so naturally decomposing to 3NF also gives 2NF. But there are relations that are in 2NF but not in 3NF, and there may be decompositions for a relation to 2NF that don't get to 3NF. Those decompositions will involve in a removal of all problem partial FDs that does not result in the removal of all problem transitive FDs.
(Because 3NF is always achievable and there are no other disadvantages compared to 2NF, 2NF isn't even useful. It's just a condition that was discovered first that is not as strong as 3NF.)
(3NF is frequently defined in terms of 2NF plus no transitive dependencies of non-prime attributes on CKs, but actually no such FDs implies no partial FDs of non-prime attributes on CKs, hence 2NF, so the first condition is redundant.)
Why not just see, if there exists a dependency wherein a non prime attribute is determined by a something which is NOT a candidate key
Why should that condition be helpful? It is not a description of just getting rid of the problem FDs of 2NF & 3NF--that's what putting into 3NF does.
Getting rid of non-trivial FDs that are not determined by superkeys happens to give BCNF. It implies 2NF & 3NF. But it is different from both of them. A BCNF relation exhibits no FD-based update anomalies. It is always achievable. However 3NF is alway achievable while "preserving FDs", whereas BCNF is not. There are cases where in order for a FD that held in the original to be enforced in a view/query that gives it via constraints on its components we need an EQD (equality dependency) constraint. That says two column sets have the same set of subrow values, which is more expensive to enforce than a FD. Either you have BCNF & an EQD & fewer update anomalies or you have 3NF/EKNF & a FD & certain update anomalies.
The NF that really matters is 5NF, which implies BCNF, with no update anomalies & with other benefits. (We might then decide to denormalize for performance reasons.)
PS Normalization to a given NF does not necessarily involve normalization to lower NFs.
It almost sounds as though you want to know why they called these two normal forms by different names instead of inventing just one form that covers both cases. If that's not the case, please ignore this answer.
Part of the answer is that the forms weren't discovered at the same time. And part of the answer is that the problem with 1NF that gave rise to 2NF is not the same as the problem with 2NF that gave rise to 3NF, even though they both exhibit harmful redundancy.
What might satisfy you a little more is BCNF. BCNF was actually discovered later than 4NF, so that name was already in use. But BCNF has to be placed between 3NF and 4NF, because it is more restrictive than 3NF but less restrictive than 4NF. So it was discovered "out of sequence", so to speak.
In BCNF, every (non trivial) determinant is a candidate key. That seems to be what you are looking for. I conjecture that any relation that is in 1NF and where every determinant is a candidate key, could be shown to be in 2NF and 3NF. But the proof is beyond me.
2NF and 3NF are essentially historical concepts and your question is a reasonable one. There is no real reason to apply them in practical database design because better tools exist today.
When it comes to teaching there is possibly some justification for mentioning 2NF and 3NF. Doing so allows students to explore the concepts involved (as you have done) while also teaching them a bit about the origins and rationale of design theory. In school maths lessons I was taught long division and differentiation from first principles. No one uses those techniques in practice, they are just teaching aids.
Before checking for 2NF the relation should be in 1NF. In simple words 2NF have only full dependencies only, no partial dependencies in relation. Full dependency means if x gives y, then by removal of any element in x, then y is not having any relation. If by removal of x, you are having relation with y then it is partial dependency. For 3NF we have to check for the 2NF, in 3NF we should not have any transitive relations like if x gives z, then there is no relation like x gives y and y gives z.
Solution for 2NF create a table for the partial dependcies and add foreign key in new relation which is primary key on the previous relation.
Solution for 3NF create a relation for both x gives y and y gives z. Add keys to relations.
1.
A table is automatically in 3NF if one of the following holds:
(i) If a relation consists of two attributes.
(ii) If 2NF table consists of only one non key attribute.
2.
If X → A is a dependency, then the table is in 3NF, if one of the following conditions exists:
(i) If X is a superkey
(ii) If A is a part of superkey
I got the above claims from this site.
I think that in both the claims, 2nd subpoint is wrong.
The first one says that a table in 2NF will be in 3NF if we have all non-key attributes and the table is in 2NF.
Consider the example R(A,B,C) with dependency A->B.
Here we have no candidate key, so all attributes are non-prime attributes and the relation is not in 3NF but in 2NF.
The second one says that for a dependency of the form X->A if A is part of a super key then it's in 3NF.
Consider the example R(A,B,C) with dependencies A->B, B->C . Here a CK is {A}. Now one of the super keys can be AC and the RHS of FD B->C contains part of AC but still the above relation R is not in 3NF.
I think it should be A should be part of a candidate key and not super key.
Am I correct?
Also can a particular relation be in 1NF, 3NF or 2NF if there are no functional dependencies present?
A CK (candidate key) is a superkey that contains no smaller superkey. A superkey is a unique set of attributes. A relation is a set of tuples. So every relation has a superkey, the set of all attributes. So it has at least one CK.
A FD (functional dependency) holds by definition when each value of a determining set of attributes appears always with the same value for its determined set. Every relation value or variable satisfies "trivial" FDs, the ones where the determined set is a subset of the determining set. Every set of attributes determines {}. So every relation satisfies at least one FD. However, the correct forms of definitions typically specifically talk about non-trivial FDs. Don't use the web, use textbooks, of which dozens are free online, although not all are well-written. Many textbooks also forget about FDs where the determinant and/or determined set is {}.
Your first point is not a correct definition of 3NF. Since its phrased "if..." instead of "if and only if", maybe it's not trying to be a definition. However, it is still wrong. (i) is wrong because a relation with two attributes is not in 3NF if one is a CK and the other has the same value in every tuple, ie it is determined by {}.
Similarly the second point is not a proper definition and also even if you treat it as only a consequence of 3NF (if...) it's false. It would be a definition if it used if and only if and talked about an FD that holds and it said it was a non-trivial FD and some other things were fixed.
Since those are neither correct definitions nor correct implications, there's a unlimited number of ways to disprove them. Read a book (or my posts) and get correct definitions.
Some comments re your reasoning:
First one says that, a table in 2NF will be in 3NF if we have all non key attributes and table is in 2NF.
I have no idea why you think that.
Here we have no candidate key
There's always one or more CKs. You need to read a definition of CK. There are also non-brute-force algorithms for finding them all.
Second one says that, for the dependency of form X->A if A is part of super key then it's in 3NF.
I have no idea why you think that.
A should be part of candidate key and not super key.
A correct defintion like the second point does normally say "... or (ii) A-X is part of a CK". But I can't follow your reasoning.
Sound reasoning involves starting from assumptions and writing new statements that we know are true because we applied a definition, a previously proved statement (theorem) or a sound rule of reasoning, eg from 'A implies B' and 'A' we can derive 'B'. You seem to need to read about how to do that.
I was studying functional dependencies and normalization and I've come across a question. The original question is below:
"Given the relation R = {v,w,x,y,z} and functional dependency set {v->w,y->z,yz->v,wx->z} find BCNF composition and check if dependency preservation holds."
First I tried to find minimal cover and came up with this:
Minimal Cover:
v -> w
y -> z
y -> v
wx -> z
Then I tried to found candidate keys, came up with only one candidate key:
Candidate Keys:
xy
Then I started to check normal forms:
1st Normal Form: check
2nd Normal Form:
I thought the below dependencies are violating 2nd normal form:
1) y -> z
2) y -> v
3) wx -> z
The first two were easy to solve. However, I've never seen an example of the 3rd where the left-hand side is a composite of prime and non-prime attributes. How do we solve this kind of situation? Do we make a new relation for the 3rd making w and x primary key?
If I solve that part, the 3rd and BC normal forms will be easy I guess.
Whether one considers a FD (functional dependency) to "violate 2NF" depends on one's definition of 2NF. A common definition of 2NF is, no FDs hold where a non-prime attribute is partially functionally dependent on a CK (candidate key). So are the violating FDs the ones where a non-prime attribute is partially functionally dependent on a CK? Or the ones where a non-prime attribute is functionally dependent on a proper subset of a CK, by which the preceding FDs are partial? Or both? And/or others? Or what? The fact is that it isn't individual FDs that violate NFs but the set of all FDs that hold. If you want to talk about individual FDs violating then you need to give a definition for 2NF & then give & justify a definition of violating FD based on how the definition talks about such FDs.
The following uses the 2NF definition above & talks about "bad" FDs explicitly disallowed by that definition, where a non-prime attribute is partially functionally dependent on a CK.
Those three FDs are not bad. A FD is partial when its right hand side is functionally determined by a proper/smaller subset of its left hand side. None of those three FDs are partial dependencies on a CK (candidate key). None of them are even partial, because none has a right hand side that is determined by a subset of the left hand side (determinant). And none of them are even on a CK, because none of them have a CK as their left hand side.
You might consider the first two to "violate 2NF" per a 2NF definition that there are no FDs with left side a proper subset of a CK & right side a non-prime attribute. That definition explicitly disallows those FDs. So we do not have 2NF.
However the FDs xy->z & xy->v are partial, because proper/smaller subsets of xy determine z & v. And they are bad: xy is a CK and Z & v are non-prime attributes so both have a non-prime attribute partially dependent on a CK. So we do not have 2NF.
wx->z isn't bad. And it doesn't "violate 2NF" per a 2NF definition that there are no FDs with left side a proper subset of a CK & right side a non-prime attribute.
It doesn't matter whether "the left-hand side is a composite of prime and non-prime attributes". What matters is what is mentioned in your definitions. (It happens that you will never see such "an example" of a bad or "violating" FD. Because both those require left-hand sides with only CK attributes.)
Read some academic definitions for partial FD & 2NF. (Many textbooks/presentations/courses are free online.) Memorize and apply definitions, theorems and algorithms exactly. You seem to not understand numerous things:
Being in BCNF implies being in all lower NFs. Getting to BCNF does not require going through lower NFs.
Examples of decompositions you have seen are not presentations of decomposition algorithms.
We don't normalize via successive NFs. We use an algorithm for the NF we want. (Going through lower NFs can even mean good higher-NF designs become unavailable.)
When some FDs hold, all the ones implied by them by Armstrong's axioms also hold.
To determine CKs & NFs it's not enough to know that some FDs hold, we need to know what FDs hold & what FDs don't hold. You need to know a closure or cover of FDs.
Each time we decompose we get new relations & sets of FDs & CKs for each.
The FDs that hold in a component are all those of the original whose attributes are in it. (Those of a closure, not just those of a cover.)
A FD is partial when its right hand side is functionally determined by a proper/smaller subset of its left hand side.
A common 2NF definition explicitly disallows partial FDs of non-prime attributes on CKs.
"Violating FD" is not a helpful term, refer to the things that definitions mention.
I read following example, that relation A(X,Y,Z,P,Q,R) with the following functional dependency.
why this is in 1NF?
anyone could help me?
The diagram is not normal notation. I suppose that arrows point to determined attributes of FDs. I suppose that an arrow that doesn't come from a box means a FD with just one determinant attribute and an arrow that comes from a box means a FD with the boxed attributes as determinant attributes. Find out what the diagram notation means.
If so then the functional dependencies are Y → Z, XYZ → QR and P → QRX.
To show what normal form the relation is in we need to know what definitions you were given for normal forms. It happens that this relation is not in 2nd normal form. So it isn't in any higher normal form. So it is only in 1st normal form. So the only normal form definition we need to know is the one you were given for 2NF. That definition usually involves candidate keys. If so then we need to know what definition you were given for candidate key. The definition of 2NF can involve full and partial FDs. If yours does then we need to know what definitions of full and/or partial FD you were given. Give the definitions.
The only CK is PY because it determines every other attribute but no proper subset of it does. It is the only CK because there is no other such set of attributes. To justify this we need to reference the rules you were given for deriving FDs and CKs. (Eg this includes how we went from one FD list to another.) Give the rules.
But there are then also determinants Y (from Y → Z) and P (from P → QRX) that are proper subsets of that candidate key. So each of non-prime attributes Z, Q, R and X is partially dependent on a CK. But to be in 2NF there must be no non-prime attributes partially dependent on a candidate key. Ie every non-prime attribute must be fully functionally dependent on all CKs. So A is not in 2NF. So the highest normal form it is in is 1NF.
The picture doesn't make its meaning very clear in my opinion because it seems to be mixing two different notations for functional dependencies (FDs). Any answer will depend on how you want to interpret the diagram.
I'd hazard a guess that the diagram is supposed to indicate the following set of FDs: XY->Z, Y->QR, P->QRX. If that's correct then the possible candidate key respecting that set of FDs would be {Y,P}. If my interpretation of the diagram is correct then both Y and P are determinants in their own right. Since Y and P are proper subsets of a candidate key of A we can conclude that A violates 2NF and therefore the highest normal form that A can satisfy is 1NF.
Update: Your new picture specifies some dependencies. Collecting the determinant terms together we can summarize as:
P->XQR
XY->QR
Y->Z
I assume these are supposed be the dependencies actually satisfied by A. On the left-hand side we have P, X and Y so PXY will be a superkey of A. P->X therefore the candidate key (minimal superkey) can only be PY. P->XQR and Y->Z are both FDs with determinants that are proper subsets of the candidate key (PY) and that means those dependencies both violate 2NF. Recap: 2NF prohibits any FD where the left-hand side is a proper subset of a candidate key. So 1NF is the highest normal form of A.
As per 1NF, no two rows of a relation must have repeating values and no column must have more than one value in a row. This increases redundancy as there will be columns with same data repeating in many rows.
Name ID Course
A 1 Computer
B 2 Arts
C 3 Computer
Here Course column has repeated values. But every row has no column which has 2 values. Hence it is in 1NF.
1NF has the least number of restrictions. So any other forms like 2NF, 3NF by default would also be in 1NF.
Consider this analogy
1NF = Living Beings
2NF = Mammals
3NF = Humans
All mammals/2NF are by default living beings/1NF, and so on.
To satisfy the functional dependency X → Y, it is essential that each X value be associated with only one Y value. And thus it satisfies the 1NF criteria which does not allow multiple values in a row for a column.
From the Database Management Systems book: given the relation SNLRWH (each letter denotes an attribute) and the following functional dependencies:
S->SNLRWH (S is the PK)
R->W
My attempt:
First, it is not 3NF: for the second FD, neither R contains W, nor R contains a key, nor W is part of a key.
Second, it is/not 2NF. If we examine the second FD, W is dependent on R, which in turn is not part of a key. STUCK.
2NF is violated if some proper subset of a candidate key appears as a determinant on the left hand side of one of your (non-trivial) dependencies. Ask yourself whether any of your determinants is a subset of a candidate key.
Usually 2NF is violated only when a relation has a composite key - a key with more than one attribute. It is technically possible for a relation with only simple keys (single attribute keys) to violate 2NF if the empty set (∅) happens to be a determinant. Such cases are fairly unusual and rarely thought worthy of consideration because they are so obviously "wrong". For completeness, here's a fun example of that special case. In the following relation Circumference and Diameter are both candidate keys. The dependency in violation of 2NF is ∅ -> Pi, the ratio of the circumference to the diameter.
2NF has to do with partial key dependencies. In order for a relation to fail the test for 2NF, the relation has to have at least one candidate key that has at least two columns.
Since your relation has only one candidate key, and that candidate key has only one column, you can't possibly have a partial key dependency. It passes the test for 2NF.