Normalization Theory Explanation needed - database

I'm looking at a specific example of a relation with a composite primary key. Based on its functional dependencies, I know it is in 1NF. While normalizing it to 3NF I came across a situation I have not yet encountered. I followed the steps for all partial dependencies and transitive dependencies, but the last step of normalizing to 3NF requires you to create a relation that contains the primary key and all non-prime attributes dependent on it.
In my specific case, I have the primary key, but no full functional dependencies on it. Do I make a table containing only my composite primary key? Or do I not make one at all?
I have no confusion of composite and primary keys. See my comment below to see why I believe my question is different from that one

It is perfectly legitimate to have a relation that consists of a composite key and no other attributes. It's not only theoretically valid, but also it happens in the real world.
In such situation, that relation is merely asserting the existence of something identified by the composite key. And it would be used by the user of the data to test for existence and not for the same kind of lookups that a relation with non key attributes is typically used for.

FDs (functional dependencies) have nothing to do with 1NF, no matter which of the various meanings for "1NF" you are using. So it's not clear what you're trying to say about 1NF. A relation by definition has a value for each attribute of each tuple. A thing like a relation with something like a "list of values" for some part like an attribute of some part like a tuple is not a relation so CKs (candidate keys) & FDs do not apply. If you define a "1NF relation" as one without certain data types (because of some fuzzy application-dependent received wisdom about "atomicity", or in Codd's sense of having no relation-valued attributes) then satisfaction does not depend on whether FDs hold on the design with that data type. (Moreover if the "normalized" "atomic"-attributed version of such a "non-1NF" "non-atomic"-attributed design satisfies a FD then the original has a certain constraint, but it's not a FD constraint.)
FDs that aren't partial are full. The only partial FDs that matter on the way to 2NF & 3NF are partial FDs of non-prime attributes on CKs. When these are gone you have 2NF. (From "followed the steps for all partial dependencies and transitive dependencies" it sounds like your plan is to decompose to 2NF then to 3NF.) Partial FDs just aren't mentioned in a definition of 3NF that requires 2NF. Also, definitions for 3NF and the common algorithm for putting a relation into 3NF just don't make use of partial FDs.
There can also be other partial FDs. They just don't matter. In particular, all the FDs of attributes on proper superkeys are partial. Just follow the definitions for determining what normal form(s) a relation is and follow the algorithms for putting a relation into a normal form. This goes for all definitions and algorithms. There is no point in worrying about every property you notice that it might be "bad".
PS You shouldn't put a relation into 3NF by first putting it into 2NF. That can exclude some good 3NF decompositions of the original from being found. Use an algorithm for 3NF. (The usual one for 3NF actually generates decompositions in the slightly stronger EKNF (Elementary Key Normal Form)).

Related

Exactly what is 2NF and 3NF?

What's the main point of Normalization?
I mean if a normal form is not in 2NF, it is because of partial dependency i.e. a non key attribute is dependent on a part of a candidate key.
So, let's say, for a relation R(A,B,C) with FDs:
AB->C, B->C
Clearly, AB is the candidate key and B->C is the partial dependency.
Solution: Decompose the relation such that (B,C) forms a new relation with B as the key.
Now, if a relation is not in 3NF, it is because a non key attribute is dependent on another non key attribute i.e. to say
if FDs for a relation R(A,B,C) are:
A->B,B->C
Clearly, A is the key and B->C shows transitive dependency, so not in 3NF.
Solution: Decompose the relation such that (B,C) forms a new relation with B as the key.
So, what's the exact difference?
I mean, why such a marked distinction? Essentially in both of the cases the action is same.
Decompose the relation using the dependency where the determinant (B here) is either PART of a key or not.
Why have separate terms like partial dependency or transitive dependency?
Why not just see, if there exists a dependency wherein a non prime attribute is determined by a something which is NOT a candidate key( no matter whether it is a partial key or another non prime attribute )
Why can't we implement a method like this:
1 NF -- having all elements in the atomic form
X NF -- if there's any
dependency of the form non_key -> non_prime_attribute(s) ,
decompose the relation with one of the new relation having this
particular "non_key" as the key with those non_prime_attributes.
BCNF
: Where for all the dependencies of the form X->Y, X is a superkey?
Can we have such NF condition format? Does it combine all the conditions?
So, what's the exact difference?
2NF is not 3NF & definitions of 2NF are not definitions of 3NF. There isn't any particular semantic or syntactic structural similarity that would leave some kind of "difference" other than that a 2NF relation can have the sort of problem FD (functional dependency) that violates 3NF that a 3NF relation doesn't have. You can find definitions all over the place. You almost give them correctly here yourself. But a NF (normal form) is a condition, not a process. What do you mean "actions are the same"? Being in 3NF implies being in 2NF, so naturally decomposing to 3NF also gives 2NF. But there are relations that are in 2NF but not in 3NF, and there may be decompositions for a relation to 2NF that don't get to 3NF. Those decompositions will involve in a removal of all problem partial FDs that does not result in the removal of all problem transitive FDs.
(Because 3NF is always achievable and there are no other disadvantages compared to 2NF, 2NF isn't even useful. It's just a condition that was discovered first that is not as strong as 3NF.)
(3NF is frequently defined in terms of 2NF plus no transitive dependencies of non-prime attributes on CKs, but actually no such FDs implies no partial FDs of non-prime attributes on CKs, hence 2NF, so the first condition is redundant.)
Why not just see, if there exists a dependency wherein a non prime attribute is determined by a something which is NOT a candidate key
Why should that condition be helpful? It is not a description of just getting rid of the problem FDs of 2NF & 3NF--that's what putting into 3NF does.
Getting rid of non-trivial FDs that are not determined by superkeys happens to give BCNF. It implies 2NF & 3NF. But it is different from both of them. A BCNF relation exhibits no FD-based update anomalies. It is always achievable. However 3NF is alway achievable while "preserving FDs", whereas BCNF is not. There are cases where in order for a FD that held in the original to be enforced in a view/query that gives it via constraints on its components we need an EQD (equality dependency) constraint. That says two column sets have the same set of subrow values, which is more expensive to enforce than a FD. Either you have BCNF & an EQD & fewer update anomalies or you have 3NF/EKNF & a FD & certain update anomalies.
The NF that really matters is 5NF, which implies BCNF, with no update anomalies & with other benefits. (We might then decide to denormalize for performance reasons.)
PS Normalization to a given NF does not necessarily involve normalization to lower NFs.
It almost sounds as though you want to know why they called these two normal forms by different names instead of inventing just one form that covers both cases. If that's not the case, please ignore this answer.
Part of the answer is that the forms weren't discovered at the same time. And part of the answer is that the problem with 1NF that gave rise to 2NF is not the same as the problem with 2NF that gave rise to 3NF, even though they both exhibit harmful redundancy.
What might satisfy you a little more is BCNF. BCNF was actually discovered later than 4NF, so that name was already in use. But BCNF has to be placed between 3NF and 4NF, because it is more restrictive than 3NF but less restrictive than 4NF. So it was discovered "out of sequence", so to speak.
In BCNF, every (non trivial) determinant is a candidate key. That seems to be what you are looking for. I conjecture that any relation that is in 1NF and where every determinant is a candidate key, could be shown to be in 2NF and 3NF. But the proof is beyond me.
2NF and 3NF are essentially historical concepts and your question is a reasonable one. There is no real reason to apply them in practical database design because better tools exist today.
When it comes to teaching there is possibly some justification for mentioning 2NF and 3NF. Doing so allows students to explore the concepts involved (as you have done) while also teaching them a bit about the origins and rationale of design theory. In school maths lessons I was taught long division and differentiation from first principles. No one uses those techniques in practice, they are just teaching aids.
Before checking for 2NF the relation should be in 1NF. In simple words 2NF have only full dependencies only, no partial dependencies in relation. Full dependency means if x gives y, then by removal of any element in x, then y is not having any relation. If by removal of x, you are having relation with y then it is partial dependency. For 3NF we have to check for the 2NF, in 3NF we should not have any transitive relations like if x gives z, then there is no relation like x gives y and y gives z.
Solution for 2NF create a table for the partial dependcies and add foreign key in new relation which is primary key on the previous relation.
Solution for 3NF create a relation for both x gives y and y gives z. Add keys to relations.

Can a table be in 3NF with no primary keys?

1.
A table is automatically in 3NF if one of the following holds:
(i) If a relation consists of two attributes.
(ii) If 2NF table consists of only one non key attribute.
2.
If X → A is a dependency, then the table is in 3NF, if one of the following conditions exists:
(i) If X is a superkey
(ii) If A is a part of superkey
I got the above claims from this site.
I think that in both the claims, 2nd subpoint is wrong.
The first one says that a table in 2NF will be in 3NF if we have all non-key attributes and the table is in 2NF.
Consider the example R(A,B,C) with dependency A->B.
Here we have no candidate key, so all attributes are non-prime attributes and the relation is not in 3NF but in 2NF.
The second one says that for a dependency of the form X->A if A is part of a super key then it's in 3NF.
Consider the example R(A,B,C) with dependencies A->B, B->C . Here a CK is {A}. Now one of the super keys can be AC and the RHS of FD B->C contains part of AC but still the above relation R is not in 3NF.
I think it should be A should be part of a candidate key and not super key.
Am I correct?
Also can a particular relation be in 1NF, 3NF or 2NF if there are no functional dependencies present?
A CK (candidate key) is a superkey that contains no smaller superkey. A superkey is a unique set of attributes. A relation is a set of tuples. So every relation has a superkey, the set of all attributes. So it has at least one CK.
A FD (functional dependency) holds by definition when each value of a determining set of attributes appears always with the same value for its determined set. Every relation value or variable satisfies "trivial" FDs, the ones where the determined set is a subset of the determining set. Every set of attributes determines {}. So every relation satisfies at least one FD. However, the correct forms of definitions typically specifically talk about non-trivial FDs. Don't use the web, use textbooks, of which dozens are free online, although not all are well-written. Many textbooks also forget about FDs where the determinant and/or determined set is {}.
Your first point is not a correct definition of 3NF. Since its phrased "if..." instead of "if and only if", maybe it's not trying to be a definition. However, it is still wrong. (i) is wrong because a relation with two attributes is not in 3NF if one is a CK and the other has the same value in every tuple, ie it is determined by {}.
Similarly the second point is not a proper definition and also even if you treat it as only a consequence of 3NF (if...) it's false. It would be a definition if it used if and only if and talked about an FD that holds and it said it was a non-trivial FD and some other things were fixed.
Since those are neither correct definitions nor correct implications, there's a unlimited number of ways to disprove them. Read a book (or my posts) and get correct definitions.
Some comments re your reasoning:
First one says that, a table in 2NF will be in 3NF if we have all non key attributes and table is in 2NF.
I have no idea why you think that.
Here we have no candidate key
There's always one or more CKs. You need to read a definition of CK. There are also non-brute-force algorithms for finding them all.
Second one says that, for the dependency of form X->A if A is part of super key then it's in 3NF.
I have no idea why you think that.
A should be part of candidate key and not super key.
A correct defintion like the second point does normally say "... or (ii) A-X is part of a CK". But I can't follow your reasoning.
Sound reasoning involves starting from assumptions and writing new statements that we know are true because we applied a definition, a previously proved statement (theorem) or a sound rule of reasoning, eg from 'A implies B' and 'A' we can derive 'B'. You seem to need to read about how to do that.

When does BCNF not preserve functional dependencies, and should I then use 3NF?

When is BCNF not able to preserve functional dependencies?
When is a 3NF decomposition desired instead of a BCNF decomposition preserving functional dependencies?
Please explain with an example.
I saw this question but it does not answer my question:
Decomposition that does not preserve functional dependency
When is BCNF not able to preserve functional dependencies?
Turns out this question is problematic in a certain way that "ok you defined 'prime number' but when is a number prime?" is, but "ok you defined 'simplest form of a fraction' but when is a fraction in simplest form?" isn't. Definition(s) say "when". But what you mean is something like, multiple conditions apply so what more simple/intuitive definition or non-brute-force algorithm characterizes this? But it has been shown that (informally) there is no non-exponential/non-exhaustive algorithm to enumerate BCNF decompositions that do/don't preserve FDs (functional dependencies).
When is a 3NF decomposition desired instead of a BCNF decomposition [not] preserving functional dependencies?
If a 3NF design is not in BCNF then it preserves a FD that is not out of a superkey and so cannot be declaratively enforced in most SQL DBMSs. But the BCNF design, not having preserved the FD, needs a constraint enforced that is equivalent to two SQL FK (foreign key) constraints to each other, which cannot be declaratively enforced in most SQL DBMSs. Since there's nothing special about cycles that prevents DBMSs from enforcing them and the two designs can represent each other, there isn't any reason per se why a DBMS couldn't support both.
There's a similar mental complexity for these two design forms--3NF plus FDs not out of CKs vs BCNF plus extra equality dependencies. But since the 3NF relation is the join of its BCNF components, the meaning of a 3NF tuple is the AND/conjunction of the meanings of the BCNF components. Since a user implicitly knows this and should be explicitly told it, and since constraints are not needed to query or modify a database (they're for integrity), the BCNF design is in some sense simpler. But if the user is always wanting to update both components then the 3NF design is in some sense simpler.
Thus, in case we are not able to get a dependency-preserving BCNF decomposition, it is generally preferable to opt for BCNF, since checking functional dependencies other than primary key constraints is difficult in SQL.
-- Database System Concepts 6th Edition (2011) by Silberschatz, Korth & Sudarshan
You can find an example facing this choice in most textbooks, and dozens are online in pdf. It must involve overlapping (composite) CKs (candidate keys).
The meaning of an SJT tuple (s,j,t)--simplified notation--is that student s is taught subject j by teacher t. The following constraints apply:
For each subject, each student of that subject is taught by only one teacher
Each teacher teaches only one subject (but each subject is taught by several teachers).
[...] From the first constraint, we have the FD {S,J} → T. From the second constraint, we have the FD T → J.
-- An Introduction to Database Systems 8th Edition (2004) by Date
(A 3NF design can suffer from further problems that could be eliminated by further decomposing the BCNF design to higher normal forms. This is why we should always decompose to 5NF then if desired explicitly denormalize. So any non-BCNF 3NF table should have come from such a denormalization.)

Can a 2NF database already be in 3NF?

I'm doing a homework question where I have to convert a database to 1NF, 2NF and 3NF. I have gotten to 2NF and it does not appear to have any transitive dependencies. Does that mean that it is already in 3NF?
Yes. When a relation (variable or value) is in a given normal form it can also be in higher normal forms at the same time. (But beware that sometimes people sloppily say that a relation is in a given normal form but they mean that it's in that normal form but also not any higher one.)
Being in a normal form is a property of a relation. The way they are named, 1-2-3-BCNF-4-5 are stricter and stricter conditions. So when a relation meets one of those conditions it meets all the preceding ones and it might meet later ones. You happen to have a 2NF relation that is also a 3NF relation. Or to put that anther way, you have a 3NF relation that, like every 3NF relation is also in 2NF. You just happened to notice that it was in 2NF before you noticed it was in 3NF.
Yes, unless you missed a transitive functional dependency.

Understanding Database Normalization - Second Normal Form(2NF)

I have been learning Normalization from "Fundamentals of Database Systems by Elmasri and Navathe (6th edition)" and I am having trouble understanding the following part about 2NF.
The following image is an example given under 2NF in the textbook
The candidate key is {SSN,Pnumber}
The dependencies are
SSN,Pnumber -> hours, SSN -> ename, pnumber->pname, pnumber -> plocation
The formal Definition:
A relation schema R is in 2NF if every nonprime attribute A in R is
fully functionally dependent on the primary key of R.
for example in the above picture:
if suppose, I define an additional functional dependency SSN -> hours, then taking the two functional dependencies,
{SSN,Pnumber} -> hours and SSN -> hours
the relation wont be in 2NF, because now SSN ->hours is now a partial functional dependency as SSN is a proper subset for the given candidate key {SSN,Pnumber}.
Looking at the relation and its general definition on 2NF, i presume that the above relation is in 2NF
As far as my understanding goes and how i understand what 2NF is,
A relation is in 2NF if one cannot find a proper subset (prime attributes)
of the on the left hand side (candidate key) of a functional dependency
which defines the NPA(non prime attribute).
My first question is, Why is the above relation not in 2NF? (The textbook has considered the above relation as not in 2NF)
There is, however, a informal ways(steps as per the textbook where a normal person not knowing normalization can take to reduce redundancy) being defined at the beginning of this chapter which are:
■ Making sure that the semantics of the attributes is clear in the schema
■ Reducing the redundant information in tuples
■ Reducing the NULL values in tuples
■ Disallowing the possibility of generating spurious tuples
The guideline mentioned is as follows:
My second question is, If the above steps described are taken into account, and consider why the following relation is not in 2NF, do you assume the following functional dependencies, which are,
{SSN,Pnumber} -> Pname
{SSN,Pnumber} -> Plocation
{SSN,Pnumber} -> Ename
making the decomposition of the relation correct? If the functional dependencies assumed are incorrect, then what are the factors leading for the relation to not satisfy 2NF condition?
When looked at a general point of view ... because the table contains more than one primary attributes and the information stored is concerned with both employee and project information, one can point out that those need to be separated, as Pnumber is a primary attribute of the composite key, the redundancy can somehow be intuitively guessed. This is because the semantics of the attributes are known to us.
what if the attributes were replaced with A,B,C,D,E,F
My Third question is, Are functional dependencies pre-determined based on "functionalities of database and a database designer having domain knowledge of the attributes" ?
Because based on the data and relation state at a given point the functional dependencies can change which was valid in one state can go invalid at a certain state.In general this can be said for any non primary attribute determining non primary attribute.
The formal definition :
A functional dependency, denoted by X → Y, between two sets of
attributes X and Y that are subsets of R specifies a constraint on the
possible tuples that can form a relation state r of R. The constraint is
that, for any two tuples t1 and t2 in r that have t1[X] = t2[X], they must
also have t1[Y] = t2[Y].
So won't predefining a functional dependency be wrong as on cannot generalize relation state at any given point?
Pardon me if my basic understanding of things is flawed to begin with.
Why is the above relation not in 2NF?
Your original/first/informal "definition" of 2NF is garbled and not helpful. Even the quote from the textbook is wrong since 2NF is not defined in terms of "the PK (primary key)" but rather in terms of all the CKs (candidate keys). (Their definition makes sense if there is only one CK.)
A table is in 2NF when there are no partial dependencies of non-prime attributes on CKs. Ie when no determinant of a non-prime attribute is a proper/smaller subset of a CK. Ie when every non-prime attribute is fully functionally dependent on every CK.
Here the only CK is {Ssn, Pnumber}. But there are FDs (functional dependencies) out of {Ssn} and {Pnumber}, both of which are smaller subsets of the CK. So the original table is not in 2NF.
If the above statement is taken into account, do you assume the following functional dependencies
so won't the same process of the decomposition shown based on the informal way alone be difficult each time such a case arrives?
A table holds the rows that make some predicate (statement template parameterized by column names) into a true proposition (statement). Given the business rules, only certain business situations can arise. Then given the table predicates, which give table values from a business situation, only certain database values can arise. That leads to certain tables having certain FDs.
However, given some FDs that hold, we can formally use Armstrong's axioms to get all other FDs that must also hold. So we can use both informal and formal ways to find which FDs hold and don't hold.
There are also shorthand rules that follow from the axioms. Eg if a set of attributes has a different subrow value in each tuple then so does every superset of it. Eg if a FD holds then every superset of its determinant determines every subset of its determined set. Eg every superset of a superkey is a superkey & no proper subset of a CK is a CK. There are also algorithms.
Are functional dependencies pre-determined based on "functionalities of database and a database designer having domain knowledge of the attributes" ?
When normalizing we are concerned with the FDs that hold no matter what the business situation is, ie what the database state is. Each table for each business can have its own particular FDs per the table predicate & the possible business situations.
PS Do "make sense" of formal things in terms of the real world when their definitions are in terms of the real world. Eg applying a predicate to all possible situations to get all possible table values. But once you have the necessary formal information, only use formal definitions and procedures. Eg determining that a FD holds for a table because it holds in every possible table value.
so would any general table be in 2NF based on a solo condition of a table having a composite primary key?
There are tables in 5NF (hence too all lower NFs) with all sorts of mixes of composite & non-composite CKs. PKs don't matter.
It is frequently wrongly said that having no composite CKs guarantees 2NF. A table without composite keys and where {} does not determine any attribute is in 2NF. But if {} determines an attribute then it's a proper/smaller subset of any/every CK with any attributes. {} determines an attribute when every row has to have the same value for that attribute.
Why is the above relation in 2NF?
EP1, EP2, and EP3 are in 2NF because, for each one, the key identifies the non-key. No part of any key identifies any part of any non-key. That is what is meant by for any two tuples t1 and t2 in r that have t1[X] = t2[X], they must also have t1[Y] = t2[Y].
By contrast, you might say EMP_PROJ is over-specified. If ssn identifies, ename (as the text says it does), then the combination of {ssn, pnumber} is too much. There exists a subset of the key {ssn,pnumber} that identifies a part of the non-key, {ename}. That situation does not occur in a table conforming to 2NF, as EP1, EP2, and EP3 illustrate.
Are functional dependencies ... based on ... domain knowledge of the attributes?
Emphatically, yes! That's all they're based on. The DBMS is just a logic machine. The ideas of "employee" and "hours" don't exist for it. The database designer chooses to define tables that model some real-world universe of discourse, and imposes meaning on the columns. He gives names to the attributes (above) in X and Y. He decides which columns serve to identify a row based on what is true about the universe being modeled.
if a table has a composite primary key, regardless of the functional dependencies is not in 2NF?
No. Remember, 2NF is defined in terms of FDs. What could it mean to speak of conforming to 2NF "regardless" of them?
The number of columns in the key is immaterial. It's some set, X, identifying the complement, Y.
I'm not sure if I thoroughly understand your questions, but I'll give a try to explain.
Your first statement about 2NF:
a relation is in 2NF if one cannot find a proper subset on the left hand side of a functional dependency which defines the NPA
is correct, as well as your supposition
if {SSN,Pnumber} -> hours and SSN -> hours then this relation wont be in 2NF
because what that means that you could determine 'hours' from 'SSN' alone, so using the composite key {SSN,Pnumber} to determine 'hours' will be redundant, and thus violates the 2NF requirements.
What you call the left hand side of an FD is usually called a key. You use the key to find the related data. In order to save space (and reduce complexity), you should always try to find a minimal key, and break up larger tables into smaller ones if possible, so you do not have to save information in more places than necessary. This is what normalization to the normal forms is all about, and being studied for about half a century now, substantial theory on the matter has been developed, and some rules chrystalized from it, like 1NF, 2NF, 3NF etc.
Your second question confuses me a lot, because from what you are saying, it seems you already understands this.
Could there be some confusion about the FD's? From the figure, it seems to me as they are defined like this:
{SSN,Pnumber} -> hours
{SSN} -> ename
{Pnumber} -> Pname,Plocation
Just like the three lower tables are modeled, together they add up to the relation (table) modeled above.
So, in the first table, you would need the composite key {SSN,Pnumber} to access any data in the relation (search in the table), while that clearly is not necessary for most of the fields.
Now, I'm not sure about what purpose that table would fulfill in real life. While that is not formally necessary, as long as the FD's are given, it might be easier to imagine why the design will benefit from normalization.
So let's day it's about recording workhours per emplyee per project in some organization. SSN identifies the employee, (whose name also is stored as ename because it is easier to remember, but could be duplicate), Pnumber identifies the project, which name and location is also stored much for the same reason.
Then if you as a manager need to register that an employee worked another few hours on some project, you would use your manager app on your device, which in turn will update the tables seamlessly (you cannot expect managers to understand the logics of normalization)
Behind the scenes, however, it would amount to some query, in SQL that would be an 'INSERT' statement which added another row to the relevant table(s).
Now you can see that in the above table, you would have to insert all the six attributes, while with the normalized tables below, you will only need to add a row to table EP1,consisting of three attributes. In a large organization with thousands of employees delivering their worksheets every week, that will quickly become huge differences in storage requirements. That has a number of benefits, perhaps the most significant beeing search speed.
Your third question I don't understand at all, I'm afraid. In a way you could say FD's are predetermined once you have decided what data you will save in your database. The FD's are not dupposed to change. When modeled in the DB, they will not change. If you later find you will alter the design, then that will be new relations with new FD's.
The text you seem to be quoting from somewhere simply says that if you have the FD X -> Y (X gives or determines Y) then if you have any two tuples (records) in that relation (table) that have the same value of X, they must also hve the same value of Y. Or in our example, if Pnumber somewhere is given the value of 888, Pname is 'Battleship' and Plocation is 'Kitchen Sink', then if somewhere else (some other record) the Pnumber 888 is used then also there Pname must be 'Battleship' and Plocation must be 'Kitchen Sink' because Pname and Plocation is functionally dependant on Pnumber.
Now that was almost another chapter in your textbook, or what? Hope it helps, because it took me some time to write :-)
A table can be said to be in 2NF, if the primary key is composed of multiple columns, and that if for each row these columns were concatenated together into a single string, then the resulting column would qualify as the primary key. Alternatively a single column primary key will also qualify as 2NF.
In this case the same employee could have multiple phone numbers (PNUMBER), so a you cannot have a compound primary key that includes the phone number.

Resources