Third Normal Form DB - database

I have a table with these columns:
person_id, name, age
person_id is the primary key.
Does age depend on both name and person_id, or only depends on person_id?
If I want it to the 3NF should I decompose it into two tables?

It only depends on person_id so you don't need to decompose the table.
And if name is an alternate key (that would be very strange) you could say that name is unique and again, you wouln't need to decompose your table.

Related

Make a entry mandatory if exist in a table

I have the next design doubt:
I have athlete entity, the athlete can have many nationalities, so I have second table called countries. Then between athlete and countries there is a many-to-many relationship. I create another table athlete_country to resolve the many-to-many relationship.
My question: Is there a way to achieve that athlete_country entry be mandatory for any entry in the athlete table?
I am working on postgresql. Is there a way in another database server?
No, this is not possible to do it this way for logical reason: athlete_country tables references athlete table, and if you do back reference (in fact you can do it) you will not be able to insert any row in either table because each row should reference to the row in another table, which isn't inserted yet.
The solution is to use many-to-one relationship in addition to many-to-many which you have described. For example, you can add "primary_country" field to athlete table which references directly to the country table. In that case you can be sure that any athlete has relationship with at least one country, specified in "primary_country" field and, optionally, with other countries listed in the athlete_country table.
create table country(id serial primary key, name text);
create table athlete(id serial primary key, name text, primary_country int references country(id));
create table athlete_country(athlete_id int references athlete(id), country_id int references country(id), primary key (athlete_id, country_id));

Normalize table to 3rd normal form

This questions is obviously a homework question. I can't understand my professor and have no idea what he said during the election. I need to make step by step instructions to normalize the following table first into 1NF, then 2NF, then 3NF.
I appreciate any help and instruction.
Okay, I hope I remember all of them correctly, let's start...
Rules
To make them very short (and not very precise, just to give you a first idea of what it's all about):
NF1: A table cell must not contain more than one value.
NF2: NF1, plus all non-primary-key columns must depend on all primary key columns.
NF3: NF2, plus non-primary key columns may not depend on each other.
Instructions
NF1: find table cells containing more than one value, put those into separate columns.
NF2: find columns depending on less then all primary key columns, put them into another table which has only those primary key columns they really depend on.
NF3: find columns which depend on other non-primary-key columns, in addition to depending on the primary key. Put the dependent columns into another table.
Examples
NF1
a column "state" has values like "WA, Washington". NF1 is violated, because that's two values, abbreviation and name.
Solution: To fulfill NF1, create two columns, STATE_ABBREVIATION and STATE_NAME.
NF2
Imagine you've got a table with these 4 columns, expressing international names of car models:
COUNTRY_ID (numeric, primary key)
CAR_MODEL_ID (numeric, primary key)
COUNTRY_NAME (varchar)
CAR_MODEL_NAME (varchar)
The table may have these two data rows:
Row 1: COUNTRY_ID=1, CAR_MODEL_ID=5, COUNTRY_NAME=USA, CAR_MODEL_NAME=Fox
Row 2: COUNTRY_ID=2, CAR_MODEL_ID=5, COUNTRY_NAME=Germany, CAR_MODEL_NAME=Polo
That says, model "Fox" is called "Fox" in USA, but the same car model is called "Polo" in Germany (don't remember if that's actually true).
NF2 is violated, because the country name does not depend on both car model ID and country ID, but only on the country ID.
Solution: To fulfill NF2, move COUNTRY_NAME into a separate table "COUNTRY" with columns COUNTRY_ID (primary key) and COUNTRY_NAME. To get a result set including the country name, you'll need to connect the two tables using a JOIN.
NF3
Say you've got a table with these columns, expressing climatic conditions of states:
STATE_ID (varchar, primary key)
CLIME_ID (foreign key, ID of a climate zone like "desert", "rainforest", etc.)
IS_MOSTLY_DRY (bool)
NF3 is violated, because IS_MOSTLY_DRY only depends on the CLIME_ID (let's at least assume that), but not on the STATE_ID (primary key).
Solution: to fulfill NF3, put the column MOSTLY_DRY into the climate zone table.
Here are some thoughts regarding the actual table given in the exercise:
I apply the above mentioned NF rules without to challenge the primary key columns. But they actually don't make sense, as we will see later.
NF1 isn't violated, each cell holds just one value.
NF2 is violated by EMP_NM and all the phone numbers, because all of these columns don't depend on the full primary key. They all depend on EMP_ID (PK), but not on DEPT_CD (PK). I assume that phone numbers stay the same when an employee moves to another department.
NF2 is also violated by DEPT_NM, because DEPT_NM does not depend on the full primary key. It depends on DEPT_CD, but not on EMP_ID.
NF2 is also violated by all the skill columns, because they are not department- but only employee-specific.
NF3 is violated by SKILL_NM, because the skill name only depends on the skill code, which is not even part of the composite primary key.
SKILL_YRS violates NF3, because it depends on a primary key member (EMP_ID) and a non-primary key member (SKILL_CD). So it is partly dependent on a non-primary-key attribute.
So if you remove all columns which violate NF2 or NF3, only the primary key remains (EMP_ID and DEPT_CD). That remaining part violates the given business rules: this structure would allow an employee to work in multiple departments at the same time.
Let's review it from a distance. Your data model is about employees, departments, skills and the relationships between these entities. If you normalize that, you'll end up with one table for the employees (containing DEPT_CD as a foreign key), one for the departments, one for the skills, and another one for the relationship between employees and skills, holding the "skill years" for each tuple of EMP_ID and SKILL_CD (my teacher would have called the latter an "associative entity").
Looking at the first two rows in your table, and looking at which columns are tagged "PK" in that table, and assuming that "PK" stands for "Primary Key", and looking at the values that appear for those two columns in those two rows, I would recommend your professor to get the hell out of database teaching and not come back until he got himself educated properly on the subject.
This exercise cannot be taken seriously because the problem statement itself contains hopelessly contradictory information.
(Observe that as a consequence, there simply is not any such thing as a "good" or "right" answer to this question !!!)
Another oversimplified answer coming up.
In a 3NF relational table, every nonkey value is determined by the key, the whole key, and nothing but the key (so help me Codd ;)).
1NF: The key. This means that if you specify the key value, and a named column, there will be at most one value at the intersection of the row and the column. A multivalue, like a series of values separated by commas, is disallowed, because you can't get directly to the value with just a key and acolumn name.
2NF: The whole key. If a column that is not part of the key is determined by a proper subset of the key columns, then 2NF is being violated.
3NF: And nothing but the key. If a column is determined by some set of non key columns, then 3NF is being violated.
3NF satisfies only if it is in 2nd normal form and doesnot have any transitive dependency and all the non-key attributes should depend on the primary key.
Transitive dependency:
R=(A,B,C).
A->B AND B->C THEN A->C

Database relation in 3NF?

I have following relation. A company has several employees. Each employee is defined by its employee number ENr and he is living on an address EAddress with a ZipCode ZZipCode. The City with the ZipCode is an own table because otherwise there is redundancy in table Employee. Therefore ZZipCode is a foreign key in Employee.
A Group is defined by its GGroupId, therefore that is the primary key. Each group has one group leader which can be any employee. Therefore ENr is a foreign key.
Each employee can work on none, one or more groups. For this reason the table GroupMember exists where the the tuple ENr and GGroupID define the primary key and both are foreign keys (I cannot do both, bold and italic).
And last, a product is defined by its product id PId and is associated to a group GGroupID.
Well here are the relations for that written description.
Employe(ENr, EName, EGender, EAddress, ZZipCode, ESocNr,
ESalery) Group(GGroupId, GName, GCostNr, ENr)
GroupMember(ENr, GGroupID) #both members are foreign keys
too! Product(PId, PName, PPrice, GGRoupId)
Zip(ZZipCode, ZCityName, SStateID) State(SStateID,
SStateName)
For clarification: bold members are primary keys and italic members are foreign keys.
I tried to put that relation into 3NF. Can anyone confirm that this is right?
This seems to be good and normalised. I dont see any further division of the tables.

Database Design for student tracking system

EDIT: my revised Entity relationship diagram
A Student can have many contact times but this does not relate to what course they are on. So the courseID in tblContact was unnecessary, so I used to Primary keys in tblStudent relating to the grade for a particular tutor marked assignment and the course a particular student is on with that TMA.
Phew
http://i.imgur.com/cf3td.png
/Edit
My Old ERD
note that StudID and CourseID are a merged compound primary key
My question: Should I have studID and courseID in tblContact? or should I just have StudID, because I'm using a compound primary key I thought I should have both values in tblContact and tblStudentTMA?
Is this right?
The answer depends on whether the contact is related to a course or not.
If it relates to a course then you need some way of identifying the Course from the contact, but you can link to the tblCourse table from tblContact.
My preference for a many-many table is to use a separate primary key in your example StudentCourseID, which is a Identity column, this removes the need to store multiple foreign keys in a related table.
The primary key in tblContact has to have at least two columns. One of them has to be StudID.
It has to have at least two columns, because you need to store more than one contact per student. One of the columns has to be StudID to guarantee the contact row refers to an actual student. The second column probably needs to be DateOfContact.
A primary key {StudID, DateOfContact} allows one contact per student per day. If you use {StudID, TimeOfContact} instead--use a timestamp instead of a date--you can have more than one contact per student per day.
In addition to that, if every row in tblContact must refer to both a student and to one of that student's courses, then you should probably include CourseID in the primary key. You also need a foreign key reference from tblContact (StudID, CourseID) to tblStudentCourse (StudID, CourseID).
If it's not necessary for every row in tblContact to refer to a course, then tblContact.CourseID should be nullable, and it shouldn't be part of the primary key. But you should still have a foreign key reference from tblContact (StudID, CourseID) to tblStudentCourse (StudID, CourseID).

Database column naming for foreign key

should I signal the foreign key in a database column name?
FKOrder vs. FK_Order vs. Order
The short answer is no - don't put "FK" in column names of foreign key columns. You can still signal the intent of the column though, here's how I do it:
Naming foreign key columns
It depends on your naming convention for the target of the FK. If you have Id, then I'd prepend the table name when creating FK columns.
Example 1:
For table User with PK Id and table Workitem with user ID FK, I'd call the column Workitem.UserId.
If there were more than one FK between the same tables, I'd make this clear in the column name:
Example 2:
For table User with PK Id and table Workitem with "assigned to user ID" and "created by user ID" FKs, I'd call the columns Workitem.CreatedByUserId and Workitem.AssignedToUserId.
If your naming convention for PKs is more like UserId, then you'd factor that into the above examples so as not to end up with UserUserId.
Naming foreign key constraints
This is mine:
FK_childtablename_[differentiator]parenttablename
The differentiator is used when there is more than one FK between the same two tables (e.g. CreatedByUserId and AssignedToUserId). Often I use the child table's column name for this.
Example 1:
Given tables: Workitem and User
Where User has CreatedByUserId and AssignedToUserId
Foreign key names are FK_Workitem_User_CreatedByUser and FK_Workitem_AssignedToUser
I use double-underscores if tables/columns have underscores in the name:
Example 2:
Given tables: work_item and user
Where user has created_by_user_id and assigned_to_user_id
Foreign key names are FK_work_item__created_by_user and FK_work_item__assigned_to_user
Is usual to name the foreign key fields with an ID (IDORDER, IDPERSON, ...), if you have a table called PERSONS and another CITIES, if one person is in certain city, CITIES has an IDCITY field (K), PERSONS has a IDPERSON (K), and other field IDCITY (FK).
Hope this answers your question. I mean, a foreign key is only foreign when it's in other table, but not in theirs. But it's a good practice to name always the same to the same fields, even if they are in other tables, as a foreign key.
You shouldn't.
If a column becomes a foreign key later, you will have to change the column name, breaking all the scripts that are using it.
If there are multiple foreign keys, you don't know which column belongs to which key, so the only information you gain is that the column is a foreign key, but you already know it by looking at the keys.
Usually I name the foreign key column the same as the primary key, so I know immediately where the key maps.
I normally use the same name as the referenced column in the table holding the FK.
Only if this is potentially confusing (or a column with this name already exists, say id), would I be more explicit. In such a case, adding the entity type name before the rest - say ProductId.
My style is slightly different:
fk_table_column
eg: fk_user_id that is foreign key to User table on id column. I do not use any capital latter.

Resources