Identifying normalisation form - database

Here's the relational schemas i have:
OfficeCustomer(office, customer, employee)
office, customer -> employee
employee -> office
This is my current analysis,
Minimal super key is (office, customer), non-key attribute employee is fully functional dependent on (office, customer) only. So this should be minimally 2NF.
Is (office, customer) -> employee -> office considered transitive?
Need help to confirm the highest normal form for the above relational table. Thanks.

There is more than one candidate key.
That's the key to the rest of the answer.
(I'm assuming this is homework, so I won't spoonfeed any more and let you first think it through.)

Related

Relationships between 3 Entities in a ternary relationship

we are tasked with creating a database which has three entities team, user, course. A course will have multiple students and multiple teams in it. A student and professor can belong to many courses. However, a user with the type of student can only belong to one team in a specific course, but they can belong to other teams in different courses. We are currently trying to figure out how to display this relationship. We are also leaning towards teams being a weak entity which depends on course. So far we have two versions of how we believe the entities and relationships will look like. Would someone be able to tell if we are on the correct track, the weak entity is throwing us off. We also are a bit confused on the cardinality for the ternary relationship.
We only put primary keys in the diagram to simplify it.
A user has the following attributes: name, primary key(userID), userType(either admin, student,teacher), and email.
A course has the following attributes: course name, primary key(course id), start date, and end date.
A team is a weak entity with the following attributes: course id, team number. Primary key(course id, team number).
Thank you to anyone who may be able to help.
IMO, the course table should not have both team and user linked to it, only team should be linked to it, to specify what course is the team for. My ERD diagram would look something similar to this :
Team_member is an associative entity used to solve the many-to-many relationship between team and user, since each user can belong to many teams, and each team can have many members, so it should have a composite key made up of user_id and team_id, to record each member within a team, and team should have a foreign key of course_id to specify its course.

3NF Normalization and Decomposition

I am currently in a DB class and working through Normalization, and am running into some trouble. Am hoping I can get some assistance working through this. I have searched for the last 30 min and haven't found anything that helps solve my question, but hopefully I'm not searching for the wrong things.
The question is as follows:
Considering the universal Relation
EMPLOYEE (ID, First, Last, Team, Dept, Salary)
With the follow set F of functional dependencies
ID -> First
ID -> Last
First, Last -> ID
Last -> Team
ID -> Dept
ID -> Salary
Salary -> Dept
Identify the candidate keys and construction a decomposition of Employee into relations in 3NF that preserve Dependencies.
For the candidate keys, I am struggling because when doing an edge diagram, there are incoming dependencies for every single attribute. There are no attributes that do not appear on the RHS of the dependencies. What I think may be confusing me is that while ID does determine everything, First, Last determines ID. So would ID and First, Last both be a candidate key?
I know for the deconstruction, Last -> Team and Salary -> Dept are transitive, but ID has a direct dependency ID -> Dept and ID-> Salary already given.
Does that mean I only need two tables,
(ID, First, Last, Salary)
and
(Last, Team)?
Or based on the candidate keys question above, do I need
(ID, First, Last)
(ID, Salary, Dept)
(Last, Team)
Let me know if any additional info is needed. Thank you.
So would ID and First, Last both be a candidate key?
ID is a candidate key and Last, First is probably a composite index. It's too common for people to have the same name.
The third normal form can be summed up in one sentence. "The columns in the table depend on the key, the whole key, and nothing but the key, so help me Codd."
So, let's take a look at your original description.
EMPLOYEE (ID, First, Last, Team, Dept, Salary)
First, Last, and Salary would be based on the employee id. One of your dependencies implies that everyone in the department gets the same salary. I don't agree, but whatever.
An employee is on one team, and one team can have one or more employees. This is a one to many relationship, which implies a foreign key to a Team table from the Employee table.
The same holds for the employee / department relationship. Another foreign key to a Department table from the Employee table.
There doesn't seem to be any relationship between the Team table and the Department table.
Salary is a weird field. I'd say it belongs in the Employee table, but the Salary -> Dept relationship is confusing me.

Relational Database: When do we need to add more entities?

We had a discussion today related to W3 lecture case study about how many entities we need for each situation. And I have some confusion as below:
Case 1) An employee is assigned to be a member of a team. A team with more than 5 members will have a team leader. The members of the team elect the team leader. List the entity(s) which you can identify in the above statement? In this cases, if we don't create 2 entities for above requirement, we need to add two more attributes for each employee which can lead to anomaly issues later. Therefore, we need to have 2 entities as below:
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK teamId&employeeId) -> 2 entities
Case 2) The company also introduced a mentoring program, whereby a new employee will be paired with someone who has been in the company longer." How many entity/ies do you need to model the mentoring program?
The Answer from Lecturer is 1. With that, we have to add 2 more attributes for each Employee, mentorRole (Mentor or Mentee) and pairNo (to distinguish between different pairs and to know who mentors whom), doesn't it?
My question is why can't we create a new Entity named MENTORING which will be similar to TEAM in Q1? And why we can only do that if this is a many-many relationship?
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK is pairNo&employeeId) -> 2 entities
Thank you in advance
First of all, about terminology: I use entity to mean an individual person, thing or event. You and I are two distinct entities, but since we're both members of StackOverflow, we're part of the same entity set. Entity sets are contrasted with value sets in the ER model, while the relational model has no such distinction.
While you're right about the number of entity sets, there's some issues with your implementation. TEAM's PK shouldn't be teamId, employeeId, it should be only teamId. The EMPLOYEE table should have a teamId foreign key (not part of the PK) to indicate team membership. The employeeId column in the TEAM table could be used to represent the team leader and is dependent on the teamId (since each team can have only one leader at most).
With only one entity set, we would probably represent team membership and leadership as:
EMPLOYEE(employeeId PK, team, leader)
where team is some team name or number which has to be the same for team members, and leader is a true/false column to indicate whether the employee in that row is the leader of his/her team. A problem with this model is that we can't ensure that a team has only one leader.
Again, there's some issues with the implementation. I don't see the need to identify pairs apart from the employees involved, and having a mentorRole (mentor or mentee) indicates that the association will be recorded for both mentor and mentee. This is redundant and creates an opportunity for inconsistency. If the goal was to represent a one-to-one relationship, there are better ways. I suggest a separate table MENTORING(menteeEmployeeId PK, mentorEmployeeId UQ) (or possibly a unique but nullable mentorEmployeeId in the EMPLOYEE table, depending on how your DBMS handles nulls in unique indexes).
The difference between the two cases is that teams can have any number of members and one leader, which is most effectively implemented by identifying teams separately from employees, whereas mentorship is a simpler association that is sufficiently identified by either of the two people involved (provided you consistently use the same role as identifier). You could create a separate entity set for mentoring, with relationships to the employees involved - it might look like my MENTORING table but with an additional surrogate key as PK, but there's no need for the extra identifier.
And why we can only do that if this is a many-many relationship?
What do you mean? Your examples don't contain a many-to-many relationship and we don't create additional entity sets for many-to-many relationships. If you're thinking of so-called "bridge" tables, you've got some concepts mixed up. Entity sets aren't tables. An entity set is a set of values, a table represents a relation over one or more sets of values. In Chen's original method, all relationships were represented in separate tables. It's just that we've gotten used to denormalizing simple one-to-one and one-to-many relationships into the same tables as entity attributes, but we can't do the same for many-to-many binary relationships or ternary and higher relationships in general.

Should student be a weak entity in DBMS?

I have this following database of a student portal project I am building. I'm new to databases but I know the concepts quite much. I wan't to ask that in my diagram should student be a weak entity as it depends on the department. If there is no department then there won't be any student to that department.
Apart from my main question I am a bit confused about the ATTENDANCE and GRADES Table. Have I related them correctly and are their attributes sufficient and correct ? I know I'm asking much but can you review my diagram and provide me suggestion to improve it even if it takes to make it from scratch.
Thanks.
Student doesn't need to be a weak entity set. While weak entity sets imply an existence dependency, existence dependencies don't imply weak entity sets. Total participation is possible for regular entity sets too.
Instead of looking at existence dependencies, look at identification. Weak entity sets can't be identified by their own attributes alone, they depend on a foreign key (usually in combination with a weak key) for identity. When an entity set has an independent identity like Roll ID (surrogate IDs are always independent), they're regular entities.
You seem to be confusing entity sets with tables, perhaps due to the mixed notation you're using. If I read your model correctly, Grades is a relationship between Student and Courses since it has a primary key that consists of two foreign keys. However, your diagram only links it to Student via an unnecessary has relationship.
You also have embedded relationships in your tables, e.g. Courses has a Department FK, but you didn't link the two in the diagram. Enrolls requires its own table, but you don't show one unlike for the other many-to-many relationships in your diagram.
Attendance, like Grades, represents a relationship between Student and Courses. You show an association with Department but don't indicate an FK. While in original ER notation we never indicate foreign keys as attributes, in your diagram this is inconsistent with most of the rest of your tables.
Edit:
Here's an example of how to represent Grades as a relationship between Student and Courses. I used original ER notation since I don't have a tool that implements your notation.
Attendance table should be linked to Course and Student not Department as shown.

IF I have multiple candidate keys which one is a primary key and justify your choice?

Given:
R = { Account , Analyst , Assets, Broker, Client, Commission, Company, Dividend, Exchange, Investment, Office, Profile, Return, Risk_profile, Stock, Volume}
and a set of functional dependencies:
F{fd1, fd2,fd3, fd4, fd5,fd6, fd7, fd8, fd9, fd10, fd11}
where:
fd1: Client -> Office
fd2: Stock -> Exchange, Dividend
fd3: Broker -> Profile
fd4: Company -> Stock
fd5: Client -> Risk_profile, Analyst
fd6: Analyst -> Broker
fd7: Stock, Broker -> Invenstment, Volume
fd8: Stock -> Company
fd9: Investment,Commission -> Return
fd10: Stock, Broker -> Client
fd11: Account -> Assests
these are candidate key(s) :
(Account, Commission,Analyst ,Company)
(Account, Commission,Analyst ,Stock)
(Account ,Commission,Broker ,Company)
(Account ,Commission,Broker ,Stock)
(Account ,Commission,Client, Company)
(Account ,Commission,Client ,Stock)
(Q) Select a primary key and justify your choice ?
I was select
(Account ,Commission,Broker ,Stock) as a primary key ???
I chose that because it has the most direct dependencies compared to other ones. e.g. more attributes are functionally dependent on this primary key.
please check if my answer is it true ? or Not
I'm waiting your answer asap
thank you
I would create a dummy unique id to identify the row and link it to other tables as I have had consistent bad experience with compound keys. A single id field just works a lot better.
I suspect they are just evil.
For all the relevant possible keys, i would recommend to create unique indexes, which would give the advantages (guaranteed uniqueness and fast retrieval) and none of the disadvantages (do not get me started).
I also suspect that the key fields proposed might change from time to time. You really want your key to be immutable, since it will be used as a reference.
In logical database design, singling out a candidate key to be "primary" at the expense of all the others becoming "secondary" is a completely arbitrary and artificial choice.
That is why Date has ditched the notion of "primary key" over 15 years ago. Every key corresponds to a certain uniqueness rule, and no single key is "more unique" than any of the others. Period.
Database systems should never have been such that they force the database designer to make such insignificant choices. The reason that they do force the database designer to make such choices, is partly for historical reasons (times were when the primary/secondary distinction was believed to bear some relevance), and partly because the big dogs in dbms land are quite happy NOT improving their existing cashflow-generating systems to any relational extent.

Resources