Relational Database: When do we need to add more entities? - database

We had a discussion today related to W3 lecture case study about how many entities we need for each situation. And I have some confusion as below:
Case 1) An employee is assigned to be a member of a team. A team with more than 5 members will have a team leader. The members of the team elect the team leader. List the entity(s) which you can identify in the above statement? In this cases, if we don't create 2 entities for above requirement, we need to add two more attributes for each employee which can lead to anomaly issues later. Therefore, we need to have 2 entities as below:
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK teamId&employeeId) -> 2 entities
Case 2) The company also introduced a mentoring program, whereby a new employee will be paired with someone who has been in the company longer." How many entity/ies do you need to model the mentoring program?
The Answer from Lecturer is 1. With that, we have to add 2 more attributes for each Employee, mentorRole (Mentor or Mentee) and pairNo (to distinguish between different pairs and to know who mentors whom), doesn't it?
My question is why can't we create a new Entity named MENTORING which will be similar to TEAM in Q1? And why we can only do that if this is a many-many relationship?
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK is pairNo&employeeId) -> 2 entities
Thank you in advance

First of all, about terminology: I use entity to mean an individual person, thing or event. You and I are two distinct entities, but since we're both members of StackOverflow, we're part of the same entity set. Entity sets are contrasted with value sets in the ER model, while the relational model has no such distinction.
While you're right about the number of entity sets, there's some issues with your implementation. TEAM's PK shouldn't be teamId, employeeId, it should be only teamId. The EMPLOYEE table should have a teamId foreign key (not part of the PK) to indicate team membership. The employeeId column in the TEAM table could be used to represent the team leader and is dependent on the teamId (since each team can have only one leader at most).
With only one entity set, we would probably represent team membership and leadership as:
EMPLOYEE(employeeId PK, team, leader)
where team is some team name or number which has to be the same for team members, and leader is a true/false column to indicate whether the employee in that row is the leader of his/her team. A problem with this model is that we can't ensure that a team has only one leader.
Again, there's some issues with the implementation. I don't see the need to identify pairs apart from the employees involved, and having a mentorRole (mentor or mentee) indicates that the association will be recorded for both mentor and mentee. This is redundant and creates an opportunity for inconsistency. If the goal was to represent a one-to-one relationship, there are better ways. I suggest a separate table MENTORING(menteeEmployeeId PK, mentorEmployeeId UQ) (or possibly a unique but nullable mentorEmployeeId in the EMPLOYEE table, depending on how your DBMS handles nulls in unique indexes).
The difference between the two cases is that teams can have any number of members and one leader, which is most effectively implemented by identifying teams separately from employees, whereas mentorship is a simpler association that is sufficiently identified by either of the two people involved (provided you consistently use the same role as identifier). You could create a separate entity set for mentoring, with relationships to the employees involved - it might look like my MENTORING table but with an additional surrogate key as PK, but there's no need for the extra identifier.
And why we can only do that if this is a many-many relationship?
What do you mean? Your examples don't contain a many-to-many relationship and we don't create additional entity sets for many-to-many relationships. If you're thinking of so-called "bridge" tables, you've got some concepts mixed up. Entity sets aren't tables. An entity set is a set of values, a table represents a relation over one or more sets of values. In Chen's original method, all relationships were represented in separate tables. It's just that we've gotten used to denormalizing simple one-to-one and one-to-many relationships into the same tables as entity attributes, but we can't do the same for many-to-many binary relationships or ternary and higher relationships in general.

Related

What is the right way to use Associative Entity?

This is the description:
Draw an Entity-Relationship diagram for Poke-Hospital which provides
medical service to pokemon.
Each pokemon has an appointment with one of the nurse Joys. In
addition to recording the name, type and trainer of each pokemon, the
system needs to keep track of the multiple types of sickness being
diagnosed for the pokemon. During an appointment, the nurse will
always prescribe medicine. It is required to record the date, time and
dosage of the medicine. A pokemon may need to take more than one
medicine at a time. Each medicine is stored with its name, brand and
cost of purchase. There is no restriction on the amount of medicine to
be prescribed by any nurse.
Within an appointment, a pokemon may need to undergo procedures such
as a surgery and/or diagnosis. Each procedure requires different type
of rooms and a list of equipment. The date, time and the actual room
of the procedure need to be recorded.
A procedure may be performed by more than one nurse. A nurse is
involved in the procedure based on the training skills that she has
completed. Not all nurses are qualified to perform procedures.
Name, pager number as well as office number for each nurse most be
known. Your diagram should show the entities, relationships and their
attributes, and the cardinality of any relationships. Mark the best
primary key for each entity by underlining it.
This is my solution:
Here are my questions:
Should I use Have Appointment as associative entity?
Should I remove 2 relationships Undergo and Prescribe and connect 2
entities Procedure and Appointment Medicine directly to Have
Appointment associative entity? Will the ERD still right then?
If it's wrong, what about the same as question 2 and I turn the Have
Appointment associative entity into a relationship?
I feel really confused about the difference between using associative entity with a relationship (like in this post Enrollment with Teach and Teacher: When to use Associative entities?) and using ternary relationship (connect Teacher directly to Enrollment relationship instead of changing Enrollment to an associative entity and have the Teach relationship).
Should I use Have Appointment as associative entity?
No, I believe it should be a regular entity set. You gave it its own identity - the ID primary key - which I agree with, but that should've corresponded with a change in element type. Associative entity sets (AES) are relationships first, which means they're identified by the (keys of the) entity sets that they relate.
This is a topic that's widely confused, since AES in the entity-relationship model are different than in the network data model. The latter is intuitively more familiar to developers, since it's essentially a model based on records and pointers, but since it only supports directed binary relationships, anything more complicated - many-to-many relationships as well as ternary and higher relationships - need to be represented as AES. In this model, AES are identified by a surrogate ID, since composite keys generally aren't supported either.
The entity-relationship model supports n-ary relationships and composite keys, and so doesn't need AES nearly as frequently. One situation that can't be represented by regular entity sets and n-ary relationships is when a relationship needs to be the subject of a further relationship.
For example, let's look at the relationship between Procedure and Nurse to represent the nurses involved in a procedure.
I prefer the look-across convention for cardinality indicators - a nurse can perform 0 or more procedures, while a procedure requires 1 or more nurses. Anyway, the relationship Perform here is identified by the composite primary key (ProcedureID, NurseID).
Now, if we wanted to track the equipment used by each nurse in the performance of the procedure, we might think a simple ternary relationship would do the trick:
but that relationship would be identified by (ProcedureID, NurseID, EquipmentID), preventing us from recording nurses that assisted in the procedure without using any equipment. What we need is two separate relationships:
(ProcedureID, NurseID)
((ProcedureID, NurseID), EquipmentID)
with an FK constraint from the second to the first to prevent nurses not assisting in the procedure from handling the equipment.
Back to Have Appointment - it's not a relationship between pokemon and nurses (a pokemon can see the same nurse multiple times), it's an event that involves pokemon, nurses, procedures and medicine. It's best handled as a regular entity set with relationships to the other four. As for identity, I imagine a pokemon or nurse can only have one appointment at a time, so we could choose (PokemonID, DateTime) or (NurseID, DateTime) as a natural key. However, in practice we usually identify events by a surrogate ID since events span an interval which most DBMSs can't handle effectively as a primary key.
Should I remove 2 relationships Undergo and Prescribe and connect 2 entities Procedure and Appointment Medicine directly to Have Appointment associative entity? Will the ERD still right then?
No, I think you should add relationships between Pokemon and Have Appointment, and between Nurse and Have Appointment, after converting the AES to a regular entity set.
If it's wrong, what about the same as question 2 and I turn the Have Appointment associative entity into a relationship?
Answered above.

foreign key or null value to link two long distanced tables

Consider the following database example :
A clinichas many articles
The relationship between supplier and article is many to many ( (1,n) - (1,n))
Let's say I have the clinic's id and I want to retrieve all it's suppliers, what's the best way to do it? is it by creating a "null" article for each supplier in article_supplier OR by creating a foreign key in supplier that references the appropriate clinic?
The latter solution may seem the simplest and easiest but what happens when there is a big chain of tables? Do I keep adding a foreign key each time I need a list of something? e.g :
list of medicines a clinic uses
list of prescriptions a doctor gave
...
If it makes a difference, I am using Laravel's Eloquent ORM
A table represents a relationship among values. (Some values identify entities.) A database can't be used until we are told what each table--base or query result--means: what business/application relationship its rows satisfy. (Ie, what is its (characteristic) predicate.) What are the relationships for your tables?
To query we express a relationship in terms of base relationships then express its table in terms of the corresponding base tables. Eg a join returns rows satisfying one relationship and another. So again, we need to know tables' relationships in business/application terms.
Cardinalities & constraints are properties of relationships given what situations can arise. They aren't needed to update or query. They can guide design & are used for integrity.
When, given a clinic, you talk about "its" suppliers, you do not say what you mean. "Has", "its", "for", "references", "appropriate"--all mean nothing--they refer to related entities, but they don't say how they are related in terms of the business/application. This design contains no explicit relationship on clinics & articles. If it did, you've said nothing by which we could put the right rows in or see the rows & know about the situation. Still, you could then derive clinic-supplier rows where the supplier is "for" some article a given clinic "has". But is that the relationship you mean?? Eg if you want pairs where the clinic is allowed to "have" the supplier "for" some articles, that's a new relationship/table that cannot be derived from what you have.
creating a "null" article for each supplier in article_supplier
That relationship/table is a certain combination of article_supplier & the relationship/table just described. But it is simpler to just have those two.
creating a foreign key in supplier that references the appropriate clinic
That would mean that if a supplier "has" more than one clinic then there can be rows in the new version that differ only by all the other columns; normalization theory says that's worse than the original design and a clinic_supplier relationship/table. And it means that if a supplier "has" no clinic you would need something like a nullable FK (foreign key).
So you likely want a clinic_supplier table. But you should post a new question in which you actually say what business/application relationships you are talking about.
Your question & the following are essentially duplicates in that the basic principles/notions to obviously apply to answer them are all the same:
How do I find relations between tables that are long-distance related?
Required to join 2 tables with their FKs in a 3rd table
Best Solution - Ternary or Binary Relationship
You need to read an information modeling & database design textbook.
In your design :
Supplier is a table.
Articles is a table.
Since Supplier supplies articles, their relationship is expressed in terms of a table that links them. So, there is a supplier_article table.
Clinic is a table.
You mention Clinic uses articles.
By the same principle as above, the relationship between Clinic and Article should be expressed in the form of a table. This should be clinic_article table.
Though you haven't mentioned, you must look at other dependencies that arise. For example, a clinic performs procedures and procedures use articles.
by creating a foreign key in supplier that references the appropriate
clinic
This would assume that one supplier can supply only to one clinic forever. Even if this is true today, it won't be true tomorrow, because you might want to plan for multiple clinics in the same location or using the same application or database.

how to implement one to one relationship if the relation has attributes?

I have two tables in my database, Department and Academic_staff. the primary key in the Department table is depId and the primary key in the Academic_staff table is aNo.
Each department is managed by only one member of academic staff, so the relation between the two tables is one-to-one.
I need to record the date when someone of the academic staff starts managing a department, so the relation must have it's own attribute(mStartDate).
How can I implement this new attribute?
At first I was thinking to create a new table with three attributes (depId, aNo, mStartDate) and make two relations between the new table and the other two tables, but I then realized that it's not many-to-many relationship.
So how can I add the attribute mStartDate to the one-to-one relation between the two tables?
There's more than one relation between the two tables, and some of those relations are one-to-many (the department employs more than one academic staff), so I can't merge the two tables.
Your proposed new table (which I shall refer to as DepartmentManagement) could in principle record the history of managers for each department, in which case it would be a many-to-many (temporal) relationship between Department and Academic.
However, if you want to record only the current manager, it's reasonable to "absorb" DepartmentManager into the Department table, giving two columns there (Manager_aNo and Manager_StartDate). Conceptually the object "DepartmentManagement" still exists, but it's absorbed, it doesn't have its own table.
You could also absorb it in the other direction (into Academic) but that wouldn't allow an Academic ever to manage more than one department. You might not need that now, but in principle it's more likely than having a Department with two managers.
Departments
depId
fk_aNo (Unique )
*****- Primary key (depId, fk_aNo)*****
academicStuffs
aNo (PK)
newTable
depId --- Important "Both depId,aNo are from Departments, be sure"
aNo ---
mStartDate
- Primary key (depId, aNo)
Constraints--> a academicStuff can't start to manage a department more
than one time. If this structure proper for you and you want to enable
a aacademic stuff manage a department more than one time inform me.

Entity Relationship Diagram. How does the IS A relationship translate into tables?

I was simply wondering, how an ISA relationship in an ER diagram would translate into tables in a database.
Would there be 3 tables? One for person, one for student, and one for Teacher?
Or would there be 2 tables? One for student, and one for teacher, with each entity having the attributes of person + their own?
Or would there be one table with all 4 attributes and some of the squares in the table being null depending on whether it was a student or teacher in the row?
NOTE: I forgot to add this, but there is full coverage for the ISA relationship, so a person must be either a studen or a teacher.
Assuming the relationship is mandatory (as you said, a person has to be a student or a teacher) and disjoint (a person is either a student or a teacher, but not both), the best solution is with 2 tables, one for students and one for teachers.
If the participation is instead optional (which is not your case, but let's put it for completeness), then the 3 tables option is the way to go, with a Person(PersonID, Name) table and then the two other tables which will reference the Person table, e.g.
Student(PersonID, GPA), with PersonID being PK and FK referencing Person(PersonID).
The 1 table option is probably not the best way here, and it will produce several records with null values (if a person is a student, the teacher-only attributes will be null and vice-versa).
If the disjointness is different, then it's a different story.
there are 4 options you can use to map this into an ER,
option 1
Person(SIN,Name)
Student(SIN,GPA)
Teacher(SIN,Salary)
option 2 Since this is a covering relationship, option 2 is not a good match.
Student(SIN,Name,GPA)
Teacher(SIN,Name,Salary)
option 3
Person(SIN,Name,GPA,Salary,Person_Type)
person type can be student/teacher
option 4
Person(SIN,Name,GPA,Salary,Student,Teacher) Student and Teacher are bool type fields, it can be yes or no,a good option for overlapping
Since the sub classes don't have much attributes, option 3 and option 4 are better to map this into an ER
This answer could have been a comment but I am putting it up here for the visibility.
I would like to address a few things that the chosen answer failed to address - and maybe elaborate a little on the consequences of the "two table" design.
The design of your database depends on the scope of your application and the type of relations and queries you want to perform. For example, if you have two types of users (student and teacher) and you have a lot of general relations that all users can part take, regardless of their type, then the two table design may end up with a lot of "duplicated" relations (like users can subscribe to different newsletters, instead of having one M2M relationship table between "users" and newsletters, you'll need two separate tables to represent that relation). This issue worsens if you have three different types of users instead of two, or if you have an extra layer of IsA in your hierarchy (part-time vs full-time students).
Another issue to consider - the types of constraints you want to implement. If your users have emails and you want to maintain a user-wide unique constraint on emails, then the implementation is trickier for a two-table design - you'll need to add an extra table for every unique constraint.
Another issue to consider is just duplications, generally. If you want to add a new common field to users, you'll need to do it multiple times. If you have unique constraints on that common field, you'll need a new table for that unique constraint too.
All of this is not to say that the two table design isn't the right solution. Depending on the type of relations, queries and features you are building, you may want to pick one design over the other, like is the case for most design decisions.
It depends entirely on the nature of the relationships.
IF the relationship between a Person and a Student is 1 to N (one to many), then the correct way would be to create a foreign key relationship, where the Student has a foreign key back to the Person's ID Primary Key Column. Same thing for the Person to Teacher relationship.
However, if the relationship is M to N (many to many), then you would want to create a separate table containing those relationships.
Assuming your ERD uses 1 to N relationships, your table structure ought to look something like this:
CREATE TABLE Person
(
sin bigint,
name text,
PRIMARY KEY (sin)
);
CREATE TABLE Student
(
GPA float,
fk_sin bigint,
FOREIGN KEY (fk_sin) REFERENCES Person(sin)
);
and follow the same example for the Teacher table. This approach will get you to 3rd Normal Form most of the time.

Creating the database from the Entity Framework 0..1 to 1 is NOT NULL

I just started with the Entity Framework and started to design the model first. So in my model there is a Person who can have a PrivateTelephone, so I created an 0..1 to 1 association. As the picture below shows.
So far so good. But when I generate the database the [PrivateTelephone] is set to NOT NULL. Why can't this be just NULL?
It is becauser your relations are defined in reverse order. You should have 1 on Person and 0..1 on Telecom to specify that Person is the principal which can have one or zero phones. In your mapping you say that Telecom is principal which can have one or zero persons but person must have Telecom. It will also lead to reverse problem due to your incorrect mapping. You have six one-to-one relations to Telecom but if you reverse them as you demand you will say that all six relations (all six FKs in Telecom) will be NOT NULL = each record will have to participate in all six relations.
One-to-one relation is very special and should be used rarely. You should instead have one-to-many relation from Person to Telecom with a new column in Telecom specifying the type.
When using one-to-one relation you must have FK in dependent table configured with unique index. EF doesn't support unique indices so when you model one-to-one relation in model first it is still one-to-many in database and if database is used by another application it can break your application.
Also avoid unnecessary inheritance. Do you need Person as separate entity? = is there any instance which is only person and not employee? Are there more derived types from person? If not you don't need person and if yes it still doesn't mean that base person is a good idea. The same is true with employee. Inheritance has its own rules in EF and when using model first it will by default creates TPT inheritance = the worst one because it results in very complex and slow database queries.

Resources