What is the right way to use Associative Entity? - database

This is the description:
Draw an Entity-Relationship diagram for Poke-Hospital which provides
medical service to pokemon.
Each pokemon has an appointment with one of the nurse Joys. In
addition to recording the name, type and trainer of each pokemon, the
system needs to keep track of the multiple types of sickness being
diagnosed for the pokemon. During an appointment, the nurse will
always prescribe medicine. It is required to record the date, time and
dosage of the medicine. A pokemon may need to take more than one
medicine at a time. Each medicine is stored with its name, brand and
cost of purchase. There is no restriction on the amount of medicine to
be prescribed by any nurse.
Within an appointment, a pokemon may need to undergo procedures such
as a surgery and/or diagnosis. Each procedure requires different type
of rooms and a list of equipment. The date, time and the actual room
of the procedure need to be recorded.
A procedure may be performed by more than one nurse. A nurse is
involved in the procedure based on the training skills that she has
completed. Not all nurses are qualified to perform procedures.
Name, pager number as well as office number for each nurse most be
known. Your diagram should show the entities, relationships and their
attributes, and the cardinality of any relationships. Mark the best
primary key for each entity by underlining it.
This is my solution:
Here are my questions:
Should I use Have Appointment as associative entity?
Should I remove 2 relationships Undergo and Prescribe and connect 2
entities Procedure and Appointment Medicine directly to Have
Appointment associative entity? Will the ERD still right then?
If it's wrong, what about the same as question 2 and I turn the Have
Appointment associative entity into a relationship?
I feel really confused about the difference between using associative entity with a relationship (like in this post Enrollment with Teach and Teacher: When to use Associative entities?) and using ternary relationship (connect Teacher directly to Enrollment relationship instead of changing Enrollment to an associative entity and have the Teach relationship).

Should I use Have Appointment as associative entity?
No, I believe it should be a regular entity set. You gave it its own identity - the ID primary key - which I agree with, but that should've corresponded with a change in element type. Associative entity sets (AES) are relationships first, which means they're identified by the (keys of the) entity sets that they relate.
This is a topic that's widely confused, since AES in the entity-relationship model are different than in the network data model. The latter is intuitively more familiar to developers, since it's essentially a model based on records and pointers, but since it only supports directed binary relationships, anything more complicated - many-to-many relationships as well as ternary and higher relationships - need to be represented as AES. In this model, AES are identified by a surrogate ID, since composite keys generally aren't supported either.
The entity-relationship model supports n-ary relationships and composite keys, and so doesn't need AES nearly as frequently. One situation that can't be represented by regular entity sets and n-ary relationships is when a relationship needs to be the subject of a further relationship.
For example, let's look at the relationship between Procedure and Nurse to represent the nurses involved in a procedure.
I prefer the look-across convention for cardinality indicators - a nurse can perform 0 or more procedures, while a procedure requires 1 or more nurses. Anyway, the relationship Perform here is identified by the composite primary key (ProcedureID, NurseID).
Now, if we wanted to track the equipment used by each nurse in the performance of the procedure, we might think a simple ternary relationship would do the trick:
but that relationship would be identified by (ProcedureID, NurseID, EquipmentID), preventing us from recording nurses that assisted in the procedure without using any equipment. What we need is two separate relationships:
(ProcedureID, NurseID)
((ProcedureID, NurseID), EquipmentID)
with an FK constraint from the second to the first to prevent nurses not assisting in the procedure from handling the equipment.
Back to Have Appointment - it's not a relationship between pokemon and nurses (a pokemon can see the same nurse multiple times), it's an event that involves pokemon, nurses, procedures and medicine. It's best handled as a regular entity set with relationships to the other four. As for identity, I imagine a pokemon or nurse can only have one appointment at a time, so we could choose (PokemonID, DateTime) or (NurseID, DateTime) as a natural key. However, in practice we usually identify events by a surrogate ID since events span an interval which most DBMSs can't handle effectively as a primary key.
Should I remove 2 relationships Undergo and Prescribe and connect 2 entities Procedure and Appointment Medicine directly to Have Appointment associative entity? Will the ERD still right then?
No, I think you should add relationships between Pokemon and Have Appointment, and between Nurse and Have Appointment, after converting the AES to a regular entity set.
If it's wrong, what about the same as question 2 and I turn the Have Appointment associative entity into a relationship?
Answered above.

Related

Relational Database: Reusing the same table in a different interpretation

Problem description
I am currently working on a project which requires a relational database for storage.
After thinking about the data and its relations for a while I ran into a quite repetitive problem:
I encountered a common data schema for entity A which contains some fields e.g. name, description, value. This entity is connected with entity B in multiple n-1 relations. So entity B has n entities A in relation rel1 and n entities A in relation rel2.
Now I am trying to break down this datamodel into a schema for a relational database (e.g. Postgres, MySQL).
After some research, I have not really found "the best" solution for this particular problem.
Some similar questions I have found so far:
Stackoverflow
DBA Stackexchange
My ideas
So I have thought about possible solutions which I am going to present here:
1. Duplicate table
The relationship from entity B to entity A has a certain meaning to it. So it is possible to create multiple tables (1 per relationship). This would solve all immediate problems but essentially duplicate the tables which means that changes now have to be reflected to multiple tables (e.g. a new column).
2. Introduce a type column
Instead of multiple relationships, I could just say "Entity B is connected with n entity A". Additionally, I would add a type column that then tells me to which relation entity A belongs. I am not exactly sure how this is represented with common ORMs like Spring-Hibernate and if this introduces additional problems that I am currently unaware of.
3. Abstract the common attributes of entity A
Another option is to create a ADetails entity, which bundles all attributes of entity A.
Then I would create two entities that represent each relationship and which are connected to the ADetails entity in a 1-to-1 relationship. This would solve the interpretation problem of the foreign key but might be too much overhead.
My Question
In the context of a medium-large-sized project, are any of these solutions viable?
Are there certain Cons that rule out one particular approach?
Are there other (better) options I haven't thought about?
I appreciate any help on this matter.
Edit 1 - PPR (Person-Party-Role)
Thanks for the suggestion from AntC. PPR Description
I think the described situation matches my problem.
Let's break it down:
Entity B is an event. There exists only one event for the given participants to make this easier. So the relationship from event to participant is 1-n.
Entity A can be described as Groups, People, Organization but given my situation they all have the same attributes. Hence, splitting them up into separate tables felt like the wrong idea.
To explain the situation with the class diagram:
An Event (Entity B) has a collection of n Groups (Entity A), n People (Entity A) and n Organizations (Entity A).
If I understand correctly the suggestion is the following:
In my case the relationship between Event and Participant is 1-n
The RefRoles table represents the ParticipantType column that descibes to which relationship the Participant belongs (is it a customer or part of the service for the event for example)
Because all my Groups, People and Organizations have the same attributes the only table required at this point is the Participant table
If there are individual attributes in the future I would introduce a new table (e.g. People) that references the Participant in a 1-1 relationship.
If there are multiple tables going to be added, the foreign key of the multiple 1-1 relationship is mutually exclusive (so there can only be one Group/Person/Organization for a participant)
Solution suggested by AntC and Christian Beikov
Splitting up the tables does make sense while keeping the common attributes in one table.
At the moment there are no individual attributes but the type column is not required anymore because the foreign keys can be used to see which relationship the entity belongs to.
I have created a small example for this:
There exist 3 types (previously type column) of people for an event: Staff, VIP, Visitor
The common attributes are mapped in a 1-1-relationship to the person table.
To make it simple: Each Person (Staff, VIP, Visitor) can only participate in one event. (Would be n-m-relationship in a more advanced example)
The database schema would be the following:
This approach is better than the type column in my opinion.
It also solves having to interprete the entity based on its type in the application later on. It is also possible to resolve a type column in an ORM (see this question) but this approach avoids the struggle if the ORM you are using does not support resolving it.
IMO since you already use dedicated terms for these objects, they probably will diverge and splitting up a table afterwards is quite some work, also on the code side, so I would suggest you map dedicated entities/tables from the beginning.

Relational Database: When do we need to add more entities?

We had a discussion today related to W3 lecture case study about how many entities we need for each situation. And I have some confusion as below:
Case 1) An employee is assigned to be a member of a team. A team with more than 5 members will have a team leader. The members of the team elect the team leader. List the entity(s) which you can identify in the above statement? In this cases, if we don't create 2 entities for above requirement, we need to add two more attributes for each employee which can lead to anomaly issues later. Therefore, we need to have 2 entities as below:
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK teamId&employeeId) -> 2 entities
Case 2) The company also introduced a mentoring program, whereby a new employee will be paired with someone who has been in the company longer." How many entity/ies do you need to model the mentoring program?
The Answer from Lecturer is 1. With that, we have to add 2 more attributes for each Employee, mentorRole (Mentor or Mentee) and pairNo (to distinguish between different pairs and to know who mentors whom), doesn't it?
My question is why can't we create a new Entity named MENTORING which will be similar to TEAM in Q1? And why we can only do that if this is a many-many relationship?
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK is pairNo&employeeId) -> 2 entities
Thank you in advance
First of all, about terminology: I use entity to mean an individual person, thing or event. You and I are two distinct entities, but since we're both members of StackOverflow, we're part of the same entity set. Entity sets are contrasted with value sets in the ER model, while the relational model has no such distinction.
While you're right about the number of entity sets, there's some issues with your implementation. TEAM's PK shouldn't be teamId, employeeId, it should be only teamId. The EMPLOYEE table should have a teamId foreign key (not part of the PK) to indicate team membership. The employeeId column in the TEAM table could be used to represent the team leader and is dependent on the teamId (since each team can have only one leader at most).
With only one entity set, we would probably represent team membership and leadership as:
EMPLOYEE(employeeId PK, team, leader)
where team is some team name or number which has to be the same for team members, and leader is a true/false column to indicate whether the employee in that row is the leader of his/her team. A problem with this model is that we can't ensure that a team has only one leader.
Again, there's some issues with the implementation. I don't see the need to identify pairs apart from the employees involved, and having a mentorRole (mentor or mentee) indicates that the association will be recorded for both mentor and mentee. This is redundant and creates an opportunity for inconsistency. If the goal was to represent a one-to-one relationship, there are better ways. I suggest a separate table MENTORING(menteeEmployeeId PK, mentorEmployeeId UQ) (or possibly a unique but nullable mentorEmployeeId in the EMPLOYEE table, depending on how your DBMS handles nulls in unique indexes).
The difference between the two cases is that teams can have any number of members and one leader, which is most effectively implemented by identifying teams separately from employees, whereas mentorship is a simpler association that is sufficiently identified by either of the two people involved (provided you consistently use the same role as identifier). You could create a separate entity set for mentoring, with relationships to the employees involved - it might look like my MENTORING table but with an additional surrogate key as PK, but there's no need for the extra identifier.
And why we can only do that if this is a many-many relationship?
What do you mean? Your examples don't contain a many-to-many relationship and we don't create additional entity sets for many-to-many relationships. If you're thinking of so-called "bridge" tables, you've got some concepts mixed up. Entity sets aren't tables. An entity set is a set of values, a table represents a relation over one or more sets of values. In Chen's original method, all relationships were represented in separate tables. It's just that we've gotten used to denormalizing simple one-to-one and one-to-many relationships into the same tables as entity attributes, but we can't do the same for many-to-many binary relationships or ternary and higher relationships in general.

How to put this in a E/R diagram?

Have a simple question, but I think I am overthinking it. I need to make an E/R diagram out of this:
Substantial fees are due every calendar year. Fee payments must be
made via a bank transfer, mentioning the member number and the
membership year it applies to. The database should store the date of
payment.
I am ignoring calendar year, as I think it is not relevant for the E/R diagram. I have an entity called "Members" which I like to "Fee" via *"payed via the relationship (diamond symbol) a bank transfer"*.
Now, my question is: should "member number" and "membership" be part of the "fee" entity or the "member" entity? Or both? Because I am thinking to add a new relationship to "fee" giving it the name "consists of" and then link "member number" and "membership", but I don't know whether that's good or not.
And what to do with the last sentence? "The database should store the date of payment."? Can I ignore it?
From your description I got:
You have entity sets Members and Payments
Members are identified by a member_number
Payments have attributes date, amount and membership_year
Obviously, we also need:
Payments have an attribute amount
How are we going to identify Payments? No combination of the listed attributes are uniquely identifying in my opinion. A Member could make two identical Payments on the same date with the same amount, for the same membership year, e.g. if they accidentally only paid half of the annual fee at first then made a second payment to correct.
Let's introduce a surrogate key:
Payments are identified by a payment_id
We also need a relationship between the two entity sets:
Each Payment is associated with a single Member
Each Member can make multiple Payments
We can put this info into an ER diagram:
To derive a table diagram, Chen's original method implemented every entity relation (entity key and attributes) and relationship relation (relationship keys (i.e. related entity keys) and relationship attributes) as separate tables:
However, it's common practice to denormalize tables with the same primary key:
I recommend you study Chen's paper The Entity-Relationship Model - Toward a Unified View of Data. Codd's paper A Relational Model of Data for Large Shared Databanks provides valuable background.

Implementating Generalizations When At The Logical Design Stage Of Database Design?

I am designing a database and as i do not have much experience in this subject, i am faced with a problem which i do not know how to go about solving.
In my conceptual model i have an object known as "Vehicle" which the customer orders and the stock system monitors. This supertype has two subtypes "Motorcar" and "Motorcycle". The user can order one or the other or even both.
Now that i am at the logical design stage, i need to know how i can have the system allow for two different types of products. The problem i have is that if i put each of the objects separate attributes into the same relation, then i will have columns that are of no use to some objects.
For example, if i just have a generic table holding both "Motorcars" and "Motorcycles" which i call "Vehicles" and all of their attributes, the cars will not need some of the motorcycle attributes and the motorcycle will not need all of the car attributes.
Is there a way to solve this issue?
The decision will need to be guided by the amount of shared information. I would start by identifying all the attributes and the rules about them.
If the majority of information is shared, you might not split into multiple tables. On the other hand, you can always split tables and then join into a view for ease of use.
For instance, you might have a vehicle table with only share information, and then a motorcar table with a foreign key to the vehicles table and a motorcycle table with a foreign key to the vehicles table. There is a certain difficulty ensuring that you don't have a motorocar row AND a motorcycle row referring to the same vehicle, and so there are other possibilities to mitigate that - but all that is unnecessary if the majority of information is common, you just have unused columns in a single vehicles table. You can even enforce with constraints to ensure that columns are NULL for types where they should not be filled in.

a layman's term for identifying relationship

There are couples of questions around asking for difference / explanation on identifying and non-identifying relationship in relationship database.
My question is, can you think of a simpler term for these jargons? I understand that technical terms have to be specific and unambiguous though. But having an 'alternative name' might help students relate more easily to the concept behind.
We actually want to use a more layman term in our own database modeling tool, so that first-time users without much computer science background could learn faster.
cheers!
I often see child table or dependent table used as a lay term. You could use either of those terms for a table with an identifying relationship
Then say a referencing table is a table with a non-identifying relationship.
For example, PhoneNumbers is a child of Users, because a phone number has an identifying relationship with its user (i.e. the primary key of PhoneNumbers includes a foreign key to the primary key of Users).
Whereas the Users table has a state column that is a foreign key to the States table, making it a non-identifying relationship. So you could say Users references States, but is not a child of it per se.
I think belongs to would be a good name for the identifying relationship.
A "weak entity type" does not have its own key, just a "partial key", so each entity instance of this weak entity type has to belong to some other entity instance so it can be identified, and this is an "identifying relationship". For example, a landlord could have a database with apartments and rooms. A room can be called kitchen or bathroom, and while that name is unique within an apartment, there will be many rooms in the database with the name kitchen, so it is just a partial key. To uniquely identify a room in the database, you need to say that it is the kitchen in this particular apartment. In other words, the rooms belong to apartments.
I'm going to recommend the term "weak entity" from ER modeling.
Some modelers conceptualize the subject matter as being made up of entities and relationships among entities. This gives rise to Entity-Relationship Modeling (ER Modeling). An attribute can be tied to an entity or a relationship, and values stored in the database are instances of attributes.
If you do ER modeling, there is a kind of entity called a "weak entity". Part of the identity of a weak entity is the identity of a stronger entity, to which the weak one belongs.
An example might be an order in an order processing system. Orders are made up of line items, and each line item contains a product-id, a unit-price, and a quantity. But line items don't have an identifying number across all orders. Instead, a line item is identified by {item number, order number}. In other words, a line item can't exist unless it's part of exactly one order. Item number 1 is the first item in whatever order it belongs to, but you need both numbers to identify an item.
It's easy to turn an ER model into a relational model. It's also easy for people who are experts in the data but know nothing about databases to get used to an ER model of the data they understand.
There are other modelers who argue vehemently against the need for ER modeling. I'm not one of them.
Nothing, absolutely nothing in the kind of modeling where one encounters things such as "relationships" (ER, I presume) is "technical", "precise" or "unambiguous". Nor can it be.
A) ER modeling is always and by necessity informal, because it can never be sufficient to capture/express the entire definition of a database.
B) There are so many different ER dialects out there that it is just impossible for all of them to use exactly the same terms with exactly the same meaning. Recently, I even discovered that some UK university that teaches ER modeling, uses the term "entity subtype" for the very same thing that I always used to name "entity supertype", and vice-versa !
One could use connection.
You have Connection between two tables, where the IDs are the same.
That type of thing.
how about
Association
Link
Correlation

Resources