Here is the scenario.
Two completely different Entities are independently related to the third entity in the same way. How do we represent it in the ERD? or (Enhanced ER)
Ex:
Student "BORROWS" BOOK (from the library)
DEPARTMENT "BORROWS" BOOK (from the same library).
If I define 'BORROWS' relationship twice, it would be awkward and clumsy in terms of appearance in the diagram, and increase the complexity of implementation as well.
At the same time, I can not declare a ternary relationship since STUDENT and DEPARTMENT are not inter-related in a relationship-instance.
However, I couldn't find a better way.
How do I solve it?
If Wikipedia is to be believed, Enhanced ER permits inheritance. Why don't you have a BORROWER entity (with the appropriate relationship), and have STUDENT and DEPARTMENT subclass that?
I've been having a similar issue - where a company or a person can order a product.
You've got an order, that can belong to either a person, or a company - so what do you link the relationship to? I'm thinking orders will have a companyId, and a personId foreign key, but how do you make them exclusive? The data returned won't necessarily be the same - a company doesn't have a first name / last name field for example.
I guess it could be done by having a name returned, and in the case of a person build the string out of firstname / lastname, and in the case of a company use the companyname field .
Related
We had a discussion today related to W3 lecture case study about how many entities we need for each situation. And I have some confusion as below:
Case 1) An employee is assigned to be a member of a team. A team with more than 5 members will have a team leader. The members of the team elect the team leader. List the entity(s) which you can identify in the above statement? In this cases, if we don't create 2 entities for above requirement, we need to add two more attributes for each employee which can lead to anomaly issues later. Therefore, we need to have 2 entities as below:
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK teamId&employeeId) -> 2 entities
Case 2) The company also introduced a mentoring program, whereby a new employee will be paired with someone who has been in the company longer." How many entity/ies do you need to model the mentoring program?
The Answer from Lecturer is 1. With that, we have to add 2 more attributes for each Employee, mentorRole (Mentor or Mentee) and pairNo (to distinguish between different pairs and to know who mentors whom), doesn't it?
My question is why can't we create a new Entity named MENTORING which will be similar to TEAM in Q1? And why we can only do that if this is a many-many relationship?
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK is pairNo&employeeId) -> 2 entities
Thank you in advance
First of all, about terminology: I use entity to mean an individual person, thing or event. You and I are two distinct entities, but since we're both members of StackOverflow, we're part of the same entity set. Entity sets are contrasted with value sets in the ER model, while the relational model has no such distinction.
While you're right about the number of entity sets, there's some issues with your implementation. TEAM's PK shouldn't be teamId, employeeId, it should be only teamId. The EMPLOYEE table should have a teamId foreign key (not part of the PK) to indicate team membership. The employeeId column in the TEAM table could be used to represent the team leader and is dependent on the teamId (since each team can have only one leader at most).
With only one entity set, we would probably represent team membership and leadership as:
EMPLOYEE(employeeId PK, team, leader)
where team is some team name or number which has to be the same for team members, and leader is a true/false column to indicate whether the employee in that row is the leader of his/her team. A problem with this model is that we can't ensure that a team has only one leader.
Again, there's some issues with the implementation. I don't see the need to identify pairs apart from the employees involved, and having a mentorRole (mentor or mentee) indicates that the association will be recorded for both mentor and mentee. This is redundant and creates an opportunity for inconsistency. If the goal was to represent a one-to-one relationship, there are better ways. I suggest a separate table MENTORING(menteeEmployeeId PK, mentorEmployeeId UQ) (or possibly a unique but nullable mentorEmployeeId in the EMPLOYEE table, depending on how your DBMS handles nulls in unique indexes).
The difference between the two cases is that teams can have any number of members and one leader, which is most effectively implemented by identifying teams separately from employees, whereas mentorship is a simpler association that is sufficiently identified by either of the two people involved (provided you consistently use the same role as identifier). You could create a separate entity set for mentoring, with relationships to the employees involved - it might look like my MENTORING table but with an additional surrogate key as PK, but there's no need for the extra identifier.
And why we can only do that if this is a many-many relationship?
What do you mean? Your examples don't contain a many-to-many relationship and we don't create additional entity sets for many-to-many relationships. If you're thinking of so-called "bridge" tables, you've got some concepts mixed up. Entity sets aren't tables. An entity set is a set of values, a table represents a relation over one or more sets of values. In Chen's original method, all relationships were represented in separate tables. It's just that we've gotten used to denormalizing simple one-to-one and one-to-many relationships into the same tables as entity attributes, but we can't do the same for many-to-many binary relationships or ternary and higher relationships in general.
I have two ERD examples involving subtypes. I cannot seem to find any definitive information online or in textbooks on connecting other entities to subtypes and how far you can inherit keys from subtypes, if at all. Those with good eyes may notice that I recently asked a similar question regarding subtypes, but it was for a different scenario and so far I only received a referral to another question that only explains the basics of subtypes which I do not need - I feel this is a more advanced topic to solve.
My specific issue is I need to know whether the Bridging entity called ENROLMENT is allowed to inherit the PK/FK from STUDENT entity, a Subtype of PATRON. If so, is PatronNumber and/or StudentNumber attributes allowed.
The two ERD examples are slightly different. Version 1 uses PatronNumber from the Subtype Student. Version 2 includes another PK called StudentNumber. Is this ok to add as a PK and can ENROLMENT reference from this? Which is better, if any?
Cheers!
The first version is to be preferred, for the reason that with a single value, PatronNumber, you can obtain all the information about the student with a single join, while in the second case you need to perform two joins.
Imagine, for instance, that you need to know the name of all the students that are enrolled to the course number 3: you can simply perform a join between Enrollment and Patron, while in the second case you need a join between Enrollment and Student and then between Student and Patron.
If your application requires explicitly a StudentNumber different from PatronNumber, you can simply add the attribute to the Student, and declare it unique.
My assignment is to draw an ER model (by hand) using Chen notation using the specifications below:
http://i57.tinypic.com/73ff2f.png
If you have questions about these specs. I'll play the role of the
client who will resolve them.
The database will serve a university.
Students have id's, names and gpa's. They must have exactly one major,
but they could have minors as well. Each major or minor is a
department which has a unique name and a phone number. For each
student with a minor, we record the date she signed up for it. Faculty
members are associated with a unique department and have id's, names
and office locations. Each internship is held by a particular student
at a particular compain and is supervised by a particular faculty
member. We also keep track of the last term in which that student
registered under that advisor for an internship at that company.
Students may have many internships over time. A given faculty member
may supervise many students at a given company, and she may supervise
a given student at several companies. However, for a given student and
company, there can be only one faculty advisor.
Students, Departments,
Faculty and Companies should be your entity types. Internship should
be a ternary relationship type. The specs should also lead you to some
binary relationship types. Don't add any ingredients to this mix other
than what appear in the specs.
Below is my work:
http://i60.tinypic.com/28rf7tf.jpg
Can anyone please help as I really need a better understanding of this (my professor is AWFUL at explaining this).
You missed (per your assignment's last paragraph) a department entity type. (Box.)
You missed 'Faculty members are associated with a unique department'. That's a relationship between those two entity types. (Diamond with lines to those boxes.)
You could have those major and minor entitie types that are 1:1 with departments. (Your present boxes with each a line to its own diamond each with a line to department.) But (per your assignment's last paragraph not listing them as entities) you could have major being a relationship 'student[s] has a major in department [d]' and similarly for minor. (Lines from student to each of two diamonds each with a line to department.) But the assignment actually says 'each major or minor is a department' so that's major as 'student[s] has major department [d]' and similarly for minor. (Same picture.)
Per your assignment's last paragraph you should make internship a ternary relationship. (Under Chen it's a relationship diamond (possibly with its own properties) formed by 3 lines to entity type rectangles (possibly with their own properties) rather than an entity box.) However, it's not clear exactly when your assignment considers that an internship holds. (It tells us what relationships hold; it's just not clear which one it wants to call interning.) (Although we can look for interpretations consistent with it being ternary.) One is 'student [s] interns at company [c] supervised by faculty member [f]'. But since 'for a given student and company, there can be only one faculty advisor' that notion of internship is more simply characterized by a binary relationship 'student [s] interns at company [c]'. But then you still need a relationship 'faculty member [f] advises student [s] at a company [c]'. So I will suggest that your assignment expects the former. We can add property term. (This is more reasonably called a relationship on student, company, faculty member and date; but E-RM considers relationships to be on entities. Although it all depends on your class's method's particulars.)
(The possibility of multiple reasonable variations is why you should propose a particular design fully handling a particular specification in a SO question.)
A problem with the E-R Model [sic] is that it introduces needless distinctions between entities, reltionships and properties. There is really no distinction between a relationship instance and an entity. Eg: Here we could just as well have an internship be per above an entity in a 4-way relationship plus property. Eg: Your assignment says 'each major or minor is a department'. But a major or minor isn't a department. A major or minor could be considered a subject, which would be the subject after which a department is named or the subject of the degree offered by a department. Or we could just have relationships in which a department participates but the relationship is about that department's subject or name or degree being a major or minor.
(If an internship as relationship participated in its own relationships I don't know how your instructor's particular method would keep the further lines organized. Some methods add internship entities (box) 1:1 with relationships (diamond); then some methods specially associate the entity type with the relationship as a reification while some make the relationship 4-way by including the reified entity type. Eg 'internship [i] is student [s] at company [c] and ...'.)
(Correctly speaking there are entity types vs relationships and entities vs relationship instances. But the assignment talks of relationship "types".)
Re E-RM see this answer and this one. Also the E-RM wiki page section 'Entity–relationship modeling'. (Which correctly mentions misinterpretations of Chen's E-RM & E-RDs by some related modeling and diagramming methods and tools and even some presentations of E-RM itself. But the 'Overview' is nonsense.)
Re E-RM problems see this.
There are couples of questions around asking for difference / explanation on identifying and non-identifying relationship in relationship database.
My question is, can you think of a simpler term for these jargons? I understand that technical terms have to be specific and unambiguous though. But having an 'alternative name' might help students relate more easily to the concept behind.
We actually want to use a more layman term in our own database modeling tool, so that first-time users without much computer science background could learn faster.
cheers!
I often see child table or dependent table used as a lay term. You could use either of those terms for a table with an identifying relationship
Then say a referencing table is a table with a non-identifying relationship.
For example, PhoneNumbers is a child of Users, because a phone number has an identifying relationship with its user (i.e. the primary key of PhoneNumbers includes a foreign key to the primary key of Users).
Whereas the Users table has a state column that is a foreign key to the States table, making it a non-identifying relationship. So you could say Users references States, but is not a child of it per se.
I think belongs to would be a good name for the identifying relationship.
A "weak entity type" does not have its own key, just a "partial key", so each entity instance of this weak entity type has to belong to some other entity instance so it can be identified, and this is an "identifying relationship". For example, a landlord could have a database with apartments and rooms. A room can be called kitchen or bathroom, and while that name is unique within an apartment, there will be many rooms in the database with the name kitchen, so it is just a partial key. To uniquely identify a room in the database, you need to say that it is the kitchen in this particular apartment. In other words, the rooms belong to apartments.
I'm going to recommend the term "weak entity" from ER modeling.
Some modelers conceptualize the subject matter as being made up of entities and relationships among entities. This gives rise to Entity-Relationship Modeling (ER Modeling). An attribute can be tied to an entity or a relationship, and values stored in the database are instances of attributes.
If you do ER modeling, there is a kind of entity called a "weak entity". Part of the identity of a weak entity is the identity of a stronger entity, to which the weak one belongs.
An example might be an order in an order processing system. Orders are made up of line items, and each line item contains a product-id, a unit-price, and a quantity. But line items don't have an identifying number across all orders. Instead, a line item is identified by {item number, order number}. In other words, a line item can't exist unless it's part of exactly one order. Item number 1 is the first item in whatever order it belongs to, but you need both numbers to identify an item.
It's easy to turn an ER model into a relational model. It's also easy for people who are experts in the data but know nothing about databases to get used to an ER model of the data they understand.
There are other modelers who argue vehemently against the need for ER modeling. I'm not one of them.
Nothing, absolutely nothing in the kind of modeling where one encounters things such as "relationships" (ER, I presume) is "technical", "precise" or "unambiguous". Nor can it be.
A) ER modeling is always and by necessity informal, because it can never be sufficient to capture/express the entire definition of a database.
B) There are so many different ER dialects out there that it is just impossible for all of them to use exactly the same terms with exactly the same meaning. Recently, I even discovered that some UK university that teaches ER modeling, uses the term "entity subtype" for the very same thing that I always used to name "entity supertype", and vice-versa !
One could use connection.
You have Connection between two tables, where the IDs are the same.
That type of thing.
how about
Association
Link
Correlation
For a database assignment I have to model a system for a school. Part of the requirements is to model information for staff, students and parents.
In the UML class diagram I have modelled this as those three classes being subtypes of a person type. This is because they will all require information on, among other things, address data.
My question is: how do I model this in the database (mysql)?
Thoughts so far are as follows:
Create a monolithic person table that contains all the information for each type and will have lots of null values depending on what type is being stored. (I doubt this would go down well with the lecturer unless I argued the case very convincingly).
A person table with three foreign keys which reference the subtypes but two of which will be null - in fact I'm not even sure if that makes sense or is possible?
According to this wikipage about django it's possible to implement the primary key on the subtypes as follows:
"id" integer NOT NULL PRIMARY KEY REFERENCES "supertype" ("id")
Something else I've not thought of...
So for those who have modelled inheritance in a database before; how did you do it? What method do you recommend and why?
Links to articles/blog posts or previous questions are more than welcome.
Thanks for your time!
UPDATE
Alright thanks for the answers everyone. I already had a separate address table so that's not an issue.
Cheers,
Adam
4 tables staff, students, parents and person for the generic stuff.
Staff, students and parents have forign keys that each refer back to Person (not the other way around).
Person has field that identifies what the subclass of this person is (i.e. staff, student or parent).
EDIT:
As pointed out by HLGM, addresses should exist in a seperate table, as any person may have multiple addresses. (However - I'm about to disagree with myself - you may wish to deliberately constrain addresses to one per person, limiting the choices for mailing lists etc).
Well I think all approaches are valid and any lecturer who marks down for shoving it in one table (unless the requirements are specific to say you shouldn't) is removing a viable strategy due to their own personal opinion.
I highly recommend that you check out the documentation on NHibernate as this provides different approaches for performing the above. Which I will now attempt to poorly parrot.
Your options:
1) One table with all the data that has a "delimiter" column. This column states what kind of person the person is. This is viable in simple scenarios and (seriously) high performance where the joins will hurt too much
2) Table per class which will lead to duplication of columns but will avoid joins again, so its simple and a lil faster (although only a lil and indexing mitigates this in most scenarios).
3) "Proper" inheritence. The normalised version. You are almost there but your key is in the wrong place IMO. Your Employee table should contain a PersonId so you can then do:
select employee.id, person.name from employee inner join person on employee.personId = person.personId
To get all the names of employees where name is only specified on the person table.
I would go for #3.
Your goal is to impress a lecturer, not a PM or customer. Academics tend to dislike nulls and might (subconciously) penalise you for using the other methods (which rely on nulls.)
And you don't necessarily need that django extension (PRIMARY KEY ... REFERENCES ...) You could use an ordinary FOREIGN KEY for that.
"So for those who have modelled inheritance in a database before; how did you do it? What method do you recommend and why?
"
Methods 1 and 3 are good. The differences are mostly in what your use cases are.
1) adaptability -- which is easier to change? Several separate tables with FK relations to the parent table.
2) performance -- which requires fewer joins? One single table.
Rats. No design accomplishes both.
Also, there's a third design in addition to your mono-table and FK-to-parent.
Three separate tables with some common columns (usually copy-and-paste of the superclass columns among all subclass tables). This is very flexible and easy to work with. But, it requires a union of the three tables to assemble an overall list.
OO databases go through the same stuff and come up with pretty much the same options.
If the point is to model subclasses in a database, you probably are already thinking along the lines of the solutions I've seen in real OO databases (leaving fields empty).
If not, you might think about creating a system that doesn't use inheritance in this way.
Inheritance should always be used quite sparingly, and this is probably a pretty bad case for it.
A good guideline is to never use inheritance unless you actually have code that does different things to the field of a "Parent" class than to the same field in a "Child" class. If business code in your class doesn't specifically refer to a field, that field absolutely shouldn't cause inheritance.
But again, if you are in school, that may not match what they are trying to teach...
The "correct" answer for the purposes of an assignment is probably #3 :
Person
PersonId Name Address1 Address2 City Country
Student
PersonId StudentId GPA Year ..
Staff
PersonId StaffId Salary ..
Parent
PersonId ParentId ParentType EmergencyContactNumber ..
Where PersonId is always the primary key, and also a foreign key in the last three tables.
I like this approach because it makes it easy to represent the same person having more than one role. A teacher could very well also be a parent, for example.
I suggest five tables
Person
Student
Staff
Parent
Address
WHy - because people can have multiple addesses and people can also have multiple roles and the information you want for staff is different than the information you need to store for parent or student.
Further you may want to store name as last_name, Middle_name, first_name, Name_suffix (like jr.) instead of as just name. Belive me you willwant to be able to search on last_name! Name is not unique, so you will need to make sure you have a unique surrogate primary key.
Please read up about normalization before trying to design a database. Here is a source to start with:
http://www.deeptraining.com/litwin/dbdesign/FundamentalsOfRelationalDatabaseDesign.aspx
Super type Person should be created like this:
CREATE TABLE Person(PersonID int primary key, Name varchar ... etc ...)
All Sub types should be created like this:
CREATE TABLE IF NOT EXISTS Staffs(StaffId INT NOT NULL ,
PRIMARY KEY (StaffId) ,
CONSTRAINT FK_StaffId FOREIGN KEY (StaffId) REFERENCES Person(PersonId)
)
CREATE TABLE IF NOT EXISTS Students(StudentId INT NOT NULL ,
PRIMARY KEY (StudentId) ,
CONSTRAINT FK_StudentId FOREIGN KEY (StudentId) REFERENCES Person(PersonId)
)
CREATE TABLE IF NOT EXISTS Parents(PersonID INT NOT NULL ,
PRIMARY KEY (PersonID ) ,
CONSTRAINT FK_PersonID FOREIGN KEY (PersonID ) REFERENCES Person(PersonId)
)
Foreign key in subtypes staffs,students,parents adds two conditions:
Person row cannot be deleted unless corresponding subtype row will
not be deleted. For e.g. if there is one student entry in students
table referring to Person table, without deleting student entry
person entry cannot be deleted, which is very important. If Student
object is created then without deleting Student object we cannot
delete base Person object.
All base types have foreign key "not null" to make sure each base
type will have base type existing always. For e.g. If you create
Student object you must create Person object first.