Mapping a ternary relationship to the relational model (Employes, Customer, Project) - database

I would like to convert this segment of an ER-Diagram to a relational model. We have a ternary relationship and what it says is the following:
1 Customer gives 1 Project to -> multiple Developers
1 Customer assigns 1 Developer with -> multiple Projects
1 Developer is assigned 1 Project by -> ONE Customer
A proposed solution would be this:
Assignment(EmployeeID, CustomerID, ProjectID)
where the primary key is composed of EmployeeID, CustomerID and ProjectID.
And all of those attributes are foreign keys, each one refering to its respective entity.
But this solution is plain wrong as it doesn't express the same thing as the ER-Diagram. We have a composed primary key, so that means that the COMBINATION of those three things is UNIQUE. That implies that I can have the same ProjectID, with the same EmployeeID but given by a different CustomerID (which I do not want).
How do I resolve this?
EDIT: As many users find that the bullet points haven't clarified anything, I will give a short textual description of the concept of the relation:
A single customer can give away one or more projects
A single project can be given by ONE SINGLE CUSTOMER
Each project can be finished by one or more developers
Each developer can work on multiple projects (regardless of the customer by which the project was given)
For that purpose, I have concluded that it would be better to use two separate binary relations instead of a single ternary. See my answer below.

When ternary relationships are expressed in a relational model, each of the entity sets with a "many" cardinality indicator becomes part of the primary key. In other words, I read your relationship as expressing the functional dependency (EmployeeID, ProjectID) -> CustomerID which will be physically represented as Assignment (EmployeeID PK/FK, ProjectID PK/FK, CustomerID FK).

the COMBINATION of those three things is UNIQUE. That implies that I can have the same ProjectID, with the same EmployeeID but given by a different CustomerID (which I do not want).
The triplets being unique does not imply that--clearly, the triplets can be unique at the same time as certain combinations of rows are absent. On the other hand it doesn't enforce their absence. But the cardinality constraints do. What they say is what the bullets (try to) say--that only certain situations/states can arise. The bullets are not "what the relationship says"--either in the sense of what rows actually form the relationship/table in a given situation/state or in the sense of what a row says about the situation when it is in the relationship/table.
In this kind of diagram a diamond denotes an n-ary business or application relation(ship) or association and its corresponding table. A line in such a diagram represents a participation by an entity type and its corresponding FK (foreign key) (sadly, called a "relationship" in pseudo-ER methods.) A constraint is a restriction on what instances/rows can appear in a relationship/table. Each instance/row in a relationship/table "says" that that row of values satisfies the relationship. Constraints "say" there are limitations on what values can be so related over all situations/states. Cardinalities are constraints that say something about how many times values and/or combinations of values can appear in a relationship.
There are two main cardinality conventions, look-across & look-here. In look-across a number/range says how many of the entities of the type it is near can participate with one subrow of entities of the other entity types, ie how many times some subrow of the others can participate/be in the relationship/table. (Chen's original ER meaning.) In look-here a number/range says how many subrows of the other entity types can appear with an entity of the type it is near, ie how many times a nearby entity can participate/be in the relationship/table. (Look-here isn't very useful for relationships with arity > 2.)
We have a ternary relationship and what it says is the following:
What the relationship diamond says is that you are recording the rows (EmployeeID, CustomerID, ProjectID) where (something like) developer EmployeeID is assigned by customer CustomerID to project ProjectID. What the cardinalities say is that only certain sets of instances/rows can satisfy that relationship in any given situation/state.
1 Customer gives 1 Project to -> multiple Developers
1 Customer assigns 1 Developer with -> multiple Projects
1 Developer is assigned 1 Project by -> ONE Customer
Your bulleted constraints are not clear. Numbers have been stuck in front of entity types--almost as one would put id values in to get what that row of id values says when in the relationship/table--but the almost-sentences produced, which also have unexplained arrows, don't mean anything. Maybe you are trying to say, for a given customer-project subrow value there can be multiple developer values, etc? That would give the look-across cardinalities in the diagram. But you haven't said that.

As I have already mentioned in the question in the first place, the description of the relation is the following:
A single customer can give away one or more projects
A single project can be given by ONE SINGLE CUSTOMER
Each project can be finished by one or more developers
Each developer can work on multiple projects (regardless of the customer by which the project was given)
The problem is in the ER-Diagram itself: it does not exactly represent the description above. The problem lies in the constraint that a single project can be given by one single customer. That's why it would make more sense to model that with two separate binary relationships instead using a ternary one.
That being said, the relationship between Customer and Project should be a 1:n relationship, while the relationship between Project and Developer should be a m:n relationship. Mapping those relationships gives us the following:
Customer(CustomerID) with Primary Key=CustomerID
Project (ProjectID, CustomerID) with Primary Key=CustomerID and Foreign Key=CustomerID referencing the Customer
Developer(DeveloperID) with PK=DeveloperID
ProjectDevelopment (ProjectID, DeveloperID) with PK={ProjectID, DeveloperID)

Related

Relational Database: When do we need to add more entities?

We had a discussion today related to W3 lecture case study about how many entities we need for each situation. And I have some confusion as below:
Case 1) An employee is assigned to be a member of a team. A team with more than 5 members will have a team leader. The members of the team elect the team leader. List the entity(s) which you can identify in the above statement? In this cases, if we don't create 2 entities for above requirement, we need to add two more attributes for each employee which can lead to anomaly issues later. Therefore, we need to have 2 entities as below:
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK teamId&employeeId) -> 2 entities
Case 2) The company also introduced a mentoring program, whereby a new employee will be paired with someone who has been in the company longer." How many entity/ies do you need to model the mentoring program?
The Answer from Lecturer is 1. With that, we have to add 2 more attributes for each Employee, mentorRole (Mentor or Mentee) and pairNo (to distinguish between different pairs and to know who mentors whom), doesn't it?
My question is why can't we create a new Entity named MENTORING which will be similar to TEAM in Q1? And why we can only do that if this is a many-many relationship?
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK is pairNo&employeeId) -> 2 entities
Thank you in advance
First of all, about terminology: I use entity to mean an individual person, thing or event. You and I are two distinct entities, but since we're both members of StackOverflow, we're part of the same entity set. Entity sets are contrasted with value sets in the ER model, while the relational model has no such distinction.
While you're right about the number of entity sets, there's some issues with your implementation. TEAM's PK shouldn't be teamId, employeeId, it should be only teamId. The EMPLOYEE table should have a teamId foreign key (not part of the PK) to indicate team membership. The employeeId column in the TEAM table could be used to represent the team leader and is dependent on the teamId (since each team can have only one leader at most).
With only one entity set, we would probably represent team membership and leadership as:
EMPLOYEE(employeeId PK, team, leader)
where team is some team name or number which has to be the same for team members, and leader is a true/false column to indicate whether the employee in that row is the leader of his/her team. A problem with this model is that we can't ensure that a team has only one leader.
Again, there's some issues with the implementation. I don't see the need to identify pairs apart from the employees involved, and having a mentorRole (mentor or mentee) indicates that the association will be recorded for both mentor and mentee. This is redundant and creates an opportunity for inconsistency. If the goal was to represent a one-to-one relationship, there are better ways. I suggest a separate table MENTORING(menteeEmployeeId PK, mentorEmployeeId UQ) (or possibly a unique but nullable mentorEmployeeId in the EMPLOYEE table, depending on how your DBMS handles nulls in unique indexes).
The difference between the two cases is that teams can have any number of members and one leader, which is most effectively implemented by identifying teams separately from employees, whereas mentorship is a simpler association that is sufficiently identified by either of the two people involved (provided you consistently use the same role as identifier). You could create a separate entity set for mentoring, with relationships to the employees involved - it might look like my MENTORING table but with an additional surrogate key as PK, but there's no need for the extra identifier.
And why we can only do that if this is a many-many relationship?
What do you mean? Your examples don't contain a many-to-many relationship and we don't create additional entity sets for many-to-many relationships. If you're thinking of so-called "bridge" tables, you've got some concepts mixed up. Entity sets aren't tables. An entity set is a set of values, a table represents a relation over one or more sets of values. In Chen's original method, all relationships were represented in separate tables. It's just that we've gotten used to denormalizing simple one-to-one and one-to-many relationships into the same tables as entity attributes, but we can't do the same for many-to-many binary relationships or ternary and higher relationships in general.

foreign key or null value to link two long distanced tables

Consider the following database example :
A clinichas many articles
The relationship between supplier and article is many to many ( (1,n) - (1,n))
Let's say I have the clinic's id and I want to retrieve all it's suppliers, what's the best way to do it? is it by creating a "null" article for each supplier in article_supplier OR by creating a foreign key in supplier that references the appropriate clinic?
The latter solution may seem the simplest and easiest but what happens when there is a big chain of tables? Do I keep adding a foreign key each time I need a list of something? e.g :
list of medicines a clinic uses
list of prescriptions a doctor gave
...
If it makes a difference, I am using Laravel's Eloquent ORM
A table represents a relationship among values. (Some values identify entities.) A database can't be used until we are told what each table--base or query result--means: what business/application relationship its rows satisfy. (Ie, what is its (characteristic) predicate.) What are the relationships for your tables?
To query we express a relationship in terms of base relationships then express its table in terms of the corresponding base tables. Eg a join returns rows satisfying one relationship and another. So again, we need to know tables' relationships in business/application terms.
Cardinalities & constraints are properties of relationships given what situations can arise. They aren't needed to update or query. They can guide design & are used for integrity.
When, given a clinic, you talk about "its" suppliers, you do not say what you mean. "Has", "its", "for", "references", "appropriate"--all mean nothing--they refer to related entities, but they don't say how they are related in terms of the business/application. This design contains no explicit relationship on clinics & articles. If it did, you've said nothing by which we could put the right rows in or see the rows & know about the situation. Still, you could then derive clinic-supplier rows where the supplier is "for" some article a given clinic "has". But is that the relationship you mean?? Eg if you want pairs where the clinic is allowed to "have" the supplier "for" some articles, that's a new relationship/table that cannot be derived from what you have.
creating a "null" article for each supplier in article_supplier
That relationship/table is a certain combination of article_supplier & the relationship/table just described. But it is simpler to just have those two.
creating a foreign key in supplier that references the appropriate clinic
That would mean that if a supplier "has" more than one clinic then there can be rows in the new version that differ only by all the other columns; normalization theory says that's worse than the original design and a clinic_supplier relationship/table. And it means that if a supplier "has" no clinic you would need something like a nullable FK (foreign key).
So you likely want a clinic_supplier table. But you should post a new question in which you actually say what business/application relationships you are talking about.
Your question & the following are essentially duplicates in that the basic principles/notions to obviously apply to answer them are all the same:
How do I find relations between tables that are long-distance related?
Required to join 2 tables with their FKs in a 3rd table
Best Solution - Ternary or Binary Relationship
You need to read an information modeling & database design textbook.
In your design :
Supplier is a table.
Articles is a table.
Since Supplier supplies articles, their relationship is expressed in terms of a table that links them. So, there is a supplier_article table.
Clinic is a table.
You mention Clinic uses articles.
By the same principle as above, the relationship between Clinic and Article should be expressed in the form of a table. This should be clinic_article table.
Though you haven't mentioned, you must look at other dependencies that arise. For example, a clinic performs procedures and procedures use articles.
by creating a foreign key in supplier that references the appropriate
clinic
This would assume that one supplier can supply only to one clinic forever. Even if this is true today, it won't be true tomorrow, because you might want to plan for multiple clinics in the same location or using the same application or database.

Linking an address table to multiple other tables

I have been asked to add a new address book table to our database (SQL Server 2012).
To simplify the related part of the database, there are three tables each linked to each other in a one to many fashion: Company (has many) Products (has many) Projects and the idea is that one or many addresses will be able to exist at any one of these levels. The thinking is that in the front-end system, a user will be able to view and select specific addresses for the project they specify and more generic addresses relating to its parent product and company.
The issue now if how best to model this in the database.
I have thought of two possible ideas so far so wonder if anyone has had a similar type of relationship to model themselves and how they implemented it?
Idea one:
The new address table will additionally contain three fields: companyID, productID and projectID. These fields will be related to the relevant tables and be nullable to represent company and product level addresses. e.g. companyID 2, productID 1, projectID NULL is a product level address.
My issue with this is that I am storing the relationship information in the table so if a project is ever changed to be related to a different product, the data in this table will be incorrect. I could potentially NULL all but the level I am interested in but this will make getting parent addresses a little harder to get
Idea two:
On the address table have a typeID and a genericID. genericID could contain the IDs from the Company, Product and Project tables with the typeID determining which table it came from. I am a little stuck how to set up the necessary constraints to do this though and wonder if this is going to get tricky to deal with in the future
Many thanks,
I will suggest using Idea one and preventing Idea two.
Second Idea is called Polymorphic Association anti pattern
Objective: Reference Multiple Parents
Resulting side effect: Using dual-purpose foreign key will violating first normal form (atomic issue), loosing referential integrity
Solution: Simplify the Relationship
The simplification of the relationship could be obtained in two ways:
Having multiple null-able forging keys (idea number 1): That will be
simple and applicable if the tables(product, project,...) that using
the relation are limited. (think about when they grow up to more)
Another more generic solution will be using inheritance. Defining a
new entity as the base table for (product, project,...) to satisfy
Addressable. May naming it organization-unit be more rational. Primary key of this organization_unit table will be the primary key of (product, project,...). Other collections like Address, Image, Contract ... tables will have a relation to this base table.
It sounds like you could use Junction tables http://en.wikipedia.org/wiki/Junction_table.
They will give you the flexibility you need to maintain your foreign key restraints, as well as share addresses between levels or entities if that is desired.
One for Company_Address, Product_Address, and Project_Address

Entity Relationship Diagram. How does the IS A relationship translate into tables?

I was simply wondering, how an ISA relationship in an ER diagram would translate into tables in a database.
Would there be 3 tables? One for person, one for student, and one for Teacher?
Or would there be 2 tables? One for student, and one for teacher, with each entity having the attributes of person + their own?
Or would there be one table with all 4 attributes and some of the squares in the table being null depending on whether it was a student or teacher in the row?
NOTE: I forgot to add this, but there is full coverage for the ISA relationship, so a person must be either a studen or a teacher.
Assuming the relationship is mandatory (as you said, a person has to be a student or a teacher) and disjoint (a person is either a student or a teacher, but not both), the best solution is with 2 tables, one for students and one for teachers.
If the participation is instead optional (which is not your case, but let's put it for completeness), then the 3 tables option is the way to go, with a Person(PersonID, Name) table and then the two other tables which will reference the Person table, e.g.
Student(PersonID, GPA), with PersonID being PK and FK referencing Person(PersonID).
The 1 table option is probably not the best way here, and it will produce several records with null values (if a person is a student, the teacher-only attributes will be null and vice-versa).
If the disjointness is different, then it's a different story.
there are 4 options you can use to map this into an ER,
option 1
Person(SIN,Name)
Student(SIN,GPA)
Teacher(SIN,Salary)
option 2 Since this is a covering relationship, option 2 is not a good match.
Student(SIN,Name,GPA)
Teacher(SIN,Name,Salary)
option 3
Person(SIN,Name,GPA,Salary,Person_Type)
person type can be student/teacher
option 4
Person(SIN,Name,GPA,Salary,Student,Teacher) Student and Teacher are bool type fields, it can be yes or no,a good option for overlapping
Since the sub classes don't have much attributes, option 3 and option 4 are better to map this into an ER
This answer could have been a comment but I am putting it up here for the visibility.
I would like to address a few things that the chosen answer failed to address - and maybe elaborate a little on the consequences of the "two table" design.
The design of your database depends on the scope of your application and the type of relations and queries you want to perform. For example, if you have two types of users (student and teacher) and you have a lot of general relations that all users can part take, regardless of their type, then the two table design may end up with a lot of "duplicated" relations (like users can subscribe to different newsletters, instead of having one M2M relationship table between "users" and newsletters, you'll need two separate tables to represent that relation). This issue worsens if you have three different types of users instead of two, or if you have an extra layer of IsA in your hierarchy (part-time vs full-time students).
Another issue to consider - the types of constraints you want to implement. If your users have emails and you want to maintain a user-wide unique constraint on emails, then the implementation is trickier for a two-table design - you'll need to add an extra table for every unique constraint.
Another issue to consider is just duplications, generally. If you want to add a new common field to users, you'll need to do it multiple times. If you have unique constraints on that common field, you'll need a new table for that unique constraint too.
All of this is not to say that the two table design isn't the right solution. Depending on the type of relations, queries and features you are building, you may want to pick one design over the other, like is the case for most design decisions.
It depends entirely on the nature of the relationships.
IF the relationship between a Person and a Student is 1 to N (one to many), then the correct way would be to create a foreign key relationship, where the Student has a foreign key back to the Person's ID Primary Key Column. Same thing for the Person to Teacher relationship.
However, if the relationship is M to N (many to many), then you would want to create a separate table containing those relationships.
Assuming your ERD uses 1 to N relationships, your table structure ought to look something like this:
CREATE TABLE Person
(
sin bigint,
name text,
PRIMARY KEY (sin)
);
CREATE TABLE Student
(
GPA float,
fk_sin bigint,
FOREIGN KEY (fk_sin) REFERENCES Person(sin)
);
and follow the same example for the Teacher table. This approach will get you to 3rd Normal Form most of the time.

a layman's term for identifying relationship

There are couples of questions around asking for difference / explanation on identifying and non-identifying relationship in relationship database.
My question is, can you think of a simpler term for these jargons? I understand that technical terms have to be specific and unambiguous though. But having an 'alternative name' might help students relate more easily to the concept behind.
We actually want to use a more layman term in our own database modeling tool, so that first-time users without much computer science background could learn faster.
cheers!
I often see child table or dependent table used as a lay term. You could use either of those terms for a table with an identifying relationship
Then say a referencing table is a table with a non-identifying relationship.
For example, PhoneNumbers is a child of Users, because a phone number has an identifying relationship with its user (i.e. the primary key of PhoneNumbers includes a foreign key to the primary key of Users).
Whereas the Users table has a state column that is a foreign key to the States table, making it a non-identifying relationship. So you could say Users references States, but is not a child of it per se.
I think belongs to would be a good name for the identifying relationship.
A "weak entity type" does not have its own key, just a "partial key", so each entity instance of this weak entity type has to belong to some other entity instance so it can be identified, and this is an "identifying relationship". For example, a landlord could have a database with apartments and rooms. A room can be called kitchen or bathroom, and while that name is unique within an apartment, there will be many rooms in the database with the name kitchen, so it is just a partial key. To uniquely identify a room in the database, you need to say that it is the kitchen in this particular apartment. In other words, the rooms belong to apartments.
I'm going to recommend the term "weak entity" from ER modeling.
Some modelers conceptualize the subject matter as being made up of entities and relationships among entities. This gives rise to Entity-Relationship Modeling (ER Modeling). An attribute can be tied to an entity or a relationship, and values stored in the database are instances of attributes.
If you do ER modeling, there is a kind of entity called a "weak entity". Part of the identity of a weak entity is the identity of a stronger entity, to which the weak one belongs.
An example might be an order in an order processing system. Orders are made up of line items, and each line item contains a product-id, a unit-price, and a quantity. But line items don't have an identifying number across all orders. Instead, a line item is identified by {item number, order number}. In other words, a line item can't exist unless it's part of exactly one order. Item number 1 is the first item in whatever order it belongs to, but you need both numbers to identify an item.
It's easy to turn an ER model into a relational model. It's also easy for people who are experts in the data but know nothing about databases to get used to an ER model of the data they understand.
There are other modelers who argue vehemently against the need for ER modeling. I'm not one of them.
Nothing, absolutely nothing in the kind of modeling where one encounters things such as "relationships" (ER, I presume) is "technical", "precise" or "unambiguous". Nor can it be.
A) ER modeling is always and by necessity informal, because it can never be sufficient to capture/express the entire definition of a database.
B) There are so many different ER dialects out there that it is just impossible for all of them to use exactly the same terms with exactly the same meaning. Recently, I even discovered that some UK university that teaches ER modeling, uses the term "entity subtype" for the very same thing that I always used to name "entity supertype", and vice-versa !
One could use connection.
You have Connection between two tables, where the IDs are the same.
That type of thing.
how about
Association
Link
Correlation

Resources