DBMS: ER Diagram - One to many relationship with descriptive attribute - database

I have the following ERD:
ER Diagram
I need to convert this into relational schema but I have some doubts on where the attribute "since" would belong. I know how 1-M relationships work but I am confused by the attribute since. Does a new table Company-Employee need to be created where it would hold the Company id, emp id and since?
Please let me know thank you.

As you have diagrammed it, the Since attribute describes the relationship, not the employee or the company. There's nothing wrong with that, although some experts maintain that attributes must describe entities and not relationships.
When you go to design a relational schema, both entities and relationships become relations (or, as I prefer, "tables"). Your only question seems to be whether there has to be a junction table between the Employee table and the Company table.
You can get away with just two tables. You can stuff both the CompanyId and the Since columns into the Employee table, because they will be single valued. There may be other considerations that lead you to decompose the two tables, but you don't have to do it just to represent the data.

Related

Relational Database: When do we need to add more entities?

We had a discussion today related to W3 lecture case study about how many entities we need for each situation. And I have some confusion as below:
Case 1) An employee is assigned to be a member of a team. A team with more than 5 members will have a team leader. The members of the team elect the team leader. List the entity(s) which you can identify in the above statement? In this cases, if we don't create 2 entities for above requirement, we need to add two more attributes for each employee which can lead to anomaly issues later. Therefore, we need to have 2 entities as below:
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK teamId&employeeId) -> 2 entities
Case 2) The company also introduced a mentoring program, whereby a new employee will be paired with someone who has been in the company longer." How many entity/ies do you need to model the mentoring program?
The Answer from Lecturer is 1. With that, we have to add 2 more attributes for each Employee, mentorRole (Mentor or Mentee) and pairNo (to distinguish between different pairs and to know who mentors whom), doesn't it?
My question is why can't we create a new Entity named MENTORING which will be similar to TEAM in Q1? And why we can only do that if this is a many-many relationship?
EMPLOYEE (PK is employeeId) (0-M)----------------(0-1) TEAM (PK is pairNo&employeeId) -> 2 entities
Thank you in advance
First of all, about terminology: I use entity to mean an individual person, thing or event. You and I are two distinct entities, but since we're both members of StackOverflow, we're part of the same entity set. Entity sets are contrasted with value sets in the ER model, while the relational model has no such distinction.
While you're right about the number of entity sets, there's some issues with your implementation. TEAM's PK shouldn't be teamId, employeeId, it should be only teamId. The EMPLOYEE table should have a teamId foreign key (not part of the PK) to indicate team membership. The employeeId column in the TEAM table could be used to represent the team leader and is dependent on the teamId (since each team can have only one leader at most).
With only one entity set, we would probably represent team membership and leadership as:
EMPLOYEE(employeeId PK, team, leader)
where team is some team name or number which has to be the same for team members, and leader is a true/false column to indicate whether the employee in that row is the leader of his/her team. A problem with this model is that we can't ensure that a team has only one leader.
Again, there's some issues with the implementation. I don't see the need to identify pairs apart from the employees involved, and having a mentorRole (mentor or mentee) indicates that the association will be recorded for both mentor and mentee. This is redundant and creates an opportunity for inconsistency. If the goal was to represent a one-to-one relationship, there are better ways. I suggest a separate table MENTORING(menteeEmployeeId PK, mentorEmployeeId UQ) (or possibly a unique but nullable mentorEmployeeId in the EMPLOYEE table, depending on how your DBMS handles nulls in unique indexes).
The difference between the two cases is that teams can have any number of members and one leader, which is most effectively implemented by identifying teams separately from employees, whereas mentorship is a simpler association that is sufficiently identified by either of the two people involved (provided you consistently use the same role as identifier). You could create a separate entity set for mentoring, with relationships to the employees involved - it might look like my MENTORING table but with an additional surrogate key as PK, but there's no need for the extra identifier.
And why we can only do that if this is a many-many relationship?
What do you mean? Your examples don't contain a many-to-many relationship and we don't create additional entity sets for many-to-many relationships. If you're thinking of so-called "bridge" tables, you've got some concepts mixed up. Entity sets aren't tables. An entity set is a set of values, a table represents a relation over one or more sets of values. In Chen's original method, all relationships were represented in separate tables. It's just that we've gotten used to denormalizing simple one-to-one and one-to-many relationships into the same tables as entity attributes, but we can't do the same for many-to-many binary relationships or ternary and higher relationships in general.

foreign key or null value to link two long distanced tables

Consider the following database example :
A clinichas many articles
The relationship between supplier and article is many to many ( (1,n) - (1,n))
Let's say I have the clinic's id and I want to retrieve all it's suppliers, what's the best way to do it? is it by creating a "null" article for each supplier in article_supplier OR by creating a foreign key in supplier that references the appropriate clinic?
The latter solution may seem the simplest and easiest but what happens when there is a big chain of tables? Do I keep adding a foreign key each time I need a list of something? e.g :
list of medicines a clinic uses
list of prescriptions a doctor gave
...
If it makes a difference, I am using Laravel's Eloquent ORM
A table represents a relationship among values. (Some values identify entities.) A database can't be used until we are told what each table--base or query result--means: what business/application relationship its rows satisfy. (Ie, what is its (characteristic) predicate.) What are the relationships for your tables?
To query we express a relationship in terms of base relationships then express its table in terms of the corresponding base tables. Eg a join returns rows satisfying one relationship and another. So again, we need to know tables' relationships in business/application terms.
Cardinalities & constraints are properties of relationships given what situations can arise. They aren't needed to update or query. They can guide design & are used for integrity.
When, given a clinic, you talk about "its" suppliers, you do not say what you mean. "Has", "its", "for", "references", "appropriate"--all mean nothing--they refer to related entities, but they don't say how they are related in terms of the business/application. This design contains no explicit relationship on clinics & articles. If it did, you've said nothing by which we could put the right rows in or see the rows & know about the situation. Still, you could then derive clinic-supplier rows where the supplier is "for" some article a given clinic "has". But is that the relationship you mean?? Eg if you want pairs where the clinic is allowed to "have" the supplier "for" some articles, that's a new relationship/table that cannot be derived from what you have.
creating a "null" article for each supplier in article_supplier
That relationship/table is a certain combination of article_supplier & the relationship/table just described. But it is simpler to just have those two.
creating a foreign key in supplier that references the appropriate clinic
That would mean that if a supplier "has" more than one clinic then there can be rows in the new version that differ only by all the other columns; normalization theory says that's worse than the original design and a clinic_supplier relationship/table. And it means that if a supplier "has" no clinic you would need something like a nullable FK (foreign key).
So you likely want a clinic_supplier table. But you should post a new question in which you actually say what business/application relationships you are talking about.
Your question & the following are essentially duplicates in that the basic principles/notions to obviously apply to answer them are all the same:
How do I find relations between tables that are long-distance related?
Required to join 2 tables with their FKs in a 3rd table
Best Solution - Ternary or Binary Relationship
You need to read an information modeling & database design textbook.
In your design :
Supplier is a table.
Articles is a table.
Since Supplier supplies articles, their relationship is expressed in terms of a table that links them. So, there is a supplier_article table.
Clinic is a table.
You mention Clinic uses articles.
By the same principle as above, the relationship between Clinic and Article should be expressed in the form of a table. This should be clinic_article table.
Though you haven't mentioned, you must look at other dependencies that arise. For example, a clinic performs procedures and procedures use articles.
by creating a foreign key in supplier that references the appropriate
clinic
This would assume that one supplier can supply only to one clinic forever. Even if this is true today, it won't be true tomorrow, because you might want to plan for multiple clinics in the same location or using the same application or database.

Do all relational database designs require a junction or associative table for many-to-many relationship?

I'm new to databases and trying to understand why a junction or association table is needed when creating a many-to-many relationship.
Most of what I'm finding on Stackoverflow and elsewhere describe it in either highly technical relational theory terms or it's just described as 'that's the way it's done' without qualifying why.
Are there any relational database designs out there that support having a many-to-many relationship without the use of an association table? Why is it not possible to have, for example, a column on on table that holds the relationships to another and vice a versa.
For example, a Course table that holds a list of courses and a Student table that holds a bunch of student info — each course can have many students and each student can take many classes.
Why is it not possible to have a column on each row in either table (possibly in csv format) that contains the relationships to the others in a list or something similar?
In a relational database, no column holds more than a single value in each row. Therefore, you would never store data in a "CSV format" -- or any other multiple value system -- in a single column in a relational database. Making repeated columns that hold instances of the same item (Course1, Course2, Course3, etc) is also not allowed. This is the very first rule of relational database design and is referred to as First Normal Form.
There are very good reasons for the existence of these rules (it is enormously easier to verify, constrain, and query the data) but whether or not you believe in the benefits the rules are, none-the-less, part of the definition of relational databases.
I do not know the answer to your question, but I can answer a similar question: Why do we use a junction table for many-to-many relationships in databases?
First, if the student table keeps track of which courses the student is in and the course keeps track of which students are in it, then we have duplication. This can lead to problems. What if a student knows it is in a course, but the course doesn't know that it has that student. Every time you made a course change you would have to make sure to change it in both tables. Inevitably this will not happen every time and the data will become inconsistent.
Second, where would we store this information? A list is not a possible type for a field in a database. So do we put a course column in the student table? No, because that would only allow each student to take one course, a many-to-one relationship from students to courses. Do we put a student column in the courses table? No, because then we have one student in each course.
What does work is having a new table that has one student and one course per row. This tells us that a student is in a class without duplicating any data.
"Junction tables" come from ER/ORM presentations/methods/products that don't really understand the relational model.
In the relational model (and in original ER information modeling) application relationships are represented by relations/tables. Each table holds tuples of values that are in that relationship to each other, ie that are so related, ie that satisfy that relationship, ie that participate in the relationship.
A relationship is expressed independently of any particular situation as a predicate, a fill-in-the-(named-)blanks statement. Rows that fill in the named blanks to give a true statement from the predicate in a particular situation go in the table. We pick sufficient predicates (hence base tables) to describe every situation. Both many-to-1 and many-to-many application relationships get tables.
The reason why you don't see a lot of many-to-many relationships along with columns about the participants rather than about their participation in the relationship is that such tables are better split into ones about the participants and one for the relationship. Eg columns in a many-to-many table that are about participants 1. can't say anything about entities that don't participate and 2. say the same thing about an entity every time it participates. Information modeling techniques that focus on identifying independent entity types first then relationships between them tend to lead to designs with few such problems. The reason why you don't see many-to-many relationships in two tables is that that is redundant and susceptible to the error of the tables disagreeing. The problem with collection-valued columns (sequences/lists/arrays) is that you cannot generically query about their parts using usual query notation and implementation because the DBMS doesn't see the parts organized into a table.
See this recent answer or this one.

Differentiation between 'Entity' and 'Table'

Can someone tell me the easy way to explain the differentiation between an entity and a table in database?
Entity is a logical concept of relational database model. And table is used to express it, but there is a slight difference. Table expresses not only entities, but also relations.
For example, assume that you want to make a database of projects and employees of a company. Entity is a unit of information that has meanings by itself. In this case, there will be two entities - "Project" and "Employee". Each entity has its own attributes.
In relational DB model, there is another idea, 'relation'. If employees participate in several projects, then we can say that there is a relation with a name 'works_on'.
Sometimes, relation can have its own attribute. In this case, 'works_on' relation can have attribute 'start_date' and so on. And if this relation is M:N relation(Like this case: In project 1, there are 5 employees. Employee A works on two projects.), then you have to make an extra table to express this relation. (If you don't make an extra table when the relation is M:N, then you have to insert too many duplicated rows into both 'Project' and 'Employee' table.)
CREATE TABLE works_on(
employee,
project_id,
start_date
)
An entity resides in a table, it is a single set of information, i.e: if you have a database of employees, then an employee is an entity. A table is a group of fields with certain parameters.
Basically everything is stored in a table, entities goes into tables.
In a relational database the concept is the same. An entity is a table.
In OOP (Oriented Object Programming) there is a nice article in Oracle docs:
In general terms, entity objects encapsulate the business policy and
data for
the logical structure of the business, such as product lines,
departments, sales, and regions
business documents, such as invoices, change orders, and service
requests
physical items, such as warehouses, employees, and equipment
Another way of looking at it is that an entity object stores the
business logic and column information for a database table (or view,
synonym, or snapshot). An entity object caches data from a database
and provides an object-oriented representation of it.
Depending on how you want to work, you can create entity objects from
existing database tables (reverse generation) or define entity objects
and use them to create database tables (forward generation).
There is little difference between an entity and a table, but first we should define the concepts of “tuple” and “attribute”. I want to express those concepts with a table.
Here is an example of a table. As you can see it has some columns. Each column represents an attribute and each line represents a tuple(row). This is the employee table. In this example, the table is being used for representing an entity which is employee. However, if we create a new table named superior to specify the superiority relationship between those employees, it would be a table that represents a relation. In summary, we can use tables for representing both the entities and relations so entities and tables are not the same.

a layman's term for identifying relationship

There are couples of questions around asking for difference / explanation on identifying and non-identifying relationship in relationship database.
My question is, can you think of a simpler term for these jargons? I understand that technical terms have to be specific and unambiguous though. But having an 'alternative name' might help students relate more easily to the concept behind.
We actually want to use a more layman term in our own database modeling tool, so that first-time users without much computer science background could learn faster.
cheers!
I often see child table or dependent table used as a lay term. You could use either of those terms for a table with an identifying relationship
Then say a referencing table is a table with a non-identifying relationship.
For example, PhoneNumbers is a child of Users, because a phone number has an identifying relationship with its user (i.e. the primary key of PhoneNumbers includes a foreign key to the primary key of Users).
Whereas the Users table has a state column that is a foreign key to the States table, making it a non-identifying relationship. So you could say Users references States, but is not a child of it per se.
I think belongs to would be a good name for the identifying relationship.
A "weak entity type" does not have its own key, just a "partial key", so each entity instance of this weak entity type has to belong to some other entity instance so it can be identified, and this is an "identifying relationship". For example, a landlord could have a database with apartments and rooms. A room can be called kitchen or bathroom, and while that name is unique within an apartment, there will be many rooms in the database with the name kitchen, so it is just a partial key. To uniquely identify a room in the database, you need to say that it is the kitchen in this particular apartment. In other words, the rooms belong to apartments.
I'm going to recommend the term "weak entity" from ER modeling.
Some modelers conceptualize the subject matter as being made up of entities and relationships among entities. This gives rise to Entity-Relationship Modeling (ER Modeling). An attribute can be tied to an entity or a relationship, and values stored in the database are instances of attributes.
If you do ER modeling, there is a kind of entity called a "weak entity". Part of the identity of a weak entity is the identity of a stronger entity, to which the weak one belongs.
An example might be an order in an order processing system. Orders are made up of line items, and each line item contains a product-id, a unit-price, and a quantity. But line items don't have an identifying number across all orders. Instead, a line item is identified by {item number, order number}. In other words, a line item can't exist unless it's part of exactly one order. Item number 1 is the first item in whatever order it belongs to, but you need both numbers to identify an item.
It's easy to turn an ER model into a relational model. It's also easy for people who are experts in the data but know nothing about databases to get used to an ER model of the data they understand.
There are other modelers who argue vehemently against the need for ER modeling. I'm not one of them.
Nothing, absolutely nothing in the kind of modeling where one encounters things such as "relationships" (ER, I presume) is "technical", "precise" or "unambiguous". Nor can it be.
A) ER modeling is always and by necessity informal, because it can never be sufficient to capture/express the entire definition of a database.
B) There are so many different ER dialects out there that it is just impossible for all of them to use exactly the same terms with exactly the same meaning. Recently, I even discovered that some UK university that teaches ER modeling, uses the term "entity subtype" for the very same thing that I always used to name "entity supertype", and vice-versa !
One could use connection.
You have Connection between two tables, where the IDs are the same.
That type of thing.
how about
Association
Link
Correlation

Resources