I have to create an ER diagram based on a relational schema.
There is a table of players, and a table of zones. A player can 'live' in many zones, and each zone is owned by one or more players.
I've come up with this simple ER diagram but I'm not sure having relationships going each way is allowed?
Cheers
Yes, that is a perfectly good Entity Relation Diagram. (I am not responding as to whether it makes sense or not: you still need to resolve the Relations and Cardinality.)
Using the correct terms helps people understand exactly what you are discussing, and which level you are discussing. Loose talk results in much more volume in the discussion, and time wasted in clarifying what you meant by which term. Not good for productive technical endeavours.
At this early stage, it is normal to model Entities and Relations (not Attributes), that's why it is called an ER diagram; we are nowhere near modelling the data. The Relations are relevant, and that's why you are detailing and evaluating their nature in the diamonds and Cardinality. The goal is to clarify the true Entities, and their Relations to each other. Many-to-many relations remain as relations. The ERD is purely Logical, there is no Physical.
Once you have some confidence with that, that you have gotten the Entities and Relations right, you move onto a Data Model (which includes Attributes). Still at a Logical level, the n::n relations remain as relations.
As you progress, you may show further detail, such as Domain for each Attribute. That's the DataType, but at the Logical level, just as the terms are Entity = Table and Attribute = Column, Domain = DataType.
.
When you get to the Physical level, the Data Model has Tables; Columns; DataTypes.
And n::n Relations are manifested as the Associative Tables.
.
The idea is, as long as you are working through the prescribed steps, at (1), the content in the diamonds will determine (expose) if they need to be stored, and the diamond is thus promoted to an Entity; otherwise it remains a Relation.
There is a junction table called lives-in in the relational schema I've been given. However, I thought when mapping a relational schema [back] to an ER diagram a junction table becomes a relationship?
The Relational term is Associative table.
Yes. If it is a pure n::n Table (containing nothing but the two FKs to the PKs of the parent Tables), at the ERD level, which is Logical only, it is a Relation.
If it has Columns other than the two FKs, it is an Entity.
Since there's a many-to-many relationship between [Players] and [Zones] you have to add a junction table (called for ex. [PlayersZones]). The notation itself is correct (Chen notation), though I prefer the Crow's Foot Notation.
I am not able to see your images (blocked!) so I'll just try to describe the "correct" design. If a player living in a zone doesn't necessarily mean they own it, you should have four tables:
PLAYER (playerid, <other fields>)
ZONE (zoneid, <other fields>
PLAYER_ZONE(playerid, lives_in_zoneid)
ZONE_OWNER (zoneid, owner_playerid)
Otherwise three tables would suffice.
Related
Before starting to develop anything, I made a draft of a ER diagram using Chen's notation, when developing the web page I had to change some stuff on the database, and at the end I got this ER diagram:
Between "booking" and "staff" tables I have a relational table "assign".
But between "booking" and "parts" I have this "cost" which I believe its the relational one, but I dont know how to represent it on Chen's notation diagram. Can someone give me a help? :)
Thanks.
In this physical ERD diagram, which shows how primary and foreign keys are used to implement the relationships.
From the conceptual point of view:
assign implements a many-to-many relation between staff and booking;
cost implements a many-to-many relation between booking and part that provides additional information about the combination between booking and parts.
In Chen's notation:
assign would be represented with a simple losange for relationship and a cardinality M, N.
cost would also be representad with a many-to-many relationship losange. But in addition, you'd show the relationship's attributes (e.g. quantity, cost, description) as additional ellipses connected to the relationship
You could also consider to use associative entities instead of realationships, especially for cost. It has the advantage of suggesting that there's a table behind. But it's not required in your model, unless cost could have relationships with other tables (which would be easy since there's a cost_id)
I am working on a practice questions for ERD, and I was wondering what the correct approach is for modelling either or relationships.
For example, in a Taekwondo school, you will have customer accounts, which will represent and pay for one or many students. The account is owned by either a parent, or a the student himself. Therefore the account owner is either a parent or a student. What is the best way to represent a relationship like this?
Here is what I came up with, but I am unsure if this conforms to best practice:
1 Clarification
Representing an either-or relationship in Crows foot ERD
The diagram you have is a good start. Note:
that is not ERD. That is way more detail than an ERD can handle
ERD does not have a Crows Foot, that is IEEE notation
Ultimately, you need a data model that has the detail required for an implementation (way more than ERD). That is why I said your diagram is a good start, it is moving in that direction. However, we have a Standard for Relational Data Modelling: IDEF1X, the Standard for modelling Relational databases since 1993, available since 1984 before it was elevated to a standard.
Evidently both Dr E F Codd's Relational Model, and the diagrammatic method for modelling Relational databases is suppressed.
The relationship symbol, especially the cardinality, in IEEE notation is better (more easily understood) than IDEF1X, therefore most people use that. All data modelling tools, such as ERwin, implement IDEF1X, and allow either IDEF1X or IEEE notation for relationships.
2 Request
The diagram as intended is illegal. Why ? Because you have one relationship going "out" of Person, to two tables. Not possible. You are asking how to represent such a relationship in a data model (not possible in ERD). The answer is, that is an OR Gate is logical terms, a Subtype in Relational terms.
Please inspect these answers for overview and detail. Follow the links for implementation details and code:
How can I relate a primary key field to multiple tables?
Structuring database relationships for tracking different variations of app settings
How do I get around this relational database design smell?
Subtypes can be:
Exclusive (the Basetype must be one of the Subtypes), or
Non-Exclusive (the Basetype must be any [more than one] of the Subtypes).
From Role it appears to be Exclusive. What you call Role is a Discriminator in IDEF1X.
That is best practice for Relational databases.
Relational Data Model
This is best practice for for data models (this level of detail shows attribute name only).
Of course, all my data models are rendered in IDEF1X.
My IDEF1X Introduction is essential reading for beginners.
ParentId, StudentId, OwnerId are all RoleNames (Relational term)of PersonId. This makes the context of the FK explicit.
3 Correction
but I am unsure if this conforms to best practice
Since you are concerned, there is one other issue. There is a mistake in your model, it is one of the common errors that happen when one stamps id on every file. Such a practice cripples the modelling exercise, and makes it prone to various errors. (I understand that you are taught that crippling method.)
Since a Person can have 0-or-1 Account, and the Person PK (which is unique to a Person), is a FK in Account, it can be the PK in Account.
AccountId is not necessary: it is 100% redundant, one additional field and one additional index, that can be eliminated.
In ER diagrams, is it possible to relate two weak entities each other? If possible, how can uniquely identify records in them?
It's certainly possible. Consider the following ER diagram in which invoices are composed of lines, and receipts are decomposed into corresponding lines which are allocated to invoice lines. Multiple receipt lines can be allocated to the same InvoiceLine. It's perhaps a bit contrived but it'll serve as an example.
The InvoiceLine entity set is identified by (InvoiceNumber, LineNumber). Similarly, the ReceiptLine entity set is identified by (ReceiptNumber, LineNumber).
The determinant of a relationship between any entity sets is a combination of the determinants of the entity sets in many-roles. It doesn't matter whether the entity sets are weak or regular, or whether you have two or more entity sets involved in the relationship. In the case of 1:1 (or 1:1:1, etc) relationship, any of the entity sets involved can be used as a determinant.
In our example, ReceiptLine is the only entity set in a many-role (indicated by an N next to the Paid relationship diamond). This means the relationship is determined by the determinant of ReceiptLine, which is (ReceiptNumber, LineNumber).
If we translate our ER diagram to a tabular model, we get the following:
I translated it directly to help you see the correspondence between the diagrams, but in practice we could denormalize the Paid relationship relation into the ReceiptLine entity relation for a simpler physical model. That can only be done for relationships with a single determining entity set, so it's important that you understand the general approach first.
I'm new to databases and trying to understand why a junction or association table is needed when creating a many-to-many relationship.
Most of what I'm finding on Stackoverflow and elsewhere describe it in either highly technical relational theory terms or it's just described as 'that's the way it's done' without qualifying why.
Are there any relational database designs out there that support having a many-to-many relationship without the use of an association table? Why is it not possible to have, for example, a column on on table that holds the relationships to another and vice a versa.
For example, a Course table that holds a list of courses and a Student table that holds a bunch of student info — each course can have many students and each student can take many classes.
Why is it not possible to have a column on each row in either table (possibly in csv format) that contains the relationships to the others in a list or something similar?
In a relational database, no column holds more than a single value in each row. Therefore, you would never store data in a "CSV format" -- or any other multiple value system -- in a single column in a relational database. Making repeated columns that hold instances of the same item (Course1, Course2, Course3, etc) is also not allowed. This is the very first rule of relational database design and is referred to as First Normal Form.
There are very good reasons for the existence of these rules (it is enormously easier to verify, constrain, and query the data) but whether or not you believe in the benefits the rules are, none-the-less, part of the definition of relational databases.
I do not know the answer to your question, but I can answer a similar question: Why do we use a junction table for many-to-many relationships in databases?
First, if the student table keeps track of which courses the student is in and the course keeps track of which students are in it, then we have duplication. This can lead to problems. What if a student knows it is in a course, but the course doesn't know that it has that student. Every time you made a course change you would have to make sure to change it in both tables. Inevitably this will not happen every time and the data will become inconsistent.
Second, where would we store this information? A list is not a possible type for a field in a database. So do we put a course column in the student table? No, because that would only allow each student to take one course, a many-to-one relationship from students to courses. Do we put a student column in the courses table? No, because then we have one student in each course.
What does work is having a new table that has one student and one course per row. This tells us that a student is in a class without duplicating any data.
"Junction tables" come from ER/ORM presentations/methods/products that don't really understand the relational model.
In the relational model (and in original ER information modeling) application relationships are represented by relations/tables. Each table holds tuples of values that are in that relationship to each other, ie that are so related, ie that satisfy that relationship, ie that participate in the relationship.
A relationship is expressed independently of any particular situation as a predicate, a fill-in-the-(named-)blanks statement. Rows that fill in the named blanks to give a true statement from the predicate in a particular situation go in the table. We pick sufficient predicates (hence base tables) to describe every situation. Both many-to-1 and many-to-many application relationships get tables.
The reason why you don't see a lot of many-to-many relationships along with columns about the participants rather than about their participation in the relationship is that such tables are better split into ones about the participants and one for the relationship. Eg columns in a many-to-many table that are about participants 1. can't say anything about entities that don't participate and 2. say the same thing about an entity every time it participates. Information modeling techniques that focus on identifying independent entity types first then relationships between them tend to lead to designs with few such problems. The reason why you don't see many-to-many relationships in two tables is that that is redundant and susceptible to the error of the tables disagreeing. The problem with collection-valued columns (sequences/lists/arrays) is that you cannot generically query about their parts using usual query notation and implementation because the DBMS doesn't see the parts organized into a table.
See this recent answer or this one.
There are couples of questions around asking for difference / explanation on identifying and non-identifying relationship in relationship database.
My question is, can you think of a simpler term for these jargons? I understand that technical terms have to be specific and unambiguous though. But having an 'alternative name' might help students relate more easily to the concept behind.
We actually want to use a more layman term in our own database modeling tool, so that first-time users without much computer science background could learn faster.
cheers!
I often see child table or dependent table used as a lay term. You could use either of those terms for a table with an identifying relationship
Then say a referencing table is a table with a non-identifying relationship.
For example, PhoneNumbers is a child of Users, because a phone number has an identifying relationship with its user (i.e. the primary key of PhoneNumbers includes a foreign key to the primary key of Users).
Whereas the Users table has a state column that is a foreign key to the States table, making it a non-identifying relationship. So you could say Users references States, but is not a child of it per se.
I think belongs to would be a good name for the identifying relationship.
A "weak entity type" does not have its own key, just a "partial key", so each entity instance of this weak entity type has to belong to some other entity instance so it can be identified, and this is an "identifying relationship". For example, a landlord could have a database with apartments and rooms. A room can be called kitchen or bathroom, and while that name is unique within an apartment, there will be many rooms in the database with the name kitchen, so it is just a partial key. To uniquely identify a room in the database, you need to say that it is the kitchen in this particular apartment. In other words, the rooms belong to apartments.
I'm going to recommend the term "weak entity" from ER modeling.
Some modelers conceptualize the subject matter as being made up of entities and relationships among entities. This gives rise to Entity-Relationship Modeling (ER Modeling). An attribute can be tied to an entity or a relationship, and values stored in the database are instances of attributes.
If you do ER modeling, there is a kind of entity called a "weak entity". Part of the identity of a weak entity is the identity of a stronger entity, to which the weak one belongs.
An example might be an order in an order processing system. Orders are made up of line items, and each line item contains a product-id, a unit-price, and a quantity. But line items don't have an identifying number across all orders. Instead, a line item is identified by {item number, order number}. In other words, a line item can't exist unless it's part of exactly one order. Item number 1 is the first item in whatever order it belongs to, but you need both numbers to identify an item.
It's easy to turn an ER model into a relational model. It's also easy for people who are experts in the data but know nothing about databases to get used to an ER model of the data they understand.
There are other modelers who argue vehemently against the need for ER modeling. I'm not one of them.
Nothing, absolutely nothing in the kind of modeling where one encounters things such as "relationships" (ER, I presume) is "technical", "precise" or "unambiguous". Nor can it be.
A) ER modeling is always and by necessity informal, because it can never be sufficient to capture/express the entire definition of a database.
B) There are so many different ER dialects out there that it is just impossible for all of them to use exactly the same terms with exactly the same meaning. Recently, I even discovered that some UK university that teaches ER modeling, uses the term "entity subtype" for the very same thing that I always used to name "entity supertype", and vice-versa !
One could use connection.
You have Connection between two tables, where the IDs are the same.
That type of thing.
how about
Association
Link
Correlation