Representing an either-or relationship in Crows foot ERD - database

I am working on a practice questions for ERD, and I was wondering what the correct approach is for modelling either or relationships.
For example, in a Taekwondo school, you will have customer accounts, which will represent and pay for one or many students. The account is owned by either a parent, or a the student himself. Therefore the account owner is either a parent or a student. What is the best way to represent a relationship like this?
Here is what I came up with, but I am unsure if this conforms to best practice:

1 Clarification
Representing an either-or relationship in Crows foot ERD
The diagram you have is a good start. Note:
that is not ERD. That is way more detail than an ERD can handle
ERD does not have a Crows Foot, that is IEEE notation
Ultimately, you need a data model that has the detail required for an implementation (way more than ERD). That is why I said your diagram is a good start, it is moving in that direction. However, we have a Standard for Relational Data Modelling: IDEF1X, the Standard for modelling Relational databases since 1993, available since 1984 before it was elevated to a standard.
Evidently both Dr E F Codd's Relational Model, and the diagrammatic method for modelling Relational databases is suppressed.
The relationship symbol, especially the cardinality, in IEEE notation is better (more easily understood) than IDEF1X, therefore most people use that. All data modelling tools, such as ERwin, implement IDEF1X, and allow either IDEF1X or IEEE notation for relationships.
2 Request
The diagram as intended is illegal. Why ? Because you have one relationship going "out" of Person, to two tables. Not possible. You are asking how to represent such a relationship in a data model (not possible in ERD). The answer is, that is an OR Gate is logical terms, a Subtype in Relational terms.
Please inspect these answers for overview and detail. Follow the links for implementation details and code:
How can I relate a primary key field to multiple tables?
Structuring database relationships for tracking different variations of app settings
How do I get around this relational database design smell?
Subtypes can be:
Exclusive (the Basetype must be one of the Subtypes), or
Non-Exclusive (the Basetype must be any [more than one] of the Subtypes).
From Role it appears to be Exclusive. What you call Role is a Discriminator in IDEF1X.
That is best practice for Relational databases.
Relational Data Model
This is best practice for for data models (this level of detail shows attribute name only).
Of course, all my data models are rendered in IDEF1X.
My IDEF1X Introduction is essential reading for beginners.
ParentId, StudentId, OwnerId are all RoleNames (Relational term)of PersonId. This makes the context of the FK explicit.
3 Correction
but I am unsure if this conforms to best practice
Since you are concerned, there is one other issue. There is a mistake in your model, it is one of the common errors that happen when one stamps id on every file. Such a practice cripples the modelling exercise, and makes it prone to various errors. (I understand that you are taught that crippling method.)
Since a Person can have 0-or-1 Account, and the Person PK (which is unique to a Person), is a FK in Account, it can be the PK in Account.
AccountId is not necessary: it is 100% redundant, one additional field and one additional index, that can be eliminated.

Related

Supertype/subtype Notation for ERD

This is more of a notation and 'proper procedure' type of question than anything.
Please see below an image of a few relations in my Enhanced ERD logical model. A patient can be an OUTPATIENT or a RESIDENT, but there are no attributes which are specific to OUTPATIENTS or RESIDENTS. There are relationships which are specific to the subtypes though, as only OUTPATIENTS can be associated with visits and only RESIDENTs can be associated with beds.
I am in the process of converting this to a physical data model. Obviously it makes sense to not have OUTPATIENT or RESIDENT tables and only a PATIENT table which contains a discriminator for the type of patient.
But what is the proper way to model this?
How do I now model the relationships to VISITS and BEDS while still maintaining the constraint that the discriminator must be of a certain value to qualify for those relationships?
Do I just forget about representing this constraint in the physical data model and make sure its implemented in the code when the tables are created?
Or is there a notation for physical data models which represents this type of constraint?
Section of CareCenter schema in Extended ERD
I have done much searching and cannot seem to find anything about this. All of the material I have found talks about creating subtypes for the purpose of isolating attributes specific to a subtype and not relationships specific to a subtype.
Advice or reference to data you have found that I was not able to is greatly appreciated!
(If you are really trying to make sense of my section of EERD it may be helpful to know that PATIENT is a subtype of a PERSON supertype.)
1  Modelling & Notation
1.1  ERD
is pre-Relational, 1960’s.  It cannot handle Relational Keys, which means it is hopeless for Relational Data Modelling.  In the Relational paradigm, the Relational Key (which is composite) is central, therefore the identity of each entity cannot be analysed, or modelled, or defined, in ERD.
There is no definition in ERD for the Relational concepts of Independent/Dependent tables, or Identifying/Non-Identifying relations, as it is meaningless without a Relational Key, which leads to much confusion when extending ERD and attempting to add those.  Further, as you have found, it has no notation of Domain/Datatype; Subtype; etc.
ERD never was a Standard.  Since it is un-useable, each person who attempts to use it for an SQL implementation has to “extend” ERD, and that results in a million notations, all of which are different and incomplete.  And which have to be explained to the reader.  Whereas a Standard needs no explanation because it is complete and documented, once.
Technically, ERD is not a model (which implies a mathematical, logical basis).  The semantics are primitive and nowhere near complete.  In fact, it is hopeless for modelling, period, even for pre-Relational filing systems.
1.2  IDEF1X
is the Standard for Relational Data Modelling, available since the 1980's, a Standard since 1993.  As such it is complete, whereas an extended ERD will never be complete, no matter how much you extend it.
The academics and authors of "textbooks" are clueless: as evidenced, they are 50 years behind the industry (definition) and 40 years behind (implementation on SQL platforms).  They are stuck in 1960's Record Filing Systems, which is physical, characterised by a RecordID, and they market it as "relational".  
Whereas Codd's Relational Model is completely logical, with a mathematical foundation, and provides far more Integrity; Power; and Speed.
To use ERD at all, you have to extend it, using some private notation, as you have done.  Instead of moving incrementally and painfully in the direction of IDEF1X, I suggest you just switch to it, and obtain the full benefit.  You may find this IDEF1X Introduction useful.
1.3  Logical vs Physical Data Model
There is a lot of nonsense written about the distinction.
The Logical model simply progresses, in iterations, to the point where it is stable, and then it is the Physical, which can be implemented on a specific SQL platform.  That is, there is no “convert” process.
In good Data Modelling tools, such as ERwin, it is one file, not two or three, and the Logical vs Physical is simply different views of that one file.  Eg. Domain in the Logical is DataType in the Physical. The Physical is of course specific to the target platform, eg. BOOLEAN in one is BIT in another.  If you are not using a Data Modelling tool, or using a poor one, sure, you will have separate files and you have to deal with the attendant synchronisation problems.
But what is the proper way to model this? How do I now model the relationships to visits and beds while still maintaining the constraint that the discriminator must be of a certain value to qualify for those relationships?
In this regard, the question is not about Logical vs Physical DM, all aspects re the question are implemented in both.
Yes, it is about notation. There is no notation problem, or difference (Logical vs Physical) in IDEF1X, because it is complete.
Do I just forget about representing this constraint in the physical data model
No, they are drawn in both, they are implemented in the DDL.
and make sure its implemented in the code when the tables are created?
If you use a Data Modelling tool, it squirts out SQL that is specific to the target platform. Otherwise, sure, you have to write your own DDL and make sure it is correct. In any case, the SQL is the same (not counting the difference in SQL flavours).
Caveat.  The pretend SQLs (all freeware “sqls” and Oracle) are not SQL compliant, their use of the term is not correct.  They cannot implement ordinary SQL features such as Constraints for Subtypes or ACID Transactions; etc.
Or is there a notation for physical data models which represents this type of constraint?
No, there is no difference in the notation in IDFE1X. Your question appears to be due to your extensions to ERD. First, the ERD is not useable for Relational data modelling, and cannot cope with Relational Keys or Subtypes.  Second, your extensions, good as they may be, do not have the ordinary Relational notation that IDEF1X has. Again, just switch to IDEF1X.
2  Codd’s Relational Model
As distinct from the variety of primitive nonsense written by the academics and in textbooks, misleadingly marketed as “relational”.
2.1  Subtype
I have done much searching and cannot seem to find anything about this. All of the material I have found talks about creating subtypes for the purpose of isolating attributes specific to a subtype and not relationships specific to a subtype.
There is no problem at all with a Subtype that has no attributes, same as there is no problem at all with a row that has no attributes.  Keep in mind that each entity is a Fact (one fact in one place), and the Fact is established by the Relational Key, to which the attributes are quite secondary (Codd’s 3NF properly understood).  Thus Resident and OutPatient are discrete Facts, whether each Subtype has attributes or not; whether the Fact exists for supporting a Foreign Key or not, is a separate issue.
Advice or reference to data you have found that I was not able to is greatly appreciated
You may find this Subtype document useful.  For examples, go to my profile, and look up any answers that interest you.
If you require even further detail, there is a long discourse regarding Subtypes and notation, that I had with the single academic who is trying to cross the great chasm between academia and reality in this field, who recently "found" IDEF1X from my data models.  I use a corrected form of IDEF1X (it was written by an academic), using the pre-existing IEEE notation when it is more precise.  The discourse goes into the whys and wherefores of the original IDEF1X vs the corrected form.  It is long at 70 posts, and there is a document that summarises it. Just ask.
Obviously it makes sense to not have OUTPATIENT or RESIDENT tables and only a PATIENT table which contains a discriminator for the type of patient.
No.  Each Subtype is a separate table, in the Logical models (first) and Physical (last), and the DDL. The physical is merely the implementation level of the Logical, you should not have anything in the Physical that is not in the Logical (you do not want to implement a thing that is not logical, not semantic; not Relational (which is absolutely logical, and unlimited).
Consider that the database may be expanded in the future, and you may have attributes in the Subtypes. 
- If the cluster is Exclusive, the Basetype table must have a Discriminator. 
- If it is Non-Exclusive, there is no Discriminator.
Supertype means something quite different, the academics use terms loosely and incorrectly. Eg. the notion of Superkey is hysterical, and anti-Relational.
2.2  Data Model
Here is the logical model in IDEF1X notation, showing attributes, not domains.  
I have corrected a few errors: given the level of modelling that you have demonstrated, I don't think they need a full explanation.
Person Subtype is Non-Exclusive (no Discriminator)
Patient Subtype is Exclusive (needs a Discriminator)
That is to be used in your code to determine the Subtype, otherwise JOIN to the Subtype
Since Resident::Bed is 1::1, the attributes (Bed FK) can be located in Resident.  
This treatment ensures that the Bed that a Patient may be assigned to, exists.
Consider:
When an OutPatient visits the CareCenter, is not the purpose to obtain a treatment of some kind, which must be recorded ?
Is not the treatment obtained under a Physician’s control, and shouldn’t the treatment details be recorded ?
Therefore an OutPatient obtains a Treatment, same as a Resident, and it is common, in the Basetype.
Visit can be eliminated
(again, whether the treatment is received by a Resident or OutPatient regards the Subtype).
The data model in a PDF.
2.3  Predicate
The Predicates can be read directly from the graphic model, the evaluation of such provides an excellent feedback loop to the modelling process.  Please read them and verify.
Eg. the Predicate Each Bed accommodates 0-to-n Residents would cause a brawl that can be avoided.
Again, the academics and authors do not understand the Relational Model, and thus they are clueless about Predicates. For a good introduction, refer to Relational Table Naming Convention, the Relationship, Verb Phrase section at the top, and the Predicate section at the end.
2.4  Null
Nulls in a Relational database are a clear indication of a Normalisation error. I have removed them.
3  Outstanding
The academics and authors understand only 1960's physical Record Filing Systems (placed in an SQL container for convenience), thus they understand only Referential Integrity.  They do not understand Codd's Relational Model, thus they cannot understand, and they cannot teach, Relational Integrity, which is logical, and provides far more data integrity than 50-year-obsolete filing systems.
Your model allows any Physician to treat any Patient, which is typical for a RFS, if you follow the literature, but sub-normal for Relational.
I doubt that that is what you want in a database.  I think you want only the treating Physician, the ProviderNo to treat the Patient.  
As the model progresses, you may wish to ensure that a Bed is assigned to one Resident only. I didn’t model it because I need to be told: is admission and bed assignment two administrative steps or one ?
Do you not require lookup tables for Speciality and TreatmentName ?
Data Modelling is an iterative exercise: it is only when a model is erected, and contemplated, that the issues are exposed, which leads to the next iteration.

Either or relational algebra enterprise constraint

I need to define a constraint where tuples in a booking table can only have a value in musician (foreign key attribute from musician table) or actor (foreign key attribute from actor table), and must have one of these, but not both. At first I came up with this solution -
1. select any tuple from booking, call it x;
2. project x's musician column, call it y;
3. project x's actor column, call it z;
4. count(y) + count(z) = 1;
This works but also unintentionally imposes the constraint that the 'empty' booking's musician and actor columns cannot contain an empty string. How can I fix this issue?
P.S. I'm aware that count() isn't always part of relational algebra but I am permitted to use it for this purpose.
Problem
The obstacles you are facing are these:
no clear separation between data analysis and problem or process analysis
resorting to relational calculus or any other theoretical concept to sole a practical (eg. data modelling) problem.
you are making assumptions on dependencies (or experiencing problems with) where the referred thing is not yet clearly defined
Solution
The solutions are:
first, model the data, and only as data, without regard to what you need to do in any given Process
the Data Model should reflect reality, the real world.
understand and appreciate the theory, but implement using practical methods. That is, straight Relational Data Modelling using the Standard for Relational Data Modelling, IDEF1X.
btw, "There are many RAs" is incorrect: there is just one Relational Calculus, by Dr E F Codd. Sure, there are many pretenders after him, but Codd's RA is the only one that is complete; resolved; universally known; and accepted. philipxy is one of those, they hate Codd.
finish the Data Model properly. Define the referred thing reasonably, before attempting to define the dependent thing.
Before you can model a Booking for exclusively {Actor|Musician}, you need to model {Actor|Musician} ... which is a Person
a Person can be {Actor|Musician|Both}, ie. non-exclusive
but the Booking for {Actor|Musician} needs to be exclusive.
Data Model
Easily modelled in the Relational paradigm. As a consequence, the SELECT is simple and straight-forward.
The Data Model in IDEF1X/ER Level (not ERD) is:
Notice how it is not a RA issue, but a Data Modelling issue. In two hierarchic locations.
Note
The Standard for Relational Data Modelling since 1983 is IDEF1X. For those unfamiliar with the Standard, refer to the short IDEF1X Introduction.
For full definition and usage considerations re Subtypes, refer to Subtype Definition.

Would this data model be considered correct

I'm new to data modelling and have started following tutorials to learn more.
I am trying to create a model for a hypothetical scenario and am struggling to validate what I have created to see if it is what would be considered a correct data model.
Essentially all im trying to do is correctly store data in a normalised form. In my scenario there are 3 types of people and each share some attributes and have one set of contact details each.
Does the below data model look feasible?
The relationship between person and one of defendant, magistrate, or staff-member is a case of the class/subclass pattern. There are two common ways of modeling this pattern in relational tables.
One way is called "Class Table Inheritance". You can find out more by visiting this tag: class-table-inheritance or by searching the web for Martin Fowler's treatment of the same subject. Your design resembles this design.
Another way is called "Single Table Inheritance", which you can also research the same way. single-table-inheritance. It's simpler, and works ok in some cases. You deal with fewer joins, but you deal with more NULLS.
Many people who go for class table inheritance also apply a technique called "Shared Primary Key". shared-primary-key. Using this technique, Defendant, Magistrate, and Staff_Member would each use a copy of person_id as the primary key. This primary key also functions as a foreign key. Shared primary key enforces the one-to-one nature of the IS-A relationships that exist in this case.
If you want to go further in data modeling, you might want to learn ER modeling as a distinct data model from the relational model. What you've done here is essentially to use ER diagramming to diagram a relational model. There's nothing wrong with that, but it obscures a whole new field of study, generally called conceptual data modeling.
If you generate an ER model at the conceptual level, you don't attempt to implement it in terms of tables. There is a diagramming convention in ER that goes under the name "generalization/specialization" that allows you to depict a class/subclass situation, while remaining silent on how it's going to be implemented.
Conceptual data models have an area of usefulness, in addition to relational data modeling. What makes conceptual data models useful is precisely the fact that they present the information requirements without stating how those requirements are going to be met.
Once you are proficient at creating conceptual data models, it's not hard to convert one of them to a relational model.
This may be more than you bargained for, but since you are taking on learning modeling, I thought I'd survey some of the field for you.

ER diagram, is this allowed?

I have to create an ER diagram based on a relational schema.
There is a table of players, and a table of zones. A player can 'live' in many zones, and each zone is owned by one or more players.
I've come up with this simple ER diagram but I'm not sure having relationships going each way is allowed?
Cheers
Yes, that is a perfectly good Entity Relation Diagram. (I am not responding as to whether it makes sense or not: you still need to resolve the Relations and Cardinality.)
Using the correct terms helps people understand exactly what you are discussing, and which level you are discussing. Loose talk results in much more volume in the discussion, and time wasted in clarifying what you meant by which term. Not good for productive technical endeavours.
At this early stage, it is normal to model Entities and Relations (not Attributes), that's why it is called an ER diagram; we are nowhere near modelling the data. The Relations are relevant, and that's why you are detailing and evaluating their nature in the diamonds and Cardinality. The goal is to clarify the true Entities, and their Relations to each other. Many-to-many relations remain as relations. The ERD is purely Logical, there is no Physical.
Once you have some confidence with that, that you have gotten the Entities and Relations right, you move onto a Data Model (which includes Attributes). Still at a Logical level, the n::n relations remain as relations.
As you progress, you may show further detail, such as Domain for each Attribute. That's the DataType, but at the Logical level, just as the terms are Entity = Table and Attribute = Column, Domain = DataType.
.
When you get to the Physical level, the Data Model has Tables; Columns; DataTypes.
And n::n Relations are manifested as the Associative Tables.
.
The idea is, as long as you are working through the prescribed steps, at (1), the content in the diamonds will determine (expose) if they need to be stored, and the diamond is thus promoted to an Entity; otherwise it remains a Relation.
There is a junction table called lives-in in the relational schema I've been given. However, I thought when mapping a relational schema [back] to an ER diagram a junction table becomes a relationship?
The Relational term is Associative table.
Yes. If it is a pure n::n Table (containing nothing but the two FKs to the PKs of the parent Tables), at the ERD level, which is Logical only, it is a Relation.
If it has Columns other than the two FKs, it is an Entity.
Since there's a many-to-many relationship between [Players] and [Zones] you have to add a junction table (called for ex. [PlayersZones]). The notation itself is correct (Chen notation), though I prefer the Crow's Foot Notation.
I am not able to see your images (blocked!) so I'll just try to describe the "correct" design. If a player living in a zone doesn't necessarily mean they own it, you should have four tables:
PLAYER (playerid, <other fields>)
ZONE (zoneid, <other fields>
PLAYER_ZONE(playerid, lives_in_zoneid)
ZONE_OWNER (zoneid, owner_playerid)
Otherwise three tables would suffice.

a layman's term for identifying relationship

There are couples of questions around asking for difference / explanation on identifying and non-identifying relationship in relationship database.
My question is, can you think of a simpler term for these jargons? I understand that technical terms have to be specific and unambiguous though. But having an 'alternative name' might help students relate more easily to the concept behind.
We actually want to use a more layman term in our own database modeling tool, so that first-time users without much computer science background could learn faster.
cheers!
I often see child table or dependent table used as a lay term. You could use either of those terms for a table with an identifying relationship
Then say a referencing table is a table with a non-identifying relationship.
For example, PhoneNumbers is a child of Users, because a phone number has an identifying relationship with its user (i.e. the primary key of PhoneNumbers includes a foreign key to the primary key of Users).
Whereas the Users table has a state column that is a foreign key to the States table, making it a non-identifying relationship. So you could say Users references States, but is not a child of it per se.
I think belongs to would be a good name for the identifying relationship.
A "weak entity type" does not have its own key, just a "partial key", so each entity instance of this weak entity type has to belong to some other entity instance so it can be identified, and this is an "identifying relationship". For example, a landlord could have a database with apartments and rooms. A room can be called kitchen or bathroom, and while that name is unique within an apartment, there will be many rooms in the database with the name kitchen, so it is just a partial key. To uniquely identify a room in the database, you need to say that it is the kitchen in this particular apartment. In other words, the rooms belong to apartments.
I'm going to recommend the term "weak entity" from ER modeling.
Some modelers conceptualize the subject matter as being made up of entities and relationships among entities. This gives rise to Entity-Relationship Modeling (ER Modeling). An attribute can be tied to an entity or a relationship, and values stored in the database are instances of attributes.
If you do ER modeling, there is a kind of entity called a "weak entity". Part of the identity of a weak entity is the identity of a stronger entity, to which the weak one belongs.
An example might be an order in an order processing system. Orders are made up of line items, and each line item contains a product-id, a unit-price, and a quantity. But line items don't have an identifying number across all orders. Instead, a line item is identified by {item number, order number}. In other words, a line item can't exist unless it's part of exactly one order. Item number 1 is the first item in whatever order it belongs to, but you need both numbers to identify an item.
It's easy to turn an ER model into a relational model. It's also easy for people who are experts in the data but know nothing about databases to get used to an ER model of the data they understand.
There are other modelers who argue vehemently against the need for ER modeling. I'm not one of them.
Nothing, absolutely nothing in the kind of modeling where one encounters things such as "relationships" (ER, I presume) is "technical", "precise" or "unambiguous". Nor can it be.
A) ER modeling is always and by necessity informal, because it can never be sufficient to capture/express the entire definition of a database.
B) There are so many different ER dialects out there that it is just impossible for all of them to use exactly the same terms with exactly the same meaning. Recently, I even discovered that some UK university that teaches ER modeling, uses the term "entity subtype" for the very same thing that I always used to name "entity supertype", and vice-versa !
One could use connection.
You have Connection between two tables, where the IDs are the same.
That type of thing.
how about
Association
Link
Correlation

Resources