I am reading the book Database Processing, 14th Edition. In chapter 5, Min and Max Cardinality are discussed. My question is mainly focused on min cardinality. For example, it is mentioned that a department has one or more employees while employee belongs to 0 or more department. The figure is as follows:
I have drawn the following diagram to understand how to place the oval and vertical bar for min, and the N and 1 for max cardinality
It seems to me both min and max cardinalities are reversed in the figure provided by the book. If an employee may not belong to a department, why oval is placed next to department? Similar question goes for max cardinality
I would deeply appreciate if someone could help me understand and further recommend detailed tutorials as the explanation in the book is not sufficient for me to understand. Note that I have read other tutorials, book, etc that use different notations, but I need to learn the notation from the above book.
Related
I am having some trouble getting my head around cardinality in ER diagramming. I am linking an example I found to help me explain where I am getting confused.
http://www.postgresqltutorial.com/download/dvd-rental-er-diagram/#
Question 1:
The cardinality between Customer and Rental is 0:1. So that means a customer can take out zero or one rentals. I would have thought the customer would be able to take out 1 or many rentals (1:*) because a customer means that they are taking out a rental (can't be a customer if you are not spending any money) and that a customer could take out many rentals.
Question 2:
Also for the Staff to Payment relationship. Staff to Payment is 0:1 cardinality. I would have thought that a staff would make at least one payment because payments are necessary for the rental transaction. And then in reverse (one payment can be made by one and only payment): just to clarify this is because logically a payment is a transaction that only be made by one person at a time?
I agree with you. The same thing occurs on both sides of film_category, which I believe represents a many-to-many relationship based on the primary key. I think the diagram was drawn incorrectly.
Note that there's no such thing as 0:1 cardinality, but rather 0/1:1. Also, despite what the site and diagram says, the diagram is a table diagram and not an ER diagram. The notation used doesn't support or distinguish all the concepts from the Entity-Relationship model. Proper ER diagrams use Chen's notation or something equivalent.
I'm a newbie to data warehousing and I've been reading articles and watching videos on the principles but I'm a bit confused as to how I would take the design below and convert it into a star schema.
In all the examples I've seen the fact table references the dim tables, so I'm assuming the questionId and responseId would be part of the fact table? Any advice would be much appreciated.
I can't see the image at the moment (blocked by my firewall # the office). but I'll try to give you some ideas.
The general idea is to organize your measurable 'facts' into what are called fact tables. There are 3 main types of facts, but that is a topic for a different day (but I'd be happy to go into this if needed). Each of these facts are what you'd see in the center of typical 'star schema'. The other attributes within the fact tables are typically FK references to the dimension tables.
Regarding dimensions, these are groups of attributes that share commonality (the most notable being a calendar dimension). This is important because when you're doing analysis across multiple facts the dimensions are what you use to connect them.
If you consider this simple example: A product is ordered and then shipped. We could have 2 transaction facts (one that contains the qty ordered - measure, type of product ordered - dimension, and transaction date - dimension). We'd also have a transaction fact for the product shipping ( qty shipped - measure, product type - dimension, and ship date - dimension). This simple schema could be used to answer questions like 'how many products by product type last quarter were ordered but not shipped'.
Hopefully this helps you get started.
Usually a fact table is used to aggregate measures - which are always numeric. Examples would be: sales dollars, distances, weights, number of items sold.
The type of data you drew here doesn't have any cut and dry "measure" so you need to decide what you want to measure. Is the number of answers per question? Is it how many responses per sample?
This is often called an Event Fact table (if you want to search for other examples). And you need some sort of reporting requirements before you can turn it into a star schema. So it isn't an easy answer...
It's so easy :) Responses is fact, all other is dimensions. And your schema is now star designed, because you can directly connect fact with all dimensions. Example, when you need to redesign its structure where addresses stored in separate table and related with sample. You must add address table id into responses table for get star schema.
Database Design
Is this a bad design for a relational database. I don't see anyone doing examples that look like this.
But considering that an interview is comprised of all the different tables I have linked to the interview table it seems valid.
Except of OH Number (Oral History Number.) An oral history from one narrator may be comprised of different interviews conducted at different dates. Each individual interview is assigned a unique ID that make op a series that is assigned 1 OH Number.
I'm also thinking of putting "Interviewer, Indexer, and Transcriptionist in the same table.
I created the following mock-up for you given the details you have provided. I believe this will be a good starting place. You have an interview object and a person object. You have a joining table of InterviewPerson. This allows you to have one to many person objects per interview.
I want the database to be robust enough that if a researcher called in
and wanted all the interviews conducted by John Doe, on Race
Relations, I could pull a query for it.
To do the aforementioned as you have stated, you would join both the Interview table and the Person table on the InterviewPerson table, and then you would limit your query of that joining based on the Person.firstName, Person.lastName, Interview.topic (or title).
Please note, this is a rough draft but should be a good general idea and start.
Database Design Redux
This is what I came up with based on your suggestions.
So I've volunteer to create a Registering system for my local church's education ministry. It should be able to register new students and keep track of their progress. Here are the requirements I've managed to gather:
The educational institution offers several courses.
Courses have a name and description.
Courses are organized in levels. There are several courses per level.
Courses also have requirements (i.e. other courses that need to be taken first).
A student graduates from a level when it has passed all courses of that level.
If a student cannot pass a course, he may repeat it as many times as he wants/needs.
Students can only take one course per semester.
An inactive student is one that isn't enrolled in the current semester.
Teachers will teach only one course per semester. Teachers may teach a different course each semester.
There could be semesters a teacher doesn't teach.
Now, this is my relational model.
![https://dl.dropbox.com/u/10900918/rmodels.jpg][1]
My questions are:
Are there any tables missing?
Looking at the semester + semester_code_description: is this the best way to do this? Under the assumption that a year has 2 semesters and that each semester have the same start and end months (i.e. semester 1: Aug - Dec, semester 2: Jan - May), is semester_code_description table really necessary?
How could I improve the design?
Sorry I didn't include any arrows. The program I'm using is a mess.
Thanks so much for your valuable time in advance.
1) Nice job on your design. I don't see any missing tables - it looks like you covered all of your requirements.
2) The semester_description table makes sense to me, whether or not you need it depends on whether you plan to do anything with that data.
3) The requirement "students can only take one course per semester" would imply that the Has_Taken relationship's primary key should be (student_id, semester_id). As it stands now, I could insert two different courses for the same student and semester. Similarly for the Has_Teached relationship.
Some other thoughts:
The "last_whatever" columns in some of your tables will force some extra processing on your actual application. You will need some mechanism to monitor/update those. Another option would be to derive them from your tables. I can get a student's last_semester by finding the semester with the max year/code.
One last consideration, how stable are these courses/descriptions/levels? I worked at a university for several years and our courses would change on a semester basis, forcing us to save an entire copy of course records for each change because we want a student's record to reflect what they actually took at that time.
Here's a little example in your app. Let's say I graduated level 1. Then a year later, the church adds a new course (Course A) to level 1. I will effectively be un-graduated b/c now there are level 1 courses I don't have (Course A).
This may not matter to you if your courses are pretty stable. Good luck!
I am having a hard time understanding what is the difference between the Max and Min cardinalities when trying to design a database.
Remember cardinality is always a relationship to another thing.
Max Cardinality(Cardinality)
Always 1 or Many. Class A has a relationship to Package B with cardinality of one, that means at most there can be one occurance of this class in the package. The opposite could be a Package has a Max Cardnality of N, which would mean there can be N number of classes
Min Cardinality(Optionality)
Simply means "required." Its always 0 or 1. 0 would mean 0 or more, 1 ore more
There are tons of good articles out there that explain this, including some that explain how to even property "diagram". Another thing you can search for is Cardinality/Optionality (OMG Terms) which explains the same thing, Optionality is "Min" Cardinality is "Max",
From http://www.databasecentral.info/FAQ.htm
Q: I can see how maximum cardinality is used when creating relationships between data tables. However, I don't see how minimal cardinality applies to database design. What am I missing?
A: You are correct in noticing that maximum cardinality is a more important characteristic of a relationship than minimum cardinality is. All minimum cardinality tells you is the minimum allowed number of rows a table must have in order for the relationship to be meaningful. For example, a basketball TEAM must have at least five PLAYERS, or it is not a basketball team. Thus the minimum cardinality on the PLAYER side is five and the minimum cardinality on the TEAM side is one.
One can argue that a person cannot be a player unless she is on a team, and thus the minimum cardinality of TEAM is mandatory. Similarly an organization cannot be a basketball team unless it has at least five players. The minimum cardinality of PLAYERS is mandatory also. One could argue in the opposite direction too. When a player quits a team, does it cease to be a team until a replacement is recruited? It cannot engage in any games, but does it cease to be a team? This is an example of the fact that each individual situation must be evaluated on its own terms. What is truth in THIS particular instance? The next time a similar situation arises, the decision might be different, due to different circumstances.
Agree with other answers, here's a slightly different view. Think in terms of optionality and multiplicity. Take an example: Person has Address.
Optionality asks: Does every Person need to have an Address? If so the relationship is unconditional - which means minimum cardinality is 1. If not, then min cardinality is 0.
Multiplicity asks: Can any given Person have more than one Address? If not, the maximum cardinality is 1. If so the maximum cardinality is >1. In most cases it's unbounded, usually denoted N or *.
Both are important. Non-optional associations make for simpler code since there's no need to test for existence before de-referencing: e.g.
a=person.address()
instead of
if (person.address !=null) {
a=person.address()
}
Addresses are a good example of why Multiplicity is important. Too many business applications assume each person has exactly one address - and so can't cope when people have e.g. holiday homes.
It is possible to further constrain the cardinality, e.g. a car engine has between 2 and 12 cyclinders. However those constraints are often not very stable (Bugatti now offers a 16 cylinder engine). So the important questions are optionality and multiplicity.
hth.
Let's work with an example -
Students takes Class. Here both Students and Class are entities.A School may or may not have students enrolled in a particular semester. Think of a school offering courses in summer semester but no student is interested to join in. So, student's cardinality can be (0,N). But if a Class is going on means, it should have at least 1 student registered. So, its cardinality should be (1,N). So, you should check whether the entity participating in the relation is partial or total, which decides it's cardinality in the relation.
Hope it helps.
Maximum Cardinality:
1 to 1, 1 to many, many to many, many to 1
Minimum Cardinality:
Optional to Mandatory, Optional to Optional, Mandatory to Optional, Mandatory to Mandatory
To your question, 'what is the use of optionality in database design?':
It becomes very helpful in the scenarios like the following.
When you design 2 tables with 1-to-1 relation, you will be confused to decide where (in which table) to have the foreign key. It's very easy to decide it, if you have optionality 1 for one table and 0 for the other table. The foreign key should be present in the former. There are many other uses for it as well.
Hope it helps.
Maximum Cardinality:- one-one, one-many, many-many
Minimum Cardinality:- zero or one
This link describes my answer, why it is so, what's the representation,
and what it is.