Fan trap and chasm trap - Database - database

Can anyone tell me what is chasm trap? Perhaps fan trap too as I'm not too clear. Also, please provide easy to understand examples (via Chen notations).
My understanding thus far: I understand that Fan trap is M:1:1:M, which suggests the paths between entities is ambiguous.
I understand that. For example, if M represents Student and the other M represents School then it'll be ambiguous because we don't know which student studies at which school (that's what I understood so far).
However, I cannot grasp what is chasm trap.
Also, how can I identify the traps and then fix it?

Based on Conolly&Begg:
Fan trap occur in a situation when a model represents relationship between entity types however a path between certain entity occurrences is ambiguous.
Example:
(Staff)-1:N-has-1:1-(Division)-1:1-operates-1:N-(Branch)
in this model it may be impossible to determine the branch a staff belongs to, in the situation when staff belong to division having more than 1 branches.
Restructuring the model resolves trap
(Division)-1:1-operates-1:N-(Branch)-1:1-has-1:N-(Staff)
Chasm trap occur when a model suggests relationship between entity types however a path between certain occurrences does not exist.
Example:
(Branch)-1:1-has-1:N-(Staff)-0:1-oversees-0:N-(PropertyForRent)
Because Staff relationship to PropertyForRent is with optional participation (0:1) for staff the path for Branch to PropertyForRent may not exist. Solution to this would be direct relationship between Branch and PropertyForRent with mandatory participation.

In simple word, for both the cases (FAN & CHASM) it will produce more line(result sets) than actual. How to identify
FAN -> 1-N-N means table relation from one -> many -> many
CHASM -> N-1-N means one row table to two or more table many relation
LOOP -> join all tables and when make loop like circle (In this case we will lose some rows absolutely)
Nothing to identify but when you create Universe than we have to keep our eyes open, if you see out of these situation while developing Universe than there will be a problem always. So rectify by applying aliases, context.
Once all problems solved at Universe level than we are good to go for reporting. By practice you will have excellent knowledge.

I fan trap occurs when three tables joins in a fashion where there realtion to each other is 1 to many way. means table A B and C are in join as .. table A links to table B in one to many and table B to table C relates again one to main way A-->B-->C.

Related

Is it Important to Understand Each Normal Form

I have been studying database design and programming for quite some time now, but I still can't get a grasp of understanding each individual normal form (1NF, 2NF, 3NF.)
Seeing as anytime the data is in Third Normal Form, it is already automatically in Second and First Normal Form, can the whole process actually be accomplished less tediously by fully normalizing the data from the start. I can accomplish this easily by arranging the data so that the columns in each table, other than the primary key, are dependent only on the whole primary key.
How important is it to understand each individual normal form if we can simply fully normalize the data less tediously by doing what I have described?
EDIT: What I'm ultimately asking is: Is it important to go through the steps of each normal form when normalizing data, or is it appropriate to just go to Third Normal Form seeing as the result is ultimately the same?
I highly recommend understanding each normal form as this will help you determine or investigate any issues with a current database may have as sometimes you might not have the perfect scenario each time and understanding each normal form will help you to understand the current problems with an existing database design if there are any.
Going through step by step through the different normal forms will help you to figure out why we do this and this is to achieve the goals specified by E. F. Codd.
The objectives of normalization were stated as follows:
1. To free the collection of relations from undesirable insertion, update and deletion dependencies.
2. To reduce the need for restructuring the collection of relations as new types of data are introduced, and thus increase the life span of application programs.
3. To make the relational model more informative to users.
4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by.
Here is a image to help you understand the different normal forms better.
P.S. BCNF is actually 3.5NF not 4NF
It's right that, when being in the 3. NF, you're also in the 2. and in the 1. NF. However, the only condition for the 3. NF is not only that all the data is only dependent on the whole candidate key. It also has the condition that it already is in the 2. NF, meaning that every property that is not the candidate key has to fully depend on the candidate key and that it is in the 1. NF, meaning that every column has to be atomar. So yes, it is important to understand every NF if you want to have a table in the 3. NF.
I'll try to explain the Normal Forms to you:
1. NF
The 1. NF states that every column has to be atomar. This means, there shouldn't be multiple items of data in one column. For example, the adress of someone shouldn't be stored in one column, but should be splitted in the country, the state, the street and so on. Each of these pieces of data should then be stored in their own column.
2. NF
The 2. NF states that every attribute, that is no part of the candidate key, has to be identificable only by the whole candidate key. That means for example that you shouldn't store books and printing labels in one table. Because then the name of the book would only be dependent on the id of the book, while the printing label's name would only be dependent of the id of the printing label and not of the whole candidate key.
3. NF
The 3. NF nearly states the same as the 2.: No column is allowed to be dependent on a non candidate key column. That means for example that you shouldn't store the IBAN of a book and an id of the book in the same table, with only the id being the candidate key, as you'd only need the IBAN to find the name to the book.
If this doesn't explain the matter well enough, there's a lot of information online regarding the normal forms (like Wikipedia).
its not the case that if its in 3 NF its in 1 NF nad 2 nd NF .it was like if its in 2nd NF it has to be in 1st NF beforehand .and same goes for 3NF .for normalising to 3NF it has to clear 1st and 2nd NF forms.
1st normal forms states that no multivalued attribute should be present.
2NF states that there should not be partial dependency on a non prime attribute .
3NF states that no transitive depedency should be there .
thank you
The only NF (normal form) that matters is 5NF.
A relation (value or variable) is in 5NF when for every way it can be losslessly decomposed the components can be joined back in some order where the common columns of each join are a superkey of the original. (Fagin's PJ/NF paper's membership algorithm.)
This allows a table to be the join of others with overlapping meanings but without update anomalies. (Although update anomalies cease at ETNF, between 4NF & 5NF.)
Anyway if you wanted a lower NF you should normalize to 5NF then denormalize. The main reason people settle for lower NFs is ignorance. There are certain costs & benefits, but people don't know or address them--code must restrict updates to account for the problematic update anomalies. Normalization to a given NF is not done by going through lower NFs; one uses an appropriate algorithm for the NF one wants. (This is made clear by most textbooks, although some wrongly say to move through lower NFs, but putting into a lower NF can prevent good higher-NF versions of the original from turning up later.)
PS There is no single notion 1NF and all it has in common with higher NFs is that both seek "better" designs.
From what I recall of the process, it's a method that you follow to get to a state where the storage and search facilities of the database are fully optimised. Yes 3NF does encapsulate the rules below it, 1st and 2nd, but it is far easier to unpick the data if you start at the easier forms of normalization to see if your data is in an efficient format for storage in a RDBMS or SQL based database. Jumping in straight at a higher normal form makes the whole process for beginners harder and intimidating and to not analyse the data correctly. To be honest will make hard work when dealing with difficult data structures that are not just your usual invoice, invoice Lines, address stuff that you tend to deal with day in and day out. Going through the process of normalization, sometimes there is value in unpicking data structures that were not obvious from the start, which not only makes your data more efficient but also helps you reason over what you are trying to accomplish.

Proper name for an intermediate table between others two intermediate

I have 4 entities: Event, Message, Flow and Document.
Event table stores a limited (seeded) number of records. Message has many events and each event can be related to many messages. The name event_message was given for the intermediate table.
As you can see, the convention for intermediate tables are: {tablename}_{tablename}.
Flow table stores a limited (seeded) number of records. Message has many flows and each flow can be related to many messages. The name flow_message was given for the intermediate table.
A document is created on each relation between Flow and Message (each record on flow_message).
The issue starts here:
Each event on a message has different documents by flow. It means: for each new record on intermediate table flow_message, each record on intermediate event_message has a new document related.
To solve this, I created an intermediate table between event_message and flow_message named: event_message_flow_message.
Is this correct (in some conventional way)? Is this modeling correct?
How to proper model and naming the intermediate table derivative by two others intermediate tables?
I also wish there was some convention. Since I do not know any official convention, I invented mine. The important thing is to respect the convention you choose.
So I would change the event_message_flow_message to rel_eventmessage_flowmessage.
But for me your convention is pretty nice.
It's hard to make a recommendation because your model seems a bit odd to me. You have 1:1 relationships between both DOCUMENT and FLOW_MESSAGE and DOCUMENT and EVENT_MESSAGE_FLOW_MESSAGE. It's hard to reconcile this in my mind with the many to one relationships to EVENT_MESSAGE_FLOW_MESSAGE. If you're relationships to DOCUMENT are really 1:1 (mandatory), then why keep documents in a separate table?
To address your question about table naming: I would argue that the {table}_{table} convention for naming intersection tables is not a best practice but rather a fallback for cases where you can't think of a better name.
The best practice is for names of tables to reflect the business name of the thing which is recorded / described by the data in the table. It's not always possible to do this, especially for intersection tables. Intersection tables represent many-to-many relationships, and relationships are often difficult to describe with a noun.
In your case, I don't think that your convention is actually making things especially easy to understand. I'd probably try to simplify with something like MESSAGE_DOCUMENT or even just DOCUMENT - since these seem to be 1:1 related in any case.

When is it okay to have a relationship loop in my database?

Possible duplicate of Can I avoid a relation loop in my database design?, but I'd like to get a broader answer than for that specific design.
The goal in this case is to store automated testing data as it’s generated. A portion of the relationship diagram is shown below.
A variable number of tests may be run on each build, hence the direct one-to-many relationship between Builds and Sessions.
Each build is made of several hundred parts, and each part number may be used on several hundred builds, hence the many-to-many relationship between Builds and DT_Parts, associated through LT_HeaderParts.
If an assembly error is found during testing, a part or parts may be switched out and the unit retested. Instead of duplicating hundreds of part records on each retest, I implement PartsChangeLog to document any changes made after a given session.
PartsChangeLog uses DT_Parts as a dictionary to save memory by storing integers instead of the varchar(20) part_number.
LT_HeaderParts and PartsChangeLog both have appear to have valid, non-redundant reasons for using DT_Parts, yet this setup creates a reference loop and poses the danger of creating a false many-to-many bridge from build_id to session_id that would yield incorrect relationships.
Is this an okay structure? Why or why not?
Trying to answer the actual title question "When is it okay to have a relationship loop in my database?".
One part of the answer is that it depends on the intended usage of the schema/diagram per se. Is it intended as a conceptual model, with the purpose of illustrating business concepts ? Then basically you can highlight just any relationship you like. By which I mean you can highlight anything in the form of a relationship if you think that relationship is of interest to the intended business audience. Or is it intended as a logical db schema ?
In that case it mostly depends on the precise "semantics" of the relationships. If two relationships are saying things that are semantically distinct, then you can bet your ... that both will be relevant to the business being modeled and that you should be keeping both.
The simplest example of such a loop is a bill-of-materials structure. Such structures have a single "parts" entity, with a many-to-many relationship of "containment". This "containment" relationship gets instantiated as a "containment" entity with two relationships to the "parts" entity. Each of these two relationships has different semantics (one saying "the containing part must be a known part" and the other saying "the contained part must be a known part") and so they should definitely be kept both.
What you have is two sets of parts associated with a session: build parts (session -> build ->> part) and changed parts (session ->> partschangelog -> part). As the answer to the question linked by JJ32 explains, consistency is the main concern in these situations. In this case, I suspect the set of changed parts should be a subset of the build parts, but your schema doesn't enforce this.
One way of enforcing it is via controlled redundancy. If you include build_id in PartsChangeLog as a non-prime attribute (and modifying the foreign key reference to Sessions accordingly), you can create two composite foreign key constraints referencing LT_HeaderParts (for build_id, part_added and build_id, part_removed).
This eliminates the possibility of associating inconsistent session_id and build_id via the many-to-many bridge; though if no parts were changed, there won't be such a bridge. That's understandable, our goal is not to replace the direct mapping between session_id and build_id, only to ensure consistency. The rest is up to the query developer.

uncertainty in developing a database model

I'm trying to develop a database model for candidate, their registered exams and result of the exams when its being taken.
This is what I've done so far. however im unsure if am on the right track especially from the examination table to the examination result table.
how easy will it be to right write an insert sql code for examinationresult population for a particular candidate
the examination types are categorised into science, art and social science. they all have 4 components each
Note on Progression
Given the fact that the Question changes substantially (in clarifying the requirement, not is scope) in response to my Response and TRD, this is going to take some back-and-forth. Let's identify Steps: your Step numbers are odd, starting from 1; mine, in response, are even. Parts of previous Response Steps have become obsolete, they may no longer make sense.
I would suggest a bounty, except for the fact that you have few points.
Response Step 2 to Initial Question & Step 1 Diagram
This is what I've done so far.
You have done some good work, but it is too early for assigning PKs. Besides, assigning an ID on every file as a starting point will cripple the modelling process, the result will not be a database. You have to model the data (not the database) first, then assign Keys when the entities are clear and stable. So drop all your IDs and PKs and model the data, as data. Forget about what you want to do with the data (ie. forget the app).
how easy will it be to right write an insert sql code for examinationresult population for a particular candidate
Right now you can't. You have no relationship between Candidate and Examination[Result]. That is not a problem because the modelling is incomplete at this stage, when it is complete the code will be simple.
The entity Course is implied, but it is missing.
however im unsure if am on the right track especially from the examination table to the examination result table
You are on the right track with some of the other files, but the Examination cluster needs work. This will take a bit of back-and-forth. Once you answer the questions in the comments, we can proceed.
The main issue is this: how is Examination identified.
An ID field does not identify anything, nor does it provide uniqueness in the data, which is required if you want data integrity. IDs result in a Record Filing System with no integrity, however, it appears you want a database with data integrity. Is that correct ?
Go back to the user and discuss how courses and components are identified, what codes they use, etc. Those are the natural Keys that they use to identify their data, that they will enter into the system when they need look something up, or to enter examination results.
Eg. It is not reasonable to contemplate an Examination that exists independently (as you have modelled it). People do not go to a hall and sit for any old exam. The exam exists only in the context of a course, they sit for an exam for a course.
Then the course, and not the exam, has components, which are examined. And each course has a different number of components.
Eg. a Course which is identified as ENG101 for English Literature year 1
And then the components within that. Eg. 2b Short essay on poetry.
They may need to identify the year and semester of the course as well, in which case, you need a CourseOffering per semester.
Consider this, as a discussion point. Courier is example data, blue is Key, green is non-key:
TRD Step 2
Response Step 4
Response to Question & Description
This is what I've done so far.
My previous response still applies:
You have done some good work, but it is too early for assigning PKs. Besides, assigning an ID on every file as a starting point will cripple the modelling process, the result will not be a database. You have to model the data (not the database) first, then assign Keys when the entities are clear and stable. So drop all your IDs and PKs and model the data, as data. Forget about what you want to do with the data (ie. forget the app).
You have not addressed that issue, that I identified in your Step 1 Diagram, in your Step 3 Diagram. It appears, from the evidence, that you might be happy with IDs as "Primary Keys" (there aren't), despite the hindrance having been identified to you. That means your understanding of the data is crippled, and the progress of your diagrams will be slow.
My previous response still applies:
An ID field does not identify anything, nor does it provide uniqueness in the data, which is required if you want data integrity. IDs result in a Record Filing System with no integrity, however, it appears you want a database with data integrity. Is that correct ?
You must answer these questions, otherwise your design cannot proceed. These are severe errors that must be corrected. One cannot build on, or progress, a foundation that contains severe errors.
Could you please confirm, you do want a Relational Database, with the integrity and performance that Relational Databases are capable of, that is easy to code against, as opposed to a Record Filing System, with no integrity or speed, that will be difficult to code against. Correct ?
If [1] is correct. Since ID fields as "Primary Keys" do not provide row uniqueness, which is demanded for a Relational Database, how exactly, do you intend to provide the required row uniqueness ? Alternately, are you happy to have an RFS that is full of duplicate rows (each with an unique record ID) ?
how easy will it be to right write an insert sql code for examinationresult population for a particular candidate
My previous response still applies:
Right now you can't. You have no relationship between Candidate and Examination[Result]. That is not a problem because the modelling is incomplete at this stage, when it is complete the code will be simple.
Ok, in your Step 3 Diagram, you have drawn a line between Candidate file and the ExaminationResult file (as opposed to, inserting a relationship in a database).
In a record filing system, sure, you can just draw a line between any two files, insert the relevant ID field, and hey presto, you have "linked" or "connected" or "mapped" the two files.
But database design (as opposed to file design) does not progress like that, you cannot just draw a line between any two objects, insert the relevant ID field, and hey presto, create a database relationship. No. There is no basis, no integrity, in the dashed line that you have drawn. Eg. in your Step 3 Diagram, any Candidate can be related to any Examination[Result].
That is "normal" or "ordinary" in record filing systems, but in a database, it is something to be recognised and understood as an error, and thus prevented. Because we expect integrity in a database, and because it can be prevented, easily.
however im unsure if am on the right track especially from the examination table to the examination result table
My previous response still applies:
You are on the right track with some of the other files, but the Examination cluster needs work. This will take a bit of back-and-forth. Once you answer the questions in the comments, we can proceed.
The main issue is this: how is Examination identified.
An ID field does not identify a row (it identifies a record, which has no relevance whatsoever in a database).
The same two problems (a) lack of a valid identifier, and (b) lack of row uniqueness, exists with your Candidate, Component and ExaminationResult files.
Response to Diagram as a Diagram (as opposed to the content)
You have improved it over your Step 1 Diagram, and in response to my Response Step 2, great. But the relationships (most of them) are still incorrect. And the basis of Candidate::Examination is still not resolved.
It appears to me that you are not clear about the notation (notches; circles; crows feet) and precisely what they mean at the parent and child ends). So you need to learn that first, and then draw the diagram, rather than the other way round.
It is great that you are using a Notation that is meaningful, and many details are shown (many people don't, they draw nice-looking diagrams that lack the detail required for a full understanding of the model. That means that every notch; circle; crows foot, has specific meaning, and must be drawn correctly, in order to convey that meaning to the reader.
Entities do not exist in isolation, there must always be a parent first, in order for the child to be a child of the parent. There is no such thing as "equal". Dependency is always in one direction.
Your relationships that are 1-and-only-1 on one side, and 1-and-only-1 on the other side, are incorrect, they indicate a Normalisation error. The field in the subordinate record can be Normalised into the ordinate record.
Eg. AdmissionLetter is not a separate file, some form of AdmissionLetter identifier (not an ID field) should be located in Candidate.
Eg. Title::Candidate is a drawing error, it should be 1 at the Title end and 0-to-many at the Candidate end.
In a data model, bold (by convention) means a migrated Foreign Key. The Primary Key that is migrated is not bold.
Response to Diagram Content
From your replies, the term Subject trumps the term Component; Category trumps various loosely-identified elements into one clear entity.
It is not reasonable to contemplate an Examination that exists independently (as you have modelled it).
People do not go to a hall and sit for any old exam, any old Subject. The exam exists only in the context of a Subject, they sit for an exam for a Subject.
I accept that the Examination is one sitting, for four Subjects
I accept that the four Subjects are defined by a Category.
I accept that the Candidate is registered for a Category.
Thus the exam exists only in the context of a Subject, which exists only in the context of a Category, and the Candidate sits for an exam which is a Category, which contains four (the number does not matter) Subjects.
Having resolved that, two questions remain:
Do you need to record an Examination as an event, independent of the Candidates who sit in that event. Eg. Examination(Location, DateTime) ?
Does the Examination event examine Candidates in one, or more than one, Category ?
The notion of four Subjects that are implemented as four repeated fields in one record breaks Second Normal Form, which demands that repeating fields are Normalised into separate records in a child file.
Therefore, for both your Component and ExaminationResult files, that issue needs to be resolved.
Note that the fact that that problem is repeated in two separate files is a second alarm that it is an error.
I have clarified the Category/Subject issues for you, and resolved the Normalisation error.
I have given simple identifiers for Categories and Subjects.
If you do not implement that, you will not have integrity between the Candidate and the Subject they are being Examined for. As well, you will suffer various problems when you get to the coding stage.
I have no idea what you are trying to do with exComp, therefore I have no response. Perhaps you can say a few words about it.
Thus far, there is still no reasonable way of relating Candidates to Examinations or ExaminationResults. That is, it has no basis, nothing has been defined as the basis for the relationship, and thus the relationship has no integrity.
On the basis of what I have been able to ascertain thus far, there must be some sort of registration for an exam. Otherwise you would not know that a Candidate is sitting for an exam.
When the Candidate registers, they register for an exam, and that exam is defined (and therefore constrained) by a Category. Otherwise any Candidate can sit for any exam, which I believe, you would like to prevent.
Further, the [four] exam Subjects that they sit for, should be constrained by the Category that they registered for.
You do want to ensure that you do not record an Economics exam result for a Candidate who is registered for Science, correct ?
I have determined that the basis of an exam is the Registration. That is the event, the fact, the recording of which, establishes that a Candidate will sit for an exam.
The identifier virtually jumps out at you, it is CategoryCode plus CandidateID. Voila! we have row uniqueness. Magnifique! we have integrity.
Now the integrity of ExaminationResult can be implemented: it is constrained to the CandidateRegistration::Category and to the Category::Subject.
To be Resolved: Do you need to identify the fact of a Candidate registering for an examination (RegistrationDate, AdmissionLetter of whatever) vs the fact that the Candidate sat for the examination (eg. ExaminationDate) ? A sort of roll call.
Right now, I have modelled that as a single fact with no differentiation, and the table is called Examination because you seem to be focussed on that.
Predicate
These days, people seem to be throwing themselves at drawing a diagram, without understanding either the basics of a Relational Database, or of the exercise of modelling data. Predictably, that results in an ill-defined diagram (many relevant details are omitted) [gratefully, your diagram has some definition], and it produces a record filing system with no integrity, no relational power, no speed, instead of a Relational Database with integrity, power, and speed.
One concept that is often missing is Predicates. A competent reader can read a good data model, and ascertain the Predicates, because they are drawn in the model, in the form of notation, but a novice doesn't understand the notation, or the relevance of the various items, and therefore will miss the Predicates. In sum, the Predicates are all the constraints that are placed on the data:
Row Identification:
The basis of it existence, and how it is Identified: Independent (square corners); or Dependent (round corners)
Row Uniqueness: Primary and Alternate Keys (note, IDs are not Keys)
Relationships between rows:
Identifying (solid lines); or Non-identifying (dashed lines)
Meaning, relevance, purpose: the all-important Verb Phrase
Further, a novice cannot determine the Predicates when there is no diagram, or when the diagram is poor, or when they are designing the filing system and drawing the diagram themselves. Thus they do not identify the relevant Predicates in their diagram.
Predicates are very important during the modelling exercise, in that as well as the model expressing the Predicates, the Predicates confirm the accuracy of the model, it is a feedback loop. It is an essential part of the modelling exercise. Since I am executing the modelling task for you, I am working out the Predicates as I perform that task, they are obvious to me. But they may not be obvious to you.
When the data model is published, and ready for discussion with the users, these Predicates are incorporated into it. They come under the heading of Business Rules, they form a part of that, because that is the way the user perceives them. Consequently, during the walkthroughs and discussions, the Predicates (as well as the other stated Business Rules) are either confirmed or denied by the user. They need to be stated explicitly, because unlike the technically educated developer, the user cannot be expected to read all the relevant Predicates from the notation in a good data model.
In this situation, I am the modeller, and you are the "user". Thus I have decided to provide the Predicates for you, explicitly. So that you can confirm or deny them, and thus we can progress the modelling exercise. Once you get used to reading the Predicates from a good data model, you will not need to have them declared explicitly for you. Again, Predicates are very important because they verify (or not) the accuracy of the model. So please read them carefully and comment on any Predicates that you do not completely agree with, or that you do not understand.
Of course, it is not necessary to explicitly declare all the Predicates, there are just too many, we declare just the more relevant ones, that relate to:
(a) rows (tables), the basis for their existence
(b) their identification
(c) all dependencies
(d) relationships, both sides (one side is the Verb Phrase).
Step 4 TRD
I have implemented all the above, as detailed. Please consider this TRD as a discussion platform for the next iteration, and comment. Courier indicates example data, blue indicates Key values, green indicates non-key values:
Step 4 TRD
Response Step 6 to Chat Step 5
All issues discussed have been resolved, and implemented in the model. Sorry, I do not have time right now to post details, this is simply identifies the updated models.
Entity-Relation and full Predicates on page 1
All resolved issues have been implemented.
Predicates
Now that it is stable, I am now giving you the second side of the Relation Predicates (child-to-parent). And now that you understand them, I have deleted the repeated, annoying "Each" that is demanded for novices.
Entity-Relation-Key on page 2
Now that the TRD is stable, we are ready to proceed to Determination of Keys
(Second only to Normalisation, Key Determination is a critical part of the modelling exercise. The two tasks are normally performed side-by-side, they are inseparable, I have already determined the keys. In this case, given the limitations of the communication media, I am presenting it as a sequential step).
Here, I use an Extension to the IDEF1X Notation that allows me to concentrate of the components that are relevant to the task, I expect that it is self-explanatory. The Key columns only, are given. Foreign Keys are not Bold (as they are in the DM). All that, is intended to make it easy on the eye.
Most tables have one Key (Primary). Where there are two Keys (Primary and Alternate), the AK is below the line.
This is my recommendation for the Keys, as requested, for your review.
Step 6 TRD and TRK 6

One-to many relationships in ER diagram

I am trying to show the following in the ER diagram:
There are instructors and courses, a course is taught by only one instructor
whereas an instructor can give many courses.
My question is, is there any difference between two diagrams, in other words, does it matter which line we turn into an arrow, or what only matters is only the direction of the arrow?
Also, if we think about the mapping cardinalities; is it 1 to many or many to 1? If we think in terms of courses, then it is many to one but if we think in terms of instructors, then it is one to many. How do we decide this?
Thank you.
In ER diagrams when the relationship is denoted the arrows are not used. Some instructors use this arrow when they want to decide the cardinalities but that is just to get the cardinality (1:1, 1:M and N:M)
I have attached the ER diagram for this in Chen notation and also using Crow Notation you can use either of them.
Deciding the cardinality for a relationship is a practical scenario there is no hard and pass rule to obtain it. What you need to do is start from one side of the relationship and take one tuple (instance) and see how many tuples from the other entity participate for the relationship. Then do the vise versa. Then you know the participation number of tuples) from each entity to the relationship. Think about set theory and functions in mathematics when you decide the cardinality (ie Set of instructors, Set of Courses and set of Teaches relationship type) then this is so easy but if you are not from a mathematic background just think of practical scenario.
For Example
a) For 1 instructor he or she can teach Many (M) courses
b) For 1 Course there is only 1 instructor
so in instructor side there is always 1 in a) and b) but in Courses there is M and 1 in a) and b) there for Instructor:Course cardinality is 1:M
I don't think the other answer is fully correct.
I would say that one should use arrows, and one should use a notation that gives a meaningful name to each direction of the relationship. In this case it will be "teaches" in one direction, and "is taught by" in the other. Either use arrows next to the names or put the name near to the entity to which it refers. You could use one line (with two arrow heads) or two lines (with one arrow each).
I would also suggest that cardinality is just one kind of constraint, and the notation should reflect that. For example, the two names for the relationship could be "teaches (many)" and "is taught by (exactly one)". The point is you might have "teaches (one or two)" or "is taught by (exactly two)" and so on.
It is better to be explicit and clear about exactly what your constraints really are.
Both are having exactly opposite cardinality
🔸Simple clean line means many.
🔸Arrow means one.
If we consider both with same cardinality.
then, many to many should be represented by following the second convention as (please assume diamond for relationship set and rectangle for entity set)
INSTRUCTOR <---- TEACHES -----> COURSE
which is actually of no meaning.
If we consider both with opposite cardinality.
then, many to many should be represented by following the second convention as (please assume diamond for relationship set and rectangle for entity set)
INSTRUCTOR ----- TEACHES ------ COURSE
No explicit arrow is always considered many to many. So, it is correct (only if we consider both opposite)
Consider an 'employee' entity set and 'department' entity set, having relationship set as 'manage'.
Employee-------------Manage--------------------Department
(entity set) (Relationship set) (entity set)
One to many relationship means one entity of employee set can be associated with more than one entity of Department entity set but, an entity of Department set can be associated with at most one entity of employee entity set.
That means if there is one to many relationship between employee and department entity sets, then each employee can manage more than one department and at the same time each department is managed by at most one employer.

Resources