Here is the background:
Bottles are acquired from the suppliers by placing orders. Some bottle types may be ordered from more than
one supplier. Each order involves only a single supplier but may include more than one bottle type. Usually
orders are filled completely by the suppliers, but occasionally an order must be filled with multiple shipments,
due to a back-order condition at the supplier. WWWC maintains careful records of what quantities are
ordered and what quantities are received, as well as when the bottles are ordered and when they are
received, and the actual price charged for the bottles.
The conceptual model of bottle is: Bottle{ID, Capacity, Shape, Material, Color, Cost, Quantity}
The conceptual model of Supplier is: Supplier{ID, Name, Phone#, Address, Contact_Name}.
Till now, I know that the relationship between Bottles and Suppliers are many-to-many.
Here is the photo of the E-R relationship, just omit attributes(you can get them from the conceptual model above).
As far as I know, converting from ER relationship to logical diagram under the many-to-many relationship, I need to create another table to represent the relationship.
So I create another table called: Purchase, which contains follow attributes: SID(Supplier ID), BID(Bottle ID), Ordered_Quantity, Received_Quantity, When_Ordered, When_Received.
My Question is: How to use lines to connect those three tables to establish the relationship?
I think you mean something like in the image below. Crow's foot notation is used here but more notations are possible such as idef1x. Most ER modeling tools support multiple of these notation techniques. This example is made with Dezign
Related
This is an problem about drawing ERD in one of my course:
A local startup is contemplating launching Jungle, a new one stop
online eCommerce site.
As they have very little experience designing and implementing
databases, they have asked you to help them design a database for
tracking their operations.
Jungle will sell a range of products, and they will need to track
information such as the name and price for each. In order to sell as
many products as possible, Jungle would like to display short reviews
alongside item listings. To conserve space, Jungle will only keep
track of the three most recent reviews for each product. Of course, if
an item is new (or just unpopular), it may have less than three
reviews stored.
Each time a customer buys something on Jungle, their details will be
stored for future access. Details collected by Jungle include
customer’s names, addresses, and phone numbers. Should a customer buy
multiple items on Jungle, their details can then be reused in future
transactions.
For maximum convenience, Jungle would also like to record credit card
information for its users. Details stored include the account and BSB
numbers. When a customer buys something on Jungle, the credit card
used is then linked to the transaction. Each customer may be linked to
one or more credit cards. However, as some users do not wish to have
their credit card details recorded, a customer may also be linked to
no credit cards. For such transactions, only the customer and product
will be recorded.
And this is the solution:
The problem is the Buys action connect with 3 others entities: Product, Customer, and Card. I find this very hard to read and understand.
Is an action involving more than 2 entities common in production? If it is, how should I understand and use it? Or if it's not, what is the better way of design for this problem?
While the bulk of relationships in practice are binary relationships, ternary and higher relationships are normal elements of the entity-relationship model. Some examples are supplies (supplier_id, product_id, region_id) or enrolled (student_id, course_id, semester_id). However, they often get converted into entity sets via the introduction of a surrogate identifier, due to dislike of composite keys or confusion with network data models in which only directed binary relationships are supported.
Reading cardinality indicators on non-binary relationships are a common source of confusion. See my answer to designing relationship between vehicle,customer and workshop in erd diagram for more info on how I handle this.
Your solution has some problems. First, Buys is indicated as an associative entity, but is used like a ternary relationship with an optional role. Neither is correct in my opinion. See my answer to When to use Associative entities? for an explanation of associative entities in the ER model.
Modeling a purchase transaction as a relationship is usually a mistake, since relationships are identified by the (keys of the) entities they relate. If (CustomerID, ProductID) is identifying, then a customer can buy a product only once, and only one product per transaction. Adding a date/time into the relationship's key is better, but still problematic. Adding a surrogate identifier and turning it into a regular entity set is almost certainly the best course of action.
Second, the Crow's foot cardinality indicators are unclear. It looks like customers and products are optional in the Buys relationship, or even as if multiple customers could be involved in the same transaction. There are three different concepts involved here - optionality, participation and cardinality - which should preferably be indicated in different ways. See my answer to is optionality (mandatory, optional) and participation (total, partial) are same? for more on the topic.
A card is optional for a purchase transaction. From the description, it sounds as if cards may participate totally, meaning we won't store information about a card unless it's used in a transaction. Furthermore, only a single card can be related to each transaction.
A customer is required for a purchase transaction, and it sounds like customers may participate totally, meaning we won't store information about customers unless they purchase something. Only a single customer can be related to each transaction.
Products are required for a purchase transaction, and since we'll offer products before they're bought, products will participate partially in transactions. However, multiple products can be related to each transaction.
I would represent transactions for this problem with something like the following structure:
I'm not saying converting a ternary or higher relationship into an entity set is always the right thing to do, but in this case it is.
Physically, that would require two tables to represent (not counting Customer, Product, Card or ProductReview) since we can denormalize TransactionCustomer and TransactionCard into Transaction, but TransactionProduct is a many-to-many relationship and requires its own table (as do ternary and higher relationships).
Transaction (TransactionID PK, TransactionDateTime, CustomerID, CardID nullable)
TransactionProduct (TransactionID PK, ProductID PK, Quantity, Price)
I have been studying datawarehouse in the last couple days, particularly, i have been reading The Data Wharehouse Toolkit - The Definitive Guide to Dimensional Modeling by Kimball and Ross.
Uppon that reading, i came to the 1st exapmle where there is a sales fact and it related to a product dimension, as you can see in the bellow image:
I think i can grasp the gist of how this relationship allows us to rotate the "cube" slicing and dicing data, however this is where i get lost:
In this example and many others on the web product is a one-to-one relationship with sales, which is fine i guess for most cases. But this generates a sales registtry for at least each kind of product that was in one sale.
So supposing i bought 1 banana, 2 apples and 1 orange, this would yield at least 3 sales registry. Again, which is fine i guess as it is storing the sale's ticket ID in the sales fact, we still can relate all itens in a given sale.
However if this was an use case: relate products on sales say i want to get every sale that had a banana and get stuff like: how many items each of these sales had, their price cost, their profit, stuff like that...
Wouldn't be better if the fact-product relation were Fact-one_to_many-Product relationship? Where fact would hold the sale's ticket ID and products would have its foreign key referencing where they are from or something?
I reckon these metrics should be in the fact table, and not in the product table as i think i would want. So, is this me not fighting my urge to normalize it or does it make sense in the way i would want to do that kind of filtering -> [given all sales with X product, get data from other products in the same sale].
If i were to follow the guidelines, product dimension would have one registry for every exclusive kind of product the store would have correct? And all the measurements i want i would store it on the fact itself, like price cost, sales price, profit, etc...
On the other hand, if i were to one-to-many product dimension would have many copies of each product. Which is bad, i think. However, i think it would give me better queries in that regard.
As you can see, i'm a beginer and really in the early stages of this path, so if you would endulge me in a Explain Like I'm Five kind of answer I would appreciate.
EDITED:
Sorry #Nick.McDermaid, you are right. I meant from the perspective of the sales fact where for every sale fact i will have only one product, but are correct that for one product it can have N sales related. And so, we have one record of product in the database for every different product on our store. This is the right way to do it, how to rightfully model it. Also, the many indicator is the "sales quantity" i'm guessing.
Anyhow, while this allows for slicing and dicing when/if we have sales as the point of view, but what if i want to for example:
Get all sales that had a banana in it, with all the other items in those sales. We can still do it with this structure but its harder than if the products were repeated and we had the sale id as a foreign key in the product table.
Cuz ultimetly i want to get all the sales(and products within that sale) that had a banana. And then take metrics out of them.
What you are somewhat hinting at would be a degenerate dimension, consisting of the sales id/invoice #/purchase order # of the transaction that took place. The whole purpose of a degenerate dimension is to group items that are related by a meaningless piece of data. For example, a PO # of A1234 is meaningless on its own, it doesn't tell you anything about the purchase. However, it can be used to identify other meaningful data, such as the date of purchase of the products for the customer. In that context, the PO # is defined by the collection of the entities it brings together to describe an event.
Another critical concept in data-warehousing is the abstraction of the schema in the database from the model in the cube. You don't join and group data in a cube model. You slice and filter. There are no foreign keys in a cube model. Those are used in the underlying data schema, but all of that work is handled behind the scenes of the cube model.
So I'm making an E/R diagram based on drugs. It states that each drug is produced by a given pharmaceutical company and the trade name of the drug is identified among the products of the given pharmaceutical company. So here's the E/R diagram I drew up:
Now the biggest question I have about this is, are these relationships supposed to be one to many or many to many? Each one relationship is represented by an arrow (where the pointed arrow means at most one and the rounded arrow means exactly one). I first assumed that a single drug identified by a single trade name would come from just one pharmaceutical company but would it be possible for a single drug to come from multiple pharmaceutical company's? I'm also not sure if it's supposed to be a 3 way relationship or not.
Not sure if this is really a technical question you can find the answer to here. It would probably be wise to further clarify with your client, but from pure wording I would assume.
1.) 1 Drug - 1 Trade Name - 1 Company
2.) 1 Company has Many Drugs
From general knowledge of US drugs, different companies have their unique versions of drugs with the same active ingredient, but these are all filed under different trade names, maintaining 1 trade name - 1 company relationship.
For example, ibuprofen (generic) is sold under both Advil and Motrin (separate trade names).
In this style of ER diagram, Chen's original, the diamond denotes a ternary
"relationship" type, aka association type, among/on the three participant "entity" types symbolized by the boxes. As in an application relationship/association, as in "Entity-Relationship Model". The lines showing participations correspond to FKs (foreign keys).
In such a diagram each line gets labeled by the number or range giving the number of entities in each entity set which is allowed in a relationship set. The table for the relationship would have a FK for each line. Per Chen it would be described as (in order company-name-drug) (at-most-1)-to-(exactly-1)-to-N relationship (assuming the unlabeled line means any number). There is a style with a cardinality at each end of a line.
Misunderstandings/misrepresentations/misappropriations of Chen style by older & newer methods & products (although quite mainstream) lead to different so-called ER diagrams.
One such style only shows entity type boxes with relationships shown by connecting lines labeled by relationship names. The 1:many relationships can be implemented by a FK attribute in one of the entity type tables, although they needn't be, and although that's contrary to Chen ER modeling, which would use a table. Typically, for n-ary relationships for n>2, instead of just having three line segments connect at a point the point is replaced by a box for what in Chen is an "associative entity" type. The lines would then be participations/FKs under Chen. All lines now represent 1:many relationships. Other so-called ER diagrams just have boxes for tables and lines for FKs and don't even have relationships on entities in the Chen sense. The use of lines that only ever denote 1:many relationships and/or FKs lead to lines and FKs being (wrongly but ubiquitously) called "relationships". (Which seems to be how you understand the word.)
The wikipedia entry on E-R modeling (and E-R diagrams) is currently reasonable.
Having trouble figuring out relations in this scenario:
I want to create a checkbox list for income types. The UI will present as "What types of income do you receive?". The choices, to keep things simple, could be full-time, part-time and retirement.
Part of me thinks this is a one-to-many relation, and thereby won't necessitate an association table because one individual can have one or more income types. However, taking things literally, "full-time" employment can relate to many individuals. In this case, I won't be showing a summary table of how many of the individuals are "full-time", I am just dealing with one person and determining what their employment status is.
But I don't think of "full-time" as an entity, like, for example, actors and movies - where many actors can be in many movies and many movies can have many different actors.
I guess what's tripping me up is that a user can select more than one option, as opposed to a radio-button list or drop down list.
In this case, which is it?
many-to-many: Person to Employment Type.
Many Persons may share a single Employment Type.
A single Person may have several Employment Types.
Having said that, I've no idea how rich is your business model, but I'd attach Employment Type to an entity called Employment that would refer Employment Type by a many-to-one association (rather than referring it straight from Person).
From my point of view this is a many-to-many relationship.
Full-Time is an entity (suppose a INCOME_TYPES table), exactly like an actor or a movie.
Since you tell us, you won't showing the things income-type-side but only individual-side, there are two alternatives:
De-normalize your schema and put 3-fields in the INDIVIDUALS table. This is not very nice.
If you do some of the things code-side, you can use a bitmask.
for example, 1 is for Full-time, 2 is for Part-Time and 4 is for retirement.
It depends on whether you have the income type as a separate table or whether it is just a string.
For separate table it is many-to-many: Each person has multiple income types. Each income type has multiple persons.
There are couples of questions around asking for difference / explanation on identifying and non-identifying relationship in relationship database.
My question is, can you think of a simpler term for these jargons? I understand that technical terms have to be specific and unambiguous though. But having an 'alternative name' might help students relate more easily to the concept behind.
We actually want to use a more layman term in our own database modeling tool, so that first-time users without much computer science background could learn faster.
cheers!
I often see child table or dependent table used as a lay term. You could use either of those terms for a table with an identifying relationship
Then say a referencing table is a table with a non-identifying relationship.
For example, PhoneNumbers is a child of Users, because a phone number has an identifying relationship with its user (i.e. the primary key of PhoneNumbers includes a foreign key to the primary key of Users).
Whereas the Users table has a state column that is a foreign key to the States table, making it a non-identifying relationship. So you could say Users references States, but is not a child of it per se.
I think belongs to would be a good name for the identifying relationship.
A "weak entity type" does not have its own key, just a "partial key", so each entity instance of this weak entity type has to belong to some other entity instance so it can be identified, and this is an "identifying relationship". For example, a landlord could have a database with apartments and rooms. A room can be called kitchen or bathroom, and while that name is unique within an apartment, there will be many rooms in the database with the name kitchen, so it is just a partial key. To uniquely identify a room in the database, you need to say that it is the kitchen in this particular apartment. In other words, the rooms belong to apartments.
I'm going to recommend the term "weak entity" from ER modeling.
Some modelers conceptualize the subject matter as being made up of entities and relationships among entities. This gives rise to Entity-Relationship Modeling (ER Modeling). An attribute can be tied to an entity or a relationship, and values stored in the database are instances of attributes.
If you do ER modeling, there is a kind of entity called a "weak entity". Part of the identity of a weak entity is the identity of a stronger entity, to which the weak one belongs.
An example might be an order in an order processing system. Orders are made up of line items, and each line item contains a product-id, a unit-price, and a quantity. But line items don't have an identifying number across all orders. Instead, a line item is identified by {item number, order number}. In other words, a line item can't exist unless it's part of exactly one order. Item number 1 is the first item in whatever order it belongs to, but you need both numbers to identify an item.
It's easy to turn an ER model into a relational model. It's also easy for people who are experts in the data but know nothing about databases to get used to an ER model of the data they understand.
There are other modelers who argue vehemently against the need for ER modeling. I'm not one of them.
Nothing, absolutely nothing in the kind of modeling where one encounters things such as "relationships" (ER, I presume) is "technical", "precise" or "unambiguous". Nor can it be.
A) ER modeling is always and by necessity informal, because it can never be sufficient to capture/express the entire definition of a database.
B) There are so many different ER dialects out there that it is just impossible for all of them to use exactly the same terms with exactly the same meaning. Recently, I even discovered that some UK university that teaches ER modeling, uses the term "entity subtype" for the very same thing that I always used to name "entity supertype", and vice-versa !
One could use connection.
You have Connection between two tables, where the IDs are the same.
That type of thing.
how about
Association
Link
Correlation