I have an entity called events can be either one or several types.
My question is how do the relationship because a event can be more than one type. These can't be an attribute nor an inheritance relationship. Could be a weak-strong entity types as events not exist without the body but I have not clear.
example:
Event is a workshop and a conference.
Thanks for yours help.
A rule of thumb is that you should always keep in mind this when designing a database: each entity with its own table. This is the base of a good Relational Database Design.
If you can't use a flag (attribute) neither use inheritance, then you'd have to artificially keep two entities, one for workshop and another for conference, keep some FK for each one and possibly a flag/trigger to ensure you're not using an FK for both at the same time.
OR
You could use some intermediary entity, but I understand this would be some sort of inheritance, because this proxy entity would act like a "super" entity for both workshop and conference (like an "event"...)
The first option is bad in terms of maintenance, I'd not recommend you that.
The second option is IMO more "intuitive".
If both entities are too close one another and will be linked to several other entities in the same way, I think you could just use a flag to differentiate them, with the risk that in the future, if anything changes to one of them, you have to refactor your schema, which is usually painful and risky. It's a premature optimization, and as we know, premature optimization is the root of all evil. So keeping them in separate entities may be a good option.
If the list of all possible Types that can be associated with an Entity are defined in a list (e.g. within a single table), then you can use the standard many-to-many relationship pattern:
TABLE EVENTS
EventId, primary key
TABLE TYPES
TypeId, primary key
TABLE EVENTTYPES
EventId, foreign key to Events
TypeId, foreign key to Types
...with the primray key on {EventId, Typeid}
This pattern was usefull: multivalued attributes.
http://www.tomjewett.com/dbdesign/dbdesign.php?page=hobbies.php
Related
I think we agree that there is a correspondance between composition and delete cascading on one side and aggregation and nullify on delete on the other, in case we delete the whole instance in a whole / part relationship.
But what if there is no whole / part relationship between two classes:
I understand that we can only use composition and aggregation in cases where the whole / part hierarchy occurs: Car - Wheels, Apartment - Rooms and not in cases where this hierarchy does not occurs (e.g. Car - Driver classes).
So, how should we represent in UML this situation where there are deletion consequences in the database (nullify or cascading) but no "whole / part" relation?
Do we agree on the initial assumption?
The UML literature frequently refers to part-whole relationships regarding aggregation/composition. However, the definitions in the UML standard have evolved (see UML 2.5.1):
Sometimes a Property is used to model circumstances in which one instance is used to group together a set of instances; this is called aggregation. (...)
Shared: Indicates that the Property has shared aggregation semantics. Precise semantics of shared aggregation varies by application area and modeler.
Composite: Indicates that the Property is aggregated compositely, i.e., the composite object has responsibility for the existence and storage of the composed objects.
Composite aggregation is a strong form of aggregation that requires a part object be included in at most one composite object at a time. If a composite object is deleted, all of its part instances that are objects are deleted with it.
In other words, there is no precise semantic specified for the "aggregation" (i.e. shared aggregation) that would make a difference from a simple association: shared aggregation is a modeling placebo.
The relationship between database constraints and UML modeling are therefore not as straightforward as you would assume.
Close match?
Moreover, there is no general one-to-one mapping between a database schema and an UML model. More than one database schema could be used to implement the same UML class diagram. And conversely, more than one UML diagram may represent the design that is implemented by a given database schema. So the best we can do here, is to consider close-matches.
In your database, the table with the FOREIGN KEY constraint would correspond to a potential component in a composition, or an element of a shared aggregation, or an associated instance in a simple association :
a ON DELETE CASCADE could help to implement a composite aggregation: it's the only way in SQL to implement the kind of lifecycle management that you would expect in a composition: the components would be deleted when the composite is. It could as well implement an ordinary association, if some business rules/contracts (e.g. UML post conditions) would require such a related deletion.
a ON DELETE SET NULL could help to implement a shared aggregation, if its smeantics would be defined as you mean: if the aggregate is deleted, its elements would not be deleted, and could therefore be shared. But it could as well implement any ordinary association, since the deletion of an associated instance would not trigger a deletion either and the constraint would allow to maintain a clean referential integrity.
I agree, that composition means cascading delete, because according to UML the whole is responsible for the existence of the parts. A normal association means, you can delete any object without affecting any other objects that might have a link to it. UML doesn't define semantics for aggregation, so they will behave in the same way. But even if we take into account domain specific semantics for aggregation, I don't think there are examples where this is changed.
However, if you have an association with a multiplicity of 1 on one end, you cannot delete the object on this end, because the objects that have been linked to it would be invalid afterwards. This has nothing to do with composition or aggregation.
So, the remaining question is, how to express cascading delete if there is no whole-part relationship? Are there really examples where this happens? I don't see that Car - Driver, could not be in a whole-part relationship. Please bear in mind, that we are not talking about real cars or real people. We are talking about a software system that we want to represent knowledge about the real world for a specific purpose. And if the purpose is to issue boarding cards for cars and their drivers on a ferry, it makes perfect sense to view them as a composition.
Problem description
I am currently working on a project which requires a relational database for storage.
After thinking about the data and its relations for a while I ran into a quite repetitive problem:
I encountered a common data schema for entity A which contains some fields e.g. name, description, value. This entity is connected with entity B in multiple n-1 relations. So entity B has n entities A in relation rel1 and n entities A in relation rel2.
Now I am trying to break down this datamodel into a schema for a relational database (e.g. Postgres, MySQL).
After some research, I have not really found "the best" solution for this particular problem.
Some similar questions I have found so far:
Stackoverflow
DBA Stackexchange
My ideas
So I have thought about possible solutions which I am going to present here:
1. Duplicate table
The relationship from entity B to entity A has a certain meaning to it. So it is possible to create multiple tables (1 per relationship). This would solve all immediate problems but essentially duplicate the tables which means that changes now have to be reflected to multiple tables (e.g. a new column).
2. Introduce a type column
Instead of multiple relationships, I could just say "Entity B is connected with n entity A". Additionally, I would add a type column that then tells me to which relation entity A belongs. I am not exactly sure how this is represented with common ORMs like Spring-Hibernate and if this introduces additional problems that I am currently unaware of.
3. Abstract the common attributes of entity A
Another option is to create a ADetails entity, which bundles all attributes of entity A.
Then I would create two entities that represent each relationship and which are connected to the ADetails entity in a 1-to-1 relationship. This would solve the interpretation problem of the foreign key but might be too much overhead.
My Question
In the context of a medium-large-sized project, are any of these solutions viable?
Are there certain Cons that rule out one particular approach?
Are there other (better) options I haven't thought about?
I appreciate any help on this matter.
Edit 1 - PPR (Person-Party-Role)
Thanks for the suggestion from AntC. PPR Description
I think the described situation matches my problem.
Let's break it down:
Entity B is an event. There exists only one event for the given participants to make this easier. So the relationship from event to participant is 1-n.
Entity A can be described as Groups, People, Organization but given my situation they all have the same attributes. Hence, splitting them up into separate tables felt like the wrong idea.
To explain the situation with the class diagram:
An Event (Entity B) has a collection of n Groups (Entity A), n People (Entity A) and n Organizations (Entity A).
If I understand correctly the suggestion is the following:
In my case the relationship between Event and Participant is 1-n
The RefRoles table represents the ParticipantType column that descibes to which relationship the Participant belongs (is it a customer or part of the service for the event for example)
Because all my Groups, People and Organizations have the same attributes the only table required at this point is the Participant table
If there are individual attributes in the future I would introduce a new table (e.g. People) that references the Participant in a 1-1 relationship.
If there are multiple tables going to be added, the foreign key of the multiple 1-1 relationship is mutually exclusive (so there can only be one Group/Person/Organization for a participant)
Solution suggested by AntC and Christian Beikov
Splitting up the tables does make sense while keeping the common attributes in one table.
At the moment there are no individual attributes but the type column is not required anymore because the foreign keys can be used to see which relationship the entity belongs to.
I have created a small example for this:
There exist 3 types (previously type column) of people for an event: Staff, VIP, Visitor
The common attributes are mapped in a 1-1-relationship to the person table.
To make it simple: Each Person (Staff, VIP, Visitor) can only participate in one event. (Would be n-m-relationship in a more advanced example)
The database schema would be the following:
This approach is better than the type column in my opinion.
It also solves having to interprete the entity based on its type in the application later on. It is also possible to resolve a type column in an ORM (see this question) but this approach avoids the struggle if the ORM you are using does not support resolving it.
IMO since you already use dedicated terms for these objects, they probably will diverge and splitting up a table afterwards is quite some work, also on the code side, so I would suggest you map dedicated entities/tables from the beginning.
I noticed in one of my exercises that an Order (Attributes OrderID, description...etc) requires a buyer, seller, and an account number which are stored as other entities. I'm wondering why Order is stored as an entity rather than an associative entity.
The idea of an associative entity is it is something we normally wouldn’t identify as as entity but which we need in order to link things together, see https://en.m.wikipedia.org/wiki/Associative_entity
Here the Order is a business entity in itself so the term wouldn’t apply. Entities can be artifacts of business processes as well as concrete things.
In all implementations of SQL (that I know of), there isn't a separate type of entity for "Associative Entities". They are simply regular entities with Foreign Key constraints.
There's quite a bit of information that can be "lost" in translation when going from something like an ER Diagram to actual DB schemas, but all you can do is try your best to reconcile the two.
Edit
Terminology changes too: Entities become Tables, and Attributes become Columns
Say if I have two or more vastly different objects that are each represented by a table in the DB. Call these Article, Book, and so on. Now say I want to add a commentening feature to each of these objects. The comments will behave exactly the same in each object, so ideally I would like to represent them in one table.
However, I don't know a good way to do this. The ways I know how to do this are:
Create a comment table per object. So have Article_comments, Book_comments, and so on. Each will have a foreign key column to the appropriate object.
Create one global comment table. Have a comment_type that references "Book" or "Article". Have a foreign key column per object that is nullable, and use the comment_type to determine which foreign key to use.
Either of the above ways will require a model/db update every time a new object is added. Is there a better way?
There is one other strategy: inherit1 different kinds of "commentable" objects from one common table then connect comments to that table:
All 3 strategies are valid and have their pros and cons:
Separate comment tables are clean but require repetition in DML and possibly client code. Also, it's impossible to enforce a common key on them, unless you employ some form of inheritance, which begs the question: why not go straight for (3) in the first place?
One comment table with multiple FKs will have a lot of NULLs (which may or may not be a problem storage and cache-wise) and requires adding a new column to the comments table whenever a new kind of "commentable" object is added to the database. BTW, you don't necessarily need the comment_type - it can be inferred from what field is non-NULL.
Inheritance is not directly supported by current relational DBMSes, which brings its own set of engineering tradeoffs. On the up side, it could enable easy addition of new kinds of commentable objects without changing the rest of the model.
1 Aka. category, subclassing, generalization hierarchy... For more on inheritance, take a look at "Subtype Relationships" section of ERwin Methods Guide.
I personally think your first option is best, but I'll throw this option in for style points:
Comments have a natural structure to them. You have a first comment, maybe comments about a comment. It's a tree of comments really.
What if you added one field to each object that points to the root of the comment tree. Then you can say, "Retrieve the comment tree for article 123.", and you could take the root and then construct the tree based off the one comment table.
Note: I still like option 1 best. =)
This is probably a simple problem for an experienced database developer, but I'm struggling... I have trouble translating a certain ER diagram to a DB model, any help is appreciated.
I have a setup similar to slide 17 of this presentation:
http://www.cbe.wwu.edu/misclasses/mis421s04/presentations/supersubtype.ppt
Slide 17 shows an ER diagram with an Employee supertype having an Employee Type attribute and as subtypes the Employee Types themselves (Hourly, Salaried and Consultant), which is very similar to my design situation.
In my case, suppose Salaried Employees are the only ones that can be bosses of other employees and I wanted to somehow indicate if a certain Salaried employee is the boss of the Hourly and/or Salaried Employee and/or Consultant (either, none or both), how could that be designed in a database model, also considering these are one-to-many relationships?
I can put a PK-FK relationship between them, which would result in all tables having two FKeys and (like Consultant having FK_Employee and FK_SalariedEmployee) and SalariedEmployee referencing itself, but I keep thinking that might not be the wisest solution....although I'm not sure why (integrity issues?).
Is this or an acceptable solution or is there a better one?
Thanks in advance for any help!
Your case looks like an instance of the design pattern known as “Generalization Specialization” (Gen-Spec for short). The gen-spec pattern is familiar to object oriented programmers. It’s covered in tutorials when teaching about inheritance and subclasses.
The design of SQL tables that implement the gen-spec pattern can be a little tricky. Database design tutorials often gloss over this topic. But it comes up again and again in practice.
If you search the web on “generalization specialization relational modeling” you’ll find several useful articles that teach you how to do this. You’ll also be pointed to several times this topic has come up before in this forum.
The articles generally show you how to design a single table to capture all the generalized data and one specialized table for each subclass that will contain all the data specific to that subclass. The interesting part involves the primary key for the subclass tables. You won’t use the autonumber feature of the DBMS to populate the sub class primary key. Instead, you’ll program the application to propagate the primary key value obtained for the generalized table to the appropriate subclass table.
This creates a two way association between the generalized data and the specialized data. A simple view for each specialized subclass will collect generalized and specialized data together. It’s easy once you get the hang of it, and it performs fairly well.
In your specific case, declaring the "boss of" FK to reference the PK in the Salaried Employees table will be enough to do the trick. This will produce the two way association you want, and also prevent employees who are not salaried from being referenced as bosses.