I created a class diagram for a system and now I have to model it into a real system. This means converting it to a database.
Now there is a base class which has just a few attributes, but there are many classes that inherit from it. Now my checklist for converting says I have to create a table for every class.
I don't know how to handle the inheritance, I can see that associations are done with PK and FK's but what about subclasses?
Is there some article which handles that or is there someone who can explain it to me?
You have three alternatives to translate class hierarchies into relational tables:
- Create only a table for the superclass (all attributes and associations of subclasses are moved to the table corresponding to the superclass with the possibility of taking a NULL value)
- Create only tables for the subclasses: All attributes and associations of the superclass are repeated in each subclass
- Create tables both for the superclass and for each of the subclasses. In this case, the PK of the subclasses is at the same table a FK to the superclass (this ensures that all identifiers in a subclass table correspond to an existing identifier in the superclass table. A join between both tables allows to recover the full information of the element)
The best strategy depends on the problem (for instance, the number of attributes in each class, the number of levels in hierarchy, whether the hierarchy is disjoint or not,...)
Usually, you design your datamodel (database tables/PK/FK etc.,) in parallel when you design your actual class diagram. After identifying all the cadidate classes and the dependencies on each of the classes, you will probably go on with the design sequence diagram. By this time, your data model should have been finalized.
How to represent in UML deletion consequences (nullify or cascading) in non-(whole / part) relations?

I think we agree that there is a correspondance between composition and delete cascading on one side and aggregation and nullify on delete on the other, in case we delete the whole instance in a whole / part relationship.
But what if there is no whole / part relationship between two classes:
I understand that we can only use composition and aggregation in cases where the whole / part hierarchy occurs: Car - Wheels, Apartment - Rooms and not in cases where this hierarchy does not occurs (e.g. Car - Driver classes).
So, how should we represent in UML this situation where there are deletion consequences in the database (nullify or cascading) but no "whole / part" relation?
Do we agree on the initial assumption?
The UML literature frequently refers to part-whole relationships regarding aggregation/composition. However, the definitions in the UML standard have evolved (see UML 2.5.1):
Sometimes a Property is used to model circumstances in which one instance is used to group together a set of instances; this is called aggregation. (...)
Shared: Indicates that the Property has shared aggregation semantics. Precise semantics of shared aggregation varies by application area and modeler.
Composite: Indicates that the Property is aggregated compositely, i.e., the composite object has responsibility for the existence and storage of the composed objects.
Composite aggregation is a strong form of aggregation that requires a part object be included in at most one composite object at a time. If a composite object is deleted, all of its part instances that are objects are deleted with it.
In other words, there is no precise semantic specified for the "aggregation" (i.e. shared aggregation) that would make a difference from a simple association: shared aggregation is a modeling placebo.
The relationship between database constraints and UML modeling are therefore not as straightforward as you would assume.
Close match?
Moreover, there is no general one-to-one mapping between a database schema and an UML model. More than one database schema could be used to implement the same UML class diagram. And conversely, more than one UML diagram may represent the design that is implemented by a given database schema. So the best we can do here, is to consider close-matches.
In your database, the table with the FOREIGN KEY constraint would correspond to a potential component in a composition, or an element of a shared aggregation, or an associated instance in a simple association :
a ON DELETE CASCADE could help to implement a composite aggregation: it's the only way in SQL to implement the kind of lifecycle management that you would expect in a composition: the components would be deleted when the composite is. It could as well implement an ordinary association, if some business rules/contracts (e.g. UML post conditions) would require such a related deletion.
a ON DELETE SET NULL could help to implement a shared aggregation, if its smeantics would be defined as you mean: if the aggregate is deleted, its elements would not be deleted, and could therefore be shared. But it could as well implement any ordinary association, since the deletion of an associated instance would not trigger a deletion either and the constraint would allow to maintain a clean referential integrity.
I agree, that composition means cascading delete, because according to UML the whole is responsible for the existence of the parts. A normal association means, you can delete any object without affecting any other objects that might have a link to it. UML doesn't define semantics for aggregation, so they will behave in the same way. But even if we take into account domain specific semantics for aggregation, I don't think there are examples where this is changed.
However, if you have an association with a multiplicity of 1 on one end, you cannot delete the object on this end, because the objects that have been linked to it would be invalid afterwards. This has nothing to do with composition or aggregation.
So, the remaining question is, how to express cascading delete if there is no whole-part relationship? Are there really examples where this happens? I don't see that Car - Driver, could not be in a whole-part relationship. Please bear in mind, that we are not talking about real cars or real people. We are talking about a software system that we want to represent knowledge about the real world for a specific purpose. And if the purpose is to issue boarding cards for cars and their drivers on a ferry, it makes perfect sense to view them as a composition.

How to correctly show M:N relationship in conceptual data model?

I am trying to create a conceptual data model and I dont know how to properly show M:N relationship, which by default should not be included, but still you can assign verbs and directions of abstract entities. So let's say we have "Projects" and we have "Project roles", how do I properly show relationships? Can I have 2 arrows as showed in the picture or do I have to add join table and ?? I can't wrap my head around this..
Thank you so much in advance :)
A conceptual data (or information) model can be created with a suitable modeling language, such as ER diagrams or UML class diagrams. Both languages have a concept and a visual notation for many-to-many associations (or relationship types). Simply follow their definitions. Since there is no standard for ER diagrams, it's easier/preferable to go with UML.
For showing a many-to-many association between two classes (representing entity types), you draw a connection line and annotate it with an asterisk ("*") at both ends.
Notice that a join table is a database implementation, and not a modeling concept.

How to model a list to different entities efficiently?

Given the model below:
CustomerType1(id, telephone, address)
CustomerType2(id, telephone, name)
OrderType1(id, timestamp, customerType1.id, comments, enum1)
OrderType2(id, timestamp, customerType2.id, comments)
OrderType3(id, timestamp, name)
How would I model the following?
OrderList(id, OrderType.id, ..)
OrderItem(OrderList.id, MenuItem.id)
A. Would I need 3 different types of OrderLists in order to adapt to the orderTypes?
OrderList1(id, OrderType1.id, ..)
OrderItem1(OrderList1.id, MenuItem.id)
OrderList2(id, OrderType2.id, ..)
OrderItem2(OrderList2.id, MenuItem.id)
OrderList3(id, OrderType3.id, ..)
OrderItem3(OrderList3.id, MenuItem.id)
B. Would 3 definitions of a relationship between orderLists and OrderTypes be better?
OrderList_Type1(orderList.id, orderType1.id)
OrderList_Type2(orderList.id, orderType2.id)
OrderList_Type3(orderList.id, orderType3.id)
This seems like a really inefficient way to store data and I just feel like i've modelled this really incorrectly (although it still makes sense, it might not be good for scaling/efficiency?). Is there a better way to model this?
Note: the given model can be changed but it would still have to contain the same information.
1. Your UML model is ok
From the point of view of UML class diagram your model and the OrderList, OrderItem extensions you want to add are clear and unambiguous and I don't see any modeling question there.
To avoid excessive copy/pastes I have only added 2 parent classes named as ...Base. It is common OOP modelling technique
Drawn as UML class diagram your model looks like this:
2. For the physical implementation I would choose B
As for the implementation "model" of this model from the two choices you gave ((A) many copy/pastes, (B) somehow normalize and minimize the schema) I would go the (B) path drawn below.
It is how one of our company's software systems models class inheritance in the relational language and it works and it works quite well.
In our system most of the necessary glue code is automatically generated. Main thing is the automatically generated OrderType2View which automatically joins corresponding field from the parent table OrderTypeBase and automatically translates all DML operations e.g. the insert as DML operations in both OrderType2 and OrderTypeBase automatically adding correct OrderTypeClassId fields to all records in the parent table. So that it is easily distinguishable which child table actually contains the specific part of the record.
Thanks to the generator we can easily extend the model with other parent classes (the inheritance hierarchy and the number of joined tables can be of any depth) and still enable some older code to treat them as their general parents - without caring about the details.
I don't know if there are better ways, given the (A) or (B) I would choose (B) because it is a design that works (I have seen it :)

Supertype/subtype db design with subtype cross-link

This is probably a simple problem for an experienced database developer, but I'm struggling... I have trouble translating a certain ER diagram to a DB model, any help is appreciated.
I have a setup similar to slide 17 of this presentation:
Slide 17 shows an ER diagram with an Employee supertype having an Employee Type attribute and as subtypes the Employee Types themselves (Hourly, Salaried and Consultant), which is very similar to my design situation.
In my case, suppose Salaried Employees are the only ones that can be bosses of other employees and I wanted to somehow indicate if a certain Salaried employee is the boss of the Hourly and/or Salaried Employee and/or Consultant (either, none or both), how could that be designed in a database model, also considering these are one-to-many relationships?
I can put a PK-FK relationship between them, which would result in all tables having two FKeys and (like Consultant having FK_Employee and FK_SalariedEmployee) and SalariedEmployee referencing itself, but I keep thinking that might not be the wisest solution....although I'm not sure why (integrity issues?).
Is this or an acceptable solution or is there a better one?
Thanks in advance for any help!
Your case looks like an instance of the design pattern known as “Generalization Specialization” (Gen-Spec for short). The gen-spec pattern is familiar to object oriented programmers. It’s covered in tutorials when teaching about inheritance and subclasses.
The design of SQL tables that implement the gen-spec pattern can be a little tricky. Database design tutorials often gloss over this topic. But it comes up again and again in practice.
If you search the web on “generalization specialization relational modeling” you’ll find several useful articles that teach you how to do this. You’ll also be pointed to several times this topic has come up before in this forum.
The articles generally show you how to design a single table to capture all the generalized data and one specialized table for each subclass that will contain all the data specific to that subclass. The interesting part involves the primary key for the subclass tables. You won’t use the autonumber feature of the DBMS to populate the sub class primary key. Instead, you’ll program the application to propagate the primary key value obtained for the generalized table to the appropriate subclass table.
This creates a two way association between the generalized data and the specialized data. A simple view for each specialized subclass will collect generalized and specialized data together. It’s easy once you get the hang of it, and it performs fairly well.
In your specific case, declaring the "boss of" FK to reference the PK in the Salaried Employees table will be enough to do the trick. This will produce the two way association you want, and also prevent employees who are not salaried from being referenced as bosses.

How to get my SQL DB to match my Domain Driven Design

Okay, I'll be straight with you guys: I'm not sure exactly how Domain Driven my Design is, but I did start by building Model objects and ignoring the persistence layer altogether. Now I'm having difficulty deciding the best way to build my tables in SQL Server to match the models.
I'm building a web application in ASP.NET MVC, although I don't think the platform matters that much. I have the following object model hierarchy:
Property - has properties such as Address and Postcode
which have one or more
Case - inherits from PropertyObject
Quote - inherits from PropertyObject
which have one or more
Message - simple class that has properties Reference, Text and SentDate
Case and Quote have a lot of similar properties, so I also have a PropertyObject abstract base class that they inherit from. So Property has an Items property of type List which can contain both Case and Quote objects.
So essentially, I can have a Property that has a few Quotes and Cases and a load of Messages that can belong to either of those.
A PropertyObject has a Reference property (and therefore so do Quote and Case) so any Message object can be related back to a Quote OR Case by it's Reference property.
I'm thinking of using the Entity Framework to get my Models in and out of the database.
My initial thoughts were to have four tables: Property, Case, Quote and Message.
They'd all have their own sequential IDs, and the Case and Quote would be related back to Property by a PropertyID field.
The only way I can think of to relate a Message table back to the Case and Quote tables is to have both a RelationID and RelationType field, but there's no obvious way to tell SQL server how that relationship works, so I won't have any referential integrity.
Any ideas, suggestions, help?
I am assuming Property doesn't also inherit from PropertyObject.
Given that these tables, Property, Case, Quote and Message, leads to a Table per Concrete Class or TPC inheritance strategy, which I generally don't recommend.
My recommendation is that you use either:
Table per Hierarchy or TPH - Case and Quote are stored in the same table with one column used as a discriminator, with nullable columns for properties that are not shared.
Table per Type or TPT - add a PropertyObject table with the shared fields and Case and Quote tables with just the extra fields for those types
Both of these strategies will allow you to maintain referential integrity and are supported by most ORMs.
see this for more: Tip 12 - How to choose an inheritance strategy
Hope this helps
Ahhh... Abstraction.
The trick with DDD is to recognize that abstraction is not always your friend. In some cases, too much abstraction leads to a too-complex relational model.
You don't always need inheritance. Indeed, the major purpose of inheritance is to reuse code. Reusing a structure can be important, but less so.
You have a prominent is-a pair of relationships: Case IS-A Property and Quote IS-A Property.
You have several ways to implement class hierarchies and "is-a" relationships.
As you've suggested with type discriminators to show which subclass this really is. This works when you often have to produce a union of the various subclasses. If you need all properties -- a union of CaseProperty and QuoteProperty, then this can work out.
You do not have to rely on inheritance; you can have disjoint tables for each set of relationships. CaseProperty and QuoteProperty. You'd have CaseMessage and QuoteMessage also, to follow the distinction forward.
You can have common features in a common table, and separate features in a separate table, and do a join to reconstruct a single object. So you might have a Property table with common features of all properties, plus CaseProperty and QuoteProperty with unique features of each subclass of Property. This is similar to what you're proposing with Case and Quote having foreign keys to Property.
You can flatten a polymorphic class hierarchy into a single table and use a type discriminator and NULL's. A master Property table has type discriminator for Case and Quote. Attributes of Case are nulled for rows that are supposed to be a Quote. Similarly, attributes of Quote are nulled for rows that are supposed to be a Case.
Your question "[how] to relate a Message table back to the Case and Quote tables" stems from a polymorphic set of subclases. In this case, the best solution might be this.
Message has an FK reference to Property.
Property has a type discriminator to separate Quote from Case. The Quote and Case class definitions both map to Property, but rely on a type discriminator, and (usually) different sets of columns.
The point is that the responsibility for Property, CaseProperty and QuoteProperty belongs to that class hierarchy, and not Message.
This is where the DDD concept of Services would come in. The Repository for each of your concrete classes only persist that entity, not the related objects.
So you have Property(), and is the base for your CaseProperty() : Property(). This special-entity is accessed via CasePropertyService(). Within here is where you would do your JOINs and such to the related tables in order to generate your CaseProperty() special entity (which is not really Case() and Property on its own, but a combination).
OT: Due to limitation of .net of where you can't inherit multiple classes, this is my work around. DDD is meant to be a guideline to the overall understanding of your domain. I often give my DDD outline to friends, and have them try to figure out what it does/represent. If it looks clean and they figure it out, it's clean. If your friends look at it and say, "I have no idea what you are trying to persist here." then go back to the drawing board.
But, there's a catch about using any ORM to persist storage of DDD objects (linq, EntityFramework, etc). Have a look at my answer over here:
Stackoverflow: Question about Repositories and their Save methods for domain objects
The catch is all objects must have an identity in the database for ORM. So, this helps you plan your DB structure.
I have recently moved away from using ORM to control direct access, and just have a clean DDD layer. I let my repositories and services control access to the DB layer, and use Velocity to entity-cache my objects. This actually works very well for: 1) DB performance, you design however is most efficient not being coupled to your DOmain objects with direct ORM representation, and 2) your domain model becomes much cleaner with no forced identies on Value Objects and such. Free!
