Related
I think we agree that there is a correspondance between composition and delete cascading on one side and aggregation and nullify on delete on the other, in case we delete the whole instance in a whole / part relationship.
But what if there is no whole / part relationship between two classes:
I understand that we can only use composition and aggregation in cases where the whole / part hierarchy occurs: Car - Wheels, Apartment - Rooms and not in cases where this hierarchy does not occurs (e.g. Car - Driver classes).
So, how should we represent in UML this situation where there are deletion consequences in the database (nullify or cascading) but no "whole / part" relation?
Do we agree on the initial assumption?
The UML literature frequently refers to part-whole relationships regarding aggregation/composition. However, the definitions in the UML standard have evolved (see UML 2.5.1):
Sometimes a Property is used to model circumstances in which one instance is used to group together a set of instances; this is called aggregation. (...)
Shared: Indicates that the Property has shared aggregation semantics. Precise semantics of shared aggregation varies by application area and modeler.
Composite: Indicates that the Property is aggregated compositely, i.e., the composite object has responsibility for the existence and storage of the composed objects.
Composite aggregation is a strong form of aggregation that requires a part object be included in at most one composite object at a time. If a composite object is deleted, all of its part instances that are objects are deleted with it.
In other words, there is no precise semantic specified for the "aggregation" (i.e. shared aggregation) that would make a difference from a simple association: shared aggregation is a modeling placebo.
The relationship between database constraints and UML modeling are therefore not as straightforward as you would assume.
Close match?
Moreover, there is no general one-to-one mapping between a database schema and an UML model. More than one database schema could be used to implement the same UML class diagram. And conversely, more than one UML diagram may represent the design that is implemented by a given database schema. So the best we can do here, is to consider close-matches.
In your database, the table with the FOREIGN KEY constraint would correspond to a potential component in a composition, or an element of a shared aggregation, or an associated instance in a simple association :
a ON DELETE CASCADE could help to implement a composite aggregation: it's the only way in SQL to implement the kind of lifecycle management that you would expect in a composition: the components would be deleted when the composite is. It could as well implement an ordinary association, if some business rules/contracts (e.g. UML post conditions) would require such a related deletion.
a ON DELETE SET NULL could help to implement a shared aggregation, if its smeantics would be defined as you mean: if the aggregate is deleted, its elements would not be deleted, and could therefore be shared. But it could as well implement any ordinary association, since the deletion of an associated instance would not trigger a deletion either and the constraint would allow to maintain a clean referential integrity.
I agree, that composition means cascading delete, because according to UML the whole is responsible for the existence of the parts. A normal association means, you can delete any object without affecting any other objects that might have a link to it. UML doesn't define semantics for aggregation, so they will behave in the same way. But even if we take into account domain specific semantics for aggregation, I don't think there are examples where this is changed.
However, if you have an association with a multiplicity of 1 on one end, you cannot delete the object on this end, because the objects that have been linked to it would be invalid afterwards. This has nothing to do with composition or aggregation.
So, the remaining question is, how to express cascading delete if there is no whole-part relationship? Are there really examples where this happens? I don't see that Car - Driver, could not be in a whole-part relationship. Please bear in mind, that we are not talking about real cars or real people. We are talking about a software system that we want to represent knowledge about the real world for a specific purpose. And if the purpose is to issue boarding cards for cars and their drivers on a ferry, it makes perfect sense to view them as a composition.
If we have an ER diagram and we want to convert to a relational schema we follow a specific procedure (eg Elmashri& Navathe book).
What is not clear to me is if there is a difference when there is a cardinality of 1:1 vs 1:N. How is this difference represented in the relational schema?
For example in the following figure from Elmashri if we focus on the relation between Department-Project that has a cardinality of 1:N we take the following relational schema.
If the cardinality was 1:1 would there be a difference?
And to ask more directly: In the following figure, if somebody was giving me only the left part of it with the relational schema, how would I say if the 2 relations (in black and red circle) are 1:1 or 1:N?
Re mapping from ER & pseudo-ER to relational
That is a Chen style of ER diagram. Diamonds denote relation(ship)s/associations & have corresponding tables/relations. Lines denote participations & have corresponding FKs. Cardinalites are about diamonds ie relation(ship)s/associations ie tables/relations. They tell you certain things about what combinations of entities can participate in a relation(ship)/association.
Your textbook is Fundamentals of Database Systems 7th edition by Elmashri & Navathe. It follows the above usage & explains how to map X:Y:etc & explains how it mapped your example in particular. That DDL can arise from mapping the ERD to obvious "cross reference"/"relationship" tables then combining them. But your textbook just gives multiple ways to map X:Ys without clearly explaining how the shortcut ways arise from the obvious ways plus combining.
Eg: The table that is the value of query select Dnum, Pnum from PROJECT represents the binary relationship "for some Pname & Plocation, department Dnum controls project Pnum"--which is presumably true exactly when "department Dnum controls project Pnum"--which is relationship CONTROLS on a DEPARTMENT & a PROJECT. The ER diagram says that the corresponding binary relation is 1:N in Dnum:Pnum & N:1 in Pnum:Dnum. Because CONTROLS has that particular 1 in its cardinality, its obvious table can be combined with the PROJECT entity table into table PROJECT. If it were M:N then combining would have certain problems so your text gives a different mapping--but under the previous presumption it still could be. And if you did that or just used the obvious mapping then your relational design would be the same for both 1:N & M:N. 1:1 allows yet other combinations.
Different design & diagramming methods have different conventions for cardinalities & mapping to relations. Many so-called ER methods are not really ER in that they are trivially ER--everything is a (possibly associative) entity. All lines involve a 1 or 1-or-0 at one or both ends--because they are about certain implicit binary relationships & tables associated with FKs--like when obvious Chen tables are combined as above. Also there are look-here & look-away cardinalities & other conventions like explicit null FKs. All depends on what textbook/reference/method are you using.
Your DDL diagram "relation" is a FK--wrongly but ubiquitously called a "relationship" by pseudo-ER methods. You are not using "X:Y" meaningfully in the DDL diagram. You are using it more or less how a pseudo-ER diagram would label a FK. But the cardinality is about the relation(ship)/association represented by a certain projection of the FK's referencing table.
Re mapping from relational to ER & pseudo-ER
If you started from a Chen ER design & used only the obvious mappings then you would have tables 1:1 with entities & relationships & the entity tables would be the ones with no FKs. But your textbook has multiple mapping choices that involve combining ER diagram entity & relationship tables into other tables, in multiple ways, so the tables in the database don't tell you the diagram entities & relationships that they arose from.
Relational designs are more general than Chen ER designs--tables represent relations on 0 or more values. (Every superkey of every base table & query result identifies some entity.) So not all reasonable relational designs correspond to Chen ER designs. Whereas one benefit of the pseudo-ER methods is that they are really recording DDL and not distinguishing between entities & relationships. But if they do arise from entities & relationships that's not recorded in the design. So you can't map from such a relational/pseudo-ER design back to those entities & relationships.
You wouldn't know the cardinality from DDL constraints unless you put them in --which you should, preferably declaratively but otherwise triggers. FKs & CKs (via PKs & UNIQUE NOT NULL) are enough to express Chen cardinality constraints for binary relation(ship)s/associations, but not n-ary. Pseudo-ER methods may or may not address constraints beyond PKs & CKs. And Chen ER designs can have problem constraints that must be addressed through general relational design principles--so really they are only provisional. (And--unnecessary.)
What is the difference between an entity relationship model and a relational model?
So I'm making an E/R diagram based on drugs. It states that each drug is produced by a given pharmaceutical company and the trade name of the drug is identified among the products of the given pharmaceutical company. So here's the E/R diagram I drew up:
Now the biggest question I have about this is, are these relationships supposed to be one to many or many to many? Each one relationship is represented by an arrow (where the pointed arrow means at most one and the rounded arrow means exactly one). I first assumed that a single drug identified by a single trade name would come from just one pharmaceutical company but would it be possible for a single drug to come from multiple pharmaceutical company's? I'm also not sure if it's supposed to be a 3 way relationship or not.
Not sure if this is really a technical question you can find the answer to here. It would probably be wise to further clarify with your client, but from pure wording I would assume.
1.) 1 Drug - 1 Trade Name - 1 Company
2.) 1 Company has Many Drugs
From general knowledge of US drugs, different companies have their unique versions of drugs with the same active ingredient, but these are all filed under different trade names, maintaining 1 trade name - 1 company relationship.
For example, ibuprofen (generic) is sold under both Advil and Motrin (separate trade names).
In this style of ER diagram, Chen's original, the diamond denotes a ternary
"relationship" type, aka association type, among/on the three participant "entity" types symbolized by the boxes. As in an application relationship/association, as in "Entity-Relationship Model". The lines showing participations correspond to FKs (foreign keys).
In such a diagram each line gets labeled by the number or range giving the number of entities in each entity set which is allowed in a relationship set. The table for the relationship would have a FK for each line. Per Chen it would be described as (in order company-name-drug) (at-most-1)-to-(exactly-1)-to-N relationship (assuming the unlabeled line means any number). There is a style with a cardinality at each end of a line.
Misunderstandings/misrepresentations/misappropriations of Chen style by older & newer methods & products (although quite mainstream) lead to different so-called ER diagrams.
One such style only shows entity type boxes with relationships shown by connecting lines labeled by relationship names. The 1:many relationships can be implemented by a FK attribute in one of the entity type tables, although they needn't be, and although that's contrary to Chen ER modeling, which would use a table. Typically, for n-ary relationships for n>2, instead of just having three line segments connect at a point the point is replaced by a box for what in Chen is an "associative entity" type. The lines would then be participations/FKs under Chen. All lines now represent 1:many relationships. Other so-called ER diagrams just have boxes for tables and lines for FKs and don't even have relationships on entities in the Chen sense. The use of lines that only ever denote 1:many relationships and/or FKs lead to lines and FKs being (wrongly but ubiquitously) called "relationships". (Which seems to be how you understand the word.)
The wikipedia entry on E-R modeling (and E-R diagrams) is currently reasonable.
What is the difference between logical data model and conceptual data model?
In the conceptual data model you worry only about the high level design - what tables should exist and the connections between them. In this phase you recognize entities in your model and the relationships between them.
The logical model comes after the conceptual modeling when you explicitly define what the columns in each table are. While writing the logical model, you might also take into consideration the actual database system you're designing for, but only if it affects the design (i.e., if there are no triggers you might want to remove some redundancy column etc.)
There is also physical model which elaborates on the logical model and assigns each column with it's type/length etc.
Here is a good table and picture that describes each of the three levels.
|----------------------|------------|---------|----------|
| Feature | Conceptual | Logical | Physical |
|----------------------|------------|---------|----------|
| Entity Names | X | X | |
| Entity Relationships | X | X | |
| Attributes | | X | |
| Primary Keys | | X | X |
| Foreign Keys | | X | X |
| Table Names | | | X |
| Column Names | | | X |
| Column Data Types | | | X |
|----------------------|------------|---------|----------|
In this table you can see the difference between each model:
See http://www.1keydata.com/datawarehousing/data-modeling-levels.html for more information and some data model examples.
These terms are unfortunately overloaded with several possible definitions. According to the ANSI-SPARC "three schema" model for instance, the Conceptual Schema or Conceptual Model consists of the set of objects in a database (tables, views, etc) in contrast to the External Schema which are the objects that users see.
In the data management professions and especially among data modellers / architects, the term Conceptual Model is frequently used to mean a semantic model whereas the term Logical Model is used to mean a preliminary or virtual database design. This is probably the usage you are most likely to come across in the workplace.
In academic usage and when describing DBMS architectures however, the Logical level means the database objects (tables, views, tables, keys, constraints, etc), as distinct from the Physical level (files, indexes, storage). To confuse things further, in the workplace the term Physical model is often used to mean the design as implemented or planned for implementation in an actual database. That may include both "physical" and "logical" level constructs (both tables and indexes for example).
When you come across any of these terms you really need to seek clarification on what is being described unless the context makes it obvious.
For a discussion of these differences, check out Data Modelling Essentials by Simsion and Witt for example.
Logical Database Model
Logical database modeling is required for compiling business requirements and representing the requirements as a model. It is mainly associated with the gathering of business needs rather than the database design. The information that needs to be gathered is about organizational units, business entities, and business processes.
Once the information is compiled, reports and diagrams are made, including these:
ERD–Entity relationship diagram shows the relationship between different categories of data and shows the different categories of data required for the development of a database.
Business process diagram–It shows the activities of individuals within the company. It shows how the data moves within the organization based on which application interface can be designed.
Feedback documentation by users.
Logical database models basically determine if all the requirements of the business have been gathered. It is reviewed by developers, management, and finally the end users to see if more information needs to be gathered before physical modeling starts.
Physical Database Model
Physical database modeling deals with designing the actual database based on the requirements gathered during logical database modeling. All the information gathered is converted into relational models and business models. During physical modeling, objects are defined at a level called a schema level. A schema is considered a group of objects which are related to each other in a database.
Tables and columns are made according to the information provided during logical modeling. Primary keys, unique keys, and foreign keys are defined in order to provide constraints. Indexes and snapshots are defined. Data can be summarized, and users are provided with an alternative perspective once the tables have been created.
Physical database modeling depends upon the software already being used in the organization. It is software specific. Physical modeling includes:
Server model diagram–It includes tables and columns and different relationships that exist within a database.
Database design documentation.
Feedback documentation of users.
Summary:
1.Logical database modeling is mainly for gathering information about business needs and does not involve designing a database; whereas physical database modeling is mainly required for actual designing of the database.
2.Logical database modeling does not include indexes and constraints; the logical database model for an application can be used across various database software and implementations; whereas physical database modeling is software and hardware specific and has indexes and constraints.
3.Logical database modeling includes; ERD, business process diagrams, and user feedback documentation; whereas physical database modeling includes; server model diagram, database design documentation, and user feedback documentation.
Read more: Difference Between Logical and Physical Database Model | Difference Between | Logical vs Physical Database Model http://www.differencebetween.net/technology/software-technology/difference-between-logical-and-physical-database-model/#ixzz3AxPVhTlg
Conceptual Schema - covers entities and relationships. Should be created first. Contrary to some of the other answers; tables are not defined here. For example a 'many to many' table is not included in a conceptual data model but is defined as a 'many to many' relationship between entities.
Logical Schema - Covers tables, attributes, keys, mandatory role constraints, and referential integrity with no regards to the physical implementation. Things like indexes are not defined, attribute types should be kept logical, e.g. text instead of varchar2. Should be created based on the conceptual schema.
I need to produce both a logical model and a conceptual model. All the explanations here are really vague. The link posted above just shows the difference being that a conceptual model is a logical model without fields. Ok fine, I don't mention the name of the database. It appears to be totally redundant.
I really don't know what 'semantic' means. can someone explain what I would do differently using 'english' and possibly post a link to better examples than a picture that shows one picture that has fields and one that does not. The buzzwords are all well and good, but its so vague its not useful to practically implement.
do I do anything other than take my logical model (which is basically my physical model reversed engineered out of the DB, click a button in said tools and the images look a little different and then take off the data types).
From what i can practically see (and without buzzwords)
physical model: actually tables. The little pictures have data types in them and named pk/fk constraints
Logical Model: click the little button my tool (using Oracles SQL Developer Data Modeller, I dont have an erwin license and 2010 visio no longer reverse engineers out of the DB), and then the images on the screen change slightly. The data types are gone and the names of the constraints are gone, then the colors of the table representations changes to purple (so now I call them entities).
ok. so what would my Conceptual model look like other then: exact same thing as my logical model minus the fields. I would think there is more to it than this. Reciting that its a 'semantic' representation of data sounds real nice and fancy, but doesn't make sense to someone who has not made one of these before.
This is an old question and maybe this comes way too late, but I don't see one very important aspect necessary to answering the question.
That is, the TARGET audience for the data model. The Conceptual Data Model is the model generated from business analysis, from interviews with the BUSINESS about their data. It is not so much "high level" as it is the business's understanding of their data, business rules captured in the relationships between "candidate" entities. At this point, you are capturing the things of importance to the business (Employee, Customer, Contract, Account, etc.) and the relationships between them.
The final Conceptual Data Model may be somewhat abstract -- for instance, treating Individuals and Organizations entering into a contract as subtypes of a "Party", Contractors and Permanent Employees as subtypes of an Employee, even Employees and Customers subtypes of "Person" -- but it is a document that a data modeler develops from discussions with the business SMEs and presents to the business for validation.
The Logical Data Model is not just "more detail" -- where useful and important, a Conceptual Data Model may well have attributes included -- it is the ARCHITECTURE document, the model that is presented to the software analysts/engineers to explain and specify the data requirements. It will resolve many-to-many relationships to association tables and will define all attributes, with examples and constraints, so that code can be written against the architecture.
The Physical model is that Logical Model generated specifically for a particular environment, such as SQL Server or Teradata or Oracle or whatever. It will have keys, indexes, partitions, or whatever is needed to implement, based on sizing, access frequency, security constraints, etc.
So, if you are being asked to develop a Conceptual Data Model, you are being asked to design the solution (or part of it) from scratch, getting your information from the business. There's more to it, but I hope that answers the question.
First of all, a data model is an abstraction tool and a database model (or scheme/diagramm) is a modeling result.
Conceptual data model is DBMS-independent and covers functional/domain design area. The most known conceptual data model is "Entity-Relationship". Normally, you can reuse the conceptual scheme to produce different logical schemes not only relational.
Logical data model is intended to be implemented by some DBMS and corresponds mostly to the conceptual level of ANSI/SPARC architecture (proposed in 1975); this point gives some collisions of terminology. Zachman Framework tried to resolve this kind of collision ten years later introducing conceptual, logical and physical models.
There are many logical data models, and the most known is relational one.
So main differences of conceptual data model are the focusing on the domain and DBMS-independence whereas logical data model is the most abstract level of concrete DBMS you plan to use. Note that contemporary DBMS support several logical models at the same time.
You can also have a look to my book and to the article for more details.
Most answers here are strictly related to notations and syntax of the data models at different levels of abstraction. The key difference has not been mentioned by anyone. Conceptual models surface concepts. Concepts relate to other concepts in a different way that an Entity relates to another Entity at the Logical level of abstraction. Concepts are closer to Types. Usually at Conceptual level you display Types of things (this does not mean you must use the term "type" in your naming convention) and relationships between such types. Therefore, the existence of many-to-many relationships is not the rule but rather the consequence of the relationships between type-wise elements. In Logical Models Entities represent one instance of that thing in the real world. In Conceptual models it is not expected the description of an instance of an Entity and their relationships but rather the description of the "type" or "class" of that particular Entity.
Examples:
- Vehicles have Wheels and Wheels are used in Vehicles. At Conceptual level this is a many-to-many relationship
- A particular Vehicle (a car by instance), with one specific registration number have 5 wheels and each particular wheel, each one with a serial number is related to only that particular car. At Logical level this is a one-to-many relationship.
Conceptual covers "types/classes". Logical covers "instances".
I would add another comment about databases. I agree with one of the colleagues who commented above that Conceptual and Logical models have absolutely nothing about databases. Conceptual and Logical models describe the real world from a data perspective using notations such as ER or UML. Database vendors, smartly, designed their products to follow the same philosophy used to logically model the World and them created Relational Databases, making everyone's lifes easier. You can describe your organisation's data landscape at all the levels using Conceptual and Logical model and never use a relational database.
Well I guess this is my 2 cents...
logical data model
A logical data model describes the data in as much detail as possible, without regard to how they will be physical implemented in the database. Features of a logical data model include:
· Includes all entities and relationships among them.
· All attributes for each entity are specified.
· The primary key for each entity is specified.
· Foreign keys (keys identifying the relationship between different entities) are specified.
· Normalization occurs at this level.
conceptual data model
A conceptual data model identifies the highest-level relationships between the different entities. Features of conceptual data model include:
· Includes the important entities and the relationships among them.
· No attribute is specified.
· No primary key is specified.
There are couples of questions around asking for difference / explanation on identifying and non-identifying relationship in relationship database.
My question is, can you think of a simpler term for these jargons? I understand that technical terms have to be specific and unambiguous though. But having an 'alternative name' might help students relate more easily to the concept behind.
We actually want to use a more layman term in our own database modeling tool, so that first-time users without much computer science background could learn faster.
cheers!
I often see child table or dependent table used as a lay term. You could use either of those terms for a table with an identifying relationship
Then say a referencing table is a table with a non-identifying relationship.
For example, PhoneNumbers is a child of Users, because a phone number has an identifying relationship with its user (i.e. the primary key of PhoneNumbers includes a foreign key to the primary key of Users).
Whereas the Users table has a state column that is a foreign key to the States table, making it a non-identifying relationship. So you could say Users references States, but is not a child of it per se.
I think belongs to would be a good name for the identifying relationship.
A "weak entity type" does not have its own key, just a "partial key", so each entity instance of this weak entity type has to belong to some other entity instance so it can be identified, and this is an "identifying relationship". For example, a landlord could have a database with apartments and rooms. A room can be called kitchen or bathroom, and while that name is unique within an apartment, there will be many rooms in the database with the name kitchen, so it is just a partial key. To uniquely identify a room in the database, you need to say that it is the kitchen in this particular apartment. In other words, the rooms belong to apartments.
I'm going to recommend the term "weak entity" from ER modeling.
Some modelers conceptualize the subject matter as being made up of entities and relationships among entities. This gives rise to Entity-Relationship Modeling (ER Modeling). An attribute can be tied to an entity or a relationship, and values stored in the database are instances of attributes.
If you do ER modeling, there is a kind of entity called a "weak entity". Part of the identity of a weak entity is the identity of a stronger entity, to which the weak one belongs.
An example might be an order in an order processing system. Orders are made up of line items, and each line item contains a product-id, a unit-price, and a quantity. But line items don't have an identifying number across all orders. Instead, a line item is identified by {item number, order number}. In other words, a line item can't exist unless it's part of exactly one order. Item number 1 is the first item in whatever order it belongs to, but you need both numbers to identify an item.
It's easy to turn an ER model into a relational model. It's also easy for people who are experts in the data but know nothing about databases to get used to an ER model of the data they understand.
There are other modelers who argue vehemently against the need for ER modeling. I'm not one of them.
Nothing, absolutely nothing in the kind of modeling where one encounters things such as "relationships" (ER, I presume) is "technical", "precise" or "unambiguous". Nor can it be.
A) ER modeling is always and by necessity informal, because it can never be sufficient to capture/express the entire definition of a database.
B) There are so many different ER dialects out there that it is just impossible for all of them to use exactly the same terms with exactly the same meaning. Recently, I even discovered that some UK university that teaches ER modeling, uses the term "entity subtype" for the very same thing that I always used to name "entity supertype", and vice-versa !
One could use connection.
You have Connection between two tables, where the IDs are the same.
That type of thing.
how about
Association
Link
Correlation