I have the following ER diagrams:
customer starts a transaction with account
Bank updates watch_list
I'm new to ER diagram and I want to add an alert system in banking process. Customer entity can start a transaction with his/her bank account in transact_with relationship. In the relationship, there are attributes such as counterpart_name and counterpart_country. If counterpart's name in remittance is the same as the name from watch_list the bank is keeping, the DB creates a new row in the table named Alert. And I wonder how can I establish that Alert entity and relationships between other entities.
Since counterpart_name is a attribute from a relationship, if I want to relate that attribute with watch_list entity, it seems like it becomes ternary relationship but I don't want watch_list to be related with customer and account entity in normal transaction process. Any suggestions on this, please?
ERD won't help you because it doesn't capture rules. See here: https://en.wikipedia.org/wiki/Entity%E2%80%93relationship_model
Of course, if you want to create an ALERT table, then ERD is fine.
The "if" part which fires a trigger (or whatever) could be modeled by a UML sequence diagram (for example).
Put another way, the ALERT table is data, the "if" is control and they are served by different diagram types. Good luck
You can and should relate the ALERT entity with the entity WATCH_LIST and TRANSACT_WITH:
If these relationships would not exist, you would know to which watch list element the transaction is related, and use the score mentioned on the watchlist. Moreover, what's the benefit of having alerts, if not knowing which transaction must be inspected/monitored?
The fact that the ALERT is not systematic but conditional, can be documented with an optional relationship.
The matching of the transaction with the watchlist is based on some criteria, but the relationship with the alert would be based on the id.
The ER diagrams show the structure of the entities and the relationships. They do not describe the processes or the behaviors. Typically, with an ERD, you'd use some DFD to explain what data is consumed by the monitoring process that would generate the ALERT records. And the IF would be documented in the flowchart or pseudocode that documents this process. On the other side, nothing prevents you from informally documenting this informally in a comment within your ERD.
Unrelated remarks:
As you have lots of attributes on the relation TRANSACT_WITH, and since a relation is not supposed to have attributes, I understand that it is in reality an associative entity.
The matching of the watchlist solely on the base of the name (without even considering the country) might lead to a high number of false positives.
UML would allow to constraint the ER-relationship (UML-association) with ER-entity (UML-Class) ALERT, and express the conditionality in a very precise manner.
Related
Problem description
I am currently working on a project which requires a relational database for storage.
After thinking about the data and its relations for a while I ran into a quite repetitive problem:
I encountered a common data schema for entity A which contains some fields e.g. name, description, value. This entity is connected with entity B in multiple n-1 relations. So entity B has n entities A in relation rel1 and n entities A in relation rel2.
Now I am trying to break down this datamodel into a schema for a relational database (e.g. Postgres, MySQL).
After some research, I have not really found "the best" solution for this particular problem.
Some similar questions I have found so far:
Stackoverflow
DBA Stackexchange
My ideas
So I have thought about possible solutions which I am going to present here:
1. Duplicate table
The relationship from entity B to entity A has a certain meaning to it. So it is possible to create multiple tables (1 per relationship). This would solve all immediate problems but essentially duplicate the tables which means that changes now have to be reflected to multiple tables (e.g. a new column).
2. Introduce a type column
Instead of multiple relationships, I could just say "Entity B is connected with n entity A". Additionally, I would add a type column that then tells me to which relation entity A belongs. I am not exactly sure how this is represented with common ORMs like Spring-Hibernate and if this introduces additional problems that I am currently unaware of.
3. Abstract the common attributes of entity A
Another option is to create a ADetails entity, which bundles all attributes of entity A.
Then I would create two entities that represent each relationship and which are connected to the ADetails entity in a 1-to-1 relationship. This would solve the interpretation problem of the foreign key but might be too much overhead.
My Question
In the context of a medium-large-sized project, are any of these solutions viable?
Are there certain Cons that rule out one particular approach?
Are there other (better) options I haven't thought about?
I appreciate any help on this matter.
Edit 1 - PPR (Person-Party-Role)
Thanks for the suggestion from AntC. PPR Description
I think the described situation matches my problem.
Let's break it down:
Entity B is an event. There exists only one event for the given participants to make this easier. So the relationship from event to participant is 1-n.
Entity A can be described as Groups, People, Organization but given my situation they all have the same attributes. Hence, splitting them up into separate tables felt like the wrong idea.
To explain the situation with the class diagram:
An Event (Entity B) has a collection of n Groups (Entity A), n People (Entity A) and n Organizations (Entity A).
If I understand correctly the suggestion is the following:
In my case the relationship between Event and Participant is 1-n
The RefRoles table represents the ParticipantType column that descibes to which relationship the Participant belongs (is it a customer or part of the service for the event for example)
Because all my Groups, People and Organizations have the same attributes the only table required at this point is the Participant table
If there are individual attributes in the future I would introduce a new table (e.g. People) that references the Participant in a 1-1 relationship.
If there are multiple tables going to be added, the foreign key of the multiple 1-1 relationship is mutually exclusive (so there can only be one Group/Person/Organization for a participant)
Solution suggested by AntC and Christian Beikov
Splitting up the tables does make sense while keeping the common attributes in one table.
At the moment there are no individual attributes but the type column is not required anymore because the foreign keys can be used to see which relationship the entity belongs to.
I have created a small example for this:
There exist 3 types (previously type column) of people for an event: Staff, VIP, Visitor
The common attributes are mapped in a 1-1-relationship to the person table.
To make it simple: Each Person (Staff, VIP, Visitor) can only participate in one event. (Would be n-m-relationship in a more advanced example)
The database schema would be the following:
This approach is better than the type column in my opinion.
It also solves having to interprete the entity based on its type in the application later on. It is also possible to resolve a type column in an ORM (see this question) but this approach avoids the struggle if the ORM you are using does not support resolving it.
IMO since you already use dedicated terms for these objects, they probably will diverge and splitting up a table afterwards is quite some work, also on the code side, so I would suggest you map dedicated entities/tables from the beginning.
I have a problem, it is as follows:
An internet store wants to send e-mails to customers. It wants to keep a database to record what messages have been sent to which customer. Suppose an e-mail has a message id (M-id), a subject (Subject), and message body (Body). The customer is identified by his e-mail address, other customer information includes name (Name), gender (Gender), and address (Address).
When the internet store sends an email to the customer, SendDate is recorded.
Now we're suppposed to draw an ERD with the information given above and then draw a relational database schema.
Based on the bold statement, I drew the following ERD, not knowing what to do with cardinality and participation:
The answer to the problem is this:
Note that Send is a weak entity, and that there is full participation between email and contains, sends and customer.
Q1: Why can't I use a ternary relationship for this example?
Q2: Regardless of this problem, in a ternary relationship, how do we determine the cardinality and participation?
Q3: How does one arrive to the final answer?
Q1: Why can't I use a ternary relationship for this example?
The question indicates that e-mail is recorded on behalf of a single internet store. There's no need to specify it in each association, the entire database belongs to the store.
If you were modeling e-mail sent to customers at multiple internet stores, a ternary relationship would be appropriate.
Q2: Regardless of this problem, in a ternary relationship, how do we determine the cardinality and participation?
The cardinality of each role in a relationship is the number of values in that role that can be associated with each valid combination of the other roles. E.g. if you have a relationship (A, B, C), then the cardinality of A is the number of values from A that can appear for each valid combination of (B, C). If (B, C) is a superkey then the cardinality of A is one.
Participation is simpler: for each role, must all the values in the associated entity set necessarily participate in the relationship, or can some exist independently? I suggest you also see my answer to this question: is optionality (mandatory, optional) and participation (total, partial) are same?.
Q3: How does one arrive to the final answer?
I disagree with the final answer you posted. In the ER model, weak entity sets can't have multiple identifying relationships, and usually have a weak key. I suspect the author may be using some network data model concepts (such as conflating relationships with foreign key constraints and/or thinking only entities can have attributes).
My own answer to the question would look like this:
i am getting confused about the entity and entity set in DBMS.
do set of entities forms entity set? Just like set of Student Objects form Array of Students.
should we compare a Table in Relational database to entity set or entity?
If i compare entity set to table then can i compare entity as a record in table. If i am wrong please correct me.
I have gone through some books and blogs regarding this. some times entity is compared with table in Rdbms and some times with entity set. which is true. Not able to get proper explanation.
Pls come up with the examples and clear explanation, thanks in advance!!
There are various descriptions of the terms, and unfortunately blogs, tutorials, enterprise framework documentation and diagramming software tend to conflate the concepts. For more rigorous definitions, consult academic papers and books by the founders of the field.
An entity is a thing which can be distinctly identified, like a specific person, company, or event. Entities are identified by values in a database, e.g. I (an entity in the real world) am represented by the number 532721 in StackOverflow's database.
An entity set is a set of similar things, like a set of persons, companies or events. An example would be all the users on StackOverflow. Entities and entity sets are conceptual and not directly contained in databases. StackOverflow's database talks about its users, those users don't actually live in the database.
A table is a data structure which represents a predicate. A predicate is a fact type, a generic statement with placeholders for values. Records contain values for those placeholders that make the predicate true, thus records represent propositions about entities in the world. Another way to view it is that a table represents a set of attributes and relations on one or more entity sets. Remember attributes are just binary relations.
For example, a table USER (UserId PK, UserName UQ, Reputation, PhotoId UQ) can be understood as saying "There exists in the world a user identified by a number UserId and unique name UserName who has a score of Reputation points and exclusively uses the photo identified as PhotoId as avatar". Each corresponding record represents a known fact about a user and an image.
I recommend you read Codd's paper "A Relational Model of Data for Large Shared Data Banks" and Chen's paper "The Entity-Relationship Model - Toward a Unified View of Data". They're shorter and more focused than a whole book, and can be easily found online.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Is it things like tables? Or would it also include things like constraints, stored procedures, packages, etc.?
I've looked around the internet, but finding elementary answers to elementary questions is sometimes a little difficult.
That's quite a general question!
Basically, all types that the database system itself offers, like NUMERIC, VARCHAR etc., or that the programming language of choice offers (int, string etc.) would be considered "atomic" data(base) types.
Anything that you - based on your program's or business' requirements - build from that, business objects and so forth, are entities.
Tables, constraints and so forth are database-internal objects needed to store and retrieve data, but those are general not considered "entities". The data stored in your tables, when retrieved and converted into an object, that then is an entity.
Marc
In the entity relationship world an entity is something that may exist independently and so there is often a one-to-one relationship between entities and database tables. However, this mapping is an implementation decision: For example, an ER diagram may contain three entities: Triangle, Square and Circle and these could potentially be modelled as a single table: Shape.
Also note that some database tables may represent relationships between entities.
This seems helpful: http://en.wikipedia.org/wiki/Entity-relationship_model
In a database an entity is a table. The table represents whatever real world concept you are trying to model (person, transaction, event).
Contraints can represents relationships between entities. These would be foreign keys. They also enforce rules like first_name can not be blank (null). A transaction must have 1 or more items. An event must have a date time.
Stored Procedures / Packages / Triggers could handle more complex relationships and/or they can handle business rules, just depends on what it's doing.
it kind of depends how you think about it and how you model your problem domain. most of the time when you hear about entities, they are database tables (one or many) mapped onto object classes. So it's not really an entity until it's been queried for and turned into a class instance.
but again, it depends on your modeling methodology, and there are multiple :-)
This thread is demomnstrating one reason why it is difficult to find "elementary answers to elementary questions". Certain words have been used by different programming paradigms to mean different things (try asking a bunch of OO programmers what is the difference between a Class and an Object sometime).
Here's my take on it.
I first came across Entity as a modelling term in SSADM (ask your dad). In that context an Entity is used to model a logical clump of datas during the requirements gathering / analysis phase. The relationships between entities were modelled using the Entity Relationship diagrams, and the profile of an Enity was modelled using Entity Life Histories. ELH diagrams were very useful in COBOL systems but utterly horrible in relational databases. ERDs on the other hand continue to be useful to this day.
During the design and implementation phases the Entities get resolved into database tables, objects or records in a COBOL input file. In the course of that process a logical entity may get split across multiple tables, or several entities may get squidged into a single table, or there may be a one-to-one mapping. Sometimes an entity is resolved away entirely or lingers on as a view or a stored procedure.
My answer is obviously a little late, but here it is as defined in a database certification text book:
Entity: A uniquely identifiable element about which data is stored in a database.
and to clear up entity and table confusion,
Entity is not a table. Tables can be called "tables" or "relations" the words are synonymous.
We'd need to know some context. One thing people sometimes do when analysing data in prepartion for designing a database is to create an Entity Realtionship Diagram, where you are considering what data items you are managing and their relationships.
I wonder if that's the context you mean?
If so perhaps a read of this article would get you started?
Entities are "things of significance" to the users/business/enterprise/problem domain.
Update:
See this article in my blog in which I try to cover the subject in more detail:
What is entity-relationship model?
An entity is a term from the entity-relationship model.
A relational model (your database schema) is one of the ways to implement the ER model.
Relational tables represent relations between simple types like integers and strings, which, in their turn, can represent everything: entities, attributes, relationships.
You cannot tell what is it only from the relational structure, you need to see the ER model.
For table persons,
id name surname
1 John Smith
id, name and surname are entities in the real world and may or may not represent entities in the underlying ER model.
The fact of a record exists in the table means that these entities are in the following relation: "person 1 has name John and has surname Smith".
In the example above, the entity is defined by id (from the model's point of view).
If a person changes his name from John to Jack, the person remains the same (again, from the model's point of view), but gets related to another name.
In example above name and surname can be treated as attribute (as opposed to entity), but again, you need to see the ER model which this schema implements to tell what is it.
In some ER-to-relational model mappings, an entity should be defined in a table referenceable with a FOREIGN KEY to be considered an entity (which should constrain its domain).
However, this constraint can exist but not be represented in a database (due to technological limitations or something else).
Like, we cannot keep a list of all possible names, but the name of ##$^# is most probably a non-name, hence, it does not belong to the domain of names.
Therefore, an attribute is an entity which can participate in a relationship but cannot be contained in a domain-defining table.
For instance, the table prices:
good_id price
defines relationships between the set of goods (which is defined by the table goods) and the set of real numbers (which cannot be contained in a table since it's not even countable).
Still each price (like $2.00) is a real-world entity just as well.
There are couples of questions around asking for difference / explanation on identifying and non-identifying relationship in relationship database.
My question is, can you think of a simpler term for these jargons? I understand that technical terms have to be specific and unambiguous though. But having an 'alternative name' might help students relate more easily to the concept behind.
We actually want to use a more layman term in our own database modeling tool, so that first-time users without much computer science background could learn faster.
cheers!
I often see child table or dependent table used as a lay term. You could use either of those terms for a table with an identifying relationship
Then say a referencing table is a table with a non-identifying relationship.
For example, PhoneNumbers is a child of Users, because a phone number has an identifying relationship with its user (i.e. the primary key of PhoneNumbers includes a foreign key to the primary key of Users).
Whereas the Users table has a state column that is a foreign key to the States table, making it a non-identifying relationship. So you could say Users references States, but is not a child of it per se.
I think belongs to would be a good name for the identifying relationship.
A "weak entity type" does not have its own key, just a "partial key", so each entity instance of this weak entity type has to belong to some other entity instance so it can be identified, and this is an "identifying relationship". For example, a landlord could have a database with apartments and rooms. A room can be called kitchen or bathroom, and while that name is unique within an apartment, there will be many rooms in the database with the name kitchen, so it is just a partial key. To uniquely identify a room in the database, you need to say that it is the kitchen in this particular apartment. In other words, the rooms belong to apartments.
I'm going to recommend the term "weak entity" from ER modeling.
Some modelers conceptualize the subject matter as being made up of entities and relationships among entities. This gives rise to Entity-Relationship Modeling (ER Modeling). An attribute can be tied to an entity or a relationship, and values stored in the database are instances of attributes.
If you do ER modeling, there is a kind of entity called a "weak entity". Part of the identity of a weak entity is the identity of a stronger entity, to which the weak one belongs.
An example might be an order in an order processing system. Orders are made up of line items, and each line item contains a product-id, a unit-price, and a quantity. But line items don't have an identifying number across all orders. Instead, a line item is identified by {item number, order number}. In other words, a line item can't exist unless it's part of exactly one order. Item number 1 is the first item in whatever order it belongs to, but you need both numbers to identify an item.
It's easy to turn an ER model into a relational model. It's also easy for people who are experts in the data but know nothing about databases to get used to an ER model of the data they understand.
There are other modelers who argue vehemently against the need for ER modeling. I'm not one of them.
Nothing, absolutely nothing in the kind of modeling where one encounters things such as "relationships" (ER, I presume) is "technical", "precise" or "unambiguous". Nor can it be.
A) ER modeling is always and by necessity informal, because it can never be sufficient to capture/express the entire definition of a database.
B) There are so many different ER dialects out there that it is just impossible for all of them to use exactly the same terms with exactly the same meaning. Recently, I even discovered that some UK university that teaches ER modeling, uses the term "entity subtype" for the very same thing that I always used to name "entity supertype", and vice-versa !
One could use connection.
You have Connection between two tables, where the IDs are the same.
That type of thing.
how about
Association
Link
Correlation