Designing entity relations - theoretical or practical solutions? - database

I have the following issue with designing entity schemas.
Let's say that I've created a schema full of many-to-many relations, because it seemed like a reasonable choice at first, but during implementation those relations are not needed.
For example, theoretically each county can have many lakes and each lake can be located in many counties. But my database has no lakes crossing borders of the counties. Is it still reasonable to use many-to-many relations? It'll basically create a junction table that serves no need, because I can represent it with one-to-many relationships.
I have a geographical database that I intitially thought would contain a lot of many-to-many relations, but in practice such relations are needed only in a few tables.

First of all: It's related to your Software Functional Requirements.
Your System Analysts should decide on that. And System Analysts should follow System Owners, End Users, and other Stakeholders within the organization.
If you write a project for an organization or company, you should ask them.
Secondly: In Database Design, if you have even few countries that can have common lake, you should use many-to-many relationship. The reason is Extendability of your project. I don't think many-to-many has a lot of difficulties in comparison with one-to-many.
Thirdly: If you have a few (seldom) countries that can have common lake, I think you can use this data modeling technique:
You can use combination of one-to-many and many-to-many.
Add primary key of Country as F.K to Lake. (for one-to-many relationship)
Add new table like Country_Lakes with F.Ks from Country and Lake (to many-to-many relationship)
How to detect the Lake type (common or not): If the F.K of Country in Lake is NULL, this Lake is common, you can get all Countries from Country_Lakes.
In this design, you have a little Nullification, but it happens seldom.

If you might be using it, keep it. Creating many-to-many relationships afterwards is far more complicated.
If you won't ever use it, remove it.

Related

Complicated database design

We have a situation in a database design in our company. We are trying to figure out the best way to design the database to store transactional data. I need expert’s advice on the best relational design to achieve it. Problem: We have different kind of “Entities” in our system, for example; Customers, Services, Dealers etc. These Entities are doing transfer of funds between each other. We need to store the history of the transfers in database.
Solutions:
One table of transfers and another table to keep “Accounts” information. There are three tables “Customers”, “Services”, “Dealers”. There is another table “Accounts”. An account can be related to any of the “Entities” mentioned above; it means (and that’s the requirement) that logically there should be a one-to-one relationship to/from Entities and Accounts. However, we can only store the Account_ID in the Entities table, but we cannot store the foreign key of Entities in Accounts table. Here the problem happens in terms of database design. Because if there is a customer’s account, it is not restricted by the database design to not be stored in Services table etc. Now we can keep all transfers in one table only since Accounts are unified among all the entities.
Keep the balance information in the table primary Entities table and separate tables for all transfers. Here for all kind of transfers between the entities, we are keeping separate tables. For example, a transfer between a Customer and Service provider will be stored in a table called “Spending”. Another table will have transfer data for transfer between Service and Dealers called “Commission” etc. In this case, we are not storing all the transfers of the funds in a single table, but the foreign keys are properly defined since the tables “Spending” and “Commission” are only between two specific entities.
According to the best practices, which one of the above given solutions is correct, and why?
If you are simply looking for schemas that claim to deal with cases like yours, there is a website with hundreds of published schemas. Some of these pertain to storing transaction data concerning customers and suppliers. You can take one of these and adapt it.
http://www.databaseanswers.org/data_models/
If your question is about how to relate accounts to business contacts, read on.
Customers, Services, and Dealers are all sub classes of some super class that I'll call Contacts. There are two well known design patterns for modeling sub classes in database tables. And there is a technique called Shared primary Key that can be used with one of them to good advantage.
Take a look at the info and the questions grouped under these three tags:
single-table-inheritance class-table-inheritance shared-primary-key
If you use class table inheritance and shared primary key, you will end up with four tables pertaining to contacts: Contacts, Customers, Dealers, and Services. Every entry in Contacts will have a corresponding entry in one of the three subclass tables.
An FK in the accounts table, let's call it Accounts.ContactID will not only reference a row in Contacts, but also a row in whichever of Customers, Dealers, Services pertains to the case at hand.
This may work outwell for you. Alternatively, single table table inheritance works out well in some of the simpler cases. It depends on details about your data and your intended use of it.
You can make table Accounts with three fields with FK to Customers,Dealers and Services and it's will close problem. But also you can make three table for each type of entity with accounting data. You have the deal with multi-system case in system design. Each system solve the task. But for deсision you need make pros and con analyses about algorithm complexity, performance and other system requirements. For example one table will be more simple to code, but three table give more performance of sql database.

Would this data model be considered correct

I'm new to data modelling and have started following tutorials to learn more.
I am trying to create a model for a hypothetical scenario and am struggling to validate what I have created to see if it is what would be considered a correct data model.
Essentially all im trying to do is correctly store data in a normalised form. In my scenario there are 3 types of people and each share some attributes and have one set of contact details each.
Does the below data model look feasible?
The relationship between person and one of defendant, magistrate, or staff-member is a case of the class/subclass pattern. There are two common ways of modeling this pattern in relational tables.
One way is called "Class Table Inheritance". You can find out more by visiting this tag: class-table-inheritance or by searching the web for Martin Fowler's treatment of the same subject. Your design resembles this design.
Another way is called "Single Table Inheritance", which you can also research the same way. single-table-inheritance. It's simpler, and works ok in some cases. You deal with fewer joins, but you deal with more NULLS.
Many people who go for class table inheritance also apply a technique called "Shared Primary Key". shared-primary-key. Using this technique, Defendant, Magistrate, and Staff_Member would each use a copy of person_id as the primary key. This primary key also functions as a foreign key. Shared primary key enforces the one-to-one nature of the IS-A relationships that exist in this case.
If you want to go further in data modeling, you might want to learn ER modeling as a distinct data model from the relational model. What you've done here is essentially to use ER diagramming to diagram a relational model. There's nothing wrong with that, but it obscures a whole new field of study, generally called conceptual data modeling.
If you generate an ER model at the conceptual level, you don't attempt to implement it in terms of tables. There is a diagramming convention in ER that goes under the name "generalization/specialization" that allows you to depict a class/subclass situation, while remaining silent on how it's going to be implemented.
Conceptual data models have an area of usefulness, in addition to relational data modeling. What makes conceptual data models useful is precisely the fact that they present the information requirements without stating how those requirements are going to be met.
Once you are proficient at creating conceptual data models, it's not hard to convert one of them to a relational model.
This may be more than you bargained for, but since you are taking on learning modeling, I thought I'd survey some of the field for you.

Is there a relationship between Database Tables and Object Oriented Classes?

Every time I program I recognize this relationship between classes and tables, or am I imagining it.
You can have a class per database table or a table per class i.e. :
tables: customer, products, order.
classes: customer, products, order, may have methods such as addRecord, deleteRecord, updateRecord.
what is this called? Object-Relational? I am not a DBA.
It all depends on the type of database you're using. If you're using an object oriented database (OODB), then there is no relationship, as the objects and the persisted data are the same thing. For example, if you have a Customer class, and you save it in an OODB, then that instance of the customer is what is stored in the DB.
If you are using a relational database, then the class instances, and the persisted representation of them in the DB, can be the same thing, but many times they aren't. This is because most folks use normalization to represent their data in an efficient way (in a relational DB). This means, instead of having a table per class, you can have a class represented by more than one table. In the Customer example, the tables might now be Customer (with Name, date of birth, and other properties), and Order (with order pointing to products in yet another table). The reason for this has to do with cardinality, and the ability for Customers to have more than one order. When your business logic needs this information from the DB, the data access layer's job is to map the data (called ORM) from the DB into your classes.
If you are using yet another type of DB, then there will be a different relationship between the classes (domain model) and what's persisted in the DB.
But, as far as having a name for this relationship? No, there is no name.
In additon to Bob's answer, the following.
In object modeling, the relationship between classes and subclasses is taken care of by inheritance, and object modelers know how to use inheritance to good advantage. The relational data model and by extension the SQL databases do not implement inheritance for you. You have to design tables to give you some of the same results.
In ER (Entity-Relationship) modeling, the corresponding concept is called generalization/specialization. This tells you how to model a class/subclass relationship, but it doesn't tell you how to design the tables when you go to build your database.
There are three techniques that are pretty well understood that can be really helpful when dealing with classes and subclasses. Here are their tags: single-table-inheritance class-table-inheritance shared-primary-key. Unfortunately, many tutorials on database design never cover these techniques. They can be enormously useful to people who know object modeling and want to come up to speed on relational modeling.

Supertype/subtype db design with subtype cross-link

This is probably a simple problem for an experienced database developer, but I'm struggling... I have trouble translating a certain ER diagram to a DB model, any help is appreciated.
I have a setup similar to slide 17 of this presentation:
http://www.cbe.wwu.edu/misclasses/mis421s04/presentations/supersubtype.ppt
Slide 17 shows an ER diagram with an Employee supertype having an Employee Type attribute and as subtypes the Employee Types themselves (Hourly, Salaried and Consultant), which is very similar to my design situation.
In my case, suppose Salaried Employees are the only ones that can be bosses of other employees and I wanted to somehow indicate if a certain Salaried employee is the boss of the Hourly and/or Salaried Employee and/or Consultant (either, none or both), how could that be designed in a database model, also considering these are one-to-many relationships?
I can put a PK-FK relationship between them, which would result in all tables having two FKeys and (like Consultant having FK_Employee and FK_SalariedEmployee) and SalariedEmployee referencing itself, but I keep thinking that might not be the wisest solution....although I'm not sure why (integrity issues?).
Is this or an acceptable solution or is there a better one?
Thanks in advance for any help!
Your case looks like an instance of the design pattern known as “Generalization Specialization” (Gen-Spec for short). The gen-spec pattern is familiar to object oriented programmers. It’s covered in tutorials when teaching about inheritance and subclasses.
The design of SQL tables that implement the gen-spec pattern can be a little tricky. Database design tutorials often gloss over this topic. But it comes up again and again in practice.
If you search the web on “generalization specialization relational modeling” you’ll find several useful articles that teach you how to do this. You’ll also be pointed to several times this topic has come up before in this forum.
The articles generally show you how to design a single table to capture all the generalized data and one specialized table for each subclass that will contain all the data specific to that subclass. The interesting part involves the primary key for the subclass tables. You won’t use the autonumber feature of the DBMS to populate the sub class primary key. Instead, you’ll program the application to propagate the primary key value obtained for the generalized table to the appropriate subclass table.
This creates a two way association between the generalized data and the specialized data. A simple view for each specialized subclass will collect generalized and specialized data together. It’s easy once you get the hang of it, and it performs fairly well.
In your specific case, declaring the "boss of" FK to reference the PK in the Salaried Employees table will be enough to do the trick. This will produce the two way association you want, and also prevent employees who are not salaried from being referenced as bosses.

a layman's term for identifying relationship

There are couples of questions around asking for difference / explanation on identifying and non-identifying relationship in relationship database.
My question is, can you think of a simpler term for these jargons? I understand that technical terms have to be specific and unambiguous though. But having an 'alternative name' might help students relate more easily to the concept behind.
We actually want to use a more layman term in our own database modeling tool, so that first-time users without much computer science background could learn faster.
cheers!
I often see child table or dependent table used as a lay term. You could use either of those terms for a table with an identifying relationship
Then say a referencing table is a table with a non-identifying relationship.
For example, PhoneNumbers is a child of Users, because a phone number has an identifying relationship with its user (i.e. the primary key of PhoneNumbers includes a foreign key to the primary key of Users).
Whereas the Users table has a state column that is a foreign key to the States table, making it a non-identifying relationship. So you could say Users references States, but is not a child of it per se.
I think belongs to would be a good name for the identifying relationship.
A "weak entity type" does not have its own key, just a "partial key", so each entity instance of this weak entity type has to belong to some other entity instance so it can be identified, and this is an "identifying relationship". For example, a landlord could have a database with apartments and rooms. A room can be called kitchen or bathroom, and while that name is unique within an apartment, there will be many rooms in the database with the name kitchen, so it is just a partial key. To uniquely identify a room in the database, you need to say that it is the kitchen in this particular apartment. In other words, the rooms belong to apartments.
I'm going to recommend the term "weak entity" from ER modeling.
Some modelers conceptualize the subject matter as being made up of entities and relationships among entities. This gives rise to Entity-Relationship Modeling (ER Modeling). An attribute can be tied to an entity or a relationship, and values stored in the database are instances of attributes.
If you do ER modeling, there is a kind of entity called a "weak entity". Part of the identity of a weak entity is the identity of a stronger entity, to which the weak one belongs.
An example might be an order in an order processing system. Orders are made up of line items, and each line item contains a product-id, a unit-price, and a quantity. But line items don't have an identifying number across all orders. Instead, a line item is identified by {item number, order number}. In other words, a line item can't exist unless it's part of exactly one order. Item number 1 is the first item in whatever order it belongs to, but you need both numbers to identify an item.
It's easy to turn an ER model into a relational model. It's also easy for people who are experts in the data but know nothing about databases to get used to an ER model of the data they understand.
There are other modelers who argue vehemently against the need for ER modeling. I'm not one of them.
Nothing, absolutely nothing in the kind of modeling where one encounters things such as "relationships" (ER, I presume) is "technical", "precise" or "unambiguous". Nor can it be.
A) ER modeling is always and by necessity informal, because it can never be sufficient to capture/express the entire definition of a database.
B) There are so many different ER dialects out there that it is just impossible for all of them to use exactly the same terms with exactly the same meaning. Recently, I even discovered that some UK university that teaches ER modeling, uses the term "entity subtype" for the very same thing that I always used to name "entity supertype", and vice-versa !
One could use connection.
You have Connection between two tables, where the IDs are the same.
That type of thing.
how about
Association
Link
Correlation

Resources