How to do a good entity design with recursive child relationships - database

We need to build a category hierarchy, like
Clothing
Men
Suits
Classic
Modern
Business
Party
Trousers
Ties
Beachwear
Women
Dresses
Gala
Evening
Simple
Skirts
Long
Mid-size
Short
Mini
Blouses
Beachwear
I designed the Category entity to have a belongsToparent relationship only ( I think I will need a children property too?). So far, a product would be attached to one category only.
Now the client is saying he thinks it should be a many-to-many relationship, and he thinks we should attach a product to all the categories in the chain (like ProductA should be attached to categories "Mid-size"-"Skirts"-"Women"-"Clothing" each). To me this sounds like overkill and a lot of redundancy. I should have all the other relationships through the parent chain available.
However, I ask myself if this is realistic, as it actually are relationships which need to be traversed and thus result in additional queries.
What would be a good design for clothing categories? That it be a hierarchy seems to be required by the client (there are also tags).

(like ProductA should be attached to categories
"Mid-size"-"Skirts"-"Women"-"Clothing" each).
It is a standard requirement in e-commerce applications for one product to belong to several categories. It makes perfect sense to any manager.
You should implement tree-traversal using optimal tree functions, not parent-child relationships of Eloquent. Check, for example, this laravel package.

Related

I'm unable to normalize my Product table as I have 4 different product types

So because I have 4 different product types (books, magazines, gifts, food) I can't just put all products in one "products" table without having a bunch of null values. So I decided to break each product up into their own tables but I know this is just wrong (https://c1.staticflickr.com/1/742/23126857873_438655b10f_b.jpg).
I also tried creating an EAV model for this (https://c2.staticflickr.com/6/5734/23479108770_8ae693053a_b.jpg), but I got stuck as I'm not sure how to link the publishers and authors tables.
I know this question has been asked a lot but I don't understand ANY of the answer's I've seen. I think this is because I'm a very visual learner and this makes it hard to understand what's being talked about when not a lot of information is given.
Your model is on the right track, except that the product name should be sufficient you don't need Gift name, book name etc. What you put in those tables is the information that is specific to the type of product that the other products don't need. The Product table contains all the common fields. I would use productid in the child tables rather than renaming it giftID, magazineID etc. It is easier to remember what things are celled when you are consistent in nameing them.
Now to be practical, you put as much as you can into the product table especially if you are going to do calculations. I prefer the child tables in this specific case to have what is mostly display information. So product contains the product name, the cost, the type of product, the units the product is sold in etc. The stuff that generally is needed to calculate the cost of an order or to have a report of what was ordered. There may be one or two fields that can contain nulls, but it simplifies the calculation type queries so much it might be worth it.
The meat of the descriptive details though would go in the child table for the type of product. These would usually only be referenced when displaying the product in the shopping area and only one at a time, so you can use the product type to let you only join to the one child table you need for display. So while the order cares about the product number and name and cost calculations, it probably doesn't need to go line by line describing the book ISBN number or the megapixels in a camera. But the description page of the product does need those things.
This approach is not purely relational, although it mostly is, but it does group the information by the meanings of the data and how they will be used which will make the database easier to understand and query. I am a big fan of relational tables because database just work better when they hit at least the third normal form but sometimes you can go too far for practicality, so the meaning of the data and the way you are grouping to use the data (and not just for the user interface, but for later reporting as well) is almost always one of my considerations in design.
Breaking each product type into its own table is fine - let the child tables use the same id as the parent Product table, and create views for the child tables that join with Product
Your case is a classic case of types and subtypes. This is often called class/subclass in object modeling and generalization/specialization in ER modeling. It's a well understood pattern. There are known techniques for dealing with this pattern.
Visit the following tabs, and read the description under the info tab (presented as "learn more"). Also look over the questions grouped under these tags.
single-table-inheritance class-table-inheritance shared-primary-key
If you want to rean in more depth use these buzzwords to search for articles on the web.
You've already discovered and discarded single table inheritance on your own. Other answers have pointed you at shared primary key. Class table inheritance involves a single table for generalized data as well as the four specialized tables. Shared primary key is generally used in conjunction with class table inheritance.

Data Modeling for consumer goods

A company is trying to build a system that breaks down consumer goods (soft drinks, detergents, beauty products, etc.) down to the very basic components. The aim is to be able to break down all the characteristics of a product into as many enumerable quantities as possible. For instance, a soft drink will have the properties flavor, calories, color, cost, etc. Do note that the products will come from a huge variety of segments and not all properties will be applicable to all products (detergents don't have calories) and similarly sounding properties are not similar (detergents with a lime fragrance is different from a lime flavored soft drink). Also, search is expected to be fast and the database needs to understand relationships between products. Suggest only a data model for the same.
The feature you highlight, that not all properties describe all products, is a classic feature of a class/subclass situation. Or, if you prefer, type/subtype.
Dealing with just that feature of the problem, I'm going to call your attention to the EER (Extended Entity Relationship) model if you want to model your understanding of the subject matter. The EER has a way of depicting what it calls a generalization/specialization pattern. That's a good search term to find detailed descriptions of it. This will adequately depict what you've said you're after.
A word of caution, however. The majority of ER models you'll see here in SO are design models, not conceptual models. That is, they reflect the intent of designing tables made up of columns and rows, with keys and foreign keys, to contain the relevant data.
What I'm recommending is the EER model for a very different purpose. It's to depict the way the data looks to the subject matter expert, not the way the data looks to the database designer. That distinction is lost on those who have never learned the difference between analysis and design.
If your project is a major one, it's worth spending an appropriate amount of time on a detailed analysis of the subject matter before moving on to design. Understanding the problem before you try to solve it is key to successful work on big projects.
Once you have a good conceptual model that captures the analysis, the choice of a data model to reflect the design will depend on what kind of database you've decided to build. It might be relational, it might be multidimensional, it might be unstructured. It depends. The analysis, however, will be more useful if it's implementation independent.

Database Design for ECommerce project (Should I use EAV Approach)

I am about to deign my first E-Commerce Database.
What i have find out in most E-Commerce websites is that these sites have Category, then SubCategory and then again SubCategory and so on. And the depth of SubCategory is not fixed means One Category have six nested Sub Category while some other have different
Now All the products have attributes associated with it.
Now my question is are these websites keep on adding tables for nested sub categories and keep on adding columns for the attributes in the database
OR
They apply something called as "EAV" model (if i am right) to solve this problem or they keep on adding columns and or tables and also keep on updated the WebPages as on many sites i have found there is now a new category.
(If they use EAV model then the website performance is impacted isnt it..)
Since this is my first ECommerce project please provide some valuable suggestions of yours.
Thanks,
Any help is appreciated.
What you need is a combination of EAV for product features and nested sets for product categories.
While I certainly agree that EAV is almost always a bad choice, one application where EAV is the perfect choice is for handling product attributes in an online catalog.
Think about how websites show product attributes... The attributes of products are always shown as a vertical list with two columns: "Attribute" | "Value". Sometimes these lists show side-by-side comparisons of multiple products. EAV works perfectly for doing this kind of thing. The things that make EAV meaningless and inefficient for most applications are exactly what makes EAV meaningful and efficient for product attributes in an online catalog.
One of the reasons why everyone always says "EAV is EVIL!" is that the attributes in EAV are "meaningless" insofar as the column name (i.e. meaning of the attribute) is table-driven and is therefore not defined by the schema. The whole point of schemas is to give your model meaning so this point is well taken. However in the case of an online product catalog, the meaning of product attributes is really unimportant to the system, itself. The only reason your catalog system cares about product attributes is to dump them in a list or possibly in a product comparison matrix. Therefore EAV is doesn't happen to be evil in this particular case.
For product categories, you want a nested set model, as I described in the answer to this question. Nested sets give you very quick retrieval along with the ability to traverse multiple levels of an unbalanced hierarchy at the expense of some precalculation effort at edit time.

Supertype/subtype db design with subtype cross-link

This is probably a simple problem for an experienced database developer, but I'm struggling... I have trouble translating a certain ER diagram to a DB model, any help is appreciated.
I have a setup similar to slide 17 of this presentation:
http://www.cbe.wwu.edu/misclasses/mis421s04/presentations/supersubtype.ppt
Slide 17 shows an ER diagram with an Employee supertype having an Employee Type attribute and as subtypes the Employee Types themselves (Hourly, Salaried and Consultant), which is very similar to my design situation.
In my case, suppose Salaried Employees are the only ones that can be bosses of other employees and I wanted to somehow indicate if a certain Salaried employee is the boss of the Hourly and/or Salaried Employee and/or Consultant (either, none or both), how could that be designed in a database model, also considering these are one-to-many relationships?
I can put a PK-FK relationship between them, which would result in all tables having two FKeys and (like Consultant having FK_Employee and FK_SalariedEmployee) and SalariedEmployee referencing itself, but I keep thinking that might not be the wisest solution....although I'm not sure why (integrity issues?).
Is this or an acceptable solution or is there a better one?
Thanks in advance for any help!
Your case looks like an instance of the design pattern known as “Generalization Specialization” (Gen-Spec for short). The gen-spec pattern is familiar to object oriented programmers. It’s covered in tutorials when teaching about inheritance and subclasses.
The design of SQL tables that implement the gen-spec pattern can be a little tricky. Database design tutorials often gloss over this topic. But it comes up again and again in practice.
If you search the web on “generalization specialization relational modeling” you’ll find several useful articles that teach you how to do this. You’ll also be pointed to several times this topic has come up before in this forum.
The articles generally show you how to design a single table to capture all the generalized data and one specialized table for each subclass that will contain all the data specific to that subclass. The interesting part involves the primary key for the subclass tables. You won’t use the autonumber feature of the DBMS to populate the sub class primary key. Instead, you’ll program the application to propagate the primary key value obtained for the generalized table to the appropriate subclass table.
This creates a two way association between the generalized data and the specialized data. A simple view for each specialized subclass will collect generalized and specialized data together. It’s easy once you get the hang of it, and it performs fairly well.
In your specific case, declaring the "boss of" FK to reference the PK in the Salaried Employees table will be enough to do the trick. This will produce the two way association you want, and also prevent employees who are not salaried from being referenced as bosses.

database relationship to many tables and representation in entity framework

2 part question:
1st
What is the best way to setup a table/relationship structure given the following scenario: I have many tables that store different kinds of data (ie: books, movies, magazine - each as different tables) and then one table that stores reviews that can link to any of the table types. So a row in the review table can link to either books or magazines table.
How I have it now is that there is a 3rd table that defines the available tables and gives them an ID number. There ends up being no true relationship stored that goes from Reviews to Books. Is this the best way to do this?
2nd
How do I represent the fake relationship in Entity Framework? I can do a query that would join the 3 tables, but is there a way to model it in the table mapping instead?
The other way to think of it is to consider BOOKS, MOVIES, MAGAZINES as sub-types of REVIEWABLE_ITEMS. They probably share some common characteristics - without knowing more about your problem domain it would be hard to be sure. The advantage of this approach is that you can model REVIEWS as a dependency of REVIEWABLE_ITEMS, giving you both a single table for Reviews and an enforceable relationship.
edit
Yes, this is just like extending types in the OO paradigm. You don't say which flavour of database you're intending to use but this article by Joe Celko shows how to do it in SQL Server. The exact same implementation works in Oracle, and I expect it would work in most other RDBMS products too.
It really depends on how you want to access/view the reviews.
I would implement one table for each kind of revew: one for books, one for movies, etc. with a one-to-many relationship for each of them (between books and books reviews, movies and movie reviews, etc). If you need all the reviews in one table, create a view which selects all reviews with a UNION ALL.
Either you include a concept "reviewable", which can be either a book or a magazine or ... , and to which "reviews" can refer, or else you can have a concept "review", which can either be a "book review" that can reference a book, or a "magazine review" that can reference a magazine, or a "newspaper review" that can reference a newspaper, ...
Because truly relational systems do not exist, you cannot do without explicitizing one of those two abstractions in your database design. (Unless perhaps if you are willing to implement a lot of trigger code.)

Resources