Does allowing a category to have multiple parents make sense? Are there alternatives? - database

Short question: How should product categories that appear under multiple categories be managed? Is it a bad practice to do so at all?
Background info:
We have a product database with categories likes this:
Products
-Arts and Crafts Supplies
-Glue
-Paper Clips
-Construction Paper
-Office Supplies
-Glue
-Paper Clips
Note that glue and paper clips are assigned to both categories. And although they appear in two different spots in this category tree, they have the same category ID in the database. Why? Two reasons:
Categories are assigned attributes - for example, a paper clip could have a weight, a material, a color, etc.
Products assigned to the glue category are displayed under arts and crafts and Office Supplies. Which is to be expected - they're the same actual category ID in the database.
This allows us to manage a single category and it's attributes and assigned products, but place it at multiple places within the category tree.
We are using the nested set model, so the db structure we use to support this is:
Category
----------
CategoryID
CategoryName
CategoryTree
------------
CategoryTreeID
CategoryID
Lft
Rgt
So there's a 1:M between Category and CategoryTree because there can be multiple instances of a given category within the category tree.
Is there a simpler way to model this that would allow a product category to display under multiple categories?

I don't see anything wrong with this as long as it is true that all Glue is appropriate for both Office Supplies and craft supplies.

What you have is a good way, though why not simplify the 2nd table like so:
Category
ID
Name
SubCategory
ID
CategoryID
SubCategoryID
Though for the future I would beware of sharing child categories between the two root categories. Sometimes it is better to create a unique categorization of products for consistency, which is easier to manage for you and potentially easier to navigate for the customer. Otherwise, you have the issue that if you're on the Glue page coming from office supplies, then do you show the other path as well? If not, you will have two identical pages, except for the path, which is an issue for SEO. If you do, then the user may get confused.

The most famous example of this is Google Mail, where the classification is done this way. Google is famous for the usability of their products ...
I believe other words are preferable to the "parent" word, that actually suggest only XToOne relationship...
Maybe you could say that a Product as many Categories, so the relationship would be ManyToMany. And only the display would starts with Categories to reach the Products...
This would highlight a problem : if you don't limit the number of categories, and you display the categories with sub-categories and so on, you could end up with:
a huge categories and product list, with many many duplications
a big depth (probably unreadable)
The interesting part is highlighting the problem, then to imagine a solution that is fine for the end-user.

It may well be necessary for a category to have multiple parents. However, no matter what parent you found a category under, its subcategories should remain the same.
I've seen real systems that implemented precisely this logic and worked fine.
edit
To answer your question, I don't think the model I'm suggesting is as restrictive as you imagine. Basically, a given branch of the tree may be found under more than one parent branch, but wherever it is found, it has the same children. Nothing about this prevents you from cherry-picking some children of one branch and also making them children of another.
So, for example, you could include the glues category under both office supplies and hobby supplies, and if you added "Crazy Glue (Suppository Edition)" under glues, it would show up in both. If you have items that might be grouped together logically but need to be separated by their use, you can still do that. You might put mucilage and paste under the category of hobby adhesives, which goes under the hobby root, but not under the office root. Or you could do that and simultaneously have a combined category that's used internally by your buyers. What you can't do is forget to include that new type of glue in all of the relevant categories once you've added it wherever it belongs in your business model ontology.
In short, you lose very little with this restriction, but gain a bit of structure to help avoid the problem of having to manage each item individually.
edit
Assuming I've made a convincing case for the model itself, there's still the issue of implementation. There are lots of options, but here's one way to go:
There is a CatalogItem table containing a synthetic primary key, the label, optional description/detail text, and an optional SKU (or equivalent). You then have a many-to-many CatalogItemJoin with child and parent ID's, both sides constrained to CatalogItemTable.
An item that appears as a parent is a category, so it should not have a SKU. An item that appears only as a child is a product, so it should have a SKU. It's fine for any item to have more than one parent; that just means that it's in multiple categories. Likewise, there's no problem with multiple children per parent; that would be the typical case of a category with a few products in it. However, given a category's ID, its children will be the same regardless of what parent category led you there. The other constraint is that you'll want to avoid loops.

Related

I'm unable to normalize my Product table as I have 4 different product types

So because I have 4 different product types (books, magazines, gifts, food) I can't just put all products in one "products" table without having a bunch of null values. So I decided to break each product up into their own tables but I know this is just wrong (https://c1.staticflickr.com/1/742/23126857873_438655b10f_b.jpg).
I also tried creating an EAV model for this (https://c2.staticflickr.com/6/5734/23479108770_8ae693053a_b.jpg), but I got stuck as I'm not sure how to link the publishers and authors tables.
I know this question has been asked a lot but I don't understand ANY of the answer's I've seen. I think this is because I'm a very visual learner and this makes it hard to understand what's being talked about when not a lot of information is given.
Your model is on the right track, except that the product name should be sufficient you don't need Gift name, book name etc. What you put in those tables is the information that is specific to the type of product that the other products don't need. The Product table contains all the common fields. I would use productid in the child tables rather than renaming it giftID, magazineID etc. It is easier to remember what things are celled when you are consistent in nameing them.
Now to be practical, you put as much as you can into the product table especially if you are going to do calculations. I prefer the child tables in this specific case to have what is mostly display information. So product contains the product name, the cost, the type of product, the units the product is sold in etc. The stuff that generally is needed to calculate the cost of an order or to have a report of what was ordered. There may be one or two fields that can contain nulls, but it simplifies the calculation type queries so much it might be worth it.
The meat of the descriptive details though would go in the child table for the type of product. These would usually only be referenced when displaying the product in the shopping area and only one at a time, so you can use the product type to let you only join to the one child table you need for display. So while the order cares about the product number and name and cost calculations, it probably doesn't need to go line by line describing the book ISBN number or the megapixels in a camera. But the description page of the product does need those things.
This approach is not purely relational, although it mostly is, but it does group the information by the meanings of the data and how they will be used which will make the database easier to understand and query. I am a big fan of relational tables because database just work better when they hit at least the third normal form but sometimes you can go too far for practicality, so the meaning of the data and the way you are grouping to use the data (and not just for the user interface, but for later reporting as well) is almost always one of my considerations in design.
Breaking each product type into its own table is fine - let the child tables use the same id as the parent Product table, and create views for the child tables that join with Product
Your case is a classic case of types and subtypes. This is often called class/subclass in object modeling and generalization/specialization in ER modeling. It's a well understood pattern. There are known techniques for dealing with this pattern.
Visit the following tabs, and read the description under the info tab (presented as "learn more"). Also look over the questions grouped under these tags.
single-table-inheritance class-table-inheritance shared-primary-key
If you want to rean in more depth use these buzzwords to search for articles on the web.
You've already discovered and discarded single table inheritance on your own. Other answers have pointed you at shared primary key. Class table inheritance involves a single table for generalized data as well as the four specialized tables. Shared primary key is generally used in conjunction with class table inheritance.

Database Design for ECommerce project (Should I use EAV Approach)

I am about to deign my first E-Commerce Database.
What i have find out in most E-Commerce websites is that these sites have Category, then SubCategory and then again SubCategory and so on. And the depth of SubCategory is not fixed means One Category have six nested Sub Category while some other have different
Now All the products have attributes associated with it.
Now my question is are these websites keep on adding tables for nested sub categories and keep on adding columns for the attributes in the database
OR
They apply something called as "EAV" model (if i am right) to solve this problem or they keep on adding columns and or tables and also keep on updated the WebPages as on many sites i have found there is now a new category.
(If they use EAV model then the website performance is impacted isnt it..)
Since this is my first ECommerce project please provide some valuable suggestions of yours.
Thanks,
Any help is appreciated.
What you need is a combination of EAV for product features and nested sets for product categories.
While I certainly agree that EAV is almost always a bad choice, one application where EAV is the perfect choice is for handling product attributes in an online catalog.
Think about how websites show product attributes... The attributes of products are always shown as a vertical list with two columns: "Attribute" | "Value". Sometimes these lists show side-by-side comparisons of multiple products. EAV works perfectly for doing this kind of thing. The things that make EAV meaningless and inefficient for most applications are exactly what makes EAV meaningful and efficient for product attributes in an online catalog.
One of the reasons why everyone always says "EAV is EVIL!" is that the attributes in EAV are "meaningless" insofar as the column name (i.e. meaning of the attribute) is table-driven and is therefore not defined by the schema. The whole point of schemas is to give your model meaning so this point is well taken. However in the case of an online product catalog, the meaning of product attributes is really unimportant to the system, itself. The only reason your catalog system cares about product attributes is to dump them in a list or possibly in a product comparison matrix. Therefore EAV is doesn't happen to be evil in this particular case.
For product categories, you want a nested set model, as I described in the answer to this question. Nested sets give you very quick retrieval along with the ability to traverse multiple levels of an unbalanced hierarchy at the expense of some precalculation effort at edit time.

database design scenario

What's the right way to do this:
I have the following relationship between entities RAW_MATERIAL_PRODUCT and FINISHED_PRODUCT: A FINISHED PRODUCT has to be made of one ore more Raw Material Products and a Raw Material Product may be part of a Finished Product.( so a Many-Many). I have the intersection entity which i called ASSEMBLY that tells me exactly of what Raw Material Products is a Finished Product made of.
Good. Now i need to sell the Finished Products and compute the production cost. PRODUCT_OUT entity comes in, which can contain only one FINISHED PRODUCT and a FINISHED PRODUCT may be part of multiple PRODUCT_OUT.
It would be easy if, for example, Finished Product A was always made of 3 pieces of Raw Material Product a1, 2 of a2 etc. Problem is that the quantities may change.
The stock of a Raw Material Product is computed as
TotalIn - TotalOut
so i can't put quantity Attribute in ASSEMBLY because i would get incorrect data when calculating the Stock. (if quantites are changed)
My only idea is to give up to FINISHED_PRODUCT entity and make a join between PRODUCT_OUT and RAW_MATERIAL_PRODUCT with the intersection entity containing a quantity attribute. But this seems kind of stupid because almost all the time a FINISHED_PRODUCT is made of the same RAW_MATERIAL_PRODUCTS.
Is there a better way?
I'm not 100% sure I understand, but it sound like essentially the recipe can change, and your model needs to account for this?
But this seems kind of stupid because almost all the time a
FINISHED_PRODUCT is made of the same RAW_MATERIAL_PRODUCTS.
Almost all the time, or all the time? I think that's a pretty critical question.
It seems to me that when you change the recipe, you should create a new FINISHED_PRODUCT row, which has a different set of RAW_MATERIAL_PRODUCTS based on the association in the ASSEMBLY table.
If you want to group differnt recipies of the same FINISHED_PRODUCT together (kind of like versioning!), create a FINISHED_PRODUCT_TYPE table with a 1:m relationship to the FINISHED_PRODUCT table.
Edit (quote from comment):
I totally agree with you it should be a different product but if i add
one screw to a product i can't really name it Product A with 1 extra
screw. And it seems this can happen. I didn't quite get the use of
creating a FINISHED_PRODUCT_TYPE table. Could you please explain?
Sure. So your FINISHED_PRODUCT_TYPE defines the name of the product, and possibly some other data (description, category, etc.). Then each row in FINISHED_PRODUCT is essentially a "version" of that product. So "Product A" would only exist in one place, a row in the FINISHED_PRODUCT_TYPE table, but there could be one or many versions of it in the FINISHED_PRODUCT table.

Web Store - Linking to specific categories without hard-coding IDs

Say I have an online catalog which contains (for our purposes) 2 tables:
Category
id
name
Product
id
category_id
name
If I wanted to link to a specific category -- say for instance, the most important category for any given section of the site -- without hard-coding IDs, what would be the best practice for this? Would I use some type of "SLUG" column that is assigned upon category creation and can't be modified?
Thanks!
It can be accomplished several ways, as is the case with most things, but creating a column to contain a url-friendly version of the category name is one option. I can't say it's the "best" solution though, because it depends on the situation.
What's the problem with giving an ID number in the URL though (e.g. /22-blah-blah-blah)? I prefer doing it that way so I can change the title part without breaking people's bookmarks. Not sure how that goes for SEO, but it doesn't seem to have hurt me so far.
My CMS has a routing table that encompasses multiple entity types (products, pages, etc..) rather than relying on multiple tables for routing information.

How to setup organization specific data elements about shared items?

First post, please be kind.
NOTE: I have reviewed entry #20856 (how to implement tagging) but feel this is different due to the fact that the tags method I'm considering is organization specific in my app. I’m hoping someone can confirm the direction I’m going or point out some other options.
(background) We are building a web application that gives different organizations visibility to their inventory in different locations. The database stores users, organizations, sites, and items and there are links from sites and items to organizations that allow us to determine which items / sites to show to which users (based on their organization).
It is common for two (or more) organizations to want to use the portal to check on the stock status of (for example) Widget A in the Los Angeles Warehouse. That part is fine. However, the different organizations also track unique information about Widget A. For example, Org 1 wants to track the color, volume, and primary vendor for each item. Org 2 wants to track Color, Stock Type, Inventory Cycle, Buyer Code for each item. I want to avoid a situation where I have to have a table loaded with all these possible fields and then figure out which organizations use which fields.
I’m considering using something along the lines of tags, but adding a tag category, and having the tag category be defined at the Org level. So, the basic table structure would be something like:
Table: OrgTagCategory
Fields: OrgId, TagCategoryId, CategoryTitle
Table: OrgTag
Fields: OrgId, TagCategoryId, TagId, TagTitle
Table: OrgItemTag
Fields: OrgId, ItemId, TagId
Then, when the user logged in the main dashboard the grid would include their appropriate item fields as columns in the grid. So, from above example, Org 1 would see Item#, Description (would be shown for all), color, volume, and primary vendor. Org 2 would be shown Item#, Description, Color, Stock Type, Inventory Cycle, Buyer Code.
Am I overthinking this or is there a simpler method of doing this that I’m missing? All thoughts / feedback sincerely appreciated.
That should be no problem, but you're storing the OrgId redundantly. Also it seems like there could be some overlap (probably a lot of overlap, realistically) between tags and orgs.
Here's how I'd do it:
Table: OrgTag
Fields: OrgId, TagId
Table: Tag
Fields: TagId, TagTitle
Table: ItemTag
Fields: ItemId, TagId
This way each org is associated with the tags it's interested in, but you don't have redundant tags. A given tag that's used by multiple orgs just gets a bunch of rows in OrgTag, instead of multiple rows in Tag with the same TagTitle.
You'd only need a table OrgTagCategory if there were multiple tag categories per org. But you haven't described this extra association as a requirement.
Based on your description I sketched a simplified model and combined it with the observation pattern. This should enable you to track various item properties and user preferences for viewing them. Admittedly, the Preference table may grow large, but data has to be stored somewhere anyway, and you may retrieve it using sql, which simplifies the business layer.
- Organization and person are types of users. User table has columns common to all users, while Organization and Person tables have columns specific to each one.
- A stock item (widget class) can be found at several sites (warehouses); a site stores many items.
- One item belongs to one user; a user can own many items.
- Measurement and trait are types of observations. Measurement is a numeric observation, like height. Trait is a descriptive observation, like color.
- An observation is of a specific type (height, weight, color), there can be many observations of the same type.
- One item (widget class) can have many observations, an observation relates to one item only.
- A user can prefer to display many observations; an observation may be preferred (to display) by many users.
UPDATE
We could simplify user's subscription to item details (observations) by tagging observation type, for example height, weight, width would be tagged with: all, dimensions, physical. Some other tags would be: accounting_interest, tracking_specific, etc. A user would then subscribe to tags only. Tags (could) form a hierarchy with ALL at the top.
- One observation type (height, weight, color) can have many tags, one tag belongs to many observation types.
- Each tag may have a parent tag forming a hierarchy.
- A user stores preferences for a set of tags that she usually monitors.
UPDATE 2
Now we can sort out who is who and who owns what. In this modification a user (now a person) can work for more than one organization (having several part-time jobs or contracts). An item belongs to a organization now. A logged-in user can see all items from all organization that she works for.
My first quick thot on this would be that - if this is just limited to 'showing' particular fields to particular Orgs on Dashboard then it is better to handle it on the App side. If there's any other usage of 'tagging' then pls clarify.
Here's a simple approach -
You can store a field [OrgDashboardFields] in the Org master table or the OrgItem table. This will be a comma (',') separated list of fields to be shown on the dashboard. At run-time fetch the [OrgDashboardFields] field and parse the comma separated list in the app and make the Dashboard Grid behave accordingly.
Or, if there's a dynamic-query framework then based on the [OrgDashboardFields] field you can create a dynamic SQL-query and get the desired result which is purely Org specific.

Resources