Problems while designing a database to manage all kind of products like Amazon - database

first of all sorry for my bad english hehehe I need some help, I want to design a database for a website, like a mini Amazon. This database will manage every kind of products (TV, cars, computers, books, videogames, penciles, tables, pants...), but also, each product must have some properties (that will be indexed) for example, if the product is a book, the properties will be something like genre, year, author. If the product is a TV, the properties will be something like size, color, also year. And if the product is a car, the properties will be something like year, color, model, for example. So, this is my idea:
One table to manage departments (like electronics, books...)
One table to manage categories of the departments, this table will be a child of the previous. If the department is electronics, here will be audio, tv and video, games... (each category belongs to one department, the relationship is one department to many categories)
One table to manage the products (each product belongs to one category, the relationship is one category to many products)
One table to manage properties (like year, color, genre, model...)
One table to engage products with properties, this table will be called ProductProperties
Im not sure if this is the best way, the database will be huge, I will develop the database on MySQL. But, I think this is not the best way, this article talks about "Database Abstraction: Aggregation and Generalization" http://cs-exhibitions.uni-klu.ac.at/index.php?id=433, in other words generic objects (I think), but this way is old (70s). In this article http://www.simple-talk.com/sql/database-administration/ten-common-database-design-mistakes/ in the section "One table to hold all domain values" says that this is a wrong way... Im saying all of this because of the table ProductProperties, I dont know if I make this table or if I make especific tables for each kind of products.
Do you have any suggestion? Or do you have a better idea?
Thanks in advance, take care!!!

1.One table to manage departments (like electronics, books...)
2.One table to manage categories of the departments, this table will be a
child of the previous. If the
department is electronics, here will
be audio, tv and video, games... (each
category belongs to one department,
the relationship is one department to
many categories)
Why? One table, categories, forming a hierarchy. More flexible.
3.One table to manage the products (each product belongs to one category,
the relationship is one category to
many products)
Why? Allow m:n here. A product in many categorries.
Im not sure if this is the best way,
the database will be huge
Ah - no. Sorry. Nontrivial, yes. Hugh? No. Just to get you an idea of hugh - I have a db I am adding 1.2 billion rows PER DAY to a specific table. On average. THIS is big. YOu end up with what - 100.000 items? not even worth mentioning.

Pablo89, the description of what you want is very close to what the AdventureWorks database for SQL Server does. There are many examples of using AdventureWorks on the Web from web applicatons to reporting to BI.
Download and install SQL Server Express 2008 R2. Download and install the sample database for the above product. Inspect the database design for AdventureWorks.
Use AdventureWorks as examples in questions you may post.
I use AdventureWorks because I use SQL Server. I do not say it is better than other database products I say this because I know AdventureWorks.

I do not think that some database can work fast with 500,000,000 items. Complete tree of products categories for amazon.com contains 51,000 nodes (amazoncategories.info). Also the data is updated hourly, so saved product information can be incorrect. I think the optimal way is to store categories tree only get the product data at runtime using Amazon's API.

Related

Splitting orderDetail into two tables for a database?

I will try my best to phrase this in a way that makes sense. I am working on a database project for my beginning database management course that uses a fictional scenario of a bookstore owner who wants me to create a database for them.
Essentially, the tables (or entities) that I have come up with are as follows:
Customer,
Product,
Order,
orderLine,
Book,
Author,
Publisher
To put it simply, I need configure this so that I can track both books and other nonbook items from sales. The issue that I am running into with this is that when I tried to just have one products table, I ran into the issue that books have a bunch of attributes that other items (such as bakery items). If I put books with other items, then there would be a whole lot of empty cells where there is no author/publisher/genre. From what my textbook has taught me so far, a composite table is needed for an orderDetail-type table, where the orderNumber and productNumber would combine. But here, I would need to somehow combine two seperate KEY attributes (for books and other items) into one order table, or some other method. This is especially confusing to me since some customers might buy a combination of books an other items in a single order, or they might only buy one type of thing. I was thinking that the ISBN would be an excellent identifier key for the Book table. What kind of configuration would I need to track orders like this?
Add one more table , product-props and store attributes in that table. you can keep all products in one table i.e books and other items as well. you can move author, publisher as props of this as well

One product multiple markets database design

Having an abstract task at this point of time:
We have a list of products for country A and a list of products for country B, they sometimes are the same, sometimes products from country A are different to what is offered in country B.
So my current task is to create and elaborate on multiple solutions for database design. Definitely to make it scalable for 17,000 - 1,000,000 products in the future.
What should be taken into consideration when developing the products database?
Few thoughts are:
=> A product table should be specific for each market, for example AUSTRALIAN_PRODUCTS, US_PRODUCTS, where id of a product is a unique identifier
=> To have an multiple schemas for each market
=> To have multiple database instances for each market
When you say "multiple schemas" or "multiple DB instances", I think you mean "unique per country". I would avoid that as very complex long term. Assuming that the same product_id might occur for different products in two different countries, I think you will find that keeping to a single DB with a combined, unique key of country_code, product_id would be most flexible in the long run.
Few remarks
Few thoughts are:
A product table should be specific for each market, for example AUSTRALIAN_PRODUCTS, US_PRODUCTS, where id of a product is a unique identifier
I wouldn't to do such a thing. A basic normalization, will bring a structure of tables like
Country
Product
CountryProduct
To have an multiple schemas for each market
The only benefit for this approach would be for security, different schema in a case like this will isolate one country from another, I don't see why you should go for this approach
To have multiple database instances for each market
same as 2nd option, multiple databases instances would only bring isolation, with no benefits over performance or other things.
the only benefit you could have later on, is that you could take one database out of one server and move to another for scalability.

SQL - Designing a Phone book database with Hierarchical model (master-client)

I just joined this site and this is my first question , I hope my question it's according to the StackOverflow question policy.
I'm designing a DB for Phone book which has the following abilities
Contact have 2 types (Company or Person)->ContactType
And I want each contact to have as many Emails, Phones Numbers, and Addresses as it wants.
And I want to specify which Person works in which Company , so I can show not only a Company Contact detail but also list of its employees and their jobs in that Company and their Contacts (CoEmpJob table)
I have designed a db diagram which is shown in the link below, is it well structured or can I achieve what I want in some better way?
Thanks in advance.
My Phone Book Design
As the design stands, you're missing a few things, such as a Companies table and a ContactTypes table. There seems to be no foreign key in the CoEmpJob table linking to the Contacts table.
In the Phones table, I personally wouldn't use a prefix field (unless you wish to display contacts by phone prefix), in which case every phone number is guaranteed to be unique, in which case the PhoneNum field becomes the primary key and the PhoneID field is unnecessary - but you might have the case in which husband and wife are in the same database; whilst they almost certainly have different mobile numbers, they almost certainly share the same home phone number! In this case, your design is correct.
I don't know how many people have more than one address (I would think very few, if at all) which means that the fields of the Address table could be moved into the Contacts table.
(Added)
As regards the companies, if you want to specify which Person works in which Company, then you will need a companies table (missing) and a join table (CoEmpJob). In the real world, this design would also require more tables - a join table can show which contacts are connected to which companies and what their current jobs are, but people change jobs (and companies) and so such a design would not store any history. Also, it is customary to link people (employees) to a department - and it is possible that one person can be connected to more than one department at a time, meaning that you will need another join table. This can get very complicated - it depends on what you want.
Your comment suggests that you want to store company data in the contacts table - this is a very bad idea; they should be kept separate.

Database design for a product aggregator

I'm trying to design a database for a product aggregator. Each product has information about where it comes from, what it costs, what type of thing it is, price, color, etc. Users need to able to search and filter results based on any of those product categories. I also expect to have a large number of users. My initial thought was having one big table with every product in it with a column for each piece of information and an index on anything I need to be able to search by but I think this might be inefficient with a lot of users pounding on this one table. My other thought was to organize the database to promote a tree-like navigation of tables but because you can search by anything I'm not sure how I would organize the tables.
Any thoughts on some good practices?
One table of products - databases are designed to have lots of users pounding on tables.
(from the comments)
You need to model your data. This comes from looking at the all the data you have, determining what is related to what (a table is called a relation because all the attributes in a row are related to a candidate key). You haven't really given enough information about the scope of what data (unstructured?) you have on these products and how it varies. Are you going to have difficulties because Shoes have brand, model, size and color, but Desks only have brand, model and finish? All this is going to inform your data model. Typically you have one products table, and other things link to it.
Some of those attributes will be foreign keys to lookup tables, others (price) would be simple scalars. Appropriate indexing and you'll be fine. For advanced analytics, consider a dimensionally modeled star-schema, but perhaps not for your live transaction system - depends what your data flow/workflow/transactions are. Or consider some benefits of its principles in your transactional database. Ralph Kimball is source of good information on dimensional modeling.
I dont see any need for the tree structure here. You can do with single table.
if you insist on tree structure with hierarchy here is an example to get you started.
For text based search, and ease of startup & design, I strongly recommend Apache SOLR. The SOLR API is easy to use (especially JSON). Databases do text search poorly, and I would instead recommend that you just make sure that they respond to primary/unique key queries properly, and those are the fields you should index.
One table for the products, and another table for the product category hierarchy (you don't specifically say you have this but "tree-like navigation of tables" makes me think you might).
I can see you might be concerned about over-indexing causing problems if you plan to index almost every column. In that case, it might be best to index on the top 5 or 10 columns you think users are likely to search for, unless it's possible for a user to search on ANY column. In that case you might want to look at building a data warehouse. Maybe you'll want to look into data cubes to see if those will help...?
For hierarchical data, you need a PRODUCT_CATEGORY table looking something like this:
ID
PARENT_ID
NAME
Some sample data:
ID PARENT_ID NAME
1 ROOT
2 1 SOCKS
3 1 HELICOPTER PARTS
4 2 ARGYLE
Some SQL engines (such as Oracle) allow you to write recursive queries to traverse the hierarchy in a single query. In this example, the root of the tree has a PARENT_ID of NULL, but if you don't want this column to be nullable, I've also seen -1 used for the same purposes.

database relationship to many tables and representation in entity framework

2 part question:
1st
What is the best way to setup a table/relationship structure given the following scenario: I have many tables that store different kinds of data (ie: books, movies, magazine - each as different tables) and then one table that stores reviews that can link to any of the table types. So a row in the review table can link to either books or magazines table.
How I have it now is that there is a 3rd table that defines the available tables and gives them an ID number. There ends up being no true relationship stored that goes from Reviews to Books. Is this the best way to do this?
2nd
How do I represent the fake relationship in Entity Framework? I can do a query that would join the 3 tables, but is there a way to model it in the table mapping instead?
The other way to think of it is to consider BOOKS, MOVIES, MAGAZINES as sub-types of REVIEWABLE_ITEMS. They probably share some common characteristics - without knowing more about your problem domain it would be hard to be sure. The advantage of this approach is that you can model REVIEWS as a dependency of REVIEWABLE_ITEMS, giving you both a single table for Reviews and an enforceable relationship.
edit
Yes, this is just like extending types in the OO paradigm. You don't say which flavour of database you're intending to use but this article by Joe Celko shows how to do it in SQL Server. The exact same implementation works in Oracle, and I expect it would work in most other RDBMS products too.
It really depends on how you want to access/view the reviews.
I would implement one table for each kind of revew: one for books, one for movies, etc. with a one-to-many relationship for each of them (between books and books reviews, movies and movie reviews, etc). If you need all the reviews in one table, create a view which selects all reviews with a UNION ALL.
Either you include a concept "reviewable", which can be either a book or a magazine or ... , and to which "reviews" can refer, or else you can have a concept "review", which can either be a "book review" that can reference a book, or a "magazine review" that can reference a magazine, or a "newspaper review" that can reference a newspaper, ...
Because truly relational systems do not exist, you cannot do without explicitizing one of those two abstractions in your database design. (Unless perhaps if you are willing to implement a lot of trigger code.)

Resources