Order, OrderItem, Product. Write Product details to OrderItem? - sql-server

This is not a question about a particular technical issue, more a general question about database design. Nonetheless, the tech stack is: ASP.NET MVC, SQL DB.
I've inherited a system recently which has an Order-->Order Item-->Product concept i.e. an order has many order items and each order item is associated with one product.
The Order Item stores the following when saved to the DB:
OrderID
ProductID
Qty
Unit Cost (written from Product)
Total NET (calculated based on Qty * Unit Cost)
VAT
Total GROSS
The Order Item viewed through the UI looks like this:
Qty | ProdCode | ProdDescription | Unit Cost | Total NET | VAT | Total Gross
Qty, Unit Cost and Totals are all pulled into the UI from the Order Item and the Product Code and Product Description are pulled from the associated Product record.
All seems sensible enough.
So my question revolves around which data should be written from the Product vs which data should be references from the Product. Specifically, the Unit Cost is written from the Product when when the Order Item is created. I assume this is because Product prices change and you don't want this change to apply to old orders. Fine, makes sense.
My question is: should the same logic should also apply to Product
Code and Product Description? And if not, why not?
To my mind, it seems like the Product Code and Description should also be getting written to the Order Item i.e. the Product Code and
Description are as liable to change as the Product Unit Cost. In which
case if you were to go back and look at an old order, the Prod Code
and Description on the order would appear to be different to what was originally ordered, which seems wrong to me.
The developer who built the system is no longer available to discuss his thinking when he designed it.
The system is working fine and there have been no complaints. However that is mainly because there have never been any updates to Prod Codes/Descriptions even though they are available to edit for various users.
I'd be interested to hear people's thoughts on it before I go making wholesale changes, is this a common scenario and am I worrying about nothing?

There are several aspects to take into consideration in such cases.
Let's start with the fact that the orders MUST be immutable.
Since the product code and description are not immutable, it would seem to make sense to keep them in the orders table at a first glance.
However, this may cause a massive amount of duplicated data, for each product in each order.
Another approach is to never keep the product code and description immutable.
One more approach is to enable the administrators edit the product code and description, but instead of updating the row in the products table you simply mark it as history (you will need to add a status column for that, of course) and add a new row with for that product, with the new code and description.
This solution will allow you to keep the integrity of the orders while allowing your administrator users to edit whatever they want, and keep the bare minimum of data, especially if the changes to the product code or descriptions are as sparse as you wrote.

Related

How to relate a product dimension with a sales fact

I have been studying datawarehouse in the last couple days, particularly, i have been reading The Data Wharehouse Toolkit - The Definitive Guide to Dimensional Modeling by Kimball and Ross.
Uppon that reading, i came to the 1st exapmle where there is a sales fact and it related to a product dimension, as you can see in the bellow image:
I think i can grasp the gist of how this relationship allows us to rotate the "cube" slicing and dicing data, however this is where i get lost:
In this example and many others on the web product is a one-to-one relationship with sales, which is fine i guess for most cases. But this generates a sales registtry for at least each kind of product that was in one sale.
So supposing i bought 1 banana, 2 apples and 1 orange, this would yield at least 3 sales registry. Again, which is fine i guess as it is storing the sale's ticket ID in the sales fact, we still can relate all itens in a given sale.
However if this was an use case: relate products on sales say i want to get every sale that had a banana and get stuff like: how many items each of these sales had, their price cost, their profit, stuff like that...
Wouldn't be better if the fact-product relation were Fact-one_to_many-Product relationship? Where fact would hold the sale's ticket ID and products would have its foreign key referencing where they are from or something?
I reckon these metrics should be in the fact table, and not in the product table as i think i would want. So, is this me not fighting my urge to normalize it or does it make sense in the way i would want to do that kind of filtering -> [given all sales with X product, get data from other products in the same sale].
If i were to follow the guidelines, product dimension would have one registry for every exclusive kind of product the store would have correct? And all the measurements i want i would store it on the fact itself, like price cost, sales price, profit, etc...
On the other hand, if i were to one-to-many product dimension would have many copies of each product. Which is bad, i think. However, i think it would give me better queries in that regard.
As you can see, i'm a beginer and really in the early stages of this path, so if you would endulge me in a Explain Like I'm Five kind of answer I would appreciate.
EDITED:
Sorry #Nick.McDermaid, you are right. I meant from the perspective of the sales fact where for every sale fact i will have only one product, but are correct that for one product it can have N sales related. And so, we have one record of product in the database for every different product on our store. This is the right way to do it, how to rightfully model it. Also, the many indicator is the "sales quantity" i'm guessing.
Anyhow, while this allows for slicing and dicing when/if we have sales as the point of view, but what if i want to for example:
Get all sales that had a banana in it, with all the other items in those sales. We can still do it with this structure but its harder than if the products were repeated and we had the sale id as a foreign key in the product table.
Cuz ultimetly i want to get all the sales(and products within that sale) that had a banana. And then take metrics out of them.
What you are somewhat hinting at would be a degenerate dimension, consisting of the sales id/invoice #/purchase order # of the transaction that took place. The whole purpose of a degenerate dimension is to group items that are related by a meaningless piece of data. For example, a PO # of A1234 is meaningless on its own, it doesn't tell you anything about the purchase. However, it can be used to identify other meaningful data, such as the date of purchase of the products for the customer. In that context, the PO # is defined by the collection of the entities it brings together to describe an event.
Another critical concept in data-warehousing is the abstraction of the schema in the database from the model in the cube. You don't join and group data in a cube model. You slice and filter. There are no foreign keys in a cube model. Those are used in the underlying data schema, but all of that work is handled behind the scenes of the cube model.

Database Design: How to prevent referential integrity violation? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
Ok, maybe it's no the best title for the question, but this is the case.
I'm working on a project that already has an e-commerce. And part of the database looks like this. Everything works perfectly.
The problem comes with the references, if a user buys a product the shopping cart is closed, but if then the product is deleted or it's price changes the order becomes totally corrupted.
I've read this text -> Database Design for Real-World E-Commerce Systems
but I can't see the solution here.
What is the best way to do this. How big companies deal with this problem.
I mean what I need is to store all the details of an order with the data it had at the purchase moment.
There are different ways to solve this. One approach is to have a price history table rather than a price column that changes periodically. When you create an order you create it for a given price and given product. When you need to change the price of the product, instead of changing the value of the price column, you enter a new record in your price history table so future orders can then take the new price. Another approach is to decouple the product price information from the order. Rather than take the price from the product table, you have a column for unit price in the order table and the current value for the price is saved there.
As far as deleting products, it depends again on your situation. Generally it's not a good idea to delete rows that are needed for historical information. So if you no longer want to sell a product, rather than delete the record, you could have a column that has the availability of the product set to false. So previous orders would still relate to that product but new orders wouldn't be able to add it.
Every instance of every field in a table should have a 1:1 relationship with that instance of the table.
The problem is that the Price has a 1:1 relationship with the Product, which is good. But it should also have a 1:1 relationship with the Cart. And, since the price can change over time, it does not.
Two possible solutions: 1)Put a Timestamp on the Purchase, and keep all HistoricalPrices, then select the proper Price for the time of the purchase. This has the advantage of being able to tell exactly why the price changed, but can be extra work. 2)Add a PurchasePrice field to the Shopping_Cart_Products table. Assign the PurchasePrice value for that instance of that table at the time the purchase is made.
You have a few choices, if the products change while they're sitting in carts:
You update the cart and notify the user of the reason why.
You put all the information you need into the cart and work with that during checkout instead of the live product data.
Solution 1 requires you to do a sanity check at least once at the end of the checkout, to see if the cart is still valid. Solution 2 means that people may buy products that are somehow outdated.

multiple stores (sId), multiple products(pId) different prices. how do I design an efficient database

Right now, I am designing the database, as such I don't have any code. I am looking to use sql server, asp.net if that is relevant.
I have a big number of stores and a big number of products too, both in some thousands. For the same pId, prices may vary by sId. I would build it like this:
1. one "store" table containing fields (sId, name, location),
2. one "products" table containing fields (pId, name size, category, sub-category) and
3. "max(sId)" number of price tables containing fields (pId, mrp, availability).
where max(sId) is the total number of stores.
I would rather not make "max(pId)" number of tables containing fields (sId, mrp, availability) as I need to provide a UI to each store so that they can update the details about product prices and availability at their respective stores. I also need to display some products of a particular store but I never need to display some stores for any specific product. That is, search for stores by product is not required, but listing of products by store would be required.
Is this a good way or can I do better?
You appear to be on the right track and I will offer some recommendations. Although there is no requirement to display some stores for any specific products, you should always think about how the requirements will change and how your system can handle that. Build your system so that you can answer questions like these easily - What stores have product ABC priced under $3/piece?
Store table should contain, as you mentioned, information about stores. Take Aaron Bertrand's comment seriously. Name the fields in a way that the next developer can read and figure out what it is. User StoreID instead of sID.
StoreID StoreName ...other fields
------- --------------
1 North Chicago
2 East Los Angeles
Product table should contain information about products. It would be better to store category and sub-category into a different table.
ProductID ProductName ...other fields
--------- --------------
1 Bread
2 Soap
Categories can be located in its own table with hierarchal structure. See Hierarchal Data and how to use hierarchyid data type. This may help in finding out the depth of each top level category and help management decide if they are going overboard with categorization and making life miserable for everybody, including themselves unknowingly.
Many-to-many ProductCategory table can link products to categories. Also keep a history table. When a product's category is changed, keep track of what it was and what it is set to. It may help in answering questions such as - How many products were moved from Agriculture to Construction category in the last 6 months?
Many-to-many StoreProductPrice can bring together store and product and a price can be defined there. Also remember - prices may differ by customers also. Some customers may get discounts at a certain level. Although this may be too much to discuss here, it should be kept in the back of the mind in case a requirement to support customer discount structure comes up.
StoreProductID StoreID ProductID Price
-------------- ------- --------- -----
1 1 1 $4.00
2 1 2 $1.00
3 2 1 $4.05
4 2 2 $1.02
Availability of the product should be done through the inventory management database table(s). For example, you may have a master table of Warehouse and master table of Location. Bringing them together would be WearhouseLocation table. A WarehouseProduct table may bring together warehouse, product and units available.
Alternatively, your production or procurement facility might be dumping data into ProcuredProduct table. Your manufacturing unit might be putting locks on a subset of products while building something out of it. Your sales unit might be putting locks on a subset of products they are trying to sell. In other words, your products may be continually get allocated. You may run queries to find out availability of a certain product and that can be a little taxing. During any such allocation, the number of available units can be updated in a single table (which contains calculated available products that you can comfortably rely on).
So...depending on your customer's needs, the system you are building can get fairly complicated. I am recommending that you think about these things and keep your database structure flexible to anticipated changes. Normalization is a good thing, and de-normalization has its place also. Use them wisely.

database design scenario

What's the right way to do this:
I have the following relationship between entities RAW_MATERIAL_PRODUCT and FINISHED_PRODUCT: A FINISHED PRODUCT has to be made of one ore more Raw Material Products and a Raw Material Product may be part of a Finished Product.( so a Many-Many). I have the intersection entity which i called ASSEMBLY that tells me exactly of what Raw Material Products is a Finished Product made of.
Good. Now i need to sell the Finished Products and compute the production cost. PRODUCT_OUT entity comes in, which can contain only one FINISHED PRODUCT and a FINISHED PRODUCT may be part of multiple PRODUCT_OUT.
It would be easy if, for example, Finished Product A was always made of 3 pieces of Raw Material Product a1, 2 of a2 etc. Problem is that the quantities may change.
The stock of a Raw Material Product is computed as
TotalIn - TotalOut
so i can't put quantity Attribute in ASSEMBLY because i would get incorrect data when calculating the Stock. (if quantites are changed)
My only idea is to give up to FINISHED_PRODUCT entity and make a join between PRODUCT_OUT and RAW_MATERIAL_PRODUCT with the intersection entity containing a quantity attribute. But this seems kind of stupid because almost all the time a FINISHED_PRODUCT is made of the same RAW_MATERIAL_PRODUCTS.
Is there a better way?
I'm not 100% sure I understand, but it sound like essentially the recipe can change, and your model needs to account for this?
But this seems kind of stupid because almost all the time a
FINISHED_PRODUCT is made of the same RAW_MATERIAL_PRODUCTS.
Almost all the time, or all the time? I think that's a pretty critical question.
It seems to me that when you change the recipe, you should create a new FINISHED_PRODUCT row, which has a different set of RAW_MATERIAL_PRODUCTS based on the association in the ASSEMBLY table.
If you want to group differnt recipies of the same FINISHED_PRODUCT together (kind of like versioning!), create a FINISHED_PRODUCT_TYPE table with a 1:m relationship to the FINISHED_PRODUCT table.
Edit (quote from comment):
I totally agree with you it should be a different product but if i add
one screw to a product i can't really name it Product A with 1 extra
screw. And it seems this can happen. I didn't quite get the use of
creating a FINISHED_PRODUCT_TYPE table. Could you please explain?
Sure. So your FINISHED_PRODUCT_TYPE defines the name of the product, and possibly some other data (description, category, etc.). Then each row in FINISHED_PRODUCT is essentially a "version" of that product. So "Product A" would only exist in one place, a row in the FINISHED_PRODUCT_TYPE table, but there could be one or many versions of it in the FINISHED_PRODUCT table.

db design questions (how to deal with groups of products)

I would be grateful if somebody could help me to find an elegant solution to this database design problem. There is a company with a lot of different products (P1,P2,P3,P4) and a lot of customers (C1, C2, C3, C4). Now they have a simple database table to deal with orders, something like
20101027 C2 P1 qty status
20101028 C1 P2 qty status
Now I would like to create groups of products ( eg. (P1+P3+p4) and (P2+P3)) that could be purchase together for a reduced price. What is the best way to represent such groups in a database system? Dealing with these groups as individual products doesn't work, because I need the functionality of replacing, adding or removing products from the groups. So I need to keep the currently given table of products.
Thanks for reading. I hope I will get some help.
Add a new table product_group_promotions, with an ID, name and discount price. Then create a table product_group_promotions_products that links products to product group promotions. This will contain a product group ID and a product ID. This way, you can place one product in multiple groups, and let groups contain multiple products (of course).
Jan's answer is correct but incomplete.
You'll also need start and end dates of the promotion. You'll probably want to enter next week promotions so they are ready but not apply them until appropriate.
Discount price may not be enough either. You also will need to get business rules from business people as to how to apply the discount. It could be a percentage or a free item or a fixed amount. If a percentage do you distribute the discount evenly, proportionally, on the cheapest product, the most expensive? If a free item, which one in the set is free. It could also be a fixed amount, $10 off if you buy x, y, and z. Is the discount applied more than once. If someone buys 5x of P2 and P3 do they get the discount on all of them or just the first ones. Is there a limit over a time period. As in the past example, if you don't give me the discount on all 5, I would just fill out 5 orders of 1 each and get the discount you were trying to prevent. If so you'd have to go back through previous purchase by that customer to see if they've received that discount.
You can see how ugly this can get. I would clarify with the business EXACTLY what they plan to use this new feature for and run through these use cases with them.
As Q, asked, if the basket of purchased items is large enough there could be more than one discount possible. Do you have to determine what to give, do you present a list of choices back to the UI... and the recalculate.
This is why I have mercy on department stores who screw this stuff up. It's not simple.
On the surface this is simple sounding but in reality it's very complex.
"Dealing with these groups as individual products doesn't work, because I need the functionality of replacing, adding or removing products from the groups."
Here are several things you need to look at:
Current Inventory vs Past Orders
How do you deal with Price Changes on P1, P2, P3
How do you handle adding a new product or products to an existing group
How do you handle removing a product or products from an existing group
In my opinion you need two sets of tables.
Tables that make up your current inventory
Tables that record what customers purchased (historical data tables)
If you need to reconstruct a customers purchase from six months ago, you can not rely on the current data givin the fact that a grouping may not look the same today as it did six months ago. So, if you do not already have a set of historical data tables for customer records then, I recommend you create them.
With a set of historical data tables that house what was bought by the customer you can pretty much do what every you want to the current invetory data. Change prices, regroup products, make products obsolete, Temporarily suspend a product, etc.

Resources