Order/Invoice/Payment database modeling - database

I am designing an e-commerce website that has the following scenario:
A customer can purchase items and create an order.
The order may have unknown fee that will be added after the customer
pays the total amount of the items. That is, the customer pays
certain amount first. The order adds some fee and changes the total.
And the customer pays again for the difference. But the two (or
more) payments are associated with the same order.
(Optional) The customer can submit a single payment for multiple
orders.
Currently, I have an Order table and each order may consist of multiple OrderLineItems (simplified schema):
Order
=====
customer
line_items
total
status
OrderLineItem
=============
price
quantity
order
product
A payment is associated with an order (simplified schema):
Payment
=======
order
payment_account
total
result
It seems to be very hard to support the multiple payments for a single order scenario in the current implementation. I reckon that I have to introduce immutable invoices in the system, and the payment should be associated with an invoice instead of an order. However, I would need some help with the order/invoice/payment modeling for the above scenario. Some specific questions I have:
An order and an invoice look very similar to me (e.g. both have
items and totals). What is the major difference in typical
e-commerce systems?
How should I model invoices for my scenario? Should I have
OrderLineItems for an Order AND InvoiceLineItems for an Invoice?
Some preliminary thoughts: I will have multiple invoices associated
with a certain order. Whenever the order changes the total, I have
to somehow calculate the difference and send a new/immutable invoice
to the customer. Then, the customer can pay and the payment will be
associated with the invoice.
Would love to hear some advice. Much appreciated. Thanks!

There are a lot of questions here, I'll try to address as many as I can. A lot of the issues are the business model rather than the data model.
Firstly, you are right that you need an immutable invoice. Once an invoice is created you cannot alter it, the standard way of handling a change in invoice is to issue a credit note and a new invoice.
Think about the relations between the tables: the Order does not need to hold the lineItems as these are referenced the other way, i.e.
Order
=====
orderId
customerId
status
OrderLineItem
=============
orderLineItemId
orderId
product
price
quantity
So to view the order you join the tables on orderId. There's also no need to store the total as that is calculated from the join.
Try not to duplicate your data: your invoice will reference orderLineItems, but not necessarily the same ones in the order. For example, the customer orders an A and a B, but B is out of stock. You ship A and create an invoice that references orderLineItemId for A. So your invoice tables might look like:
invoice
=======
invoiceId
status
invoiceLineItem
===============
invoiceId
orderLineItemId
quantity
In other words there's no need to have the details of the item. You may be wondering why you don't just add an invoiceId to the orderLineItem table - the reason is the customer might order 10 of item A, but you only ship 8 of them, the other two will go on a subsequent invoice.
Payments are not made against orders, they are against invoices. So your payments table should reference the invoiceId.
Then you touch on reconciliation. If everything was perfect, the customer would make a payment against a particular invoice, even if a partial payment. In reality this is going to be your biggest headache. Suppose the customer has several outstanding invoices for amounts x, y and z. If they make a payment of p, which do you allocate it against? Perhaps in date order, so that if p>x the remainder is allocated against y. What if p=z? Perhaps the customer is intending to pay z now but is disputing y and has mislaid x? How you deal with these sorts of things is up to you, but I can say that most invoicing systems do it very badly indeed.

Related

Redundant relation: Is this a violation of database normalization?

I have a table with products that I offer. For each product ever sold, an entry is created in the ProductInstance table. This refers to this instance of the product and contains information such as the next due date (if the product is to be billed monthly) and other information relevant to this instance (e.g. personal branding).
For understanding: The products are service contracts. The template of the contract is stored in the product table (e.g. "Monthly lawn mowing"). The product instance is then e.g. "Monthly lawn mowing in sample street" and contains information like the size of the garden or something specific to this instance of the service instead of the general product.
An invoice is created for a product instance either one time or recurring. An Invoice may consists of several entries. Each entry is represented by an element in the InvoiceEntry table. This is linked to the ProductInstance to create the reference to the invoice.
I want to extend the database with purchase orders. To do this, a record is created in the Order table. This contains a relation to the customer and e.g. the order date. The single products of the order are mapped by an OrderEntry. The initial invoice generated for the order is linked via the field "invoice_id" in the table order. The invoice items from the initial order are created per OrderEntry and create one InvoiceEntry each. However, I want the ProductInstance to be created only after the invoice is paid. Therefore the OrderEntry has a relation to the product and not only to the ProductInstance. Once the order has been created, the instance is created and linked to the OrderEntry.
I see the problem that the relation between Order and Invoice is doubled: once Order <-> Invoice and once Order <-> OrderEntry <-> InvoiceEntry <-> Invoice.
And for the Product: OrderEntry <-> Product and OrderEntry <-> ProductInstance <-> Product.
Model of the described database
My question is if this "duplicate" relation is problematic, or could cause problems later. One case that feels messy to me is, what should I do if I want to upgrade the ProductInstance later (to an other product [e.g. upgrade to bigger service])? The order would still show the old product_id but the instance would point to a new product_id.
This is a nice example of real-life messy requirements, where the 'pure' theory of normalisation has to be tempered by compromises. There's no 'slam-dunk right' approach; there's some definitely 'wrong' approaches -- your proposed schema exhibits some of those. I suspect there's not even a 'best' approach. Thank you for expanding the description of the business context -- especially for the ProductInstance table.
But still your description won't support legally required behaviour:
An invoice is created for a product instance either one time or recurring. An Invoice may consists of several entries. Each entry is represented by an element in the InvoiceEntry table.
... I want the ProductInstance to be created only after the invoice is paid.
An invoice represents an indebtedness from customer to supplier. It applies at one date only, not "recurring". (So leaving out the Invoice date has exactly got in the way of you "thinking about relations".) A recurring or cyclical billing arrangement would be represented by something like a 'contract' table, from which an Invoice is generated by some scheduled process.
Or ... your "recurring" means the invoice is paid once up-front for a recurring service(?) Still you need an Invoice date. The terms of service/its recurrence would be on the ProductInstance table.
I can see no merit in delaying recording the ProductInstance 'til after invoice payment. Where are you going to hold the terms of service in the meantime? If you're raising an invoice, your auditors/the statutory authorities will want you to provide records of what the indebtedness relates to. Create ProductInstance ab initio and put a status on it. (Or in the application look up the Invoice's paid status before actually providing the service.)
There's something else about Invoices you're currently failing to capture -- and that has also lead you to a wrong design: in general there is more making up the total $ value of an invoice than product lines, such as discounts applying to the invoice overall rather than particular products; delivery charges; installation costs or inspection/certification; taxes (local/State/Federal).
From your description perhaps the only one applying is taxes. ("in this world nothing can be said to be certain, except death and taxes.") And taxes are not specific to products/no product_instance_id is applicable on an InvoiceEntry.
For this reason, on ERP schemas in general, there is no foreign key declared from InvoiceEntry to Product/Instance. (In your case you might get away with product_instance_id being nullable, but yeuch.) There might be a system-generated XRef text column, which contains different content according to what the InvoiceEntry represents, but any referencing can't be declared to the schema. (There might be a 'fully normalised' way to represent that with an auxiliary linkage table, but maintaining that in step adds too much complexity to the application.)
I see the problem that the relation between Order and Invoice is doubled: once Order <-> Invoice and once Order <-> OrderEntry <-> InvoiceEntry <-> Invoice.
Again think about the business sequence of operations that generate these records: ordering happens as a prelude to invoicing. You can't put an invoice_id on Order, because you haven't created the Invoice yet. You might put the order_id on Invoice. But here you're again in the situation that not all Invoices arrive via Orders -- some might be cash sales/immediate delivery. (You could make order_id nullable, but yeuch.) For this reason on ERP schemas in general, there is no foreign key declared from Invoice to Order, etc, etc.
And the same thinking with OrderEntry <-> InvoiceEntry: your proposed schema has the sequencing wrong/the reference points the wrong way. (And not every InvoiceEntry will have corresponding OrderEntry.)
On OrderEntry, having all of (OrderEntry)id and product_id and product_instance_id seems to me to give you way too many opportunities for tangling it all up. Can an Order have multiple Entrys for the same product_id? -- why/how? Can it have multiple Entrys for the same product_instance_id? -- why/how? Can there be a product_instance_id which refers to a different product_id than OrderEntry.product_id? This is exactly the sort of risk for confusing entanglement that normalisation aims to remove/reduce.
The customer is ordering a ProductInstance: mowing a particular size of garden at a particular address, fortnightly on a Tuesday afternnon. So OrderEntry.product_instance_id is what you want; .product_id is wrong. So (again) you need to create ProductInstance at time of recording the Order. Furthermore I strongly suspect you don't need an id on OrderEntry; instead give it a compound key (order_entry_id, product_instance_id). [**]
[**] I see you're using 'eloquent'. I suspect this is requiring id on every table. So you're not even using a relational database, this is some sort of Object-Relational hybrid. Insisting on a dedicated single id as key on every table is toxic. It has lead schema designers astray every time I get called in to help -- as here. Please if you can at all avoid it, don't do that.

database work flow analysis

I have analyzed my as below and I have two point I am confused about
1- currently when I insert an items within orders I give order_ID as a PK
so each item within the order as its own PK... and same customer_id (FK) for all Item within the order so ... in that case the invoice number is same as the customer_ID
Is that what should think goes on or there is something wrong with this work flow ?
2-in some cases I don't need to record the customer information I just want to
insert the orders without their customer info.. I don't have clear idea about how think should happen :S
3- IF I want to apply a discount on some customer orders where the discount should I allow user to able to apply on the item per orders level ?
or on the whole order ? and where the discount column should be stored
1 - This design seems fine for the problem. You are stating that for every product ordered, this is the customer line. You can pick the ID, name, tax/fiscal number, address, etc.
2 - If you don't need any kind of customer information, make the customer_id on the Orders table accept NULL. It is the cleanest way to do it.
If for some reason NULLs are not an option or you want to keep on the database some basic anonymous customer data, you can create a line on customers for anonymous users (Ex.: ID: 1 / Name: ANONYMOUS ...) and place that ID on the order line.
3 - Placing the discount per product ordered might be the better idea.
If you want to apply a discount for the complete order, you just need to place that discount on every single order line.
If you want to apply a discount for a single product, you just need to place that discount on that product line.
If you want to apply a discount for a single product with a limited ammount of quantity (Ex.: Discount of 50% but limited to 1 purchase), and the customers buys more than that limit, you just need to place 2 orders for the same product. One with the discount and max quantity and the other without discount and the rest of the quantity.
Placing this on the order level wouldn't work for single product discounts.
This were answers to your questions. I would also question your design in points like:
If the customer changes address, is your order supposed to change too? If not, you should use the normal approach for orders, which is to have a order header table, with fixed information, like the address and the foreign key to the customer, and a order line table, like your orders line, but with a foreign key to the order header. That also avoids repetition.
Are you expecting up to 2 phone numbers? If you do, the design is ok. If you don't know how many phone numbers to expect, maybe a table with phone numbers, with foreign keys to the customer might be a better approach.

SQL Server Database Design - Seperate Table for Sale and Purchase

I am building a new business application for my personal business which has close to ~100 transactions of sale and purchase per day. I am thinking of having Separate tables to record the sale and purchase with another linked table for Items that were sold and a seperate linked table with items that were purchased.
Example:
**SaleTable**
InvoiceNo
TotalAmt
**SaleTableDetail**
LinkedInvNo
ProductID
Quantity
Amount
etc.,
would this design be better or would it be more efficient to have one transactiontable with a column stating sale or purchase?
-From an App/Database/Query/Reporting Perspective
An invoice is not the same as a sales order. An invoice is a request for payment. A sales order is an agreement to sell products to a party at a price on a date.
A sales order is almost exactly the same as a purchase order, except you are the customer, and a sales order line item can reference a purchase order line item. You can put them in separate tables, but you should probably use Table Inheritance (CTI, extending from an abstract Order). Putting them in the same table with a "type" column is called Single Table Inheritance and is nice and simple.
Don't store totals in your operational db. You can put them in your analytic db though (warehouse).
You are starting small, thats a quick way to do. But, I am sure, very shortly you will run into differences between sale and purchase transactions, some fields will describe only a sale and some fields that will be applicable only for purchases.
In due course, you may want to keep track of modifications or a modification audit. Then you start having multiple rows for the same transaction with fields indicating obsoletion or you have to move history records to another table.
Also, consider the code-style-overhead in all your queries, you got to mention the Transaction Type as sale or purchase for simple queries.
It would be better to design your database with a model that maps business reality closest. At the highest level, everything may abstract to a "transaction", with date, amount and some kind of tag to indicate amount is paid or received against what context. That shouldn't mean we can have a table with Tag, Date, Amount, PayOrReceive to handle all the diverse transactions.

Same Fact Table Column; Records with Multiple Reasons

I am in a situation similar to the one below:
Think for instance we need to store customer sales in a fact table (under a data warehouse built with dimensional modelling). I have sales, discounts related to the sale, sales returns and cancellations to be stored.
Do you think it would be advisable to store sales for a day to a customer in a particular product (when the day is the grain) as a positive value while the returns and discounts are stored as minuses?
Also if a discount is enforced to a customer at a level other than the product (for instance brand), do you think it is alright to persist it with a key particularly assigned to the brand (product is the grain) while the product column being given an N/A, for the particular record?
Thanks in advance.
If your sales are considered a good thing (I'm assuming they are) then recording sales as positive numbers makes perfect sense. Any transaction that reduces sales (i.e. discounts and returns) should therefore be recorded as negative numbers. This will make reporting your sales very natural.
If you have diffent dimensions that might account for a record, you should populate the dimensions that make sense. So yes, attribute a discount to a brand rather than a product if that is what happened in your business transaction. This way your reporting will be able to look at all discounts, at discounts for particular products and discounts for entire brands. If your fact table shows the most direct "cause" of the discount (product or brand) then your reports will be more useful than if you link the fact to brand through a relationship to product.

db design questions (how to deal with groups of products)

I would be grateful if somebody could help me to find an elegant solution to this database design problem. There is a company with a lot of different products (P1,P2,P3,P4) and a lot of customers (C1, C2, C3, C4). Now they have a simple database table to deal with orders, something like
20101027 C2 P1 qty status
20101028 C1 P2 qty status
Now I would like to create groups of products ( eg. (P1+P3+p4) and (P2+P3)) that could be purchase together for a reduced price. What is the best way to represent such groups in a database system? Dealing with these groups as individual products doesn't work, because I need the functionality of replacing, adding or removing products from the groups. So I need to keep the currently given table of products.
Thanks for reading. I hope I will get some help.
Add a new table product_group_promotions, with an ID, name and discount price. Then create a table product_group_promotions_products that links products to product group promotions. This will contain a product group ID and a product ID. This way, you can place one product in multiple groups, and let groups contain multiple products (of course).
Jan's answer is correct but incomplete.
You'll also need start and end dates of the promotion. You'll probably want to enter next week promotions so they are ready but not apply them until appropriate.
Discount price may not be enough either. You also will need to get business rules from business people as to how to apply the discount. It could be a percentage or a free item or a fixed amount. If a percentage do you distribute the discount evenly, proportionally, on the cheapest product, the most expensive? If a free item, which one in the set is free. It could also be a fixed amount, $10 off if you buy x, y, and z. Is the discount applied more than once. If someone buys 5x of P2 and P3 do they get the discount on all of them or just the first ones. Is there a limit over a time period. As in the past example, if you don't give me the discount on all 5, I would just fill out 5 orders of 1 each and get the discount you were trying to prevent. If so you'd have to go back through previous purchase by that customer to see if they've received that discount.
You can see how ugly this can get. I would clarify with the business EXACTLY what they plan to use this new feature for and run through these use cases with them.
As Q, asked, if the basket of purchased items is large enough there could be more than one discount possible. Do you have to determine what to give, do you present a list of choices back to the UI... and the recalculate.
This is why I have mercy on department stores who screw this stuff up. It's not simple.
On the surface this is simple sounding but in reality it's very complex.
"Dealing with these groups as individual products doesn't work, because I need the functionality of replacing, adding or removing products from the groups."
Here are several things you need to look at:
Current Inventory vs Past Orders
How do you deal with Price Changes on P1, P2, P3
How do you handle adding a new product or products to an existing group
How do you handle removing a product or products from an existing group
In my opinion you need two sets of tables.
Tables that make up your current inventory
Tables that record what customers purchased (historical data tables)
If you need to reconstruct a customers purchase from six months ago, you can not rely on the current data givin the fact that a grouping may not look the same today as it did six months ago. So, if you do not already have a set of historical data tables for customer records then, I recommend you create them.
With a set of historical data tables that house what was bought by the customer you can pretty much do what every you want to the current invetory data. Change prices, regroup products, make products obsolete, Temporarily suspend a product, etc.

Resources