I could really use some insights on choosing between the following two database layouts.
Layout #1 | Layout #2
|
CUSTOMERS | CUSTOMERS
id int pk | id int pk
info char | info char
|
ORDERS | ORDERS
id int pk | id int pk
customerid int fk | customerid int fk
date timedate | date timedate
|
DETAILS | INVOICES
id int pk | id int pk
orderid int fk | orderid int fk
date timedate | date timedate
description char |
amount real | DETAILS
period int | id int pk
| invoiceid int fk
| date timedate
| description char
| amount real
This is a billing application for a small business, a sole proprietor. The first layout has no separate table for invoices, relying instead on the field 'period' in DETAILS for the billing cycle number. The second layout introduces a table specifically for invoices.
Specifically in this application, at what point do you see Layout #1 breaking, or what kind of things will get harder and harder as the amount of data increases? In the case of Layout #2, what does the added flexibility/complexity mean in practical terms? What are the implications for 30-60-90 aging? I'm sure that will be necessary at some point.
More generally, this seems to be a general case of whether you track/control something through a field in a table or a whole new table, yet it's not really a normalization issue, is it? How do you generally make the choice?
Given the previous comments, this is how I would approach it:
CUSTOMERS
id int pk
info char
CASES
id int pk
customerid int fk
dateOpened datetime
dateClosed datetime
status int <- open, closed, final billed, etc.
BillPeriod int <- here is where you determine how often to bill the client.
BillStartDate datetime <- date that billings should start on.
BILLING
billingid int pk
caseid int fk
userid int fk <- id of person who is charging to this case. i.e. the lawyer.
invoicedetailid fk <- nullable, this will make it easier to determine if this particular item has been invoiced or not.
amount money
billdate datetime
billingcode int fk <- associate with some type of billing code table so you know what this is: time, materials, etc.
description char
INVOICES
invoiceid int pk
customerid int FK
invoicedate datetime
amount money <- sum of all invoice details
status int <- paid, unpaid, collection, etc..
discount money <- sum of all invoice details discounts
invoicetotal <- usually amount - discount.
INVOICEDETAILS
invoicedetailid int PK
invoiceid int FK
billingid int FK
discount money <- amount of a discount, if any
===========
In the above you open a "case" and associate it with a customer. On an ongoing basis one or more people apply Billings to the case.
Once the combination of the bill start date and period have elapsed, then the system will create a new Invoice with Details copied from the Billing table. It should do this based on those details that have not already been billed. You should lock the billing record from future changes once it has been invoiced.
You might have to change "BillPeriod" to some other type of field if you need different triggers. For example, period is just one "trigger" to create an invoice.
One might include sending an invoice when you hit a certain dollar amount. This could be configured at the customer or case level. Another option is capping expenditures. For example putting a cap value at the case level which would prevent billings from going over the cap; or at the very least causing an alert to be sent to the relevant parties.
I'm not entirely sure why "period" is attached to the item instead of the order itself. Layout #1 seems to imply that you can have an open "order" that is made up of "details" which might be added and paid for over a period of years. This seems very wrong and should make accounting a nightmare. Layout #2 really isn't much better.
Generally speaking an Order is comprised of a single transaction with a purchase or contract date. That transaction might encompass multiple detail items, but it's still one transaction. It represents a single agreement made at a certain point in time between the buyer and seller. If new items are purchased, a new order is created... With this in mind, neither table structure works.
Regarding Invoices. An order might have one or more invoices attached to it. The goal of invoices is to have payments applied against them. For small transactions there is a one-to-one relationship between invoices and orders.
In larger transactions, you might have multiple invoices that get applied to a single order. For example if you have contracted to make "3 easy payments of $199.99 ..." . In this case you would have 3 invoices for $199.99 each applied to a single order whose total was $599.97; and each due at different time periods.
An Invoice table should then have at minimum the Order Id, Invoice Number, Invoice Date, Invoiced Amount, Date Due, Transaction Id (for credit card), Check Number (obvious), Amount Received and Date Received fields.
If you want to get fancy and support more of the real world, then you would additionally have a Payments table which stored the Invoice Number, Amount Received (or refunded), Date Received, Transaction ID, and Check Number. If you go this route, remove those fields from the Invoice table.
Now, if you need to support recurring charges (for example, internet hosting), then you would have a different table called "Contracts" and "ContractDetails" or something similar. These tables would store the contract details (similar to orders and order details, but includes date start, date end, and recurring period). When the next billing period is hit, then the details would be used to create an order and the appropriate invoices generated..
Since you're doing legal billing, I'd suggest you spend some time looking at the features of Sage Timeslips. Lawyers don't behave like other people; accounting software for lawywers doesn't behave like other accounting software. It's the nature of the business.
They have a 30-day free trial, and you can probably learn a lot from the help files and documentation.
Besides, reverse-engineering database design from the user interface is good practice.
Related
I am designing an e-commerce website that has the following scenario:
A customer can purchase items and create an order.
The order may have unknown fee that will be added after the customer
pays the total amount of the items. That is, the customer pays
certain amount first. The order adds some fee and changes the total.
And the customer pays again for the difference. But the two (or
more) payments are associated with the same order.
(Optional) The customer can submit a single payment for multiple
orders.
Currently, I have an Order table and each order may consist of multiple OrderLineItems (simplified schema):
Order
=====
customer
line_items
total
status
OrderLineItem
=============
price
quantity
order
product
A payment is associated with an order (simplified schema):
Payment
=======
order
payment_account
total
result
It seems to be very hard to support the multiple payments for a single order scenario in the current implementation. I reckon that I have to introduce immutable invoices in the system, and the payment should be associated with an invoice instead of an order. However, I would need some help with the order/invoice/payment modeling for the above scenario. Some specific questions I have:
An order and an invoice look very similar to me (e.g. both have
items and totals). What is the major difference in typical
e-commerce systems?
How should I model invoices for my scenario? Should I have
OrderLineItems for an Order AND InvoiceLineItems for an Invoice?
Some preliminary thoughts: I will have multiple invoices associated
with a certain order. Whenever the order changes the total, I have
to somehow calculate the difference and send a new/immutable invoice
to the customer. Then, the customer can pay and the payment will be
associated with the invoice.
Would love to hear some advice. Much appreciated. Thanks!
There are a lot of questions here, I'll try to address as many as I can. A lot of the issues are the business model rather than the data model.
Firstly, you are right that you need an immutable invoice. Once an invoice is created you cannot alter it, the standard way of handling a change in invoice is to issue a credit note and a new invoice.
Think about the relations between the tables: the Order does not need to hold the lineItems as these are referenced the other way, i.e.
Order
=====
orderId
customerId
status
OrderLineItem
=============
orderLineItemId
orderId
product
price
quantity
So to view the order you join the tables on orderId. There's also no need to store the total as that is calculated from the join.
Try not to duplicate your data: your invoice will reference orderLineItems, but not necessarily the same ones in the order. For example, the customer orders an A and a B, but B is out of stock. You ship A and create an invoice that references orderLineItemId for A. So your invoice tables might look like:
invoice
=======
invoiceId
status
invoiceLineItem
===============
invoiceId
orderLineItemId
quantity
In other words there's no need to have the details of the item. You may be wondering why you don't just add an invoiceId to the orderLineItem table - the reason is the customer might order 10 of item A, but you only ship 8 of them, the other two will go on a subsequent invoice.
Payments are not made against orders, they are against invoices. So your payments table should reference the invoiceId.
Then you touch on reconciliation. If everything was perfect, the customer would make a payment against a particular invoice, even if a partial payment. In reality this is going to be your biggest headache. Suppose the customer has several outstanding invoices for amounts x, y and z. If they make a payment of p, which do you allocate it against? Perhaps in date order, so that if p>x the remainder is allocated against y. What if p=z? Perhaps the customer is intending to pay z now but is disputing y and has mislaid x? How you deal with these sorts of things is up to you, but I can say that most invoicing systems do it very badly indeed.
I'm designing a database that will hold a list of transactions. There are two types of transactions, I'll name them credit (add to balance) and debit (take from balance).
Credit transactions most probably will have an expiry, after which this credit balance is no longer valid, and is lost.
Debit transactions must store from which credit transaction they come from.
There is always room for leniency with expiry dates. It does not need to be exact (till the rest of the day for example).
My friend and I have came up with two different solutions, but we can't decide on which to use, maybe some of you folks can help us out:
Solution 1:
3 tables: Debit, Credit, DebitFromCredit
Debit: id | time | amount | type | account_id (fk)
Credit: id | time | amount | expiry | amount_debited | accountId (fk)
DebitFromCredit: amount | debit_id (fk) | credit_id (fk)
In table Credit, amount_debited can be updated whenever a debit transaction occurs.
When a debit transaction occurs, DebitFromCredit holds information of which credit transaction(s) has this debit transaction been withdrawn.
There is a function getBalance(), that will get the balance according to expiry date, amount and amount_debited. So there is no physical storage of the balance; it is calculated every time.
There is also a chance to add a cron job that will check expired transactions and possibly add a Debit transaction with "expired" as a type.
Solution 2
3 tables: Transactions, CreditInfo, DebitInfo
Transactions: id | time | amount (+ or -) | account_id (fk)<br />
CreditInfo: trans_id (fk) | expiry | amount | remaining | isConsumed<br />
DebitInfo: trans_id (fk) | from_credit_id (fk) | amount<br />
Table Account adds a "balance" column, which will store the balance. (another possibility is to sum up the rows in transactions for this account).
Any transaction (credit or debit) is stored in table transactions, the sign of the amount differentiates between them.
On credit, a row is added to creditInfo.
On debit one or more rows are added to DebitInfo (to handle debiting from multiple credits, if needed). Also, Credit info row updates the column "remaining".
A cron job works on CreditInfo table, and whenever an expired row is found, it adds a debit record with the expired amount.
Debate
Solution 1 offers distinction between the two tables, and getting data is pretty simple for each. Also, as there is not a real need for a cron job (except if to add expired data as a debit), getBalance() gets the correct current balance. Requires some kind of join to get data for reporting. No redundant data.
Solution 2 holds both transactions in one table, with + and - for amounts, and no updates are occurring to this table; only inserts. Credit Info is being updated on expiry (cron job) or debiting. Single table query to get data for reporting. Some redundancy.
Choice?
Which solution do you think is better? Should the balance be stored physically or should it be calculated (considering that it might be updated with cron jobs)? Which one would be faster?
Also, if you guys have a better suggestion, we'd love to hear it as well.
Which solution do you think is better?
Solution 2. A transaction table with just inserts is easier for financial auditing.
Should the balance be stored physically or should it be calculated (considering that it might be updated with cron jobs)?
The balance should be stored physically. It's much faster than calculating the balance by reading all of the transaction rows every time you need it.
I am IT student that has passed a course called databases, pardon my inexperience.
I made this using MySQL workbench can send you model via email to you do not lose time recreating the model from picture.
This schema was made in 10 minutes. Its holding transactions for a common shop.
Schema explanation
I have a person who can have multiple phones and addresses.
Person makes transactions when he is making a transaction,
You input card name e.g. american express,
card type credit or debit (MySQL workbench does not have domains or constrains as power-designer as far as i know so i left field type as varchar) should have limited input of string debit or credit,
Card expiry date e.g. 8/12,
Card number e.g. 1111111111
Amount for which to decrease e.g. 20.0
time-stamp of transaction
program enters it when entering data
And link it to person that has made the transsaction
via person_idperson field.
e.g.
1
While id 1 in table person has name John Smith
What all this offers:
transaction is uniquely using a card. Credit or Debit cant be both cant be none.
Speed less table joins more speed for system has.
What program requires:
Continuous comparion of fields variable exactTimestamp is less then variable cardExpiery when entering a transaction.
Constant entering of card details.
Whats system does not have
Saving amount that is used in the special field, however that can be accomplished with Sql query
Saving amount that remains, I find that a personal information, What I mean is you come to the shop and shopkeeper asks me how much money do you still have?
system does not tie person to the card explicitly.
The person must be present and use the card with card details, and keeping anonymity of the person. (Its should be a query complex enough not to type by hand by an amateur e.g. shopkeeper) , if i want to know which card person used last time i get his last transaction and extract card fields.
I hope you think of this just as a proposition.
I'm designing a database application where data is going to change over time. I want to persist historical data and allow my users to analyze it using SQL Server Analysis Services, but I'm struggling to come up with a database schema that allows this. I've come up with a handful of schemas that could track the changes (including relying on CDC) but then I can't figure out how to turn that schema into a working BISM within SSAS. I've also been able to create a schema that translates nicely in to a BISM but then it doesn't have the historical capabilities I'm looking for. Are there any established best practices for doing this sort of thing?
Here's an example of what I'm trying to do:
I have a fact table called Sales which contains monthly sales figures. I also have a regular dimension table called Customers which allows users to look at sales figures broken down by customer. There is a many-to-many relationship between customers and sales representatives so I can make a reference dimension called Responsibility that refers to the customer dimension and a Sales Representative reference dimension that refers to the Responsibility dimension. I now have the Sales facts linked to Sales Representatives by the chain of reference dimensions Sales -> Customer -> Responsibility -> Sales Representative which allows me to see sales figures broken down by sales rep. The problem is that the Sales facts aren't the only things that change over time. I also want to be able to maintain a history of which Sales Representative was Responsible for a Customer at the time of a particular Sales fact. I also want to know where the Sale Representative's office was located at the time of a particular sales fact, which may be different than his current location. I might also what to know the size of a customer's organization at the time of a particular Sales fact, also which might be different than it is currently. I have no idea how to model this in an BISM-friendly way.
You mentioned that you currently have a fact table which contains monthly sales figures. So one record per customer per month. So each record in this fact table is actually an aggregation of individual sales "transactions" that occurred during the month for the corresponding dimensions.
So in a given month, there could be 5 individual sales transactions for $10 each for customer 123...and each individual sales transaction could be handled by a different Sales Rep (A, B, C, D, E). In the fact table you describe there would be a single record for $50 for customer 123...but how do we model the SalesReps (A-B-C-D-E)?
Based on your goals...
to be able to maintain a history of which Sales Representative was Responsible for a Customer at the time of a particular Sales fact
to know where the Sale Representative's office was located at the time of a particular sales fact
to know the size of a customer's organization at the time of a particular Sales fact
...I think it would be easier to model at a lower granularity...specifcally a sales-transaction fact table which has a grain of 1 record per sales transaction. Each sales transaction would have a single customer and single sales rep.
FactSales
DateKey (date of the sale)
CustomerKey (customer involved in the sale)
SalesRepKey (sales rep involved in the sale)
SalesAmount (amount of the sale)
Now for the historical change tracking...any dimension with attributes for which you want to track historical changes will need to be modeled as a "Slowly Changing Dimension" and will therefore require the use of "Surrogate Keys". So for example, in your customer dimension, Customer ID will not be the primary key...instead it will simply be the business key...and you will use an arbitrary integer as the primary key...this arbitrary key is referred to as a surrogate key.
Here's how I'd model the data for your dimensions...
DimCustomer
CustomerKey (surrogate key, probably generated via IDENTITY function)
CustomerID (business key, what you will find in your source systems)
CustomerName
Location (attribute we wish to track historically)
-- the following columns are necessary to keep track of history
BeginDate
EndDate
CurrentRecord
DimSalesRep
SalesRepKey (surrogate key)
SalesRepID (business key)
SalesRepName
OfficeLocation (attribute we wish to track historically)
-- the following columns are necessary to keep track of historical changes
BeginDate
EndDate
CurrentRecord
FactSales
DateKey (this is your link to a date dimension)
CustomerKey (this is your link to DimCustomer)
SalesRepKey (this is your link to DimSalesRep)
SalesAmount
What this does is allow you to have multiple records for the same customer.
Ex. CustomerID 123 moves from NC to GA on 3/5/2012...
CustomerKey | CustomerID | CustomerName | Location | BeginDate | EndDate | CurrentRecord
1 | 123 | Ted Stevens | North Carolina | 01-01-1900 | 03-05-2012 | 0
2 | 123 | Ted Stevens | Georgia | 03-05-2012 | 01-01-2999 | 1
The same applies with SalesReps or any other dimension in which you want to track the historical changes for some of the attributes.
So when you slice the sales transaction fact table by CustomerID, CustomerName (or any other non-historicaly-tracked attribute) you should see a single record with the facts aggregated across all transactions for the customer. And if you instead decide to analyze the sales transactions by CustomerName and Location (the historically tracked attribute), you will see a separate record for each "version" of the customer location corresponding to the sales amount while the customer was in that location.
By the way, if you have some time and are interested in learning more, I highly recommend the Kimball bible "The Data Warehouse Toolkit"...which should provide a solid foundation on dimensional modeling scenarios.
The established best practices way of doing what you want is a dimensional model with slowly changing dimensions. Sales reps are frequently used to describe the usefulness of SCDs. For example, sales managers with bonuses tied to the performance of their teams don't want their totals to go down if a rep transfers to a new territory. SCDs are perfect for tracking this sort of thing (and the situations you describe) and allow you to see what things looked like at any point historically.
Spend some time on Ralph Kimball's website to get started. The first 3 articles I'd recommend you read are Slowly Changing Dimensions, Slowly Changing Dimensions Part 2, and The 10 Essential Rules of Dimensional Modeling.
Here are a few things to focus on in order to be successful:
You are not designing a 3NF transactional database. Get comfortable with denormalization.
Make sure you understand what grain means and explicitly define the grain of your database.
Do not use natural keys as keys, and do not bake any intelligence into your surrogate keys (with the exception of your time keys).
The goals of your application should be query speed and ease of understanding and navigation.
Understand type 1 and type 2 slowly changing dimensions and know where to use them.
Make sure you have a sponsor on the business side with the power to "break ties". You will find different people in the organization with different definitions of the same thing, and you need an enforcer with the power to make decisions. To see what I mean, ask 5 different people in your organization to define "customer" or "gross profit". You'll be lucky to get 2 people to define either the same way.
Don't try to wing it. Read the The Data Warehouse Lifecycle Toolkit and embrace the ideas, even if they seem strange at first. They work.
OLAP is powerful and can be life changing if implemented skillfully. It can be an absolute nightmare if it isn't.
Have fun!
I am wondering what is the best way to make bank transaction table.
I know that user can have many accounts so I add AccountID instead of UserID, but how do I name the other, foreign account. And how do I know if it is incoming or outgoing transaction. I have an example here but I think it can be done better so I ask for your advice.
In my example I store all transactions in one table and add bool isOutgoing. So if it is set to true than I know that user sent money to ForeignAccount if it's false then I know that ForeignAccount sent money to user.
My example
Please note that this is not for real bank, of course. I am just trying things out and figuring best practices.
My opinion:
make the ID not null, Identity(1,1) and primary key
UserAccountID is fine. Dont forget to create the FK to the Accounts table;
You could make the foreignAccount a integer as well if every transaction is between 2 accounts and both accounts are internal to the organization
Do not create Nvarchar fields unless necessary (the occupy twice as much space) and don't create it 1024. If you need more than 900 chars, use varchar(max), because if the column is less than 900 you can still create an index on it
create the datetime columns as default getdate(), unless you can create transactions on a different date that the actual date;
Amount should be numeric, not integer
usually, i think, you would see a column to reflect DEBIT, or CREDIT to the account, not outgoing.
there are probably several tables something like these:
ACCOUNT
-------
account_id
account_no
account_type
OWNER
-------
owner_id
name
other_info
ACCOUNT_OWNER
--------------
account_id
owner_id
TRANSACTION
------------
transaction_id
account_id
transaction_type
amount
transaction_date
here you would get 2 records for transactions - one showing a debit, and one for a credit
if you really wanted, you could link these two transactions in another table
TRANSACTION_LINK
----------------
transaction_id1
transaction_id2
I'd agree with the comment about the isOutgoing flag - its far too easy for an insert/update to incorrectly set this (although the name of the column is clear, as a column it could be overlooked and therefore set to a default value).
Another approach for a transaction table could be along the lines of:
TransactionID (unique key)
OwnerID
FromAccount
ToAccount
TransactionDate
Amount
Alternatively you can have a "LocalAccount" and a "ForeignAccount" and the sign of the Amount field represents the direction.
If you are doing transactions involving multiple currencies then the following columns would be required/considered
Currency
AmountInBaseCcy
FxRate
If involving multiple currencies then you either want to have an fx rate per ccy combination/to a common ccy on a date or store it per transaction - that depends on how it would be calculated
I think what you are looking for is how to handle a many-tomany relationship (accounts can have multiple owners, owners can have mulitple accounts)
You do this through a joining table. So you have account with all the details needed for an account, you have user for all teh details needed for a user and then you have account USer which contains just the ids from both the other two tables.
I need a little help designing a database that a bar would use for drinks. My app handles drink orders in 2 ways: A customer can order drinks and pay right away, or they can start a tab. This is my table relationship so far:
Table Order: Table Tabs:
------------ ------------
[PK] OrderId - 1 [PK] TabId
ItemName / TabName
QtyOrdered / Total
TabId *- PaymentType
Archive
Its setup so 1 tab can have many drinks on it. Thats great for tabs, but customers can choose not to setup a tab. My question is, how can i modify these two tables to support that? Or do I have to create a new table?
Thanks in advance.
Your "item" is the order line-item.
"My brother and I will each have a Guiness and his wife would like a gin-and-tonic."
There are three (EDIT: items, but two) line-items on that order. To identify them as part of the same order [if you should need to do so], you would need two tables:
the OrderHeader
(orderheaderid, orderdatetime, customerid)
and
the OrderDetail
(id, orderheaderid, drinkid, howmany, extendedprice, paymentamountapplied).
If you do not need to identify the drinks as belonging to the same order, you could abandon the two-table structure, but then you could have only one kind of drink per order.
ORDER
orderid, orderdatetime, customerid, drinkid, howmany, extendedprice, paymentapplied.
Extended price is howmany * the drink's price at time of the order. You store this value in your order table so you can change the price of the drink in the DRINKS table without affecting the tab amount owed (in case the tab lifetime is longer than a day -- i.e. customers can run a monthly or quarterly tab).
HowMany can default to 1. The amount owed on a drink line-item is (extendedprice - paymentamountapplied). PaymentAmountApplied defaults to 0 (i.e. default is to run a tab).
If you do not need to track the kind of drink being ordered (i.e. you don't care to use your database to discover that on Tuesday nights you sell a lot more gins-and-tonic than Guinesses, for some reason):
ORDER
orderid, orderdatetime, customerid, ordertotal, paymentapplied.
Your barkeep would simply enter the total amount of the order, calculated outside your system, without reference to a DRINKS table in the database -- maybe the barkeep would look at a chalkboard on the wall.
You need a CUSTOMERS table, a DRINKS table, an ORDERSHEADER table, an ORDERDETAIL table (each drink on the tab is a line-item). You could/should have a separate PAYMENTS table. And you would allocate payments to the line-items of the ORDERDETAIL (your tab). The "tab" is the set of all line-items on the order(s) that have no payment applied to them.