Calculating the selling profit of different incoming goods - database

I'm making a retail distribution system for a store, where this store is the main distribution store that will distribute goods to other stores.
I have a problem, that is, let's say on January 10th an item in quantity of 10 comes in at a price of $ 5 / pcs. Then on January 13 came another 20 items at a price of $ 6 / pcs. Now, I have 30 pcs of items in the warehouse.
Date | Quantity | Price
10 Jan | 10 | $5
13 Jan | 20 | $6
Then there was a request for an order with a total of 25 pcs. Under the FIFO principle, I take 10 items that come in on the 10th, then take 15 items that come in on the 13th.
What is the best way to calculate profit margin? Where each item entered will have its own selling price depending on the purchase price.
I need help and advice for database structure and logic in this system.
Thank You

Related

How to deal with Variable data over time in associations

In linked models (let's say a drink transaction, a waiter, and a restaurant), when you want to display data, you look for informations in your linked content :
Where was that beer bought ?
Fetch Drink transaction => Fetch its Waiter => Fetch this waiter's Restaurant : this is where the beer was purchased
So at time T, when I display all transactions, I fetch my data following associations, thus I can display this :
TransactionID Waiter Restaurant
1 Julius Caesar's palace
2 Cleo Moe's tavern
Let's say now that my waiter is moved to another restaurant.
If I refresh this table, the result will be
TransactionID Waiter Restaurant
1 Julius Moe's tavern
2 Cleo Moe's tavern
But we know that the transaction n°1 was made in Caesar's palace !
Solution 1
Don't modify the waiter Julius, but clone it.
Upside : I keep an association between models, and still can filter with every field of every associated models.
Downside : Every modification on every model duplicates content, which can do a LOT when time passes.
Solution 2
Keep a copy of the current state of your associated models when you create the transaction.
Upside : I don't duplicate the contents.
Downside : You can't anymore use fields on your content to display, sort or filter them, as your original and real data is inside, let's say, a JSON field. So you have to, if you use MySQL, filter your data by makin plain-search queries in that field.
What is your solution ?
[EDIT]
The problem goes further, as it's not only a matter when association changes : a simple modification on an associated model causes a problem too.
What I mean :
What's the amount of this order ?
Fetch Drink transaction => Fetch its product => Fetch this product's Price => Multiply by order quantity : this is the total amount of the order
So at time T, when I display all transactions, I fetch my data following associations, thus I can display this :
TransactionID Qty ProductId
1 2 1
ProductID Title Price
1 Beer 3
==> Amount of order n°1 : 6.
Let's say now that the beer costs 2,5.
If I refresh this table, the result will be
TransactionID Qty ProductId
1 2 1
ProductID Title Price
1 Beer 2,5
==> Amount of order n°1 : 5.
So, once again, the 2 solutions are available : do I clone the beer product when its price is changed ? Do I save a copy of beer in my order when the order is made ? Do you have any third solution ?
I can't just add an "amount" attribute on my orders : yes it can solve that problem (partially) but it's not a scalable solution as many other attributes will be in the same situation and I can't multiply attributes like this.
Event Sourcing
This is a good use case for Event Sourcing. Martin Fowler wrote a very good article about it, I advise you to read it.
there are times when we don't just want to see where we are, we also want to know how we got there.
The idea is to never overwrite data but instead create immutable transactions for everything you want to keep a history of. In your case you'll have WaiterRelocationEvents and PriceChangeEvents. You can recreate the status of any given time by applying every event in order.
If you don't use Event Sourcing, you lose information. Often it's acceptable to forget historic information, but sometimes it's not.
Lambda Architecture
As you don't want to recalculate everything on every single request, it's advisable to implement a Lambda Architecture. That architecture is often explained with BigData technology and frameworks, but you could implement it with Plain Old Java and CronJobs.
It consists of three parts: Batch Layer, Service Layer and Speed Layer.
The Batch Layer regularly calculates an aggregated version of the data, for example you'll calculate the monthly income once per day. So the current month's income will change every night until the month is over.
But now you want to know the income in real-time. Therefore you add a Speed Layer, which will apply all events of the current date immediately. Now if a request of the current month's income arrives, you'll add up the last result of the Batch Layer and the Speed Layer.
The Service Layer allows more advanced queries by combing multiple batch results and the Speed Layer results into one query. For example you can calculate the year's income by summing the monthly incomes.
But as said before, only use the Lambda approach if you need the data often and fast, because it adds extra complexity. Calculations which are rarely needed, should be run on-the-fly. For example: Which waiter creates the most income at Saturday evenings?
Example
Restaurants:
| Timestamp | Id | Name |
| ---------- | -- | --------------- |
| 2016-01-01 | 1 | Caesar's palace |
| 2016-11-01 | 2 | Moe's tavern |
Waiters:
| Timestamp | Id | Name | FirstRestaurant |
| ---------- | -- | -------- | --------------- |
| 2016-01-01 | 11 | Julius | 1 |
| 2016-11-01 | 12 | Cleo | 2 |
WaiterRelocationEvents:
| Timestamp | WaiterId | RestaurantId |
| ---------- | -------- | ------------ |
| 2016-06-01 | 11 | 2 |
Products:
| Timestamp | Id | Name | FirstPrice |
| ---------- | -- | -------- | ---------- |
| 2016-01-01 | 21 | Beer | 3.00 |
PriceChangeEvent:
| Timestamp | ProductId | NewPrice |
| ---------- | --------- | -------- |
| 2016-11-01 | 21 | 2.50 |
Orders:
| Timestamp | Id | ProductId | Quantity | WaiterId |
| ---------- | -- | --------- | -------- | -------- |
| 2016-06-14 | 31 | 21 | 2 | 11 |
Now let's get all information about order 31.
get order 31
get price of product 21 at 2016-06-14
get last PriceChangeEvent before the date or use FirstPrice if none exists
calculate total price by multiplying retrieved price with quantity
get waiter 11
get waiter's restaurant at 2016-06-14
get last WaiterRelocationEvent before the date or use FirstRestaurant if none exists
get restaurant name by retrieved restaurant id of the waiter
As you can see it becomes complicated, therefore you should only keep history of useful data.
I wouldn't involve the relocation events in the calculation. They could be stored, but I would store the restaurant id and the waiter id in the order directly.
The price history on the other hand could be interesting to check if orders went down after a price change. Here you could use the Lambda Architecure to calculate a full order with prices from the raw order and the price history.
Summary
Decide of which data you want to keep the history.
Implement Event Sourcing for that data.
Use the Lambda Architecture to speed up commonly used queries.
I like the question as it raises something very straightforward and also something more subtle.
The common principle in both cases is that ‘History must not change’, meaning if we run a query over a specified past date range today the results are the same as when we run that same query at any point in the future.
Waiters Case
When a waiter changes restaurants we must not change the history of sales. If waiter Julius sells a drink yesterday in restaurant 1 then he switches to sell more drinks today in restaurant 2 we must retain those details.
Thus we want to be able to answer queries such as ‘how many drinks has Julius sold in restaurant 1’ and ‘how many drinks has Julius sold in all restaurants’.
To achieve this you have to abstract away from Julius as a waiter by bringing in a concept of staff. Julius is a member of staff. Staff work as waiters. When working in restaurant 1 Julius is waiter A and when he works in another restaurant he is waiter B, but always the same member of staff – Julius. With an entity ‘Staff’ the queries can be answered easily.
Upside:
No loss of historic data or excessive duplications.
Downside New entity Staff must be managed. But waiter table content is reduced making net overhead of data storage is low.
In summary - abstract data subject to change into a new entity and refer back to it from transactions.
Value of Order Case
The extended use case regarding ‘what is the value of this order’ is more involved. I work in cross-currency transactions where value for the observer (user) in the price list changes from day to day as currency fluctuations occur.
But there are good reasons to lock the order value in place. For example invoice processing systems have tolerance for a small difference between their expected invoice value and that of the submitted invoice, but any large difference can lead to late payment whilst invoice handlers check the issue. Also, if customers run reports on their historic purchases then the values of those orders must remain consistent despite fluctuations in currency rates over time.
The solution is to save into the order line:
the value of product in the customers currency,
or the rate between custom and supplier currency,
but ideally do both to avoid rounding errors.
What this does is provide a statement that ‘on the date that this order was placed line 1 cost $44.56 at exchange rate 1.1 $/£’. Having this data locked in allows you to invoice to the customers expectation and provide consistent spend reports over time.
Upside: Consistent historic data. Fast database performance as no look-ups required against historic rate tables.
Downside: Some data duplication. However, trading off against overhead of storage and indexation for historic rate storage plus indexation then this is possibly an upside.
Regarding adding 'amount' to your order table - you have to do this if you want to achieve a consistent data history. If you only work in one currency then amount is the only additional storage concern. And by adding this one attribute you have protected history. Your other alternative is to store a historic cost table for drinks so you know in January beer was $1, in February it as $1.10 etc and then store the cost-table key in the transaction so that you can look up the cost if anyone asks about a historic order. But the overhead on storing the key PLUS the indexes needed to make this practicable will outweigh the storage cost of cloning 'amount' onto the order record.
In summary - clone cost data that will change over time.

SQL Server database design for evaluations

I'm designing this employee evaluation web page, and was wondering if my current database design is the correct one or if it could be improved.
This is my current design
Table Agenda:
+--------------+----------+----------+-----------+------+-------+-------+
| idEvaluation | Location | Employee | #Employee | Date | Date1 | Date2 |
+--------------+----------+----------+-----------+------+-------+-------+
Date is the date scheduled for the evaluation to be performed.
Date 1 and Date 2 its a period of time to retrieve some metrics from another database.
Table Evaluations:
+--------------+---------+------------+------+----------+
| idEvaluation | Manager | Department | Date | Comments |
+--------------+---------+------------+------+----------+
Table Scores:
+--------------+----------+-------+
| idEvaluation | idFactor | Score |
+--------------+----------+-------+
idFactor relates to another table which contains the factor and a description of it, like I said its this a correct design??
My concern its this, currently there are 60 employees, 11 managers and 12 factors, each employee its evaluated twice a year by every manager, so in the Agenda Table there's not much trouble since its only one record per evaluation (60 employees = 60 records), how ever on the Evaluations Table there are 11 records for every evaluation, so it goes to 660 records (60 employees * 11 managers = 660), and then on the Scores Table it goes even bigger since there are 12 factors for every evaluation, it goes to 7920 records (660 evaluations * 12 factors each = 7920).
Is this normal?? Am I doing it wrong?? Any input its appreciated.
EDIT
Location, Employee, #Employee, Manager and Department are loaded automatically by the vb.net page, they are "imported" from an Active Directory and its checked before insertion so duplicate names, misspelled names, and this sort of thing its not an issue.
The main idea is you dont want to repeat string literals
So if you have
id Department
1 Sales
2 IT
3 Admin
Instead of repeat Sales many time you only use 1 which is smaller so things also get faster.
Second if you have users
id user
1 Jhon Alexander
2 Maria Jhonson
If Jhon decide change his name then you will have to check all tables and change the name. Also there is the problem if two person have same name you wont know which one are you evaluating.
So go for separated table and use the ID.

How to design a Db table for attendance

I am currently working on a school management system but can't seem to figure out the best way to design my student attendance table.
INFO
School is for 14 weeks and class holds 5 times a week. Students in the school can be up to 2000 per term. Meaning attendance can be up to 14 x 5 x 2000 = 140, 000 per term.
I am developing the application for a desktop using VB.Net and MS Access.
PROGRESS SO FAR
I have so far designed something that I am skeptic about.
table name: attendance
_____________________________________________
| id |std_id | att_week | att_date | status |
''''''''''''''''''''''''''''''''''''''''''''''
| 1 | 0001 | 1 |29/9/2015 | yes |
''''''''''''''''''''''''''''''''''''''''''''''
| 2 | 0002 | 1 |29/9/2015 | yes |
''''''''''''''''''''''''''''''''''''''''''''''
I easily found out that designing it like this can easily yield 140, 000 rows in a term.
I also thought of making the week days as column names, that will easily result in 14 x 5 = 70 columns.
What is the best way to design this said table.
Friend I think you should construct your table like this:
Table would accept only the absentees
id student_id class date
________________________________________
1 11 7a 11/11/2020
2 21 6b 10/12/2020
and so on.....
You could easily retrieve details like
1] total absentees per class
2] total absent of a student in date range
3] Per day report of attendance of student can be easily prepared based on this data
ALSO this would be extremly fast due to less number of record and if you index on class_id and and partition tables in specified date range.
Thank You!

How to optimize large database requests

I am working with a database that contains information (measurements) about ships. The ships send an update with their position, fuel use, etc. So an entry in the database looks like this
| measurement_id | ship_id | timestamp | position | fuel_use |
| key | f_key | dd-mm-yy hh:ss| lat-lon | in l/km |
A new one of these entries gets added for every ship every second so the amount of entries in the database gets large very fast.
What I need for the application I am working on is not the information for one second but rather cumulative data for 1 minute, 1 day, or even 1 year. For example the total fuel use over a day, the distance traveled in a year, or the average fuel use per day over a month.
To get that and calculate that from this raw data is unfeasible, you would have to get 31,5 million records from the server to calculate the distance traveled in a year.
What I thought was the smart thing to do is combining entries into one bigger entry. For example get 60 measurements and combine them into 1 minute measurement entry in a separate table. By averaging the fuel use, and by summing the distance traveled between two entries. A minute entry would then look like this.
| min_measurement_id | ship_id | timestamp | position | distance_traveled | fuel_use |
| new key |same ship| dd-mm-yy hh| avg lat-lon | sum distance_traveled | avg fuel_use |
This process could then be repeated to work with hours, days, months, years. This way a query for a week could be done by requesting only 7 queries, or if I want hourly details 168 entries. Those look like way more usable numbers to me.
The new tables can be filled by querying the original database every 10 minutes, that data then fills the minute table, which in turn updates the hours table, etc.
However this seems to be a lot of management and duplication of almost the same data, with constantly the same operation being done.
So what I am interested in is if there is some way of structuring this data. Could it be sorted hierarchically (after all seconds, days, minutes are pretty hierarchical) or are there other ways to optimize this?
This is the first time I am using a database this size so I also did not really know what to look for on the internet.
Aggregates are common in data warehouses so your approach to group data is fine. Yes, you are duplicating some of the data, but you'll get the speed benefit.

Database schema design for financial forecasting

I need to develop a web app that allows companies to forecast financials.
the app has different screens, one for defining employee salaries, another for sales projections etc..
basically turn an excel financial forecast model into an app.
question is, what would be the best way to design the database, so that financial reports (e.g. a profit and loss statement or balance sheet) could be quickly generated?
assuming the forecast period is for 5 years, would you have a table
with 5 years*12 months = 60 fields per each row? is that performant
enough?
would you use DB triggers to recalculate salary expenses whenever a single employee data is changed?
I'd think it would be better to store each month's forecast in its own row in a table that looks like this
month forecast
----- --------
1 30000
2 31000
3 28000
... ...
60 52000
Then you can use the aggregate functions to calculate forecast reports, discounted cash flow etc. ( Like if you want the un-discounted total for just 4 years):
SELECT SUM(forecast) from FORECASTS where month=>1 and month<=48
For salary expenses, I would think that having a view that does calculations on the fly (or if you DB engine supports "materialized views" should have sufficient performance unless we're talking some giant number of employees or really slow DB.
Maybe have a salary history table, that trigger populates when employee data changes/payroll runs
employeeId month Salary
---------- ----- ------
1 1 4000
2 1 3000
3 1 5000
1 2 4100
2 2 3100
3 2 4800
... ... ...
Then again, you can do SUM or other aggregate function to get to the reported data.

Resources