Is there a way to duplicate a formula with a circular reference from a Excel file into SQL Server? My client uses a excel file to calculate a Selling Price. The Selling Price field is (costs/1-Projected Margin)) = 6.5224 (1-.6) = 16.3060. One of the numbers that goes into the costs is commission which is defined as SellingPrice times a commission rate.
Costs = 6.5224
Projected Margin = 60%
Commissions = 16.3060(Selling Price) * .10(Commission Rate) = 1.6306 (which is part of the 6.5224)
They get around the circular reference issue in Excel because Excel allows them to check a Enable Iterative Calculation option and stops the iterations after 100 times.
Is this possible using SQL Server 2005?
Thanks
Don
This is a business problem, not an IT one, so it follows that you need a business solution, not an IT one. It doesn't sound like you're working for a particularly astute customer. Essentially, you're feeding the commission back into the costs and recalculating commission 100 times. So the salesman is earning commission based on their commission?!? Seriously? :-)
I would try persuading them to calculate costs and commissions separately. In professional organisations with good accounting practices were I've worked before these costs are often broken down into operating and non-operating or raw materials costs, which should improve their understanding of their business. To report total costs later on, add commission and raw materials costs. No circular loops and good accounting reports.
At banks where I've worked these costs are often called things like Cost (no commissions or fees), Net Cost (Cost + Commission) and then bizzarely Net Net Cost (Cost + Commission + Fees). Depending on the business model, cost breakdowns can get quite interesting.
Here are 2 sensible options you might suggest for them to calculate the selling price.
Option 1: If you're going to calculate margin to exclude commission then
Price before commission = Cost + (Cost * (1 - Projected Margin))
Selling price = Price before commission + (Price before commision * Commission)
Option 2: If your client insists on calculating margin to include commission (which it sounds like they might want to do) then
Cost price = Cost + (Cost * Commission)
Profit per Unit or Contribution per Unit = Cost price * (1-Projected Margin)
Selling Price = Cost Price + Profit per Unit
This is sensible in accounting terms and a doddle to implement with SQL or any other software tool. It also means your customer has a way of analysing their sales to highlight per unit costs and per unit profits when the projected margin is different per product. This invariably happens as the business grows.
Don't blindly accept calculations from spreadsheets. Think them through and don't be afraid to ask your customer what they're trying to achieve. All too often broken business processes make it as far as the IT department before being called into question. Don't be afraid of doing a good job and that sometimes means challenging customer requests when they don't make sense.
Good luck!
No, it is not possible
mysql> select 2+a as a;
ERROR 1054 (42S22): Unknown column 'a' in 'field list'
sql expressions can only refer to expressions that already exist.
You can not even write
mysql> select 2 as a, 2+a as b;
ERROR 1054 (42S22): Unknown column 'a' in 'field list'
The way to look at databases is as transactional engines that take data from one state into another state in one step (with combination of operators that operate not only on scalar values, but also on sets).
Whilst I agree with #Sir Wobin's answer, if you do want to write some recursive code, you may be able to do it by abusing Recursive Common Table Expressions:
with RecurseCalc as (
select CAST(1.5 as float) as Value,1 as Iter
union all
select 2 * Value,1+Iter from RecurseCalc where Iter < 100
), FinalResult as (
select top 1 Value from RecurseCalc order by Iter desc
)
select * from FinalResult option (maxrecursion 100)
Related
i need your help for a task that i have undertaken and i face difficulties.
So, i have to calculate the NET amount of sales for some products, which were sold in different cities on different years and for this reason different tax rate is applied.
Specifically, i have a dimension table (Dim_Cities) which consists of the cities that the products can be sold.
i.e
Dim_Cities:
CityID, CityName, Area, District.
Dim_Cities:
1, "Athens", "Attiki", "Central Greece".
Also, i have a file/table which consists of the following information :
i.e
[SalesArea]
,[EffectiveFrom_2019]
,[EffectiveTo_2019]
,[VAT_2019]
,[EffectiveFrom_2018]
,[EffectiveTo_2018]
,[VAT_2018]
,[EffectiveFrom_2017]
,[EffectiveTo_2017]
,[VAT_2017]
,[EffectiveFrom_2016_Semester1]
,[EffectiveTo_2016_Semester1]
,[VAT_2016_Semester1]
,[EffectiveFrom_2016_Semester2]
,[EffectiveTo_2016_Semester2]
,[VAT_2016_Semester2]
i.e
"Athens", "2019-01-01", "2019-12-31", 0.24,
"2018-01-01", "2018-12-31", 0.24,
"2017-01-01", "2017-12-31", 0.17,
"2016-01-01", "2016-05-31", 0.16,
"2016-01-06", "2016-12-31", 0.24
And of course there is a fact table that holds all the information,
i.e
FactSales_ID, CityID, SaleAmount (with VAT), SaleDate_ID.
The question is how to compute for every city the "TAX-Free SalesAmount", that corresponds to each particular saledate? In other words, i think that i have to create a function that computes every time the NET amount, substracting in each case the corresponding tax rate, based on the date and city that it finds. Can anyone help me or guide me to achieve this please?
I'm not sure if you are asking how to query your data to produce this result or how to design your data warehouse to make this data available - but I'm hoping you are asking about how to design your data warehouse as this information should definitely be pre-calculated and held in your DW rather than being calculated every time anyone wants to report on the data.
One of the key points of building a DW is that all the complex business logic should be handled in the ETL (as much as possible) so that the actually reporting is simple; the only calculations in a reporting process are those that can't be pre-calculated.
If your CITY Dim is SCD2 (or could be made to be SCD2) then I would add the VAT rate as an attribute to that Dim - otherwise you could hold VAT Rate in a "worker" table.
When your ETL loads your Fact table you would use the VAT rate on the CITY Dim (or in the worker table) to calculate the Net and Gross amounts and hold both as measures in your fact table
I want to store trades as well as best ask/bid data, where the latter updates much more rapidly than the former, in InfluxDB.
I want to, if possible, use a schema that allows me to query: "for each trade on market X, find the best ask/bid on market Y whose timestamp is <= the timestamp of the trade".
(I'll use any version of Influx.)
For example, trades might look like this:
Time Price Volume Direction Market
00:01.000 100 5 1 foo-bar
00:03.000 99 50 0 bar-baz
00:03.050 99 25 0 foo-bar
00:04.000 101 15 1 bar-baz
And tick data might look more like this:
Time Ask Bid Market
00:00.763 100 99 bar-baz
00:01.010 101 99 foo-bar
00:01.012 101 98 bar-baz
00:01.012 101 99 foo-bar
00:01:238 100 99 bar-baz
...
00:03:021 101 98 bar-baz
I would want to be able to somehow join each trade for some market, e.g. foo-bar, with only the most recent ask/bid data point on some other market, e.g. bar-baz, and get a result like:
Time Trade Price Ask Bid
00:01.000 100 100 99
00:03.050 99 101 98
Such that I could compute the difference between the trade price on market foo-bar and the most recently quoted ask or bid on market bar-baz.
Right now, I store trades in one time series and ask/bid data points in another and merge them on the client side, with logic along the lines of:
function merge(trades, quotes, data_points)
next_trade, more_trades = first(trades), rest(trades)
quotes = drop-while (quote.timestamp < next_trade.timestamp) quotes
data_point = join(next_trade, first(quotes))
if more_trades
return merge(more_trades, quotes, data_points + data_point)
return data_points + data_point
The problem is that the client has to discard tons of ask/bid data points because they update so frequently, and only the most recent update before the trade is relevant.
There are tens of markets whose most recent ask/bid I might want to compare a trade with, otherwise I'd simply store the most recent ask/bid in the same series as the trades.
Is it possible to do what I want to do with Influx, or with another time series database? An alternative solution that produces lower quality results is to group the ask/bid data by some time interval, say 250ms, and take the last from each interval, to at least impose an upper bound on the amount of quotes the client has to drop before finding the one that's closest to the next trade.
NB. Just a clarification on InfluxDB terminology. You're probably storing trade and tick data in different measurements(analogous to a table). Series is a subdivision withing a measurement based on tag values. e.g
Time Ask Bid Market
00:00.763 100 99 bar-baz
is one series
Time Ask Bid Market
00:01.010 101 99 foo-bar
is another series(assuming you are storing Market name/id as a tag and not a field)
Answer
InfluxQL https://docs.influxdata.com/influxdb/v1.7/query_language/spec/ - I can't think of a way to achieve what you need with InfluxQL (Influx Query Language) as it does not support joins.
Perhaps what you could do on the client side is instead of requesting all tick data for a period and discarding most of it, make a request per trade and market to get exactly the (the most recent with respect to the trade) ask/bid datapoint that you need. Something like:
function merge(trades, market)
points = <empty list>
for next_trade in trades
quote = db.query("select last(ask), last(bid) from tick_data where time<=next_trade.timestamp and Market=market and time>next_trade.timestamp - 1m")
// or to get a list per market with one query
// quote_per_market = db.query("select last(ask), last(bid) from tick_data where time<=next_trade.timestamp group by Market")
points = points + join(next_trade, quote)
return points
Of course you'd have the overhead of querying the database more frequently but depending on the number of trades and your resource constraints it may be more efficient. NB. A potential pitfall here is that ask and bid retrieved this way are not retrieved as a pair but independently and while they are returned as a pair it could happen that they have different timestamps. If for some timestamp for some reason you only have an ask or a bid price you might run into this problem. However, as long as you write them in pairs and have no missing data it should be ok.
Flux https://www.influxdata.com/products/flux/ - Flux is a more sophisticated query language that is part of Influxdb 1.7 and 2 that allows you to do joins and operations across different measurements. I can't give you any examples yet but it's worth having a look at.
Other (relational) Times Series DBs that you could have a look at that would also allow you to do joins are CrateDB https://crate.io/ or Postgres + TimescaleDB https://www.timescale.com/products
I am creating some db model for rental invoice generation.
The invoice consists of N booking time ranges.
Each booking belongs to a price model. A price model is a set of rules which determine a final price (base price + season price + quantity discout + ...).
That means the final price for the N bookings within an invoice can be a complex calculation, and of course I want to keep track of every aspect of the final price calculation for later review of an invoice.
The problem is, that a price model can change in the future. So upon invoice generation, there are two possibilities:
(a) Never change a price model. Just make it immutable by versioning it and refer to a concrete version from an invoice.
(b) Put all the price information, discounts and extras into the invoice. That would mean alot of data, as an invoice contains N bookings which may be partly in the range of a season price.
Basically, I would break down each booking into its days and for each day I would have N rows calculating the base price, discounts and extra fees.
Possible table model:
Invoice
id: int
InvoiceBooking # Each booking. One invoice has N bookings
id: int
invoiceId: int
(other data, e.g. guest information)
InvoiceBookingDay # Days of a booking. Each booking has N days
id: int
invoiceBookingId: id
date: date
InvoiceBookingDayPriceItem # Concrete discounts, etc. One days has many items
id: int
invoiceBookingDayId: int
price: decimal
title: string
My question is, which way should I prefer and why.
My considerations:
With solution (a), the invoice would be re-calculated using the price model information each time the data is viewed. I don't like this, as algorithms can change. It does not feel natural for the "read-only" nature of an invoice.
Also the version handling of price models is not a trivial task and the user needs to know about the version concept, which adds application complexity.
With solution (b), I generate a bunch of nested data and it adds alot of complexity to the schema.
Which way would you prefer? Am I missing something?
Thank you
There is a third option which I recommend. I call it temporal (time) versioning and the layout of the table is really quite simple. You don't describe your pricing data so I'll just show a simple example.
Table: DailyPricing
ID EffDate Price ...
A 01/01/2015 17.50 ...
B 01/01/2015 20.00 ...
C 01/01/2015 22.50 ...
B 01/01/2016 19.50 ...
C 07/01/2016 24.00 ...
This shows that all three price schedules (A, B and C just represent whatever method you use to distinguish between price levels) were given a price on Jan 1, 2015. On Jan 1, 2016, the price of plan B was reduced. In July, the price of plan C was increased.
To get the current price of a plan, the query is this:
select dp.Price
from DailyPricing dp
where dp.ID = 'A'
and dp.Effdate =(
select Max( dp2.EffDate )
from DailyPricing dp2
where dp2.ID = dp.ID
and dp2.EffDate >= :DateOfInterest);
The DateOfInterest variable would be loaded with the current date/time. This query returns the one price that is currently in effect. In this case, the price set Jan 1, 2015 as that has never changed since taking effect. If the search had been for plan B, the price set on Jan 1, 2016 would have been returned and for plan C, the price set on July 1, 2016. These are the latest prices set for each plan; that is, the current prices.
Such a query would more likely be in a join with probably the invoice table so you could perform the price calculation.
select ...
from Invoices i
join DailyPricing dp
on dp.ID = i.ID
and dp.Effdate =(
select Max( dp2.EffDate )
from DailyPricing dp2
where dp2.ID = dp.ID
and dp2.EffDate >= i.InvoiceDate )
where i.ID = 1234;
This is a little more complex than a simple query but you are asking for more complex data (or, rather, a more complex view of the data). However, this calculation is probably only executed once and the final price stored back in to the invoice data or elsewhere.
It would be calculated again only if the customer made some changes or you were going through an audit, rechecking the calculation for accuracy.
Notice something, however, that is subtle but very important. If the query above were being executed for an invoice that had just been created, the InvoiceDate would be the current date and the price returned would be the current price. If, however, the query was being run as a verification on an invoice that was two years old, the InvoiceDate would be two years ago and the price returned would be the price that was in effect two years ago.
In other words, the query to return current data and the query to return past data is the same query.
That is because current data and past data remain in the same table, differentiated only by the date the data takes effect. This, I think, is about the simplest solution to what you want to do.
How about A and B?
It's not best practice to re-calculate any component of an invoice, especially if the component was printed. An invoice and invoice details should be immutable, and you should be able to reproduce it without re-calculating.
If you ever have a problem with figuring out how you got to a certain amount, or if there is a bug in your program, you'll be glad you have the details, especially if the calculations are complex.
Also, it's a good idea to keep a history of your pricing models so you can validate how you got to a certain price. You can make this simple to your users. They don't have to see the history -- but you should record their changes in the history log.
I am looking to calculate in the calc script something, so I can allocate a row from a fact table to a dimension member.
The business scenario is the following. I have a fact table that record customer credit and debit ( customer can do a lot of little loan) and a dimension Customer.I want to classify my customer base on his history of credit and debit on a given period.Classification of customer change over time.
Example
The rule is, if a customer balance (for a given period ) is over - 50 000, the classification is "large", if he have more than a record and have done a payement in the last 3 month he is a "P&P.If he doesn't own any money and have done a payement in the last 3 month its "regular".
My question is more about direction than a specific code,which way is the best to implement this kind of rule ?
Best Regards
Vincent Diallo-Nort
I'd create a fact table with a balance auto-updated status every day:
check the rolling balance yesterday plus today's records.
when the balance = 0, then remove a record.
Plus add a flow fact table with payments only.
Add measures:
LastChild aggregation for the first fact table.
Sum aggregation for the second fact table.
When it's done, you may apply a MDX calculation:
case
when [Measure].[Balance] > 50000
then "Large"
when [Measure].[Payments] + ([Date].[Calendar].CurrentMember.Lag(1),[Measure].[Payments]) + ([Date].[Calendar].CurrentMember.Lag(2),[Measure].[Payments]) > 0
then "P&P"
else "Regular"
end
In order to give you answer in detail you have to provide more information about your data structure.
Here in South Africa we have Value Added Tax (VAT) which is pretty much identical to Sales Tax and is currently fixed at 14%, but could change at any time.
I need to include VAT on invoices (which are immutable) consisting of several Invoice Lines. Each line references a Product with a Boolean property, IsTaxable, and almost all products are taxable.
I don't want to store pre-tax prices in the database, because that just makes it hard to read the real price that the customer is going to pay and everywhere I display those prices, I then have to remember to add tax. And when the VAT rate does change, for this particular business, it is undesirable for all prices to change automagically.
So I reckon a reverse tax calculation is the way to go and probably not uncommon. The invoice total is the sum of all invoice line totals, which includes any line discounts and should be tax-inclusive. Therefore the invoice total itself is tax-inclusive:
TaxTotal = InvoiceTotal / (1 + TaxRate),
where InvoiceTotal is tax-inclusive and TaxRate == 0.14
Since invoices cannot be changed once issued (they are immutable), should I:
Store a single Tax amount in my Invoices table that does not change? Or...
Store a tax amount for each invoice line and calculate the invoice tax total every time I display the invoice?
Option 2 seems safer from a DBA point-of-view since if an invoice is ever manually changed, then Tax will be calculated correctly, but if the invoice has already been issued, this still presents a problem of inconsistency. If I stick with option 1, then I cannot display tax for a single line item, but it makes managing the tax total and doing aggregate calculations easier, though it also presents inconsistency if ever changed.
I can't do both since that would be duplicating data.
Which is the right way to go? Or is a reverse tax calculation a really bad idea?
Store the pre tax value in the data base, you can also store the with tax value and use that for most use cases.
Th big problem I forsee is the rounding rules for VAT on invoices.These (at least in the UK) are really strict and there is no way for your reverse calculation to get this right.
Also you need to store the tax item by item as the VAT dragons will expect you to refund exactly the tax paid if an item is returned. You really need to get hold of the local sales tax rules before you start.
My experience is that you can get dragged over the coals if your calculations are out by as little as a penny, and, if you are audited you need to be able to show how you arrived at the VAT figure so not storing anything used in your calculations will catch you out.
I totally agree with James Anderson! In Germany the rules according VAT calculations are as strict as in the UK.
We have to accumulate the net value by VAT percentage (we have three types: 0, 7 and 19 percent) rounded by two digits. On this rounded value we have to calculcate VAT.
VAT has to be rounded by two digits and has to be showed at the invoice.
But nonetheless you can store prices including tax. It depends whether the net prices or the end prices stay unchanged when tax rises. In Germany usually B2B net prices stay unchanged but B2C end prices stay unchanged - it depends.
You can calculate it this way:
with cPriceIncludingVAT as (
select InvoiceNo, VATPercentage,
PriceIncludingVAT = cast(sum(amount * price) as decimal(12,2))
from InvoiceLines inner join VAT on VAT.VATID=InvoiceLines.VATID
group by InvoiceNo, VATPercentage
),
cVATcalculated as (
select InvoiceNo, VATPercentage, PriceIncludingVAT,
VAT = cast(PriceIncludingVAT * VATPercentage /
(1+VATPercentage) as decimal(12,2))
from cVATcalculated
)
select InvoiceNo, VATPercentage, PriceIncludingVAT, VAT,
NetPrice = PriceIncludingVAT - VAT
from cVATcalculated;
If you save this as a view you should be able to reprint a dynamically calculated VAT value exactly. When there is an accounting system you can (and should) export exactly the same data you printed.
Usually you should save values like these as field values within the database - but I understand if you'd like to have a more dynamic approach ...
The other answers are good, but as idevlop mentions, it's almost a certainty that at some time in the future, you'll start having different rates for different categories of products. Adding that capability up front will save you a ton of heartache later on. Been there, done that.