I have several variables listed below:
int cpu_time_b = 6
float clock_cycles_a = 2 * pow(10, 10));
float cpi_a = 2.0;
int cycle_time_a = 250;
float cpi_b = 1.2;
int cycle_time_b = 500
I am working out the clock rate of b with the following calculation:
(((1.2*clock_cycles_a)/cpu_time_b)/(1 * pow(10, 9)))
Clearly the answer should be 4 however my program is outputting 6000000204800000000.0 as the answer
I think that overflow is possibly happening here. Is this the case and if so, how could I fix the problem?
All calculations should be made to ensure comparable numbers are "reduced" together. in your example, it seems like only
cpu_time_b
is truly variable (undefined in the scope of your snippet. All other variables appears as constants. All constants should be computed before compilation especially if they are susceptible to cause overflow.
clock_cycles_a
cancels the denominator. pow is time consuming (may not be critical here) and not always that precise. You multiply the 2 explicitly when you declare clock_cycles_a and then use 1.2 below. etc. Reducing the whole thing keeping only the actual variable becomes:
24.0/cpu_time_b
which makes me deduce that cpu_time_b should be 6?
Finaly, while you write the equation, we have no idea of what you do with the result. Store it in the wrong variable type? printf with the wrong format? etc?
Related
Hope you're having a nice day.
I'm encountering a weird issue on my side. I am working on embedded C code on an STM32 F103 C8T6 micro controller on a custom BMS PCB, but I am having some issue with the code that calculates the actual temperature from the thermistor ADC value.
Through excel, we have determined that the equation we need to use to calculate the temperature in Celsius from the ADC value is: y = -0.5022x^5 + 6.665x^4 - 35.123x^3 + 92.559x^2 - 144.22x + 166.76.
So, in my code I have the following lines, with temp[i] being the raw ADC value and realTemp[i] being the converted value:
realTemp[i] = (double)(temp[i] / 10000);
realTemp[i] = -0.5022 * realTemp[i]*realTemp[i]*realTemp[i]*realTemp[i]*realTemp[i] + 6.665 * realTemp[i]*realTemp[i]*realTemp[i]*realTemp[i] - 35.123 * realTemp[i]*realTemp[i]*realTemp[i] + 92.559 * realTemp[i]*realTemp[i] - 144.22 * realTemp[i] + 166.76;
I am not using the pow function from math.h as it has given us issues in the past.
The values we are getting in our temp[i] variable are the following: 35480, 35496, 35393, 35480. When using these values with our function in excel, we are getting the correct output, between 25.3 and 25.5 Celsius, however the C code listed above is outputting 36 in the realTemp array. I am not sure about the decimal values, but I don't care about them because the value is typecast to a uint16 a few lines later to be transmitted over a CAN bus.
Use floating point division, not integer division.
// Integer division ------v-------------v
// realTemp[i] = (double)(temp[i] / 10000);
realTemp[i] = temp[i] / 10000.0;
The answer by Chux is correct, I just wanted to explain more why this works.
temp[i] is uint16, therefore the formula temp[i] / 10000 is integer division, and result will be the floor of (temp[i] / 10000). Thus, the final conversion to double is performed on a value which is floored already.
By converting 10000 to 10000.0, it means that the division of an integer with a float/double will perform floating division. By this, the result will be similar to what you expected.
As others have said, you are doing integer division then casting the result to a double - you need to do the division itself as a double.
Your code will be big and very slow on the micro-controller in question. This might not be an issue, assuming that temperature values don't usually change very often, so slow code could be fine for you.
You also need to be careful with high-degree polynomials - they can easily be unstable, especially if you try to extrapolate them to very high or low temperatures. This is a particularly risky if you decide to make the code faster by switching to a float.
A better method of this kind of thing is usually a lookup table (which can be big but is simpler to implement), or with a linear spline (which has smaller footprint but a bit more complex to implement).
I was trying to write a program to calculate the value of x^n using a while loop:
#include <stdio.h>
#include <math.h>
int main()
{
float x = 3, power = 1, copyx;
int n = 22, copyn;
copyx = x;
copyn = n;
while (n)
{
if ((n % 2) == 1)
{
power = power * x;
}
n = n / 2;
x *= x;
}
printf("%g^%d = %f\n", copyx, copyn, power);
printf("%g^%d = %f\n", copyx, copyn, pow(copyx, copyn));
return 0;
}
Up until the value of 15 for n, the answer from my created function and the pow function (from math.h) gives the same value; but, when the value of n exceeds 15, then it starts giving different answers.
I cannot understand why there is a difference in the answer. Is it that I have written the function in the wrong way or it is something else?
You are mixing up two different types of floating-point data. The pow function uses the double type but your loop uses the float type (which has less precision).
You can make the results coincide by either using the double type for your x, power and copyx variables, or by calling the powf function (which uses the float type) instead of pow.
The latter adjustment (using powf) gives the following output (clang-cl compiler, Windows 10, 64-bit):
3^22 = 31381059584.000000
3^22 = 31381059584.000000
And, changing the first line of your main to double x = 3, power = 1, copyx; gives the following:
3^22 = 31381059609.000000
3^22 = 31381059609.000000
Note that, with larger and larger values of n, you are increasingly likely to get divergence between the results of your loop and the value calculated using the pow or powf library functions. On my platform, the double version gives the same results, right up to the point where the value overflows the range and becomes Infinity. However, the float version starts to diverge around n = 55:
3^55 = 174449198498104595772866560.000000
3^55 = 174449216944848669482418176.000000
When I run your code I get this:
3^22 = 31381059584.000000
3^22 = 31381059609.000000
This would be because pow returns a double but your code uses float. When I changed to powf I got identical results:
3^22 = 31381059584.000000
3^22 = 31381059584.000000
So simply use double everywhere if you need high resolution results.
Floating point math is imprecise (and float is worse than double, having even fewer bits to store the data in; using double might delay the imprecision longer). The pow function (usually) uses an exponentiation algorithm that minimizes precision loss, and/or delegates to a chip-level instruction that may do stuff more efficiently, more precisely, or both. There could be more than one implementation of pow too, depending on whether you tell the compiler to use strictly conformant floating point math, the fastest possible, the hardware instruction, etc.
Your code is fine (though using double would get more precise results), but matching the improved precision of math.h's pow is non-trivial; by the time you've done so, you'll have reinvented it. That's why you use the library function.
That said, for logically integer math as you're using here, precision loss from your algorithm likely doesn't matter, it's purely the float vs. double issue where you lose precision from the type itself. As a rule, default to using double, and only switch to float if you're 100% sure you don't need the precision and can't afford the extra memory/computation cost of double.
Precision
float x = 3, power = 1; ... power = power * x forms a float product.
pow(x, y) forms a double result and good implementations internally use even wider math.
OP's loop method incurs rounded results after the 15th iteration. These roundings slowly compound the inaccuracy of the final result.
316 is a 26 bit odd number.
float encodes all odd numbers exactly until typically 224. Larger values are all even and of only 24 significant binary digits.
double encodes all odd numbers exactly until typically 253.
To do a fair comparison, use:
double objects and pow() or
float objects and powf().
For large powers, the pow(f)() function is certain to provide better answers than a loop at such functions often use internally extended precision and well managed rounding vs. the loop approach.
I'm trying to do the Steinhart-Hart temperature calculation on an Arduino. The equation is
I solved a system of 3 equations to obtain the values of A, B and C, which are:
A = 0.0164872
B = -0.00158538
C = 3.3813e-6
When I plug these into WolframAlpha to solve for T I get a value in Kelvins that makes sense:
T=1/(0.0164872-0.00158538*log2(10000)+3.3813E-6*(log2(10000))^3) solve for T
T = 298.145 Kelvins = 77 Fahrenheit
However when I try to use this equation on my Arduino, I get a very wrong answer, I suspect because doubles do not have enough precision. Here's what I'm using:
double temp = (1 / (A + B*log(R_therm) + C*pow(log(R_therm),3)));
This returns 222 Kelvin instead, which is way off.
So, how can I do a calculation like this in Arduino?? Any advice is greatly appreciated, thanks.
Precision is not the main issue. Could even use float and powf(). A thermistor temperature calculation is not that accurate. After all the temperature is certainly not better than ±0.1°C accurate. Self heating of the thermistor is a larger factor.
OP's C code assumes log base 2, use log base e log() as the constants were derived using log base 2. #Martin R
// double temp = (1 / (A + B*log(R_therm) + C*pow(log(R_therm),3)));
double temp = (1 / (A + B*log(R_therm)/log(2) + C*pow(log(R_therm)/log(2),3)));`
Sample implementation, that avoids an unnecessary slow pow() call.
static const inv_ln2 = 1.4426950408889634073599246810019;
double ln2_R = log(R_therm)*inv_ln2;
double temp = 1.0 / (A + ln2_R*(B + C*ln2_R*ln2_R));
Yes, floating point arithmetic has limited precision on most arduinos.
Have you considered using fixed precision? If used correctly, this might give you better results. The requirement for this is to have rather narrow parameters, however, and be careful about unit conversions.
An unsigned long on arduino is 4 bytes too, so it can contain numbers up to 2^32-1. If using fixed point, you might want to replace this 1/T by something like 100000/T, where the numerator constant and T have been scaled according to the desired precision.
You will also need to keep a (mental or paper) model of the number of decimals each variable contains, in order to optimize the operation order not to lose precision.
For the log2 function, I doubt it is available out of the box for integers. You could either cast the result or reimplement it. There is plenty of ressources for this problem, even here on SO.
(can skip this part just an explanation of the code below. my problems are under the code block.)
hi. i'm trying to algro for throttling loop cycles based on how much bandwidth the linux computer is using. i'm reading /proc/net/dev once a second and keeping track of the bytes transmitted in 2 variables. one is the last time it was checked the other is the recent time. from there subtracts the recent one from the last one to calculate how many bytes has been sent in 1 second.
from there i have the variables max_throttle, throttle, max_speed, and sleepp.
the idea is to increase or decrease sleepp depending on bandwidth being used. the less bandwidth the lower the delay and the higher the longer.
i am currently having to problems dealing with floats and ints. if i set all my variables to ints max_throttle becomes 0 always no matter what i set the others to and even if i initialize them.
also even though my if statement says "if sleepp is less then 0 return it to 0" it keeps going deeper and deeper into the negatives then levels out at aroung -540 with 0 bandwidth being used.
and the if(ii & 0x40) is for speed and usage control. in my application there will be no 1 second sleep so this code allows me to limit the sleepp from changing about once every 20-30 iterations. although im also having a problem with it where after the 2X iterations when it does trigger it continues to trigger every iteration after instead of only being true once and then being true again after 20-30 more iterations.
edit:: simpler test cast for my variable problem.
#include <stdio.h>
int main()
{
int max_t, max_s, throttle;
max_s = 400;
throttle = 90;
max_t = max_s * (throttle / 100);
printf("max throttle:%d\n", max_t);
return 0;
}
In C, operator / is an integer division when used with integers only. Therefore, 90/100 = 0. In order to do floating-point division with integers, first convert them to floats (or double or other fp types).
max_t = max_s * (int)(((float)throttle / 100.0)+0.5);
The +0.5 is rounding before converting to int. You might want to consider some standard flooring functions, I don't know your use case.
Also note that the 100.0 is a float literal, whereas 100 would be an intger literal. So, although they seem identical, they are not.
As kralyk pointed out, C’s integer division of 90/100 is 0. But rather than using floats you can work with ints… Just do the division after the multiplication (note the omission of parentheses):
max_t = max_s * throttle / 100;
This gives you the general idea. For example if you want the kind of rounding kralyk mentions, add 50 before doing the division:
max_t = (max_s * throttle + 50) / 100;
For one of my course project I started implementing "Naive Bayesian classifier" in C. My project is to implement a document classifier application (especially Spam) using huge training data.
Now I have problem implementing the algorithm because of the limitations in the C's datatype.
( Algorithm I am using is given here, http://en.wikipedia.org/wiki/Bayesian_spam_filtering )
PROBLEM STATEMENT:
The algorithm involves taking each word in a document and calculating probability of it being spam word. If p1, p2 p3 .... pn are probabilities of word-1, 2, 3 ... n. The probability of doc being spam or not is calculated using
Here, probability value can be very easily around 0.01. So even if I use datatype "double" my calculation will go for a toss. To confirm this I wrote a sample code given below.
#define PROBABILITY_OF_UNLIKELY_SPAM_WORD (0.01)
#define PROBABILITY_OF_MOSTLY_SPAM_WORD (0.99)
int main()
{
int index;
long double numerator = 1.0;
long double denom1 = 1.0, denom2 = 1.0;
long double doc_spam_prob;
/* Simulating FEW unlikely spam words */
for(index = 0; index < 162; index++)
{
numerator = numerator*(long double)PROBABILITY_OF_UNLIKELY_SPAM_WORD;
denom2 = denom2*(long double)PROBABILITY_OF_UNLIKELY_SPAM_WORD;
denom1 = denom1*(long double)(1 - PROBABILITY_OF_UNLIKELY_SPAM_WORD);
}
/* Simulating lot of mostly definite spam words */
for (index = 0; index < 1000; index++)
{
numerator = numerator*(long double)PROBABILITY_OF_MOSTLY_SPAM_WORD;
denom2 = denom2*(long double)PROBABILITY_OF_MOSTLY_SPAM_WORD;
denom1 = denom1*(long double)(1- PROBABILITY_OF_MOSTLY_SPAM_WORD);
}
doc_spam_prob= (numerator/(denom1+denom2));
return 0;
}
I tried Float, double and even long double datatypes but still same problem.
Hence, say in a 100K words document I am analyzing, if just 162 words are having 1% spam probability and remaining 99838 are conspicuously spam words, then still my app will say it as Not Spam doc because of Precision error (as numerator easily goes to ZERO)!!!.
This is the first time I am hitting such issue. So how exactly should this problem be tackled?
This happens often in machine learning. AFAIK, there's nothing you can do about the loss in precision. So to bypass this, we use the log function and convert divisions and multiplications to subtractions and additions, resp.
SO I decided to do the math,
The original equation is:
I slightly modify it:
Taking logs on both sides:
Let,
Substituting,
Hence the alternate formula for computing the combined probability:
If you need me to expand on this, please leave a comment.
Here's a trick:
for the sake of readability, let S := p_1 * ... * p_n and H := (1-p_1) * ... * (1-p_n),
then we have:
p = S / (S + H)
p = 1 / ((S + H) / S)
p = 1 / (1 + H / S)
let`s expand again:
p = 1 / (1 + ((1-p_1) * ... * (1-p_n)) / (p_1 * ... * p_n))
p = 1 / (1 + (1-p_1)/p_1 * ... * (1-p_n)/p_n)
So basically, you will obtain a product of quite large numbers (between 0 and, for p_i = 0.01, 99). The idea is, not to multiply tons of small numbers with one another, to obtain, well, 0, but to make a quotient of two small numbers. For example, if n = 1000000 and p_i = 0.5 for all i, the above method will give you 0/(0+0) which is NaN, whereas the proposed method will give you 1/(1+1*...1), which is 0.5.
You can get even better results, when all p_i are sorted and you pair them up in opposed order (let's assume p_1 < ... < p_n), then the following formula will get even better precision:
p = 1 / (1 + (1-p_1)/p_n * ... * (1-p_n)/p_1)
that way you devide big numerators (small p_i) with big denominators (big p_(n+1-i)), and small numerators with small denominators.
edit: MSalter proposed a useful further optimization in his answer. Using it, the formula reads as follows:
p = 1 / (1 + (1-p_1)/p_n * (1-p_2)/p_(n-1) * ... * (1-p_(n-1))/p_2 * (1-p_n)/p_1)
Your problem is caused because you are collecting too many terms without regard for their size. One solution is to take logarithms. Another is to sort your individual terms. First, let's rewrite the equation as 1/p = 1 + ∏((1-p_i)/p_i). Now your problem is that some of the terms are small, while others are big. If you have too many small terms in a row, you'll underflow, and with too many big terms you'll overflow the intermediate result.
So, don't put too many of the same order in a row. Sort the terms (1-p_i)/p_i. As a result, the first will be the smallest term, the last the biggest. Now, if you'd multiply them straight away you would still have an underflow. But the order of calculation doesn't matter. Use two iterators into your temporary collection. One starts at the beginning (i.e. (1-p_0)/p_0), the other at the end (i.e (1-p_n)/p_n), and your intermediate result starts at 1.0. Now, when your intermediate result is >=1.0, you take a term from the front, and when your intemediate result is < 1.0 you take a result from the back.
The result is that as you take terms, the intermediate result will oscillate around 1.0. It will only go up or down as you run out of small or big terms. But that's OK. At that point, you've consumed the extremes on both ends, so it the intermediate result will slowly approach the final result.
There's of course a real possibility of overflow. If the input is completely unlikely to be spam (p=1E-1000) then 1/p will overflow, because ∏((1-p_i)/p_i) overflows. But since the terms are sorted, we know that the intermediate result will overflow only if ∏((1-p_i)/p_i) overflows. So, if the intermediate result overflows, there's no subsequent loss of precision.
Try computing the inverse 1/p. That gives you an equation of the form 1 + 1/(1-p1)*(1-p2)...
If you then count the occurrence of each probability--it looks like you have a small number of values that recur--you can use the pow() function--pow(1-p, occurences_of_p)*pow(1-q, occurrences_of_q)--and avoid individual roundoff with each multiplication.
You can use probability in percents or promiles:
doc_spam_prob= (numerator*100/(denom1+denom2));
or
doc_spam_prob= (numerator*1000/(denom1+denom2));
or use some other coefficient
I am not strong in math so I cannot comment on possible simplifications to the formula that might eliminate or reduce your problem. However, I am familiar with the precision limitations of long double types and am aware of several arbitrary and extended precision math libraries for C. Check out:
http://www.nongnu.org/hpalib/
and
http://www.tc.umn.edu/~ringx004/mapm-main.html