Floating-point arithemtic in C: epsilon comparison - c

I'm trying to compare values with double precision using epsilon. However, I have a problem - initially I have thought that the difference should be equal to the epsilon, but it isn't. Additionally, when I've tried to check the binary representation using the successive multiplication something strange has happened and I feel confused, therefore I would appreciate your explanation to the problem and comments on my way of thinking
#include <stdio.h>
#define EPSILON 1e-10
void double_equal(double a, double b) {
printf("a: %.12f, b: %.12f, a - b = %.12f\n", a, b, a - b);
printf("a: %.12f, b: %.12f, b - a = %.12f\n", a, b, b - a);
if (a - b < EPSILON) printf("a - b < EPSILON\n");
if (a - b == EPSILON) printf("a - b == EPSILON\n");
if (a - b <= EPSILON) printf("a - b <= EPSILON\n");
if (b - a <= EPSILON) printf("b - a <= EPSILON\n");
}
int main(void) {
double wit1 = 1.0000000001;
double wit2 = 1.0;
double_equal(wit1, wit2);
return 0;
}
The output is:
a: 1.000000000100, b: 1.000000000000, a - b = 0.000000000100
a: 1.000000000100, b: 1.000000000000, b - a = -0.000000000100
b - a <= EPSILON
Numeric constants in C are declared as doubles if we don't provide "F"/"f" sign right after the number (#define EPSILON 1e-10F), therefore I can't see here the problem of conversion as in this question. Therefore, I have created really simple program for THESE SPECIFIC examples (I know it should include handling converting integral parts to binary numbers).
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
char* convert(double a) {
char* res = malloc(200);
int count = 0;
double integral;
a = modf(a, &integral);
if (integral == 1) {
res[count++] = integral + '0';
res[count++] = '.';
} else {
res[count++] = '0';
res[count++] = '.';
}
while(a != 0 && count < 200) {
printf("%.100f\n", a);
a *= 2;
a = modf(a, &integral);
if (integral == 1) res[count++] = integral + '0';
else res[count++] = '0';
}
res[count] = '\0';
return res;
}
int main(void) {
double wit1 = 1.0000000001;
double diff = 0.0000000001;
char* res = convert(wit1);
char* di = convert(diff);
printf("this: %s\n", res);
printf("diff: %s\n", di);
return 0;
}
Direct output:
this: 1.0000000000000000000000000000000001101101111100111
diff: 0.00000000000000000000000000000000011011011111001101111111011001110101111011110110111011
First question: why there are so many ending zero-ones in the difference? Why do the results after the binary point differ?
However, if we look at the process of calculation and the fractional part, printed out (I'm presenting only the first few lines):
1.0000000001:
0.0000000001000000082740370999090373516082763671875000000000000000000000000000000000000000000000000000
0.0000000002000000165480741998180747032165527343750000000000000000000000000000000000000000000000000000
0.0000000004000000330961483996361494064331054687500000000000000000000000000000000000000000000000000000
0.0000000001:
0.0000000001000000000000000036432197315497741579165547065599639608990401029586791992187500000000000000
0.0000000002000000000000000072864394630995483158331094131199279217980802059173583984375000000000000000
0.0000000004000000000000000145728789261990966316662188262398558435961604118347167968750000000000000000
Second question: why there are so many strange ending numbers? Is this a result of the incapability of the floating-point arithmetic of precisely representing decimal values?
Analyzing the subtraction, I can see, why the result is bigger than the epsilon. I follow the procedure:
Prepare a complement sequence of zero-ones for the sequence to subtract
"Add" the sequences
Subtract the one in the beginning, add it to the rightmost bit
Therefore:
1.0000000000000000000000000000000001101101111100111
- 1.0000000000000000000000000000000000000000000000000
|
\/
1.0000000000000000000000000000000001101101111100111
"+"0.1111111111111111111111111111111111111111111111111
--------------------------------------------------------
10.0000000000000000000000000000000001101101111100110
|
\/
0.0000000000000000000000000000000001101101111100111
Comparing with the calculated value of epsilon:
0.000000000000000000000000000000000110110111110011 0 1111111011001110101111011110110111011
0.000000000000000000000000000000000110110111110011 1
Spaces indicate the difference.
Third question: do I have to worry if I can't compare the value equal to the epsilon? I think that this situation indicates what the interval of tolerance with epsilon has been made for. However, is there anything I should change?

Why do the results after the binary point differ?
Because that is the difference.
Expecting something else comes from thinking 1.0000000001 and 0.0000000001 as double have those 2 values. They do not. Their difference is not 1.0. They have values near those two, each with about 53 binary digits of significance. Their difference is close to the unit in the last place of 1.0000000001.
why there are so many strange ending numbers? Is this a result of the incapability of the floating-point arithmetic of precisely representing decimal values?
Somewhat.
double can encode about 264 different numbers. 1.0000000001 and 0.0000000001 are not in that set. Instead nearby ones are used that look like strange ending numbers.
do I have to worry if I can't compare the value equal to the epsilon? I think that this situation indicates what the interval of tolerance with epsilon has been made for. However, is there anything I should change?
Yes, change use of epsilon. epsilon is useful for the relative difference, not absolute one. Very large consecutive double values are far more than epsilon apart. About 45% of all double, (the small ones) are all less than epsilon in magnitude. Either if (a - b <= EPSILON) printf("a - b <= EPSILON\n"); or if (b - a <= EPSILON) printf("b - a <= EPSILON\n"); will be true for small a, b even though they are trillions of times different in magnitude.
Oversimplification:
if (fabs(a-b) < EPSILON*fabs(a + b)) {
return values_a_b_are_near_each_other;
}

This answer assumes your C implementation uses IEEE-754 binary64, also known as the “double” format for its double type. This is common.
If the C implementation rounds correctly, then double wit1 = 1.0000000001; initializes wit1 to 1.0000000001000000082740370999090373516082763671875. This is because the two representable values nearest 1.0000000001 are 1.000000000099999786229432174877729266881942749023437500000000000000000000000000000000000000000000000 and 1.0000000001000000082740370999090373516082763671875. The latter is chosen since it is closer.
If correctly rounded, the 1e-10 used for EPSILON will produce 0.000000000100000000000000003643219731549774157916554706559963960899040102958679199218750000000000000.
Clearly wit1 - 1 exceeds EPSILON, so the test a - b < EPSILON in double_equal evaluates as false.
First question: why there are so many ending zero-ones in the difference?
Count the number of bits from the first 1 to the last 1. In each number, there are 53. That is because there are 53 bits in the significand of a double. It is a bit of a coincidence your numbers happened to end in a 1 bit. About half the time, the trailing bit is 0, and a quarter of the time, the last two bits are zeros, and so on. However, since there are 53 bits in the significand of a double, there will be exactly 53 bits from the first 1 bit to the last bit that is part of the represented value.
Since your first number starts with a 1 in the integer position, it has at most 52 bits after that. At that point, the number must be rounded to the nearest representable value.
Since your second number is between 2−34 and 2−33, its first 1 bit is in the 2−34 position, and it can go to the 2−86 position before it has to be rounded.
Third question: do I have to worry if I can't compare the value equal to the epsilon?
Why do you want to compare to the epsilon? There is no general solution for comparing floating-point numbers that contain errors from previous operations. Whether or not an “epsilon comparison” can or should be used is dependent on the application and the operations and numbers involved.

Related

How to compute the digits of an irrational number one by one?

I want to read digit by digit the decimals of the sqrt of 5 in C.
The square root of 5 is 2,23606797749979..., so this'd be the expected output:
2
3
6
0
6
7
9
7
7
...
I've found the following code:
#include<stdio.h>
void main()
{
int number;
float temp, sqrt;
printf("Provide the number: \n");
scanf("%d", &number);
// store the half of the given number e.g from 256 => 128
sqrt = number / 2;
temp = 0;
// Iterate until sqrt is different of temp, that is updated on the loop
while(sqrt != temp){
// initially 0, is updated with the initial value of 128
// (on second iteration = 65)
// and so on
temp = sqrt;
// Then, replace values (256 / 128 + 128 ) / 2 = 65
// (on second iteration 34.46923076923077)
// and so on
sqrt = ( number/temp + temp) / 2;
}
printf("The square root of '%d' is '%f'", number, sqrt);
}
But this approach stores the result in a float variable, and I don't want to depend on the limits of the float types, as I would like to extract like 10,000 digits, for instance. I also tried to use the native sqrt() function and casting it to string number using this method, but I faced the same issue.
What you've asked about is a very hard problem, and whether it's even possible to do "one by one" (i.e. without working space requirement that scales with how far out you want to go) depends on both the particular irrational number and the base you want it represented in. For example, in 1995 when a formula for pi was discovered that allows computing the nth binary digit in O(1) space, this was a really big deal. It was not something people expected to be possible.
If you're willing to accept O(n) space, then some cases like the one you mentioned are fairly easy. For example, if you have the first n digits of the square root of a number as a decimal string, you can simply try appending each digit 0 to 9, then squaring the string with long multiplication (same as you learned in grade school), and choosing the last one that doesn't overshoot. Of course this is very slow, but it's simple. The easy way to make it a lot faster (but still asymptotically just as bad) is using an arbitrary-precision math library in place of strings. Doing significantly better requires more advanced approaches and in general may not be possible.
As already noted, you need to change the algorithm into a digit-by-digit one (there are some examples in the Wikipedia page about the methods of computing of the square roots) and use an arbitrary precision arithmetic library to perform the calculations (for instance, GMP).
In the following snippet I implemented the before mentioned algorithm, using GMP (but not the square root function that the library provides). Instead of calculating one decimal digit at a time, this implementation uses a larger base, the greatest multiple of 10 that fits inside an unsigned long, so that it can produce 9 or 18 decimal digits at every iteration.
It also uses an adapted Newton method to find the actual "digit".
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <gmp.h>
unsigned long max_ul(unsigned long a, unsigned long b)
{
return a < b ? b : a;
}
int main(int argc, char *argv[])
{
// The GMP functions accept 'unsigned long int' values as parameters.
// The algorithm implemented here can work with bases other than 10,
// so that it can evaluate more than one decimal digit at a time.
const unsigned long base = sizeof(unsigned long) > 4
? 1000000000000000000
: 1000000000;
const unsigned long decimals_per_digit = sizeof(unsigned long) > 4 ? 18 : 9;
// Extract the number to be square rooted and the desired number of decimal
// digits from the command line arguments. Fallback to 0 in case of errors.
const unsigned long number = argc > 1 ? atoi(argv[1]) : 0;
const unsigned long n_digits = argc > 2 ? atoi(argv[2]) : 0;
// All the variables used by GMP need to be properly initialized before use.
// 'c' is basically the remainder, initially set to the original number
mpz_t c;
mpz_init_set_ui(c, number);
// At every iteration, the algorithm "move to the left" by two "digits"
// the reminder, so it multplies it by base^2.
mpz_t base_squared;
mpz_init_set_ui(base_squared, base);
mpz_mul(base_squared, base_squared, base_squared);
// 'p' stores the digits of the root found so far. The others are helper variables
mpz_t p;
mpz_init_set_ui(p, 0UL);
mpz_t y;
mpz_init(y);
mpz_t yy;
mpz_init(yy);
mpz_t dy;
mpz_init(dy);
mpz_t dx;
mpz_init(dx);
mpz_t pp;
mpz_init(pp);
// Timing, for testing porpuses
clock_t start = clock(), diff;
unsigned long x_max = number;
// Each "digit" correspond to some decimal digits
for (unsigned long i = 0,
last = (n_digits + decimals_per_digit) / decimals_per_digit + 1UL;
i < last; ++i)
{
// Find the greatest x such that: x * (2 * base * p + x) <= c
// where x is in [0, base), using a specialized Newton method
// pp = 2 * base * p
mpz_mul_ui(pp, p, 2UL * base);
unsigned long x = x_max;
for (;;)
{
// y = x * (pp + x)
mpz_add_ui(yy, pp, x);
mpz_mul_ui(y, yy, x);
// dy = y - c
mpz_sub(dy, y, c);
// If y <= c we have found the correct x
if ( mpz_sgn(dy) <= 0 )
break;
// Newton's step: dx = dy/y' where y' = 2 * x + pp
mpz_add_ui(yy, yy, x);
mpz_tdiv_q(dx, dy, yy);
// Update x even if dx == 0 (last iteration)
x -= max_ul(mpz_get_si(dx), 1);
}
x_max = base - 1;
// The actual format of the printed "digits" is up to you
if (i % 4 == 0)
{
if (i == 0)
printf("%lu.", x);
putchar('\n');
}
else
printf("%018lu", x);
// p = base * p + x
mpz_mul_ui(p, p, base);
mpz_add_ui(p, p, x);
// c = (c - y) * base^2
mpz_sub(c, c, y);
mpz_mul(c, c, base_squared);
}
diff = clock() - start;
long int msec = diff * 1000L / CLOCKS_PER_SEC;
printf("\n\nTime taken: %ld.%03ld s\n", msec / 1000, msec % 1000);
// Final cleanup
mpz_clear(c);
mpz_clear(base_squared);
mpz_clear(p);
mpz_clear(pp);
mpz_clear(dx);
mpz_clear(y);
mpz_clear(dy);
mpz_clear(yy);
}
You can see the outputted digits here.
Your title says:
How to compute the digits of an irrational number one by one?
Irrational numbers are not limited to most square roots. They also include numbers of the form log(x), exp(z), sin(y), etc. (transcendental numbers). However, there are some important factors that determine whether or how fast you can compute a given irrational number's digits one by one (that is, from left to right).
Not all irrational numbers are computable; that is, no one has found a way to approximate them to any desired length (whether by a closed form expression, a series, or otherwise).
There are many ways numbers can be expressed, such as by their binary or decimal expansions, as continued fractions, as series, etc. And there are different algorithms to compute a given number's digits depending on the representation.
Some formulas compute a given number's digits in a particular base (such as base 2), not in an arbitrary base.
For example, besides the first formula to extract the digits of π without computing the previous digits, there are other formulas of this type (known as BBP-type formulas) that extract the digits of certain irrational numbers. However, these formulas only work for a particular base, not all BBP-type formulas have a formal proof, and most importantly, not all irrational numbers have a BBP-type formula (essentially, only certain log and arctan constants do, not numbers of the form exp(x) or sqrt(x)).
On the other hand, if you can express an irrational number as a continued fraction (which all real numbers have), you can extract its digits from left to right, and in any base desired, using a specific algorithm. What is more, this algorithm works for any real number constant, including square roots, exponentials (e and exp(x)), logarithms, etc., as long as you know how to express it as a continued fraction. For an implementation see "Digits of pi and Python generators". See also Code to Generate e one Digit at a Time.

How to check for floating point precision in Double

Take this simple function for example.
int checkIfTriangleIsValid(double a, double b, double c) {
//fix the precision problem
int c1, c2, c3;
c1 = a+b>c ? 0 : 1;
c2 = b+c>a ? 0 : 1;
c3 = c+a>b ? 0 : 1;
if(c1 == 0 && c2 == 0 && c3 == 0)
return 0;
else {
printf("%d, %d, %d\n",c1, c2, c3);
return 1;
}
}
I place for a = 1.923, b = 59.240, c = 61.163
Now for some reason when I check for the condition in c1 it should give me 1, but instead, it gives me 0. I tried to do a printf with %.30f and found that the values later changes.
How can I fix this problem?
EDIT: I checked the other questions that are similar to mine but they don't even have a double.
Likely your C implementation uses the IEEE-754 basic 64-bit binary floating-point format for double. When 1.923, 59.240, and 61.163 are properly converted to the nearest values representable in double, the results are exactly:
1.9230000000000000426325641456060111522674560546875,
59.24000000000000198951966012828052043914794921875, and
61.1629999999999967030817060731351375579833984375.
As you can see, the first two of these sum to more than the third. This means that, by the time you assign these values to double objects, they have already been altered in a way that changes their relationship. No subsequent calculations can repair this, because the original information is gone.
Since no solution after conversion to double can work, you need a solution that operates before or instead of conversion to double. If you want to compute exactly, or more precisely, with the values 1.923, 59.240, and 61.163, you may need to write your own decimal arithmetic code or find some other code that supports decimal arithmetic. If you only want to work with numbers with three decimal places, then a possible solution is to write some code that reads input such as “59.240” and returns it in an integer object scaled by 1000, so that 59240 is returned. The resulting values could then easily be tested for the triangle inequality.
when I check for the condition in c1 it should give me 1, but instead it gives me 0
How can I fix this problem?
Change your expectations.
A typical double can represent exactly about 264 different values. 1.923, 59.240, 61.163 are typically not in that set as double is usually encoded in a binary way. e.g. binary64.
When a,b,c are assigned 1.923, 59.240, 61.163, they get values more like the below which are the closet double.
a 1.923000000000000042632564145606...
b 59.240000000000001989519660128281...
c 61.162999999999996703081706073135...
In my case, the a, and b both received a slightly higher value than the decimal code form, while c received a slightly lower one.
When adding a+b, the sum was rounded up, further away from c.
printf("a+b %35.30f\n", a+b);
a+b 61.163000000000003808509063674137
a + b > c was true, as well as other compares and OP's
checkIfTriangleIsValid(1.923, 59.240, 61.163) should return valid (0) as it is really more like checkIfTriangleIsValid(1.9230000000000000426..., 59.24000000000000198..., 61.16299999999999670...)
Adding a+b is further complicated in that the addition may occur using double or long double math. Research FLT_EVAL_METHOD for details. Rounding mode also can affect the final sum.
#include <float.h>
printf("FLT_EVAL_METHOD %d\n", FLT_EVAL_METHOD);
As to an alternative triangle check, subtract the largest 2 values and then compare against the smallest.
a > (c-b) can preserve significantly more precision than (a+b) > c.
// Assume a,b,c >= 0
int checkIfTriangleIsValid_2(double a, double b, double c) {
// Sort so `c` is largest, then b, a.
if (c < b) {
double t = b; b = c; c = t;
}
if (c < a) {
double t = a; a = c; c = t;
}
if (a > b) {
double t = b; b = a; a = t;
}
// So far, no loss of precision is expected due to compares/swaps.
// Only now need to check a + b >= c for valid triangle
// To preserve precision, subtract from `c` the value closest to it (`b`).
return a > (c-b);
}
I will review more later as time permits. This approach significant helps for a precise answer - yet need to assess more edge cases. It reports a valid triangle checkIfTriangleIsValid_2(1.923, 59.240, 61.163)).
FLT_EVAL_METHOD, rounding mode and double encoding can result in different answers on other platforms.
Notes:
It appears a checkIfTriangleIsValid() returning 0 means valid triangle.
It also appears when the triangle has 0 area, the expected result is 1 or invalid.

How to compare two complex numbers?

In C, complex numbers are float or double and have same problem as canonical types:
#include <stdio.h>
#include <complex.h>
int main(void)
{
double complex a = 0 + I * 0;
double complex b = 1 + I * 1;
for (int i = 0; i < 10; i++) {
a += .1 + I * .1;
}
if (a == b) {
puts("Ok");
}
else {
printf("Fail: %f + i%f != %f + i%f\n", creal(a), cimag(a), creal(b), cimag(b));
}
return 0;
}
The result:
$ clang main.c
$ ./a.out
Fail: 1.000000 + i1.000000 != 1.000000 + i1.000000
I try this syntax:
a - b < DBL_EPSILON + I * DBL_EPSILON
But the compiler hate it:
main.c:24:15: error: invalid operands to binary expression ('_Complex double' and '_Complex double')
if (a - b < DBL_EPSILON + I * DBL_EPSILON) {
~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This last works fine but it’s a little fastidious:
fabs(creal(a) - creal(b)) < DBL_EPSILON && fabs(cimag(a) - cimag(b)) < DBL_EPSILON
Comparing 2 complex floating point numbers is much like comparing 2 real floating point numbers.
Comparing for exact equivalences often is insufficient as the numbers involved contain small computational errors.
So rather than if (a == b) code needs to be if (nearlyequal(a,b))
The usual approach is double diff = cabs(a - b) and then comparing diff to some small constant value like DBL_EPSILON.
This fails when a,b are large numbers as their difference may many orders of magnitude larger than DBL_EPSILON, even though a,b differ only by their least significant bit.
This fails for small numbers too as the difference between a,b may be relatively great, but many orders of magnitude smaller than DBL_EPSILON and so return true when the value are relatively quite different.
Complex numbers literally add another dimensional problem to the issue as the real and imaginary components themselves may be greatly different. Thus the best answer for nearlyequal(a,b) is highly dependent on the code's goals.
For simplicity, let us use the magnitude of the difference as compared to the average magnitude of a,b. A control constant ULP_N approximates the number of binary digits of least significance that a,b are allowed to differ.
#define ULP_N 4
bool nearlyequal(complex double a, complex double b) {
double diff = cabs(a - b);
double mag = (cabs(a) + cabs(b))/2;
return diff <= (mag * DBL_EPSILON * (1ull << ULP_N));
}
Instead of comparing the complex number components, you can compute the complex absolute value (also known as norm, modulus or magnitude) of their difference, which is the distance between the two on the complex plane:
if (cabs(a - b) < DBL_EPSILON) {
// complex numbers are close
}
Small complex numbers will appear to be close to zero even if there is no precision issue, a separate issue that is also present for real numbers.
Since complex numbers are represented as floating point numbers, you have to deal with their inherent imprecision. Floating point numbers are "close enough" if they're within the machine epsilon.
The usual way is to subtract them, take the absolute value, and see if it's close enough.
#include <complex.h>
#include <stdbool.h>
#include <float.h>
static inline bool ceq(double complex a, double complex b) {
return cabs(a-b) < DBL_EPSILON;
}

Comparing two numbers without comparison operators

As part of a program that I am writing for an assignment, I need to compare two numbers. Essentially, the program computes the eccentricity of an ellipse given its two axes and it has to compare the value of the calculated eccentricity to the (given) eccentricity of the Moon's orbit around the Earth, and Earth's orbit around the Sun. If the calculated eccentricity is greater than the given eccentricity, then this needs to be represented by a value of 1, otherwise, a value of 0. All of these values are floating-point, specifically, long double.
The constraints of the assignment do not allow me to use comparison operators (like >) or any sort of logic (!x or if-else). However, I am allowed to use the pow and sqrt functions from the math.h library. I am also allowed to use arithmetic operations as well as the modulo operation.
I know that I can take advantage of integer division to truncate the decimal if the denominator is greater than the numerator, i.e.:
int x = eccentricity / MOON_ORBIT_ECCENTRICITY;
... will be 0 if MOON_ORBIT_ECCENTRICITY is greater than eccentricity. However, if this relationship is inverted, then the value of x could be any non-zero integer. In such a case, the desired result is 1.
The first and most intuitive (and naïve) solution was:
int y = (x / x);
This will return 1 if x is non-zero. However, if x is 0, then my program crashes due to division by zero. In fact, I keep running into the problem of dividing by zero. This also happens in the case of:
int y = (x + 1) % x;
Does anyone have an idea of how to solve this? This seems so frustratingly easy.
#lurker comment above is a good approach to handle eccentricity as restricted by OP.
So as not to copy that, consider the not-so-serious following:
// Return e1 > e2
int Eccentricity_Compare(long double e1, long double e2) {
char buf[20];
// print a number beginning with
// if e2 >= e1: `+`
// else `-`
sprintf(buf, "%+Le", e2 - e1); // reverse subtraction for = case
const char *pm = "+-";
char *p = strchr(pm, buf[0]);
return (int) (p - pm);
}
Wink, wink: OP said nothing about <stdio.h> functions.

Compare two floats

#include <stdbool.h>
bool Equality(double a, double b, double epsilon)
{
if (fabs(a-b) < epsilon) return true;
return false;
}
I tried this method to compare two doubles, but I always get problems since I don't know how to chose the epsilon, actually I want to compare small numbers (6 6 digits after the decimal point) like 0.000001. I tried with some numbers, sometimes I get 0.000001 != 0.000001 and sometimes 0.000001 == 0.000002
Is there another method else than comparing with the epsilon?
My purpose is to compare two doubles (which represent the time in my case). The variable t which represents the time in milliseconds is a double. It is incremented by another function 0.000001 then 0.000002 etc. each time t changes, I want to check if it is equal to another variable of type double tt, in case tt == t, I have some instructions to execute..
Thanks for your help
Look here: http://floating-point-gui.de/errors/comparison/
Due to rounding errors, most floating-point numbers end up being
slightly imprecise. As long as this imprecision stays small, it can
usually be ignored. However, it also means that numbers expected to be
equal (e.g. when calculating the same result through different correct
methods) often differ slightly, and a simple equality test fails.
And, of course, What Every Computer Scientist Should Know About Floating-Point Arithmetic
First: there's no point in computing a boolean value (with the < operator) and then wrapping that in another boolean. Just write it like this:
bool Equality(float a, float b, float epsilon)
{
return fabs(a - b) < epsilon;
}
Second, it's possible that your epsilon itself isn't well-represented as a float, and thus doesn't look like what you expect. Try with a negative power of 2, such as 1/1048576 for instance.
Alternatively, you could compare two integers instead. Just multiply your two floats by the desired precision and cast them to integers.
Be sure to round up/down correctly. Here is what it looks like:
BOOL floatcmp(float float1, float float2, unsigned int precision){
int int1, int2;
if (float1 > 0)
int1 = (int)(float1 * precision + .5);
else
int1 = (int)(float1 * precision - .5);
if (float2 > 0)
int2 = (int)(float2 * precision + .5);
else
int2 = (int)(float2 * precision - .5);
return (int1 == int2);
}
Keep in mind that when float a = +2^(254-127) * 1.___22 zeros___1 and float b = +2^(254-127) * 1.___23 zeros___ then we expect abs(a-b) < epsilon but instead a - b = +2^(254-127-23) * 1.___23 zeros___ = 20282409603651670423947251286000 which is much bigger than epsilon...

Resources