How can I check if a float variable contains an integer value? So far, I've been using:
float f = 4.5886;
if (f-(int)f == 0)
printf("yes\n");
else printf("no\n");
But I wonder if there is a better solution, or if this one has any (or many) drawbacks.
Apart from the fine answers already given, you can also use ceilf(f) == f or floorf(f) == f. Both expressions return true if f is an integer. They also returnfalse for NaNs (NaNs always compare unequal) and true for ±infinity, and don't have the problem with overflowing the integer type used to hold the truncated result, because floorf()/ceilf() return floats.
Keep in mind that most of the techniques here are valid presuming that round-off error due to prior calculations is not a factor. E.g. you could use roundf, like this:
float z = 1.0f;
if (roundf(z) == z) {
printf("integer\n");
} else {
printf("fraction\n");
}
The problem with this and other similar techniques (such as ceilf, casting to long, etc.) is that, while they work great for whole number constants, they will fail if the number is a result of a calculation that was subject to floating-point round-off error. For example:
float z = powf(powf(3.0f, 0.05f), 20.0f);
if (roundf(z) == z) {
printf("integer\n");
} else {
printf("fraction\n");
}
Prints "fraction", even though (31/20)20 should equal 3, because the actual calculation result ended up being 2.9999992847442626953125.
Any similar method, be it fmodf or whatever, is subject to this. In applications that perform complex or rounding-prone calculations, usually what you want to do is define some "tolerance" value for what constitutes a "whole number" (this goes for floating-point equality comparisons in general). We often call this tolerance epsilon. For example, lets say that we'll forgive the computer for up to +/- 0.00001 rounding error. Then, if we are testing z, we can choose an epsilon of 0.00001 and do:
if (fabsf(roundf(z) - z) <= 0.00001f) {
printf("integer\n");
} else {
printf("fraction\n");
}
You don't really want to use ceilf here because e.g. ceilf(1.0000001) is 2 not 1, and ceilf(-1.99999999) is -1 not -2.
You could use rintf in place of roundf if you prefer.
Choose a tolerance value that is appropriate for your application (and yes, sometimes zero tolerance is appropriate). For more information, check out this article on comparing floating-point numbers.
if (fmod(f, 1) == 0.0) {
...
}
Don't forget math.h and libm.
stdlib float modf (float x, float *ipart) splits into two parts, check if return value (fractional part) == 0.
if (f <= LONG_MIN || f >= LONG_MAX || f == (long)f) /* it's an integer */
This deals with computational round-off. You set the epsilon as desired:
bool IsInteger(float value)
{
return fabs(ceilf(value) - value) < EPSILON;
}
I'm not 100% sure but when you cast f to an int, and subtract it from f, I believe it is getting cast back to a float. This probably won't matter in this case, but it could present problems down the line if you are expecting that to be an int for some reason.
I don't know if it's a better solution per se, but you could use modulus math instead, for example:
float f = 4.5886;
bool isInt;
isInt = (f % 1.0 != 0) ? false : true;
depending on your compiler you may or not need the .0 after the 1, again the whole implicit casts thing comes into play. In this code, the bool isInt should be true if the right of the decimal point is all zeroes, and false otherwise.
#define twop22 (0x1.0p+22)
#define ABS(x) (fabs(x))
#define isFloatInteger(x) ((ABS(x) >= twop22) || (((ABS(x) + twop22) - twop22) == ABS(x)))
Related
I´m looking for an alternative for the ceil() and floor() functions in C, due to I am not allowed to use these in a project.
What I have build so far is a tricky back and forth way by the use of the cast operator and with that the conversion from a floating-point value (in my case a double) into an int and later as I need the closest integers, above and below the given floating-point value, to be also double values, back to double:
#include <stdio.h>
int main(void) {
double original = 124.576;
double floorint;
double ceilint;
int f;
int c;
f = (int)original; //Truncation to closest floor integer value
c = f + 1;
floorint = (double)f;
ceilint = (double)c;
printf("Original Value: %lf, Floor Int: %lf , Ceil Int: %lf", original, floorint, ceilint);
}
Output:
Original Value: 124.576000, Floor Int: 124.000000 , Ceil Int: 125.000000
For this example normally I would not need the ceil and floor integer values of c and f to be converted back to double but I need them in double in my real program. Consider that as a requirement for the task.
Although the output is giving the desired values and seems right so far, I´m still in concern if this method is really that right and appropriate or, to say it more clearly, if this method does bring any bad behavior or issue into the program or gives me a performance-loss in comparison to other alternatives, if there are any other possible alternatives.
Do you know a better alternative? And if so, why this one should be better?
Thank you very much.
Do you know a better alternative? And if so, why this one should be better?
OP'code fails:
original is already a whole number.
original is a negative like -1.5. Truncation is not floor there.
original is just outside int range.
original is not-a-number.
Alternative construction
double my_ceil(double x)
Using the cast to some integer type trick is a problem when x is outsize the integer range. So check first if x is inside range of a wide enough integer (one whose precision exceeds double). x values outside that are already whole numbers. Recommend to go for the widest integer (u)intmax_t.
Remember that a cast to an integer is a round toward 0 and not a floor. Different handling needed if x is negative/positive when code is ceil() or floor(). OP's code missed this.
I'd avoid if (x >= INTMAX_MAX) { as that involves (double) INTMAX_MAX whose rounding and then precise value is "chosen in an implementation-defined manner". Instead, I'd compare against INTMAX_MAX_P1. some_integer_MAX is a Mersenne Number and with 2's complement, ...MIN is a negated "power of 2".
#include <inttypes.h>
#define INTMAX_MAX_P1 ((INTMAX_MAX/2 + 1)*2.0)
double my_ceil(double x) {
if (x >= INTMAX_MAX_P1) {
return x;
}
if (x < INTMAX_MIN) {
return x;
}
intmax_t i = (intmax_t) x; // this rounds towards 0
if (i < 0 || x == i) return i; // negative x is already rounded up.
return i + 1.0;
}
As x may be a not-a-number, it is more useful to reverse the compare as relational compare of a NaN is false.
double my_ceil(double x) {
if (x >= INTMAX_MIN && x < INTMAX_MAX_P1) {
intmax_t i = (intmax_t) x; // this rounds towards 0
if (i < 0 || x == i) return i; // negative x is already rounded up.
return i + 1.0;
}
return x;
}
double my_floor(double x) {
if (x >= INTMAX_MIN && x < INTMAX_MAX_P1) {
intmax_t i = (intmax_t) x; // this rounds towards 0
if (i > 0 || x == i) return i; // positive x is already rounded down.
return i - 1.0;
}
return x;
}
You're missing an important step: you need to check if the number is already integral, so for ceil assuming non-negative numbers (generalisation is trivial), use something like
double ceil(double f){
if (f >= LLONG_MAX){
// f will be integral unless you have a really funky platform
return f;
} else {
long long i = f;
return 0.0 + i + (f != i); // to obviate potential long long overflow
}
}
Another missing piece in the puzzle, which is covered off by my enclosing if, is to check if f is within the bounds of a long long. On common platforms if f was outside the bounds of a long long then it would be integral anyway.
Note that floor is trivial due to the fact that truncation to long long is always towards zero.
I have one double, and one int64_t. I want to know if they hold exactly the same value, and if converting one type into the other does not lose any information.
My current implementation is the following:
int int64EqualsDouble(int64_t i, double d) {
return (d >= INT64_MIN)
&& (d < INT64_MAX)
&& (round(d) == d)
&& (i == (int64_t)d);
}
My question is: is this implementation correct? And if not, what would be a correct answer? To be correct, it must leave no false positive, and no false negative.
Some sample inputs:
int64EqualsDouble(0, 0.0) should return 1
int64EqualsDouble(1, 1.0) should return 1
int64EqualsDouble(0x3FFFFFFFFFFFFFFF, (double)0x3FFFFFFFFFFFFFFF) should return 0, because 2^62 - 1 can be exactly represented with int64_t, but not with double.
int64EqualsDouble(0x4000000000000000, (double)0x4000000000000000) should return 1, because 2^62 can be exactly represented in both int64_t and double.
int64EqualsDouble(INT64_MAX, (double)INT64_MAX) should return 0, because INT64_MAX can not be exactly represented as a double
int64EqualsDouble(..., 1.0e100) should return 0, because 1.0e100 can not be exactly represented as an int64_t.
Yes, your solution works correctly because it was designed to do so, because int64_t is represented in two's complement by definition (C99 7.18.1.1:1), on platforms that use something resembling binary IEEE 754 double-precision for the double type. It is basically the same as this one.
Under these conditions:
d < INT64_MAX is correct because it is equivalent to d < (double) INT64_MAX and in the conversion to double, the number INT64_MAX, equal to 0x7fffffffffffffff, rounds up. Thus you want d to be strictly less than the resulting double to avoid triggering UB when executing (int64_t)d.
On the other hand, INT64_MIN, being -0x8000000000000000, is exactly representable, meaning that a double that is equal to (double)INT64_MIN can be equal to some int64_t and should not be excluded (and such a double can be converted to int64_t without triggering undefined behavior)
It goes without saying that since we have specifically used the assumptions about 2's complement for integers and binary floating-point, the correctness of the code is not guaranteed by this reasoning on platforms that differ. Take a platform with binary 64-bit floating-point and a 64-bit 1's complement integer type T. On that platform T_MIN is -0x7fffffffffffffff. The conversion to double of that number rounds down, resulting in -0x1.0p63. On that platform, using your program as it is written, using -0x1.0p63 for d makes the first three conditions true, resulting in undefined behavior in (T)d, because overflow in the conversion from integer to floating-point is undefined behavior.
If you have access to full IEEE 754 features, there is a shorter solution:
#include <fenv.h>
…
#pragma STDC FENV_ACCESS ON
feclearexcept(FE_INEXACT), f == i && !fetestexcept(FE_INEXACT)
This solution takes advantage of the conversion from integer to floating-point setting the INEXACT flag iff the conversion is inexact (that is, if i is not representable exactly as a double).
The INEXACT flag remains unset and f is equal to (double)i if and only if f and i represent the same mathematical value in their respective types.
This approach requires the compiler to have been warned that the code accesses the FPU's state, normally with #pragma STDC FENV_ACCESS on but that's typically not supported and you have to use a compilation flag instead.
OP's code has a dependency that can be avoided.
For a successful compare, d must be a whole number and round(d) == d takes care of that. Even d, as a NaN would fail that.
d must be mathematically in the range of [INT64_MIN ... INT64_MAX] and if the if conditions properly insure that, then the final i == (int64_t)d completes the test.
So the question comes down to comparing INT64 limits with the double d.
Let us assume FLT_RADIX == 2, but not necessarily IEEE 754 binary64.
d >= INT64_MIN is not a problem as -INT64_MIN is a power of 2 and exactly converts to a double of the same value, so the >= is exact.
Code would like to do the mathematical d <= INT64_MAX, but that may not work and so a problem. INT64_MAX is a "power of 2 - 1" and may not convert exactly - it depends on if the precision of the double exceeds 63 bits - rendering the compare unclear. A solution is to halve the comparison. d/2 suffers no precision loss and INT64_MAX/2 + 1 converts exactly to a double power-of-2
d/2 < (INT64_MAX/2 + 1)
[Edit]
// or simply
d < ((double)(INT64_MAX/2 + 1))*2
Thus if code does not want to rely on the double having less precision than uint64_t. (Something that likely applies with long double) a more portable solution would be
int int64EqualsDouble(int64_t i, double d) {
return (d >= INT64_MIN)
&& (d < ((double)(INT64_MAX/2 + 1))*2) // (d/2 < (INT64_MAX/2 + 1))
&& (round(d) == d)
&& (i == (int64_t)d);
}
Note: No rounding mode issues.
[Edit] Deeper limit explanation
Insuring mathematically, INT64_MIN <= d <= INT64_MAX, can be re-stated as INT64_MIN <= d < (INT64_MAX + 1) as we are dealing with whole numbers. Since the raw application of (double) (INT64_MAX + 1) in code is certainly 0, an alternative, is ((double)(INT64_MAX/2 + 1))*2. This can be extended for rare machines with double of higher powers-of-2 to ((double)(INT64_MAX/FLT_RADIX + 1))*FLT_RADIX. The comparison limits being exact powers-of-2, conversion to double suffers no precision loss and (lo_limit >= d) && (d < hi_limit) is exact, regardless of the precision of the floating point. Note: that a rare floating point with FLT_RADIX == 10 is still a problem.
In addition to Pascal Cuoq's elaborate answer, and given the extra context you give in comments, I would add a test for negative zeros. You should preserve negative zeros unless you have good reasons not to. You need a specific test to avoid converting them to (int64_t)0. With your current proposal, negative zeros will pass your test, get stored as int64_t and read back as positive zeros.
I am not sure what is the most efficient way to test them, maybe this:
int int64EqualsDouble(int64_t i, double d) {
return (d >= INT64_MIN)
&& (d < INT64_MAX)
&& (round(d) == d)
&& (i == (int64_t)d
&& (!signbit(d) || d != 0.0);
}
This question already has answers here:
Checking if float is an integer
(8 answers)
Closed 8 years ago.
I have a program in which i need to print FLOAT in case of a float number or print INTEGER in case of a regular number.
for Example pseudo code
float num = 1.5;
if (num mod sizeof(int)==0)
printf ("INTEGER");
else
printf("FLOAT");
For example:
1.6 would print "FLOAT"
1.0 would print "INTEGER"
Will something like this work?
All float types have the same size, so your method won't work. You can check if a float is an integer by using ceilf
float num = 1.5;
if (ceilf(num) == num)
printf ("INTEGER");
else
printf("FLOAT");
You can use modff():
const char * foo (float num) {
float x;
modff(num, &x);
return (num == x) ? "INTEGER" : "FLOAT";
}
modff() will take a float argument, and break it into its integer and fractional parts. It stores the integer part in the second argument, and the fractional part is returned.
The "easy" way, but with a catch:
You could use roundf, like this:
float z = 1.0f;
if (roundf(z) == z) {
printf("integer\n");
} else {
printf("fraction\n");
}
The problem with this and other similar techniques (such as ceilf) is that, while they work great for whole number constants, they will fail if the number is a result of a calculation that was subject to floating-point round-off error. For example:
float z = powf(powf(3.0f, 0.05f), 20.0f);
if (roundf(z) == z) {
printf("integer\n");
} else {
printf("fraction\n");
}
Prints "fraction", even though (31/20)20 should equal 3, because the actual calculation result ended up being 2.9999992847442626953125.
So how do we deal with this?
Any similar method, be it fmodf or whatever, is subject to this. In applications that perform complex or rounding-prone calculations, usually what you want to do is define some "tolerance" value for what constitutes a "whole number" (this goes for floating-point equality comparisons in general). We often call this tolerance epsilon. For example, lets say that we'll forgive the computer for up to +/- 0.00001 rounding error. Then, if we are testing z, we can choose an epsilon of 0.00001 and do:
if (fabsf(roundf(z) - z) <= 0.00001f) {
printf("integer\n");
} else {
printf("fraction\n");
}
You don't really want to use ceilf here because e.g. ceilf(1.0000001) is 2 not 1, and ceilf(-1.99999999) is -1 not -2.
Choose a tolerance value that is appropriate for your application. For more information, check out this article on comparing floating-point numbers.
Will something like this work?
No. For example on the x86_32 and ARM 32 bit architectures sizeof(int) == 4 and sizeof(float) == 4.
Also whatever you think mod is, it clearly shows you don't understand what the sizeof operator does.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
C compiler bug (floating point arithmetic)?
I've got two doubles which I can guarantee are exactly equal to 150 decimal places - ie. the following code:
printf("***current line time is %5.150lf\n", current_line->time);
printf("***time for comparison is %5.150lf\n", (last_stage_four_print_time + FIVE_MINUTES_IN_DAYS));
...returns:
***current line time is 39346.526736111096397507935762405395507812500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
***time for comparison is 39346.526736111096397507935762405395507812500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
FIVE_MINUTES_IN_DAYS is #defined, and current_line->time and last_stage_four_print_time are both doubles.
My problem is that the next line of my debugging code:
printf("if condition is %d\n", (current_line->time >= (last_stage_four_print_time + FIVE_MINUTES_IN_DAYS)));
returns the following:
if condition is 0
Can anyone tell me what's going on here? I am aware of the non-decimal/inexact nature of floats and doubles but these are not subject to any error at all (the original figures have all been read with sscanf or #defined and are all specified to 10 decimal places).
EDIT: My mistake was assuming that printf-ing the doubles accurately represented them in memory, which was wrong because one value is being calculated on-the-fly. Declaring (last_stage_four_print_time + FIVE_MINUTES_IN_DAYS) as threshold_time and using that instead fixed the problem. I will make sure to use an epsilon for my comparisons - I knew that was the way to go, I was just confused as to why these values which I (incorrectly) thought looked the same were apparently inequal.
Floats certainly are not accurate to 150 significant digits, so I 'm not sure what conclusion can be drawn from the "visual" comparison (if any).
On the other hand, the values are obviously not bit-identical (and how could they be, since one of them is calculated on the spot with addition?). So it's not really clear why the behavior you see is unexpected.
Don't ever compare floats like that, just do the standard comparison of difference vs epsilon.
Read about floating point representation (particularly http://en.wikipedia.org/wiki/IEEE_754-2008). Try printing the actual contents of the bytes containing the doubles as hexadecimal and they won't match bit for bit.
The proper comparison for floats is in Knuth (Seminumerical algorithms). Simply (replace bool with int and float with double, true with 1):
bool almostEqual(float x, float y, float epsilon)
{
if (x == 0.0 && y == 0.0) {
return true;
}
if (fabs(x) > fabs(y)) {
return fabs((x - y) / x) < epsilon;
} else {
return fabs((x - y) / y) < epsilon;
}
}
You should always use an EPSILON value for comparison of floats and doubles to check for equality. Even though it looks the same the internal representation is not guaranteed to match because of the way these types of numbers are represented in binary.
You can do something like
#define EPSILON 0.00001
...
if (fabs(a - b) <= EPSILON) return 1; // they are equal
return 0;
Jesus is right about how to solve this.
As for why... in one case you read in a constant value, in the other case you perform an addition operation. Even if the printed output is exactly the same, the binary representation can be slightly different.
Try inspecting the memory backing the two double's and see if any bits are different (there will be differences).
For a comprehensive treatment, I recommend
http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
In general you shouldn't use == to compare floats or doubles. You should instead check that the difference is smaller than some small number.
double second_number = last_stage_four_print_time + FIVE_MINUTES_IN_DAYS;
if (fabs(current_line->time - second_number) < 0.001 || current_line->time > second_number){
// your comparison code
}
First, doubles have just 15-16 decimal places (log_2 of 52 bit matissa).
Second, if you want to compare, use the already mentioned epsilon.
Thirdly, for debugging, print the hex value.
#include <stdbool.h>
bool Equality(double a, double b, double epsilon)
{
if (fabs(a-b) < epsilon) return true;
return false;
}
I tried this method to compare two doubles, but I always get problems since I don't know how to chose the epsilon, actually I want to compare small numbers (6 6 digits after the decimal point) like 0.000001. I tried with some numbers, sometimes I get 0.000001 != 0.000001 and sometimes 0.000001 == 0.000002
Is there another method else than comparing with the epsilon?
My purpose is to compare two doubles (which represent the time in my case). The variable t which represents the time in milliseconds is a double. It is incremented by another function 0.000001 then 0.000002 etc. each time t changes, I want to check if it is equal to another variable of type double tt, in case tt == t, I have some instructions to execute..
Thanks for your help
Look here: http://floating-point-gui.de/errors/comparison/
Due to rounding errors, most floating-point numbers end up being
slightly imprecise. As long as this imprecision stays small, it can
usually be ignored. However, it also means that numbers expected to be
equal (e.g. when calculating the same result through different correct
methods) often differ slightly, and a simple equality test fails.
And, of course, What Every Computer Scientist Should Know About Floating-Point Arithmetic
First: there's no point in computing a boolean value (with the < operator) and then wrapping that in another boolean. Just write it like this:
bool Equality(float a, float b, float epsilon)
{
return fabs(a - b) < epsilon;
}
Second, it's possible that your epsilon itself isn't well-represented as a float, and thus doesn't look like what you expect. Try with a negative power of 2, such as 1/1048576 for instance.
Alternatively, you could compare two integers instead. Just multiply your two floats by the desired precision and cast them to integers.
Be sure to round up/down correctly. Here is what it looks like:
BOOL floatcmp(float float1, float float2, unsigned int precision){
int int1, int2;
if (float1 > 0)
int1 = (int)(float1 * precision + .5);
else
int1 = (int)(float1 * precision - .5);
if (float2 > 0)
int2 = (int)(float2 * precision + .5);
else
int2 = (int)(float2 * precision - .5);
return (int1 == int2);
}
Keep in mind that when float a = +2^(254-127) * 1.___22 zeros___1 and float b = +2^(254-127) * 1.___23 zeros___ then we expect abs(a-b) < epsilon but instead a - b = +2^(254-127-23) * 1.___23 zeros___ = 20282409603651670423947251286000 which is much bigger than epsilon...