I am making this big program in C, which is a part of my homework. My problem is that my program is outputing x = -0.00 instead of x = 0.00. I have tried comparing like if(x==-0.00) x=fabs(x) but I've read that it won't work like that with doubles. So my question is are there any other ways to check if double is equal to negative zero?
You can use the standard macro signbit(arg) from math.h. It will return nonzero value if arg is negative and 0 otherwise.
From the man page:
signbit() is a generic macro which can work on all real floating-
point types. It returns a nonzero value if the value of x has its
sign bit set.
This is not the same as x < 0.0, because IEEE 754 floating point
allows zero to be signed. The comparison -0.0 < 0.0 is false, but
signbit(-0.0) will return a nonzero value.
NaNs and infinities have a sign bit.
Also, from cppreference.com:
This macro detects the sign bit of zeroes, infinities, and NaNs. Along
with copysign, this macro is one of the only two portable ways to
examine the sign of a NaN.
Very few calculations actually give you a signed negative zero. What you're probably observing is a negative value close to zero that has been truncated by your formatting choice when outputting the value.
Note that -0.0 is defined to be equal to 0.0, so a simple comparison to 0.0 is enough to verify a signed zero.
If you want to convert an exact signed zero -0.0 to 0.0 then add 0.0 to it.
Most likely, your program has a small negative value, not zero, which printf formats as “-0.00”. To print such numbers as “0.00”, you can test how printf will format them and replace the undesired string with the desired string:
#include <stdio.h>
#include <string.h>
void PrintAdjusted(double x)
{
char buffer[6];
int result = snprintf(buffer, sizeof buffer, "%.2f", x);
/* If snprintf produces a result other than "-0.00", including
a result that does not fit in the buffer, use it.
Otherwise, print "0.00".
*/
if (sizeof buffer <= result || strcmp(buffer, "-0.00") != 0)
printf("%.2f", x);
else
printf("0.00");
}
This is portable. Alternatives such as comparing the number to -0.005 have portability issues, due to implementation-dependent details in floating-point formats and rounding methods in printf.
If you truly do want to test whether a number x is −0, you can use:
#include <math.h>
…
signbit(x) && x == 0
There are two functions you need here.
First, the signbit function can tell you if the sign bit is set on a floating point number. Second, the fpclassify function will tell you if a floating point number is some form of 0.
For example:
double x = 0.0;
double y = -0.0;
double a = 3;
double b = -2;
printf("x=%f, y=%f\n", x, y);
printf("x is zero: %d\n", (fpclassify(x) == FP_ZERO));
printf("y is zero: %d\n", (fpclassify(y) == FP_ZERO));
printf("a is zero: %d\n", (fpclassify(a) == FP_ZERO));
printf("b is zero: %d\n", (fpclassify(b) == FP_ZERO));
printf("x sign: %d\n", signbit(x));
printf("y sign: %d\n", signbit(y));
printf("a sign: %d\n", signbit(a));
printf("b sign: %d\n", signbit(b));
Output:
x=0.000000, y=-0.000000
x is zero: 1
y is zero: 1
a is zero: 0
b is zero: 0
x sign: 0
y sign: 1
a sign: 0
b sign: 1
So to check if a value is negative zero, do the following:
if (fpclassify(x) == FP_ZERO)) {
if (signbit(x)) {
printf("x is negative zero\n");
} else {
printf("x is positive zero\n");
}
}
To always get the non-negative version, you don't need the comparison at all.
You can take the absolute value all of the time. If the value is non-negative, fabs should return the original value.
Related
Suppose I have a floating-point value of type float or double (i.e. 32 or 64 bits on typical machines). I want to print this value as text (e.g. to the standard output stream), and then later, in some other process, scan it back in - with fscanf() if I'm using C, or perhaps with istream::operator>>() if I'm using C++. But - I need the scanned float to end up being exactly, identical to the original value (up to equivalent representations of the same value). Also, the printed value should be easily readable - to a human - as floating-point, i.e. I don't want to print 0x42355316 and reinterpret that as a 32-bit float.
How should I do this? I'm assuming the standard library of (C and C++) won't be sufficient, but perhaps I'm wrong. I suppose that a sufficient number of decimal digits might be able to guarantee an error that's underneath the precision threshold - but that's not the same as guaranteeing the rounding/truncation will happen just the way I want it.
Notes:
The scanning does not having to be perfectly accurate w.r.t. the value it scans, only the original value.
If it makes it easier, you may assume the value is a number and is not infinity.
denormal support is desired but not required; still if we get a denormal, failure should be conspicuous.
First, you should use the %a format with fprintf and fscanf. This is what it was designed for, and the C standard requires it to work (reproduce the original number) if the implementation uses binary floating-point.
Failing that, you should print a float with at least FLT_DECIMAL_DIG significant digits and a double with at least DBL_DECIMAL_DIG significant digits. Those constants are defined in <float.h> and are defined:
… number of decimal digits, n, such that any floating-point number with p radix b digits can be rounded to a floating-point number with n decimal digits and back again without change to the value,… [b is the base used for the floating-point format, defined in FLT_RADIX, and p is the number of base-b digits in the format.]
For example:
printf("%.*g\n", FLT_DECIMAL_DIG, 1.f/3);
or:
#define QuoteHelper(x) #x
#define Quote(x) QuoteHelper(x)
…
printf("%." Quote(FLT_DECIMAL_DIG) "g\n", 1.f/3);
In C++, these constants are defined in <limits> as std::numeric_limits<Type>::max_digits10, where Type is float or double or another floating-point type.
Note that the C standard only recommends that such a round-trip through a decimal numeral work; it does not require it. For example, C 2018 5.2.4.2.2 15 says, under the heading “Recommended practice”:
Conversion from (at least) double to decimal with DECIMAL_DIG digits and back should be the identity function. [DECIMAL_DIG is the equivalent of FLT_DECIMAL_DIG or DBL_DECIMAL_DIG for the widest floating-point format supported in the implementation.]
In contrast, if you use %a, and FLT_RADIX is a power of two (meaning the implementation uses a floating-point base that is two, 16, or another power of two), then C standard requires that the result of scanning the numeral produced with %a equals the original number.
I need the scanned float to end up being exactly, identical to the original value.
As already pointed out in the other answers, that can be achieved with the %a format specifier.
Also, the printed value should be easily readable - to a human - as floating-point, i.e. I don't want to print 0x42355316 and reinterpret that as a 32-bit float.
That's more tricky and subjective. The first part of the string that %a produces is in fact a fraction composed by hexadecimal digits, so that an output like 0x1.4p+3 may take some time to be parsed as 10 by a human reader.
An option could be to print all the decimal digits needed to represent the floating-point value, but there may be a lot of them. Consider, for example the value 0.1, its closest representation as a 64-bit float may be
0x1.999999999999ap-4 == 0.1000000000000000055511151231257827021181583404541015625
While printf("%.*lf\n", DBL_DECIMAL_DIG, 01); (see e.g. Eric's answer) would print
0.10000000000000001 // If DBL_DECIMAL_DIG == 17
My proposal is somewhere in the middle. Similarly to what %a does, we can exactly represent any floating-point value with radix 2 as a fraction multiplied by 2 raised to some integer power. We can transform that fraction into a whole number (increasing the exponent accordingly) and print it as a decimal value.
0x1.999999999999ap-4 --> 1.999999999999a16 * 2-4 --> 1999999999999a16 * 2-56
--> 720575940379279410 * 2-56
That whole number has a limited number of digits (it's < 253), but the result it's still an exact representation of the original double value.
The following snippet is a proof of concept, without any check for corner cases. The format specifier %a separates the mantissa and the exponent with a p character (as in "... multiplied by two raised to the Power of..."), I'll use a q instead, for no particular reason other than using a different symbol.
The value of the mantissa will also be reduced (and the exponent raised accordingly), removing all the trailing zero-bits. The idea beeing that 5q+1 (parsed as 510 * 21) should be more "easily" identified as 10, rather than 2814749767106560q-48.
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void to_my_format(double x, char *str)
{
int exponent;
double mantissa = frexp(x, &exponent);
long long m = 0;
if ( mantissa ) {
exponent -= 52;
m = (long long)scalbn(mantissa, 52);
// A reduced mantissa should be more readable
while (m && m % 2 == 0) {
++exponent;
m /= 2;
}
}
sprintf(str, "%lldq%+d", m, exponent);
// ^
// Here 'q' is used to separate the mantissa from the exponent
}
double from_my_format(char const *str)
{
char *end;
long long mantissa = strtoll(str, &end, 10);
long exponent = strtol(str + (end - str + 1), &end, 10);
return scalbn(mantissa, exponent);
}
int main(void)
{
double tests[] = { 1, 0.5, 2, 10, -256, acos(-1), 1000000, 0.1, 0.125 };
size_t n = (sizeof tests) / (sizeof *tests);
char num[32];
for ( size_t i = 0; i < n; ++i ) {
to_my_format(tests[i], num);
double x = from_my_format(num);
printf("%22s%22a ", num, tests[i]);
if ( tests[i] != x )
printf(" *** %22a *** Round-trip failed\n", x);
else
printf("%58.55g\n", x);
}
return 0;
}
Testable here.
Generally, the improvement in readability is admitedly little to none, surely a matter of opinion.
You can use the %a format specifier to print the value as hexadecimal floating point. Note that this is not the same as reinterpreting the float as an integer and printing the integer value.
For example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
float x;
scanf("%f", &x);
printf("x=%.7f\n", x);
char str[20];
sprintf(str, "%a", x);
printf("str=%s\n", str);
float y;
sscanf(str, "%f", &y);
printf("y=%.7f\n", y);
printf("x==y: %d\n", (x == y));
return 0;
}
With an input of 4, this outputs:
x=4.0000000
str=0x1p+2
y=4.0000000
x==y: 1
With an input of 3.3, this outputs:
x=3.3000000
str=0x1.a66666p+1
y=3.3000000
x==y: 1
As you can see from the output, the %a format specifier prints in exponential format with the significand in hex and the exponent in decimal. This format can then be converted directly back to the exact same value as demonstrated by the equality check.
I´m looking for an alternative for the ceil() and floor() functions in C, due to I am not allowed to use these in a project.
What I have build so far is a tricky back and forth way by the use of the cast operator and with that the conversion from a floating-point value (in my case a double) into an int and later as I need the closest integers, above and below the given floating-point value, to be also double values, back to double:
#include <stdio.h>
int main(void) {
double original = 124.576;
double floorint;
double ceilint;
int f;
int c;
f = (int)original; //Truncation to closest floor integer value
c = f + 1;
floorint = (double)f;
ceilint = (double)c;
printf("Original Value: %lf, Floor Int: %lf , Ceil Int: %lf", original, floorint, ceilint);
}
Output:
Original Value: 124.576000, Floor Int: 124.000000 , Ceil Int: 125.000000
For this example normally I would not need the ceil and floor integer values of c and f to be converted back to double but I need them in double in my real program. Consider that as a requirement for the task.
Although the output is giving the desired values and seems right so far, I´m still in concern if this method is really that right and appropriate or, to say it more clearly, if this method does bring any bad behavior or issue into the program or gives me a performance-loss in comparison to other alternatives, if there are any other possible alternatives.
Do you know a better alternative? And if so, why this one should be better?
Thank you very much.
Do you know a better alternative? And if so, why this one should be better?
OP'code fails:
original is already a whole number.
original is a negative like -1.5. Truncation is not floor there.
original is just outside int range.
original is not-a-number.
Alternative construction
double my_ceil(double x)
Using the cast to some integer type trick is a problem when x is outsize the integer range. So check first if x is inside range of a wide enough integer (one whose precision exceeds double). x values outside that are already whole numbers. Recommend to go for the widest integer (u)intmax_t.
Remember that a cast to an integer is a round toward 0 and not a floor. Different handling needed if x is negative/positive when code is ceil() or floor(). OP's code missed this.
I'd avoid if (x >= INTMAX_MAX) { as that involves (double) INTMAX_MAX whose rounding and then precise value is "chosen in an implementation-defined manner". Instead, I'd compare against INTMAX_MAX_P1. some_integer_MAX is a Mersenne Number and with 2's complement, ...MIN is a negated "power of 2".
#include <inttypes.h>
#define INTMAX_MAX_P1 ((INTMAX_MAX/2 + 1)*2.0)
double my_ceil(double x) {
if (x >= INTMAX_MAX_P1) {
return x;
}
if (x < INTMAX_MIN) {
return x;
}
intmax_t i = (intmax_t) x; // this rounds towards 0
if (i < 0 || x == i) return i; // negative x is already rounded up.
return i + 1.0;
}
As x may be a not-a-number, it is more useful to reverse the compare as relational compare of a NaN is false.
double my_ceil(double x) {
if (x >= INTMAX_MIN && x < INTMAX_MAX_P1) {
intmax_t i = (intmax_t) x; // this rounds towards 0
if (i < 0 || x == i) return i; // negative x is already rounded up.
return i + 1.0;
}
return x;
}
double my_floor(double x) {
if (x >= INTMAX_MIN && x < INTMAX_MAX_P1) {
intmax_t i = (intmax_t) x; // this rounds towards 0
if (i > 0 || x == i) return i; // positive x is already rounded down.
return i - 1.0;
}
return x;
}
You're missing an important step: you need to check if the number is already integral, so for ceil assuming non-negative numbers (generalisation is trivial), use something like
double ceil(double f){
if (f >= LLONG_MAX){
// f will be integral unless you have a really funky platform
return f;
} else {
long long i = f;
return 0.0 + i + (f != i); // to obviate potential long long overflow
}
}
Another missing piece in the puzzle, which is covered off by my enclosing if, is to check if f is within the bounds of a long long. On common platforms if f was outside the bounds of a long long then it would be integral anyway.
Note that floor is trivial due to the fact that truncation to long long is always towards zero.
Given this code that my professor gave us in an exam which means we cannot modify the code nor use function from other libraries (except stdio.h):
float x;
(suppose x NOT having an integer part)
while (CONDITION){
x = x*10
}
I have to find the condition that makes sure that x has no valid number to the right of decimal point not giving attention to the problems of precision of a float number (After the decimal point we have to have only zeros). I tried this condition:
while ((fmod((x*10),10))){
X = X*10
}
printf(" %f ",x);
example:
INPUT x=0.456; --------> OUTPUT: 456.000
INPUT X=0.4567;--------> OUTPUT; 4567.000
It is important to be sure that after the decimal point we don't have any
significant number
But I had to include math.h library BUT my professor doesn't allow us to use it in this specific case (I'm not even allowed to use (long) since we never seen it in class).
So what is the condition that solve the problem properly without this library?
As pointed out here previously:Due to the accuracy of floats this is not really possible but I think your Prof wants to get something like
while (x - (int)x != 0 )
or
while (x - (int)x >= 0.00000001 )
You can get rid of the zeroes by using the g modifier instead of f:
printf(" %g \n",x);
There is fuzziness ("not giving attention to the problems of precision of a float number") in the question, yet I think a sought answer is below, assign x to an integer type until x no longer has a fractional part.
Success of this method depends on INT_MIN <= x <= INT_MAX. This is expected when the number of bits in the significant of float does not exceed the value bits of int. Although this is common, it is not specified by C. As an alternative, code could with a wider integer type like long long with a far less chance of the range restriction issue.
Given the rounding introduced with *10, this method is not a good foundation of float to text conversion.
float Dipok(float x) {
int i;
while ((i=x) != x) {
x = x*10;
}
return x;
}
#include <assert.h>
#include <stdio.h>
#include <float.h>
void Dipok_test(float x) {
// suppose x NOT having an integer part
assert(x > -1.0 && x < 1.0);
float y = Dipok(x);
printf("x:%.*f y:%.f\n", FLT_DECIMAL_DIG, x, y);
}
int main(void) {
Dipok_test(0.456);
Dipok_test(0.4567);
return 0;
}
Output
x:0.456000000 y:456
x:0.456699997 y:4567
As already pointed out by 2501, this is just not possible.
Floats are not accurate. Depending on your platform, the float value for 0.001 is represented as something like 0.0010000001 in fact.
What would you expect the code to calculate: 10000001 or 1?
Any solution will work for some values only.
I try to answer to my exam question please if I say something wrong correct me!
It is not possible to find a proper condition that makes sure that there are no valid number after the decimal point. For example : We want to know the result of 0.4*20 which is 8.000 BUT due to imprecision problems the output will be different:
f=0.4;
for(i=1;i<20;i++)
f=f+0.4;
printf("The number f=0.4*20 is ");
if(f!=8.0) {printf(" not ");}
printf(" %f ",8.0);
printf("The real answer is f=0.4*20= %f",f);
Our OUTPUT will be:
The number f=0.4*20 is not 8.000000
The real answer is f=0.4*20= 8.000001
I just encountered a behaviour I don't understand in a C program that I'm using.
I guess it's due to floating numbers, maybe int to float cast, but still I would like someone to explain to me that this is a normal behaviour, and why.
Here is my C program :
#include <stdio.h>
#include <float.h>
int main ()
{
printf("FLT_MIN : %f\n", FLT_MIN);
printf("FLT_MAX : %f\n", FLT_MAX);
float valueFloat = 0.000000;
int valueInt = 0;
if (valueInt < FLT_MIN) {
printf("1- integer %d < FLT_MIN %f\n", valueInt, FLT_MIN);
}
if (valueFloat < FLT_MIN) {
printf("2- float %f < FLT_MIN %f\n", valueFloat, FLT_MIN);
}
if (0 < 0.000000) {
printf("3- 0 < 0.000000\n");
} else if (0 == 0.000000) {
printf("4- 0 == 0.000000\n");
} else {
printf("5- 0 > 0.000000\n");
}
if (valueInt < valueFloat) {
printf("6- %d < %f\n", valueInt, valueFloat);
} else if (valueInt == valueFloat) {
printf("7- %d == %f\n", valueInt, valueFloat);
} else {
printf("8- %d > %f\n", valueInt, valueFloat);
}
return 0;
}
And here is my command to compile and run it :
gcc float.c -o float ; ./float
Here is the output :
FLT_MIN : 0.000000
FLT_MAX : 340282346638528859811704183484516925440.000000
1- integer 0 < FLT_MIN 0.000000
2- float 0.000000 < FLT_MIN 0.000000
4- 0 == 0.000000
7- 0 == 0.000000
A C developper that I know consider normal that the line "1-" displays become of the loss of precision in the comparison. Let's admit that.
But why the line "3-" doesn't appear then, since it's the same comparison ?
Why the line "2-" appears, since I'm comparing the same numbers ? (or at least I hope so)
And why lines "4-" and "7-" appear ? It seems a different behaviour from line "1-".
Thanks for your help.
Your confusion is probably over the line:
printf("FLT_MIN : %f\n", FLT_MIN);
change it to:
printf("FLT_MIN : %g\n", FLT_MIN);
And you will see, that FLT_MIN is actually NOT zero, but a (tiny bit) larger than zero.
FLT_MIN is not 0, it's just above 0, you just need to show more places to see that. FLT_MIN is the smallest floating point number above 0 that the computer can represent, since floating points are almost always an approximation, printf and friends round when printing, unless you ask it for the precision:
printf("FLT_MIN : %.64f\n", FLT_MIN);
3 does not actually appear in your output because 0 is not less than 0
4 is comparing 0 with 0, the computer has no problem representing both of those (0 is a special case for floats) so they compare equal
7 is the same case as 4 just with intermediate assignments
This is correct behaviour. Under IEEE754, zero is exactly representable as a float. Therefore it can be 'equal' to integer zero (although 'equivalent' would be a better term). FLT_MIN is the smallest magnitude number that can be represented as a float and still be distinguished from zero. Even though a standard %f format specifier to printf() will show FLT_MIN as 0.000000, it is not zero. A literal 0.00... will be interpreted by the compiler as float 0, which is not equal to FLT_MIN, even though the default six decimal place %f format will print them the same.
Hello I'm learning Objective C and I was doing the classic Calculator example.
Problem is that I'm getting a negative zero when I multiply zero by any negative number, and I put the result into a (double) type!
To see what was going on, I played with the debugger and this is what I got:
(gdb) print -2*0
$1 = 0
(gdb) print (double) -2 * 0
$2 = -0
In the second case when I cast it to a double type, it turns into negative zero! How can I fix that in my application? I need to work with doubles.
How can I fix the result so I get a zero when the result should be zero?
I did a simple test:
double d = (double) -2.0 * 0;
if (d < 0)
printf("d is less than zero\n");
if (d == 0)
printf("d is equal to zero\n");
if (d > 0)
printf("d is greater than zero\n");
printf("d is: %lf\n", d);
It outputs:
d is equal to zero
d is: -0.000000
So, to fix this, you can add a simple if-check to your application:
if (d == 0) d = 0;
There is a misunderstanding here about operator precedence:
(double) -2 * 0
is parsed as
((double)(-(2))) * 0
which is essentially the same as (-2.0) * 0.0.
The C Standard informative Annex J lists as Unspecifier behavior Whether certain operators can generate negative zeros and whether a negative zero becomes a normal zero when stored in an object (6.2.6.2).
Conversely, (double)(-2 * 0) should generate a positive zero 0.0 on most current platforms as the multiplication is performed using integer arithmetic. The C Standard does have support for architectures that distinguish positive and negative zero integers, but these are vanishingly rare nowadays.
If you want to force zeros to be positive, this simple fix should work:
if (d == 0) {
d = 0;
}
You could make the intent clearer with this:
if (d == -0.0) {
d = +0.0;
}
But the test will succeed also if d is a positive zero.
Chux has a simpler solution for IEC 60559 complying environments:
d = d + 0.0; // turn -0.0 to +0.0
http://en.wikipedia.org/wiki/Signed_zero
The number 0 is usually encoded as +0, but can be represented by either +0 or −0
It shouldn't impact on calculations or UI output.
How can I fix that in my application?
Code really is not broken, so nothing needs to be "fixed". #kennytm
How can I fix the result so I get a zero when the result should be zero?
To easily get rid of the - when the result is -0.0, add 0.0. Code following standard (IEC 60559 floating-point) rules will produce drop the - sign.
double nzero = -0.0;
printf("%f\n", nzero);
printf("%f\n", nzero + 0.0);
printf("%f\n", fabs(nzero)); // This has a side effect of changing all negative values
// pedantic code using <math.h>
if (signbit(nzero)) nzero = 0.0; // This has a side effect of changing all negative values
printf("%f\n", nzero);
Usual output.
-0.000000
0.000000
0.000000
0.000000
Yet for general double x that may have any value, hard to beat the following. #Richard J. Ross III #chqrlie The x + 0.0 approach has an advantage in that likely does not introduce a branch, yet the following is clear.
if (x == 0.0) x = 0.0;
Note: fmax(-0.0, 0.0) may produce -0.0.
In my code (on C MPI intel compiler) -0.0 and +0.0 are not the same.
As an example:
d = -0.0
if (d < 0.0)
do something...
and it is doing this "something".
also adding -0.0 + 0.0 = -0.0...
GCC was seemingly optimizing out the simple fix of negzero += 0.0 as noted above until I realized that -fno-signed-zeros was in place. Duh.
But in the process I did find that this will fix a signed zero, even when -fno-signed-zeros is set:
if (negzero > -DBL_MIN && negzero < DBL_MIN && signbit(negzero))
negzero = 0.0;
or as a macro:
#define NO_NEG_ZERO(a) ( (a) > -DBL_MIN && (a) < DBL_MIN && signbit(a) ? 0.0 : (a) )
negzero = NO_NEG_ZERO(negzero)
Note that the comparitor is < and > (not <= or >=) so a really is zero! (OR it is a subnormal number...but nevermind the guy behind the curtain.)
Maybe this answer is slightly less correct in the sense that a value of between DBL_MIN and -DBL_MIN will be converted to 0.0, in which case this isn't the way if you need to support subnormal numbers.
If you do need subnormal numbers (!) then perhaps your the kind of person who plays with -fno-signed-zeros, too.
The lesson here for me and subnormal-numbers-guy is this: if you play outside of spec then expect out-of-spec results ;)
(Sorry, that was not PC. It could be subnormal-numbers-person...but I digress.)