K&R's C language has the following sentence:
A compiler's license to treat mathematically associative operators as computationally associative is revoked.
This is in the Appendix C, which tells what's different from before ANSI C. But I don't know how computationally associated is different from mathematically associated. Maybe I guess the mathematically associative is a * b * c = (a * b) * c (left), or a * (b * c) (right).
Consider this code:
#include <stdio.h>
int main(void)
{
double a = 0x1p64; // Two the power of 64, 18,446,744,073,709,551,616.
double b = 1;
double c = -a;
printf("%g\n", a+b+c);
}
In the C grammar, a+b+c is equivalent to (a+b)+c, so a and b are added first, and then c is added. In the format commonly used for double, a+b yields 264, not 264+1, because the double format does not have enough precision to represent 264+1, so the result of the addition is the ideal mathematical result rounded to the nearest representable value, which is 264. Then adding c yields zero, so “0” is printed.
If instead we calculated a+c+b, adding a and c would give zero, and then adding b would give one, and “1” would be printed.
Thus, floating-point operations are not generally associative; a+b+c is not the same as a+c+b.
In ordinary mathematics with real numbers, a+b+c is the same as a+c+b; addition of real numbers is associative.
Prior to standardization, some C compilers would treat floating-point expressions as if operators were associative (for those operators whose counterparts in real-number-arithmetic were associative). The C standard does not permit that in implementations that conform to the standard. Conforming compilers must produce results as if the operations were performed in the order specified by the C grammar.
Some compilers may still treat floating-point operators as associative when operating in non-standard modes, which may be selected by flags or switches passed to the compiler. Also, because the C standard allows implementations to perform floating-point arithmetic with more precision than the nominal type (e.g., when computing a+b+c, it can calculate it as if the types were long double instead of double), that can produce results that are the same as if operations were rearranged, so you can still get results that look like operators have been reordered associatively, depending on the C implementation and the flags used.
Related
I'm wondering if there are any circumstances where code like this will be incorrect due to floating point inaccuracies:
#include <math.h>
// other code ...
float f = /* random but not NAN or INF */;
int i = (int)floorf(f);
// OR
int i = (int)ceilf(f);
Are there any guarantees about these values? If I have a well-formed f (not NAN or INF) will i always be the integer that it rounds to, whichever way that is.
I can image a situation where (with a bad spec/implementation) the value you get is the value just below the true value rather than just above/equal but is actually closer. Then when you truncate it actually rounds down to the next lower value.
It doesn't seem possible to me given that integers can be exact values in ieee754 floating point but I don't know if float is guaranteed to be that standard
The C standard is sloppy in specifying floating-point behavior, so it is technically not completely specified that floorf(f) produces the correct floor of f or that ceilf(f) produces the correct ceiling of f.
Nonetheless, no C implementations I am aware of get this wrong.
If, instead of floorf(some variable), you have floorf(some expression), there are C implementations that may evaluate the expression in diverse ways that will not get the same result as if IEEE-754 arithmetic were used throughout.
If the C implementation defines __STDC_IEC_559__, it should evaluate the expressions using IEEE-754 arithmetic.
Nonetheless, int i = (int)floorf(f); is of course not guaranteed to set i to the floor of f if the floor of f is out of range of int.
I recently ran into an issue that could easily be solved using modulus division, but the input was a float:
Given a periodic function (e.g. sin) and a computer function that can only compute it within the period range (e.g. [-π, π]), make a function that can handle any input.
The "obvious" solution is something like:
#include <cmath>
float sin(float x){
return limited_sin((x + M_PI) % (2 *M_PI) - M_PI);
}
Why doesn't this work? I get this error:
error: invalid operands of types double and double to binary operator %
Interestingly, it does work in Python:
def sin(x):
return limited_sin((x + math.pi) % (2 * math.pi) - math.pi)
Because the normal mathematical notion of "remainder" is only applicable to integer division. i.e. division that is required to generate integer quotient.
In order to extend the concept of "remainder" to real numbers you have to introduce a new kind of "hybrid" operation that would generate integer quotient for real operands. Core C language does not support such operation, but it is provided as a standard library fmod function, as well as remainder function in C99. (Note that these functions are not the same and have some peculiarities. In particular, they do not follow the rounding rules of integer division.)
You're looking for fmod().
I guess to more specifically answer your question, in older languages the % operator was just defined as integer modular division and in newer languages they decided to expand the definition of the operator.
EDIT: If I were to wager a guess why, I would say it's because the idea of modular arithmetic originates in number theory and deals specifically with integers.
I can't really say for sure, but I'd guess it's mostly historical. Quite a few early C compilers didn't support floating point at all. It was added on later, and even then not as completely -- mostly the data type was added, and the most primitive operations supported in the language, but everything else left to the standard library.
The modulo operator % in C and C++ is defined for two integers, however, there is an fmod() function available for usage with doubles.
The constraints are in the standards:
C11(ISO/IEC 9899:201x) §6.5.5 Multiplicative operators
Each of the operands shall have arithmetic type. The operands of the % operator shall
have integer type.
C++11(ISO/IEC 14882:2011) §5.6 Multiplicative operators
The operands of * and / shall have arithmetic or enumeration type; the operands of % shall have integral or enumeration
type. The usual arithmetic conversions are performed on the operands and determine the type of the result.
The solution is to use fmod, which is exactly why the operands of % are limited to integer type in the first place, according to C99 Rationale §6.5.5 Multiplicative operators:
The C89 Committee rejected extending the % operator to work on floating types as such usage would duplicate the facility provided by fmod
try fmod
"The mathematical notion of modulo arithmetic works for floating point
values as well, and this is one of the first issues that Donald Knuth
discusses in his classic The Art of Computer Programming (volume I).
I.e. it was once basic knowledge."
The floating point modulus operator is defined as follows:
m = num - iquot*den ; where iquot = int( num/den )
As indicated, the no-op of the % operator on floating point numbers appears
to be standards related. The CRTL provides 'fmod', and usually 'remainder'
as well, to perform % on fp numbers. The difference between these two lies
in how they handle the intermediate 'iquot' rounding.
'remainder' uses round-to-nearest, and 'fmod' uses simple truncate.
If you write your own C++ numerical classes, nothing prevents you
from amending the no-op legacy, by including an overloaded operator %.
Best Regards
The % operator does not work in C++, when you are trying to find the remainder of two numbers which are both of the type Float or Double.
Hence you could try using the fmod function from math.h / cmath.h or you could use these lines of code to avoid using that header file:
float sin(float x) {
float temp;
temp = (x + M_PI) / ((2 *M_PI) - M_PI);
return limited_sin((x + M_PI) - ((2 *M_PI) - M_PI) * temp));
}
For C/C++, this is only defined for integer operations.
Python is a little broader and allows you to get the remainder of a floating point number for the remainder of how many times number can be divided into it:
>>> 4 % math.pi
0.85840734641020688
>>> 4 - math.pi
0.85840734641020688
>>>
I recently ran into an issue that could easily be solved using modulus division, but the input was a float:
Given a periodic function (e.g. sin) and a computer function that can only compute it within the period range (e.g. [-π, π]), make a function that can handle any input.
The "obvious" solution is something like:
#include <cmath>
float sin(float x){
return limited_sin((x + M_PI) % (2 *M_PI) - M_PI);
}
Why doesn't this work? I get this error:
error: invalid operands of types double and double to binary operator %
Interestingly, it does work in Python:
def sin(x):
return limited_sin((x + math.pi) % (2 * math.pi) - math.pi)
Because the normal mathematical notion of "remainder" is only applicable to integer division. i.e. division that is required to generate integer quotient.
In order to extend the concept of "remainder" to real numbers you have to introduce a new kind of "hybrid" operation that would generate integer quotient for real operands. Core C language does not support such operation, but it is provided as a standard library fmod function, as well as remainder function in C99. (Note that these functions are not the same and have some peculiarities. In particular, they do not follow the rounding rules of integer division.)
You're looking for fmod().
I guess to more specifically answer your question, in older languages the % operator was just defined as integer modular division and in newer languages they decided to expand the definition of the operator.
EDIT: If I were to wager a guess why, I would say it's because the idea of modular arithmetic originates in number theory and deals specifically with integers.
I can't really say for sure, but I'd guess it's mostly historical. Quite a few early C compilers didn't support floating point at all. It was added on later, and even then not as completely -- mostly the data type was added, and the most primitive operations supported in the language, but everything else left to the standard library.
The modulo operator % in C and C++ is defined for two integers, however, there is an fmod() function available for usage with doubles.
The constraints are in the standards:
C11(ISO/IEC 9899:201x) §6.5.5 Multiplicative operators
Each of the operands shall have arithmetic type. The operands of the % operator shall
have integer type.
C++11(ISO/IEC 14882:2011) §5.6 Multiplicative operators
The operands of * and / shall have arithmetic or enumeration type; the operands of % shall have integral or enumeration
type. The usual arithmetic conversions are performed on the operands and determine the type of the result.
The solution is to use fmod, which is exactly why the operands of % are limited to integer type in the first place, according to C99 Rationale §6.5.5 Multiplicative operators:
The C89 Committee rejected extending the % operator to work on floating types as such usage would duplicate the facility provided by fmod
try fmod
"The mathematical notion of modulo arithmetic works for floating point
values as well, and this is one of the first issues that Donald Knuth
discusses in his classic The Art of Computer Programming (volume I).
I.e. it was once basic knowledge."
The floating point modulus operator is defined as follows:
m = num - iquot*den ; where iquot = int( num/den )
As indicated, the no-op of the % operator on floating point numbers appears
to be standards related. The CRTL provides 'fmod', and usually 'remainder'
as well, to perform % on fp numbers. The difference between these two lies
in how they handle the intermediate 'iquot' rounding.
'remainder' uses round-to-nearest, and 'fmod' uses simple truncate.
If you write your own C++ numerical classes, nothing prevents you
from amending the no-op legacy, by including an overloaded operator %.
Best Regards
The % operator does not work in C++, when you are trying to find the remainder of two numbers which are both of the type Float or Double.
Hence you could try using the fmod function from math.h / cmath.h or you could use these lines of code to avoid using that header file:
float sin(float x) {
float temp;
temp = (x + M_PI) / ((2 *M_PI) - M_PI);
return limited_sin((x + M_PI) - ((2 *M_PI) - M_PI) * temp));
}
For C/C++, this is only defined for integer operations.
Python is a little broader and allows you to get the remainder of a floating point number for the remainder of how many times number can be divided into it:
>>> 4 % math.pi
0.85840734641020688
>>> 4 - math.pi
0.85840734641020688
>>>
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
There are numerous reference on the subject (here or here). However I still fails to understand why the following is not considered UB and properly reported by my favorite compiler (insert clang and/or gcc) with a neat warning:
// f1, f2 and epsilon are defined as double
if ( f1 / f2 <= epsilon )
As per C99:TC3, 5.2.4.2.2 §8: we have:
Except for assignment and cast (which remove all extra range and
precision), the values of operations with floating operands and values
subject to the usual arithmetic conversions and of floating constants
are evaluated to a format whose range and precision may be greater
than required by the type. [...]
Using typical compilation f1 / f2 would be read directly from the FPU. I've tried here using gcc -m32, with gcc 5.2. So f1 / f2 is (over-here) on an 80 bits (just a guess dont have the exact spec here) floating point register. There is not type promotion here (per standard).
I've also tested clang 3.5, this compiler seems to cast the result of f1 / f2 back to a normal 64 bits floating point representation (this is an implementation defined behavior but for my question I prefer the default gcc behavior).
As per my understanding the comparison will be done in between a type for which we don't know the size (ie. format whose range and precision may be greater) and epsilon which size is exactly 64 bits.
What I really find hard to understand is equality comparison with a well known C types (eg. 64bits double) and something whose range and precision may be greater. I would have assumed that somewhere in the standard some kind of promotion would be required (eg. standard would mandates that epsilon would be promoted to a wider floating point type).
So the only legitimate syntaxes should instead be:
if ( (double)(f1 / f2) <= epsilon )
or
double res = f1 / f2;
if ( res <= epsilon )
As a side note, I would have expected the litterature to document only the operator <, in my case:
if ( f1 / f2 < epsilon )
Since it is always possible to compare floating point with different size using operator <.
So in which cases the first expression would make sense ? In other word, how could the standard defines some kind of equality operator in between two floating point representation with different size ?
EDIT: The whole confusion here, was that I assumed it was possible to compare two float of different size. Which cannot possibly happen. (thanks #DevSolar!).
<= is well-defined for all possible floating point values.
There is one exception though: the case when at least one of the arguments is uninitialised. But that's more to do with reading an uninitialised variable being UB; not the <= itself
I think you're confusing implementation-defined with undefined behavior. The C language doesn't mandate IEEE 754, so all floating point operations are essentially implementation-defined. But this is different from undefined behavior.
After a bit of chat, it became clear where the miscommunication came from.
The quoted part of the standard explicitly allows an implementation to use wider formats for floating operands in calculations. This includes, but is not limited to, using the long double format for double operands.
The standard section in question also does not call this "type promotion". It merely refers to a format being used.
So, f1 / f2 may be done in some arbitrary internal format, but without making the result any other type than double.
So when the result is compared (by either <= or the problematic ==) to epsilon, there is no promotion of epsilon (because the result of the division never got a different type), but by the same rule that allowed f1 / f2 to happen in some wider format, epsilon is allowed to be evaluated in that format as well. It is up to the implementation to do the right thing here.
The value of FLT_EVAL_METHOD might tell what exactly an implementation is doing exactly (if set to 0, 1, or 2 respectively), or it might have a negative value, which indicates "indeterminate" (-1) or "implementation-defined", which means "look it up in your compiler manual".
This gives an implementation "wiggle room" to do any kind of funny things with floating operands, as long as at least the range / precision of the actual type is preserved. (Some older FPUs had "wobbly" precisions, depending on the kind of floating operation performed. The quoted part of the standard caters for exactly that.)
In no case may any of this lead to undefined behaviour. Implementation-defined, yes. Undefined, no.
The only case where you would get undefined behavior is when a large floating point variable gets demoted to a smaller one which cannot represent the contents. I don't quite see how that applies in this case.
The text you quote is concerned about whether or not floats may be evaluated as doubles etc, as indicated by the text you unfortunately didn't include in the quote:
The use of evaluation formats is characterized by the
implementation-defined value of FLT_EVAL_METHOD:
-1 indeterminable;
0 evaluate all operations and constants just to the range and precision of the type;
1 evaluate operations and constants of type float and double to the range and precision of the double type, evaluate long double operations and constants to the range and precision of the long double type;
2 evaluate all operations and constants to the range and precision of the long double type.
However, I don't believe this macro overwrites the behavior of the usual arithmetic conversions. The usual arithmetic conversions guarantee that you can never compare two float variables of different size. So I don't see how you could run into undefined behavior here. The only possible issue you would have is performance.
In theory, in case FLT_EVAL_METHOD == 2 then your operands could indeed get evaluated as type long double. But please note that if the compiler allows such implicit promotions to larger types, there will be a reason for it.
According to the text you cited, explicit casting will counter this compiler behavior.
In which case the code if ( (double)(f1 / f2) <= epsilon ) is nonsense. By the time you cast the result of f1 / f2 to double, the calculation is already done and have been carried out on long double. The calculation of the result <= epsilon will however be carried out on double since you forced this with the cast.
To avoid long double entirely, you would have to write the code as:
if ( (double)((double)f1 / (double)f2) <= epsilon )
or to increase readability, preferably:
double div = (double)f1 / (double)f2;
if( (double)div <= (double)epsilon )
But again, code like this does only make sense if you know that there will be implicit promotions, which you wish to avoid to increase performance. In practice, I doubt you'll ever run into that situation, as the compiler is most likely far more capable than the programmer to make such decisions.
#include <stdio.h>
int main()
{
char c;
c=10;
if(c%2==0)
printf("Yes");
return 0;
}
The above code prints "Yes". Can someone tell why the modulus operator works for char and int but not for double etc.?
You already got comments explaining why % is defined for char: it's defined for all integer types, and in C, char is an integer type. Some other languages do define a distinct char type that does not support arithmetic operations, but C is not one of them.
But to answer why it isn't defined for floating-point types: history. There is no technical reason why it wouldn't be possible to define the % operator for floating-point types. Here's what the C99 rationale says:
6.5.5 Multiplicative operators
[...]
The C89 Committee rejected extending the % operator to work on floating types as such usage would duplicate the facility provided by fmod (see §7.12.10.1).
And as mafso found later:
7.12.10.1 The fmod functions
[...]
The C89 Committee considered a proposal to use the remainder operator % for this function; but it was rejected because the operators in general correspond to hardware facilities, and fmod is not supported in hardware on most machines.
They seem somewhat contradictory. The % operator was not extended because fmod already filled that need, but fmod was picked to fill that need because the committee did not want to extend the % operator? They cannot very well both be true at the same time.
I suspect one of these reasons was the original reason, and the other was the reason for not later re-visiting that decision, but there's no telling which was first. Either way, it was simply decided that % wouldn't perform this operation.