How do you write an NaN floating-point literal in C?
In C99's <math.h>
7.12 Mathematics <math.h>
[#5] The macro
NAN
is defined if and only if the implementation supports quiet
NaNs for the float type. It expands to a constant
expression of type float representing a quiet NaN. |
5.2.4.2.2/3:
floating types may be able to contain
other kinds of floating-point numbers,
such as ... infinities and NaNs. A NaN is an encoding
signifying Not-a-Number. A quiet NaN propagates through almost
every arithmetic operation without
raising a floating-point exception; a
signaling NaN generally raises a
floating-point exception when occurring
as an arithmetic operand.
7.12/5 (math.h):
The macro NAN is defined if and only if
the implementation supports quiet NaNs
for the float type. It expands to a
constant expression of type float
representing a quiet NaN.
So you can get a value if your implementation supports NaNs at all, and if some or all of those NaNs are quiet. Failing that you're into implementation-defined territory.
There's also a slight worry with some of these floating-point macros that the compiler front-end might not know whether it supports the feature or not, because it produces code that can run on multiple versions of an architecture where support varies. I don't know whether that applies to this macro, but it's another situation where you're into implementation-defined territory - the preprocessor might conservatively claim that it doesn't support the feature when actually the implementation as a whole, as you're using it, does.
Using NAN is better, but if you're on a system that has NaNs, 0.0/0.0 is an easy way to get one...
In C you can write a NaN floating point literal in the following way.
const unsigned long dNAN[2] = {0x00000000, 0x7ff80000};
const double LITERAL_NAN = *( double* )dNAN;
Please note that, this is not a standard way. On Microsoft C, it works fine.
Related
In C11 (and later) integer constant expression shall only have operands that are, in particular:
floating constants that are the immediate operands of casts
The following code:
int a[ A > B ? 16 : 32 ];
when A and B are floating constants is invalid in C:
$ echo '#include "t576.h"' | clang -std=c11 -pedantic -Wall -Wextra -DA=1.0 -DB=2.0 -c -xc -
In file included from <stdin>:1:
./t576.h:1:5: warning: size of static array must be an integer constant expression [-Wpedantic]
but valid in C++:
$ echo '#include "t576.h"' | clang++ -std=c++11 -pedantic -Wall -Wextra -DA=1.0 -DB=2.0 -c -xc++ -
<nothing>
What is the origin / rationale of this requirement?
Extra question: In the future C standard revisions will it be useful to remove this requirement?
What is the origin / rationale of this requirement?
It means C compilers are not required to be able to execute floating-point arithmetic within the compiler. When compiling for a target platform different from the compiler host platform, replicating the exact behavior of the target floating-point operations can require a lot of work. (This was especially so prior to widespread adoption of the IEEE 754 standard for floating-point arithmetic.)
To implement floating-point semantics in a C program, the compiler only has to be able to convert the constants in source code to the target floating-point format and take the integer portion of them (in a cast). It does not have to be able to perform general arithmetic operations on them. Without this requirement, the compiler would have to reproduce the floating-point arithmetic operations of the target platform. So, if a program uses floating-point arithmetic, the compiler can implement that just by generating instructions to do the arithmetic; it does not have to do the arithmetic itself. This is also true for arithmetic constant expressions, which can be used as initializers: The compiler is not strictly required by the C standard to compute the value of the initializer. It can generate instructions that compute the value when the program starts running (for initialization of static objects) or when needed (for initialization of automatic objects).
In contrast, integer constant expressions can be used in places where the compiler needs the value, such as the width of a bit-field. So the compiler must be able to compute the value itself. If it were required to be able to do floating-point arithmetic to get the value, this would add considerable burden to writing some compilers.
Extra question: In the future C standard revisions will it be useful to remove this requirement?
Removing it will provide some opportunity for C program writers to use additional constant expressions and will require C compiler writers to do more work. The value of these things is subjective.
The rationale is common sense: they don't want to allow users to declare an array of some 3.1415 items - the array size needs to be an integer, obviously.
For many operators in C, the usual arithmetic conversions would turn the end result into floating point whenever a floating point operand is present. In case of ?: specifically that doesn't happen, since the result is the 2nd or 3rd operand. Also the > operator does always return int so it doesn't really apply there either.
If you don't immediately cast floating point operands to an integer type, as told in the definition of an integer constant expression that you quote, then it will become an arithmetic constant expression instead, which is a broader term.
So you can do this:
int a[ (int)1.0 > (int)2.0 ? 16 : 32 ]; // compliant
But you can't do this:
int a[ 1.0 > 2.0 ? 16 : 32 ]; // not compliant
Consider int a[ (int)1.0 > (int)2.0 ? 16.0 : 32 ]; (not compliant either). Here the condition always evaluates as false. We should get size 32, but because of the special implicit conversion rules of ?: the 2nd and 3rd operands are balanced per the usual arithmetic conversions, so we end up with 32.0 of type double. And if that in turn would lead to a floating point number that cannot be exactly represented, we would get a floating point array size.
I'm looking for a way to detect whether a C compiler uses the IEEE-754 floating point representation at compile time, preferably in the preprocessor, but a constant expression is fine too.
Note that the __STDC_IEC_559__ macro does not fit this purpose, as an implementation may use the correct representation while not fully supporting Annex F.
Not an absolute 100% solution, but will get you practically close.
Check if the characteristics of floating type double match binary64:
#include <float.h>
#define BINARY64_LIKE ( \
(FLT_RADIX == 2) \
(DBL_MANT_DIG == 53) \
(DBL_DECIMAL_DIG == 17) \
(DBL_DIG == 15) \
(DBL_MIN_EXP == -1021) \
(DBL_HAS_SUBNORM == 1) \
(DBL_MIN_10_EXP == -307) \
(DBL_MAX_EXP == +1024) \
(DBL_MAX_10_EXP == +308))
BINARY64_LIKE usable at compile time. Need additional work though for older compilers that do not define them all like: DBL_HAS_SUBNORM since C11.
Likewise for float.
Since C11, code could use _Static_assert() to detect some attributes.
_Static_assert(sizeof(double)*CHAR_BIT == 64, "double unexpected size");
See also Are there any commonly used floating point formats besides IEEE754?.
Last non-IEEE754 FP format I used was CCSI 5 years ago.
Caution: Unclear why OP wants this test. If code is doing some bit manipulations of a floating point, even with __STDC_IEC_559__ defined there remains at least one hole: The endian of floating point and integer may differ - uncommon - but out there.
Other potential holes: support of -0.0, NaN sign, encoding of infinity, signalling NaN, quiet NaN, NaN payload: the usual suspects.
As of July 2020, this would still be compiler specific... though C2x intends to change that with the __STDC_IEC_60559_BFP__ macro - see Annex F, section F.2.
It might be noted that:
The compiler usually doesn't choose the binary representation. The compiler usually follows the target system's architecture (the chipset instruction design for the CPU / GPU, etc').
The use of non-conforming binary representations for floating-point is pretty much a thing of the past. If you're using a modern (or even a moderately modern) system from the past 10 years, you are almost certainly using a conforming binary representation.
In C you can test to see if a double is NaN using isnan(x). However many places online, including for example this SO answer say that you can simply use x!=x instead.
Is x!=x in any C specification as a method that is guaranteed to test if x is NaN? I can't find it myself and I would like my code to work with different compilers.
NaN as the only value x with the property x!=x is an IEEE 754 guarantee. Whether it is a faithful test to recognize NaN in C boils down to how closely the representation of variables and the operations are mapped to IEEE 754 formats and operations in the compiler(s) you intend to use.
You should in particular worry about “excess precision” and the way compilers deal with it. Excess precision is what happens when the FPU only conveniently supports computations in a wider format than the compiler would like to use for float and double types. In this case computations can be made at the wider precision, and rounded to the type's precision when the compiler feels like it in an unpredictable way.
The C99 standard defined a way to handle this excess precision that preserved the property that only NaN was different from itself, but for a long time after 1999 (and even nowadays when the compiler's authors do not care), in presence of excess precision, x != x could possibly be true for any variable x that contains the finite result of a computation, if the compiler chooses to round the excess-precision result of the computation in-between the evaluation of the first x and the second x.
This report describes the dark times of compilers that made no effort to implement C99 (either because it wasn't 1999 yet or because they didn't care enough).
This 2008 post describes how GCC started to implement the C99 standard for excess precision in 2008. Before that, GCC could provide one with all the surprises described in the aforementioned report.
Of course, if the target platform does not implement IEEE 754 at all, a NaN value may not even exist, or exist and have different properties than specified by IEEE 754. The common cases are a compiler that implements IEEE 754 quite faithfully with FLT_EVAL_METHOD set to 0, 1 or 2 (all of which guarantee that x != x iff x is NaN), or a compiler with a non-standard implementation of excess precision, where x != x is not a reliable test for NaN.
Please refer to the normative section Annex F: IEC 60559 floating-point arithmetic of the C standard:
F.1 Introduction
An implementation that defines __STDC_IEC_559__ shall conform to the specifications in this annex.
Implementations that do not define __STDC_IEC_559__ are not required to conform to these specifications.
F.9.3 Relational operators
The expression x ≠ x is true if x is a NaN.
The expression x = x is false if X is a Nan.
F.3 Operators and functions
The isnan macro in <math.h> provides the isnan function recommended in the Appendix to IEC 60559.
AFAIK, C supports just a few data types:
int, float, double, char, void enum.
I need to store a number that could reach into the high 10 digits. Since I'm getting a low 10 digit # from
INT_MAX
, I suppose I need a double.
<limits.h> doesn't have a DOUBLE_MAX. I found a DBL_MAX on the internet that said this is LEGACY and also appears to be C++. Is double what I need? Why is there no DOUBLE_MAX?
DBL_MAX is defined in <float.h>. Its availability in <limits.h> on unix is what is marked as "(LEGACY)".
(linking to the unix standard even though you have no unix tag since that's probably where you found the "LEGACY" notation, but much of what is shown there for float.h is also in the C standard back to C89)
You get the integer limits in <limits.h> or <climits>. Floating point characteristics are defined in <float.h> for C. In C++, the preferred version is usually std::numeric_limits<double>::max() (for which you #include <limits>).
As to your original question, if you want a larger integer type than long, you should probably consider long long. This isn't officially included in C++98 or C++03, but is part of C99 and C++11, so all reasonably current compilers support it.
Its in the standard float.h include file. You want DBL_MAX
Using double to store large integers is dubious; the largest integer that can be stored reliably in double is much smaller than DBL_MAX. You should use long long, and if that's not enough, you need your own arbitrary-precision code or an existing library.
You are looking for the float.h header.
INT_MAX is just a definition in limits.h. You don't make it clear whether you need to store an integer or floating point value. If integer, and using a 64-bit compiler, use a LONG (LLONG for 32-bit).
Is it legal to zero the memory of an array of doubles (using memset(…, 0, …)) or struct containing doubles?
The question implies two different things:
From the point of view of C standard: Is this undefined behavior of not? (On any particular platform, I presume, this cannot be undefined behavior, as it just depends on the in-memory representation of floating-point numbers—that’s all.)
From practical point of view: Is it OK on Intel platform? (Regardless of what the standard is saying.)
The C99 standard Annex F says:
This annex specifies C language support for the IEC 60559 floating-point standard. The
IEC 60559 floating-point standard is specifically Binary floating-point arithmetic for
microprocessor systems, second edition (IEC 60559:1989), previously designated
IEC 559:1989 and as IEEE Standard for Binary Floating-Point Arithmetic
(ANSI/IEEE 754−1985). IEEE Standard for Radix-Independent Floating-Point
Arithmetic (ANSI/IEEE 854−1987) generalizes the binary standard to remove
dependencies on radix and word length. IEC 60559 generally refers to the floating-point
standard, as in IEC 60559 operation, IEC 60559 format, etc. An implementation that
defines __STDC_IEC_559__ shall conform to the specifications in this annex. Where
a binding between the C language and IEC 60559 is indicated, the IEC 60559-specified
behavior is adopted by reference, unless stated otherwise.
And, immediately after:
The C floating types match the IEC 60559 formats as follows:
The float type matches the IEC 60559 single format.
The double type matches the IEC 60559 double format.
Thus, since IEC 60559 is basically IEEE 754-1985, and since this specifies that 8 zero bytes mean 0.0 (as #David Heffernan said), it means that if you find __STDC_IEC_559__ defined, you can safely do a 0.0 initialization with memset.
If you are talking about IEEE754 then the standard defines +0.0 to double precision as 8 zero bytes. If you know that you are backed by IEEE754 floating point then this is well-defined.
As for Intel, I can't think of a compiler that doesn't use IEEE754 on Intel x86/x64.
David Heffernan has given a good answer for part (2) of your question. For part (1):
The C99 standard makes no guarantees about the representation of floating-point values in the general case. §6.2.6.1 says:
The representations of all types are unspecified except as stated in this subclause.
...and that subclause makes no further mention of floating point.
You said:
(on a fixed platform, how can this UB ... it just depends of floating representation that's all ...)
Indeed - there a difference between "undefined behaviour", "unspecified behaviour" and "implementation-defined behaviour":
"undefined behaviour" means that anything could happen (including a runtime crash);
"unspecified behaviour" means that the compiler is free to implement something sensible in any way it likes, but there is no requirement for the implementation choice to be documented;
"implementation-defined behaviour" means that the compiler is free to implement something sensible in any way it likes, and is supposed to document that choice (for example, see here for the implementation choices documented by the most recent release of GCC);
and so, as floating point representation is unspecified behaviour, it can vary in an undocumented manner from platform to platform (where "platform" here means "the combination of hardware and compiler" rather than just "hardware").
(I'm not sure how useful the guarantee that a double is represented such that all-bits-zero is +0.0 if __STDC_IEC_559__ is defined, as described in Matteo Italia's answer, actually is in practice. For example, GCC never defines this, even though is uses IEEE 754 / IEC 60559 on many hardware platforms.)
Even though it is unlikely that you encounter a machine where this has problems, you may also avoid this relatively easily if you are really talking of arrays as you indicate in the question title, and if these arrays are of known length at compile time (that is not VLA), then just initializing them is probably even more convenient:
double A[133] = { 0 };
should always work. If you'd have to zero such an array again, later, and your compiler is compliant to modern C (C99) you can do this with a compound literal
memcpy(A, (double const[133]){ 0 }, 133*sizeof(double));
on any modern compiler this should be as efficient as memset, but has the advantage of not relying on a particular encoding of double.
As Matteo Italia says, that’s legal according to the standard, but I wouldn’t use it. Something like
double *p = V, *last = V + N; // N is count
while (p != last) *(p++) = 0;
is at least twice faster.
It’s “legal” to use memset. The issue is whether it produces a bit pattern where array[x] == 0.0 is true. While the basic C standard doesn’t require that to be true, I’d be interested in hearing examples where it isn’t!
It appears that setting to zero via memset is equivalent to assigning 0.0 on IBM-AIX, HP-UX (PARISC), HP-UX (IA-64), Linux (IA-64, I think).
Here is a trivial test code:
double dFloat1 = 0.0;
double dFloat2 = 111111.1111111;
memset(&dFloat2, 0, sizeof(dFloat2));
if (dFloat1 == dFloat2) {
fprintf(stdout, "memset appears to be equivalent to = 0.0\n");
} else {
fprintf(stdout, "memset is NOT equivalent to = 0.0\n");
}
Well, I think the zeroing is "legal" (after all, it's zeroing a regular buffer), but I have no idea if the standard lets you assume anything about the resulting logical value. My guess would be that the C standard leaves it as undefined.