Format specifier %F missing from glibc's documentation

Format specifier %F missing from glibc's documentation - c

I stumbled upon the following deficiency in the GNU C Library's documentation:
‘%f’
Print a floating-point number in normal (fixed-point) notation. See Floating-Point Conversions, for details.
And it does not mention %F anywhere, although according to the C99 standard:
The F conversion specifier produces INF, INFINITY, or NAN instead of inf, infinity, or nan, respectively.
A simple example to test this:
double inf = 1.0 / 0.0;
printf( "Testing on %ld.%ld, %%f: %f and %%F: %F.\n", __GLIBC__, __GLIBC_MINOR__, inf, inf );
This outputs:
Testing on 2.24, %f: inf and %F: INF.
So it works as it should, not as the documentation says.
I was thinking why the capital F feature is not in the documentation. Does it make sense? Yes. Does it comply with the standard? Yes. Is it be implemented? Yes. So, I don't see the reason.
Is there a convention that issues regulated by the standard are automatically considered part of the library documentation? Is it safe to use %F in a printf format?

That various open source projects come with incomplete or incorrect documentation isn't exactly news. The documentation is simply missing this information indeed.
The 2.36 documentation is also missing this info. Conclusion: the documentation is bad.
And yes %F is perfectly safe and well-defined as per C99. For very old compiler versions (such as gcc < 5.0.0) you should compile with -std=c99 however.

Related

Why printf is not able to handle flags, field width and precisions properly?

I'm trying to discover all capabilities of printf and I have tried this :
printf("Test:%+*0d", 10, 20);
that prints
Test:%+100d
I have use first the flag +, then the width * and the re-use the flag 0.
Why it's make this output ? I purposely used printf() in a bad way but I wonder why it shows me the number 100?

This is because, you're supplying syntactical nonsense to the compiler, so it is free to do whatever it wants. Related reading, undefined behavior.
Compile your code with warnings enabled and it will tell you something like
warning: unknown conversion type character ‘0’ in format [-Wformat=]
printf("Test:%+*0d", 10, 20);
^
To be correct, the statement should be either of
printf("Test:%+*.0d", 10, 20); // note the '.'
where, the 0 is used as a precision
Related, quoting the C11, chapter §7.21.6.1, (emphasis mine)
An optional precision that gives the minimum number of digits to appear for the d, i,
o, u, x, and X conversions, the number of digits to appear after the decimal-point
character for a, A, e, E, f, and F conversions, the maximum number of significant
digits for the g and G conversions, or the maximum number of bytes to be written for s conversions. The precision takes the form of a period (.) followed either by an
asterisk * (described later) or by an optional decimal integer; if only the period is
specified, the precision is taken as zero. If a precision appears with any other
conversion specifier, the behavior is undefined.
printf("Test:%+0*d", 10, 20);
where, the 0 is used as a flag. As per the syntax, all the flags should appear together, before any other conversion specification entry, you cannot just put it anywhere in the conversion specification and expect the compiler to follow your intention.
Again, to quote, (and my emphasis)
Each conversion specification is introduced by the character %. After the %, the following
appear in sequence:
Zero or more flags (in any order) [...]
An optional minimum field width [...]
An optional precision [...]
An optional length modifier [...]
A conversion specifier [....]

Your printf format is incorrect: the flags must precede the width specifier.
After it handles * as the width specifier, printf expects either a . or a length modifier or a conversion specifier, 0 being none of these, the behavior is undefined.
Your library's implementation of printf does something bizarre, it seems to handle * by replacing it with the actual width argument... A side effect of the implementation. Others may do something else, including aborting the program. Such a format error would be especially risky if followed by a %s conversion.
Changing your code to printf("Test:%+0*d", 10, 20); should produce the expected output:
Test:+000000020

In complement of Sourav Ghosh's answer; an important notion is that of undefined behavior, which is tricky. Be sure to read Lattner's blog: What Every C Programmer Should Know About Undefined Behavior. See also this.
So, leaving on purpose (or perhaps depending upon) some undefined behavior in your code is intentional malpractice. Don't do that. In the very rare cases you want to do that (I cannot see any), please document it and justify yourself in some comment.
Be aware that if indeed printf is implemented by the C standard library, it can be (and often is) specially handled by the compiler (with GCC and GNU libc, that magic might happens using internally __builtin_printf)
The C99 & C11 standards are partially specifying the behavior of printf but does leave some undefined behavior cases to ease the implementation. You are unlikely to full understand or be able to mimic these cases. And the implementation itself could change (for example, on my Debian Linux, an upgrade of libc might change the undefined behavior of printf)
If you want to understand more printf study the source of some C standard library implementation (e.g. musl-libc, whose code is quite readable) and of the GCC implementation (assuming a Linux operating system).
But maintainers of GNU libc and of GCC (& even of the Linux kernel, thru syscalls) stay free to change the undefined behavior (of printf and anything else)
In practice, always compile with gcc -Wall (and probably also -g) if using GCC. Don't accept any warnings (so improve your own code till you get none).

How to detect if printf will support %a?

I need to losslessly represent a double precision float in a string, and so I am using
sprintf(buf, "%la", x);
This works fine on my system, but when built on MinGW under Windows, gives a warning:
unknown conversion type character 'a' in format
I coded up a workaround for this case, but have trouble detecting when I should use the workaround -- I tried #if __STDC_VERSION__ >= 199901L, but it seems Gcc/MinGW defines that even if it doesn't support %a. Is there another macro I could check?

This doesn't answer the question "How to detect if printf will support %a?" in the general case, but you can modify your compiler installation so that %a is supported.
First of all , use mingw-w64. This is an up-to-date fork of MinGW. The original version of MinGW is not well maintained and does not fix bugs such as you are experiencing (preferring to blame Microsoft or something).
Using mingw-w64 4.9.2 in Windows 10, the following code works for me:
#include <stdio.h>
int main()
{
double x = 3.14;
printf("%a\n", x);
}
producing 0x1.91eb85p+1 which is correct. This is still deferring to the Microsoft runtime.
Your question mentions %la, however %a and %la are both the same and can be used to print either a float or a double argument.
If you want to print a long double, then the Microsoft runtime does not support that; gcc and MS use different sizes of long double. You have to use mingw-w64's own printf implementation:
#define __USE_MINGW_ANSI_STDIO 1
#include <stdio.h>
int main()
{
long double x = 3.14;
printf("%La\n", x);
}
which outputs 0xc.8f5c28f5c28f8p-2. This is actually the same number as 0x1.91eb85p+1 with more precision and a different placement of the binary point, it is also correct.

As jxh already suspected, MingW uses the MSVCRT LibC from Windows. The C99 support is not complete, especially some of the options of printf(3) like a are missing.

Is `x!=x` a portable way to test for NaN?

In C you can test to see if a double is NaN using isnan(x). However many places online, including for example this SO answer say that you can simply use x!=x instead.
Is x!=x in any C specification as a method that is guaranteed to test if x is NaN? I can't find it myself and I would like my code to work with different compilers.

NaN as the only value x with the property x!=x is an IEEE 754 guarantee. Whether it is a faithful test to recognize NaN in C boils down to how closely the representation of variables and the operations are mapped to IEEE 754 formats and operations in the compiler(s) you intend to use.
You should in particular worry about “excess precision” and the way compilers deal with it. Excess precision is what happens when the FPU only conveniently supports computations in a wider format than the compiler would like to use for float and double types. In this case computations can be made at the wider precision, and rounded to the type's precision when the compiler feels like it in an unpredictable way.
The C99 standard defined a way to handle this excess precision that preserved the property that only NaN was different from itself, but for a long time after 1999 (and even nowadays when the compiler's authors do not care), in presence of excess precision, x != x could possibly be true for any variable x that contains the finite result of a computation, if the compiler chooses to round the excess-precision result of the computation in-between the evaluation of the first x and the second x.
This report describes the dark times of compilers that made no effort to implement C99 (either because it wasn't 1999 yet or because they didn't care enough).
This 2008 post describes how GCC started to implement the C99 standard for excess precision in 2008. Before that, GCC could provide one with all the surprises described in the aforementioned report.
Of course, if the target platform does not implement IEEE 754 at all, a NaN value may not even exist, or exist and have different properties than specified by IEEE 754. The common cases are a compiler that implements IEEE 754 quite faithfully with FLT_EVAL_METHOD set to 0, 1 or 2 (all of which guarantee that x != x iff x is NaN), or a compiler with a non-standard implementation of excess precision, where x != x is not a reliable test for NaN.

Please refer to the normative section Annex F: IEC 60559 floating-point arithmetic of the C standard:
F.1 Introduction
An implementation that defines __STDC_IEC_559__ shall conform to the specifications in this annex.
Implementations that do not define __STDC_IEC_559__ are not required to conform to these specifications.
F.9.3 Relational operators
The expression x ≠ x is true if x is a NaN.
The expression x = x is false if X is a Nan.
F.3 Operators and functions
The isnan macro in <math.h> provides the isnan function recommended in the Appendix to IEC 60559.

Is it legal to use memset(…, 0, …) on an array of doubles?

Is it legal to zero the memory of an array of doubles (using memset(…, 0, …)) or struct containing doubles?
The question implies two different things:
From the point of view of C standard: Is this undefined behavior of not? (On any particular platform, I presume, this cannot be undefined behavior, as it just depends on the in-memory representation of floating-point numbers—that’s all.)
From practical point of view: Is it OK on Intel platform? (Regardless of what the standard is saying.)

The C99 standard Annex F says:
This annex specifies C language support for the IEC 60559 floating-point standard. The
IEC 60559 floating-point standard is specifically Binary floating-point arithmetic for
microprocessor systems, second edition (IEC 60559:1989), previously designated
IEC 559:1989 and as IEEE Standard for Binary Floating-Point Arithmetic
(ANSI/IEEE 754−1985). IEEE Standard for Radix-Independent Floating-Point
Arithmetic (ANSI/IEEE 854−1987) generalizes the binary standard to remove
dependencies on radix and word length. IEC 60559 generally refers to the floating-point
standard, as in IEC 60559 operation, IEC 60559 format, etc. An implementation that
defines __STDC_IEC_559__ shall conform to the specifications in this annex. Where
a binding between the C language and IEC 60559 is indicated, the IEC 60559-specified
behavior is adopted by reference, unless stated otherwise.
And, immediately after:
The C floating types match the IEC 60559 formats as follows:
The float type matches the IEC 60559 single format.
The double type matches the IEC 60559 double format.
Thus, since IEC 60559 is basically IEEE 754-1985, and since this specifies that 8 zero bytes mean 0.0 (as #David Heffernan said), it means that if you find __STDC_IEC_559__ defined, you can safely do a 0.0 initialization with memset.

If you are talking about IEEE754 then the standard defines +0.0 to double precision as 8 zero bytes. If you know that you are backed by IEEE754 floating point then this is well-defined.
As for Intel, I can't think of a compiler that doesn't use IEEE754 on Intel x86/x64.

David Heffernan has given a good answer for part (2) of your question. For part (1):
The C99 standard makes no guarantees about the representation of floating-point values in the general case. §6.2.6.1 says:
The representations of all types are unspecified except as stated in this subclause.
...and that subclause makes no further mention of floating point.
You said:
(on a fixed platform, how can this UB ... it just depends of floating representation that's all ...)
Indeed - there a difference between "undefined behaviour", "unspecified behaviour" and "implementation-defined behaviour":
"undefined behaviour" means that anything could happen (including a runtime crash);
"unspecified behaviour" means that the compiler is free to implement something sensible in any way it likes, but there is no requirement for the implementation choice to be documented;
"implementation-defined behaviour" means that the compiler is free to implement something sensible in any way it likes, and is supposed to document that choice (for example, see here for the implementation choices documented by the most recent release of GCC);
and so, as floating point representation is unspecified behaviour, it can vary in an undocumented manner from platform to platform (where "platform" here means "the combination of hardware and compiler" rather than just "hardware").
(I'm not sure how useful the guarantee that a double is represented such that all-bits-zero is +0.0 if __STDC_IEC_559__ is defined, as described in Matteo Italia's answer, actually is in practice. For example, GCC never defines this, even though is uses IEEE 754 / IEC 60559 on many hardware platforms.)

Even though it is unlikely that you encounter a machine where this has problems, you may also avoid this relatively easily if you are really talking of arrays as you indicate in the question title, and if these arrays are of known length at compile time (that is not VLA), then just initializing them is probably even more convenient:
double A[133] = { 0 };
should always work. If you'd have to zero such an array again, later, and your compiler is compliant to modern C (C99) you can do this with a compound literal
memcpy(A, (double const[133]){ 0 }, 133*sizeof(double));
on any modern compiler this should be as efficient as memset, but has the advantage of not relying on a particular encoding of double.

As Matteo Italia says, that’s legal according to the standard, but I wouldn’t use it. Something like
double *p = V, *last = V + N; // N is count
while (p != last) *(p++) = 0;
is at least twice faster.

It’s “legal” to use memset. The issue is whether it produces a bit pattern where array[x] == 0.0 is true. While the basic C standard doesn’t require that to be true, I’d be interested in hearing examples where it isn’t!
It appears that setting to zero via memset is equivalent to assigning 0.0 on IBM-AIX, HP-UX (PARISC), HP-UX (IA-64), Linux (IA-64, I think).
Here is a trivial test code:
double dFloat1 = 0.0;
double dFloat2 = 111111.1111111;
memset(&dFloat2, 0, sizeof(dFloat2));
if (dFloat1 == dFloat2) {
fprintf(stdout, "memset appears to be equivalent to = 0.0\n");
} else {
fprintf(stdout, "memset is NOT equivalent to = 0.0\n");
}

Well, I think the zeroing is "legal" (after all, it's zeroing a regular buffer), but I have no idea if the standard lets you assume anything about the resulting logical value. My guess would be that the C standard leaves it as undefined.

How to declare IEEE mathematical functions like 'ilogbf' in MSVC++6?

Could someone please help and tell me how to include IEEE mathematical functions in MSVC++6? I tried both and , but I still get these errors:
error C2065: 'ilogbf' : undeclared identifier
error C2065: 'scalbnf' : undeclared identifier

Edit 3: Hopefully this will be my final edit. I have come to realize that I haven't properly addressed this question at all. I am going to leave my answer in place as a cautionary tale, and because it may have some educational value. But I understand why I have zero upvotes, and in fact I am going to upvote Andy Ross' answer because I think his is much more relevant (although incomplete at least at the time of writing). It seems to me my mistake was to take the Man definitions I found for ilogbf() a little superficially. It's a function that takes the integer part of the log of a float, how hard can that be to implement ? It turns out what the function is really about is IEEE floating point representation, in particular the exponent (as opposed to the mantissa) part of that representation. I should definitely have realized that before attempting to answer the question! An interesting point to me is how a function can possibly find the exponent part of a float, as I thought a fundamental rule of C is that floats are promoted to doubles as part of a function call. But that's a whole separate discussion of course.
--- End of edit 3, start of cautionary tale ---
A little googling suggests these are defined in some flavors of Unix, but maybe are not in any Posix or ANSI standard and so not provided with the MSVC libraries. If the functions aren't in the library they won't be declared in math.h. Obviously if the compiler can't see declarations for these external symbols it won't be happy and you'll get errors like the ones you list.
The obvious work around is to create your own versions of these functions, using math functions that are provided. eg
#include <math.h>
int ilogbf( float f )
{
double d1 = (double)f;
double d2 = log(d1);
int ret = (int)d2;
return ret;
}
Edit: This isn't quite right. Apparently, this function should use log to the base 2, rather than natural logs, so that the returned value is actually a binary exponent. It should also take the absolute value of its parameter, so that it will work for negative numbers as well. I will work up an improved version, if you ask me in a comment, otherwise I'm tempted to leave that as an exercise for the reader :-)
The essence of my answer, i.e. that ANSI C doesn't require this function and that MSVC doesn't include it, is apparently correct.
Edit 2: Okay I've weakened and provided an improved version without being asked. Here it is;
#include <math.h>
int ilogbf( float f )
{
double d1 = (double)f;
if( d1 < 0 )
d1 = -d1;
double d2 = log(d1) / log(2); // log2(x) = ln(x)/ln(2)
int ret = (int)d2;
return ret;
}

These are C99 functions, not IEEE754-1985. Microsoft seems to have decided that their market doesn't care about C99 support, so they haven't bothered to provide them. This is a shame, but unless more of you (developers) complain, there's no reason to expect that the situation will change.
The brand new 754 standard, IEEE754-2008, requires these functions (Clause 5.3.3, "logBFormat operations"), but that version of the standard won't be widely adopted for several more years; even if it does reach wide adoption, Microsoft hasn't seen fit to provide these functions for the ten years they've been in C99 so why would they bother to provide them just because they're in the IEEE754 standard?
edit: note that scalb and logb are defined in the IEEE754-1985 Appendix "Recommended Functions and Predicates", but said appendix is explicitly "not a part of" said standard.

If you know you're on an IEEE system (and these days, you do), these functions aren't needed: just inspect the bits directly by unioning the double with a uint64_t. Presumably you're using these functions in the interest of efficiency in the first place (otherwise you'd be using more natural operations like log() or exp()), so spending a little effort on matching your code to the floating point representation is probably worthwhile.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight