What precisely does the %g printf specifier mean? - c

The %g specifier doesn't seem to behave in the way that most sources document it as behaving.
According to most sources I've found, across multiple languages that use printf specifiers, the %g specifier is supposed to be equivalent to either %f or %e - whichever would produce shorter output for the provided value. For instance, at the time of writing this question, cplusplus.com says that the g specifier means:
Use the shortest representation: %e or %f
And the PHP manual says it means:
g - shorter of %e and %f.
And here's a Stack Overflow answer that claims that
%g uses the shortest representation.
And a Quora answer that claims that:
%g prints the number in the shortest of these two representations
But this behaviour isn't what I see in reality. If I compile and run this program (as C or C++ - it's a valid program with the same behaviour in both):
#include <stdio.h>
int main(void) {
double x = 123456.0;
printf("%e\n", x);
printf("%f\n", x);
printf("%g\n", x);
printf("\n");
double y = 1234567.0;
printf("%e\n", y);
printf("%f\n", y);
printf("%g\n", y);
return 0;
}
... then I see this output:
1.234560e+05
123456.000000
123456
1.234567e+06
1234567.000000
1.23457e+06
Clearly, the %g output doesn't quite match either the %e or %f output for either x or y above. What's more, it doesn't look like %g is minimising the output length either; y could've been formatted more succinctly if, like x, it had not been printed in scientific notation.
Are all of the sources I've quoted above lying to me?
I see identical or similar behaviour in other languages that support these format specifiers, perhaps because under the hood they call out to the printf family of C functions. For instance, I see this output in Python:
>>> print('%g' % 123456.0)
123456
>>> print('%g' % 1234567.0)
1.23457e+06
In PHP:
php > printf('%g', 123456.0);
123456
php > printf('%g', 1234567.0);
1.23457e+6
In Ruby:
irb(main):024:0* printf("%g\n", 123456.0)
123456
=> nil
irb(main):025:0> printf("%g\n", 1234567.0)
1.23457e+06
=> nil
What's the logic that governs this output?

This is the full description of the g/G specifier in the C11 standard:
A double argument representing a floating-point number is
converted in style f or e (or in style F or E in the case of a G
conversion specifier), depending on the value converted and the
precision. Let P equal the precision if nonzero, 6 if the precision is
omitted, or 1 if the precision is zero. Then, if a conversion with
style E would have an exponent of X:
if P > X ≥ −4, the conversion is
with style f (or F) and precision P − (X + 1).
otherwise, the
conversion is with style e (or E) and precision P − 1.
Finally, unless
the # flag is used, any trailing zeros are removed from the fractional
portion of the result and the decimal-point character is removed if
there is no fractional portion remaining.
A double argument
representing an infinity or NaN is converted in the style of an f or F
conversion specifier.
This behaviour is somewhat similar to simply using the shortest representation out of %f and %e, but not equivalent. There are two important differences:
Trailing zeros (and, potentially, the decimal point) get stripped when using %g, which can cause the output of a %g specifier to not exactly match what either %f or %e would've produced.
The decision about whether to use %f-style or %e-style formatting is made based purely upon the size of the exponent that would be needed in %e-style notation, and does not directly depend on which representation would be shorter. There are several scenarios in which this rule results in %g selecting the longer representation, like the one shown in the question where %g uses scientific notation even though this makes the output 4 characters longer than it needs to be.
In case the C standard's wording is hard to parse, the Python documentation provides another description of the same behaviour:
General format. For a given precision p >= 1,
this rounds the number to p significant digits and
then formats the result in either fixed-point format
or in scientific notation, depending on its magnitude.
The precise rules are as follows: suppose that the
result formatted with presentation type 'e' and
precision p-1 would have exponent exp. Then
if -4 <= exp < p, the number is formatted
with presentation type 'f' and precision
p-1-exp. Otherwise, the number is formatted
with presentation type 'e' and precision p-1.
In both cases insignificant trailing zeros are removed
from the significand, and the decimal point is also
removed if there are no remaining digits following it.
Positive and negative infinity, positive and negative
zero, and nans, are formatted as inf, -inf,
0, -0 and nan respectively, regardless of
the precision.
A precision of 0 is treated as equivalent to a
precision of 1. The default precision is 6.
The many sources on the internet that claim that %g just picks the shortest out of %e and %f are simply wrong.

My favorite format for doubles is "%.15g". It seems to do the right thing in every case. I'm pretty sure 15 is the maximum reliable decimal precision in a double as well.

Related

Why does printf() with %f lose a digit after decimal point sometimes?

Why does the statement
printf("%f", sensorvalue)
print out a string like “11312.96” (with two digits after decimal points) most of the time, but sometimes print out a string like “11313.1” (with one digit after decimal point)? sensorvalue is read from a power meter continuously. The values at different times are supposed to have the same format.
It's C running on Linux.
Why does the statement printf("%f", sensorvalue) print out the string like 11312.96 (with two digits after decimal points) at most time, but sometimes print string like 11313.1 (with one digit after decimal point)?
The library is simply not C compliant even if "It's C running on Linux."
The output of
printf("%f\n", 11312.96f);
printf("%f\n", 11312.96);
printf("%f\n", 11313.1f);
printf("%f\n", 11313.1);
... is expected to be like the below with 6 digits after the '.' - perhaps with some variation in the values of the least digits. Even with implementations of varying quality, the output should have been 6 digits after the '.'.
11312.959961
11312.960000
11313.099609
11313.100000
Had the format been "%g", output like below could have occurred.
11313
11313
11313.1
11313.1
If you're using %f exactly as stated, this actually violates the standard (this would be unusual but certainly not unheard of), which states in C11 7.21.6.1 The fprintf function /8:
F, f: A double argument representing a floating-point number is converted to decimal notation in the style [−]ddd.ddd, where the number of digits after the decimal-point character is equal to the precision specification. If the precision is missing, it is taken as 6.
In other words, this program:
#include <stdio.h>
int main() {
double d1 = 11312.96, d2 = 11313.1;
printf("%f\n%f\n", d1, d2);
return 0;
}
should generate:
11312.960000
11313.100000
If you want it to have a different format (in both your seemingly incorrect case, and the case that complies with the standard), use the precision argument to force it, such as with:
printf("%.2f\n", d1); // gives "11312.96"
You may also want to specify the minimum field width to ensure your numbers are lined up on the right, such as with:
// posn: 123456789
// ---------
printf("%9.2f\n", d1); // gives " 11312.96"
printf("%9.2f\n", 3.1); // gives " 3.10"

How %g works in printf

The %g description says
Use the shortest representation: %e or %f
for example,
544666.678 is written as 544667 if %.6g is used which is fine.
But the same number is written as 5.4467E+5 when %.5g is used.
Why would it use exponential notation (%e) here while 544670 (%f) is shorter in length than that.
Can anyone please help me to understand? Is this a bug?
Similarly, 44.35 is written as 44.4 when %.1f is used which is fine.
but 44.55 is written as 44.5 when %.1f is used. Why isn't it written as 44.6? Is this a bug?
Use the shortest representation: %e or %f is not precise enough.
C11 7.21.6.1 The fprintf function
A double argument representing a floating-point number is converted in style f or e (or in style F or E in the case of a G conversion specifier), depending on the value converted and the precision. Let P equal the precision if nonzero, 6 if the precision is omitted, or 1 if the precision is zero. Then, if a conversion with style E would have an exponent of X :
— if P > X ≥ −4, the conversion is with style f (or F) and precision P − (X + 1).
— otherwise, the conversion is with style e (or E) and precision P − 1.
Finally, unless the # flag is used, any trailing zeros are removed from the fractional portion of the result and the decimal-point character is removed if there is no fractional portion remaining.
So in the example of %.5g with 544666.678, P is 5, and X being 5, according to the rule, the style e is used because P > X ≥ −4 is false.

Get printf to print all float digits

I'm confused about the behavior of printf("%f", M_PI). It prints out 3.141593, but M_PI is 3.14159265358979323846264338327950288. Why does printf do this, and how can I get it to print out the whole float. I'm aware of the %1.2f format specifiers, but if I use them then I get a bunch of unused 0s and the output is ugly. I want the entire precision of the float, but not anything extra.
Why does printf do this, and how can I get it to print out the whole
float.
By default, the printf() function takes precision of 6 for %f and %F format specifiers. From C11 (N1570) §7.21.6.1/p8 The fprintf function (emphasis mine going forward):
If the precision is missing, it is taken as 6; if the precision is
zero and the # flag is not specified, no decimal-point character
appears. If a decimal-point character appears, at least one digit
appears before it. The value is rounded to the appropriate number
of digits.
Thus call is just equivalent to:
printf("%.6f", M_PI);
The is nothing like "whole float", at least not directly as you think. The double objects are likely to be stored in binary IEEE-754 double precision representation. You can see the exact representation using %a or %A format specifier, that prints it as hexadecimal float. For instance:
printf("%a", M_PI);
outputs it as:
0x1.921fb54442d18p+1
which you can think as "whole float".
If all what you need is "longest decimal approximation", that makes sense, then use DBL_DIG from <float.h> header. C11 5.2.4.2.2/p11 Characteristics of floating types :
number of decimal digits, q, such that any floating-point number with
q decimal digits can be rounded into a floating-point number with p
radix b digits and back again without change to the q decimal digits
For instance:
printf("%.*f", DBL_DIG-1, M_PI);
may print:
3.14159265358979
You can use sprintf to print a float to a string with an overkill display precision and then use a function to trim 0s before passing the string to printf using %s to display it. Proof of concept:
#include <math.h>
#include <string.h>
#include <stdio.h>
void trim_zeros(char *x){
int i;
i = strlen(x)-1;
while(i > 0 && x[i] == '0') x[i--] = '\0';
}
int main(void){
char s1[100];
char s2[100];
sprintf(s1,"%1.20f",23.01);
sprintf(s2,"%1.20f",M_PI);
trim_zeros(s1);
trim_zeros(s2);
printf("s1 = %s, s2 = %s\n",s1,s2);
//vs:
printf("s1 = %1.20f, s2 = %1.20f\n",23.01,M_PI);
return 0;
}
Output:
s1 = 23.010000000000002, s2 = 3.1415926535897931
s1 = 23.01000000000000200000, s2 = 3.14159265358979310000
This illustrates that this approach probably isn't quite what you want. Rather than simply trimming zeros you might want to truncate if the number of consecutive zeros in the decimal part exceeds a certain length (which could be passed as a parameter to trim_zeros. Also — you might want to make sure that 23.0 displays as 23.0 rather than 23. (so maybe keep one zero after a decimal place). This is mostly proof of concept — if you are unhappy with printf use sprintf then massage the result.
Once a piece of text is converted to a float or double, "all" the digits is no longer a meaningful concept. There's no way for the computer to know, for example, that it converted "3.14" or "3.14000000000000000275", and they both happened to produce the same float. You'll simply have to pick the number of digits appropriate to your task, based on what you know about the precision of the numbers involved.
If you want to print as many digits as are likely to be distinctly represented by the format, floats are about 7 digits and doubles are about 15, but that's an approximation.

What is the difference between %g and %f in C?

I was going through The C programming Language by K&R. Here in a statement to print a double variable it is written
printf("\t%g\n", sum += atof(line));
where sum is declared as double. Can anybody please help me out when to use %g in case of double or in case of float and whats the difference between %g and %f.
They are both examples of floating point input/output.
%g and %G are simplifiers of the scientific notation floats %e and %E.
%g will take a number that could be represented as %f (a simple float or double) or %e (scientific notation) and return it as the shorter of the two.
The output of your print statement will depend on the value of sum.
See any reference manual, such as the man page:
f,F
The double argument is rounded and converted to decimal notation in the style [-]ddd.ddd, where the number of digits after the decimal-point character is equal to the precision specification. If the precision is missing, it is taken as 6; if the precision is explicitly zero, no decimal-point character appears. If a decimal point appears, at least one digit appears before it.
(The SUSv2 does not know about F and says that character string representations for infinity and NaN may be made available. The C99 standard specifies '[-]inf' or '[-]infinity' for infinity, and a string starting with 'nan' for NaN, in the case of f conversion, and '[-]INF' or '[-]INFINITY' or 'NAN*' in the case of F conversion.)
g,G
The double argument is converted in style f or e (or F or E for G conversions). The precision specifies the number of significant digits. If the precision is missing, 6 digits are given; if the precision is zero, it is treated as 1. Style e is used if the exponent from its conversion is less than -4 or greater than or equal to the precision. Trailing zeros are removed from the fractional part of the result; a decimal point appears only if it is followed by at least one digit.
E = exponent expression, simply means power(10, n) or 10 ^ n
F = fraction expression, default 6 digits precision
G = gerneral expression, somehow smart to show the number in a concise way (but
really?)
See the below example,
The code
void main(int argc, char* argv[])
{
double a = 4.5;
printf("=>>>> below is the example for printf 4.5\n");
printf("%%e %e\n",a);
printf("%%f %f\n",a);
printf("%%g %g\n",a);
printf("%%E %E\n",a);
printf("%%F %F\n",a);
printf("%%G %G\n",a);
double b = 1.79e308;
printf("=>>>> below is the exbmple for printf 1.79*10^308\n");
printf("%%e %e\n",b);
printf("%%f %f\n",b);
printf("%%g %g\n",b);
printf("%%E %E\n",b);
printf("%%F %F\n",b);
printf("%%G %G\n",b);
double d = 2.25074e-308;
printf("=>>>> below is the example for printf 2.25074*10^-308\n");
printf("%%e %e\n",d);
printf("%%f %f\n",d);
printf("%%g %g\n",d);
printf("%%E %E\n",d);
printf("%%F %F\n",d);
printf("%%G %G\n",d);
}
The output
=>>>> below is the example for printf 4.5
%e 4.500000e+00
%f 4.500000
%g 4.5
%E 4.500000E+00
%F 4.500000
%G 4.5
=>>>> below is the exbmple for printf 1.79*10^308
%e 1.790000e+308
%f 178999999999999996376899522972626047077637637819240219954027593177370961667659291027329061638406108931437333529420935752785895444161234074984843178962619172326295244262722141766382622299223626438470088150218987997954747866198184686628013966119769261150988554952970462018533787926725176560021258785656871583744.000000
%g 1.79e+308
%E 1.790000E+308
%F 178999999999999996376899522972626047077637637819240219954027593177370961667659291027329061638406108931437333529420935752785895444161234074984843178962619172326295244262722141766382622299223626438470088150218987997954747866198184686628013966119769261150988554952970462018533787926725176560021258785656871583744.000000
%G 1.79E+308
=>>>> below is the example for printf 2.25074*10^-308
%e 2.250740e-308
%f 0.000000
%g 2.25074e-308
%E 2.250740E-308
%F 0.000000
%G 2.25074E-308
As Unwind points out f and g provide different default outputs.
Roughly speaking if you care more about the details of what comes after the decimal point I would do with f and if you want to scale for large numbers go with g. From some dusty memories f is very nice with small values if your printing tables of numbers as everything stays lined up but something like g is needed if you stand a change of your numbers getting large and your layout matters. e is more useful when your numbers tend to be very small or very large but never near ten.
An alternative is to specify the output format so that you get the same number of characters representing your number every time.
Sorry for the woolly answer but it is a subjective out put thing that only gets hard answers if the number of characters generated is important or the precision of the represented value.
%g removes trailing zeros in floats,
prints (integer) upto 10**6 , after that in e+ upto precision 6
123456 gives 123456
1234567 gives 1.23457e+06
prints (float > 10** -4 ) upto precision 6 , after that rounds off to pre. 6
1.23456 gives 1.23456
1.234567 gives 1.23457
print (float < 10** -4 ) upto precision 4 , else in ne-0p
0.0001 gives 0.0001
0.000001 gives 1e-06
0.12345678 gives 0.123457
%G does the same , but exp(e) becomes exp(E)
%f and %g does the same thing. Only difference is that %g is the shorter form of %f. That is the precision after decimal point is larger in %f compared to %g

Wrong output from printf of a number

int main()
{
double i=4;
printf("%d",i);
return 0;
}
Can anybody tell me why this program gives output of 0?
When you create a double initialised with the value 4, its 64 bits are filled according to the IEEE-754 standard for double-precision floating-point numbers. A float is divided into three parts: a sign, an exponent, and a fraction (also known as a significand, coefficient, or mantissa). The sign is one bit and denotes whether the number is positive or negative. The sizes of the other fields depend on the size of the number. To decode the number, the following formula is used:
1.Fraction × 2Exponent - 1023
In your example, the sign bit is 0 because the number is positive, the fractional part is 0 because the number is initialised as an integer, and the exponent part contains the value 1025 (2 with an offset of 1023). The result is:
1.0 × 22
Or, as you would expect, 4. The binary representation of the number (divided into sections) looks like this:
0 10000000001 0000000000000000000000000000000000000000000000000000
Or, in hexadecimal, 0x4010000000000000. When passing a value to printf using the %d specifier, it attempts to read sizeof(int) bytes from the parameters you passed to it. In your case, sizeof(int) is 4, or 32 bits. Since the first (rightmost) 32 bits of the 64-bit floating-point number you supply are all 0, it stands to reason that printf produces 0 as its integer output. If you were to write:
printf("%d %d", i);
Then you might get 0 1074790400, where the second number is equivalent to 0x40100000. I hope you see why this happens. Other answers have already given the fix for this: use the %f format specifier and printf will correctly accept your double.
Jon Purdy gave you a wonderful explanation of why you were seeing this particular result. However, bear in mind that the behavior is explicitly undefined by the language standard:
7.19.6.1.9: If a conversion specification is invalid, the behavior is undefined.248) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
(emphasis mine) where "undefined behavior" means
3.4.3.1: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
IOW, the compiler is under no obligation to produce a meaningful or correct result. Most importantly, you cannot rely on the result being repeatable. There's no guarantee that this program would output 0 on other platforms, or even on the same platform with different compiler settings (it probably will, but you don't want to rely on it).
%d is for integers:
int main()
{
int i=4;
double f = 4;
printf("%d",i); // prints 4
printf("%0.f",f); // prints 4
return 0;
}
Because the language allows you to screw up and you happily do it.
More specifically, '%d' is the formatting for an int and therefore printf("%d") consumes as many bytes from the arguments as an int takes. But a double is much larger, so printf only gets a bunch of zeros. Use '%lf'.
Because "%d" specifies that you want to print an int, but i is a double. Try printf("%f\n"); instead (the \n specifies a new-line character).
The simple answer to your question is, as others have said, that you're telling printf to print a integer number (for example a variable of the type int) whilst passing it a double-precision number (as your variable is of the type double), which is wrong.
Here's a snippet from the printf(3) linux programmer's manual explaining the %d and %f conversion specifiers:
d, i The int argument is converted to signed decimal notation. The
precision, if any, gives the minimum number of digits that must
appear; if the converted value requires fewer digits, it is
padded on the left with zeros. The default precision is 1.
When 0 is printed with an explicit precision 0, the output is
empty.
f, F The double argument is rounded and converted to decimal notation
in the style [-]ddd.ddd, where the number of digits after the
decimal-point character is equal to the precision specification.
If the precision is missing, it is taken as 6; if the precision
is explicitly zero, no decimal-point character appears. If a
decimal point appears, at least one digit appears before it.
To make your current code work, you can do two things. The first alternative has already been suggested - substitute %d with %f.
The other thing you can do is to cast your double to an int, like this:
printf("%d", (int) i);
The more complex answer(addressing why printf acts like it does) was just answered briefly by Jon Purdy. For a more in-depth explanation, have a look at the wikipedia article relating to floating point arithmetic and double precision.
Because i is a double and you tell printf to use it as if it were an int (%d).
#jagan, regarding the sub-question:
What is Left most third byte. Why it is 00000001? Can somebody explain?"
10000000001 is for 1025 in binary format.

Resources