Read scientific notation scanf - c

I am developing a program which should have only one scanf function and it should be able to accept input in scientific notation and real numbers.
Any help will be appreciated

According to the scanf documentation:
%f matches a floating-point number. The format of the number is the same as expected by strtof().
Looking at the strtof documentation
(optional) e or E followed with optional minus or plus sign and nonempty sequence of decimal digits (defines exponent)
Thus, you can use the %f specifier to read numbers in e notation. That is,
1e-3 is 1 * 10 ^ -3.

Related

What precisely does the %g printf specifier mean?

The %g specifier doesn't seem to behave in the way that most sources document it as behaving.
According to most sources I've found, across multiple languages that use printf specifiers, the %g specifier is supposed to be equivalent to either %f or %e - whichever would produce shorter output for the provided value. For instance, at the time of writing this question, cplusplus.com says that the g specifier means:
Use the shortest representation: %e or %f
And the PHP manual says it means:
g - shorter of %e and %f.
And here's a Stack Overflow answer that claims that
%g uses the shortest representation.
And a Quora answer that claims that:
%g prints the number in the shortest of these two representations
But this behaviour isn't what I see in reality. If I compile and run this program (as C or C++ - it's a valid program with the same behaviour in both):
#include <stdio.h>
int main(void) {
double x = 123456.0;
printf("%e\n", x);
printf("%f\n", x);
printf("%g\n", x);
printf("\n");
double y = 1234567.0;
printf("%e\n", y);
printf("%f\n", y);
printf("%g\n", y);
return 0;
}
... then I see this output:
1.234560e+05
123456.000000
123456
1.234567e+06
1234567.000000
1.23457e+06
Clearly, the %g output doesn't quite match either the %e or %f output for either x or y above. What's more, it doesn't look like %g is minimising the output length either; y could've been formatted more succinctly if, like x, it had not been printed in scientific notation.
Are all of the sources I've quoted above lying to me?
I see identical or similar behaviour in other languages that support these format specifiers, perhaps because under the hood they call out to the printf family of C functions. For instance, I see this output in Python:
>>> print('%g' % 123456.0)
123456
>>> print('%g' % 1234567.0)
1.23457e+06
In PHP:
php > printf('%g', 123456.0);
123456
php > printf('%g', 1234567.0);
1.23457e+6
In Ruby:
irb(main):024:0* printf("%g\n", 123456.0)
123456
=> nil
irb(main):025:0> printf("%g\n", 1234567.0)
1.23457e+06
=> nil
What's the logic that governs this output?
This is the full description of the g/G specifier in the C11 standard:
A double argument representing a floating-point number is
converted in style f or e (or in style F or E in the case of a G
conversion specifier), depending on the value converted and the
precision. Let P equal the precision if nonzero, 6 if the precision is
omitted, or 1 if the precision is zero. Then, if a conversion with
style E would have an exponent of X:
if P > X ≥ −4, the conversion is
with style f (or F) and precision P − (X + 1).
otherwise, the
conversion is with style e (or E) and precision P − 1.
Finally, unless
the # flag is used, any trailing zeros are removed from the fractional
portion of the result and the decimal-point character is removed if
there is no fractional portion remaining.
A double argument
representing an infinity or NaN is converted in the style of an f or F
conversion specifier.
This behaviour is somewhat similar to simply using the shortest representation out of %f and %e, but not equivalent. There are two important differences:
Trailing zeros (and, potentially, the decimal point) get stripped when using %g, which can cause the output of a %g specifier to not exactly match what either %f or %e would've produced.
The decision about whether to use %f-style or %e-style formatting is made based purely upon the size of the exponent that would be needed in %e-style notation, and does not directly depend on which representation would be shorter. There are several scenarios in which this rule results in %g selecting the longer representation, like the one shown in the question where %g uses scientific notation even though this makes the output 4 characters longer than it needs to be.
In case the C standard's wording is hard to parse, the Python documentation provides another description of the same behaviour:
General format. For a given precision p >= 1,
this rounds the number to p significant digits and
then formats the result in either fixed-point format
or in scientific notation, depending on its magnitude.
The precise rules are as follows: suppose that the
result formatted with presentation type 'e' and
precision p-1 would have exponent exp. Then
if -4 <= exp < p, the number is formatted
with presentation type 'f' and precision
p-1-exp. Otherwise, the number is formatted
with presentation type 'e' and precision p-1.
In both cases insignificant trailing zeros are removed
from the significand, and the decimal point is also
removed if there are no remaining digits following it.
Positive and negative infinity, positive and negative
zero, and nans, are formatted as inf, -inf,
0, -0 and nan respectively, regardless of
the precision.
A precision of 0 is treated as equivalent to a
precision of 1. The default precision is 6.
The many sources on the internet that claim that %g just picks the shortest out of %e and %f are simply wrong.
My favorite format for doubles is "%.15g". It seems to do the right thing in every case. I'm pretty sure 15 is the maximum reliable decimal precision in a double as well.

Standard form in C

After a few hours of searching I have not been able to find an answer, but if this is a duplicate please point me in the correct direction.
Will a C program accept a standard form input into something like scanf("%f",&float); from the keyboard. Standard form is writing a number like 2400 as 2.4E3 if this helps you understand what I am asking.
I must stress this MUST be from the keyboard.
Yes.
scanf works identically, regardless of whether stdin is connected to a terminal, a file, or some other input stream.
The %a, %e, %f, %g scanf format codes all do the same thing: interpret the longest string which would be acceptable to strtod (§7.21.6.2/12). (And so do %A, %E, %F and %G -- paragraph 14 of the same clause.) The redundancy is because scanf formats accepts the same format codes as printf.
strtod accepts any of the following (§7.22.1.3/3):
a nonempty sequence of decimal digits optionally containing a decimal-point character, then an optional exponent part as defined in 6.4.4.2;
a 0x or 0X, then a nonempty sequence of hexadecimal digits optionally containing a decimal-point character, then an optional binary exponent part as defined in 6.4.4.2;
INF or INFINITY, ignoring case
NAN or NAN(n-char-sequenceopt), ignoring case in the NAN part,…
Yes, it is being accepted from my terminal and also the IDEone interpreter (added it in stdin which is the same as "from the keyboard").
And probably a typo, but you have missed the "" in
scanf("%f", &float)

e format in printf() and precision modifiers

Could you explain me why
printf("%2.2e", 1201.0);
gives a result 1.20e+03 and not just 12.01e2?
My way of thinking: default number is 1201.0, specifier tells are that there should be 2 numbers after the digit.
What is wrong?
According to Wikipedia:
In normalized scientific notation, the exponent b is chosen so that the absolute value of a remains at least one but less than ten (1 ≤ |a| < 10). Thus 350 is written as 3.5×102. This form allows easy comparison of numbers, as the exponent b gives the number's order of magnitude. In normalized notation, the exponent b is negative for a number with absolute value between 0 and 1 (e.g. 0.5 is written as 5×10−1). The 10 and exponent are often omitted when the exponent is 0.
Normalized scientific form is the typical form of expression of large numbers in many fields, unless an unnormalised form, such as engineering notation, is desired. Normalized scientific notation is often called exponential notation—although the latter term is more general and also applies when a is not restricted to the range 1 to 10 (as in engineering notation for instance) and to bases other than 10 (as in 3.15× 220).
The first 2 in "%2.2e" is the minimum character width to print. 1.20e+03 is 8 characters which is more than 2.
e directs that the number is printed: (sign), 1 digit, '.', followed by some digits and an exponent.
The 2nd 2 in "%2.2e" is the number of digits after the decimal point to print. 6 is used if this 2nd value is not provided.
The %e format uses scientific notation, i.e. one digit before the decimal separator and an exponent for scaling. You can't set the digits before the decimal separator using this format.
This is just how the scientific notation is defined. The result you expect is a very weird notation. I don't think you can get it with printf.
The number before the dot in the format specifier defines the minimum width of the resulting sub-string. Try %20.2e to see what that means.

Formatting floating point numbers in C [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Avoid trailing zeroes in printf()
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
FILE *file;
double n;
file = fopen("fp.source", "r");
while(!feof(file)) {
fscanf(file, "%lf", &n);
printf("Next double:\"%lf\"\n", n);
}
fclose(file);
return 0;
}
Hi I am trying to scan for floating point numbers and I have gotten it to work, but I get trailing zeroes that I don't want. Is there a way to avoid this? For example, the current output I get is:
Next double:"11.540000"
When in reality I would like:
Next double:"11.54"
That's not a problem with scanning. That is a problem with printf formatting.
From the documentation (emphasis mine):
f, F
The double argument shall be converted to decimal notation in the style "[-]ddd.ddd", where the number of digits after the radix character is equal to the precision specification. If the precision is missing, it shall be taken as 6; if the precision is explicitly zero and no '#' flag is present, no radix character shall appear. If a radix character appears, at least one digit appears before it. The low-order digit shall be rounded in an implementation-defined manner.
You probably want %g (again, emphasis mine):
g, G
The double argument shall be converted in the style f or e (or in the style F or E in the case of a G conversion specifier), with the precision specifying the number of significant digits. If an explicit precision is zero, it shall be taken as 1. The style used depends on the value converted; style e (or E ) shall be used only if the exponent resulting from such a conversion is less than -4 or greater than or equal to the precision. Trailing zeros shall be removed from the fractional portion of the result; a radix character shall appear only if it is followed by a digit or a '#' flag is present.
Is there a way to avoid this?
Yes, just format the output correctly:
printf("Next double:\"%.0lf\"\n", n);
The .0 in the printf format string indicates that you don't want to print any digits after the decimal point. You can change the 0 to some other value if you want one, two, three or more digits after the decimal.
Try this.
printf("Next double:\"%l.0f\"\n", n);
Here's a good reference for string format specifiers.
http://www.cplusplus.com/reference/cstdio/printf/

What Is "\t%.10g\n"

I'm new at Bison, but in C/C++ no and at this time of development and regular expressions i never heard something like this, only the \n that's used for a new line, but i want to know what is the explanation of \t%.10g, that in the code is like this:
line: '\n'
| exp '\n' { printf ("\t%.10g\n", $1); }
;
Best Regards.
It means "print a tab character (\t) followed by a floating point number with 10 decimal places, either in scientific or fixed point notation depending on the order of magnitude (%.10g), followed by a newline (\n)".
Have a look at the printf reference to decode the pattern:
g Use the shorter of %e or %f
e Scientific notation (mantise/exponent) using e character
f Decimal floating point
Thus, %.10g prints a decimal number with ten significant digits.
It's not a regex but a printf format specification : Print a tab character followed by a floating point number with 10 digits behind the decimal point, either %f (floating point notation) way or %e (scientific notatation) way, whichever is shorter, and end with a newline.
man printf

Resources