I am trying to write a parser in C and part of its job is to convert a series of characters into a double. Up to now I have been using strtod but I find it to be quite dangerous and it won't handle cases where the number is at the end of the buffer, which is not null terminated.
I thought I'd write my own. If I have a string representation of a number of the form a.b, will I be nieve to think that I can just calculate (double)a + ((double)b / (double)10^n), where n is the number of digits in b?
For example, 23.4563:
a = 23
b = 4563
final answer: 23 + (4563/10000)
Or would that produce inaccurate results with regard to the IEEE format of floats?
It is hard to read floating-point numerals accurately, in the sense that there are various problems that must be carefully addressed, and many people fail to do so. However, it is a solved problem. To start, see How to read floating point numbers accurately, June 1990, by William D. Clinger.
I agree with Roddy, you are likely better off copying the data into a buffer and using existing library functions. (However, you should check that your C implementation provides correctly rounded conversion of floating-point numerals. The C standard does not require it, and some implementations do not provide it.)
You may be interested in this answer of mine to a somewhat related question.
The parser in that answer converts decimal floating point numbers (represented as strings) into IEEE-754 floats and doubles with proper rounding.
As far as I remember, about the only issue in the code is that it may not handle the cases when the exponent part is too big (doesn't fit into an integer) and should amount to returning either an error or INF.
Otherwise, it should give you a good idea of what to do (if you have any idea at all of what you're doing:).
As already said, it's difficult, you need extra precision, etc...
But if you have restricted inputs, and want to know if you can still correctly convert these restricted decimal to binary with semi naive algorithm and standard IEEE 754 ops, you might be interested in my answer to
How to manually parse a floating point number from a string
Related
I have a string representing a rational number.
I want to convert the string to a float with strtof(nptr, &endptr)
The problem is that e.g. a string "1.0000000000000000000001" will be converted to 1. without raising any flags (iirc).
Therefore my question: How does one catch this precision loss?
How does one catch this precision loss?
One doesn't, at least not with anything in the standard library; none of the strto* conversion functions will tell you if the value cannot be represented exactly.
Edit
I know that's not terribly helpful, but it means you'll have to go outside anything in the standard library. You'll either have to write your own conversion routines that somehow keep track of precision loss (I have no idea how you would implement this), or you'll have to go with some arbitrary-precision library like GMP, or you'll have to implement your own version of binary-coded decimal and hand-hack your own API to assign, compare, and manipulate BCD values.
C just doesn't give you the tools needed to do that kind of analysis.
Reading Here be dragons: advances in problems you didn’t even know you had I've noticed that they compare the new algorithm with the one used in glibc's printf:
Grisu3 is about 5 times faster than the algorithm used by printf in GNU libc
But at the same time I've failed to find any format specifier for printf which would automatically find the best number of decimal places to print. All I tried either have some strange defaults like 6 digits after decimal point for %f or 2 after point for %g or 6 after point for %e.
How do I actually make use of that algorithm implementation in glibc, mentioned in the article? Is there really any such implementation in glibc and is it even discussed in any way by the Standard?
This is the actual article. The blog post is referring to the results in section 7 (in other words, “they” are not comparing anything in the blog post, “they” are regurgitating the information from the actual article, omitting crucial details):
Implementations of Dragon4 or Grisu3 can be found in implementations of modern programming languages that specify this “minimal number of decimal digits” fashion (I recommend you avoid calling it “perfect”). Java uses this type of conversion to decimal in some contexts, as does Ruby. C is not one of the languages that specify “minimal number of decimal digits” conversion to decimal, so there is no reason for a compiler or for a libc to provide an implementation for Dragon4 or Grisu3.
There is no such thing as "best number of decimal places" because floating point numbers are not stored as decimal numbers. So you need to define what you mean by "best". If you want to print the numbers without any possible loss of information C11 gives you the format specifier %a (except for non-normalized floating point numbers where the behavior is unspecified).
The defaults from the C11 standard are 6 digits for %f and %e and for %g it is:
Let P equal the
precision if nonzero, 6 if the precision is omitted, or 1 if the precision is zero.
Then, if a conversion with style E would have an exponent of X:
— if P > X ≥ −4, the conversion is with style f (or F) and precision
P − (X + 1).
— otherwise, the conversion is with style e (or E) and precision P − 1.
If you want to use that algorithm, implement your own function for it. Or hope that glibc have implemented it in the past 5 years. Or just rethink if the performance of printing floating point numbers is really a problem you have.
Scientific notation is the common way to express a number with an explicit order of magnitude. First a nonzero digit, then a radix point, then a fractional part, and the exponent. In binary, there is only one possible nonzero digit.
Floating-point math involves an implicit first digit equal to one, then the mantissa bits "follow the radix point."
So why does frexp() put the radix point to the left of the implicit bit, and return a number in [0.5, 1) instead of scientific-notation-like [1, 2)? Is there some overflow to beware of?
Effectively it subtracts one more than the bias value specified by IEEE 754/ISO 60559. In hardware, this potentially trades an addition for an XOR. Alone, that seems like a pretty weak argument, considering that in many cases getting back to normal will require another floating-point operation.
The rationale says:
4.5.4.2 The frexp function
The functions frexp, ldexp, and modf are primitives used by the
remainder of the library. There was some sentiment for dropping them
for the same reasons that ecvt, fcvt, and gcvt were dropped, but their
adherents rescued them for general use. Their use is problematic: on
nonbinary architectures ldexp may lose precision, and frexp may be
inefficient.
One can speculate that the “remainder of the library” was more convenient to write with frexp's convention, or was already traditionally written against this interface although it did not provide any benefit.
I know that this does not fully answer the question, but it did not quite fit inside a comment.
I should also point out that some of the choices made in the design of the C language predate IEEE 754. Perhaps the format returned by frexp made sense with the PDP-11's floating-point format(s), or any other architecture on which a function frexp was first introduced. EDIT: See also page 155 of the manual for one PDP-11 model.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Dealing with accuracy problems in floating-point numbers
I was quite surprised why I tried to multiply a float in C (with GCC 3.2) and that it did not do as I expected.. As a sample:
int main() {
float nb = 3.11f;
nb *= 10;
printf("%f\n", nb);
}
Displays: 31.099998
I am curious regarding the way floats are implemented and why it produces this unexpected behavior?
First off, you can multiply floats. The problem you have is not the multiplication itself, but the original number you've used. Multiplication can lose some precision, but here the original number you've multiplied started with lost precision.
This is actually an expected behavior. floats are implemented using binary representation which means they can't accurately represent decimal values.
See MSDN for more information.
You can also see in the description of float that it has 6-7 significant digits accuracy. In your example if you round 31.099998 to 7 significant digits you will get 31.1 so it still works as expected here.
double type would of course be more accurate, but still has rounding error due to it's binary representation while the number you wrote is decimal.
If you want complete accuracy for decimal numbers, you should use a decimal type. This type exists in languages like C#. http://msdn.microsoft.com/en-us/library/system.decimal.aspx
You can also use rational numbers representation. Using two integers which will give you complete accuracy as long as you can represent the number as a division of two integers.
This is working as expected. Computers have finite precision, because they're trying to compute floating point values from integers. This leads to floating point inaccuracies.
The Floating point wikipedia page goes into far more detail on the representation and resulting accuracy problems than I could here :)
Interesting real-world side-note: this is partly why a lot of money calculations are done using integers (cents) - don't let the computer lose money with lack of precision! I want my $0.00001!
The number 3.11 cannot be represented in binary. The closest you can get with 24 significant bits is 11.0001110000101000111101, which works out to 3.1099998950958251953125 in decimal.
If your number 3.11 is supposed to represent a monetary amount, then you need to use a decimal representation.
In the Python communities we often see people surprised at this, so there are well-tested-and-debugged FAQs and tutorial sections on the issue (of course they're phrased in terms of Python, not C, but since Python delegates float arithmetic to the underlying C and hardware anyway, all the descriptions of float's mechanics still apply).
It's not the multiplication's fault, of course -- remove the statement where you multiply nb and you'll see similar issues anyway.
From Wikipedia article:
The fact that floating-point numbers
cannot precisely represent all real
numbers, and that floating-point
operations cannot precisely represent
true arithmetic operations, leads to
many surprising situations. This is
related to the finite precision with
which computers generally represent
numbers.
Floating points are not precise because they use base 2 (because it's binary: either 0 or 1) instead of base 10. And base 2 converting to base 10, as many have stated before, will cause rounding precision issues.
Yesterday I asked a floating point question, and I have another one. I am doing some computations where I use the results of the math.h (C language) sine, cosine and tangent functions.
One of the developers muttered that you have to be careful of the return values of these functions and I should not make assumptions on the return values of the gcc math functions. I am not trying to start a discussion but I really want to know what I need to watch out for when doing computations with the standard math functions.
x
You should not assume that the values returned will be consistent to high degrees of precision between different compiler/stdlib versions.
That's about it.
You should not expect sin(PI/6) to be equal to cos(PI/3), for example. Nor should you expect asin(sin(x)) to be equal to x, even if x is in the domain for sin. They will be close, but might not be equal.
Floating point is straightforward. Just always remember that there is an uncertainty component to all floating point operations and functions. It is usually modelled as being random, even though it usually isn't, but if you treat it as random, you'll succeed in understanding your own code. For instance:
a=a/3*3;
This should be treated as if it was:
a=(a/3+error1)*3+error2;
If you want an estimate of the size of the errors, you need to dig into each operation/function to find out. Different compilers, parameter choice etc. will yield different values. For instance, 0.09-0.089999 on a system with 5 digits precision will yield an error somewhere between -0.000001 and 0.000001. this error is comparable in size with the actual result.
If you want to learn how to do floating point as precise as posible, then it's a study by it's own.
The problem isn't with the standard math functions, so much as the nature of floating point arithmetic.
Very short version: don't compare two floating point numbers for equality, even with obvious, trivial identities like 10 == 10 / 3.0 * 3.0 or tan(x) == sin(x) / cos(x).
you should take care about precision:
Structure of a floating-point number
are you on 32bits, 64 bits Platform ?
you should read IEEE Standard for Binary Floating-Point Arithmetic
there are some intersting libraries such GMP, or MPFR.
you should learn how Comparing floating-point numbers
etc ...
Agreed with all of the responses that say you should not compare for equality. What you can do, however, is check if the numbers are close enough, like so:
if (abs(numberA - numberB) < CLOSE_ENOUGH)
{
// Equal for all intents and purposes
}
Where CLOSE_ENOUGH is some suitably small floating-point value.