IEEE 754 to decimal in C language - c

I'm looking the best way to transform a float number to its decimal representation in C. I'll try to give you an example: the user introduces a number in IEEE754 (1 1111111 10101...) and the program has to return the decimal representation (ex. 25.6)
I've tried with masks, and bitwise operations, but I haven't got any logical result.

I believe the following is performing the operation you describe:
I use the int as an intermediate representation because it has the same number of bits as the float (on my machine), and it allowed easy conversion from the binary string.
#include <stdio.h>
union {
int i;
float f;
} myunion;
int binstr2int(char *s)
{
int rc;
for (rc = 0; '\0' != *s; s++) {
if ('1' == *s) {
rc = (rc * 2) + 1;
} else if ('0' == *s) {
rc *= 2;
}
}
return rc;
}
int main(void) {
// the input binary string (4 bytes)
char * input = "11000000110110011001100110011010";
float *output;
// convert to int, sizeof(int) == sizeof(float) == 4
int converted = binstr2int(input);
// strat 1: point memory of float at the int
output = (float*)&converted; // cast to suppress warning
printf("%f\n", *output); // -6.8
// strat 2: use a union to share memory
myunion.i = converted;
printf("%f\n", myunion.f); // -6.8
return 0;
}
As #DanielKamilKozar points out, the correct type for that int is uint32_t. However, that would require including <stdint.h>.

Related

How would I produce an integer from a float in the sense of removing the decimal point, despite floating-point precision errors?

In C, how can I produce, for example 314159 from 3.14159 or 11 from 1.1 floats? I may not use #include at all, and I am not allowed to use library functions. It must be completely cross platform, and fit in a single function.
I tried this:
while (Number-(int)Number) {
Number *= 10;
}
and this:
Number *= 10e6;
and floating-point precision errors get in my way. How can I do this? How can I accurately transform all digits in a float into an integer?
In response to a comment, they are a float argument to a function:
char *FloatToString(char *Dest, float Number, register unsigned char Base) {
if (Base < 2 || Base > 36 || !Dest) {
return (char *)0;
}
char *const RDest = Dest;
if (Number < 0) {
Number = -Number;
*Dest = '-';
Dest++;
}
register unsigned char WholeDigits = 1;
for (register unsigned int T = (int)Number/Base; T; T /= Base) {
WholeDigits++;
}
Dest[WholeDigits] = '.';
// I need to now effectively "delete" the decimal point to further process it. Don't answer how to convert a float to a string, answer the title.
return RDest;
}
The essential problem you have is that floating point numbers can't represent your example numbers, so your input is always going to be slightly different. So if you accurately produce output, it will be different from what you expect as the input numbers are different from what you think they are.
If you don't have to worry about very large numbers, you can do this most easily by converting to a long:
v = v - (long)v; // remove the integer part
int frac = (int)(v * 100000);
will give you the 5 digits after the decimal point. The problem with this is that it give undefined behavior if the initial value is too large to be converted to a long. You might also want to be rounding differently (converting to int truncates towards zero) -- if you want the closest value rather than the leading 5 digits of the fraction, you can use (int)(v * 100000 + (v > 0 ? 0.5 : -0.5))
New version :
#include <stdio.h>
int main()
{
double x;
int i;
char s[10];
x = 9999.12504;
x = (x-(int)x);
sprintf(s,"%0.5g\n",x);
sscanf((s+2),"%d",&i);
printf("%d",i);
return 0;
}
Old version
#include <stdio.h>
int main()
{
float x;
int i;
x = -3.14159;
x = (x-(int)x);
if (x>=0)
i = 100000*x;
else
i = -100000*x;
printf("%d",i);
return 0;
}
#include <stdio.h>
#include <stdint.h>
#include <limits.h>
int main(void) {
double t = 0.12;
unsigned long x = 0;
t = (t<0)? -t : t; // To handle negative numbers.
for(t = t-(int)t; x < ULONG_MAX/10; t = 10*t-(int)(10*t))
{
x = 10*x+(int)(10*t);
}
printf("%lu\n", x);
return 0;
}
Output:
11999999999999999644
I feel like you should use modulo to get the decimal portion, convert it to a string, count the number of characters, and use that to multiply your remainder before casting it to an int.

Converting floating point to unsigned int while preserving order

I have found a lot of answers on SO focusing on converting float to int.
I am manipulating only positive floating point values.
One simple method I have been using is this:
unsigned int float2ui(float arg0) {
float f = arg0;
unsigned int r = *(unsigned int*)&f;
return r;
}
The above code works well yet it fails to preserve the numeric order.
By order I mean this:
float f1 ...;
float f2 ...;
assert( ( (f1 >= f2) && (float2ui(f1) >= float2ui(f2)) ) ||
( (f1 < f2) && (float2ui(f1) < vfloat2ui(f2)) ));
I have tried to use unions with the same results.
Any idea?
I use Homebrew gcc 5.3.0.
The code you're using, as writen, has undefind behavior. If you want to access the representation of floats semi-portably (implementation-defined, well-defined assuming IEEE 754 and that float and integer endianness match), you should do:
uint32_t float2ui(float f){
uint32_t r;
memcpy(&r, &f, sizeof r);
return r;
}
For non-negative values, this mapping between floating point values and representation is order-preserving. If you think you're seeing it fail to preserve order, we'll need to see exactly what values you think are a counterexample.
If f1 and f2 are floating points, and f1 <= f2, and (int)f1 and (int)f2 are valid conversions, then (int)f1 <= (int)f2.
In other words, a truncation to an integral type never swaps an order round.
You could replace float2ui with simply (int)arg0, having checked the float is in the bounds of an int.
Note that the behaviour of float to int and float to unsigned is undefined if the truncated float value is out of the range for the type.
Your current code - somehow intrepreting the float memory as int memory - has undefined behaviour. Even type-punning through a union will give you implementation defined results; note in particular that sizeof(int) isn't necessarily the same as sizeof(float).
If you are using an IEEE754 single-precision float, a 32 bit 2's complement int with no trap representation, a positive value for conversion, consistent endianness, and some allowances for the various patterns represented by NaN and +-Inf, then the transformation effected by a type pun is order preserving.
Extracting the bits from a float using a union should work. There is some discussion if the c standard actually supports this. But whatever the standard says, gcc seems to support it. And I would expect there is too much existing code that demands it, for the compilers to remove support.
There are some things you must be aware of when putting a float in an int and keeping order.
Funny values like nan does not have any order to keep
floats are stored as magnitude and sign bit, while ints are twos compliment
(assuming a sane architecture). So for negative values, you must flip all the
bits except the sign bit
If float and int does not have the same endianess on your architecture, you
must also convert the endianess
Here is my implementation, tested with gcc (Gentoo 6.4.0-r1 p1.3) 6.4.0 on x64
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
union ff_t
{
float f;
unsigned char a[4];
int i;
};
int same_endianess = 0;
void
swap_endianess(union ff_t *ff)
{
if (!same_endianess)
{
unsigned char tmp;
tmp = ff->a[0];
ff->a[0] = ff->a[3];
ff->a[3] = tmp;
tmp = ff->a[1];
ff->a[1] = ff->a[2];
ff->a[2] = tmp;
}
}
void
test_endianess()
{
union ff_t ff = { ff.f = 1 };
if (ff.i == 0x3f800000)
same_endianess = 1;
else if (ff.i == 0x803f)
same_endianess = 0;
else
{
fprintf(stderr, "Architecture has some weird endianess");
exit(1);
}
}
float
random_float()
{
float f = random();
f -= RAND_MAX/2;
return f;
}
int
f2i(float f)
{
union ff_t ff = { .f = f };
swap_endianess(&ff);
if (ff.i >= 0)
return ff.i;
return ff.i ^ 0x3fffffff;
}
float
i2f(int i)
{
union ff_t ff;
if (i >= 0)
ff.i = i;
else
ff.i = i ^ 0x3fffffff;
swap_endianess(&ff);
return ff.f;
}
int
main()
{
/* Test if floats and ints uses the same endianess */
test_endianess();
for (int n = 0; n < 10000; n++)
{
float f1 = random_float();
int i1 = f2i(f1);
float f2 = random_float();
int i2 = f2i(f2);
printf("\n");
printf("0x%08x, %f\n", i1, f1);
printf("0x%08x, %f\n", i2, f2);
assert ( f1 == i2f(i1));
assert ( f2 == i2f(i2));
assert ( (f1 <= f2) == (i1 <= i2));
}
}

<math.h> pow() giving wrong result

This is from google's code jam, practice problem "All your base".
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
long long pow_longlong(int digit, int raiseto)
{
if (raiseto == 0) return 1;
else return digit * pow_longlong(digit, raiseto - 1);
}
long long base10_with_map(int base, char* instr, char* digits)
{
if (base < 2) base = 2;
long long result = 0;
int len = strlen(instr);
int i = 0;
while (len--)
result += digits[instr[len]] * pow_longlong(base, i++);
return result;
}
long long test(char* in)
{
char appear[256];
int i;
int len = strlen(in);
int hold = 0;
for (i = 0; i < 256; i++) appear[i] = '\xFF';
for (i = 0; i < len; i++)
if (appear[in[i]] == '\xFF')
{
if (hold == 0) { appear[in[i]] = 1; hold++; }
else if (hold == 1) { appear[in[i]] = 0; hold++; }
else appear[in[i]] = hold++;
}
return base10_with_map(hold, in, appear);
}
int main(int argc, char* argv[])
{
if (argc < 2)
{
printf("Usage: %s <input-file> \n", argv[0]); return 1;
}
char buf[100];
int a, i;
FILE* f = fopen(argv[1], "r");
fscanf(f, "%d", &a);
long long result;
for (i = 1; i <= a; i++)
{
fscanf(f, "%s", buf);
result = test(buf);
printf("Case #%d: %lld\n", i, result);
}
return 0;
}
This works as intended and produces correct result to the problem. But if I replace my own pow_longlong() with pow() from math.h some calculations differ.
What is the reason to this? Just curious.
Edits:
- No overflow, plain long is enough to store the values, long long is just overkill
- Of course I include math.h
- In example: test("wontyouplaywithme") with pow_longlong returns 674293938766347782 (right) and with math.h 674293938766347904 (wrong)
Sorry that I won't go through your example and your intermediary function; the issue you're having occurs due to double being insufficient, not the long long. It is just that the number grows too large, causing it to require more and more precision towards the end, more than double can safely represent.
Here, try this really simple programme out, or just trust in the output I append to it to see what I mean:
#include <stdio.h>
int main( ){
double a;
long long b;
a = 674293938766347782.0;
b = a;
printf( "%f\n", a );
printf( "%lld", b );
getchar( );
return 0;
}
/*
Output:
674293938766347780.000000
674293938766347776
*/
You see, the double may have 8 bytes, just as much as the long long has, but it is designed so that it would also be able to hold non-integral values, which makes it less precise than long long can get in some cases like this one.
I don't know the exact specifics, but here, in MSDN it is said that its representation range is from -1.7e308 to +1.7e308 with (probably just on average) 15 digit precision.
So, if you are going to work with positive integers only, stick with your function. If you want to have an optimized version, check this one out: https://stackoverflow.com/a/101613/2736228
It makes use of the fact that, for example, while calculating x to the power 8, you can get away with 3 operations:
...
result = x * x; // x^2
result = result * result; // (x^2)^2 = x^4
result = result * result; // (x^4)^2 = x^8
...
Instead of dealing with 7 operations, multiplying them one by one.
pow (see reference) is not defined for integers, but only for floating point numbers. If you call pow with int as an argument the result will be a double.
You can in general not assume that the result of pow will be exactly the same as if you would use pure integer math as in the function pow_longlong.
Citation from wikipedia about double precision floating point numbers:
Between 2^52=4,503,599,627,370,496 and 2^53=9,007,199,254,740,992 the
representable numbers are exactly the integers. For the next range,
from 2^53 to 2^54, everything is multiplied by 2, so the representable
numbers are the even ones, etc.
So you get inaccurate results with pow if the result would be bigger than 2^53.

Counting the number of digits before and after the decimal point in C

I need to write a C program that will compare the number of digits before decimal point and after the decimal point and make sure they are equal.
How can I count how many powers of ten we have before and after the decimal point?
Here is what I have so far:
void main()
{
is_equal(6757.658);
}
INT is_equal(double x)
{
int digits = 0;
while (x) {
x /= 10;
digits++;
}
printf("%d ",digits);
}
Is there a better way to do this?
You do not seem to know binary representation of double/float variables as #AProgrammer suggests.
Your job is impossible if you use float/double. You may use string for the job.
something like below.NOTE: it's just a hint and not a good style.
EDIT: disable cout since this is C
bool checkFloat(string); //the function checks whether the string have a float number format
void twoPart(string num)
{
if (!checkFloat(num))
return;
int i = 0;
int a = 0;//integer part
int b = 0;//fractional part
for(;i<num.length() && num[i]!='.'; i ++);
a = i;
b = num.length() - a - 1;
if(i == num.length())
b = 0;
// print the result here
//cout << a << " " << b << endl;
}
The above piece of code accepts number like 123, 123.456, .123
That's a bit tricky. IEEE floats can't represent most decimal fractions exactly. The number 6757.658 is represented as a binary decimal: 0x1a65a872b020c5×2-40, which is exactly 6757.6580000000003565219230949878692626953125 (I think). I.e., your number actually has 40 decimal places.
This simplest work-around is to format it using something like sprintf(buf, "%.10g", x);, then read the parts back using int a, b; sscanf(buf, "%d.%d", &a, &b);. Alternatively, you could start with int b = 1e10*(x - floor(x)) and keep dividing b by 10 until it isn't a multiple of 10 (while (b%10 == 0) b /= 10;).
3rd try:
Count the number of "digits" before and after a "."
Null is considered not equal to anything
I did not test this code it might contain typos.
int is_equal(char *buffer)
{
char *temp;
int leftLen,rightLen;
temp = strtok(buffer,".");
if (temp == null) return false;
leftLen = strlen(temp);
temp = strtok(buffer,".");
if (temp == null) return false;
rightLen = strlen(temp);
return (leftLen == rightLen);
}
Old stuff...
There are going to be lots of problems here, a floating point (double) in C is not always accurate to 100%; If you perform multiplication or division. If you multiply the digits will change.
The best way to solve this problem is to render the double to a string and then parse that string.
You can use sprintf to write the formatted double to a buffer.
OR
You can skip using a double all together and use a string to start with.
Thus building on Marcelo's answer:
Read the string from the user into a buffer called buff
Then parse it with a statement like sscanf(buf, "%d.%d", &a, &b);
buff is a char * or a char [], a and b are int. You test by saying a == b
void main()
{
is_equal("6757.658");
}
int is_equal(char *x)
{
int left,right;
sscanf(x, "%d.%d", &left, &right);
printf("Left digits: %d\n\r",left);
printf("Right digits: %d\n\r",right);
return (left == right);
}
#include <stdio.h>
#include <assert.h>
float main(void)
{
int siz;
assert (sizeof siz == sizeof (float));
siz = is_equal(6757.658);
printf( "Size=%d\n", siz);
return 0.0;
}
int is_equal(double x)
{
int digits;
for (digits=0; x >= 1.0; digits++) {
x /= 10;
}
return digits;
}

How to print out each bit of a floating point number?

I am trying to print out each bit of a floating point number in C.
I am able to do it for integers with this:
int bit_return(int a, int loc)
// Bit returned at location
{
int buf = a & 1<<loc;
if (buf == 0)
return 0;
else
return 1;
}
The compiler wouldn't compile if I replaced int a with float a.
Is there a solution for this?
Copy and reformat your comment below
OK, for people who are not clear, I post my whole code here:
#include <stdio.h>
#include <stdlib.h>
int bit_return(int a, int loc) // Bit returned at location
{
int buf = a & 1<<loc;
if (buf == 0)
return 0;
else
return 1;
}
int main()
{
int a = 289642; // Represent 'a' in binary
int i = 0;
for (i = 31; i>=0; i--)
{
printf("%d",bit_return(a,i));
}
return 0;
}
Thanks to Pascal Cuoq for his comment. I finally figure out how to solve my own problem. Yes, just assign address of the float number to a pointer to integer then dereference it.
Here is my code solution:
#include <stdio.h>
// bit returned at location
int bit_return(int a, int loc)
{
int buf = a & 1<<loc;
if (buf == 0) return 0;
else return 1;
}
int main()
{
//11000010111011010100000000000000
// 1 sign bit | 8 exponent bit | 23 fraction bits
float a = -118.625;
int *b;
b = &a;
int i;
for (i = 31; i >= 0; i--)
{
printf("%d",bit_return(*b,i));
}
return 0;
}
Cast the address of your float to the address of an int of the same size, and pass that int to your existing function.
static void printme(void *c, size_t n)
{
unsigned char *t = c;
if (c == NULL)
return;
while (n > 0) {
--n;
printf("%02x", t[n]);
}
printf("\n");
}
void fpp(float f, double d)
{
printme(&f, sizeof f);
printme(&d, sizeof d);
}
A note on float parameters
Be sure you have the prototype for fpp() in scope when you call it or you will invoke an obscure K&R C vs ANSI C issue.
Update: binary output...
while (n > 0) {
int q;
--n;
for(q = 0x80; q; q >>= 1)
printf("%x", !!(t[n] & q));
}
The following code assumes floats and pointers are the same size, which is true on many systems:
float myfloat = 254940.4394f;
printf("0x%p", *(void**)(&myfloat));
In C language, the term "bit" refers to an element of binary positional representation of a number. Integral numbers in C use binary positional representation, which is why they have "bits". These are the bits you "see" by means of bitwise operators (logical and shifts). Floating-point numbers do not use that representation. Moreover, representation of floating-point numbers is not defined by the language specification, In other words, floating-point numbers in C do not have "bits", which is why you won't be able to access any of their "bits" by any legal means of the language, and which is why you can't apply any bitwise operators to floating-point objects.
Having said that, I'd suspect that you might be interested in physical bits representing a floating-point object. You can reinterpret the memory occupied by the floating-point object (or any other object) as an array of unsigned char elements and print the bits of each of the unsigned char objects. That will give you the map of all physical bits representing the object.
However, this won't be exactly equivalent to what you have in your code above. Your code above prints the bits of value representation of an integral object (i.e it is the logical bits I described above), while the memory reinterpretation approach will give you the bits of the object representation (i.e. the physical bits). But then again, floating-point numbers in C don't have logical bits by definition.
Added later: I feel that understanding the difference between the concepts of physical and logical bits might not be an easy task for some readers. As another example that might help to promote the understanding, I'd like to note that there's absolutely nothing that would preclude a perfectly compliant C implementation on ternary hardware, i.e. hardware that does not have physical binary bits at all. In such implementation bitwise operations would still work perfectly fine, they would still access binary bits, i.e. elements of [now only imaginary] binary positional representation of each integral number. That would be the logical bits I'm talking about above.
While from comments it seems that outputing the bits of the internal representation may be what was wanted, here is code to do what the question seemed to literally ask for, without the lossy conversion to int some have proposed:
Outputing a floating point number in binary:
#include <stdio.h>
#include <stdlib.h>
void output_binary_fp_number(double arg)
{
double pow2;
if ( arg < 0 ) { putchar('-'); arg = -arg; }
if ( arg - arg != 0 ) {
printf("Inf");
}
else {
/* compare and subtract descending powers of two, printing a binary digit for each */
/* first figure out where to start */
for ( pow2 = 1; pow2 * 2 <= arg; pow2 *= 2 ) ;
while ( arg != 0 || pow2 >= 1 ) {
if ( pow2 == .5 ) putchar('.');
if ( arg < pow2 ) putchar('0');
else {
putchar('1');
arg -= pow2;
}
pow2 *= .5;
}
}
putchar('\n');
return;
}
void usage(char *progname) {
fprintf(stderr, "Usage: %s real-number\n", progname);
exit(EXIT_FAILURE);
}
int main(int argc, char **argv) {
double arg;
char *endp;
if ( argc != 2 ) usage(argv[0]);
arg = strtod(argv[1], &endp);
if ( endp == argv[1] || *endp ) usage(argv[0]);
output_binary_fp_number(arg);
return EXIT_SUCCESS;
}
If you want to use your bit_return function on a float, you can just cheat:
float f = 42.69;
for ....
bit_return((int) f, loc)
The (int) cast will make the compiler believe you're working with an integer, so bit_return will work.
This is essentially what Pascal was suggesting.
EDIT:
I stand corrected by Pascal. I think this will conform with his latest comment:
bit_return (*((float *) &f), loc)
hope I got it right that time.
Another alternative (with fewer parentheses) would be to use a union to cheat on the data type.
I have included code which produces hexadecimal output that I think may help you understand floating-point numbers. Here is an example:
double: 00 00 A4 0F 0D 4B 72 42 (1257096936000.000000) (+0x1.24B0D0FA40000 x 2^40)
From my code example below, it should become obvious to you how to output the bits. Cast the double's address to unsigned char * and output the bits of sizeof(double) chars.
Since I want to output the exponent and significand (and sign bit) of a floating-point number, my example code digs into the bits of the IEEE-754 standard representation for 64-bit 'double precision' floating pointing point in radix 2. Therefore I do not use sizeof(double) other than to verify that the compiler and I agree that double means a 64-bit float.
If you would like to output the bits for a floating-point number of any type, do use sizeof(double) rather than 8.
void hexdump_ieee754_double_x86(double dbl)
{
LONGLONG ll = 0;
char * pch = (char *)&ll;
int i;
int exponent = 0;
assert(8 == sizeof(dbl));
// Extract the 11-bit exponent, and apply the 'bias' of 0x3FF.
exponent = (((((char *)&(dbl))[7] & 0x7F) << 4) + ((((char *)&(dbl))[6] & 0xF0) >> 4) & 0x7FF) - 0x3FF;
// Copy the 52-bit significand to an integer we will later display
for (i = 0; i < 6; i ++)
*pch++ = ((char *)&(dbl))[i];
*pch++ = ((char *)&(dbl))[6] & 0xF;
printf("double: %02X %02X %02X %02X %02X %02X %02X %02X (%f)",
((unsigned char *)&(dbl))[0],
((unsigned char *)&(dbl))[1],
((unsigned char *)&(dbl))[2],
((unsigned char *)&(dbl))[3],
((unsigned char *)&(dbl))[4],
((unsigned char *)&(dbl))[5],
((unsigned char *)&(dbl))[6],
((unsigned char *)&(dbl))[7],
dbl);
printf( "\t(%c0x1.%05X%08X x 2^%d)\n",
(((char *)&(dbl))[6] & 0x80) ? '-' : '+',
(DWORD)((ll & 0xFFFFFFFF00000000LL) >> 32),
(DWORD)(ll & 0xFFFFFFFFLL),
exponent);
}
Nota Bene: The significand is displayed as a hexadecimal fraction ("0x1.24B0D0FA40000") and the exponent is display as decimal ("40"). For me, this was an intuitive way to display the floating-point bits.
Print the integer part, then a '.', then the fractional part.
float f = ...
int int_part = floor(f)
int fraction_part = floor((f - int_part) * pow(2.0, 32))
You can then use your bit_return to print x and y. Bonus points for not printing leading and/or trailing zeros.
I think the best way to address this question is to use an union
unsigned f2u(float f)
{
union floatToUnsiged{
float a;
unsigned b;
}test;
test.a = f;
return (test.b);
}

Resources