I'm using a function (Borrowing code from: http://www.exploringbinary.com/converting-floating-point-numbers-to-binary-strings-in-c/) to convert a float into binary; stored in a char. I need to be able to perform bitwise operations on the result though, so I've been trying to find a way to take the string and convert it to an integer so that I can shift the bits around as needed. I've tried atoi() but that seems to return -1.
Thus far, I have:
char binStringRaw[FP2BIN_STRING_MAX];
float myfloat;
printf("Enter a floating point number: ");
scanf("%f", &myfloat);
int castedFloat = (*((int*)&myfloat));
fp2bin(castedFloat, binStringRaw);
Where the input is "12.125", the output of binStringRaw is "10000010100001000000000000000000". However, attempting to perform a bitwise operation on this give an error: "Invalid operands to binary expression ('char[1077]' and 'int')".
P.S. - I apologize if this is a simple question or if there are some general problems with my code. I'm very new to C programming coming from Python.
"castedFloat already is the binary representation of the float, as the cast-operation tells it to interpret the bits of myfloat as bits of an integer instead of a float. "
EDIT: Thanks to Eric Postpischil:
Eric Postpischil in Comments:
"the above is not guaranteed by the C standard. Dereferencing a
converted pointer is not fully specified by the standard. A proper way
to do this is to use a union: int x = (union { float f; int i; }) {
myfloat } .i;. (And one must still ensure that int and float are the
same size in the C implementation being used.)"
Bitwise operations are only defined for Integer-type values, such as char, int, long, ..., thats why it fails when using them on the string (char-array)
btw,
int atoi(char*)
returns the integer-value of a number written inside that string, eg.
atoi("12")
will return an integer with value 12
If you would want to convert the binary representation stored in a string, you have to set the integer bit by bit corresponding to the chars, a function to do this could look like that:
long intFromBinString(char* str){
long ret=0; //initialize returnvalue with zero
int i=0; //stores the current position in string
while(str[i] != 0){ //in c, strings are NULL-terminated, so end of string is 0
ret<<1; //another bit in string, so binary shift resutl-value
if(str[i] == '1') //if the new bit was 1, add that by binary or at the end
ret || 0x01;
i++; //increment position in string
}
return ret; //return result
}
The function fp2bin needs to get a double as parameter. if you call it with castedFloat, the (now interpreted as an integer)value will be implicitly casted to float, and then pass it on.
I assume you want to get a binary representation of the float, play some bitwise ops on it, and then pass it on.
In order to do that you have to cast it back to float, the reverse way you did before, so
int castedFloat = (*((int*)&myfloat));
{/*** some bitwise magic ***/}
float backcastedFloat = (*(float*)&castedFloat);
fp2bin(castedFloat, binStringRaw);
EDIT:(Thanks again, Eric):
union bothType { float f; int i; }) both;
both.f = myfloat;
{/*** some bitwise magic on both.i ***/}
fp2bin(both.f, binStringRaw);
should work
Related
I'm trying to interface a board with a raspberry.
I have to read/write value to the board via modbus, but I can't write floating point value like the board.
I'm using C, and Eclipse debug perspective to see the variable's value directly.
The board send me 0x46C35000 which should value 25'000 Dec but eclipse shows me 1.18720512e+009...
When I try on this website http://www.binaryconvert.com/convert_float.html?hexadecimal=46C35000 I obtain 25,000.
What's the problem?
For testing purposes I'm using this:
int main(){
while(1){ // To view easily the value in the debug perspective
float test = 0x46C35000;
printf("%f\n",test);
}
return 0;
}
Thanks!
When you do this:
float test = 0x46C35000;
You're setting the value to 0x46C35000 (decimal 1187205120), not the representation.
You can do what you want as follows:
union {
uint32_t i;
float f;
} u = { 0x46C35000 };
printf("f=%f\n", u.f);
This safely allows an unsigned 32-bit value to be interpreted as a float.
You’re confusing logical value and internal representation. Your assignments sets the value, which is thereafter 0x46C35000, i.e. 1187205120.
To set the internal representation of the floating point number you need to make a few assumptions about how floating point numbers are represented in memory. The assumptions on the website you’re using (IEEE 754, 32 bit) are fair on a general purpose computer though.
To change the internal representation, use memcpy to copy the raw bytes into the float:
// Ensure our assumptions are correct:
#if !defined(__STDC_IEC_559__) && !defined(__GCC_IEC_559)
# error Floating points might not be in IEEE 754/IEC 559 format!
#endif
_Static_assert(sizeof(float) == sizeof(uint32_t), "Floats are not 32 bit numbers");
float f;
uint32_t rep = 0x46C35000;
memcpy(&f, &rep, sizeof f);
printf("%f\n", f);
Output: 25000.000000.
(This requires the header stdint.h for uint32_t, and string.h for memcpy.)
The constant 0x46C35000 being assigned to a float will implicitly convert the int value 1187205120 into a float, rather than directly overlay the bits into the IEEE-754 floating point format.
I normally use a union for this sort of thing:
#include <stdio.h>
typedef union
{
float f;
uint32_t i;
} FU;
int main()
{
FU foo;
foo.f = 25000.0;
printf("%.8X\n", foo.i);
foo.i = 0x46C35000;
printf("%f\n", foo.f);
return 0;
}
Output:
46C35000
25000.000000
You can understand how data are represented in memory when you access them through their address:
#include <stdio.h>
int main()
{
float f25000; // totally unused, has exactly same size as `int'
int i = 0x46C35000; // put binary value of 0x46C35000 into `int' (4 bytes representation of integer)
float *faddr; // pointer (address) to float
faddr = (float*)&i; // put address of `i' into `faddr' so `faddr' points to `i' in memory
printf("f=%f\n", *faddr); // print value pointed bu `faddr'
return 0;
}
and the result:
$ gcc -of25000 f25000.c; ./f25000
f=25000.000000
What it does is:
put 0x46C35000 into int i
copy address of i into faddr, which is also address that points data in memory, in this case of float type
print value pointed by faddr; treat it as float type
you get your 25000.0.
I have to round off a float to decimal. After rounding off, I should convert this number to hexadecimal. I think I got the round off part okay with round()
Is there a way to convert a decimal to hexadecimal in C, and store it into a part of an array?
I'm thinking of the concept on how printf() converts the decimal to hex.
What I have in mind is something like this:
float k = 10.123;
int a;
unsigned char var_store[1];
unsigned char array_t[3];
array_t[0] = 0x01;
array_t[1] = 0x04;
a = round(k);
var_store[0] = sprintf("%x",a);
array_t[2] = var_store[0];
but I'm having a
warning passing argument 2 of 'sprintf' makes pointer from integer
without a cast
I'm not sure if this is the way to do it. But I think this is relatively straight forward. Thanks
People tend to get very confused with the term "hexadecimal". It should mean "the number as a human-readable ascii string with digits 0-F", but because raw binary data is typically presented in hex, people miuse it to mean the binary data itself.
Whilst of course you can write a function that converts a decimal number, expressed as a string, to a hexadecimal number, expressed as another string, it's fiddly and, except as a learning exercise, pointless thing to do. sprintf converts C variables to human-readable strings for you. To get a decimal, pass "%d", to get hex, pass "%x". You also need to pass a destination buffer, like this.
char destination[256];
int a = 123;
sprintf(destination, "number is decimal %d hex %x", a, a);
I did not recollect any library function.
But the traditional mathematical way is below. I you want you can create a user defined function.
#include <iostream>
using namespace std;
int main()
{
long int decimalNumber = 2567888;
char hexadecimalNumber[100];
int temp;
int i =1;
while(decimalNumber!=0)
{
temp = decimalNumber % 16;
//To convert integer into character
if( temp < 10)
temp =temp + 48;
else
temp = temp + 55;
hexadecimalNumber[i++]= temp;
decimalNumber = decimalNumber / 16;
}
for(int j = i -1 ;j> 0;j--)
cout<<hexadecimalNumber[j];
}
I stumbled on one issue while I was implementing in C the given algorithm:
int getNumberOfAllFactors(int number) {
int counter = 0;
double sqrt_num = sqrt(number);
for (int i = 1; i <= sqrt_num; i++) {
if ( number % i == 0) {
counter = counter + 2;
}
}
if (number == sqrt_num * sqrt_num)
counter--;
return counter;
}
– the reason for second condition – is to make a correction for perfect squares (i.e. 36 = 6 * 6), however it does not avoid situations (false positives) like this one:
sqrt(91) = 18.027756377319946
18.027756377319946 * 18.027756377319946 = 91.0
So my questions are: how to avoid it and what is the best way in C language to figure out whether a double number has any digits after decimal point? Should I cast square root values from double to integers?
In your case, you could test it like this:
if (sqrt_num == (int)sqrt_num)
You should probably use the modf() family of functions:
#include <math.h>
double modf(double value, double *iptr);
The modf functions break the argument value into integral and fractional parts, each of
which has the same type and sign as the argument. They store the integral part (in
floating-point format) in the object pointed to by iptr.
This is more reliable than trying to use direct conversions to int because an int is typically a 32-bit number and a double can usually store far larger integer values (up to 53 bits worth) so you can run into errors unnecessarily. If you decide you must use a conversion to int and are working with double values, at least use long long for the conversion rather than int.
(The other members of the family are modff() which handles float and modfl() which handles long double.)
I saw the following piece of code in an opensource AAC decoder,
static void flt_round(float32_t *pf)
{
int32_t flg;
uint32_t tmp, tmp1, tmp2;
tmp = *(uint32_t*)pf;
flg = tmp & (uint32_t)0x00008000;
tmp &= (uint32_t)0xffff0000;
tmp1 = tmp;
/* round 1/2 lsb toward infinity */
if (flg)
{
tmp &= (uint32_t)0xff800000; /* extract exponent and sign */
tmp |= (uint32_t)0x00010000; /* insert 1 lsb */
tmp2 = tmp; /* add 1 lsb and elided one */
tmp &= (uint32_t)0xff800000; /* extract exponent and sign */
*pf = *(float32_t*)&tmp1 + *(float32_t*)&tmp2 - *(float32_t*)&tmp;
} else {
*pf = *(float32_t*)&tmp;
}
}
In that the line,
*pf = *(float32_t*)&tmp;
is same as,
*pf = (float32_t)tmp;
Isn't it?
Or is there a difference? Maybe in performance?
Thank you.
No, they're completely different. Say the value of tmp is 1. Their code will give *pf the value of whatever floating point number has the same binary representation as the integer 1. Your code would give it the floating point value 1.0!
This code is editing the value of a float knowing it is formatted using the standard IEEE 754 floating representation.
*(float32_t*)&tmp;
means reinterpret the address of temp as being a pointer on a 32 bit float, extract the value pointed.
(float32_t)tmp;
means cast the integer to float 32. Which means 32.1111f may well produce 32.
Very different.
The first causes the bit pattern of tmp to be reinterpreted as a float.
The second causes the numerical value of tmp to be converted to float (within the accuracy that it can be represented including rounding).
Try this:
int main(void) {
int32_t n=1078530011;
float32_t f;
f=*(float32_t*)(&n);
printf("reinterpet the bit pattern of %d as float - f==%f\n",n,f);
f=(float32_t)n;
printf("cast the numerical value of %d as float - f==%f\n",n,f);
return 0;
}
Example output:
reinterpet the bit pattern of 1078530011 as float - f==3.141593
cast the numerical value of 1078530011 as float - f==1078530048.000000
It's like thinking that
const char* str="3568";
int a=*(int*)str;
int b=atoi(str);
Will assign a and b the same values.
First to answer the question, my_float = (float)my_int safely converts the integer to a float according to the rules of the standard (6.3.1.4).
When a value of integer type is converted to a real floating type, if
the value being converted can be represented exactly in the new type,
it is unchanged. If the value being converted is in the range of
values that can be represented but cannot be represented exactly, the
result is either the nearest higher or nearest lower representable
value, chosen in an implementation-defined manner. If the value being
converted is outside the range of values that can be represented, the
behavior is undefined.
my_float = *(float*)&my_int on the other hand, is a dirty trick, telling the program that the binary contents of the integer should be treated as if they were a float variable, with no concerns at all.
However, the person who wrote the dirty trick was probably not aware of it leading to undefined behavior for another reason: it violates the strict aliasing rule.
To fix this bug, you either have to tell your compiler to behave in a non-standard, non-portable manner (for example gcc -fno-strict-aliasing), which I don't recommend.
Or preferably, you rewrite the code so that it doesn't rely on undefined behavior. Best way is to use unions, for which strict aliasing doesn't apply, in the following manner:
typedef union
{
uint32_t as_int;
float32_t as_float;
} converter_t;
uint32_t value1, value2, value3; // do something with these variables
*pf = (converter_t){value1}.as_float +
(converter_t){value2}.as_float -
(converter_t){value3}.as_float;
Also it is good practice to add the following sanity check:
static_assert(sizeof(converter_t) == sizeof(uint32_t),
"Unexpected padding or wrong type sizes!");
I'm trying to write a code that converts a real number to a 64 bit floating point binary. In order to do this, the user inputs a real number (for example, 547.4242) and the program must output a 64 bit floating point binary.
My ideas:
The sign part is easy.
The program converts the integer part (547 for the previous example) and stores the result in an int variable. Then, the program converts the fractional part (.4242 for the previous example) and stores the result into an array (each position of the array stores '1' or '0').
This is where I'm stuck. Summarizing, I have: "Integer part = 1000100011" (type int) and "Fractional part = 0110110010011000010111110000011011110110100101000100" (array).
How can I proceed?
the following code is used to determine internal representation of a floating point number according to the IEEE754 notation. This code is made in Turbo c++ ide but you can easily convert for a generalised ide.
#include<conio.h>
#include<stdio.h>
void decimal_to_binary(unsigned char);
union u
{
float f;
char c;
};
int main()
{
int i;
char*ptr;
union u a;
clrscr();
printf("ENTER THE FLOATING POINT NUMBER : \n");
scanf("%f",&a.f);
ptr=&a.c+sizeof(float);
for(i=0;i<sizeof(float);i++)
{
ptr--;
decimal_to_binary(*ptr);
}
getch();
return 0;
}
void decimal_to_binary(unsigned char n)
{
int arr[8];
int i;
//printf("n = %u ",n);
for(i=7;i>=0;i--)
{
if(n%2==0)
arr[i]=0;
else
arr[i]=1;
n/=2;
}
for(i=0;i<8;i++)
printf("%d",arr[i]);
printf(" ");
}
For further details visit Click here!
In order to correctly round all possible decimal representations to the nearest double, you need big integers. Using only the basic integer types from C will leave you to re-implement big integer arithmetics. Each of these two approaches is possible, more information about each follows:
For the first approach, you need a big integer library: GMP is a good one. Armed with such a big integer library, you tackle an input such as the example 123.456E78 as the integer 123456 * 1075 and start wondering what values M in [253 … 254) and P in [-1022 … 1023] make (M / 253) * 2P closest to this number. This question can be answered with big integer operations, following the steps described in this blog post (summary: first determine P. Then use a division to compute M). A complete implementation must take care of subnormal numbers and infinities (inf is the correct result to return for any decimal representation of a number that would have an exponent larger than +1023).
The second approach, if you do not want to include or implement a full general-purpose big integer library, still requires a few basic operations to be implemented on arrays of C integers representing large numbers. The function decfloat() in this implementation represents large numbers in base 109 because that simplifies the conversion from the initial decimal representation to the internal representation as an array x of uint32_t.
Following is a basic conversion. Enough to get OP started.
OP's "integer part of real number" --> int is far too limiting. Better to simply convert the entire string to a large integer like uintmax_t. Note the decimal point '.' and account for overflow while scanning.
This code does not handle exponents nor negative numbers. It may be off in the the last bit or so due to limited integer ui or the the final num = ui * pow10(expo). It handles most overflow cases.
#include <inttypes.h>
double my_atof(const char *src) {
uintmax_t ui = 0;
int dp = '.';
size_t dpi;
size_t i = 0;
size_t toobig = 0;
int ch;
for (i = 0; (ch = (unsigned char) src[i]) != '\0'; i++) {
if (ch == dp) {
dp = '\0'; // only get 1 dp
dpi = i;
continue;
}
if (!isdigit(ch)) {
break; // illegal character
}
ch -= '0';
// detect overflow
if (toobig ||
(ui >= UINTMAX_MAX / 10 &&
(ui > UINTMAX_MAX / 10 || ch > UINTMAX_MAX % 10))) {
toobig++;
continue;
}
ui = ui * 10 + ch;
}
intmax_t expo = toobig;
if (dp == '\0') {
expo -= i - dpi - 1;
}
double num;
if (expo < 0) {
// slightly more precise than: num = ui * pow10(expo);
num = ui / pow10(-expo);
} else {
num = ui * pow10(expo);
}
return num;
}
The trick is to treat the value as an integer, so read your 547.4242 as an unsigned long long (ie 64-bits or more), ie 5474242, counting the number of digits after the '.', in this case 4. Now you have a value which is 10^4 bigger than it should be. So you float the 5474242 (as a double, or long double) and divide by 10^4.
Decimal to binary conversion is deceptively simple. When you have more bits than the float will hold, then it will have to round. More fun occurs when you have more digits than a 64-bit integer will hold -- noting that trailing zeros are special -- and you have to decide whether to round or not (and what rounding occurs when you float). Then there's dealing with an E+/-99. Then when you do the eventual division (or multiplication) by 10^n, you have (a) another potential rounding, and (b) the issue that large 10^n are not exactly represented in your floating point -- which is another source of error. (And for E+/-99 forms, you may need upto and a little beyond 10^300 for the final step.)
Enjoy !