I'm using a gps module through which I'm getting the string
"0x3f947ae147ae147b"
which I need to convert to double. The expected value is 0.02.
I referred the following website for the reference
https://gregstoll.com/~gregstoll/floattohex/
How I can convert value in the C?
3F947AE147AE147B16 is the encoding for an IEEE-754 binary64 (a.k.a. “double precision”) datum with value 0.0200000000000000004163336342344337026588618755340576171875. Supposing your C implementation uses that format for double and has 64-bit integers with the same endianness, you can decode it (not convert it) by copying its bytes into a double and printing them:
#include <errno.h>
#include <limits.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char *string = "0x3f947ae147ae147b";
// Set errno to zero before using strtoull.
errno = 0;
char *end;
unsigned long long t = strtoull(string, &end, 16);
// Test whether stroull did not accept all characters.
if (*end)
{
fprintf(stderr,
"Error, string \"%s\", is not a proper hexadecimal numeral.\n",
string);
exit(EXIT_FAILURE);
}
// Move the value to a 64-bit unsigned integer.
uint64_t encoding = t;
/* Test whether the number is too large, either because strtoull reported
an error or because it does not fit in a uint64_t.
*/
if ((t == ULLONG_MAX && errno) || t != encoding)
{
fprintf(stderr, "Error, string \"%s\", is bigger than expected.\n",
string);
exit(EXIT_FAILURE);
}
// Copy the bytes into a double.
double x;
memcpy(&x, &encoding, sizeof x);
printf("%.9999g\n", x);
}
This should output “0.0200000000000000004163336342344337026588618755340576171875”.
If your C implementation does not support this format, you can decode it:
Separate the 64 bits into s, e, f, where s is the leading bit, e is the next 11 bits, and f is the remaining 52 bits.
If e is 2047 and f is zero, report the value is +∞ or −∞, according to whether s is 0 or 1, and stop.
If e is 2047 and f is not zero, report the value is a NaN (Not a Number) and stop.
If e is not zero, add 252 to f. If e is zero, change it to one.
The magnitude of the represented value is f•2−52•2e−1023, and its sign is + or − according to whether s is 0 or 1.
The usual way to convert a string of digits like "0x3f947ae147ae147b" into an actual integer is with one of the "strto" functions. Since you have 64 bits, and you're not interested in treating them as a signed integer (since you're about to, instead, try to treat them as a double), the appropriate choice is strtoull:
#include <stdlib.h>
char *str = "0x3f947ae147ae147b";
uint64_t x = strtoull(str, NULL, 16);
Now you have your integer, as you can verify by doing
printf("%llx\n", x);
But now the question is, how do you treat those bits as an IEEE-754 double value, instead of an integer? There are at least three ways to do it, in increasing levels of portability.
(1) Use pointers. Take a pointer to your integer value x, change it do a double pointer, then indirect on it, forcing the compiler to (try to) treat the bits of x as if they were a double:
double *dp = (double *)&x;
double d = *dp;
printf("%f\n", d);
This was once a decent and simple way to do it, but it is no longer legal as it runs afoul of the "strict aliasing rule". It might work for you, or it might not. Theoretically this sort of technique can also run into issues with alignment. For these reasons, this technique is not recommended.
(2) Use a union:
union u { uint64_t x; double d; } un;
un.x = strtoull(str, NULL, 16);
printf("%f\n", un.d);
Opinions differ on whether this technique is 100% strictly legal. I believe it's fine in C, but it may not be in C++. I'm not aware of machines where it won't work.
(3) Use memcpy:
#include <string.h>
uint64_t x = strtoull(str, NULL, 16);
double d;
memcpy(&d, &x, 8);
printf("%f\n", d);
This works by, literally, copying the individual bytes of the unsigned long int value x into the bytes of the double variable d. This is 100% portable (as long as x and d are the same size). I used to think it was wasteful, due to the extra function call, but these days it's a generally recommended technique, and I'm told that modern compilers are smart enough to recognize what you're trying to do, and emit perfectly efficient code (that is, just as efficient as techniques (1) or (2)).
Now, one other portability concern is that this all assumes that type double on your machine is in fact implemented using the same IEEE-754 double-precision format as your incoming hex string representation. That's actually a very safe assumption these days, although it's not strictly guaranteed by the C standards. If you like to be particularly careful about type correctness, you might add the lines
#include <assert.h>
assert(sizeof(uint64_t) == sizeof(double));
and change the memcpy call (if that's what you end up using) to
memcpy(&d, &x, sizeof(double));
(But note that these last few changes only guard against unexpected system-specific discrepancies in the size of type double, not its representation.)
One further point. Note that one technique which will most definitely not work is the superficially obvious
d = (double)x;
That line would perform an actual conversion of the value 0x3f947ae147ae147b. It won't just reinterpret the bits. If you try it, you'll get an answer like 4581421828931458048.000000. Where did that come from? Well, 0x3f947ae147ae147b in decimal is 4581421828931458171, and the closest value that type double can represent is 4581421828931458048. (Why can't type double represent the integer 4581421828931458171 exactly? Because it's a 62-bit number, and type double has at most 53 bits of precision.)
Related
I'm struggling to understand the behavior of gcc in this. The size of a float is of 4 bytes for my architecture. But I can still store a 8 bytes real value in a float, and my compiler says nothing about it.
For example I have :
#include <stdio.h>
int main(int argc, char** argv){
float someFloatNumb = 0xFFFFFFFFFFFF;
printf("%i\n", sizeof(someFloatNumb));
printf("%f\n", someFloatNumb);
printf("%i\n", sizeof(281474976710656));
return 0;
}
I expected the compiler to insult me, or displaying a disclaimer of some sort, because I shouldn't be able to something like that, at least I think it's kind of twisted wizardry.
The program simply run :
4
281474976710656.000000
8
So, if I print the size of someFloatNumb, I get 4 bytes, which is expected. But the affected value isn't, as seen just below.
So I have a few questions:
Does sizeof(variable) simply get the variable type and return sizeof(type), which in this case would explain the result?
Does/Can gcc grow the capacity of a type? (managing multiple variables behind the curtains to allow us that sort of things)
1)
Does sizeof(variable) simply get the variable type and return sizeof(type), which in this case would explain the result ?
Except for variable-length arrays, sizeof doesn't evaluate its operand. So yes, all it cares is the type. So sizeof(someFloatNumb) is 4 which is equivalent to sizeof(float). This explains printf("%i\n", sizeof(someFloatNumb));.
2)
[..] But I can still store a 8 bytes real value in a float, and my compiler says nothing about it.
Does/Can gcc grow the capacity of a type ? (managing multiple variables behind the curtains to allow us that sort of things)
No. Capacity doesn't grow. You simply misunderstood how floats are represented/stored. sizeof(float) being 4 doesn't mean
it can't store more than 2^32 (assuming 1 byte == 8 bits). See Floating point representation.
What the maximum value of a float can represent is defined by the constant FLT_MAX (see <float.h>). sizeof(someFloatNumb) simply yields how many bytes the object (someFloatNumb) takes up in memory which isn't necessarily equal to the range of values it can represent.
This explains why printf("%f\n", someFloatNumb); prints the value as expected (and there's no automatic "capacity growth").
3)
printf("%i\n", sizeof(281474976710656));
This is slightly more involved. As said before in (1), sizeof only cares about the type here. But the type of 281474976710656 is not necessarily int.
The C standard defines the type of integer constants according to the smallest type that can represent the value. See https://stackoverflow.com/a/42115024/1275169 for an explanation.
On my system 281474976710656 can't be represented in an int and it's stored in a long int which is likely to be case on your system as well. So what you see is essentially equivalent to sizeof(long).
There's no portable way to determine the type of integer constants. But since you are using gcc, you could use a little trick with typeof:
typeof(281474976710656) x;
printf("%s", x); /* deliberately using '%s' to generate warning from gcc. */
generates:
warning: format ‘%s’ expects argument of type ‘char *’, but argument 2
has type ‘long int’ [-Wformat=]
printf("%s", x);
P.S: sizeof results a size_t for which the correct format specifier is %zu. So that's what you should be using in your 1st and 3rd printf statements.
This doesn't store "8 bytes" of data, that value gets converted to an integer by the compiler, then converted to a float for assignment:
float someFloatNumb = 0xFFFFFFFFFFFF; // 6 bytes of data
Since float can represent large values, this isn't a big deal, but you will lose a lot of precision if you're only using 32-bit floats. Notice there's a slight but important difference here:
float value = 281474976710656.000000;
int value = 281474976710655;
This is because float becomes an approximation when it runs out of precision.
Capacities don't "grow" for standard C types. You'll have to use a "bignum" library for that.
But I can still store a 8 bytes real value in a float, and my compiler
says nothing about it.
That's not what's happening.
float someFloatNumb = 0xFFFFFFFFFFFF;
0xFFFFFFFFFFFF is an integer constant. Its value, expressed in decimal, is 281474976710655, and its type is probably either long or long long. (Incidentally, that value can be stored in 48 bits, but most systems don't have a 48-bit integer type, so it will probably be stored in 64 bits, of which the high-order 16 bits will be zero.)
When you use an expression of one numeric type to initialize an object of a different numeric type, the value is converted. This conversion doesn't depend on the size of the source expression, only on its numeric value. For an integer-to-float conversion, the result is the closest representation to the integer value. There may be some loss of precision (and in this case, there is). Some compilers may have options to warn about loss of precision, but the conversion is perfectly valid so you probably won't get a warning by default.
Here's a small program to illustrate what's going on:
#include <stdio.h>
int main(void) {
long long ll = 0xFFFFFFFFFFFF;
float f = 0xFFFFFFFFFFFF;
printf("ll = %lld\n", ll);
printf("f = %f\n", f);
}
The output on my system is:
ll = 281474976710655
f = 281474976710656.000000
As you can see, the conversion has lost some precision. 281474976710656 is an exact power of two, and floating-point types generally can represent those exactly. There's a very small difference between the two values because you chose an integer value that's very close to one that can be represented exactly. If I change the value:
#include <stdio.h>
int main(void) {
long long ll = 0xEEEEEEEEEEEE;
float f = 0xEEEEEEEEEEEE;
printf("ll = %lld\n", ll);
printf("f = %f\n", f);
}
the apparent loss of precision is much larger:
ll = 262709978263278
f = 262709979381760.000000
0xFFFFFFFFFFFF == 281474976710655
If you init a float with that value, it will end up being
0xFFFFFFFFFFFF +1 == 0x1000000000000 == 281474976710656 == 1<<48
That fits easily in a 4byte float, simple mantisse, small exponent.
It does however NOT store the correct value (one lower) because that IS hard to store in a float.
Note that the " +1" does not imply incrementation. It ends up one higher because the representation can only get as close as off-by-one to the attempted value. You may consider that "rounding up to the next power of 2 mutliplied by whatever the mantisse can store". Mantisse, by the way, usually is interpreted as a fraction between 0 and 1.
Getting closer would indeed require the 48 bits of your initialisation in the mantisse; plus whatever number of bits would be used to store the exponent; and maybe a few more for other details.
Look at the value printed... 0xFFFF...FFFF is an odd value, but the value printed in your example is even. You are feeding the float variable with an int value that is converted to float. The conversion is loosing precision, as expected by the value used, which doesn't fit in the 23 bits reserved to the target variable mantissa. And finally you get an approximation with is the value 0x1000000....0000 (the next value, which is the closest value to the one you used, as posted #Yunnosch in his answer)
Sorry if this is already been asked, and I've seen other way of extracting the exponent of a floating point number, however this is what is given to me:
unsigned f2i(float f)
{
union {
unsigned i;
float f;
} x;
x.i = 0;
x.f = f;
return x.i;
}
I'm having trouble understanding this union datatype, because shouldn't the return x.i at the end always make f2i return a 0?
Also, what application could this data type even be useful for? For example, say I have a function:
int getexponent(float f){
}
This function is supposed to get the exponent value of the floating point number with bias of 127. I've found many ways to make this possible, however how could I manipulate the f2i function to serve this purpose?
I appreciate any pointers!
Update!!
Wow, years later and this just seem trivial.
For those who may be interested, here is the function!
int getexponent(float f) {
unsigned f2u(float f);
unsigned int ui = (f2u(f)>>23) & 0xff ;//shift over by 23 and compare to 0xff to get the exponent with the bias
int bias = 127;//initialized bias
if(ui == 0) return 1-bias; // special case 0
else if(ui == 255) return 11111111; //special case infinity
return ui - bias;
}
I'm having trouble understanding this union datatype
The union data type is a way for a programmer to indicate that some variable can be one of a number of different types. The wording of the C11 standard is something like "a union contains at most one of its members". It is used for things like parameters that may be logically one thing or another. For example, an IP address might be an IPv4 address or an IPv6 address so you might define an address type as follows:
struct IpAddress
{
bool isIPv6;
union
{
uint8_t v4[4];
uint8_t v6[16];
} bytes;
}
And you would use it like this:
struct IpAddress address = // Something
if (address.isIPv6)
{
doSomeV6ThingWith(address.bytes.v6);
}
else
{
doSomeV4ThingWith(address.bytes.v4);
}
Historically, unions have also been used to get the bits of one type into an object of another type. This is because, in a union, the members all start at the same memory address. If I just do this:
float f = 3.0;
int i = f;
The compiler will insert code to convert a float to an integer, so the exponent will be lost. However, in
union
{
unsigned int i;
float f;
} x;
x.f = 3.0;
int i = x.i;
i now contains the exact bits that represent 3.0 in a float. Or at least you hope it does. There's nothing in the C standard that says float and unsigned int have to be the same size. There's also nothing in the C standard that mandates a particular representation for float (well, annex F says floats conform to IEC 60559 , but I don't know if that counts as part of the standard). So the above code is, at best, non portable.
To get the exponent of a float the portable way is the frexpf() function defined in math.h
how could I manipulate the f2i function to serve this purpose?
Let's make the assumption that a float is stored in IEC 60559 format in 32 bits which Wkipedia thinks is the same as IEEE 754. Let's also assume that integers are stored in little endian format.
union
{
uint32_t i;
float f;
} x;
x.f = someFloat;
uint32_t bits = x.i;
bits now contains the bit pattern of the floating point number. A single precision floating point number looks like this
SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM
^ ^ ^
bit 31 bit 22 bit 0
Where S is the sign bit, E is an exponent bit, M is a mantissa bit.
So having got your int32_t you just need to do some shifting and masking:
uint32_t exponentWithBias = (bits >> 23) & 0xff;
Because it's a union it means that x.i and x.f have the same address, what this allows you to do is reinterpret one data type to another. In this scenario the union is first zeroed out by x.i = 0; and then filled with f. Then x.i is returned which is the integer representation of the float f. If you would then shift that value you would get the exponent of the original f because of the way a float is laid out in memory.
I'm having trouble understanding this union datatype, because shouldn't the return x.i at the end always make f2i return a 0?
The line x.i = 0; is a bit paranoid and shouldn't be necessary. Given that unsigned int and float are both 32 bits, the union creates a single chunk of 32 bits in memory, which you can access either as a float or as the pure binary representation of that float, which is what the unsigned is for. (It would have been better to use uint32_t.)
This means that the lines x.i = 0; and x.f = f; write to the very same memory area twice.
What you end up with after the function is the pure binary notation of the float. Parsing out the exponent or any other part from there is very much implementation-defined, since it depends on floating point format and endianess. How to represent FLOAT number in memory in C might be helpful.
That union type is strongly discouraged, as it is strongly architecture dependant and compiler implementation dependant.... both things make it almost impossible to determine a correct way to achieve the information you request.
There are portable ways of doing that, and all of them have to deal with the calculation of logarithm to the base ten. If you get the integer part of the log10(x) you'll get the number you want,
int power10 = (int)log10(x);
double log10(double x)
{
return log(x)/log(10.0);
}
will give you the exponent of 10 to raise to get the number to multiply the mantissa to get the number.... if you divide the original number by the last result, you'll get the mantissa.
Be careful, as the floating point numbers are normally internally stored in a power of two's basis, which means the exponent you get stored is not a power of ten, but a power of two.
Can I round trip any 4-byte-aligned pointer through a double? And can I round trip any finite double through a string?
Specifically, on any platform that uses IEEE floating point, which conforms to C11, and on which neither static assertion fails, are the assertions in the following program guaranteed to pass?
#include <stdint.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <math.h>
int main(void) {
struct {
void *dummy;
} main_struct;
main_struct.dummy = 0;
static_assert(_Alignof(main_struct) >= 4,
"Dummy struct insufficiently aligned");
static_assert(sizeof(double) == sizeof(uint64_t) && sizeof(double) == 8,
"double and uint64_t must have size 8");
double x;
uint64_t ptr = (uint64_t)&main_struct;
assert((ptr & 3) == 0);
ptr >>= 2;
memcpy(&x, &ptr, 8);
assert(!isnan(x));
assert(isfinite(x));
assert(x > 0);
char buf[1000];
snprintf(buf, sizeof buf, "Double is %#.20g\n", x);
double q;
sscanf(buf, "Double is %lg\n", &q);
assert(q == x);
assert(memcmp(&q, &ptr, 8) == 0);
}
Specifically, on any platform that uses IEEE floating point, which conforms to C11, and on which neither static assertion fails, are the assertions in the following program guaranteed to pass?
With only those requirements, then no. Among reasons that preclude it are the following:
You haven't asserted that pointers are 64 bits or less in size.
Nothing says that pointers and doubles use the same kind of endianness in memory. If pointers are big-endian and doubles are little-endian (or middle-endian, or use some other weird in-memory format), then your shifting does not preclude negative, infinite or NaN values.
Pointers are not guaranteed to translate simply into an integral value with lower-order bits guaranteed to be zero just because they point to an aligned value.
These objections may be somewhat pathological on current, practical platforms, but could certainly be true in theory, and nothing in your list of requirements stands against them.
It is, for instance, perfectly possible to imagine an architecture with a separate floating-point coprocessor that uses a different memory format than the main integer CPU. In fact, the Wikipedia article actually states that there are real examples of architectures that do this. As for weird pointer formats, the C FAQ provides some interesting historical examples.
I'm a bit confused about the round() function in C.
First of all, man says:
SYNOPSIS
#include <math.h>
double round(double x);
RETURN VALUE
These functions return the rounded integer value.
If x is integral, +0, -0, NaN, or infinite, x itself is returned.
The return value is a double / float or an int?
In second place, I've created a function that first rounds, then casts to int. Latter on my code I use it as a mean to compare doubles
int tointn(double in,int n)
{
int i = 0;
i = (int)round(in*pow(10,n));
return i;
}
This function apparently isn't stable throughout my tests. Is there redundancy here? Well... I'm not looking only for an answer, but a better understanding on the subject.
The wording in the man-page is meant to be read literally, that is in its mathematical sense. The wording "x is integral" means that x is an element of Z, not that x has the data type int.
Casting a double to int can be dangerous because the maximum arbitrary integral value a double can hold is 2^52 (assuming an IEEE 754 conforming binary64 ), the maximum value an int can hold might be smaller (it is mostly 32 bit on 32-bit architectures and also 32-bit on some 64-bit architectures).
If you need only powers of ten you can test it with this little program yourself:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main(){
int i;
for(i = 0;i < 26;i++){
printf("%d:\t%.2f\t%d\n",i, pow(10,i), (int)pow(10,i));
}
exit(EXIT_SUCCESS);
}
Instead of casting you should use the functions that return a proper integral data type like e.g.: lround(3).
here is an excerpt from the man page.
#include <math.h>
double round(double x);
float roundf(float x);
long double roundl(long double x);
notice: the returned value is NEVER a integer. However, the fractional part of the returned value is set to 0.
notice: depending on exactly which function is called will determine the type of the returned value.
Here is an excerpt from the man page about which way the rounding will be done:
These functions round x to the nearest integer, but round halfway cases
away from zero (regardless of the current rounding direction, see
fenv(3)), instead of to the nearest even integer like rint(3).
For example, round(0.5) is 1.0, and round(-0.5) is -1.0.
If you want a long integer to be returned then please use lround:
long int tolongint(double in)
{
return lround(in));
}
For details please see lround which is available as of the C++ 11 standard.
Basically I have a uint64_t whose actual value I do not care about. I need to store it in a double so that I can easily store the bits in an object in R (if you don't know what that is, that's fine, it doesn't matter for the question).
So what I would like is a means of storing the 64 bits of the uint64_t inside of a double and then also convert from the double holding the bits back to the original uint64_t.
I've been banging my head against the wall on this for a quite a bit (is that a pun?) so any and all help is greatly appreciated!!!
As stated in the comments, you can use memcpy to copy the bits from your uint64_t to a double.
This works because the size of a double is the same as the one of an uint64_t (8 bytes).
If you try to display the values, you won't get the same result, though. Indeed a double stores both positive and negative values, with a floating point, whereas an uint64_t is... unsigned and whole.
The binary representation is the same, the interpreted value is different.
in C casting is your friend:
dbl = * (double*)(&ui64);
ui64 = * (uint64_t*)(&dbl);
example:
#include <stdint.h>
#include <inttypes.h>
#include <stdio.h>
double dbl_1 = 42.42;
uint64_t ui64_1;
double dbl_2;
uint64_t ui64_2 = 0x404535C28F5C28F6;
void main(void)
{
ui64_1 = *(uint64_t*)(&dbl_1);
printf("%" PRIx64 "\n", ui64_1);
dbl_2 = *(double*)(&ui64_2);
printf("%f", dbl_2);
}
Great resource for conversions:
IEEE-754 Floating-Point Conversion
Edit: This is undefined behavior per the C standards and although it may work, it is not best practice.