128 bit floating point binary representation error

128 bit floating point binary representation error - c

Let's say we have some 128bit floating point number, for example x = 2.6 (1.3 * 2^1 ieee-754).
I put in in union like this:
union flt {
long double flt;
int64_t byte8[OCTALC];
} d;
d = x;
Then i run this to get it hexadecimal representation in memory:
void print_bytes(void *ptr, int size)
{
unsigned char *p = ptr;
int i;
for (i=0; i<size; i++) {
printf("%02hhX ", p[i]);
}
printf("\n");
}
// some where in the code
print_bytes(&d.byte8[0], 16);
And i get something like
66 66 66 66 66 66 66 A6 00 40 00 00 00 00 00 00
So by assumption i expect to see one of the leading bits(the left ones) to be 1(because exponent of 2.6 is 1) but in fact i see right bits to be 1(like it treating value big-endian). If i flip sign the output changes to:
66 66 66 66 66 66 66 A6 00 C0 00 00 00 00 00 00
So it seems like sign bit is righter than i thought. And if you count the bytes it seems like there is only 10 bytes used remaining 6 is like truncated or something.
I trying to find out why this happens any help?

You have a number of misconceptions.
First of all, you don't have a 128-bit floating point number. long double is probably a float in the x86 extended precision format on an x86-64. This is an 80 bit (10 byte) value, which is padded to 16 bytes. (I suspect this is for alignment purposes.)
And of course, it's going to be in little-endian byte order (since this is an x86/x86-64). This doesn't refer to the order of the bits in each byte, it refers to the order of the bytes in the whole.
And finally, the exponent is biased. An exponent of 1 isn't stored as 1. It's stored as 1+0x3FFF. This allows for negative exponents.
So we get the following:
66 66 66 66 66 66 66 A6 00 40 00 00 00 00 00 00
Demo on Compiler Explorer
If we remove the padding and reverse the bytes to better match the image in the Wikipedia page, we get
4000A666666666666666
This translates to
+0x1.4CCCCCCCCCCCCCCC × 2^(0x4000-0x3FFF)
(0xA66...6 = 0b1010 0110 0110...0110 ⇒ 0b1.0100 1100 1100...110[0] = 0x1.4CC...C)
or
+1.29999999999999999995663191310057982263970188796520233154296875 × 2^1
Decimal conversion obtained using
perl -Mv5.10 -e'
use Math::BigFloat;
Math::BigFloat->div_scale( 1000 );
say
Math::BigFloat->from_hex( "4CCCCCCCCCCCCCCC" ) /
Math::BigFloat->from_hex( "10000000000000000" )
'
or
perl -Mv5.10 -e'
use Math::BigFloat;
Math::BigFloat->div_scale( 1000 );
say
Math::BigFloat->from_hex( "A666666666666666" ) /
Math::BigFloat->from_hex( "8000000000000000" )
'

You've been bamboozled by some very strange aspects of the way extended-precision floating-point is typically implemented in C on Intel architectures. So don't feel too bad. :-)
What you're seeing is that although sizeof(long double) may be 16 (== 128 bits), deep down inside what you're really getting is the 80-bit Intel extended format. It's being padded out with 6 bytes, which in your case happen to be 0. So, yes, "the sign bit is righter than you thought".
I see the same thing on my machine, and it's something I've always wondered about. It seems like a real waste, doesn't it? I used to think it was for some kind of compatibility with machines which actually do have 128-bit long doubles. But that can't be it, because this 0-padded 16-byte format is not binary-compatible with true IEEE 128-bit floating point, among other things because the padding is on the wrong end.

Related

Converting AMF Number from char to double an back

in context of a protokoll I get messages in AMF Format.
The AMF Object Type "Number" is defined as
number-type = number-marker DOUBLE
The data following a Number type marker is always an 8 byte IEEE-754 double [...] in network byte order.
The following Examples are captured using Wireshark:
Hex: 40 00 00 00 00 00 00 00
Number: 2
Hex: 40 08 00 00 00 00 00 00
Number: 3
Hex: 3f f0 00 00 00 00 00 00
Number: 1
I tried to treat these as doube, long long and int64_t but none of these Types seems to use the correct order/format.
The implementation needs to be in C so I cant use any Librarys (The are none as it seems)
What would be the correct approach?

Likely your platform supports 8-byte IEEE-754 doubles but requires them to be in little-endian format. Your examples are in big-endian format. If you store them in an aligned array of unsigned characters from last to first and cast the pointer to a double *, you should get the right value.

Understanding fwrite() behaviour

When I use fwrite to stdout it becomes undefined results?
What is the use of the size_t count argument in fwrite and fread function?
#include<stdio.h>
struct test
{
char str[20];
int num;
float fnum;
};
int main()
{
struct test test1={"name",12,12.334};
fwrite(&test1,sizeof(test1),1,stdout);
return 0;
}
output:
name XEA
Process returned 0 (0x0) execution time : 0.180 s
Press any key to continue.
And when i use fwrite for file
#include<stdio.h>
#include<stdlib.h>
struct test
{
char str[20];
int num;
float fnum;
};
int main()
{
struct test test1={"name",12,12.334};
FILE *fp;
fp=fopen("test.txt","w");
if(fp==NULL)
{
printf("FILE Error");
exit(1);
}
fwrite(&test1,sizeof(test1),1,fp);
return 0;
}
The File also contains like this
name XEA
Why is the output like this?
And when I put size_t count as 5, it also becomes undefined result. What is the purpose of this argument?

In the way that you're using it, fwrite is writing "binary" output, which is a byte-for-byte representation of your struct test as in memory. This representation is not generally human-readable.
When you write
struct test test1 = {"name", 12, 12.334};
you get a structure in memory which might be represented like this (with all byte valuess in hexadecimal):
test1: str: 6e 61 6d 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
num: 0c 00 00 00
fnum: 10 58 45 41
Specifically: 0x6e, 0x61, 0x6d, and 0x65 are the ASCII codes for the letters in "name". Your str array was size 20, so the string name is 0-terminated and then padded out with 15 more 0 characters. 0x0c is the hexadecimal representation of the number 12, and I'm assuming that type int is 32 bits or 4 bytes on your machine, so there are 3 more 0's there, also. (Also I'm assuming your machine is "little endian", with the least-significant byte of a 4-byte quantity like 0x0000000c coming first in memory.) Finally, in the IEEE-754 format which your machine uses, the number 12.334 is represented in 32-bit single-precision floating point as 0x41455810, which is again stored in the opposite order in memory.
So those 28 bytes (plus possibly some padding, which we won't worry about for now) are precisely the bytes that get written to the screen, or to your file. The string "name" will probably be human-readable, but all the rest will be, literally, "binary garbage". It just so happens that three of the bytes making up the float number 12.334, namely 0x58, 0x45, and 0x41, correspond in ASCII to the capital letters X, E, and A, so that's why you see those characters in the output.
Here's the result of passing the output of your program into a "hex dump utility:
0 6e 61 6d 65 00 00 00 00 00 00 00 00 00 00 00 00 name............
16 00 00 00 00 0c 00 00 00 10 58 45 41 .........XEA
You can see the letters name at the beginning, and the letters XEA at the end, and all the 0's and other binary characters in between.
If you're on a Unix or Linux (or Mac OS X) system, you can use tools like od or hexdump to get a hex dump similar to the one I've shown.
You asked about the "count" argument to fwrite. fwrite is literally designed for writing out binary structures, just like you're doing. Its function signature is
size_t fwrite(void *ptr, size_t sz, size_t n, FILE *fp);
As you know, ptr is a pointer to the data structure(s) you're writing, and fp is the file pointer you're writing it/them to. And then you're saying you want to write n (or 'count') items, each of size sz.
You called
fwrite(&test1, sizeof(test1), 1, fp);
The expression sizeof(test1) gives the size of test1 in bytes. (It will probably be 28 or so, as I mentioned above.) And of course you're writing one struct, so passing sizeof(test1) and 1 as sz and n is perfectly reasonable and correct.
It would also not be unreasonable or incorrect to call
fwrite(&test1, 1, sizeof(test1), fp);
Now you're telling fwrite to write 28 bytes, each of size 1. (A byte is of course always size 1.)
One other thing about fwrite (as noted by #AnttiHaapala in a comment): when you're writing binary data to a file, you should specify the b flag when you open the file:
fp = fopen("test.txt", "wb");
Finally, if this isn't what you want, if you want human-readabe text output instead, then fwrite is not what you want here. You could use something like
printf("str: %s\n", test1.str);
printf("num: %d\n", test1.num);
printf("fnum: %f\n", test1.fnum);
Or, to your file:
fprintf(fp, "str: %s\n", test1.str);
fprintf(fp, "num: %d\n", test1.num);
fprintf(fp, "fnum: %f\n", test1.fnum);

What is the difference between these two C functions in terms of handling memory?

typedef unsigned char *byte_pointer;
void show_bytes(byte_pointer start, size_t len) {
size_t i;
for (i = 0; i < len; i++)
printf(" %.2x", start[i]); //line:data:show_bytes_printf
printf("\n");
}
void show_integer(int* p,size_t len){
size_t i;
for(i=0;i<len;i++){
printf(" %d",p[i]);
}
printf("\n");
}
Suppose I have two functions above, and I use main function to test my functions:
int main(int argc, char *argv[])
{
int a[5]={12345,123,23,45,1};
show_bytes((byte_pointer)a,sizeof(a));
show_integer(a,5);
}
I got the following results in my terminal:
ubuntu#ubuntu:~/OS_project$ ./show_bytes
39 30 00 00 7b 00 00 00 17 00 00 00 2d 00 00 00 01 00 00 00
12345 123 23 45 1
Can someone tell me why I got the result? I understand the second function, but I have no idea why I got 39 30 00 00 7b 00 00 00 17 00 00 00 2d 00 00 00 01 00 00 00 for the first function. Actually I know the number sequence above are hexadecimal decimal for 12345, 123, 23, 45, 1. However, I have no idea: start[i] pointer doesn't point to the whole number such as 12345 or 123 in the first function. Instead, the start[0] just point to the least significant digit for the first number 12345? Can someone help me explain why these two functions are different?

12345 is 0x3039 in hex. because int is 32bits on your machine it will be represented as 0x00003039. then because your machine is little endian it will be represented as 0x39300000. you can read more about Big and Little endian on: https://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Data/endian.html
the same applies for other results.

On your platform, sizeof(int) is 4 and your platform uses little endian system. The binary representation of 12345 using a 32-bit representation is:
00000000 00000000 00110000 00111001
In a little endian system, that is captured using the following byte sequence.
00111001 00110000 00000000 00000000
In hex, those bytes are:
39 30 00 00
That's what you are seeing as the output corresponding to the first number.
You can do similar processing of the other numbers in the array to understand the output corresponding to them.

store 30010241024 in 64bit variable as low and high bit

I am trying to understand how 300*1024*1024 value will be stored in a 64bit variable on a big endian machine and how will we evaluate the high and low bytes?

Build a union with long integer and an array of 8 unsigned chars and see for yourself. You can view the unsigned chars in hex if you want.

Big-endian hardware stores the most significant byte first in memory. Little-endian hardware stores the least significant byte first. In hex 300*1024*1024 is 0x12C00000.
So, for your big-endian hardware it will be stored like so:
byte number 1 2 3 4 5 6 7 8
value 00 00 00 00 12 C0 00 00
On LE hardware the bytes will be stored in reverse order:
byte number 1 2 3 4 5 6 7 8
value 00 00 C0 12 00 00 00 00

How can I access members of a struct when it's not aligned properly?

I'm afraid that I'm not very good at low level C stuff, I'm more used to
using objects in Obj-c, so please excuse me if this is an obvious question, or if I've completely misunderstood something...
I am attempting to write an application in Cocoa/Obj-C which communicates with an external bit of hardware (a cash till.) I have the format of the data the device sends and receives - and have successfully got some chunks of data from the device.
For example: the till exchanges PLU (price data) in chunks of data in the following format: (from the documentation)
Name Bytes Type
Name Bytes Type
PLU code h 4 long
PLU code L 4 long
desc 19 char
Group 3 char
Status 5 char
PLU link code h 4 long
PLU link code l 4 long
M&M Link 1 char
Min. Stock. 2 int
Price 1 4 long
Price 2 4 long
Total 54 Bytes
So I have a struct in the following form in which to hold the data from the till:
typedef struct MFPLUStructure {
UInt32 pluCodeH;
UInt32 pluCodeL;
unsigned char description[19];
unsigned char group[3];
unsigned char status[5];
UInt32 linkCodeH;
UInt32 linkCodeL;
unsigned char mixMatchLink;
UInt16 minStock;
UInt32 price[2];
} MFPLUStructure;
I have some known sample data from the till (below) which I have checked by hand and is valid
00 00 00 00 4E 61 BC 00 54 65 73 74 20 50 4C 55 00 00 00 00 00 00 00 00 00 00 00 09 08 07 17 13 7C 14 04 00 00 00 00 09 03 00 00 05 BC 01 7B 00 00 00 00 00 00 00
i.e.
bytes 46 to 50 are <7B 00 00 00> == 123 as I would expect as the price is set to '123' on the till.
byte 43 is <05> == 5 as I would expect as the 'mix and match link' is set to 5 on the till.
bytes 39 to 43 are <09 03 00 00> == 777 as I would expect as the 'link code' is set to '777' on the till.
Bytes 27,28,29 are <09 08 07> which are the three groups (7,8 & 9) that I would expect.
The problem comes when I try to get some of the data out of the structure programmatically: The early members work correctly right up to, and including the five 'status' bytes. However, members after that don't come out properly. (see debugger screenshot below.)
Image 1 - http://i.stack.imgur.com/nOdER.png
I assume that the reason for this is because the five status bytes push later members out of alignment - i.e. they are over machine word boundaries. Is this right?
Image 2 - i.imgur.com/ZhbXU.png
Am I right in making that assumption?
And if so, how can I get the members in and out correctly?
Thanks for any help.

Either access the data a byte at a time and assemble it into larger types, or memcpy it into an aligned variable. The latter is better if the data is known to be in a format specific to the host's endianness, etc. The former is better if the data follows an external specification that might not match the host.

If you're sure that endianness of host and wire agree, you could also use a packed structure to read the data in a single pass. However, this is compiler-specific and will most likely impact performance of member access.
Assuming gcc, you'd use the following declarations:
struct __attribute__ ((__packed__)) MFPLUStructure { ... };
typedef struct MFPLUStructure MFPLUStructure;
If you decide to use a packed structure, you should also verify that it is of correct size:
assert(sizeof (MFPLUStructure) == 54);

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

128 bit floating point binary representation error - c

Related

Converting AMF Number from char to double an back

Understanding fwrite() behaviour

What is the difference between these two C functions in terms of handling memory?

store 30010241024 in 64bit variable as low and high bit

How can I access members of a struct when it's not aligned properly?

Categories

Resources

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

128 bit floating point binary representation error - c

Related

Converting AMF Number from char to double an back

Understanding fwrite() behaviour

What is the difference between these two C functions in terms of handling memory?

store 300*1024*1024 in 64bit variable as low and high bit

How can I access members of a struct when it's not aligned properly?

Categories

Resources

store 30010241024 in 64bit variable as low and high bit