I am using the HIDAPI to send some data to a USB device. This data can be sent only as byte array and I need to send some float numbers inside this data array. I know floats have 4 bytes. So I thought this might work:
float f = 0.6;
char data[4];
data[0] = (int) f >> 24;
data[1] = (int) f >> 16;
data[2] = (int) f >> 8;
data[3] = (int) f;
And later all I had to do is:
g = (float)((data[0] << 24) | (data[1] << 16) | (data[2] << 8) | (data[3]) );
But testing this shows me that the lines like data[0] = (int) f >> 24; returns always 0. What is wrong with my code and how may I do this correctly (i.e. break a float inner data in 4 char bytes and rebuild the same float later)?
EDIT:
I was able to accomplish this with the following codes:
float f = 0.1;
unsigned char *pc;
pc = (unsigned char*)&f;
// 0.6 in float
pc[0] = 0x9A;
pc[1] = 0x99;
pc[2] = 0x19;
pc[3] = 0x3F;
std::cout << f << std::endl; // will print 0.6
and
*(unsigned int*)&f = (0x3F << 24) | (0x19 << 16) | (0x99 << 8) | (0x9A << 0);
I know memcpy() is a "cleaner" way of doing it, but this way I think the performance is somewhat better.
You can do it like this:
char data[sizeof(float)];
float f = 0.6f;
memcpy(data, &f, sizeof f); // send data
float g;
memcpy(&g, data, sizeof g); // receive data
In order for this to work, both machines need to use the same floating point representations.
As was rightly pointed out in the comments, you don't necessarily need to do the extra memcpy; instead, you can treat f directly as an array of characters (of any signedness). You still have to do memcpy on the receiving side, though, since you may not treat an arbitrary array of characters as a float! Example:
unsigned char const * const p = (unsigned char const *)&f;
for (size_t i = 0; i != sizeof f; ++i)
{
printf("Byte %zu is %02X\n", i, p[i]);
send_over_network(p[i]);
}
In standard C is guaranted that any type can be accessed as an array of bytes.
A straight way to do this is, of course, by using unions:
#include <stdio.h>
int main(void)
{
float x = 0x1.0p-3; /* 2^(-3) in hexa */
union float_bytes {
float val;
unsigned char bytes[sizeof(float)];
} data;
data.val = x;
for (int i = 0; i < sizeof(float); i++)
printf("Byte %d: %.2x\n", i, data.bytes[i]);
data.val *= 2; /* Doing something with the float value */
x = data.val; /* Retrieving the float value */
printf("%.4f\n", data.val);
getchar();
}
As you can see, it is not necessary at all to use memcpy or pointers...
The union approach is easy to understand, standard and fast.
EDIT.
I will explain why this approach is valid in C (C99).
[5.2.4.2.1(1)] A byte has CHAR_BIT bits (an integer constant >= 8, in almost cases is 8).
[6.2.6.1(3)] The unsigned char type uses all its bits to represent the value of the object, which is an nonnegative integer, in a pure binary representation. This means that there are not padding bits or bits used for any other extrange purpouse. (The same thing is not guaranted for signed char or char types).
[6.2.6.1(2)] Every non-bitfield type is represented in memory as a contiguous sequence of bytes.
[6.2.6.1(4)] (Cited) "Values stored in non-bit-field objects of any other object type consist of n × CHAR_BIT bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of type unsigned char [n] (e.g., by memcpy); [...]"
[6.7.2.1(14)] A pointer to a structure object (in particular, unions), suitably converted, points to its initial member. (Thus, there is no padding bytes at the beginning of a union).
[6.5(7)] The content of an object can be accessed by a character type:
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the
object,
— a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its
members (including, recursively,amember of a subaggregate or contained union), or
— a character type
More information:
A discussion in google groups
Type-punning
EDIT 2
Another detail of the standard C99:
[6.5.2.3(3) footnote 82] Type-punning is allowed:
If the member used to access the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called "type
punning"). This might be a trap representation.
The C language guarantees that any value of any type¹ can be accessed as an array of bytes. The type of bytes is unsigned char. Here's a low-level way of copying a float to an array of bytes. sizeof(f) is the number of bytes used to store the value of the variable f; you can also use sizeof(float) (you can either pass sizeof a variable or more complex expression, or its type).
float f = 0.6;
unsigned char data[sizeof(float)];
size_t i;
for (i = 0; i < sizeof(float); i++) {
data[i] = (unsigned char*)f + i;
}
The functions memcpy or memmove do exactly that (or an optimized version thereof).
float f = 0.6;
unsigned char data[sizeof(float)];
memcpy(data, f, sizeof(f));
You don't even need to make this copy, though. You can directly pass a pointer to the float to your write-to-USB function, and tell it how many bytes to copy (sizeof(f)). You'll need an explicit cast if the function takes a pointer argument other than void*.
int write_to_usb(unsigned char *ptr, size_t size);
result = write_to_usb((unsigned char*)f, sizeof(f))
Note that this will work only if the device uses the same representation of floating point numbers, which is common but not universal. Most machines use the IEEE floating point formats, but you may need to switch endianness.
As for what is wrong with your attempt: the >> operator operates on integers. In the expression (int) f >> 24, f is cast to an int; if you'd written f >> 24 without the cast, f would still be automatically converted to an int. Converting a floating point value to an integer approximates it by truncating or rounding it (usually towards 0, but the rule depends on the platform). 0.6 rounded to an integer is 0 or 1, so data[0] is 0 or 1 and the others are all 0.
You need to act on the bytes of the float object, not on its value.
¹ Excluding functions which can't really be manipulated in C, but including function pointers which functions decay to automatically.
Assuming that both devices have the same notion of how floats are represented then why not just do a memcpy. i.e
unsigned char payload[4];
memcpy(payload, &f, 4);
the safest way to do this, if you control both sides is to send some sort of standardized representation... this isn't the most efficient, but it isn't too bad for small numbers.
hostPort writes char * "34.56\0" byte by byte
client reads char * "34.56\0"
then converts to float with library function atof or atof_l.
of course that isn't the most optimized, but it sure will be easy to debug.
if you wanted to get more optimized and creative, first byte is length then the exponent, then each byte represents 2 decimal places... so
34.56 becomes char array[] = {4,-2,34,56}; something like that would be portable... I would just try not to pass binary float representations around... because it can get messy fast.
It might be safer to union the float and char array. Put in the float member, pull out the 4 (or whatever the length is) bytes.
Related
So I'm working with system calls in Linux. I'm using "lseek" to navigate through the file and "read" to read. I'm also using Midnight Commander to see the file in hexadecimal. The next 4 bytes I have to read are in little-endian , and look like this : "2A 00 00 00". But of course, the bytes can be something like "2A 5F B3 00". I have to convert those bytes to an integer. How do I approach this? My initial thought was to read them into a vector of 4 chars, and then to build my integer from there, but I don't know how. Any ideas?
Let me give you an example of what I've tried. I have the following bytes in file "44 00". I have to convert that into the value 68 (4 + 4*16):
char value[2];
read(fd, value, 2);
int i = (value[0] << 8) | value[1];
The variable i is 17480 insead of 68.
UPDATE: Nvm. I solved it. I mixed the indexes when I shift. It shoud've been value[1] << 8 ... | value[0]
General considerations
There seem to be several pieces to the question -- at least how to read the data, what data type to use to hold the intermediate result, and how to perform the conversion. If indeed you are assuming that the on-file representation consists of the bytes of a 32-bit integer in little-endian order, with all bits significant, then I probably would not use a char[] as the intermediate, but rather a uint32_t or an int32_t. If you know or assume that the endianness of the data is the same as the machine's native endianness, then you don't need any other.
Determining native endianness
If you need to compute the host machine's native endianness, then this will do it:
static const uint32_t test = 1;
_Bool host_is_little_endian = *(char *)&test;
It is worthwhile doing that, because it may well be the case that you don't need to do any conversion at all.
Reading the data
I would read the data into a uint32_t (or possibly an int32_t), not into a char array. Possibly I would read it into an array of uint8_t.
uint32_t data;
int num_read = fread(&data, 4, 1, my_file);
if (num_read != 1) { /* ... handle error ... */ }
Converting the data
It is worthwhile knowing whether the on-file representation matches the host's endianness, because if it does, you don't need to do any transformation (that is, you're done at this point in that case). If you do need to swap endianness, however, then you can use ntohl() or htonl():
if (!host_is_little_endian) {
data = ntohl(data);
}
(This assumes that little- and big-endian are the only host byte orders you need to be concerned with. Historically, there have been others, which is why the byte-reorder functions come in pairs, but you are extremely unlikely ever to see one of the others.)
Signed integers
If you need a signed instead of unsigned integer, then you can do the same, but use a union:
union {
uint32_t unsigned;
int32_t signed;
} data;
In all of the preceding, use data.unsigned in place of plain data, and at the end, read out the signed result from data.signed.
Suppose you point into your buffer:
unsigned char *p = &buf[20];
and you want to see the next 4 bytes as an integer and assign them to your integer, then you can cast it:
int i;
i = *(int *)p;
You just said that p is now a pointer to an int, you de-referenced that pointer and assigned it to i.
However, this depends on the endianness of your platform. If your platform has a different endianness, you may first have to reverse-copy the bytes to a small buffer and then use this technique. For example:
unsigned char ibuf[4];
for (i=3; i>=0; i--) ibuf[i]= *p++;
i = *(int *)ibuf;
EDIT
The suggestions and comments of Andrew Henle and Bodo could give:
unsigned char *p = &buf[20];
int i, j;
unsigned char *pi= &(unsigned char)i;
for (j=3; j>=0; j--) *pi++= *p++;
// and the other endian:
int i, j;
unsigned char *pi= (&(unsigned char)i)+3;
for (j=3; j>=0; j--) *pi--= *p++;
I have a requirement where I need to read the 4 raw bytes of the single precision IEEE754 floating point representation as to send on the serial port as it is without any modification. I just wanted to ask what is the correct way of extracting the bytes among the following:
1.) creating a union such as:
typedef union {
float f;
uint8_t bytes[4];
struct {
uint32_t mantissa : 23;
uint32_t exponent : 8;
uint32_t sign : 1;
};
} FloatingPointIEEE754_t ;
and then just reading the bytes[] array after writing to the float variable f?
2.) Or, extracting bytes by a function in which a uint32_t type pointer is made to point to the float variable and then the bytes are extracted via masking
uint32_t extractBitsFloat(float numToExtFrom, uint8_t numOfBits, uint8_t bitPosStartLSB){
uint32_t *p = &numToExtFrom;
/* validate the inputs */
if ((numOfBits > 32) || (bitPosStartLSB > 31)) return NULL;
/* build the mask */
uint32_t mask = ((1 << numOfBits) - 1) << bitPosStartLSB;
return ((*p & mask) >> bitPosStartLSB);
}
where calling will be made like:
valF = -4.235;
byte0 = extractBitsFloat(valF, 8, 0);
byte1 = extractBitsFloat(valF, 8, 8);
byte2 = extractBitsFloat(valF, 8, 16);
byte3 = extractBitsFloat(valF, 8, 24);
Please suggest me the correct way if you think both the above-mentioned methods are wrong!
First of all, I assume you're coding specifically for a platform where float actually is represented in a IEEE754 single. You can't take this for granted in general, so your code won't be portable to all platforms.
Then, the union approach is the correct one. But don't add this bitfield member! There's no guarantee how the bits will be arranged, so you might access the wrong bits. Just do this:
typedef union {
float f;
uint8_t bytes[4];
} FloatingPointIEEE754;
Also, don't add a _t suffix to your own types. On POSIX systems, this is reserved to the implementation, so it's best to always avoid it.
Instead of using a union, accessing the bytes through a char pointer is fine as well:
unsigned char *rep = (unsigned char *)&f;
// access rep[0] to rep[3]
Note in both cases, you are accessing the representation in memory, this means you have to pay attention to the endianness of your machine.
Your second option isn't correct, it violates the strict aliasing rule. In short, you're not allowed to access an object through a pointer that doesn't have compatible type, a char pointer is an explicit exception for accessing the representation. The exact rules are written in 6.5 p7 of N1570, the latest draft to the C11 standard.
You can do:
unsigned char *p = (unsigned char *)&the_float;
and then read 4 bytes from where p is pointing (e.g. p[0], p[1], etc.). The exact best code to "read 4 bytes" depends on what form the serial port function accepts data in.
If you do not care of endianness, just alias a character pointer to the address of a float. The standard explicitely allows to use a charater pointer to access the bytes of the representation of any type. If you need a specific endianness to send the bytes on the serial port, you can test for it before sending:
Simple way, just use native endianness:
float f;
...
char * bytes = &f; // bytes point the the beginning of a char array of size sizeof(f)
Automatically test for endianness and uses big endian (AKA network order). The struct is just a trick to return an array and have thread safe code.
struct float_bytes {
char bytes[sizeof(float)];
};
struct float_bytes(float f) {
float end = 1.;
float_bytes resul;
char *src = (char *) &f;
if (*end == 0) { // end is 0 on a little endian platform, else 0x3f
int i = sizeof(f) { // little endian: reverse the bytes
while (i > 0) {
resul.bytes[--i] = src++;
}
}
else { // already in big endian order, just memcpy
memcpy(&(resul.bytes), &f, sizeof(f));
}
return resul;
}
Beware: the test for endianness will only make sense if floating point is IEEE754 single.
Is this a safe way to convert array to number?
// 23 FD 15 94 -> 603788692
char number[4] = {0x94, 0x15, 0xFD, 0x23};
uint32_t* n = (uint32_t*)number;
printf("number is %lu", *n);
MORE INFO
I'm using that in a embedded device with LSB architecture, does not need to be portable.
I'm currently using shifting, but if this code is safe i prefer it.
No. You're only allowed to access something as an integer if it is an integer.
But here's how you can manipulate the binary representation of an object by simply turning the logic around:
uint32_t n;
unsigned char * p = (unsigned char *)&n;
assert(sizeof n == 4); // assumes CHAR_BIT == 8
p[0] = 0x94; p[1] = 0x15; p[2] = 0xFD; p[3] = 0x23;
The moral: You can treat every object as a sequence of bytes, but you can't treat an arbitrary sequence of bytes as any particular object.
Moreover, the binary representation of a type is very much platform dependent, so there's no telling what actual integer value you get out from this. If you just want to synthesize an integral value from its base-256 digits, use normal maths:
uint32_t n = 0x94 + (0x15 * 0x100) + (0xFD * 0x10000) + (0x23 * 0x1000000);
This is completely platform-independent and expresses what you want purely in terms of values, not representations. Leave it to your compiler to produce a machine representation of the code.
No, it is not safe.
This is violating C aliasing rules that say that an object can only be accessed trough its own type, its signed / unsigned variant or through a character type. It can also invoke undefined behavior by breaking alignment.
A safe solution to get a uint32_t value from the array is to use bitwise operators (<< and &) on the char values to form an uint32_t.
You're better off with something like this (more portable):
int n = (c[3]<<24)|(c[2]<<16)|(c[1]<<8)|c[0];
where c is an unsigned char array.
As part of my CS course I've been given some functions to use. One of these functions takes a pointer to unsigned chars to write some data to a file (I have to use this function, so I can't just make my own purpose built function that works differently BTW). I need to write an array of integers whose values can be up to 4095 using this function (that only takes unsigned chars).
However am I right in thinking that an unsigned char can only have a max value of 256 because it is 1 byte long? I therefore need to use 4 unsigned chars for every integer? But casting doesn't seem to work with larger values for the integer. Does anyone have any idea how best to convert an array of integers to unsigned chars?
Usually an unsigned char holds 8 bits, with a max value of 255. If you want to know this for your particular compiler, print out CHAR_BIT and UCHAR_MAX from <limits.h> You could extract the individual bytes of a 32 bit int,
#include <stdint.h>
void
pack32(uint32_t val,uint8_t *dest)
{
dest[0] = (val & 0xff000000) >> 24;
dest[1] = (val & 0x00ff0000) >> 16;
dest[2] = (val & 0x0000ff00) >> 8;
dest[3] = (val & 0x000000ff) ;
}
uint32_t
unpack32(uint8_t *src)
{
uint32_t val;
val = src[0] << 24;
val |= src[1] << 16;
val |= src[2] << 8;
val |= src[3] ;
return val;
}
Unsigned char generally has a value of 1 byte, therefore you can decompose any other type to an array of unsigned chars (eg. for a 4 byte int you can use an array of 4 unsigned chars). Your exercise is probably about generics. You should write the file as a binary file using the fwrite() function, and just write byte after byte in the file.
The following example should write a number (of any data type) to the file. I am not sure if it works since you are forcing the cast to unsigned char * instead of void *.
int homework(unsigned char *foo, size_t size)
{
int i;
// open file for binary writing
FILE *f = fopen("work.txt", "wb");
if(f == NULL)
return 1;
// should write byte by byte the data to the file
fwrite(foo+i, sizeof(char), size, f);
fclose(f);
return 0;
}
I hope the given example at least gives you a starting point.
Yes, you're right; a char/byte only allows up to 8 distinct bits, so that is 2^8 distinct numbers, which is zero to 2^8 - 1, or zero to 255. Do something like this to get the bytes:
int x = 0;
char* p = (char*)&x;
for (int i = 0; i < sizeof(x); i++)
{
//Do something with p[i]
}
(This isn't officially C because of the order of declaration but whatever... it's more readable. :) )
Do note that this code may not be portable, since it depends on the processor's internal storage of an int.
If you have to write an array of integers then just convert the array into a pointer to char then run through the array.
int main()
{
int data[] = { 1, 2, 3, 4 ,5 };
size_t size = sizeof(data)/sizeof(data[0]); // Number of integers.
unsigned char* out = (unsigned char*)data;
for(size_t loop =0; loop < (size * sizeof(int)); ++loop)
{
MyProfSuperWrite(out + loop); // Write 1 unsigned char
}
}
Now people have mentioned that 4096 will fit in less bits than a normal integer. Probably true. Thus you can save space and not write out the top bits of each integer. Personally I think this is not worth the effort. The extra code to write the value and processes the incoming data is not worth the savings you would get (Maybe if the data was the size of the library of congress). Rule one do as little work as possible (its easier to maintain). Rule two optimize if asked (but ask why first). You may save space but it will cost in processing time and maintenance costs.
The part of the assignment of: integers whose values can be up to 4095 using this function (that only takes unsigned chars should be giving you a huge hint. 4095 unsigned is 12 bits.
You can store the 12 bits in a 16 bit short, but that is somewhat wasteful of space -- you are only using 12 of 16 bits of the short. Since you are dealing with more than 1 byte in the conversion of characters, you may need to deal with endianess of the result. Easiest.
You could also do a bit field or some packed binary structure if you are concerned about space. More work.
It sounds like what you really want to do is call sprintf to get a string representation of your integers. This is a standard way to convert from a numeric type to its string representation. Something like the following might get you started:
char num[5]; // Room for 4095
// Array is the array of integers, and arrayLen is its length
for (i = 0; i < arrayLen; i++)
{
sprintf (num, "%d", array[i]);
// Call your function that expects a pointer to chars
printfunc (num);
}
Without information on the function you are directed to use regarding its arguments, return value and semantics (i.e. the definition of its behaviour) it is hard to answer. One possibility is:
Given:
void theFunction(unsigned char* data, int size);
then
int array[SIZE_OF_ARRAY];
theFunction((insigned char*)array, sizeof(array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(*array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(int));
All of which will pass all of the data to theFunction(), but whether than makes any sense will depend on what theFunction() does.
I'm trying to get the numerical (double) value from a byte array of 16 elements, as follows:
unsigned char input[16];
double output;
...
double a = input[0];
distance = a;
for (i=1;i<16;i++){
a = input[i] << 8*i;
output += a;
}
but it does not work.
It seems that the temporary variable that contains the result of the left-shift can store only 32 bits, because after 4 shift operations of 8 bits it overflows.
I know that I can use something like
a = input[i] * pow(2,8*i);
but, for curiosity, I was wondering if there's any solution to this problem using the shift operator...
Edit: this won't work (see comment) without something like __int128.
a = input[i] << 8*i;
The expression input[i] is promoted to int (6.3.1.1) , which is 32bit on your machine. To overcome this issue, the lefthand operand has to be 64bit, like in
a = (1L * input[i]) << 8*i;
or
a = (long long unsigned) input[i] << 8*i;
and remember about endianness
The problem here is that indeed the 32 bit variables cannot be shifted more than 4*8 times, i.e. your code works for 4 char's only.
What you could do is find the first significant char, and use Horner's law: anxn + an-1n-1 + ... = ((...( anx + an-1 ).x + an-2 ) . x + ... ) + a0 as follows:
char coefficients[16] = { 0, 0, ..., 14, 15 };
int exponent=15;
double result = 0.;
for(int exponent = 15; exp >= 0; --exp ) {
result *= 256.; // instead of <<8.
result += coefficients[ exponent ];
}
In short, No, you can't convert a sequence of bytes directly into a double by bit-shifting as shown by your code sample.
byte, an integer type and double, a floating point type (i.e. not an integer type) are not bitwise compatible (i.e. you can't just bitshift to values of a bunch of bytes into a floating point type and expect an equivalent result.)
1) Assuming the byte array is a memory buffer referencing an integer value, you should be able to convert your byte array into a 128-bit integer via bit-shifting and then convert that resulting integer into a double. Don't forget that endian-issues may come into play depending on the CPU architecture.
2) Assuming the byte array is a memory buffer that contains a 128-bit long double value, and assuming there are no endian issues, you should be able to memcpy the value from the byte array into the long double value
union doubleOrByte {
BYTE buffer[16];
long double val;
} dOrb;
dOrb.val = 3.14159267;
long double newval = 0.0;
memcpy((void*)&newval, (void*)dOrb.buffer, sizeof(dOrb.buffer));
Why not simply cast the array to a double pointer?
unsigned char input[16];
double* pd = (double*)input;
for (int i=0; i<sizeof(input)/sizeof(double); ++i)
cout << pd[i];
if you need to fix endian-ness, reverse the char array using the STL reverse() before casting to a double array.
Have you tried std::atof:
http://www.cplusplus.com/reference/clibrary/cstdlib/atof/
Are you trying to convert a string representation of a number to a real number? In that case, the C-standard atof is your best friend.
Well based off of operator precedence the right hand side of
a = input[i] << 8*i;
gets evaluated before it gets converted to a double, so you are shifting input[i] by 8*i bits, which stores its result in a 32 bit temporary variable and thus overflows. You can try the following:
a = (long long unsigned int)input[i] << 8*i;
Edit: Not sure what the size of a double is on your system, but on mine it is 8 bytes, if this is the case for you as well the second half of your input array will never be seen as the shift will overflow even the double type.