Changing endianness on 3 byte integer - c

I am receiving a 3-byte integer, which I'm storing in an array. For now, assume the array is unsigned char myarray[3]
Normally, I would convert this into a standard int using:
int mynum = ((myarray[2] << 16) | (myarray[1] << 8) | (myarray[0]));
However, before I can do this, I need to convert the data from network to host byte ordering.
So, I change the above to (it comes in 0-1-2, but it's n to h, so 0-2-1 is what I want):
int mynum = ((myarray[1] << 16) | (myarray[2] << 8) | (myarray[0]));
However, this does not seem to work. For the life of me can't figure this out. I've looked at it so much that at this point I think I'm fried and just confusing myself. Is what I am doing correct? Is there a better way? Would the following work?
int mynum = ((myarray[2] << 16) | (myarray[1] << 8) | (myarray[0]));
int correctnum = ntohl(mynum);

Here's an alternate idea. Why not just make it structured and make it explicit what you're doing. Some of the confusion you're having may be rooted in the "I'm storing in an array" premise. If instead, you defined
typedef struct {
u8 highByte;
u8 midByte;
u8 lowByte;
} ThreeByteInt;
To turn it into an int, you just do
u32 ThreeByteTo32(ThreeByteInt *bytes) {
return (bytes->highByte << 16) + (bytes->midByte << 8) + (bytes->lowByte);
}

if you receive the value in network ordering (that is big endian) you have this situation:
myarray[0] = most significant byte
myarray[1] = middle byte
myarray[2] = least significant byte
so this should work:
int result = (((int) myarray[0]) << 16) | (((int) myarray[1]) << 8) | ((int) myarray[2]);

Beside the ways of using strucures / unions with byte-size members you have two other ways
Using ntoh / hton and masking out the high byte of the 4-byte integer before or after
the conversion with an bitwise and.
Doing the bitshift operations contained in other answers
At any rate you should not rely on side effects and shift data beyond the size of data type.
Shift by 16 is beyond the size of unsigned char and will cause problems depending on compiler, flags, platform endianess and byte order. So always do the proper cast before bitwise to make it work on any compiler / platform:
int result = (((int) myarray[0]) << 16) | (((int) myarray[1]) << 8) | ((int) myarray[2]);

Why don't just receive into the top 3 bytes of a 4-byte buffer? After that you could use ntohl which is just a byte swap instruction in most architectures. In some optimization levels it'll be faster than simple bitshifts and or
union
{
int32_t val;
unsigned char myarray[4];
} data;
memcpy(&data, buffer, 3);
data.myarray[3] = 0;
data.val = ntohl(data.val);
or in case you have copied it to the bottom 3 bytes then another shift is enough
memcpy(&data.myarray[1], buffer, 3);
data.myarray[0] = 0;
data.val = ntohl(data.val) >> 8; // or data.val = ntohl(data.val << 8);

unsigned char myarray[3] = { 1, 2, 3 };
# if LITTLE_ENDIAN // you figure out a way to express this on your platform
int mynum = (myarray[0] << 0) | (myarray[1] << 8) | (myarray[2] << 16);
# else
int mynum = (myarray[0] << 16) | (myarray[1] << 8) | (myarray[2] << 0);
# endif
printf("%x\n", mynum);
That prints 30201 which I think is what you want. The key is to realize that you have to shift the bytes differently per-platform: you can't easily use ntohl() because you don't know where to put the extra zero byte.

Related

Arduino - Converto long to byte and back to long

I took this example from the following page. I am trying to convert long into a 4 byte array. This is the original code from the page.
long n;
byte buf[4];
buf[0] = (byte) n;
buf[1] = (byte) n >> 8;
buf[2] = (byte) n >> 16;
buf[3] = (byte) n >> 24;
long value = (unsigned long)(buf[4] << 24) | (buf[3] << 16) | (buf[2] << 8) | buf[1];
I modified the code replacing
long value = (unsigned long)(buf[4] << 24) | (buf[3] << 16) | (buf[2] << 8) | buf[1];
for
long value = (unsigned long)(buf[3] << 24) | (buf[2] << 16) | (buf[1] << 8) | buf[0];
I tried the original code where n is 15000 and value would return 0. After modifiying the line in question (i think there was an error in the indexes on the original post?) value returns 152.
The objetive is to have value return the same number as n. Also, n can be negative, so value should also return the same negative number.
Not sure what I am doing wrong. Thanks!
You were correct that the indices were wrong. A 4-byte array indexes from 0 to 3, not 1 to 4.
The rest of the issues were because you were using a signed 'long' type. Doing bit-manipulations on signed datatypes is not well defined, since it assumes something about how signed integers are stored (twos-complement on most systems, although I don't think any standard requires it).
e.g. see here
You're then assigning between signed 'longs' and unsigned 'bytes'.
Someone else has posted an answer (possibly abusing casts) that I'm sure works. But without any explanation I feel it doesn't help much.

Can you reverse the byte order of a double and store the result back in a double in C?

I'm currently writing a binary application protocol in C that sends uint32_t and doubles across the network using sockets. When writing uint32_t's, I always use the htonl() functions to convert the host byte order to network byte order. However, such a function does not exist for doubles. So I ended up writing a new function which checks the host endianness and reverses the byte order of the double if necessary.
double netHostToNetFloat64(double input)
{
typedef union DoubleData
{
double d;
char c[8];
} DoubleData;
DoubleData input_data, output_data;
if(netHostIsBigEndian())
{
return input;
}
else
{
input_data.d = input;
output_data.c[0] = input_data.c[7];
output_data.c[1] = input_data.c[6];
output_data.c[2] = input_data.c[5];
output_data.c[3] = input_data.c[4];
output_data.c[4] = input_data.c[3];
output_data.c[5] = input_data.c[2];
output_data.c[6] = input_data.c[1];
output_data.c[7] = input_data.c[0];
return output_data.d;
}
}
However, I'm getting some weird results when the network peer reads the double. I'm confident that my function is reversing the byte order, however, I'm curious if storing an invalid double value (i.e. it results in an Inf or NaN) corrupts the output the next time it is read?
Also, I realize that reversing the byte order and placing the result back into a double is stupid, however, I wanted to keep it consistent with the htonl/htons/ntohl/ntohs functions.
I would use other way of reversing the value.
uint64_t reversebytes(double d)
{
uint64_t u;
memcpy(&u, &d, sizeof(u));
u = (u >> 56) | (u << 56) |
((u & 0x00ff000000000000) >> 40) | ((u & 0x000000000000ff00) << 40) |
((u & 0x0000ff0000000000) >> 24) | ((u & 0x0000000000ff0000) << 24) |
((u & 0x000000ff00000000) >> 8) | ((u & 0x00000000ff000000) << 8) ;
return u;
}
This code is very well optimized by the compilers: https://godbolt.org/z/qn6ar3hvM
Your function is much more difficult for the compiler to optimize: https://godbolt.org/z/M4shPoEnE
Double is eight bytes, so you need to reverse the eight bytes of the value to send it on the network... normally, when you send a IEEE-752, the protocol specification indicates which byte must be sent first (it's normally the most significant byte of the double value, which has the sign and the most significant byte of the exponent of two.) but quite common (in Intel architecture) to store that byte just in the last byte (following something similar to little endian again) so you must change all the bytes order to appear in reverse, before transmitting. Other architectures I'm not sure what they do, but probably will do something similar. You could use this:
uint64_t llhton(void *p)
{
uint64_t data = *(uint64_t *)p;
src = (data & 0xffffffff00000000) >> 32 | (data & 0x00000000ffffffff) << 32;
src = (data & 0xffff0000ffff0000) >> 16 | (data & 0x0000ffff0000ffff) << 16;
src = (data & 0xff00ff00ff00ff00) >> 8 | (data & 0x00ff00ff00ff00ff) << 8;
return data;
}
if you have:
double value;
you can serialize it for output:
uint64_t data_in_network_order = llhton(&value);
while data_in_network_order is in network order, it has no meaning in this architecture, so I used a 64bit unsigned data for convenience only, it should be considered opaque data.

How to combine 4 bytes and do math calculation in C using AVR

I have an atMega1281 micro controller using C. I have a routine to pull 4 bytes off of a CAN bus to read an SPN. I am able to get the 4 bytes but I cannot print the 4 byte number because it is truncating the the first 2 bytes making it a 16 bit number. I have tried using unsigned long as the declaration with no success. What is the trick when using 32 bit numbers with an AVR?
unsigned long engine_hours_raw;
float engine_hours_uint;
engine_hours_raw = (OneMessage.Msg.Data[3] << 24) | (OneMessage.Msg.Data[2] << 16) | (OneMessage.Msg.Data[1] << 8) | OneMessage.Msg.Data[0]);
engine_hours_uint = engine_hours_raw * 0.05;
ftoa(engine_hours_uint, engHours, 1);
UART1_Printf("Engine Hours: %s ", engHours);
(OneMessage.Msg.Data[3] << 24) will be 0 as the default size for an expression is an int. unless it is cast.
Instead, load the data into the long int and then perform the shift.
engine_hours_raw = OneMessage.Msg.Data[3];
engine_hours_raw <<= 8;
engine_hours_raw |= OneMessage.Msg.Data[2];
engine_hours_raw <<= 8;
engine_hours_raw |= OneMessage.Msg.Data[1];
engine_hours_raw <<= 8;
engine_hours_raw |= OneMessage.Msg.Data[0];
You could also cast the intermediate expressions as unsigned log or uint32_t, but it just as messy and may take longer.
engine_hours_raw = ((unit32_t)(OneMessage.Msg.Data[3]) << 24) | ((unit32_t)(OneMessage.Msg.Data[2]) << 16) | ((unit32_t)(OneMessage.Msg.Data[1]) << 8) | (unit32_t)(OneMessage.Msg.Data[0]);
There is much easier and straightforward way :)
uint32_t *engine_hours_raw = (uint32_t *)OneMessage.Msg.Data;
float engine_hours_uint = *engine_hours_raw * 0.05;

Combining uint8_t, uint16_t and uint8_t

I have three values, uint8_t, uint16_t and uint8_t in that order. I am trying to combine them to one uint_32 without losing the order. I found this question from here, but I got stuck with the uint_16 value in the middle.
For example:
uint8_t v1=0x01;
uint16_t v2=0x1001;
uint8_t v3=0x11;
uint32_t comb = 0x01100111;
I was thinking about spitting v2 into two separate uint8_t:s but realized there might be some easier way to solve it.
My try:
v2 = 0x1001;
a = v2 & 0xFF;
b = v1 >> 8;
first = ((uint16_t)v1 << 8) | a;
end = ((uint16_t)b << 8) | v3;
comb = ((uint32_t)first << 16) | end;
This should be your nestedly implied and as one-liner written transformation:
uint32_t comb = ((uint32_t)v1 << 24) | (((uint32_t)v2 << 8) | v3);
Basically, you have the 8 | 16 | 8 building the 32bit-sized type. To shift the first one and put at the head, you would need to cast to 32bit and use 24 (32-8). Then OR the next ones whilst shifting, i.e. placing at the right offset and the rest filling with zeros and casting respectively.
You use OR for the obvious reasons of not losing any information.

Casting 8-bit int to 32-bit

I think I confused myself with endianness and bit-shifting, please help.
I have 4 8-bit ints which I want to convert to a 32-bit int. This is what I an doing:
uint h;
t_uint8 ff[4] = {1,2,3,4};
if (BIG_ENDIAN) {
h = ((int)ff[0] << 24) | ((int)ff[1] << 16) | ((int)ff[2] << 8) | ((int)ff[3]);
}
else {
h = ((int)ff[0] >> 24) | ((int)ff[1] >> 16) | ((int)ff[2] >> 8) | ((int)ff[3]);
}
However, this seems to produce a wrong result. With a little experimentation I realised that it should be other way round: in the case of big endian I am supposed to shift bits to the right, and otherwise to the left. However, I don't understand WHY.
This is how I understand it. Big endian means most significant byte first (first means leftmost, right? perhaps this where I am wrong). So, converting 8-bit int to 32-bit int would prepend 24 zeros to my existing 8 bits. So, to make it a 1st byte I need to shift bits 24 to the left.
Please point out where I am wrong.
You always have to shift the 8-bit-values left. But in the little-endian case, you have to change the order of indices, so that the fourth byte goes into the most-significant position, and the first byte into the least-significant.
if (BIG_ENDIAN) {
h = ((int)ff[0] << 24) | ((int)ff[1] << 16) | ((int)ff[2] << 8) | ((int)ff[3]);
}
else {
h = ((int)ff[3] << 24) | ((int)ff[2] << 16) | ((int)ff[1] << 8) | ((int)ff[0]);
}

Resources