Bitmask for exactly one byte in C - c

My goal is to save a long in four bytes like this:
unsigned char bytes[4];
unsigned long n = 123;
bytes[0] = (n >> 24) & 0xFF;
bytes[1] = (n >> 16) & 0xFF;
bytes[2] = (n >> 8) & 0xFF;
bytes[3] = n & 0xFF;
But I want the code to be portable, so I use CHAR_BIT from <limits.h>:
unsigned char bytes[4];
unsigned long n = 123;
bytes[0] = (n >> (CHAR_BIT * 3)) & 0xFF;
bytes[1] = (n >> (CHAR_BIT * 2)) & 0xFF;
bytes[2] = (n >> CHAR_BIT) & 0xFF;
bytes[3] = n & 0xFF;
The problem is that the bitmask 0xFF only accounts for eight bits, which is not necessarily equivalent to one byte. Is there a way to make the upper code completely portable for all platforms?

How about something like:
unsigned long mask = 1;
mask<<=CHAR_BIT;
mask-=1;
and then using this as the mask instead of 0xFF?
Test program:
#include <stdio.h>
int main() {
#define MY_CHAR_BIT_8 8
#define MY_CHAR_BIT_9 9
#define MY_CHAR_BIT_10 10
#define MY_CHAR_BIT_11 11
#define MY_CHAR_BIT_12 12
{
unsigned long mask = 1;
mask<<=MY_CHAR_BIT_8;
mask-= 1;
printf("%lx\n", mask);
}
{
unsigned long mask = 1;
mask<<=MY_CHAR_BIT_9;
mask-= 1;
printf("%lx\n", mask);
}
{
unsigned long mask = 1;
mask<<=MY_CHAR_BIT_10;
mask-= 1;
printf("%lx\n", mask);
}
{
unsigned long mask = 1;
mask<<=MY_CHAR_BIT_11;
mask-= 1;
printf("%lx\n", mask);
}
{
unsigned long mask = 1;
mask<<=MY_CHAR_BIT_12;
mask-= 1;
printf("%lx\n", mask);
}
}
Output:
ff
1ff
3ff
7ff
fff

I work almost exclusively with embedded systems where I rather often have to provide portable code between all manner of more or less exotic systems. Like writing code which will work both on some tiny 8 bit MCU and a x86_64.
But even for me, bothering with portability to exotic obsolete DSP systems and the like is a huge waste of time. These systems barely exist in the real world - why exactly do you need portability to them? Is there any other reason than "showing off" mostly useless language lawyer knowledge of C? In my experience, 99% of all such useless portability concerns boil down to programmers "showing off", rather than an actual requirement specification.
And even if you for some strange reason do need such portability, this task doesn't make any sense to begin with since neither char nor long are portable! If char is not 8 bits then what makes you think long is 4 bytes? It could be 2 bytes, it could be 8 bytes, or it could be something else.
If portability is an actual concern, then you must use stdint.h. Then if you truly must support exotic systems, you have to decide which ones. The only real-world computers I know of that actually do use different byte sizes are various obsolete exotic TI DSPs from the 1990s, which use 16 bit bytes/char. Lets assume this is your intended target which you have decided is important to support.
Lets also assume that a standard C compiler (ISO 9899) exists for that exotic target, which is highly unlikely. (More likely you'll get a poorly conforming, mostly broken legacy C90 thing... or even more likely those who use the target write everything in assembler.) In case of a standard C compiler, it will not implement uint8_t since it's not a mandatory type if the target doesn't support it. Only uint_least8_t and uint_fast8_t are mandatory.
Then you'd go about it like this:
#include <stdint.h>
#include <limits.h>
#if CHAR_BIT == 8
static void uint32_to_uint8 (uint8_t dst[4], uint32_t u32)
{
dst[0] = (u32 >> 24) & 0xFF;
dst[1] = (u32 >> 16) & 0xFF;
dst[2] = (u32 >> 8) & 0xFF;
dst[3] = (u32 >> 0) & 0xFF;
}
#endif
// whatever other conversion functions you need:
static void uint32_to_uint16 (uint16_t dst[2], uint32_t u32){ ... }
static void uint64_to_uint16 (uint16_t dst[2], uint32_t u32){ ... }
The exotic DSP will then use the uint32_to_uint16 function. You could use the same compiler #if CHAR_BIT checks to do #define byte_to_word uint32_to_uint16 etc.
And then should also immediately notice that endianess will be the next major portability concern. I have no idea what endianess obsolete DSPs often use, but that's another question.

What about:
unsigned long mask = (unsigned char)-1;
This will work because the C standard says in 6.3.1.3p2
1 When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
And that unsigned long can represent all values of unsigned char.

#define CHARMASK ((1UL << CHAR_BIT) - 1)
int main(void)
{
printf("0x%x\n", CHARMASK);
}
And the mask will always have width of the char. Calculated compile time, no additional variables needed.
Or
#define CHARMASK ((unsigned char)(~0))
You can do it without the masks as well
void foo(unsigned int n, unsigned char *bytes)
{
bytes[0] = ((n << (CHAR_BIT * 0)) >> (CHAR_BIT * 3));
bytes[1] = ((n << (CHAR_BIT * 1)) >> (CHAR_BIT * 3));
bytes[2] = ((n << (CHAR_BIT * 2)) >> (CHAR_BIT * 3));
bytes[3] = ((n << (CHAR_BIT * 3)) >> (CHAR_BIT * 3));
}
int main(void)
{
unsigned int z = 0xaabbccdd;
unsigned char bytes[4];
foo(z, bytes);
printf("0x%02x 0x%02x 0x%02x 0x%02x\n", bytes[0], bytes[1], bytes[2], bytes[3]);
}

Related

Using bitwise operators to change all bits of the most significant byte to 1

Let's suppose I have an unsigned int x = 0x87654321. How can I use bitwise operators to change the most significant byte (the leftmost 8 bits) of this number to 1?
So, instead of 0x87654321, I would have 0xFF654321?
As an unsigned in C may be 32 bits, 16 bits or other sizes, best to drive code without assuming the width.
The value UINT_MAX has all value bits set.
A "byte" in C is CHAR_BIT wide - usually 8.
UINT_MAX ^ (UINT_MAX >> CHAR_BIT) or ~(UINT_MAX >> CHAR_BIT) is the desired mask.
#include <limits.h>
#include <stdio.h>
#define UPPER_BYTE_MASK (UINT_MAX ^ (UINT_MAX >> CHAR_BIT))
// or
#define UPPER_BYTE_MASK (~(UINT_MAX >> CHAR_BIT))
int main() {
unsigned value = 0x87654321;
printf("%X\n", value | UPPER_BYTE_MASK);
}
#define MSB1(x) ((x) | (((1ULL << CHAR_BIT) - 1)<< ((sizeof(x) - 1) * CHAR_BIT)))
int main(void)
{
char x;
short y;
int z;
long q;
long long l;
printf("0x%llx\n", (unsigned long long)MSB1(x));
printf("0x%llx\n", (unsigned long long)MSB1(y));
printf("0x%llx\n", (unsigned long long)MSB1(z));
printf("0x%llx\n", (unsigned long long)MSB1(q));
printf("0x%llx\n", (unsigned long long)MSB1(l));
l = MSB1(l);
}
If you know the size of the integer, you can simply use something like
x |= 0xFF000000;
If not, you'll need to calculate the mask. One way:
x |= UINT_MAX - ( UINT_MAX >> 8 );

Portable and Tight Bit Packing

Suppose I have three unsigned ints, {a, b, c, d}, which I want to pack with non-standard lengths, {9,5,7,11} respectively. I wish to make a network packet (unsigned char pkt[4]) that I can pack these values into and unpack them reliably on another machine using the same header file regardless of endianness.
Everything I have read about using packed structs suggests that the bit-ordering will not be predictable so that is out of the question. So that leaves me with bit-set and bit-clear operations, but I'm not confident in how to ensure that endianness will not cause me problems. Is the following sufficient, or shall I run into problems with the endianness of a and d separately?
void pack_pkt(uint16_t a, uint8_t b, uint8_t c, uint16_t d, uint8_t *pkt){
uint32_t pkt_h = ((uint32_t)a & 0x1FF) // 9 bits
| (((uint32_t)b & 0x1F) << 9) // 5 bits
| (((uint32_t)c & 0x3F) << 14) // 7 bits
| (((uint32_t)d & 0x7FF) << 21); //11 bits
*pkt = htonl(pkt_h);
}
void unpack_pkt(uint16_t *a, uint8_t *b, uint8_t *c, uint16_t *d, uint8_t *pkt){
uint32_t pkt_h = ntohl(*pkt);
(*a) = pkt_h & 0x1FF;
(*b) = (pkt_h >> 9) & 0x1F;
(*c) = (pkt_h >> 14) & 0x3F;
(*d) = (pkt_h >> 21) & 0x7FF;
}
If so, what other measures can I take to ensure portability?
Structs with bitfields are indeed essentially useless for this purpose, as their field order and even padding rules are not consistent.
shall I run into problems with the endianness of a and d separately?
The endianness of a and d doesn't matter, their byte-order is never used. a and d are not reinterpreted as raw bytes, only their integer values are used or assigned to, and in those cases endianness does not enter the picture.
There is an other problem though: uint8_t *pkt in combination with *pkt = htonl(pkt_h); means that only the least significant byte is saved (regardless of whether it is executed by a little endian or big endian machine, because this is not a reinterpretation, it's an implicit conversion). uint8_t *pkt is OK by itself, but then the resulting group of 4 bytes must be copied into the buffer it points to, it cannot be assigned all in one go. uint32_t *pkt would enable such a single-assignment to work without losing data, but that makes the function less convenient to use.
Similarly in unpack_pkt, only one byte of data is currently used.
When those issues are fixed, it should be good:
void pack_pkt(uint16_t a, uint8_t b, uint8_t c, uint16_t d, uint8_t *buffer){
uint32_t pkt_h = ((uint32_t)a & 0x1FF) // 9 bits
| (((uint32_t)b & 0x1F) << 9) // 5 bits
| (((uint32_t)c & 0x3F) << 14) // 7 bits
| (((uint32_t)d & 0x7FF) << 21); //11 bits
uint32_t pkt = htonl(pkt_h);
memcpy(buffer, &pkt, sizeof(uint32_t));
}
void unpack_pkt(uint16_t *a, uint8_t *b, uint8_t *c, uint16_t *d, uint8_t *buffer){
uint32_t pkt;
memcpy(&pkt, buffer, sizeof(uint32_t));
uint32_t pkt_h = ntohl(pkt);
(*a) = pkt_h & 0x1FF;
(*b) = (pkt_h >> 9) & 0x1F;
(*c) = (pkt_h >> 14) & 0x3F;
(*d) = (pkt_h >> 21) & 0x7FF;
}
An alternative that works without worrying about endianness at any point is manually deconstructing the uint32_t (rather than conditionally byte-swapping it with htonl and then reinterpreting it as raw bytes), for example:
void pack_pkt(uint16_t a, uint8_t b, uint8_t c, uint16_t d, uint8_t *pkt){
uint32_t pkt_h = ((uint32_t)a & 0x1FF) // 9 bits
| (((uint32_t)b & 0x1F) << 9) // 5 bits
| (((uint32_t)c & 0x3F) << 14) // 7 bits
| (((uint32_t)d & 0x7FF) << 21); //11 bits
// example serializing the bytes in big endian order, regardless of host endianness
pkt[0] = pkt_h >> 24;
pkt[1] = pkt_h >> 16;
pkt[2] = pkt_h >> 8;
pkt[3] = pkt_h;
}
The original approach isn't bad, this is just an alternative, something to consider. Since nothing is ever reinterpreted, endianness does not matter at all, which may increase confidence in the correctness of the code. Of course as a downside, it requires more code to get the same thing done. By the way, even though manually deconstructing the uint32_t and storing 4 separate bytes looks like a lot of work, GCC can compile it efficiently into a bswap and 32bit store. On the other hand Clang misses this opportunity and other compilers may as well, so this is not without its drawbacks.
for packing and packing i suggest use struct like this
remember size of struct is different in other machines like 8 bit system vs 32 bit system compile same struct with different sizes we call it padding in struct so you can use pack to be sure struct size is same in transmitter and receiver
typedef struct {
uint8_t A;
uint8_t B;
uint8_t C;
uint8_t D;
} MyPacket;
now you can stream this struct into byte stream such as SerialPort or UART or something else
and in the receiver you can pack bytes to gether
see the following functions
void transmitPacket(MyPacket* packet) {
int len = sizeof(MyPacket);
uint8_t* pData = (uint8_t*) packet;
while (len-- > 0) {
// send bytes 1 by 1
transmitByte(*pData++);
}
}
void receivePacket(MyPacket* packet) {
int len = sizeof(MyPacket);
uint8_t* pData = (uint8_t*) packet;
while (len-- > 0) {
// receive bytes 1 by 1
*pData++ = receiveByte();
}
}
remember bit ordering in byte is same every where but you must check your byte ordering for be sure packet will not be miss understand in receiver
for example if sizeof your packet is 4 bytes and you send low byte first
you have to receive low byte in receiver
in your code you get packet in uint8_t* pointer but your actual sizeof packet is uint32_t and is 4 bytes

Copy low-order bytes of an integer whilst preserving endianness

I need to write a function that copies the specified number of low-order bytes of a given integer into an address in memory, whilst preserving their order.
void lo_bytes(uint8_t *dest, uint8_t no_bytes, uint32_t val)
I expect the usage to look like this:
uint8 dest[3];
lo_bytes(dest, 3, 0x44332211);
// Big-endian: dest = 33 22 11
// Little-endian: dest = 11 22 33
I've tried to implement the function using bit-shifts, memcpy, and iterating over each byte of val with a for-loop, but all of my attempts failed to work on either one or the other endianness.
Is it possible to do this in a platform-independent way, or do I need to use #ifdefs and have a separate piece of code for each endianness?
I've tried to implement the function using bit-shifts, memcpy, and
iterating over each byte of val with a for-loop, but all of my
attempts failed to work on either one or the other endianness.
All arithmetic, including bitwise arithmetic, is defined in terms of the values of the operands, not their representations. This cannot be sufficient for you because you want to obtain a result that differs depending on details of the representation style for type uint32_t.
You can operate on object representations via various approaches, but you still need to know which bytes to operate upon. That calls for some form of detection. If big-endian and little-endian are the only byte orders you're concerned with supporting, then I favor an approach similar to that given in #P__J__'s answer:
void lo_bytes(uint8_t *dest, uint8_t no_bytes, uint32_t val) {
static const union { uint32_t i; uint8_t a[4] } ubytes = { 1 };
memcpy(dest, &val + (1 - ubytes.a[0]) * (4 - no_bytes), no_bytes);
}
The expression (1 - ubytes.a[0]) evaluates to 1 if the representation of uint32_t is big endian, in which case the high-order bytes occur at the beginning of the representation of val. In that case, we want to skip the first 4 - no_bytes of the representation and copy the rest. If uint32_t has a little-endian representation, on the other hand, (1 - ubytes.a[0]) will evaluate to 0, with the result that the memcpy starts at the beginning of the representation. In every case, whichever bytes are copied from the representation of val, their order is maintained. That's what memcpy() does.
Is it possible to do this in a platform-independent way, or do I need to use #ifdefs and have a separate piece of code for each endianness?
No, that doesn't even make sense. Anything that cares about a specific characteristic of a platform (e.g. endianness) can't be platform independent.
Example 1 (platform independent):
// Copy the 3 least significant bytes to dest[]
dest[0] = value & 0xFF; dest[1] = (value >> 8) & 0xFF; dest[2] = (value >> 16) & 0xFF;
Example 2 (platform independent):
// Copy the 3 most significant bytes to dest[]
dest[0] = (value >> 8) & 0xFF; dest[1] = (value >> 16) & 0xFF; dest[2] = (value >> 24) & 0xFF;
Example 3 (platform dependent):
// I want the least significant bytes on some platforms and the most significant bytes on other platforms
#ifdef PLATFORM_TYPE_A
dest[0] = value & 0xFF; dest[1] = (value >> 8) & 0xFF; dest[2] = (value >> 16) & 0xFF;
#endif
#ifdef PLATFORM_TYPE_B
dest[0] = (value >> 8) & 0xFF; dest[1] = (value >> 16) & 0xFF; dest[2] = (value >> 24) & 0xFF;
#endif
Note that it makes no real difference what the cause of the platform dependence is (if it's endianness or something else), as soon as you have a platform dependence you can't have platform independence.
int detect_endianess(void) //1 if little endian 0 if big endianes
{
union
{
uint16_t u16;
uint8_t u8[2];
}val = {.u16 = 0x1122};
return val.u8[0] == 0x22;
}
void lo_bytes(void *dest, uint8_t no_bytes, uint32_t val)
{
if(detect_endianess())
{
memcpy(dest, &val, no_bytes);
}
else
{
memcpy(dest, (uint8_t *)(&val) + sizeof(val) - no_bytes, no_bytes);
}
}

Converting Char array to Long in C error

I try to covert unsigned long int to char
but I got some error
int main(void)
{
unsigned char pdest[4];
unsigned long l=0xFFFFFFFF;
pdest[0] = l & 0xFF;
pdest[1] = (l >> 8) & 0xFF;
pdest[2] = (l >> 16) & 0xFF;
pdest[3] = (l >> 24) & 0xFF;
unsigned long int l1=0;
l1 |= (pdest[0]);
l1 |= (pdest[1] << 8);
l1 |= (pdest[2] << 16);
l1 |= (pdest[3] << 24);
printf ("%lu",l1);
}
and output is
18446744073709551615
not 4294967295?
How to do it correct?
Read this:
https://en.wikipedia.org/wiki/C_data_types
...Long unsigned integer type. Capable of containing at least the [0,
4,294,967,295] range;
You should write the last 4 lines as:
l1 |= ((unsigned long) pdest[0]);
l1 |= (((unsigned long) pdest[1]) << 8);
l1 |= (((unsigned long) pdest[2]) << 16);
l1 |= (((unsigned long) pdest[3]) << 24);
As you should cast the byte to unsigned long before shifting.
pdest[3] << 24 is of type signed int.
Change your code to
unsigned long int l1=0;
l1 |= (pdest[0]);
l1 |= (pdest[1] << 8);
l1 |= (pdest[2] << 16);
l1 |= ((unsigned int)pdest[3] << 24);
The problem is the shifting of char. You must force a conversion to unsigned long before to shift beyond the 8 bits of a char. Moreover the sign of char will play a further alteration on the result.
Try
unsigned long int l1=0;
l1 |= ((unsigned long)pdest[0]);
l1 |= ((unsigned long)pdest[1] << 8);
l1 |= ((unsigned long)pdest[2] << 16);
l1 |= ((unsigned long)pdest[3] << 24);
Note the use of a cast to force compilers to convert the char to an unsigned long before that the shift take place.
Your unsigned long does not have to be 4 bytes long.
#include <stdio.h>
#include <stdint.h>
int main(void) {
int index;
unsigned char pdest[sizeof(unsigned long)];
unsigned long l=0xFFFFFFFFUL;
for(index = 0; index < sizeof(unsigned long); index++)
{
pdest[index] = l & 0xff;
l >>= 8;
}
unsigned long l1=0;
for(index = 0; index < sizeof(unsigned long); index++)
{
l1 |= (unsigned long)pdest[index] << (8 * index);
}
printf ("%lx\n",l1);
}
First of all, to name a type that is exactly 32 bits wide, use uint32_t, not unsigned long int. unsigned long int is generally 64 bits wide in 64-bit *nixes (so called LP64), whereas they're 32 bits in Windows (LLP64).
Anyway, the problem is with integer promotions. An operand to a arithmetic operation with conversion rank less than int or unsigned int will be converted to int or unsigned int, whichever its range fits into. Since all unsigned chars are representible as signed ints, the pdest[3] is converted to signed int and the result of pdest[3] << 24 is also of type signed int!. Now, if that has the most significant bit set, the bit is shifted into the sign bit of the integer, and the behaviour is according to the C standard, undefined.
However, GCC has defined behaviour for this case; in there, the result is just a negative integer with 2's complement representation; therefore the result of (unsigned char)0xFF << 24 is (int)-16777216. Now, for | operation this then needs to be promoted to the rank of the other operand, which is unsigned. The unsigned conversion happens as if by repeatedly adding or subtracting one more than the maximum (i.e. repeatedly adding or subtracting 2⁶⁴) until the value fits in the range of the value. Since unsigned long is 64 bits on your platform, the result of this conversion is 2^64 - 16777216, or 18446744073692774400, which is ORred with the bits from previous steps.
How to fix? Easy, just prior to shifts, cast each shifted number to uint32_t. Print with the help of PRIu32 macro:
#include <inttypes.h>
...
uint32_t l1=0;
l1 |= (uint32_t)pdest[0];
l1 |= (uint32_t)pdest[1] << 8;
l1 |= (uint32_t)pdest[2] << 16;
l1 |= (uint32_t)pdest[3] << 24;
printf ("%" PRIu32, l1);
The problem with your code is implicit type conversions of unsigned char to int and then to unsigned long with signed extension for bitwise or operation, the corresponding values for each lines are as commented below
l1 |= (pdest[0]); //dec = 255 hex = 0xFF
l1 |= (pdest[1] << 8); //dec = 65535 hex = 0xFFFF
l1 |= (pdest[2] << 16); //dec = 16777215 hex =0xFFFFFF
l1 |= (pdest[3] << 24); //here is the problem
In last line pdest[3] << 24 = 0xFF000000 which is equivalent to -16777216 due to implicit conversion to int. It is again converted to unsigned long for bitwise or operation, where signed extension happens in l1 |= (pdest[3] << 24) which is equivalent to 0x0000000000FFFFFF | 0xFFFFFFFFFF000000.
As many people suggested you can use explicit type conversion or you can use below code snippet,
l1 = (l1 << 0) | pdest[3];
l1 = (l1 << 8) | pdest[2];
l1 = (l1 << 8) | pdest[1];
l1 = (l1 << 8) | pdest[0];
I hope it solves your problem and reasons out for such a huge output.

endianness conversion, regardless of endianness

Many implementations of htonl() or ntohl() test for the endianness of the platform first and then return a function which is either a no-op or a byte-swap.
I once read a page on the web about a few tricks to handle to/from big/little-endian conversions, without any preconceived knowledge of the hardware configuration. Just taking endianness for what it is : a representation of integers in memory. But I could not find it again, so I wrote this :
typedef union {
uint8_t b[4];
uint32_t i;
} swap32_T;
uint32_t to_big_endian(uint32_t x) {
/* convert to big endian, whatever the endianness of the platform */
swap32_T y;
y.b[0] = (x & 0xFF000000) >> 24;
y.b[1] = (x & 0x00FF0000) >> 16;
y.b[2] = (x & 0x0000FF00) >> 8;
y.b[3] = (x & 0x000000FF);
return y.i;
}
My two questions are :
Do you know a cleaner way to write this to_big_endian() function ?
Did you ever bookmarked this mysterious page I can not find, which contained very precious (because unusual) advices on endianness ?
edit
not really a duplicate (even if very close) mainly because I do not want to detect endianness. The same code compile on both architecture, with the same result
little endian
for u = 0x12345678 (stored as 0x78 0x56 0x34 0x12)
to_big_endian(u) = 0x12345678 (stored as 0x78 0x56 0x34 0x12)
big endian
for u = 0x12345678 (stored as 0x12 0x34 0x56 0x78)
to_big_endian(u) = 0x78563412 (stored as 0x78 0x56 0x34 0x12)
same code, same result... in memory.
Here is my own version of the same (although memory convention in this example is little endian instead of big endian) :
/* unoptimized version; solves endianess & alignment issues */
static U32 readLE32 (const BYTE* srcPtr)
{
U32 value32 = srcPtr[0];
value32 += (srcPtr[1]<<8);
value32 += (srcPtr[2]<<16);
value32 += (srcPtr[3]<<24);
return value32;
}
static void writeLE32 (BYTE* dstPtr, U32 value32)
{
dstPtr[0] = (BYTE)value32;
dstPtr[1] = (BYTE)(value32 >> 8);
dstPtr[2] = (BYTE)(value32 >> 16);
dstPtr[3] = (BYTE)(value32 >> 24);
}
Basically, what's missing in your function prototype to make the code a bit easier to read is a pointer to the source or destination memory.
Depending on your intentions, this may or may not be an answer to your question. However, if all you want to do is to be able to convert various types to various endiannesses (including 64-bit types and little endian conversions, which the htonl obviously won't do), you may want to consider the htobe32 and related functions:
uint16_t htobe16(uint16_t host_16bits);
uint16_t htole16(uint16_t host_16bits);
uint16_t be16toh(uint16_t big_endian_16bits);
uint16_t le16toh(uint16_t little_endian_16bits);
uint32_t htobe32(uint32_t host_32bits);
uint32_t htole32(uint32_t host_32bits);
uint32_t be32toh(uint32_t big_endian_32bits);
uint32_t le32toh(uint32_t little_endian_32bits);
uint64_t htobe64(uint64_t host_64bits);
uint64_t htole64(uint64_t host_64bits);
uint64_t be64toh(uint64_t big_endian_64bits);
uint64_t le64toh(uint64_t little_endian_64bits);
These functions are technically non-standard, but they appear to be present on most Unices.
It should also be said, however, as Paul R rightly points out in the comments, that there is no runtime test of endianness. The endianness is a fixed feature of a given ABI, so it is always a constant at compile-time.
Well ... That's certainly a workable solution, but I don't understand why you'd use a union. If you want an array of bytes, why not just have an array of bytes as an output pointer argument?
void uint32_to_big_endian(uint8_t *out, uint32_t x)
{
out[0] = (x >> 24) & 0xff;
out[1] = (x >> 16) & 0xff;
out[2] = (x >> 8) & 0xff;
out[3] = x & 0xff;
}
Also, it's often better code-wise to shift first, and mask later. It calls for smaller mask literals, which is often better for the code generator.
Well, here's my solution for a general signed/unsigned integer, independent of machine endianness, and of any size capable to store the data ---you need a version for each, but the algorithm is the same):
AnyLargeEnoughInt fromBE(BYTE *p, size_t n)
{
AnyLargeEnoughInt res = 0;
while (n--) {
res <<= 8;
res |= *p++;
} /* for */
return res;
} /* net2host */
void toBE(BYTE *p, size_t n, AnyLargeEnoughInt val)
{
p += n;
while (n--) {
*--p = val & 0xff;
val >>= 8;
} /* for */
} /* host2net */
AnyLargeEnoughInt fromLE(BYTE *p, size_t n)
{
p += n;
AnyLargeEnoughInt res = 0;
for (n--) {
res <<= 8;
res |= *--p;
} /* for */
return res;
} /* net2host */
void toLE(BYTE *p, size_t n, AnyLargeEnoughInt val)
{
while (n--) {
*p++ = val & 0xff;
val >>= 8;
} /* for */
} /* host2net */

Resources