Operation maybe undefined in macro [duplicate] - c

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 8 years ago.
Guys i have next two pieces of code which extracts 32bit variable from 8bit
1) First is that:
#include <stdio.h>
#include <string.h>
int main()
{
unsigned char buffer[] = {0xaa, 0xbb, 0xcc, 0xdd, 0xee};
unsigned char *buff = buffer;
unsigned int result;
result = *buff++;
result += *buff++ <<8;
result += *buff++ << 16;
result += *buff++ <<24;
printf("result = 0x%x, *buffer = 0x%x.", result, *buff);
return 0;
}
Without any warning, but it looks a little lame....
2) In second we have macro instead of those ugly 4 lines:
#include <stdio.h>
#include <string.h>
#define to32(buffer) ((unsigned int)*buffer++ | *buffer++ << 8 | *buffer++ << 16 | *buffer++ << 24)
int main()
{
unsigned char buffer[] = {0xaa, 0xbb, 0xcc, 0xdd, 0xee};
unsigned char *buff = buffer;
unsigned int result = to32(buff);
printf("result = 0x%x, *buffer = 0x%x.", result, *buff);
return 0;
}
And it leaves next warning:
main.cpp: In function 'int main()':
main.cpp:4:99: warning: operation on 'buff' may be undefined [-Wsequence-point]
#define to32(buffer) ((unsigned int)*buffer++ | *buffer++ << 8 | *buffer++ << 16 | *buffer++ << 24)
And i'm little confused what's exactly GCC found as undefined behavior.
Is it that all shifts in one line and i sum it?

EDIT
A solution is:
static inline unsigned to32(const unsigned char *buffer)
{
return buffer[0] | buffer[1] << 8 | buffer[1] << 16 | buffer[3] << 24;
}
If you really don't like functions, you can do the same with a macro...
It is about those sequence-points the compiler warns you about.
The order is not defined here:
to32(buffer) ((unsigned int)*buffer++ | *buffer++ << 8 | *buffer++ << 16 | *buffer++ << 24)
// will be
buffer = *buffer++;
buffer |= *buffer++ << 8;
....
// or
buffer = *buffer++ << 8;
buffer |= *buffer++;
...
In case of || and && there is a sequence point at the operator, and the operations have an order, going left to right. But not at | and &
The order is not defined, the compiler can do it any order.

After some suggestions from Buella Gabor and auguar it came to me that i can do it like that.
And it seems to be right
#define to32(buffer) ((unsigned int)*buffer | *(buffer+1) << 8 | *(buffer+2) << 16 | *(buffer+3) << 24)
Though it's not perfect for my needs :(

Related

Cast byte array to int

I have an array of four bytes and want to cast it to an int. The following code works just fine for that:
#include <stdio.h>
#include <stdint.h>
int main(void) {
uint8_t array[4] = {0xDE, 0xAD, 0xC0, 0xDE};
uint32_t myint;
myint = (uint32_t)(array[0]) << 24;
myint |= (uint32_t)(array[1]) << 16;
myint |= (uint32_t)(array[2]) << 8;
myint |= (uint32_t)(array[3]);
printf("0x%x\n",myint);
return 0;
}
The result is as expected:
$./test
0xdeadc0de
Now I want to do this in a one-liner like this:
#include <stdio.h>
#include <stdint.h>
int main(void) {
uint8_t array[4] = {0xDE, 0xAD, 0xC0, 0xDE};
uint32_t myint = (uint32_t)(array[0]) << 24 || (uint32_t)(array[1]) << 16 || (uint32_t)(array[2]) << 8 || (uint32_t)(array[3]);
printf("0x%x\n",myint);
return 0;
}
But this results in:
$./test
0x1
Why does my program behave like this?
Your are mixing up the operators for the logical or (||) and the bit wise or (|).
Do
uint32_t myint = (uint32_t)(array[0]) << 24
| (uint32_t)(array[1]) << 16
| (uint32_t)(array[2]) << 8
| (uint32_t)(array[3]);
Logical OR || is different from bitwise OR |
So in your 2nd snippet you use || use |

Byte array to Decimal

i have a byte array (64-bit unsigned integer) :
byte array[8] = { 0x01,0xc9,0x98,0x57,0xd1,0x47,0xf3,0x60 }
i want to translate it into decimal..
when i'am using the calculator windows the result is :
128801567297500000
i don't find a way to do it in winapi or C ..
Any help is appreciated.
for a 4 bytes array i use the working code below
BYTE array[4] = { 0xC3,0x02,0x00,0x00 };
printf("Result : %d\n",(array[0]) | (array[1]) <<8 |(array[2]) <<16 | (array[3]) <<24 );
Result : 707
Cast the bytes to 64bit before the shifting. Currently they are implicitly promoted to int, which is a 32bit data type.
Assuming you use stdint:
uint64_t result = ((uint64_t)b[0]) | ((uint64_t)b[1] << 8) | ((uint64_t)b[2] << 16) | ((uint64_t)b[3] << 24) | ((uint64_t)b[4] << 32) | ((uint64_t)b[5] << 40) | ((uint64_t)b[6] << 48) | ((uint64_t)b[7] << 56);
or in reverse order (array is little endian; this will get the result you're seeing in windows calculator):
uint64_t result = ((uint64_t)b[7]) | ((uint64_t)b[6] << 8) | ((uint64_t)b[5] << 16) | ((uint64_t)b[4] << 24) | ((uint64_t)b[3] << 32) | ((uint64_t)b[2] << 40) | ((uint64_t)b[1] << 48) | ((uint64_t)b[0] << 56);
well, you can use
sprintf() to print the positional hex values to a string.
convert that string to decimal using strtoll() using base 16.
Sample code:
#include <stdio.h>
#include <stdlib.h>
#define SIZE 128
int main()
{
char array[8] = { 0x01,0xc9,0x98,0x57,0xd1,0x47,0xf3,0x60 };
char arr[SIZE] = {0};
int i = 0;
unsigned long long res = 0;
for (i = 0; i < 8; i++)
sprintf((arr + (i * 2)), "%2x", (array[i] & 0xff));
printf("arr is %s\n", arr);
res = strtoll(arr, NULL, 16);
printf("res is %llu\n", res);
return 0;
}
int i;
byte array[8] = { 0x01,0xc9,0x98,0x57,0xd1,0x47,0xf3,0x60 };
unsigned long long v;
//Change of endian
for(i=0;i<4;++i){
byte temp = array[i];
array[i] = array[7-i];
array[7-i] = temp;
}
v = memcpy(&v, array, sizeof(v));//*(unsigned long long*)array;
printf("%llu ", v);

Little-endian convention, and saving to a binary file

I have a matrix (2-D int pointer int **mat) that I am trying to write to a file in Linux in Little-endian convention.
Here is my function that writes to the file:
#define BUFF_SIZE 4
void write_matrix(int **mat, int n, char *dest_file) {
int i, j;
char buff[BUFF_SIZE];
int fd = open(dest_file, O_CREAT | O_WRONLY, S_IRUSR | S_IWUSR | S_IXUSR);
if (fd < 0) {
printf("Error: Could not open the file \"%s\".\n", dest_file);
}
buff[0] = (n & 0x000000ff);
buff[1] = (n & 0x0000ff00) >> 8;
buff[2] = (n & 0x00ff0000) >> 16;
buff[3] = (n & 0xff000000) >> 24;
write(fd, buff, BUFF_SIZE);
for (i = 0; i < n; i++) {
for (j = 0; j < n; j++) {
buff[0] = (mat[i][j] & 0x000000ff);
buff[1] = (mat[i][j] & 0x0000ff00) >> 8;
buff[2] = (mat[i][j] & 0x00ff0000) >> 16;
buff[3] = (mat[i][j] & 0xff000000) >> 24;
if (write(fd, buff, BUFF_SIZE) != BUFF_SIZE) {
close(fd);
printf("Error: could not write to file.\n");
return;
}
}
}
close(fd);
}
The problem is that when I write out a matrix large enough of the form mat[i][i] = i (let's say 512 X 512), I think I get an overflow, since I get weird negative numbers.
To convert back I use:
void read_matrix(int fd, int **mat, int n, char buff[]) {
int i, j;
for (i = 0; i < n; i++) {
for (j = 0; j < n; j++) {
assert(read(fd, buff, BUFF_SIZE) == BUFF_SIZE);
mat[i][j] = byteToInt(buff);
}
}
}
int byteToInt(char buff[]) {
return (buff[3] << 24) | (buff[2] << 16) | (buff[1] << 8) | (buff[0]);
}
What an I doing wrong?
EDITED:
Added the read_matrix function.
It seems like I'm getting a short instead on an int, since 384 = (110000000) becomes -128 = (bin) 1000000
Did a test, and found out that:
char c = 128;
int i = 0;
i |= c;
gives i = -128. Why????
The problem is in your input conversion:
int byteToInt(char buff[]) {
return (buff[3] << 24) | (buff[2] << 16) | (buff[1] << 8) | (buff[0]);
}
You don't mention which platform you are on, but on most common platforms char is signed. And that will cause problems. Suppose, for example, that buff[1] is 0x80 (0b1000000). Since it is a signed value, that is the code for the value -128. And since shift operators start by doing integer promotions on both of their arguments, that will be converted to the integer -128 before the shift operation is performed; in other words, it will have the value 0xFFFFFF80, which will become 0xFFFF8000 after the shift.
The bitwise logical operators (such as |) perform the usual arithmetic conversions before doing the bitwise operations; in the case of (buff[1] << 8) | (buff[0]), the left-hand operator will already be a signed int (because the type of << is the type of its promoted left-hand argument); the right-hand argument, an implicitly signed char, will also be promoted to a signed int, so again if it were 0x80, it would end up being sign-extended to 0xFFFFFF80.
In either case, the bitwise-or operation will end up with unwanted high-order 1 bits.
Explicitly casting buff[x] to an unsigned int won't help, because it will first be sign-extended to an int before being reinterpreted as an unsigned int. Instead, it is necessary to cast it to an unsigned char:
int byteToInt(char buff[]) {
return ((unsigned char)buff[3] << 24)
| ((unsigned char)buff[2] << 16)
| ((unsigned char)buff[1] << 8)
| (unsigned char)buff[0];
}
Since int may be 16-bit, it would be better to use long, and indeed it would be better to use unsigned long to avoid other conversion issues. That means doing a double cast:
unsigned long byteToInt(char buff[]) {
return ((unsigned long)(unsigned char)buff[3] << 24)
| ((unsigned long)(unsigned char)buff[2] << 16)
| ((unsigned long)(unsigned char)buff[1] << 8)
| (unsigned long)(unsigned char)buff[0];
}
What you have is an undefined behaviour often overlooked. Left shifting of signed negative values is undefined. See here for details.
When you do this
int byteToInt(char buff[]) {
return (buff[3] << 24) | (buff[2] << 16) | (buff[1] << 8) | (buff[0]);
}
even if one element of buff has a negative value (I.e. one of the binary data's value sets the MSB) then you hit undefined behaviour. Since your data is binary, reading it as unsigned makes the most sense. You could use a standard type which makes the signedness and length explicit, such as uint8_t from stdint.h.

Convert Little Endian to Big Endian

I just want to ask if my method is correct to convert from little endian to big endian, just to make sure if I understand the difference.
I have a number which is stored in little-endian, here are the binary and hex representations of the number:
‭0001 0010 0011 0100 0101 0110 0111 1000‬
‭12345678‬
In big-endian format I believe the bytes should be swapped, like this:
1000 0111 0110 0101 0100 0011 0010 0001
‭87654321
Is this correct?
Also, the code below attempts to do this but fails. Is there anything obviously wrong or can I optimize something? If the code is bad for this conversion can you please explain why and show a better method of performing the same conversion?
uint32_t num = 0x12345678;
uint32_t b0,b1,b2,b3,b4,b5,b6,b7;
uint32_t res = 0;
b0 = (num & 0xf) << 28;
b1 = (num & 0xf0) << 24;
b2 = (num & 0xf00) << 20;
b3 = (num & 0xf000) << 16;
b4 = (num & 0xf0000) << 12;
b5 = (num & 0xf00000) << 8;
b6 = (num & 0xf000000) << 4;
b7 = (num & 0xf0000000) << 4;
res = b0 + b1 + b2 + b3 + b4 + b5 + b6 + b7;
printf("%d\n", res);
OP's sample code is incorrect.
Endian conversion works at the bit and 8-bit byte level. Most endian issues deal with the byte level. OP's code is doing a endian change at the 4-bit nibble level. Recommend instead:
// Swap endian (big to little) or (little to big)
uint32_t num = 9;
uint32_t b0,b1,b2,b3;
uint32_t res;
b0 = (num & 0x000000ff) << 24u;
b1 = (num & 0x0000ff00) << 8u;
b2 = (num & 0x00ff0000) >> 8u;
b3 = (num & 0xff000000) >> 24u;
res = b0 | b1 | b2 | b3;
printf("%" PRIX32 "\n", res);
If performance is truly important, the particular processor would need to be known. Otherwise, leave it to the compiler.
[Edit] OP added a comment that changes things.
"32bit numerical value represented by the hexadecimal representation (st uv wx yz) shall be recorded in a four-byte field as (st uv wx yz)."
It appears in this case, the endian of the 32-bit number is unknown and the result needs to be store in memory in little endian order.
uint32_t num = 9;
uint8_t b[4];
b[0] = (uint8_t) (num >> 0u);
b[1] = (uint8_t) (num >> 8u);
b[2] = (uint8_t) (num >> 16u);
b[3] = (uint8_t) (num >> 24u);
[2016 Edit] Simplification
... The type of the result is that of the promoted left operand.... Bitwise shift operators C11 §6.5.7 3
Using a u after the shift constants (right operands) results in the same as without it.
b3 = (num & 0xff000000) >> 24u;
b[3] = (uint8_t) (num >> 24u);
// same as
b3 = (num & 0xff000000) >> 24;
b[3] = (uint8_t) (num >> 24);
Sorry, my answer is a bit too late, but it seems nobody mentioned built-in functions to reverse byte order, which in very important in terms of performance.
Most of the modern processors are little-endian, while all network protocols are big-endian. That is history and more on that you can find on Wikipedia. But that means our processors convert between little- and big-endian millions of times while we browse the Internet.
That is why most architectures have a dedicated processor instructions to facilitate this task. For x86 architectures there is BSWAP instruction, and for ARMs there is REV. This is the most efficient way to reverse byte order.
To avoid assembly in our C code, we can use built-ins instead. For GCC there is __builtin_bswap32() function and for Visual C++ there is _byteswap_ulong(). Those function will generate just one processor instruction on most architectures.
Here is an example:
#include <stdio.h>
#include <inttypes.h>
int main()
{
uint32_t le = 0x12345678;
uint32_t be = __builtin_bswap32(le);
printf("Little-endian: 0x%" PRIx32 "\n", le);
printf("Big-endian: 0x%" PRIx32 "\n", be);
return 0;
}
Here is the output it produces:
Little-endian: 0x12345678
Big-endian: 0x78563412
And here is the disassembly (without optimization, i.e. -O0):
uint32_t be = __builtin_bswap32(le);
0x0000000000400535 <+15>: mov -0x8(%rbp),%eax
0x0000000000400538 <+18>: bswap %eax
0x000000000040053a <+20>: mov %eax,-0x4(%rbp)
There is just one BSWAP instruction indeed.
So, if we do care about the performance, we should use those built-in functions instead of any other method of byte reversing. Just my 2 cents.
I think you can use function htonl(). Network byte order is big endian.
"I swap each bytes right?" -> yes, to convert between little and big endian, you just give the bytes the opposite order.
But at first realize few things:
size of uint32_t is 32bits, which is 4 bytes, which is 8 HEX digits
mask 0xf retrieves the 4 least significant bits, to retrieve 8 bits, you need 0xff
so in case you want to swap the order of 4 bytes with that kind of masks, you could:
uint32_t res = 0;
b0 = (num & 0xff) << 24; ; least significant to most significant
b1 = (num & 0xff00) << 8; ; 2nd least sig. to 2nd most sig.
b2 = (num & 0xff0000) >> 8; ; 2nd most sig. to 2nd least sig.
b3 = (num & 0xff000000) >> 24; ; most sig. to least sig.
res = b0 | b1 | b2 | b3 ;
You could do this:
int x = 0x12345678;
x = ( x >> 24 ) | (( x << 8) & 0x00ff0000 )| ((x >> 8) & 0x0000ff00) | ( x << 24) ;
printf("value = %x", x); // x will be printed as 0x78563412
One slightly different way of tackling this that can sometimes be useful is to have a union of the sixteen or thirty-two bit value and an array of chars. I've just been doing this when getting serial messages that come in with big endian order, yet am working on a little endian micro.
union MessageLengthUnion
{
uint16_t asInt;
uint8_t asChars[2];
};
Then when I get the messages in I put the first received uint8 in .asChars[1], the second in .asChars[0] then I access it as the .asInt part of the union in the rest of my program.
If you have a thirty-two bit value to store you can have the array four long.
I am assuming you are on linux
Include "byteswap.h" & Use int32_t bswap_32(int32_t argument);
It is logical view, In actual see, /usr/include/byteswap.h
one more suggestion :
unsigned int a = 0xABCDEF23;
a = ((a&(0x0000FFFF)) << 16) | ((a&(0xFFFF0000)) >> 16);
a = ((a&(0x00FF00FF)) << 8) | ((a&(0xFF00FF00)) >>8);
printf("%0x\n",a);
A Simple C program to convert from little to big
#include <stdio.h>
int main() {
unsigned int little=0x1234ABCD,big=0;
unsigned char tmp=0,l;
printf(" Little endian little=%x\n",little);
for(l=0;l < 4;l++)
{
tmp=0;
tmp = little | tmp;
big = tmp | (big << 8);
little = little >> 8;
}
printf(" Big endian big=%x\n",big);
return 0;
}
OP's code is incorrect for the following reasons:
The swaps are being performed on a nibble (4-bit) boundary, instead of a byte (8-bit) boundary.
The shift-left << operations of the final four swaps are incorrect, they should be shift-right >> operations and their shift values would also need to be corrected.
The use of intermediary storage is unnecessary, and the code can therefore be rewritten to be more concise/recognizable. In doing so, some compilers will be able to better-optimize the code by recognizing the oft-used pattern.
Consider the following code, which efficiently converts an unsigned value:
// Swap endian (big to little) or (little to big)
uint32_t num = 0x12345678;
uint32_t res =
((num & 0x000000FF) << 24) |
((num & 0x0000FF00) << 8) |
((num & 0x00FF0000) >> 8) |
((num & 0xFF000000) >> 24);
printf("%0x\n", res);
The result is represented here in both binary and hex, notice how the bytes have swapped:
‭0111 1000 0101 0110 0011 0100 0001 0010‬
78563412
Optimizing
In terms of performance, leave it to the compiler to optimize your code when possible. You should avoid unnecessary data structures like arrays for simple algorithms like this, doing so will usually cause different instruction behavior such as accessing RAM instead of using CPU registers.
#include <stdio.h>
#include <inttypes.h>
uint32_t le_to_be(uint32_t num) {
uint8_t b[4] = {0};
*(uint32_t*)b = num;
uint8_t tmp = 0;
tmp = b[0];
b[0] = b[3];
b[3] = tmp;
tmp = b[1];
b[1] = b[2];
b[2] = tmp;
return *(uint32_t*)b;
}
int main()
{
printf("big endian value is %x\n", le_to_be(0xabcdef98));
return 0;
}
You can use the lib functions. They boil down to assembly, but if you are open to alternate implementations in C, here they are (assuming int is 32-bits) :
void byte_swap16(unsigned short int *pVal16) {
//#define method_one 1
// #define method_two 1
#define method_three 1
#ifdef method_one
unsigned char *pByte;
pByte = (unsigned char *) pVal16;
*pVal16 = (pByte[0] << 8) | pByte[1];
#endif
#ifdef method_two
unsigned char *pByte0;
unsigned char *pByte1;
pByte0 = (unsigned char *) pVal16;
pByte1 = pByte0 + 1;
*pByte0 = *pByte0 ^ *pByte1;
*pByte1 = *pByte0 ^ *pByte1;
*pByte0 = *pByte0 ^ *pByte1;
#endif
#ifdef method_three
unsigned char *pByte;
pByte = (unsigned char *) pVal16;
pByte[0] = pByte[0] ^ pByte[1];
pByte[1] = pByte[0] ^ pByte[1];
pByte[0] = pByte[0] ^ pByte[1];
#endif
}
void byte_swap32(unsigned int *pVal32) {
#ifdef method_one
unsigned char *pByte;
// 0x1234 5678 --> 0x7856 3412
pByte = (unsigned char *) pVal32;
*pVal32 = ( pByte[0] << 24 ) | (pByte[1] << 16) | (pByte[2] << 8) | ( pByte[3] );
#endif
#if defined(method_two) || defined (method_three)
unsigned char *pByte;
pByte = (unsigned char *) pVal32;
// move lsb to msb
pByte[0] = pByte[0] ^ pByte[3];
pByte[3] = pByte[0] ^ pByte[3];
pByte[0] = pByte[0] ^ pByte[3];
// move lsb to msb
pByte[1] = pByte[1] ^ pByte[2];
pByte[2] = pByte[1] ^ pByte[2];
pByte[1] = pByte[1] ^ pByte[2];
#endif
}
And the usage is performed like so:
unsigned short int u16Val = 0x1234;
byte_swap16(&u16Val);
unsigned int u32Val = 0x12345678;
byte_swap32(&u32Val);
Below is an other approach that was useful for me
convertLittleEndianByteArrayToBigEndianByteArray (byte littlendianByte[], byte bigEndianByte[], int ArraySize){
int i =0;
for(i =0;i<ArraySize;i++){
bigEndianByte[i] = (littlendianByte[ArraySize-i-1] << 7 & 0x80) | (littlendianByte[ArraySize-i-1] << 5 & 0x40) |
(littlendianByte[ArraySize-i-1] << 3 & 0x20) | (littlendianByte[ArraySize-i-1] << 1 & 0x10) |
(littlendianByte[ArraySize-i-1] >>1 & 0x08) | (littlendianByte[ArraySize-i-1] >> 3 & 0x04) |
(littlendianByte[ArraySize-i-1] >>5 & 0x02) | (littlendianByte[ArraySize-i-1] >> 7 & 0x01) ;
}
}
Below program produce the result as needed:
#include <stdio.h>
unsigned int Little_To_Big_Endian(unsigned int num);
int main( )
{
int num = 0x11223344 ;
printf("\n Little_Endian = 0x%X\n",num);
printf("\n Big_Endian = 0x%X\n",Little_To_Big_Endian(num));
}
unsigned int Little_To_Big_Endian(unsigned int num)
{
return (((num >> 24) & 0x000000ff) | ((num >> 8) & 0x0000ff00) | ((num << 8) & 0x00ff0000) | ((num << 24) & 0xff000000));
}
And also below function can be used:
unsigned int Little_To_Big_Endian(unsigned int num)
{
return (((num & 0x000000ff) << 24) | ((num & 0x0000ff00) << 8 ) | ((num & 0x00ff0000) >> 8) | ((num & 0xff000000) >> 24 ));
}
#include<stdio.h>
int main(){
int var = 0X12345678;
var = ((0X000000FF & var)<<24)|
((0X0000FF00 & var)<<8) |
((0X00FF0000 & var)>>8) |
((0XFF000000 & var)>>24);
printf("%x",var);
}
Here is a little function I wrote that works pretty good, its probably not portable to every single machine or as fast a single cpu instruction, but should work for most. It can handle numbers up to 32 byte (256 bit) and works for both big and little endian swaps. The nicest part about this function is you can point it into a byte array coming off or going on the wire and swap the bytes inline before converting.
#include <stdio.h>
#include <string.h>
void byteSwap(char**,int);
int main() {
//32 bit
int test32 = 0x12345678;
printf("\n BigEndian = 0x%X\n",test32);
char* pTest32 = (char*) &test32;
//convert to little endian
byteSwap((char**)&pTest32, 4);
printf("\n LittleEndian = 0x%X\n", test32);
//64 bit
long int test64 = 0x1234567891234567LL;
printf("\n BigEndian = 0x%lx\n",test64);
char* pTest64 = (char*) &test64;
//convert to little endian
byteSwap((char**)&pTest64,8);
printf("\n LittleEndian = 0x%lx\n",test64);
//back to big endian
byteSwap((char**)&pTest64,8);
printf("\n BigEndian = 0x%lx\n",test64);
return 0;
}
void byteSwap(char** src,int size) {
int x = 0;
char b[32];
while(size-- >= 0) { b[x++] = (*src)[size]; };
memcpy(*src,&b,x);
}
output:
$gcc -o main *.c -lm
$main
BigEndian = 0x12345678
LittleEndian = 0x78563412
BigEndian = 0x1234567891234567
LittleEndian = 0x6745239178563412
BigEndian = 0x1234567891234567

convert big endian to little endian in C [without using provided func] [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions must demonstrate a minimal understanding of the problem being solved. Tell us what you've tried to do, why it didn't work, and how it should work. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I need to write a function to convert big endian to little endian in C. I can not use any library function.
Assuming what you need is a simple byte swap, try something like
Unsigned 16 bit conversion:
swapped = (num>>8) | (num<<8);
Unsigned 32-bit conversion:
swapped = ((num>>24)&0xff) | // move byte 3 to byte 0
((num<<8)&0xff0000) | // move byte 1 to byte 2
((num>>8)&0xff00) | // move byte 2 to byte 1
((num<<24)&0xff000000); // byte 0 to byte 3
This swaps the byte orders from positions 1234 to 4321. If your input was 0xdeadbeef, a 32-bit endian swap might have output of 0xefbeadde.
The code above should be cleaned up with macros or at least constants instead of magic numbers, but hopefully it helps as is
EDIT: as another answer pointed out, there are platform, OS, and instruction set specific alternatives which can be MUCH faster than the above. In the Linux kernel there are macros (cpu_to_be32 for example) which handle endianness pretty nicely. But these alternatives are specific to their environments. In practice endianness is best dealt with using a blend of available approaches
By including:
#include <byteswap.h>
you can get an optimized version of machine-dependent byte-swapping functions.
Then, you can easily use the following functions:
__bswap_32 (uint32_t input)
or
__bswap_16 (uint16_t input)
#include <stdint.h>
//! Byte swap unsigned short
uint16_t swap_uint16( uint16_t val )
{
return (val << 8) | (val >> 8 );
}
//! Byte swap short
int16_t swap_int16( int16_t val )
{
return (val << 8) | ((val >> 8) & 0xFF);
}
//! Byte swap unsigned int
uint32_t swap_uint32( uint32_t val )
{
val = ((val << 8) & 0xFF00FF00 ) | ((val >> 8) & 0xFF00FF );
return (val << 16) | (val >> 16);
}
//! Byte swap int
int32_t swap_int32( int32_t val )
{
val = ((val << 8) & 0xFF00FF00) | ((val >> 8) & 0xFF00FF );
return (val << 16) | ((val >> 16) & 0xFFFF);
}
Update : Added 64bit byte swapping
int64_t swap_int64( int64_t val )
{
val = ((val << 8) & 0xFF00FF00FF00FF00ULL ) | ((val >> 8) & 0x00FF00FF00FF00FFULL );
val = ((val << 16) & 0xFFFF0000FFFF0000ULL ) | ((val >> 16) & 0x0000FFFF0000FFFFULL );
return (val << 32) | ((val >> 32) & 0xFFFFFFFFULL);
}
uint64_t swap_uint64( uint64_t val )
{
val = ((val << 8) & 0xFF00FF00FF00FF00ULL ) | ((val >> 8) & 0x00FF00FF00FF00FFULL );
val = ((val << 16) & 0xFFFF0000FFFF0000ULL ) | ((val >> 16) & 0x0000FFFF0000FFFFULL );
return (val << 32) | (val >> 32);
}
Here's a fairly generic version; I haven't compiled it, so there are probably typos, but you should get the idea,
void SwapBytes(void *pv, size_t n)
{
assert(n > 0);
char *p = pv;
size_t lo, hi;
for(lo=0, hi=n-1; hi>lo; lo++, hi--)
{
char tmp=p[lo];
p[lo] = p[hi];
p[hi] = tmp;
}
}
#define SWAP(x) SwapBytes(&x, sizeof(x));
NB: This is not optimised for speed or space. It is intended to be clear (easy to debug) and portable.
Update 2018-04-04
Added the assert() to trap the invalid case of n == 0, as spotted by commenter #chux.
If you need macros (e.g. embedded system):
#define SWAP_UINT16(x) (((x) >> 8) | ((x) << 8))
#define SWAP_UINT32(x) (((x) >> 24) | (((x) & 0x00FF0000) >> 8) | (((x) & 0x0000FF00) << 8) | ((x) << 24))
Edit: These are library functions. Following them is the manual way to do it.
I am absolutely stunned by the number of people unaware of __byteswap_ushort, __byteswap_ulong, and __byteswap_uint64. Sure they are Visual C++ specific, but they compile down to some delicious code on x86/IA-64 architectures. :)
Here's an explicit usage of the bswap instruction, pulled from this page. Note that the intrinsic form above will always be faster than this, I only added it to give an answer without a library routine.
uint32 cq_ntohl(uint32 a) {
__asm{
mov eax, a;
bswap eax;
}
}
As a joke:
#include <stdio.h>
int main (int argc, char *argv[])
{
size_t sizeofInt = sizeof (int);
int i;
union
{
int x;
char c[sizeof (int)];
} original, swapped;
original.x = 0x12345678;
for (i = 0; i < sizeofInt; i++)
swapped.c[sizeofInt - i - 1] = original.c[i];
fprintf (stderr, "%x\n", swapped.x);
return 0;
}
here's a way using the SSSE3 instruction pshufb using its Intel intrinsic, assuming you have a multiple of 4 ints:
unsigned int *bswap(unsigned int *destination, unsigned int *source, int length) {
int i;
__m128i mask = _mm_set_epi8(12, 13, 14, 15, 8, 9, 10, 11, 4, 5, 6, 7, 0, 1, 2, 3);
for (i = 0; i < length; i += 4) {
_mm_storeu_si128((__m128i *)&destination[i],
_mm_shuffle_epi8(_mm_loadu_si128((__m128i *)&source[i]), mask));
}
return destination;
}
Will this work / be faster?
uint32_t swapped, result;
((byte*)&swapped)[0] = ((byte*)&result)[3];
((byte*)&swapped)[1] = ((byte*)&result)[2];
((byte*)&swapped)[2] = ((byte*)&result)[1];
((byte*)&swapped)[3] = ((byte*)&result)[0];
This code snippet can convert 32bit little Endian number to Big Endian number.
#include <stdio.h>
main(){
unsigned int i = 0xfafbfcfd;
unsigned int j;
j= ((i&0xff000000)>>24)| ((i&0xff0000)>>8) | ((i&0xff00)<<8) | ((i&0xff)<<24);
printf("unsigned int j = %x\n ", j);
}
Here's a function I have been using - tested and works on any basic data type:
// SwapBytes.h
//
// Function to perform in-place endian conversion of basic types
//
// Usage:
//
// double d;
// SwapBytes(&d, sizeof(d));
//
inline void SwapBytes(void *source, int size)
{
typedef unsigned char TwoBytes[2];
typedef unsigned char FourBytes[4];
typedef unsigned char EightBytes[8];
unsigned char temp;
if(size == 2)
{
TwoBytes *src = (TwoBytes *)source;
temp = (*src)[0];
(*src)[0] = (*src)[1];
(*src)[1] = temp;
return;
}
if(size == 4)
{
FourBytes *src = (FourBytes *)source;
temp = (*src)[0];
(*src)[0] = (*src)[3];
(*src)[3] = temp;
temp = (*src)[1];
(*src)[1] = (*src)[2];
(*src)[2] = temp;
return;
}
if(size == 8)
{
EightBytes *src = (EightBytes *)source;
temp = (*src)[0];
(*src)[0] = (*src)[7];
(*src)[7] = temp;
temp = (*src)[1];
(*src)[1] = (*src)[6];
(*src)[6] = temp;
temp = (*src)[2];
(*src)[2] = (*src)[5];
(*src)[5] = temp;
temp = (*src)[3];
(*src)[3] = (*src)[4];
(*src)[4] = temp;
return;
}
}
EDIT: This function only swaps the endianness of aligned 16 bit words. A function often necessary for UTF-16/UCS-2 encodings.
EDIT END.
If you want to change the endianess of a memory block you can use my blazingly fast approach.
Your memory array should have a size that is a multiple of 8.
#include <stddef.h>
#include <limits.h>
#include <stdint.h>
void ChangeMemEndianness(uint64_t *mem, size_t size)
{
uint64_t m1 = 0xFF00FF00FF00FF00ULL, m2 = m1 >> CHAR_BIT;
size = (size + (sizeof (uint64_t) - 1)) / sizeof (uint64_t);
for(; size; size--, mem++)
*mem = ((*mem & m1) >> CHAR_BIT) | ((*mem & m2) << CHAR_BIT);
}
This kind of function is useful for changing the endianess of Unicode UCS-2/UTF-16 files.
If you are running on a x86 or x86_64 processor, the big endian is native. so
for 16 bit values
unsigned short wBigE = value;
unsigned short wLittleE = ((wBigE & 0xFF) << 8) | (wBigE >> 8);
for 32 bit values
unsigned int iBigE = value;
unsigned int iLittleE = ((iBigE & 0xFF) << 24)
| ((iBigE & 0xFF00) << 8)
| ((iBigE >> 8) & 0xFF00)
| (iBigE >> 24);
This isn't the most efficient solution unless the compiler recognises that this is byte level manipulation and generates byte swapping code. But it doesn't depend on any memory layout tricks and can be turned into a macro pretty easily.

Resources