I'm learning how to use the Intel MMX and SSE instructions in a video application. I have an 8-byte word and I would like to add all 8 bytes and produce a single integer as result. The straightforward method is a series of 7 shifts and adds, but that is slow. What is the fastest way of doing this? Is there an MMX or SSE instruction for this?
This is the slow way of doing it
unsigned long PackedWord = whatever....
int byte1 = 0xff & (PackedWord);
int byte2 = 0xff & (PackedWord >> 8);
int byte3 = 0xff & (PackedWord >> 16);
int byte4 = 0xff & (PackedWord >> 24);
int byte5 = 0xff & (PackedWord >> 32);
int byte6 = 0xff & (PackedWord >> 40);
int byte7 = 0xff & (PackedWord >> 48);
int byte8 = 0xff & (PackedWord >> 56);
int sum = byte1 + byte2 + byte3 + byte4 + byte5 + byte6 + byte7 + byte8;
Based on the suggestion of #harold, you'd want something like:
#include <emmintrin.h>
inline int bytesum(uint64_t pw)
{
__m64 result = _mm_sad_pu8(*((__m64*) &pw), (__m64) 0LLU); // aka psadbw
return _mm_cvtsi64_si32(result);
}
You can do this with a horizontal sum-by-multiply after one pairwise reduction:
uint16_t bytesum(uint64_t x) {
uint64_t pair_bits = 0x0001000100010001LLU;
uint64_t mask = pair_bits * 0xFF;
uint64_t pair_sum = (x & mask) + ((x >> 8) & mask);
return (pair_sum * pair_bits) >> (64 - 16);
}
This produces much leaner code than doing three pairwise reductions.
I'm not an assembly guru but this code should be a little bit faster on platforms that don't have fancy SIMD instructions:
#include <stdint.h>
int bytesum(uint64_t pw) {
uint64_t a, b, mask;
mask = 0x00ff00ff00ff00ffLLU;
a = (pw >> 8) & mask;
b = pw & mask;
pw = a + b;
mask = 0x0000ffff0000ffffLLU;
a = (pw >> 16) & mask;
b = pw & mask;
pw = a + b;
return (pw >> 32) + (pw & 0xffffffffLLU);
}
The idea is that you first add every other byte, then every other word, and finally every other doubleworld.
Related
Write a function that swaps the highest bits in each nibble of the byte pointed to by the pointer b. (i.e. 0bAxxxBxxx -> 0bBxxxAxxx).
I have a predefined function with this prototype: void swapBits(uint8_t* b);
The solution that I came up with is not working -
void swapBits(uint8_t *b)
{
uint8_t bit1;
uint8_t bit2;
uint8_t x;
bit1 = (*b >> 4) & 1;
bit2 = (*b >> 8) & 1;
x = bit1 ^ bit2;
x = x << 4 | x << 8;
*b = *b ^ x;
}
there are couple problems:
to get 4th bit from the right you need to shift 3 times (not four)
xor is probably not what you need to use
here is fixed version:
void swapBits(uint8_t* b)
{
uint8_t bit1;
uint8_t bit2;
bit1 = ((*b >> 3) & 1) << 7; // get bit from one position and put it into another
bit2 = ((*b >> 7) & 1) << 3;
*b = (*b & 0x77) | bit1 | bit2; // clear already extracted bits and reset with new values
}
You are extracting the low bit in the high nibble and shift out the whole uint8_t for bit2. You need to extract the high bit in both.
Example:
void swapBits(uint8_t* b)
{
// shift down the high nibble and get its MSb
uint8_t bit1 = (*b >> 4) & 0b1000;
// get the MSb in the low nibble and shift it up
uint8_t bit2 = (*b & 0b1000) << 4;
// remove whatever values the MSbs had and replace them with the new values
*b = (*b & 0b01110111) | bit2 | bit1;
}
0b for binary literals is a gcc extension (but will become standard in C23). If you can't use it, use a plain 8 instead of 0b1000 and 0x77 instead of 0b01110111. I'm using the extension because it makes it easier to see the patterns.
A more generic version where you can supply the mask for the the bits to swap between the nibbles could look like this:
uint8_t swapBits(uint8_t b, uint8_t m) {
return (b & ~(m << 4 | m)) // remove the bits in the mask
| ((b >> 4) & m) // high nibble bits -> low nibble
| ((b & m) << 4); // low nibble bits -> high nibble
}
Demo
Your code does not work because you shift the byte values by 4 and 8 instead of 3 and 7.
Here is a modified version of your code:
void swapBits(uint8_t *b) {
uint8_t bit1 = (*b >> 3) & 1;
uint8_t bit2 = (*b >> 7) & 1;
uint8_t x = bit1 ^ bit2;
*b ^= (x << 3) | (x << 7);
}
Here is a simplified version:
void swapBits(uint8_t *b) {
uint8_t x = (*b ^ (*b >> 4)) & 0x08;
*b ^= x * 0x11;
}
Here is an alternative approach that can be used for more multiple bits:
void swapBits(uint8_t *b) {
uint8_t bit7 = (*b & 0x08) << 4;
uint8_t bit3 = (*b & 0x80) >> 4;
*b = (*b & 0x77) | bit7 | bit3;
}
I have to memset 4 bytes in a char * with an integer.
For exemple I have an integer int i = 3276854 (0x00320036) and a char header[54] = {0}. I have to write i at header + 2, as if I print memory from header[2] to header[6], I get :
0032 0036
I tried this:
memset(&header[2], i, 1);
But it seems to put only the last byte into header, I get :
0036 0000
I also tried:
memset(&header[2], i, 4);
But it fill each byte with the last byte of i, I get :
3636 3636
I also tried to use binary masks like that :
ft_memset(&header[2], (int)(54 + size) & 0xff000000, 1);
ft_memset(&header[3], (int)(54 + size) & 0x00ff0000, 1);
ft_memset(&header[4], (int)(54 + size) & 0x0000ff00, 1);
ft_memset(&header[5], (int)(54 + size) & 0x000000ff, 1);
I get :
3600 0000.
So I don't know how I can get my 0032 0036, or at least 3600 3200 (maybe there is a thing with little and big endian into that, because I run it under MacOS, which is big endian).
memset fills memory with a constant byte value. The second parameter (of type int) is converted to an unsigned char value.
You could use memcpy like this:
memcpy(&header[2], &i, sizeof(i));
However, it depends what exactly you are trying to achieve. If the header needs the integer to be in a particular format, you may need to convert the value in some way. For example, if the value needs to be big-endian (which is also known as "network byte order" in several Internet protocols), you can convert it with the htonl function:
uint32_t bi = htonl(i);
memcpy(&header[2], &bi, sizeof(bi));
(The htonl function is defined by #include <arpa/inet.h>.)
Also check the newer byte order conversion functions htobe16, htole16, be16toh, le16toh, htobe32, htole32, be32toh, le32toh, htobe64, htole64, be64toh, and le64toh declared by:
#define _BSD_SOURCE
#include <endian.h>
These convert between host byte order and little-endian byte order, or between host byte order and big-endian byte order, and work on uint16_t, uint32_t or uint64_t values, depending on the function name.
If there are no equivalents to those byte-order conversion functions provided on your system the following non-optimized, but portable (on implementations that support uint16_t, uint32_t and uint64_t) functions may be used:
myendian.h
#ifndef MYENDIAN_H__INCLUDED_
#define MYENDIAN_H__INCLUDED_
#include <stdint.h>
uint16_t my_htobe16(uint16_t h16);
uint16_t my_htole16(uint16_t h16);
uint16_t my_be16toh(uint16_t be16);
uint16_t my_le16toh(uint16_t le16);
uint32_t my_htobe32(uint32_t h32);
uint32_t my_htole32(uint32_t h32);
uint32_t my_be32toh(uint32_t be32);
uint32_t my_le32toh(uint32_t le32);
uint64_t my_htobe64(uint64_t h64);
uint64_t my_htole64(uint64_t h64);
uint64_t my_be64toh(uint64_t be64);
uint64_t my_le64toh(uint64_t le64);
#endif
myendian.c
#include "myendian.h"
union swab16
{
uint16_t v;
uint8_t b[2];
};
union swab32
{
uint32_t v;
uint8_t b[4];
};
union swab64
{
uint64_t v;
uint8_t b[8];
};
static uint16_t xbe16(uint16_t x)
{
union swab16 s;
s.b[0] = (x >> 8) & 0xffu;
s.b[1] = x & 0xffu;
return s.v;
}
static uint16_t xle16(uint16_t x)
{
union swab16 s;
s.b[0] = x & 0xffu;
s.b[1] = (x >> 8) & 0xffu;
return s.v;
}
static uint32_t xbe32(uint32_t x)
{
union swab32 s;
s.b[0] = (x >> 24) & 0xffu;
s.b[1] = (x >> 16) & 0xffu;
s.b[2] = (x >> 8) & 0xffu;
s.b[3] = x & 0xffu;
return s.v;
}
static uint32_t xle32(uint32_t x)
{
union swab32 s;
s.b[0] = x & 0xffu;
s.b[1] = (x >> 8) & 0xffu;
s.b[2] = (x >> 16) & 0xffu;
s.b[3] = (x >> 24) & 0xffu;
return s.v;
}
static uint64_t xbe64(uint64_t x)
{
union swab64 s;
s.b[0] = (x >> 56) & 0xffu;
s.b[1] = (x >> 48) & 0xffu;
s.b[2] = (x >> 40) & 0xffu;
s.b[3] = (x >> 32) & 0xffu;
s.b[4] = (x >> 24) & 0xffu;
s.b[5] = (x >> 16) & 0xffu;
s.b[6] = (x >> 8) & 0xffu;
s.b[7] = x & 0xffu;
return s.v;
}
static uint64_t xle64(uint64_t x)
{
union swab64 s;
s.b[0] = x & 0xffu;
s.b[1] = (x >> 8) & 0xffu;
s.b[2] = (x >> 16) & 0xffu;
s.b[3] = (x >> 24) & 0xffu;
s.b[4] = (x >> 32) & 0xffu;
s.b[5] = (x >> 40) & 0xffu;
s.b[6] = (x >> 48) & 0xffu;
s.b[7] = (x >> 56) & 0xffu;
return s.v;
}
uint16_t my_htobe16(uint16_t h16)
{
return xbe16(h16);
}
uint16_t my_htole16(uint16_t h16)
{
return xle16(h16);
}
uint16_t my_be16toh(uint16_t be16)
{
return xbe16(be16);
}
uint16_t my_le16toh(uint16_t le16)
{
return xle16(le16);
}
uint32_t my_htobe32(uint32_t h32)
{
return xbe32(h32);
}
uint32_t my_htole32(uint32_t h32)
{
return xle32(h32);
}
uint32_t my_be32toh(uint32_t be32)
{
return xbe32(be32);
}
uint32_t my_le32toh(uint32_t le32)
{
return xle32(le32);
}
uint64_t my_htobe64(uint64_t h64)
{
return xbe64(h64);
}
uint64_t my_htole64(uint64_t h64)
{
return xle64(h64);
}
uint64_t my_be64toh(uint64_t be64)
{
return xbe64(be64);
}
uint64_t my_le64toh(uint64_t le64)
{
return xle64(le64);
}
Test harness: myendiantest.c
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include "myendian.h"
#define TEST(n, fn, v) \
printf("%s(%#" PRIx##n ") = %#" PRIx##n "\n", #fn, (v), (fn)(v))
int main(void)
{
const uint16_t t16 = UINT16_C(0x1234);
const uint32_t t32 = UINT32_C(0x12345678);
const uint64_t t64 = UINT64_C(0x123456789abcdef);
TEST(16, my_htobe16, t16);
TEST(16, my_htole16, t16);
TEST(16, my_be16toh, t16);
TEST(16, my_le16toh, t16);
TEST(32, my_htobe32, t32);
TEST(32, my_htole32, t32);
TEST(32, my_be32toh, t32);
TEST(32, my_le32toh, t32);
TEST(64, my_htobe64, t64);
TEST(64, my_htole64, t64);
TEST(64, my_be64toh, t64);
TEST(64, my_le64toh, t64);
return 0;
}
Output on a little endian system:
my_htobe16(0x1234) = 0x3412
my_htole16(0x1234) = 0x1234
my_be16toh(0x1234) = 0x3412
my_le16toh(0x1234) = 0x1234
my_htobe32(0x12345678) = 0x78563412
my_htole32(0x12345678) = 0x12345678
my_be32toh(0x12345678) = 0x78563412
my_le32toh(0x12345678) = 0x12345678
my_htobe64(0x123456789abcdef) = 0xefcdab8967452301
my_htole64(0x123456789abcdef) = 0x123456789abcdef
my_be64toh(0x123456789abcdef) = 0xefcdab8967452301
my_le64toh(0x123456789abcdef) = 0x123456789abcdef
Output on a big endian system (expected, but not tested by me):
my_htobe16(0x1234) = 0x1234
my_htole16(0x1234) = 0x3412
my_be16toh(0x1234) = 0x1234
my_le16toh(0x1234) = 0x3412
my_htobe32(0x12345678) = 0x12345678
my_htole32(0x12345678) = 0x78563412
my_be32toh(0x12345678) = 0x12345678
my_le32toh(0x12345678) = 0x78563412
my_htobe64(0x123456789abcdef) = 0x123456789abcdef
my_htole64(0x123456789abcdef) = 0xefcdab8967452301
my_be64toh(0x123456789abcdef) = 0x123456789abcdef
my_le64toh(0x123456789abcdef) = 0xefcdab8967452301
You could use memcpy but that would make your code dependent on the underlying endianess of the CPU. I don't think that's what you want.
My take is that you want to convert to the network endianess of whatever data communications protocol this is, regardless of CPU endianess.
The only way to achieve that is bit shifts. The byte order depends on the target network endianess:
void u32_to_big_endian (uint8_t* dst, const uint32_t src)
{
dst[0] = (uint8_t) ((src >> 24) & 0xFFu);
dst[1] = (uint8_t) ((src >> 16) & 0xFFu);
dst[2] = (uint8_t) ((src >> 8) & 0xFFu);
dst[3] = (uint8_t) ((src >> 0) & 0xFFu);
}
void u32_to_little_endian (uint8_t* dst, const uint32_t src)
{
dst[3] = (uint8_t) ((src >> 24) & 0xFFu);
dst[2] = (uint8_t) ((src >> 16) & 0xFFu);
dst[1] = (uint8_t) ((src >> 8) & 0xFFu);
dst[0] = (uint8_t) ((src >> 0) & 0xFFu);
}
In these functions, it doesn't matter what the CPU endianess is. You can forget about memcpy() and non-standard endianess conversion functions. Full example:
#include <stdio.h>
#include <stdint.h>
void u32_to_big_endian (uint8_t* dst, const uint32_t src)
{
dst[0] = (uint8_t) ((src >> 24) & 0xFFu);
dst[1] = (uint8_t) ((src >> 16) & 0xFFu);
dst[2] = (uint8_t) ((src >> 8) & 0xFFu);
dst[3] = (uint8_t) ((src >> 0) & 0xFFu);
}
void u32_to_little_endian (uint8_t* dst, const uint32_t src)
{
dst[3] = (uint8_t) ((src >> 24) & 0xFFu);
dst[2] = (uint8_t) ((src >> 16) & 0xFFu);
dst[1] = (uint8_t) ((src >> 8) & 0xFFu);
dst[0] = (uint8_t) ((src >> 0) & 0xFFu);
}
int main(void)
{
uint32_t i = 0x00320036u;
uint8_t header[54] = {0};
u32_to_little_endian(&header[2], i);
for(size_t i=0; i<6; i++)
{
printf("%.2x ", (unsigned int)header[i]);
}
}
Output, including the first 2 bytes as zeroes:
00 00 36 00 32 00
I just want to ask if my method is correct to convert from little endian to big endian, just to make sure if I understand the difference.
I have a number which is stored in little-endian, here are the binary and hex representations of the number:
0001 0010 0011 0100 0101 0110 0111 1000
12345678
In big-endian format I believe the bytes should be swapped, like this:
1000 0111 0110 0101 0100 0011 0010 0001
87654321
Is this correct?
Also, the code below attempts to do this but fails. Is there anything obviously wrong or can I optimize something? If the code is bad for this conversion can you please explain why and show a better method of performing the same conversion?
uint32_t num = 0x12345678;
uint32_t b0,b1,b2,b3,b4,b5,b6,b7;
uint32_t res = 0;
b0 = (num & 0xf) << 28;
b1 = (num & 0xf0) << 24;
b2 = (num & 0xf00) << 20;
b3 = (num & 0xf000) << 16;
b4 = (num & 0xf0000) << 12;
b5 = (num & 0xf00000) << 8;
b6 = (num & 0xf000000) << 4;
b7 = (num & 0xf0000000) << 4;
res = b0 + b1 + b2 + b3 + b4 + b5 + b6 + b7;
printf("%d\n", res);
OP's sample code is incorrect.
Endian conversion works at the bit and 8-bit byte level. Most endian issues deal with the byte level. OP's code is doing a endian change at the 4-bit nibble level. Recommend instead:
// Swap endian (big to little) or (little to big)
uint32_t num = 9;
uint32_t b0,b1,b2,b3;
uint32_t res;
b0 = (num & 0x000000ff) << 24u;
b1 = (num & 0x0000ff00) << 8u;
b2 = (num & 0x00ff0000) >> 8u;
b3 = (num & 0xff000000) >> 24u;
res = b0 | b1 | b2 | b3;
printf("%" PRIX32 "\n", res);
If performance is truly important, the particular processor would need to be known. Otherwise, leave it to the compiler.
[Edit] OP added a comment that changes things.
"32bit numerical value represented by the hexadecimal representation (st uv wx yz) shall be recorded in a four-byte field as (st uv wx yz)."
It appears in this case, the endian of the 32-bit number is unknown and the result needs to be store in memory in little endian order.
uint32_t num = 9;
uint8_t b[4];
b[0] = (uint8_t) (num >> 0u);
b[1] = (uint8_t) (num >> 8u);
b[2] = (uint8_t) (num >> 16u);
b[3] = (uint8_t) (num >> 24u);
[2016 Edit] Simplification
... The type of the result is that of the promoted left operand.... Bitwise shift operators C11 §6.5.7 3
Using a u after the shift constants (right operands) results in the same as without it.
b3 = (num & 0xff000000) >> 24u;
b[3] = (uint8_t) (num >> 24u);
// same as
b3 = (num & 0xff000000) >> 24;
b[3] = (uint8_t) (num >> 24);
Sorry, my answer is a bit too late, but it seems nobody mentioned built-in functions to reverse byte order, which in very important in terms of performance.
Most of the modern processors are little-endian, while all network protocols are big-endian. That is history and more on that you can find on Wikipedia. But that means our processors convert between little- and big-endian millions of times while we browse the Internet.
That is why most architectures have a dedicated processor instructions to facilitate this task. For x86 architectures there is BSWAP instruction, and for ARMs there is REV. This is the most efficient way to reverse byte order.
To avoid assembly in our C code, we can use built-ins instead. For GCC there is __builtin_bswap32() function and for Visual C++ there is _byteswap_ulong(). Those function will generate just one processor instruction on most architectures.
Here is an example:
#include <stdio.h>
#include <inttypes.h>
int main()
{
uint32_t le = 0x12345678;
uint32_t be = __builtin_bswap32(le);
printf("Little-endian: 0x%" PRIx32 "\n", le);
printf("Big-endian: 0x%" PRIx32 "\n", be);
return 0;
}
Here is the output it produces:
Little-endian: 0x12345678
Big-endian: 0x78563412
And here is the disassembly (without optimization, i.e. -O0):
uint32_t be = __builtin_bswap32(le);
0x0000000000400535 <+15>: mov -0x8(%rbp),%eax
0x0000000000400538 <+18>: bswap %eax
0x000000000040053a <+20>: mov %eax,-0x4(%rbp)
There is just one BSWAP instruction indeed.
So, if we do care about the performance, we should use those built-in functions instead of any other method of byte reversing. Just my 2 cents.
I think you can use function htonl(). Network byte order is big endian.
"I swap each bytes right?" -> yes, to convert between little and big endian, you just give the bytes the opposite order.
But at first realize few things:
size of uint32_t is 32bits, which is 4 bytes, which is 8 HEX digits
mask 0xf retrieves the 4 least significant bits, to retrieve 8 bits, you need 0xff
so in case you want to swap the order of 4 bytes with that kind of masks, you could:
uint32_t res = 0;
b0 = (num & 0xff) << 24; ; least significant to most significant
b1 = (num & 0xff00) << 8; ; 2nd least sig. to 2nd most sig.
b2 = (num & 0xff0000) >> 8; ; 2nd most sig. to 2nd least sig.
b3 = (num & 0xff000000) >> 24; ; most sig. to least sig.
res = b0 | b1 | b2 | b3 ;
You could do this:
int x = 0x12345678;
x = ( x >> 24 ) | (( x << 8) & 0x00ff0000 )| ((x >> 8) & 0x0000ff00) | ( x << 24) ;
printf("value = %x", x); // x will be printed as 0x78563412
One slightly different way of tackling this that can sometimes be useful is to have a union of the sixteen or thirty-two bit value and an array of chars. I've just been doing this when getting serial messages that come in with big endian order, yet am working on a little endian micro.
union MessageLengthUnion
{
uint16_t asInt;
uint8_t asChars[2];
};
Then when I get the messages in I put the first received uint8 in .asChars[1], the second in .asChars[0] then I access it as the .asInt part of the union in the rest of my program.
If you have a thirty-two bit value to store you can have the array four long.
I am assuming you are on linux
Include "byteswap.h" & Use int32_t bswap_32(int32_t argument);
It is logical view, In actual see, /usr/include/byteswap.h
one more suggestion :
unsigned int a = 0xABCDEF23;
a = ((a&(0x0000FFFF)) << 16) | ((a&(0xFFFF0000)) >> 16);
a = ((a&(0x00FF00FF)) << 8) | ((a&(0xFF00FF00)) >>8);
printf("%0x\n",a);
A Simple C program to convert from little to big
#include <stdio.h>
int main() {
unsigned int little=0x1234ABCD,big=0;
unsigned char tmp=0,l;
printf(" Little endian little=%x\n",little);
for(l=0;l < 4;l++)
{
tmp=0;
tmp = little | tmp;
big = tmp | (big << 8);
little = little >> 8;
}
printf(" Big endian big=%x\n",big);
return 0;
}
OP's code is incorrect for the following reasons:
The swaps are being performed on a nibble (4-bit) boundary, instead of a byte (8-bit) boundary.
The shift-left << operations of the final four swaps are incorrect, they should be shift-right >> operations and their shift values would also need to be corrected.
The use of intermediary storage is unnecessary, and the code can therefore be rewritten to be more concise/recognizable. In doing so, some compilers will be able to better-optimize the code by recognizing the oft-used pattern.
Consider the following code, which efficiently converts an unsigned value:
// Swap endian (big to little) or (little to big)
uint32_t num = 0x12345678;
uint32_t res =
((num & 0x000000FF) << 24) |
((num & 0x0000FF00) << 8) |
((num & 0x00FF0000) >> 8) |
((num & 0xFF000000) >> 24);
printf("%0x\n", res);
The result is represented here in both binary and hex, notice how the bytes have swapped:
0111 1000 0101 0110 0011 0100 0001 0010
78563412
Optimizing
In terms of performance, leave it to the compiler to optimize your code when possible. You should avoid unnecessary data structures like arrays for simple algorithms like this, doing so will usually cause different instruction behavior such as accessing RAM instead of using CPU registers.
#include <stdio.h>
#include <inttypes.h>
uint32_t le_to_be(uint32_t num) {
uint8_t b[4] = {0};
*(uint32_t*)b = num;
uint8_t tmp = 0;
tmp = b[0];
b[0] = b[3];
b[3] = tmp;
tmp = b[1];
b[1] = b[2];
b[2] = tmp;
return *(uint32_t*)b;
}
int main()
{
printf("big endian value is %x\n", le_to_be(0xabcdef98));
return 0;
}
You can use the lib functions. They boil down to assembly, but if you are open to alternate implementations in C, here they are (assuming int is 32-bits) :
void byte_swap16(unsigned short int *pVal16) {
//#define method_one 1
// #define method_two 1
#define method_three 1
#ifdef method_one
unsigned char *pByte;
pByte = (unsigned char *) pVal16;
*pVal16 = (pByte[0] << 8) | pByte[1];
#endif
#ifdef method_two
unsigned char *pByte0;
unsigned char *pByte1;
pByte0 = (unsigned char *) pVal16;
pByte1 = pByte0 + 1;
*pByte0 = *pByte0 ^ *pByte1;
*pByte1 = *pByte0 ^ *pByte1;
*pByte0 = *pByte0 ^ *pByte1;
#endif
#ifdef method_three
unsigned char *pByte;
pByte = (unsigned char *) pVal16;
pByte[0] = pByte[0] ^ pByte[1];
pByte[1] = pByte[0] ^ pByte[1];
pByte[0] = pByte[0] ^ pByte[1];
#endif
}
void byte_swap32(unsigned int *pVal32) {
#ifdef method_one
unsigned char *pByte;
// 0x1234 5678 --> 0x7856 3412
pByte = (unsigned char *) pVal32;
*pVal32 = ( pByte[0] << 24 ) | (pByte[1] << 16) | (pByte[2] << 8) | ( pByte[3] );
#endif
#if defined(method_two) || defined (method_three)
unsigned char *pByte;
pByte = (unsigned char *) pVal32;
// move lsb to msb
pByte[0] = pByte[0] ^ pByte[3];
pByte[3] = pByte[0] ^ pByte[3];
pByte[0] = pByte[0] ^ pByte[3];
// move lsb to msb
pByte[1] = pByte[1] ^ pByte[2];
pByte[2] = pByte[1] ^ pByte[2];
pByte[1] = pByte[1] ^ pByte[2];
#endif
}
And the usage is performed like so:
unsigned short int u16Val = 0x1234;
byte_swap16(&u16Val);
unsigned int u32Val = 0x12345678;
byte_swap32(&u32Val);
Below is an other approach that was useful for me
convertLittleEndianByteArrayToBigEndianByteArray (byte littlendianByte[], byte bigEndianByte[], int ArraySize){
int i =0;
for(i =0;i<ArraySize;i++){
bigEndianByte[i] = (littlendianByte[ArraySize-i-1] << 7 & 0x80) | (littlendianByte[ArraySize-i-1] << 5 & 0x40) |
(littlendianByte[ArraySize-i-1] << 3 & 0x20) | (littlendianByte[ArraySize-i-1] << 1 & 0x10) |
(littlendianByte[ArraySize-i-1] >>1 & 0x08) | (littlendianByte[ArraySize-i-1] >> 3 & 0x04) |
(littlendianByte[ArraySize-i-1] >>5 & 0x02) | (littlendianByte[ArraySize-i-1] >> 7 & 0x01) ;
}
}
Below program produce the result as needed:
#include <stdio.h>
unsigned int Little_To_Big_Endian(unsigned int num);
int main( )
{
int num = 0x11223344 ;
printf("\n Little_Endian = 0x%X\n",num);
printf("\n Big_Endian = 0x%X\n",Little_To_Big_Endian(num));
}
unsigned int Little_To_Big_Endian(unsigned int num)
{
return (((num >> 24) & 0x000000ff) | ((num >> 8) & 0x0000ff00) | ((num << 8) & 0x00ff0000) | ((num << 24) & 0xff000000));
}
And also below function can be used:
unsigned int Little_To_Big_Endian(unsigned int num)
{
return (((num & 0x000000ff) << 24) | ((num & 0x0000ff00) << 8 ) | ((num & 0x00ff0000) >> 8) | ((num & 0xff000000) >> 24 ));
}
#include<stdio.h>
int main(){
int var = 0X12345678;
var = ((0X000000FF & var)<<24)|
((0X0000FF00 & var)<<8) |
((0X00FF0000 & var)>>8) |
((0XFF000000 & var)>>24);
printf("%x",var);
}
Here is a little function I wrote that works pretty good, its probably not portable to every single machine or as fast a single cpu instruction, but should work for most. It can handle numbers up to 32 byte (256 bit) and works for both big and little endian swaps. The nicest part about this function is you can point it into a byte array coming off or going on the wire and swap the bytes inline before converting.
#include <stdio.h>
#include <string.h>
void byteSwap(char**,int);
int main() {
//32 bit
int test32 = 0x12345678;
printf("\n BigEndian = 0x%X\n",test32);
char* pTest32 = (char*) &test32;
//convert to little endian
byteSwap((char**)&pTest32, 4);
printf("\n LittleEndian = 0x%X\n", test32);
//64 bit
long int test64 = 0x1234567891234567LL;
printf("\n BigEndian = 0x%lx\n",test64);
char* pTest64 = (char*) &test64;
//convert to little endian
byteSwap((char**)&pTest64,8);
printf("\n LittleEndian = 0x%lx\n",test64);
//back to big endian
byteSwap((char**)&pTest64,8);
printf("\n BigEndian = 0x%lx\n",test64);
return 0;
}
void byteSwap(char** src,int size) {
int x = 0;
char b[32];
while(size-- >= 0) { b[x++] = (*src)[size]; };
memcpy(*src,&b,x);
}
output:
$gcc -o main *.c -lm
$main
BigEndian = 0x12345678
LittleEndian = 0x78563412
BigEndian = 0x1234567891234567
LittleEndian = 0x6745239178563412
BigEndian = 0x1234567891234567
How to convert 64 bit int to binary presentation (big endian)? For reverse task I use these functions:
int readInt (struct str *buf) {
buf -> cur_len = buf -> cur_len + 4;
return
(((buf -> data[buf -> cur_len - 3 ] & 0xff) << 24) |
((buf -> data[buf -> cur_len - 2 ] & 0xff) << 16) |
((buf -> data[buf -> cur_len - 1 ] & 0xff) << 8) |
((buf -> data[buf -> cur_len ] & 0xff) << 0));
};
long unsigned int 32Bit(struct str *buf) { // 32
return ((long unsigned int)readInt(buf)) & 0xffffffffL;
};
long unsigned int 64Bit(struct str *buffer) { //64
long unsigned int result = 32Bit(buf);
result *= 4294967296.0;
return result;
}
Serialising a 64 bit unsigned number into an array of unsigned char, storing 8 bits in each in big-endian order, can be done like so:
void serialise_64bit(unsigned char dest[8], unsigned long long n)
{
dest[0] = (n >> 56) & 0xff;
dest[1] = (n >> 48) & 0xff;
dest[2] = (n >> 40) & 0xff;
dest[3] = (n >> 32) & 0xff;
dest[4] = (n >> 24) & 0xff;
dest[5] = (n >> 16) & 0xff;
dest[6] = (n >> 8) & 0xff;
dest[7] = (n >> 0) & 0xff;
}
You shouldn't use built-in types for serialization; instead, when you need to know the exact size of a type, you need fixed-width types:
#include <stdint.h>
unsigned char buf[8]; // 64-bit raw data
uint64_t little_endian_value =
(uint64_t)buf[0] + ((uint64_t)buf[1] << 8) + ((uint64_t)buf[2] << 16) + ... + ((uint64_t)buf[7] << 56);
uint64_t big_endian_value =
(uint64_t)buf[7] + ((uint64_t)buf[6] << 8) + ((uint64_t)buf[5] << 16) + ... + ((uint64_t)buf[0] << 56);
Similarly for 32-bit values, use uint32_t there. Make sure your source buffer uses unsigned chars.
how to reverse the bits using bit wise operators in c language
Eg:
i/p: 10010101
o/p: 10101001
If it's just 8 bits:
u_char in = 0x95;
u_char out = 0;
for (int i = 0; i < 8; ++i) {
out <<= 1;
out |= (in & 0x01);
in >>= 1;
}
Or for bonus points:
u_char in = 0x95;
u_char out = in;
out = (out & 0xaa) >> 1 | (out & 0x55) << 1;
out = (out & 0xcc) >> 2 | (out & 0x33) << 2;
out = (out & 0xf0) >> 4 | (out & 0x0f) << 4;
figuring out how the last one works is an exercise for the reader ;-)
Knuth has a section on Bit reversal in The Art of Computer Programming Vol 4A, bitwise tricks and techniques.
To reverse the bits of a 32 bit number in a divide and conquer fashion he uses magic constants
u0= 1010101010101010, (from -1/(2+1)
u1= 0011001100110011, (from -1/(4+1)
u2= 0000111100001111, (from -1/(16+1)
u3= 0000000011111111, (from -1/(256+1)
Method credited to Henry Warren Jr., Hackers delight.
unsigned int u0 = 0x55555555;
x = (((x >> 1) & u0) | ((x & u0) << 1));
unsigned int u1 = 0x33333333;
x = (((x >> 2) & u1) | ((x & u1) << 2));
unsigned int u2 = 0x0f0f0f0f;
x = (((x >> 4) & u2) | ((x & u2) << 4));
unsigned int u3 = 0x00ff00ff;
x = (((x >> 8) & u3) | ((x & u3) << 8));
x = ((x >> 16) | (x << 16) mod 0x100000000); // reversed
The 16 and 8 bit cases are left as an exercise to the reader.
Well, this might not be the most elegant solution but it is a solution:
int reverseBits(int x) {
int res = 0;
int len = sizeof(x) * 8; // no of bits to reverse
int i, shift, mask;
for(i = 0; i < len; i++) {
mask = 1 << i; //which bit we are at
shift = len - 2*i - 1;
mask &= x;
mask = (shift > 0) ? mask << shift : mask >> -shift;
res |= mask; // mask the bit we work at at shift it to the left
}
return res;
}
Tested it on a sheet of paper and it seemed to work :D
Edit: Yeah, this is indeed very complicated. I dunno why, but I wanted to find a solution without touching the input, so this came to my haead