How does this reverseBytes method work? - c

I was looking at this function online and am wondering how it works:
/*
* reverseBytes - reverse bytes
* Example: reverseBytes(0x12345678) = 0x78563412
* Legal ops: ! ~ & ^ | + << >>
*/
int reverseBytes(int x)
{
int newbyte0 = (x >> 24) & 0xff;
int newbyte1 = (x >> 8) & 0xff00;
int newbyte2 = (x << 8) & 0xff0000;
int newbyte3 = x << 24;
return newbyte0 | newbyte1 | newbyte2 | newbyte3;
}
Here's what I think I understand:
0xff, 0xff00, and 0xff0000 in binary are 1111 1111, 1111 1111 0000 0000, and 1111 1111 0000 0000 0000 0000 respectively
The method creates four new bytes with masks (0xff, etc), and then adds their values together using the | operator
I really don't get how this reverses the bytes though. I would appreciate a detailed explanation. Thanks!

The code assumes a 32 bit integer and 8 bit bytes. A 32 bit integer is made up of 4 bytes:
Let's say these 4 bytes are laid out in memory like so:
+---------------------------------+
|Byte 4 | Byte 3 | Byte 2 | Byte 1|
+---------------------------------+
This could relate to the Endianess of a given CPU type. When interpreting an integer that is made up of several bytes, some CPU families will treat the leftmost byte, the one with a lower memory address as the most significant byte of the integer - such CPUs are called big endian. Other CPUs will do the reverse, they will treat the rightmost byte within an integer , the byte with the largest memory address as the most significant byte - little endian CPUs. So your functions convert an integer from one endian to another.
int newbyte0 = (x >> 24) & 0xff;
This takes the integer (the 4 bytes) depicted above, shifts it 24 bits to the right, and masks off everything but the lower 8 bits, newbyte0 looks like this now, where Byte 4 is the original Byte 4 of x and the other 3 bytes have all bits set to zero.
+---------------------------------+
| 0 | 0 | 0 | Byte 4 |
+---------------------------------+
Similarely
int newbyte1 = (x >> 8) & 0xff00;
Shifts the bits 8 bits to the right, and masks off everything but the 8 bits in the 2. byte from the left. The result looks like this with, with only Byte 3 remaining of the original value of x
+---------------------------------+
| 0 | 0 | Byte 3 | 0 |
+---------------------------------+
The 2 leftmost bytes are handled similarly, just x is shifted left to accomplish the same thing.
Finally you have
newbyte0 | newbyte1 | newbyte2 | newbyte3;
Which combines all the integers you created above, each with only 8 bits remaining from the original x. Do a bitwise or of them, and you end up with
+---------------------------------+
|Byte 1 | Byte 2 | Byte 3 | Byte 4|
+---------------------------------+

int newbyte0 = (x >> 24) & 0xff;
Shifts the number 24 bits to the right, so that the left-most byte will now be the right-most byte. It then uses a mask (0xff) to zero out the rest of the bytes, which is redundant as the shift will have zeroed them anyways, so the mask can be omitted.
int newbyte1 = (x >> 8) & 0xff00;
Shifts the number 8 bits to the right, so that the second byte from the left is now the second byte from the right, and the rest of the bytes are zeroed out with a mask.
int newbyte2 = (x << 8) & 0xff0000;
Shifts the number 8 bits to the left this time - essentially the same thing as the last line, only now the second byte from the right becomes the second byte from the left.
int newbyte3 = x << 24;
The same as the first line (this time the redundant mask really is omitted) - the right-most byte becomes the left-most byte.
return newbyte0 | newbyte1 | newbyte2 | newbyte3;
And finally you just OR all the bytes to finish the reversal.
You can actually follow this process step-by-step in code by using printf("%x", newbyte) to print each of the bytes - the %x format allows you to print in hexadecimal.

Lets assume for 32 bit system you have passed 0x12345678 to the function.
int newbyte0 = (x >> 24) & 0xff; //will be 0x00000012
int newbyte1 = (x >> 8) & 0xff00; //will be 0x00003400
int newbyte2 = (x << 8) & 0xff0000; //will be 0x00560000
int newbyte3 = x << 24; //will be 0x78000000
return newbyte0 | newbyte1 | newbyte2 | newbyte3; will be 0x78563412

This function just shift byte to the right position in an integer and than OR all of them together.
For example x is 0xAABBCCDD:
For the first byte we shift all byte to the right, so we have 0x00000000AA & 0xFF which is 0xAA.
For the second byte we have 0x00AABBCC & 0xFF00 which is 0x0000BB00
And so on.
We just shift bits to the right position and erase all other bits.

Yes, your understands the code correctly, but of course it assumes int as 32 bits value.
int newbyte0 = (x >> 24) & 0xff; // Shift the bits 24~31 to 0~7
int newbyte1 = (x >> 8) & 0xff00; // Shift the bits 16~23 to 8~15
int newbyte2 = (x << 8) & 0xff0000; // Shifts bit bits 8~15 to 16~23
int newbyte3 = x << 24; // Shift bits 0~7 to 24~31
return newbyte0 | newbyte1 | newbyte2 | newbyte3; // Join all the bits

Related

Does someone can explain me this line of C Code (Pointer Arithmetic, bit shift)?

Let *c be 32bit in Memory and xmc[] array of 32bit in memory (abstract: Network packet)
xmc[0] = (*c >> 4) & 0x7;
xmc[1] = (*c >> 1) & 0x7;
xmc[2] = (*c++ & 0x1) << 2;
xmc[2] |= (*c >> 6) & 0x3;
xmc[3] = (*c >> 3) & 0x7;
What do the lines xmc[2] of code do to the Value (thought in binary)?
I tried to look up the arithmetic, but I failed understanding the part beginning from *c++.
EDIT: Added more context for clarification
Dereferencing and increment:
First, you are taking the value stored at the address pointed by the c pointer and incrementing the address.
Bitwise AND with a mask: A bitwise AND (&) is done with a mask of value 0x1 (decimal 1), which means that only the least significant bit is taken out of the value stored at the address c.
Think about it like that: You can have a variable on 4 bits, called a, with a decimal value of 3 (binary 0011) and you are doing a bitwise AND between a and a mask of decimal value 2 (binary 10), also on 4 bits (so 0010):
a = 0011
b = 0010
Bitwise AND (a & b or a & (0x10)) will compute an AND between each two bits from a and b. First bit in a is 1, first bit in b is 0 => least significant bit in the result is 1 & 0 = 0, go on with the second bits of each variable, leading to the second least significant bit in the result being 1, and so on...
AND with such a mask is typically used to take a certain bit (or a group of bits) from a value stored in a variable. In your case, your code takes the least significant bit stored in a.
Left shift: The left shift << takes the least significant bit two positions to the left (e.g. from 0001 to 0100), adding 2 bits on 0 to the right.
Let's assume that we operating on a unsigned 32 bit value. Then code
xmc[2] = (*c++ & 0x1) << 2;
is equivalent to
uint32_t tmp1 = *c; // Read the value that c points to and
c = c + 1; // increment the pointer c
// These two lines is the *c++ part
uint32_t tmp2 = tmp1 & 0x1; // Make tmp2 equal to the least significant bit of tmp1
// i.e. tmp2 will be 1 if tmp1 is odd and
// tmp2 will be 0 if tmp1 is even
uint32_t tmp3 = tmp2 << 2; // Make tmp3 equal to tmp2 shifted 2 bits to the left
// This is the same as: tmp3 = tmp2 * 4
xmc[2] = tmp3; // Save the result in xmc[2]
In pseudo code this means:
If the value pointed to be c is odd, set xmc[2] to 4
If the value pointed to be c is even, set xmc[2] to 0
Increment the pointer c
Today's date could be said to be 20230215.
If you have that as a number, you could extract the components as follows:
n = 20230215;
y = n / 10000 % 10000;
m = n / 100 % 100;
d = n / 1 % 100;
The code in question does the same thing. It's extracting four values (a, b, c and d) spread over two bytes.
c[0] c[1]
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| | a | a | a | b | b | b | c | | c | c | d | d | d | | | |
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
Since we want to extract bits instead of digits, we need to use powers of two instead of using powers of ten. When dealing with powers of two, >> can be used in lieu of division, and & can be used in lieu of %.
To extract a, b, c and d, we could use the following:
n = ( c[0] << 8 ) | c[1];
xmc[0] = ( n >> 12 ) & 0x7;
xmc[1] = ( n >> 9 ) & 0x7;
xmc[2] = ( n >> 6 ) & 0x7;
xmc[3] = ( n >> 3 ) & 0x7;
The posted code takes an approach that avoids calculating n, but it does the same thing.
++ is a post-increment operator in C language, and it would be the last operation before assigning a value to the left of the =.
xmc[2] = (*c++ & 0x1) << 2;
This statement could be considered to:
de-reference : *c
bit-wise AND : *c & 0x1
left-shift : (*c & 0x1) << 2
post-increment : c++
assign to left of = : xmc[2] = the result of the 3rd step.
But, the compiler would optimize these operations by its design and might increase c before the bit-wise AND operation by utilizing the registers of CPU and stacks in the memory. i.e. differences might be found in the final assembly code.
The posted code could be more easily read if it were, equivalently:
xmc[0] = (c[0] >> 4) & 0x7;
xmc[1] = (c[0] >> 1) & 0x7;
xmc[2] = (c[0] & 0x1) << 2;
xmc[2] |= (c[1] >> 6) & 0x3;
xmc[3] = (c[1] >> 3) & 0x7;
(And then, if later code depended on c being advanced, this could be followed by a standalone c++.)
It looks like xmc[0] is taken from three particular bits of c[0], and xmc[1] is taken from three other bits of c[0]. Then the next field straddles a word boundary, in that xmc[2] is composed by putting together one bit from c[0], and two bits from c[1]. Finally xmc[3] is taken from three other bits of c[1].
#ikegami's answer has more of the details.

How to make a word from Byte []?

I am new to programming world, I want to convert two bytes into a word.
So basically, I have a Byte array where index 0 is Buffer[0]=08 and index 1 is Buffer[1]=06
I want to create a word from these two bytes
where word ETHType to be 0x0806
You would use bitwise operators and bit shifting.
uint16_t result = ((uint16_t)Buffer[0] << 8) | Buffer[1];
This does the following:
The value of Buffer[0] is shifted 8 bits to the left. That gives you 0x0800
A bitwise OR is performed between the prior value and the value of Buffer[1]. This sets the low order 8 bits to Buffer[1], giving you 0x0806
word ETHType = 0;
ETHType = (Buffer[0] << 8) | Buffer[1];
edit: You should probably add a mask to each operation to make sure there is no bit overlap, and you can do it safely in one line.
word ETHType = (unsigned short)(((unsigned char)(Buffer[0] & 0xFF) << 8) | (unsigned char)(Buffer[1] & 0xFF)) & 0xFFFF;
Here's a macro I use for this very thing:
#include <stdint.h>
#define MAKE16(a8, b8) ((uint16_t)(((uint8_t)(((uint32_t)(a8)) & 0xff)) | ((uint16_t)((uint8_t)(((uint32_t)(b8)) & 0xff))) << 8))

Checking byte with bitwise operators

I could use this:
unsigned long alpha = 140 | 130 << 8 | 255 << 16;
to set 140 to the first byte of alpha, 130 to the second and 255 as 3rd.
How do I do the opposite (i.e checking a specific byte of alpha) ?
alpha & 255 // works for the first byte
alpha >> 16; // works for the 3rd byte
Shift the value x bits to the right and then use AND to restrict the number of bits you use. ie: (n >> 8) & 0xff or (n >> 16) & 0xff.

Casting 8-bit int to 32-bit

I think I confused myself with endianness and bit-shifting, please help.
I have 4 8-bit ints which I want to convert to a 32-bit int. This is what I an doing:
uint h;
t_uint8 ff[4] = {1,2,3,4};
if (BIG_ENDIAN) {
h = ((int)ff[0] << 24) | ((int)ff[1] << 16) | ((int)ff[2] << 8) | ((int)ff[3]);
}
else {
h = ((int)ff[0] >> 24) | ((int)ff[1] >> 16) | ((int)ff[2] >> 8) | ((int)ff[3]);
}
However, this seems to produce a wrong result. With a little experimentation I realised that it should be other way round: in the case of big endian I am supposed to shift bits to the right, and otherwise to the left. However, I don't understand WHY.
This is how I understand it. Big endian means most significant byte first (first means leftmost, right? perhaps this where I am wrong). So, converting 8-bit int to 32-bit int would prepend 24 zeros to my existing 8 bits. So, to make it a 1st byte I need to shift bits 24 to the left.
Please point out where I am wrong.
You always have to shift the 8-bit-values left. But in the little-endian case, you have to change the order of indices, so that the fourth byte goes into the most-significant position, and the first byte into the least-significant.
if (BIG_ENDIAN) {
h = ((int)ff[0] << 24) | ((int)ff[1] << 16) | ((int)ff[2] << 8) | ((int)ff[3]);
}
else {
h = ((int)ff[3] << 24) | ((int)ff[2] << 16) | ((int)ff[1] << 8) | ((int)ff[0]);
}

Setting invidual bits in byte by group of bits

For example:
We have a byte A: XXXX XXXX
We have a byte B: 0000 0110
And now for example we want 4 bits from byte B on specific position and we want to put inside byte A on specific position like so we have a result:
We have a byte A: 0110 XXXX
Im still searching through magic functions without success.
Found similar and reworking it but still have no endgame with it:
unsigned int i, j; // positions of bit sequences to swap
unsigned int n; // number of consecutive bits in each sequence
unsigned int b; // bits to swap reside in b
unsigned int r; // bit-swapped result goes here
unsigned int x = ((b >> i) ^ (b >> j)) & ((1U << n) - 1); // XOR temporary
r = b ^ ((x << i) | (x << j));
As an example of swapping ranges of bits suppose we have have b = 00101111 (expressed in binary) and we want to swap the n = 3 consecutive bits starting at i = 1 (the second bit from the right) with the 3 consecutive bits starting at j = 5; the result would be r = 11100011 (binary).
This method of swapping is similar to the general purpose XOR swap trick, but intended for operating on individual bits. The variable x stores the result of XORing the pairs of bit values we want to swap, and then the bits are set to the result of themselves XORed with x. Of course, the result is undefined if the sequences overlap.
It's hard to understand your requirenments exactly, so correct me if I'm wrong:
You want to take the last 4 bits of a byte (B) and add them to the first for bits of byte A? You use the term 'put inside' but it's unclear what you mean exactly by it (If not adding, do you mean replace?).
So assuming addition is what you want you could do something like this:
A = A | (B <<4)
This will shift by 4 bits to the left (thereby ending up with 01100000) and then 'adding ' it to A (using or).
byte A: YYYY XXXX
byte B: 0000 0110
and you want 0110 XXXX
so AND A with 00001111 then copy the last 4 bits of B (first shift then OR)
a &= 0x0F; //now a is XXXX
a |= (b << 4); //shift B to 01100000 then OR to get your result
if you wanted 0110 YYYY just shift a by 4 to the right instead of AND
a >>= 4
Found an solution :
x = ((b>>i)^(r>>j)) & ((1U << n) -1)
r = r^(x << j)
where r is the 2nd BYTE, i,j are indexes in order (from,to).

Resources