Bit Twiddling - Confused With This Program's Output - c

So I was messing around with Bit-Twiddling in C, and I came across an interesting output:
int main()
{
int a = 0x00FF00FF;
int b = 0xFFFF0000;
int res = (~b & a);
printf("%.8X\n", (res << 8) | (b >> 24));
}
And the output from this statement is:
FFFFFFFF
I expected the output to be
0000FFFF
But why wasn't it? Am I missing something with bit-shifting here?

TLDR: Your integer b is negative so when you shift it right the value of the uppermost bit (i.e. 1) remains the same. Therefore when you shift b right by 24 places you end up with 0xFFFFFFFF.
Longer explanation:
Assuming on your platform that your integers are 32 bits or longer and a signed integer is represented by 2's complement then the 0xFFFF0000 assigned to a signed integer variable is a negative number. If an int is longer than 32 bits then the 0xFFFF0000 will be sign extended first and will still be a negative number.
Shifting a negative number right is implementation defined by the standard (C99 / N1256, section 6.5.7.5):
The result of E1 >> E2 is E1 right-shifted E2 bit positions. [...] If E1
has a signed type and a negative value, the resulting value is
implementation defined.
That means a particular compiler can choose what happens in a particular situation, but it should be noted in the compiler manual what the effect is.
There tend to be two sets of shift instructions in many processors, a logical shift and an arithmetic shift. The logical shift right will shift bits and fill the exposed bits with zeros. Arithmetic shifts right (assuming 2's complement again) will fill the exposed bits with the same bit value of the most significant bit so that it ends up with a result that is consistent with using shifts as a divide by 2. (For example, -4 >> 1 == 0xFFFFFFFC >> 1 == 0xFFFFFFFE == -2.)
In your case it appears that the compiler implementor has chosen to use arithmetic shifts when applied to signed integers and so the result of shifting a negative value to the right remains a negative value. In terms of bit patterns 0xFFFF0000 >> 24 gives 0xFFFFFFFF.
Unless you are absolutely sure of what you are doing it is best to perform bitwise operations only on unsigned types as their internal representation can safety be treated as a collection of bits. You probably also want to make sure any numeric values you use in that case are unsigned by appending the unsigned suffix to your number.

Right-shifting negative values (like b) can be defined in two different ways: logical shift, which pads the value with zeroes on the left (which yields a positive number when shifting a nonzero amount), and arithmetic shift, which pads the value with ones (always yielding a negative number). Which definition is used in C is implementation-defined, and your compiler apparently uses arithmetic shift, so b >> 24 is 0xFFFFFFFF.

b >> 24 gives 0xFFFFFFFF signed right pad of negative number
List = (res << 8) | (b >> 24)
a = 0x00FF00FF = 0000 0000 1111 1111 0000 0000 1111 1111
b = 0xFFFF0000 = 1111 1111 1111 1111 0000 0000 0000 0000
~b = 0x0000FFFF = 0000 0000 0000 0000 1111 1111 1111 1111
~b & a = 0x000000FF = 0000 0000 0000 0000 0000 0000 1111 1111, = res
res << 8 = 0x0000FF00 = 0000 0000 0000 0000 1111 1111 0000 0000
b >> 24 = 0xFFFFFFFF = 1111 1111 1111 1111 1111 1111 1111 1111
List = 0xFFFFFFFF = 1111 1111 1111 1111 1111 1111 1111 1111

The golden rule: Never ever mix signed numbers with bitwise operators.
Change all ints to unsigned ints. Just as a precaution, change all literals to unsigned too.
#include <stdint.h>
uint32_t a = 0x00FF00FFu;
uint32_t b = 0xFFFF0000u;
uint32_t res = (~b & a);

Related

C rotate bitfield (explanation needed)

I want to rotate a byte 1 bit to the left. I was looking at several examples from this site and came across this code. Though it works, I'd appreciate any step by step into how does it works.
unsigned int _rotl(const unsigned int value, int shift)
{
if ((shift &= sizeof(value)*8 - 1) == 0) //What does this do?
return value;
return (value << shift) | (value >> (sizeof(value)*8 - shift));
}
First of all, what does the first part does? And for the last part wouldn't shifting by that much you'd pretty much be erasing some of the bits?
For example:
Say
value= 0x50 //01010000
shift = 4
I'll have
for
value << shift
01010000 << 4 => 00000000
And for
value >> (sizeof(value)*8 - shift)
01010000>> (4*8 - 4) => 00000000
So doing the OR operation for both would give me 0. My understanding is wrong obviously, but I'd appreciate anyone 'dumbing' it down for a begginer like me. Thanks.
Let's take it step by step:
unsigned int _rotl(const unsigned int value, int shift)
{
if ((shift &= sizeof(value)*8 - 1) == 0) //What does this do?
return value;
return (value << shift) | (value >> (sizeof(value)*8 - shift));
}
First line:
if ((shift &= sizeof(value)*8 - 1) == 0) //What does this do?
This statement is an optimization and a check rolled into one line. It returns TRUE if one of two conditions is met:
Shift is zero (i.e. don't rotate any bits)
The number of rotations specified by shift would have no effect
That statement returns FALSE otherwise, but also calculates the minimum number of rotations necessary to achieve the desired result, and stores that value in shift. In other words, it calculates shift = shift % size_of_integer_data_type.
For example, if you have a 32-bit integer, then rotating it by 32 bits does nothing. If you rotate it by 64, 96, or any other multiple of 32, that also accomplishes nothing. If the effect of our rotation does nothing then we save ourselves a lot of time and just quit early.
However, we might also specify a lot more work than is necessary. If you have a 32-bit integer, then rotating it by one bit has the same effect as rotating it by 33 bits, or by 65 bits, or 97 bits, etc. This code recognizes this fact, so if you specify shift as 97, it reassigns shift=1 and cuts out the extraneous rotations.
The statement sizeof(value)*8 - 1 returns one less than the number of bits in the representation of value. For example, if sizeof(value) evaluates to 4 (which it will on a system with 32-bit integers), then 4*8 - 1 = 31.
The &= operator is a bitwise-AND with assignment. This means we're doing a bitwise-AND between shift and sizeof(value)*8 - 1 and assigning that result to shift. As before, the right hand side of that expression is equal to the number of bits in value minus one. Thus, this has the effect of masking out all bits of shift that are greater than the size of the representation of value, which in turn has the effect of computing shift = shift % size_of_integer_data_type.
To be concrete, reconsider the 32-bit case. As before, sizeof(value)*8-1 evaluates to 31. Bitwise, this value is 0000 0000 0000 0000 0000 0000 0001 1111. That value is bitwise-ANDed with shift. Any bits in shift's 6th to 32nd positions are set to zero, while the bits in the 1st to 5th positions are unchanged. If you were to specify 97 rotations the result would be one.
0000 0000 0000 0000 0000 0000 0110 0001 (97)
& 0000 0000 0000 0000 0000 0000 0001 1111 (31)
=========================================
0000 0000 0000 0000 0000 0000 0000 0001 (1)
The last thing to do here is to recall that, in C, the return value of an assignment statement is the value that was assigned. Thus, if the new value of shift is zero, then we return immediately, otherwise we continue.
Second line:
return (value << shift) | (value >> (sizeof(value)*8 - shift));
Since C doesn't have a rotation operator (it only has left and right shifts) we have to compute the low-order bits and the high-order bits separately, and then combine them with a bitwise-OR. This line is a simple matter of calculating each side separately.
The statement value << shift calculates the high order bits. It shifts the bit pattern to the left by shift places. The other statement calculates the low order bits by shifting the bit pattern to the right by size_of_integer_type - shift bits. This is easy to see in an example.
Suppose that value has the decimal value 65535 and that shift has the value 26. Then the starting value is:
0000 0000 0000 0000 1111 1111 1111 1111 (65535)
The left shift gives us:
1111 1100 0000 0000 0000 0000 0000 0000 (65535 << 26)
The right shift gives us:
0000 0000 0000 0000 0000 0011 1111 1111 (65535 >> 6)
Then the bitwise-OR combines these results:
1111 1100 0000 0000 0000 0000 0000 0000 (65535 << 26)
| 0000 0000 0000 0000 0000 0011 1111 1111 (65535 >> 6)
=========================================
1111 1100 0000 0000 0000 0011 1111 1111 (65535 rot 26)
You could re-write this code and achieve the same correct result:
unsigned int _rotl(const unsigned int value, int shift)
{
//Assume 8 bits in a byte
unsigned bits_in_integer_type = sizeof(value)*8;
shift = shift % bits_in_integer_type;
if( shift == 0 ) return value; //rotation does nothing
unsigned high_bits = value << shift;
unsigned low_bits = value >> (bits_in_integer_type - shift);
return high_bits | low_bits;
}
unsigned int _rotl(const unsigned int value, int shift)
{
// If all bits in value are zero, do nothing and return
int bitmaskOfAllOnes = sizeof(value)*8 - 1;
if ((shift &= bitmaskOfAllOnes) == 0) //What does this do?
return value;
// Shift value to the left
(value << shift)
// shifting right to handle the wrap around
| (value >> (sizeof(value)*8 - shift));
}
e.g. using 16-bit ints
value = 0x1001
shift = 4
sizeof(value) => 2
//sizeof(value)*8 - 1 => 0xffff
sizeof(value)*8 - 1 => 15 Decimal
shift &= bitmaskOfAllOnes => 0x1001 (i.e. not zero)
value << shift => 0x0010
sizeof(value)*8 - shift => 12
value >> (sizeof(value)*8 - shift) => 0x001
0x0010 | 0x001 => 0x0011

Calculation of Bit wise NOT

How to calculate ~a manually? I am seeing these types of questions very often.
#include <stdio.h>
int main()
{
unsigned int a = 10;
a = ~a;
printf("%d\n", a);
}
The result of the ~ operator is the bitwise complement of its (promoted) operand
C11dr ยง6.5.3.3
When used with unsigned, it is sufficient to mimic ~ with exclusive-or with UINT_MAX which is the same type and value as (unsigned) -1. #EOF
unsigned int a = 10;
// a = ~a;
a ^= -1;
You could XOR it with a bitmask of all 1's.
unsigned int a = 10, mask = 0xFFFFFFFF;
a = a ^ mask;
This is assuming of course that an int is 32 bits. That's why it makes more sense to just use ~.
Just convert the number to binary form, and change '1' by '0' and '0' by '1'.
That is:
10 (decimal)
Converted to binary (32 bits as usual in an int) gives us:
0000 0000 0000 0000 0000 0000 0000 1010
Then apply the ~ operator:
1111 1111 1111 1111 1111 1111 1111 0101
Now you have a number that could be interpreted as an unsigned 32 bit number, or signed one. As you are using %d in your printf and a is an int, signed it is.
To find out the value in decimal from a signed (2-complement) number do as this:
If the most significant bit (the leftmost) is 0, then just convert back the binary number to decimal as usual.
if the most significant bit is 1 (our case here), then change '1' by '0' and '0' by '1', add '1' and convert to decimal prepending a minus sign to the result.
So it is:
1111 1111 1111 1111 1111 1111 1111 0101
^
|
Its most significant bit is 1, so first we change 0 and 1
0000 0000 0000 0000 0000 0000 0000 1010
And then, we add 1
0000 0000 0000 0000 0000 0000 0000 1010
1
---------------------------------------
0000 0000 0000 0000 0000 0000 0000 1011
Take this number and convert back to decimal prepending a minus sign to the result. The converted value is 11. With the minus sign, is -11
This function shows the binary representation of an int and swaps the 0's and 1's:
void not(unsigned int x)
{
int i;
for(i=(sizeof(int)*8)-1; i>=0; i--)
(x&(1u<<i))?putchar('0'):putchar('1');
printf("\n");
}
Source: https://en.wikipedia.org/wiki/Bitwise_operations_in_C#Right_shift_.3E.3E

bitwise operations in c explanation

I have the following code in c:
unsigned int a = 60; /* 60 = 0011 1100 */
int c = 0;
c = ~a; /*-61 = 1100 0011 */
printf("c = ~a = %d\n", c );
c = a << 2; /* 240 = 1111 0000 */
printf("c = a << 2 = %d\n", c );
The first output is -61 while the second one is 240. Why the first printf computes the two's complement of 1100 0011 while the second one just converts 1111 0000 to its decimal equivalent?
You have assumed that an int is only 8 bits wide. This is probably not the case on your system, which is likely to use 16 or 32 bits for int.
In the first example, all the bits are inverted. This is actually a straight inversion, not two's complement:
1111 1111 1111 1111 1111 1111 1100 0011 (32-bit)
1111 1111 1100 0011 (16-bit)
In the second example, when you shift it left by 2, the highest-order bit is still zero. You have misled yourself by depicting the numbers as 8 bits in your comments.
0000 0000 0000 0000 0000 0000 1111 0000 (32-bit)
0000 0000 1111 0000 (16-bit)
Try to avoid doing bitwise operations with signed integers -- often it'll lead you into undefined behavior.
The situation here is that you're taking unsigned values and assigning them to a signed variable. For ~60 this is undefined behavior. You see it as -61 because the bit pattern ~60 is also the two's-complement representation of -61. On the other hand 60 << 2 comes out correct because 240 has the same representation both as a signed and unsigned integer.

Sign-extend a number in C

I am having trouble trying to sign-extend a number by extracting part of a bit-string. This has trouble when it is a negative number, it wraps the number around to the positive side.
Here is my code:
// printf("add1 \n");
unsigned short r1 = (instruction>>6)&7;
signed short amount = (instruction& 31); //right here! i am trying to get the last 5 bits and store it in a register but i can't figure out how to make it negative if it is negative
// printf("\namount is %d \n", amount);
unsigned short dest = (instruction>>9)&7;
state->regs[dest] = state->regs[r1]+amount;
setCC(state,state->regs[r1]+amount);
For bit patterns, it's often easier to use hex constants instead of decimal.
signed short amount = (instruction & 0x1F);
Then to sign-extend the number, check the sign-bit (assuming the sign-bit here is the left-most of the 5 extracted bits). If it's set, do a binary inversion and add 1. Take the 2's-complement of the 5-bit value (invert and add one), then take the 2's-complement of the full-width result (invert and add 1).
if (amount & 0x10)
amount = ~(amount^0x1F + 1) + 1;
Eg.
5-bit "bitfield"
X XXXX
0000 0000 0001 1111
0000 0000 0000 0000 invert x ^ 0x1F (= 1 1111)
0000 0000 0000 0001 add 1
1111 1111 1111 1110 invert ~
1111 1111 1111 1111 add 1
0000 0000 0001 0000
0000 0000 0000 1111 invert x ^ 0x1F (= 1 1111)
0000 0000 0001 0000 add 1
1111 1111 1110 1111 invert ~
1111 1111 1111 0000 add 1
Ooops. Even simpler:
-(x^0x1F + 1) Assuming the machine operates with 2's-complement
0000 0000 0001 0110
0000 0000 0000 1001 invert
0000 0000 0000 1010 add 1 (yielding the full-width absolute value)
1111 1111 1111 0110 negate
use bitfields:
union {
int a;
struct {
int a:5;
int b:3;
unsigned int c:20;
} b;
} u = 0xdeadbeef;
int b = u.b.b; // should sign extend the 3-bit bitfield starting from bit 5
Here is how you can sign extend a 5-bit two's complement value portably without tests:
int amount = (instruction & 31) - ((instruction & 16) << 1);
More generally, it the field width is n, non zero and less than the number of bits in an int, you can write:
int amount = (instruction & ~(~1U << (n - 1) << 1)) -
((instruction & (1U << (n - 1)) << 1);
From Hacker's Delight 2-6. Assuming 5 bits of data that must be sign extended (sign bit has value 16).
Best case: If the upper bits are all zeros:
(i ^ 16) - 16
Next best case (as with OP's instruction): If the upper bits contain data that must be discarded:
(i & 15) - (i & 16)
You can check the sign-bit and fix-up the result accordingly:
int width_of_field = 5;
signed short amount = (instruction& 31);
if (amount & (1 << width_of_field >> 1)) // look at the sign bit
{
amount -= 1 << width_of_field; // fix the result
}
Alternatively, use a left-shift followed by a right shift:
width_of_field = 5;
signed short amount = (instruction& 31);
// It is possible to omit the "& 31", because of the left shift below
amount <<= 16 - width_of_field;
amount >>= 16 - width_of_field;
Note: must use two statements to avoid effects of promotion to int (which presumably has 32 bits).

Unexpected C/C++ bitwise shift operators outcome

I think I'm going insane with this.
I have a a piece of code that needs to create an (unsigned) integer with N consequent bits set to 1. To be exact I have a bitmask, and in some situations I'd like to set it to a solid rnage.
I have the following function:
void MaskAddRange(UINT& mask, UINT first, UINT count)
{
mask |= ((1 << count) - 1) << first;
}
In simple words: 1 << count in binary representation is 100...000 (number of zeroes is count), subtracting 1 from such a number gives 011...111, and then we just left-shift it by first.
The above should yield correct result, when the following obvious limitation is met:
first + count <= sizeof(UINT)*8 = 32
Note that it should also work correctly for "extreme" cases.
if count = 0 we have (1 << count) = 1, and hence ((1 << count) - 1) = 0.
if count = 32 we have (1 << count) = 0, since the leading bit overflows, and according to C/C++ rules bitwise shift operators are not cyclic. Then ((1 << count) - 1) = -1 (all bits set).
However, as turned out, for count = 32 the formula doesn't work as expected. As discovered:
UINT n = 32;
UINT x = 1 << n;
// the value of x is 1
Moreover, I'm using MSVC2005 IDE. When I evaluate the above expression in the debugger, the result is 0. However when I step over the above line, x gets value of 1. Lokking via the disassembler we see the following:
mov eax,1
mov ecx,dword ptr [ebp-0Ch] // ecx = n
shl eax,cl // eax <<= LOBYTE(ecx)
mov dword ptr [ebp-18h],eax // n = ecx
There's no magic indeed, compiler just used shl instruction. Then it seems that shl doesn't do what I expected it should do. Either CPU decides to ignore this instruction, or the shift is treated modulo 32, or donno what.
My questions are:
What is the correct behavior of shl/shr instructions?
Is there a CPU flag controlling the bitshift instructions?
Is this according to C/C++ standard?
Thanks in advance
Edit:
Thanks for answers. I've realized that (1) shl/shr indeed treat operand modulo 32 (or & 0x1F) and (2) C/C++ standard treats shift by more than 31 bits as undefined behavior.
Then I have one more question. How can I rewrite my "masking" expression to cover this extreme case too. It should be without branching (if, ?). What'd be the simplest expression?
1U << 32 is undefined behavior in C and in C++ when type unsigned int is 32-bit wide.
(C11, 6.5.7p3) "If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined"
(C++11, 5.8p1) "The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand."
Shifting by as many or more bits than in the integer type you're shifting is undefined in C and C++. On x86 and x86_64, the shift amount of the shift instructions is indeed treated modulo 32 (or whatever the operand size is). You however cannot rely on this modulo behaviour to be generated by your compiler from C or C++ >>/<< operations unless your compiler explicitly guarantees it in its documentation.
I think the expression 1 << 32 is the same as 1 << 0. IA-32 Instruction Set Reference says that the count operand of shift instructions is masked to 5 bits.
The instruction set reference of IA-32 architectures can be found here.
To fix the "extreme" case, I can only come up with the following code (maybe buggy) that may be a little awkward:
void MaskAddRange(UINT *mask, UINT first, UINT count) {
int count2 = ((count & 0x20) >> 5);
int count1 = count - count2;
*mask |= (((1 << count1) << count2) - 1) << first;
}
The basic idea is to split the shift operation so that each shift count does not exceed 31.
Apparently, the above code assumes that the count is in a range of 0..32, so it is not very robust.
If I have understood the requirements, you want an unsigned int, with the top N bits set?
There are several ways to get the result (I think) you want.
Edit:
I am worried that this isnt very robust, and will fail for n>32:
uint32_t set_top_n(uint32 n)
{
static uint32_t value[33] = { ~0xFFFFFFFF, ~0x7FFFFFFF, ~0x3FFFFFFF, ~0x1FFFFFFF,
~0x0FFFFFFF, ~0x07FFFFFF, ~0x03FFFFFF, ~0x01FFFFFF,
~0x00FFFFFF, ~0x007FFFFF, ~0x003FFFFF, ~0x001FFFFF,
// you get the idea
0xFFFFFFFF
};
return value[n & 0x3f];
}
This should be quite fast as it is only 132 bytes of data.
To make it robust, I'd either extend for all values up to 63, or make it conditional, in which case it can be done with a version of your original bit-masking + the 32 case. I.e.
My 32 cents:
#include <limits.h>
#define INT_BIT (CHAR_BIT * sizeof(int))
unsigned int set_bit_range(unsigned int n, int frm, int cnt)
{
return n | ((~0u >> (INT_BIT - cnt)) << frm);
}
List 1.
A safe version with bogus / semi-circular result could be:
unsigned int set_bit_range(unsigned int n, int f, int c)
{
return n | (~0u >> (c > INT_BIT ? 0 : INT_BIT - c)) << (f % INT_BIT);
}
List 2.
Doing this without branching, or local variables, could be something like;
return n | (~0u >> ((INT_BIT - c) % INT_BIT)) << (f % INT_BIT);
List 3.
List 2 and List 3 This would give "correct" result as long as from is less then INT_BIT and >= 0. I.e.:
./bs 1761 26 810
Setting bits from 26 count 810 in 1761 -- of 32 bits
Trying to set bits out of range, set bits from 26 to 836 in 32 sized range
x = ~0u = 1111 1111 1111 1111 1111 1111 1111 1111
Unsafe version:
x = x >> -778 = 0000 0000 0000 0000 0000 0011 1111 1111
x = x << 26 = 1111 1100 0000 0000 0000 0000 0000 0000
x v1 Result = 1111 1100 0000 0000 0000 0110 1110 0001
Original: 0000 0000 0000 0000 0000 0110 1110 0001
Safe version, branching:
x = x >> 0 = 1111 1111 1111 1111 1111 1111 1111 1111
x = x << 26 = 1111 1100 0000 0000 0000 0000 0000 0000
x v2 Result = 1111 1100 0000 0000 0000 0110 1110 0001
Original: 0000 0000 0000 0000 0000 0110 1110 0001
Safe version, modulo:
x = x >> 22 = 0000 0000 0000 0000 0000 0011 1111 1111
x = x << 26 = 1111 1100 0000 0000 0000 0000 0000 0000
x v3 Result = 1111 1100 0000 0000 0000 0110 1110 0001
Original: 0000 0000 0000 0000 0000 0110 1110 0001
You could avoid the undefined behavior by splitting the shift operation in two steps, the first one by (count - 1) bits and the second one by 1 more bit. Special care is needed in case count is zero, however:
void MaskAddRange(UINT& mask, UINT first, UINT count)
{
if (count == 0) return;
mask |= ((1 << (count - 1) << 1) - 1) << first;
}

Resources