set the m-bit to n-bit [closed] - c

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I have a 32-bit number and without using for loop, I want to set m bit to n bits.
For example:
m bit may be 2nd or 5th or 9th or 10th.
n bit may be 22nd or 27 or 11th bit.
I assume (m < n).
Please help me.Thanks

Suppose Bits are numbered from LSB to MSB:
BIT NUMBER 31 0
▼ ▼
number bits 0000 0000 0000 0000 0000 0000 0001 0101
▲ ^ ^ ▲
MSB | | LSB
| |
n=27 m=17
LSB - Least Significant Bit (numbered 0)
MSB - Most Significant Bit (numbered 31)
In the figure above, I have shown how bits are numbered from LSB to MSB.
Notice the relative positions of n and m where n > m.
To set (all one) bits from n to m
To set-1 all bits from position m to n (where n > m) in a 32-bit number.
You need a 32-bit mask in which all bits are 1 from n to m and remaining bits are 0.
For example, to set all bits from m=17 to n=27 we need mask like:
BIT NUMBER 31 n=27 m=17 0
▼ ▼ ▼ ▼
mask = 0000 1111 1111 1110 0000 0000 0000 0000
And if we have any 32-bit number, by bitwise OR (|) with this number we can set-1 all bits from m to n. All other bits will be unchanged.
Remember OR works like:
x | 1 = 1 , and
x | 0 = x
where x value can be either 1 or 0.
So by doing:
num32bit = num32bit | mask;
we can set n to m bit 1 and remaining bits will be unchanged. For example, suppose, num32bit = 0011 1001 1000 0000 0111 1001 0010 1101,
then:
0011 1001 1000 0000 0111 1001 0010 1101 <--- num32bit
0000 1111 1111 1110 0000 0000 0000 0000 <--- mask
---------------------------------------- ---------------Bitwise OR operation
0011 1111 1111 1110 0111 1001 0010 1101 <--- new number
---- ▲ ▲ -------------------
|-----------| this bits are from `num32bit`
all bits are
1 here
This is what I mean by:
num32bit = num32bit | mask;
##How to make the mask?
To make a mask in which all bits are 1 from n to m and others are 0, we need three steps:
Create mask_n: All bits on Right side from n=27 are one
BIT NUMBER 31 n=27 0
▼ ▼ ▼
mask_27= 0000 1111 1111 1111 1111 1111 1111 1111
In programming this can be created by right-shift (>>) 4 times.
And, why 4?
4 = 32 - n - 1 ==> 31 - 27 ==> 4
Also note: the complement (~) of 0 has all bits one,
and we need unsigned right shift in C.
Understand the difference between signed and unsigned right shift
Create mask_m: All bits on left side from m=17 are one.
BIT NUMBER 31 m=17 0
▼ ▼ ▼
mask_17 1111 1111 1111 1110 0000 0000 0000 0000
Create mask: Bitwise AND of above to: mask = mask_n & mask_m:
mask = 0000 1111 1111 1110 0000 0000 0000 0000
▲ ▲
BIT NUMBER 27 17
And, below is my getMask(n, m) function that returns a unsigned number that looks like mask in step-3.
#define BYTE 8
typedef char byte; // Bit_sizeof(char) == BYTE
unsigned getMask(unsigned n,
unsigned m){
byte noOfBits = sizeof(unsigned) * BYTE;
unsigned mask_n = ((unsigned)~0u) >> (noOfBits - n - 1),
mask_m = (~0u) << (noOfBits - m),
mask = mask_n & mask_m; // bitwise & of 2 sub masks
return mask;
}
To test my getMask() I have also written a main() function and a binary() function, which prints a given number in binary format.
void binary(unsigned);
int main(){
unsigned num32bit = 964720941u;
unsigned mask = 0u;
unsigned rsult32bit;
int i = 51;
mask = getMask(27, 17);
rsult32bit = num32bit | mask; //set n to m bits 1
printf("\nSize of int is = %ld bits, and "
"Size of unsigned = %ld e.g.\n", sizeof(int) * BYTE,
sizeof(unsigned) * BYTE);
printf("dec= %-4u, bin= ", 21);
binary(21);
printf("\n\n%s %d\n\t ", "num32bit =", num32bit);
binary(num32bit);
printf("mask\t ");
binary(mask);
while(i--) printf("-");
printf("\n\t ");
binary(rsult32bit);
printf("\n");
return EXIT_SUCCESS;
}
void binary(unsigned dec){
int i = 0,
left = sizeof(unsigned) * BYTE - 1;
for(i = 0; left >= 0; left--, i++){
printf("%d", !!(dec & ( 1 << left )));
if(!((i + 1) % 4)) printf(" ");
}
printf("\n");
}
This test code runs like (the output is quite same as I explained in above example):
Output of code:
-----------------
$ gcc b.c
:~$ ./a.out
Size of int is = 32 bits, and Size of unsigned = 32 e.g.
dec= 21 , bin= 0000 0000 0000 0000 0000 0000 0001 0101
num32bit = 964720941
0011 1001 1000 0000 0111 1001 0010 1101
mask 0000 1111 1111 1110 0000 0000 0000 0000
---------------------------------------------------
0011 1111 1111 1110 0111 1001 0010 1101
:~$
Additionally, you can write getMask() function in shorter form in two statements, as follows:
unsigned getMask(unsigned n,
unsigned m){
byte noOfBits = sizeof(unsigned) * BYTE;
return ((unsigned)~0u >> (noOfBits - n - 1)) &
(~0u << (noOfBits -m));
}
Note: I removed redundant parentheses, to clean up the code. Although you never need to remember precedence of operators, as you can override precedence using (), a good programmer always refers to precedence table to write neat code.
A better approach may be to write a macro as below:
#define _NO_OF_BITS sizeof(unsigned) * CHAR_BIT
#define MASK(n, m) (((unsigned)~0u >> (_NO_OF_BITS - n - 1)) & \
(~0u << (_NO_OF_BITS - m)))
And call like:
result32bit = num32bit | MASK(27, 17);
To reset (all zero) bits from n to m
To reset all bits from n to m = 0, and leave the rest unchanged, you just need complement (~) of mask.
mask 0000 1111 1111 1111 1000 0000 0000 0000
~mask 1111 0000 0000 0000 0111 1111 1111 1111 <-- complement
Also instead of | operator to set zero & is required.
remember AND works like:
x & 0 = 0 , and
x & 0 = 0
where x value can be 1 or 0.
Because we already have a bitwise complement ~ operator and and & operator, we just need to do:
rsult32bit = num32bit & ~MASK(27, 17);
And it will work like:
num32bit = 964720941
0011 1001 1000 0000 0111 1001 0010 1101
mask 1111 0000 0000 0000 0111 1111 1111 1111
---------------------------------------------------
0011 0000 0000 0000 0111 1001 0010 1101

Related

what does bit_test() function do?

I'm reading Programming in C, 4th edn by Stephen Kochan.
Exercise: Write a function called bit_test() that takes two arguments: an unsigned int and a bit number n. Have the function return 1 if bit number n is on inside the word, and 0 if it is off. Assume that bit number 0 references the leftmost bit inside the integer. Also write a corresponding function called bit_set() that takes two arguments: an unsigned int and a bit number n. Have the function return the result of turning bit n on inside the integer.
This is one of the exercise's answers on their forum.
12-5
-----
/* test bit n in word to see if it is on
assumes words are 32 bits long */
int bit_test (unsigned int word, int n)
{
if ( n < 0 || n > 31 )
return 0;
if ( (word >> (31 - n)) & 0x1 )
return 1;
else
return 0;
}
unsigned int bit_set (unsigned int word, int n)
{
if ( n < 0 || n > 31 )
return 0;
return word | (1 << (31 - n));
}
Now I tried to understand it like this and as per my understanding it always returns 0. What does this function actually do?
It just checks whether a bit is set or not.
It assumes that it unsigned int is stored in 32 bit on that particular system.
Why the check?
Check is needed to make it safe ( I am not shifting a negative value or value greater than 31 ) As first one complains of being an error and the seecond one is useless as it returns 0.
what it really does in (word >> (31 - n)) & 0x1 )?
x x y x x x x x x
0 1 2 3 4 5 6 7 8
|-----------|
8-2=6
(Here I considered 9 bit words instead of 32. In your case it will be 31-3=28
So right shift it 6 bit
0 0 0 0 0 0 x x y
Now how to check if it is set or not?
0 0 0 0 0 0 x x y
& 0 0 0 0 0 0 0 0 1
________________________
0 0 0 0 0 0 0 0 y if it is set it returns 1 else 0
if that bit is et the result will be 1 else 0.
What does bit_set do?
It returns that nth bit set.
So if you input
0001 0010 1
and set bit is 0 (you want to set bit at position 0) then you will get
1001 0010 1
return word | (1 << (31 - n));
let the word be 0001 1001 1
You want to set bit 2 [0 indexing]
0001 1001 1
| 0010 0000 0
0011 1001 1
You have to apply logical or operation on with that value.
How to get that value?
Here we just want this number
0010 0000 0
|-------|
6 shift needed (left shift)
1 << (8-2) ---> is how you get it.
Or in your case 1<<(31-n)
Now I get what you are thinking wrong.....
You considered 25
0000 0000 0000 0000 0000 0000 0000 1101
The bit in 3rd (0 indexing) position is this
000[0] 0000 0000 0000 0000 0000 0000 1101
This bit is unset or 0.
Try 29th position of number 25 you will get 1 as answer.
The problem statement has us identify the leftmost, or highest order bit as n = 0, and the rightmost, or lowest order bit as n = 31.
The bit_test() function shifts the test bit to the lowest order position and does a bitwise AND to find if the test bit was set. For example, to test if the bit n = 0 is set for the bit pattern:
1111 1111 1111 1111 1111 1111 1111 1111
there is a shift to the right (word >> 31 - 0):
0000 0000 0000 0000 0000 0000 0000 0001
then the bitwise AND with 0x1 evaluates to 1, indicating that the n = 0 bit was set.
The bit_set() function shifts a bit-pattern with only the lowest order bit set to the left so that only the bit indicated by n is set, and then combines this bit pattern with the input number using a bitwise OR to set the n bit. If the input number is 0, and n = 3, then the lowest order bit of the bit pattern for 1 (or 0x1):
0000 0000 0000 0000 0000 0000 0000 0001
is shifted to the left (1 << 31 - 3):
0001 0000 0000 0000 0000 0000 0000 0000
and combined with the bit-pattern for 0 using a bitwise OR:
0001 0000 0000 0000 0000 0000 0000 0000
The result is that the n bit of the input number is set to 1.

Decide if X is at most half as long as Y, in binary, for unsigned ints in C

I have two unsigned ints X and Y, and I want to efficiently decide if X is at most half as long as Y, where the length of X is k+1, where 2^k is the largest power of 2 that is no larger than X.
i.e., X=0000 0101 has length 3, Y=0111 0000 is more than twice as long as X.
Obviously we can check by looking at individual bits in X and Y, for example by shifting right and counting in a loop, but is there an efficient, bit-twiddling (and loop-less) solution?
The (toy) motivation comes from the fact that I want to divide the range RAND_MAX either into range buckets or RAND_MAX/range buckets, plus some remainder, and I prefer use the larger number of buckets. If range is (approximately) at most the square root of RAND_MAX (i.e., at most half as long), than I prefer using RAND_MAX/range buckets, and otherwise I want to use range buckets.
It should be noted, therefore, that X and Y might be large, where possibly Y=1111 1111, in the 8-bit example above. We certainly don't want to square X.
Edit, post-answer: The answer below mentions the built-in count leading zeros function (__builtin_clz()), and that is probably the fastest way to compute the answer. If for some reason this is unavailable, the lengths of X and Y can be obtained through some well-known bit twiddling.
First, smear the bits of X to the right (filling X with 1s except its leading 0s), and then do a population count. Both of these operations involve O(log k) operations, where k is the number of bits that X occupies in memory (my examples are for uint32_t, 32 bit unsigned integers). There are various implementations, but I put the ones that are easiest to understand below:
//smear
x = x | x>>1;
x = x | x>>2;
x = x | x>>4;
x = x | x>>8;
x = x | x>>16;
//population count
x = ( x & 0x55555555 ) + ( (x >> 1 ) & 0x55555555 );
x = ( x & 0x33333333 ) + ( (x >> 2 ) & 0x33333333 );
x = ( x & 0x0F0F0F0F ) + ( (x >> 4 ) & 0x0F0F0F0F );
x = ( x & 0x00FF00FF ) + ( (x >> 8 ) & 0x00FF00FF );
x = ( x & 0x0000FFFF ) + ( (x >> 16) & 0x0000FFFF );
The idea behind the population count is to divide and conquer. For example with
01 11, I first count the 1-bits in 01: there is 1 1-bit on the right, and
there are 0 1-bits on the left, so I record that as 01 (in place). Similarly,
11 becomes 10, so the updated bit-string is 01 10, and now I will add the
numbers in buckets of size 2, and replace the pair of them with the result;
1+2=3, so the bit string becomes 0011, and we are done. The original
bit-string is replaced with the population count.
There are faster ways to do the pop count given in Hacker's Delight, but this
one is easier to explain, and seems to be the basis for most of the others. You
can get my code as a
Gist here..
X=0000 0000 0111 1111 1000 1010 0010 0100
Set every bit that is 1 place to the right of a 1
0000 0000 0111 1111 1100 1111 0011 0110
Set every bit that is 2 places to the right of a 1
0000 0000 0111 1111 1111 1111 1111 1111
Set every bit that is 4 places to the right of a 1
0000 0000 0111 1111 1111 1111 1111 1111
Set every bit that is 8 places to the right of a 1
0000 0000 0111 1111 1111 1111 1111 1111
Set every bit that is 16 places to the right of a 1
0000 0000 0111 1111 1111 1111 1111 1111
Accumulate pop counts of bit buckets size 2
0000 0000 0110 1010 1010 1010 1010 1010
Accumulate pop counts of bit buckets size 4
0000 0000 0011 0100 0100 0100 0100 0100
Accumulate pop counts of bit buckets size 8
0000 0000 0000 0111 0000 1000 0000 1000
Accumulate pop counts of bit buckets size 16
0000 0000 0000 0111 0000 0000 0001 0000
Accumulate pop counts of bit buckets size 32
0000 0000 0000 0000 0000 0000 0001 0111
The length of 8358436 is 23 bits
Y=0000 0000 0000 0000 0011 0000 1010 1111
Set every bit that is 1 place to the right of a 1
0000 0000 0000 0000 0011 1000 1111 1111
Set every bit that is 2 places to the right of a 1
0000 0000 0000 0000 0011 1110 1111 1111
Set every bit that is 4 places to the right of a 1
0000 0000 0000 0000 0011 1111 1111 1111
Set every bit that is 8 places to the right of a 1
0000 0000 0000 0000 0011 1111 1111 1111
Set every bit that is 16 places to the right of a 1
0000 0000 0000 0000 0011 1111 1111 1111
Accumulate pop counts of bit buckets size 2
0000 0000 0000 0000 0010 1010 1010 1010
Accumulate pop counts of bit buckets size 4
0000 0000 0000 0000 0010 0100 0100 0100
Accumulate pop counts of bit buckets size 8
0000 0000 0000 0000 0000 0110 0000 1000
Accumulate pop counts of bit buckets size 16
0000 0000 0000 0000 0000 0000 0000 1110
Accumulate pop counts of bit buckets size 32
0000 0000 0000 0000 0000 0000 0000 1110
The length of 12463 is 14 bits
So now I know that 12463 is significantly larger than the square root of
8358436, without taking square roots, or casting to floats, or dividing or
multiplying.
See also
Stackoverflow
and Haacker's Delight (it's
a book, of course, but I linked to some snippets on their website).
If you are dealing with unsigned int and sizeof(unsigned long long) >= sizeof(unsigned int), you can just use the square method after casting:
(unsigned long long)X * (unsigned long long)X <= (unsigned long long)Y
If not, you can still use the square method if X is less than the square root of UINT_MAX+1, which you may need to hard code in the function.
Otherwise, you could use floating point calculation:
sqrt((double)Y) >= (double)X
On modern CPUs, this would be quite fast anyway.
If you are OK with gcc extensions, you can use __builtin_clz() to compute the length of X and Y:
int length_of_X = X ? sizeof(X) * CHAR_BIT - __builtin_clz(X) : 0;
int length_of_Y = Y ? sizeof(Y) * CHAR_BIT - __builtin_clz(Y) : 0;
return length_of_X * 2 <= length_of_Y;
__buitin_clz() compiles to a single instruction on modern Intel CPUs.
Here is a discussion on more portable ways to count leading zeroes you could use to implement your length function: Counting leading zeros in a 32 bit unsigned integer with best algorithm in C programming or this one: Implementation of __builtin_clz

Type conversion: signed int to unsigned long in C

I'm currently up to chapter 2 in The C Programming Language (K&R) and reading about bitwise operations.
This is the example that sparked my curiosity:
x = x & ~077
Assuming a 16-bit word length and 32-bit long type, what I think would happen is 077 would first be converted to:
0000 0000 0011 1111 (16 bit signed int).
This would then be complemented to:
1111 1111 1100 0000.
My question is what would happen next for the different possible types of x? If x is a signed int the answer is trivial. But, if x is a signed long I'm assuming ~077 would become:
1111 1111 1111 1111 1111 1111 1100 0000
following 2s complement to preserve the sign. Is this correct?
Also, if x is an unsigned long will ~077 become:
0000 0000 0000 0000 1111 1111 1100 0000
Or, will ~077 be converted to a signed long first:
1111 1111 1111 1111 1111 1111 1100 0000
...after which it is converted to an unsigned long (no change to bits)?
Any help would help me clarify whether or not this operation will always set only the last 6 bits to zero.
Whatever data-type you choose, ~077 will set the rightmost 6 bits to 0 and all others to 1.
Assuming 16-bit ints and 32-bit longs, there are 4 cases:
Case 1
unsigned int x = 077; // x = 0000 0000 0011 1111
x = ~x; // x = 1111 1111 1100 0000
unsigned long y = ~x; // y = 0000 0000 0000 0000 1111 1111 1100 0000
Case 2
unsigned int x = 077; // x = 0000 0000 0011 1111
x = ~x; // x = 1111 1111 1100 0000
long y = ~x; // y = 0000 0000 0000 0000 1111 1111 1100 0000
Case 3
int x = 077; // x = 0000 0000 0011 1111
x = ~x; // x = 1111 1111 1100 0000
unsigned long y = ~x; // y = 1111 1111 1111 1111 1111 1111 1100 0000
Case 4
int x = 077; // x = 0000 0000 0011 1111
x = ~x; // x = 1111 1111 1100 0000
long y = ~x; // y = 1111 1111 1111 1111 1111 1111 1100 0000
See code here. This means the sign extension is done when the source is signed. When the source is unsigned, sign bit is not extended and the left bits are set to 0.
x = x & ~077 //~077=11111111111111111111111111000000(not in every case)
~077 is a constant evaluated at the complie time so its value will be casted according to the value of x at the compile time so the AND operation will always yield to last 6 bits of x to 0 and the remaining bits will remain whatever they were before the AND operation. Like
//let x=256472--> Binary--> 0000 0000 0000 0011 1110 1001 1101 1000
x = x & ~077;
// now x = 0000 0000 0000 0011 1110 1001 1100 0000 Decimal--> 256448
So the last 6 bits are changed to 0 irrespective of the data type during the compile time remaining bits remain same. And in knr it is written there The portable form
involves no extra cost, since ~077 is a constant expression that can be evaluated at compile time.

Sign-extend a number in C

I am having trouble trying to sign-extend a number by extracting part of a bit-string. This has trouble when it is a negative number, it wraps the number around to the positive side.
Here is my code:
// printf("add1 \n");
unsigned short r1 = (instruction>>6)&7;
signed short amount = (instruction& 31); //right here! i am trying to get the last 5 bits and store it in a register but i can't figure out how to make it negative if it is negative
// printf("\namount is %d \n", amount);
unsigned short dest = (instruction>>9)&7;
state->regs[dest] = state->regs[r1]+amount;
setCC(state,state->regs[r1]+amount);
For bit patterns, it's often easier to use hex constants instead of decimal.
signed short amount = (instruction & 0x1F);
Then to sign-extend the number, check the sign-bit (assuming the sign-bit here is the left-most of the 5 extracted bits). If it's set, do a binary inversion and add 1. Take the 2's-complement of the 5-bit value (invert and add one), then take the 2's-complement of the full-width result (invert and add 1).
if (amount & 0x10)
amount = ~(amount^0x1F + 1) + 1;
Eg.
5-bit "bitfield"
X XXXX
0000 0000 0001 1111
0000 0000 0000 0000 invert x ^ 0x1F (= 1 1111)
0000 0000 0000 0001 add 1
1111 1111 1111 1110 invert ~
1111 1111 1111 1111 add 1
0000 0000 0001 0000
0000 0000 0000 1111 invert x ^ 0x1F (= 1 1111)
0000 0000 0001 0000 add 1
1111 1111 1110 1111 invert ~
1111 1111 1111 0000 add 1
Ooops. Even simpler:
-(x^0x1F + 1) Assuming the machine operates with 2's-complement
0000 0000 0001 0110
0000 0000 0000 1001 invert
0000 0000 0000 1010 add 1 (yielding the full-width absolute value)
1111 1111 1111 0110 negate
use bitfields:
union {
int a;
struct {
int a:5;
int b:3;
unsigned int c:20;
} b;
} u = 0xdeadbeef;
int b = u.b.b; // should sign extend the 3-bit bitfield starting from bit 5
Here is how you can sign extend a 5-bit two's complement value portably without tests:
int amount = (instruction & 31) - ((instruction & 16) << 1);
More generally, it the field width is n, non zero and less than the number of bits in an int, you can write:
int amount = (instruction & ~(~1U << (n - 1) << 1)) -
((instruction & (1U << (n - 1)) << 1);
From Hacker's Delight 2-6. Assuming 5 bits of data that must be sign extended (sign bit has value 16).
Best case: If the upper bits are all zeros:
(i ^ 16) - 16
Next best case (as with OP's instruction): If the upper bits contain data that must be discarded:
(i & 15) - (i & 16)
You can check the sign-bit and fix-up the result accordingly:
int width_of_field = 5;
signed short amount = (instruction& 31);
if (amount & (1 << width_of_field >> 1)) // look at the sign bit
{
amount -= 1 << width_of_field; // fix the result
}
Alternatively, use a left-shift followed by a right shift:
width_of_field = 5;
signed short amount = (instruction& 31);
// It is possible to omit the "& 31", because of the left shift below
amount <<= 16 - width_of_field;
amount >>= 16 - width_of_field;
Note: must use two statements to avoid effects of promotion to int (which presumably has 32 bits).

Unexpected C/C++ bitwise shift operators outcome

I think I'm going insane with this.
I have a a piece of code that needs to create an (unsigned) integer with N consequent bits set to 1. To be exact I have a bitmask, and in some situations I'd like to set it to a solid rnage.
I have the following function:
void MaskAddRange(UINT& mask, UINT first, UINT count)
{
mask |= ((1 << count) - 1) << first;
}
In simple words: 1 << count in binary representation is 100...000 (number of zeroes is count), subtracting 1 from such a number gives 011...111, and then we just left-shift it by first.
The above should yield correct result, when the following obvious limitation is met:
first + count <= sizeof(UINT)*8 = 32
Note that it should also work correctly for "extreme" cases.
if count = 0 we have (1 << count) = 1, and hence ((1 << count) - 1) = 0.
if count = 32 we have (1 << count) = 0, since the leading bit overflows, and according to C/C++ rules bitwise shift operators are not cyclic. Then ((1 << count) - 1) = -1 (all bits set).
However, as turned out, for count = 32 the formula doesn't work as expected. As discovered:
UINT n = 32;
UINT x = 1 << n;
// the value of x is 1
Moreover, I'm using MSVC2005 IDE. When I evaluate the above expression in the debugger, the result is 0. However when I step over the above line, x gets value of 1. Lokking via the disassembler we see the following:
mov eax,1
mov ecx,dword ptr [ebp-0Ch] // ecx = n
shl eax,cl // eax <<= LOBYTE(ecx)
mov dword ptr [ebp-18h],eax // n = ecx
There's no magic indeed, compiler just used shl instruction. Then it seems that shl doesn't do what I expected it should do. Either CPU decides to ignore this instruction, or the shift is treated modulo 32, or donno what.
My questions are:
What is the correct behavior of shl/shr instructions?
Is there a CPU flag controlling the bitshift instructions?
Is this according to C/C++ standard?
Thanks in advance
Edit:
Thanks for answers. I've realized that (1) shl/shr indeed treat operand modulo 32 (or & 0x1F) and (2) C/C++ standard treats shift by more than 31 bits as undefined behavior.
Then I have one more question. How can I rewrite my "masking" expression to cover this extreme case too. It should be without branching (if, ?). What'd be the simplest expression?
1U << 32 is undefined behavior in C and in C++ when type unsigned int is 32-bit wide.
(C11, 6.5.7p3) "If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined"
(C++11, 5.8p1) "The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand."
Shifting by as many or more bits than in the integer type you're shifting is undefined in C and C++. On x86 and x86_64, the shift amount of the shift instructions is indeed treated modulo 32 (or whatever the operand size is). You however cannot rely on this modulo behaviour to be generated by your compiler from C or C++ >>/<< operations unless your compiler explicitly guarantees it in its documentation.
I think the expression 1 << 32 is the same as 1 << 0. IA-32 Instruction Set Reference says that the count operand of shift instructions is masked to 5 bits.
The instruction set reference of IA-32 architectures can be found here.
To fix the "extreme" case, I can only come up with the following code (maybe buggy) that may be a little awkward:
void MaskAddRange(UINT *mask, UINT first, UINT count) {
int count2 = ((count & 0x20) >> 5);
int count1 = count - count2;
*mask |= (((1 << count1) << count2) - 1) << first;
}
The basic idea is to split the shift operation so that each shift count does not exceed 31.
Apparently, the above code assumes that the count is in a range of 0..32, so it is not very robust.
If I have understood the requirements, you want an unsigned int, with the top N bits set?
There are several ways to get the result (I think) you want.
Edit:
I am worried that this isnt very robust, and will fail for n>32:
uint32_t set_top_n(uint32 n)
{
static uint32_t value[33] = { ~0xFFFFFFFF, ~0x7FFFFFFF, ~0x3FFFFFFF, ~0x1FFFFFFF,
~0x0FFFFFFF, ~0x07FFFFFF, ~0x03FFFFFF, ~0x01FFFFFF,
~0x00FFFFFF, ~0x007FFFFF, ~0x003FFFFF, ~0x001FFFFF,
// you get the idea
0xFFFFFFFF
};
return value[n & 0x3f];
}
This should be quite fast as it is only 132 bytes of data.
To make it robust, I'd either extend for all values up to 63, or make it conditional, in which case it can be done with a version of your original bit-masking + the 32 case. I.e.
My 32 cents:
#include <limits.h>
#define INT_BIT (CHAR_BIT * sizeof(int))
unsigned int set_bit_range(unsigned int n, int frm, int cnt)
{
return n | ((~0u >> (INT_BIT - cnt)) << frm);
}
List 1.
A safe version with bogus / semi-circular result could be:
unsigned int set_bit_range(unsigned int n, int f, int c)
{
return n | (~0u >> (c > INT_BIT ? 0 : INT_BIT - c)) << (f % INT_BIT);
}
List 2.
Doing this without branching, or local variables, could be something like;
return n | (~0u >> ((INT_BIT - c) % INT_BIT)) << (f % INT_BIT);
List 3.
List 2 and List 3 This would give "correct" result as long as from is less then INT_BIT and >= 0. I.e.:
./bs 1761 26 810
Setting bits from 26 count 810 in 1761 -- of 32 bits
Trying to set bits out of range, set bits from 26 to 836 in 32 sized range
x = ~0u = 1111 1111 1111 1111 1111 1111 1111 1111
Unsafe version:
x = x >> -778 = 0000 0000 0000 0000 0000 0011 1111 1111
x = x << 26 = 1111 1100 0000 0000 0000 0000 0000 0000
x v1 Result = 1111 1100 0000 0000 0000 0110 1110 0001
Original: 0000 0000 0000 0000 0000 0110 1110 0001
Safe version, branching:
x = x >> 0 = 1111 1111 1111 1111 1111 1111 1111 1111
x = x << 26 = 1111 1100 0000 0000 0000 0000 0000 0000
x v2 Result = 1111 1100 0000 0000 0000 0110 1110 0001
Original: 0000 0000 0000 0000 0000 0110 1110 0001
Safe version, modulo:
x = x >> 22 = 0000 0000 0000 0000 0000 0011 1111 1111
x = x << 26 = 1111 1100 0000 0000 0000 0000 0000 0000
x v3 Result = 1111 1100 0000 0000 0000 0110 1110 0001
Original: 0000 0000 0000 0000 0000 0110 1110 0001
You could avoid the undefined behavior by splitting the shift operation in two steps, the first one by (count - 1) bits and the second one by 1 more bit. Special care is needed in case count is zero, however:
void MaskAddRange(UINT& mask, UINT first, UINT count)
{
if (count == 0) return;
mask |= ((1 << (count - 1) << 1) - 1) << first;
}

Resources