If x is an unsigned int type is there a difference in these statements:
return (x & 7);
and
return (-x & 7);
I understand negating an unsigned value gives a value of max_int - value. But is there a difference in the return value (i.e. true/false) among the above two statements under any specific boundary conditions OR are they both same functionally?
Test code:
#include <stdio.h>
static unsigned neg7(unsigned x) { return -x & 7; }
static unsigned pos7(unsigned x) { return +x & 7; }
int main(void)
{
for (unsigned i = 0; i < 8; i++)
printf("%u: pos %u; neg %u\n", i, pos7(i), neg7(i));
return 0;
}
Test results:
0: pos 0; neg 0
1: pos 1; neg 7
2: pos 2; neg 6
3: pos 3; neg 5
4: pos 4; neg 4
5: pos 5; neg 3
6: pos 6; neg 2
7: pos 7; neg 1
For the specific case of 4 (and also 0), there isn't a difference; for other values, there is a difference. You can extend the range of the input, but the outputs will produce the same pattern.
If you ask specifically for true/false (i.e. is zero / not zero) and two's complement then there is indeed no difference. (You do however return not just a simple truth value but allow different bit patterns for true. As long as the caller does not distinguish, that is fine.)
Consider how a two's complement negation is formed: invert the bits then increment. Since you take only the least significant bits, there will be no carry in for the increment. This is a necessity, so you can't do this with anything but a range of least significant bits.
Let's look at the two cases:
First, if the three low bits are zero (for a false equivalent). Inverting gives all ones, incrementing turns them to zero again. The fourth and more significant bits might be different, but they don't influence the least significant bits and they don't influence the result since they are masked out. So this stays.
Second, if the three low bits are not all zero (for a true equivalent). The only way this can change into false is when the increment operation leaves them at zero, which can only happen if they were all ones before, which in turn could only happen if they were all zeros before the inversion. That can't be, since that is the first case. Again, the more significant bits don't influence the three low bits and they are masked out. So the result does not change.
But again, this only works when the caller considers only the truth value (all bits zero / not all bits zero) and when the mask allows a range of bits starting from the least significant without a gap.
Firstly, negating an unsigned int value produces UINT_MAX - original_value + 1. (For example, 0 remains 0 under negation). The alternative way to describe negation is full inversion of all bits followed by increment.
It is not clear why you'd even ask this question, since it is obvious that basically the very first example that comes to mind — an unsigned int value 1 — already produces different results in your expression. 1u & 7 is 1, while -1u & 7 is 7. Did you mean something else, by any chance?
Related
I am attempting exercise 2.1 of K&R. The exercise reads:
Write a program to determine the ranges of char, short, int, and long variables, both signed and unsigned, by printing appropriate values from standard headers and by direct computation. Harder if you compute them: determine the ranges of the various floating-point types.
Printing the values of constants in the standards headers is easy, just like this (only integer shown for example):
printf("Integral Ranges (from constants)\n");
printf("int max: %d\n", INT_MAX);
printf("int min: %d\n", INT_MIN);
printf("unsigned int max: %u\n", UINT_MAX);
However, I want to determine the limits programmatically.
I tried this code which seems like it should work but it actually goes into an infinite loop and gets stuck there:
printf("Integral Ranges (determined programmatically)\n");
int i_max = 0;
while ((i_max + 1) > i_max) {
++i_max;
}
printf("int max: %d\n", i_max);
Why is this getting stuck in a loop? It would seem that when an integer overflows it jumps from 2147483647 to -2147483648. The incremented value is obviously smaller than the previous value so the loop should end, but it doesn't.
Ok, I was about to write a comment but it got too long...
Are you allowed to use sizeof?
If true, then there is an easy way to find the max value for any type:
For example, I'll find the maximum value for an integer:
Definition: INT_MAX = (1 << 31) - 1 for 32-bit integer (2^31 - 1)
The previous definition overflows if we use integers to compute int max, so, it has to be adapted properly:
INT_MAX = (1 << 31) - 1
= ((1 << 30) * 2) - 1
= ((1 << 30) - 1) * 2 + 2) - 1
= ((1 << 30) - 1) * 2) + 1
And using sizeof:
INT_MAX = ((1 << (sizeof(int)*8 - 2) - 1) * 2) + 1
You can do the same for any signed/unsigned type by just reading the rules for each type.
So it actually wasn't getting stuck in an infinite loop. C code is usually so fast that I assume it's broken if it doesn't complete immediately.
It did eventually return the correct answer after I let it run for about 10 seconds. Turns out that 2,147,483,647 increments takes quite a few cycles to complete.
I should also note that I compiled with cc -O0 to disable optimizations, so this wasn't the problem.
A faster solution might look something like this:
int i_max = 0;
int step_size = 256;
while ((i_max + step_size) > i_max) {
i_max += step_size;
}
while ((i_max + 1) > i_max) {
++i_max;
}
printf("int max: %d\n", i_max);
However, as signed overflow is undefined behavior, probably it is a terrible idea to ever try to programmatically guess this in practice. Better to use INT_MAX.
The simplest I could come up with is:
signed int max_signed_int = ~(1 << ((sizeof(int) * 8) -1));
signed int min_signed_int = (1 << ((sizeof(int) * 8) -1));
unsigned int max_unsigned_int = ~0U;
unsigned int min_unsigned_int = 0U;
In my system:
// max_signed_int = 2147483647
// min_signed_int = -2147483648
// max_unsigned_int = 4294967295
// min_unsigned_int = 0
Assuming a two's complement processor, use unsigned math:
unsigned ... smax, smin;
smax = ((unsigned ...)0 - (unsigned ...)1) / (unsigned ...) 2;
smin = ~smax;
As it has been pointed here in other solutions, trying to overflow an integer in C is undefined behaviour, but, at least in this case, I think you can get an valid answer, even from the U.B. thing:
The case is tha if you increment a value and compare the new value with the last, you always get a greater value, except on an overflow (in this case you'll get a value lesser or equal ---you don't have more values greater, that's the case in an overflow) So you can try at least:
int i_old = 0, i = 0;
while (++i > i_old)
i_old = i;
printf("MAX_INT guess: %d\n", i_old);
After this loop, you will have got the expected overflow, and old_i will store the last valid number. Of course, in case you go down, you'll have to use this snippet of code:
int i_old = 0, i = 0;
while (--i < i_old)
i_old = i;
printf("MIN_INT guess: %d\n", i_old);
Of course, U.B. can even mean program stopping run (in this case, you'll have to put traces, to get at least the last value printed)
By the way, in the ancient times of K&R, integers used to be 16bit wide, a value easily accessible by counting up (easier than now, try 64bit integers overflow from 0 up)
I would use the properties of two's complement to compute the values.
unsigned int uint_max = ~0U;
signed int int_max = uint_max >> 1;
signed int int_min1 = (-int_max - 1);
signed int int_min2 = ~int_max;
2^3 is 1000. 2^3 - 1 is 0111. 2^4 - 1 is 1111.
w is the length in bits of your data type.
uint_max is 2^w - 1, or 111...111. This effect is achieved by using ~0U.
int_max is 2^(w-1) - 1, or 0111...111. This effect can be achieved by bitshifting uint_max 1 bit to the right. Since uint_max is an unsigned value, the logical shift is applied by the >> operator, means it adds in leading zeroes instead of extending the sign bit.
int_min is -2^(w-1), or 100...000. In two's complement, the most significant bit has a negative weight!
This is how to visualize the first expression for computing int_min1:
...
011...111 int_max +2^(w-1) - 1
100...000 (-int_max - 1) -2^(w-1) == -2^(w-1) + 1 - 1
100...001 -int_max -2^(w-1) + 1 == -(+2^(w-1) - 1)
...
Adding 1 would be moving down, and subtracting 1 would be moving up. First we negate int_max in order to generate a valid int value, then we subtract 1 to get int_min. We can't just negate (int_max + 1) because that would exceed int_max itself, the biggest int value.
Depending on which version of C or C++ you are using, the expression -(int_max + 1) would either become a signed 64-bit integer, keeping the signedness but sacrificing the original bit width, or it would become an unsigned 32-bit integer, keeping the original bit width but sacrificing the signedness. We need to declare int_min programatically in this roundabout way to keep it a valid int value.
If that's a bit (or byte) too complicated for you, you can just do ~int_max, observing that int_max is 011...111 and int_min is 100...000.
Keep in mind that these techniques I've mentioned here can be used for any bit width w of an integer data type. They can be used for char, short, int, long, and also long long. Keep in mind that integer literals are almost always 32-bits by default, so you may have to cast the 0U to the data type with the appropriate bit width before bitwise NOTing it. But other than that, these techniques are based on the fundamental mathematical principles of two's complement integer representation. That said, they won't work if your computer uses a different way of representing integers, for example ones' complement or most-significant sign-bit.
The assignment says that "printing appropriate values from standard headers" is allowed, and in the real world, that is what you would do. As your prof wrote, direct computation is harder, and why make things harder for its own sake when you're working on another interesting problem and you just want the result? Look up the constants in <limits.h>, for example, INT_MIN and INT_MAX.
Since this is homework and you want to solve it yourself, here are some hints.
The language standard technically allows any of three different representations for signed numbers: two's-complement, one's-complement and sign-and-magnitude. Sure, every computer made in the last fifty years has used two's-complement (with the partial exception of legacy code for certain Unisys mainframes), but if you really want to language-lawyer, you could compute the smallest number for each of the three possible representations and find the minimum by comparing them.
Attempting to find the answer by overflowing or underflowing a signed value does not work! This is undefined behavior! You may in theory, but not in practice, increment an unsigned value of the same width, convert to the corresponding signed type, and compare to the result of casting the previous or next unsigned value. For 32-bit long, this might just be tolerable; it will not scale to a machine where long is 64 bits wide.
You want to use the bitwise operators, particularly ~ and <<, to calculate the largest and smallest value for every type. Note: CHAR_BITS * sizeof(x) gives you the number of bits in x, and left-shifting 0x01UL by one fewer than that, then casting to the desired type, sets the highest bit.
For floating-point values, the only portable way is to use the constants in <math.h>; floating-point values might or might not be able to represent positive and negative infinity, are not constrained to use any particular format. That said, if your compiler supports the optional Annex G of the C11 standard, which specifies IEC 60559 complex arithmetic, then dividing a nonzero floating-point number by zero will be defined as producing infinity, which does allow you to "compute" infinity and negative infinity. If so, the implementation will #define __STDC_IEC_559_COMPLEX__ as 1.
If you detect that infinity is not supported on your implementation, for instance by checking whether INFINITY and -INFINITY are infinities, you would want to use HUGE_VAL and -HUGE_VAL instead.
#include <stdio.h>
int main() {
int n = 1;
while(n>0) {
n=n<<1;
}
int int_min = n;
int int_max = -(n+1);
printf("int_min is: %d\n",int_min);
printf("int_max is: %d\n", int_max);
return 0;
}
unsigned long LMAX=(unsigned long)-1L;
long SLMAX=LMAX/2;
long SLMIN=-SLMAX-1;
If you don't have yhe L suffix just use a variable or cast to signed before castong to unsigned.
For long long:
unsigned long long LLMAX=(unsigned long long)-1LL;
I'm trying to find the position of two 1's in a 64 bit number. In this case the ones are at the 0th and 63rd position. The code here returns 0 and 32, which is only half right. Why does this not work?
#include<stdio.h>
void main()
{
unsigned long long number=576460752303423489;
int i;
for (i=0; i<64; i++)
{
if ((number & (1 << i))==1)
{
printf("%d ",i);
}
}
}
There are two bugs on the line
if ((number & (1 << i))==1)
which should read
if (number & (1ull << i))
Changing 1 to 1ull means that the left shift is done on a value of type unsigned long long rather than int, and therefore the bitmask can actually reach positions 32 through 63. Removing the comparison to 1 is because the result of number & mask (where mask has only one bit set) is either mask or 0, and mask is only equal to 1 when i is 0.
However, when I make that change, the output for me is 0 59, which still isn't what you expected. The remaining problem is that 576460752303423489 (decimal) = 0800 0000 0000 0001 (hexadecimal). 0 59 is the correct output for that number. The number you wanted is 9223372036854775809 (decimal) = 8000 0000 0000 0001 (hex).
Incidentally, main is required to return int, not void, and needs an explicit return 0; as its last action (unless you are doing something more sophisticated with the return code). Yes, C99 lets you omit that. Do it anyway.
Because (1 << i) is a 32-bit int value on the platform you are compiling and running on. This then gets sign-extended to 64 bits for the & operation with the number value, resulting in bit 31 being duplicated into bits 32 through 63.
Also, you are comparing the result of the & to 1, which isn't correct. It will not be 0 if the bit is set, but it won't be 1.
Shifting a 32-bit int by 32 is undefined.
Also, your input number is incorrect. The bits set are at positions 0 and 59 (or 1 and 60 if you prefer to count starting at 1).
The fix is to use (1ull << i), or otherwise to right-shift the original value and & it with 1 (instead of left-shifting 1). And of course if you do left-shift 1 and & it with the original value, the result won't be 1 (except for bit 0), so you need to compare != 0 rather than == 1.
#include<stdio.h>
int main()
{
unsigned long long number = 576460752303423489;
int i;
for (i=0; i<64; i++)
{
if ((number & (1ULL << i))) //here
{
printf("%d ",i);
}
}
}
First is to use 1ULL to represent unsigned long long constant. Second is in the if statement, what you mean is not to compare with 1, that will only be true for the rightmost bit.
Output: 0 59
It's correct because 576460752303423489 is equal to 0x800000000000001
The problem could have been avoided in the first place by adopting the methodology of applying the >> operator to a variable, instead of a literal:
if ((variable >> other_variable) & 1)
...
I know the question has some time and multiple correct answers while my should be a comment, but is a bit too long for it. I advice you to encapsulate bit checking logic in a macro and don't use 64 number directly, but rather calculate it. Take a look here for quite comprehensive source of bit manipulation hacks.
#include<stdio.h>
#include<limits.h>
#define CHECK_BIT(var,pos) ((var) & (1ULL<<(pos)))
int main(void)
{
unsigned long long number=576460752303423489;
int pos=sizeof(unsigned long long)*CHAR_BIT-1;
while((pos--)>=0) {
if(CHECK_BIT(number,pos))
printf("%d ",pos);
}
return(0);
}
Rather than resorting to bit manipulation, one can use compiler facilities to perform bit analysis tasks in the most efficient manner (using only a single CPU instruction in many cases).
For example, gcc and clang provide those handy routines:
__builtin_popcountll() - number of bits set in the 64b value
__builtin_clzll() - number of leading zeroes in the 64b value
__builtin_ctzll() - number of trailing zeroes in the 64b value
__builtin_ffsll() - bit index of least significant set bit in the 64b value
Other compilers have similar mechanisms.
I'm implementing a relative branching function in my simple VM.
Basically, I'm given an 8-bit relative value. I then shift this left by 1 bit to make it a 9-bit value. So, for instance, if you were to say "branch +127" this would really mean, 127 instructions, and thus would add 256 to the IP.
My current code looks like this:
uint8_t argument = 0xFF; //-1 or whatever
int16_t difference = argument << 1;
*ip += difference; //ip is a uint16_t
I don't believe difference will ever be detected as a less than 0 with this however. I'm rusty on how signed to unsigned works. Beyond that, I'm not sure the difference would be correctly be subtracted from IP in the case argument is say -1 or -2 or something.
Basically, I'm wanting something that would satisfy these "tests"
//case 1
argument = -5
difference -> -10
ip = 20 -> 10 //ip starts at 20, but becomes 10 after applying difference
//case 2
argument = 127 (must fit in a byte)
difference -> 254
ip = 20 -> 274
Hopefully that makes it a bit more clear.
Anyway, how would I do this cheaply? I saw one "solution" to a similar problem, but it involved division. I'm working with slow embedded processors (assumed to be without efficient ways to multiply and divide), so that's a pretty big thing I'd like to avoid.
To clarify: you worry that left shifting a negative 8 bit number will make it appear like a positive nine bit number? Just pad the top 9 bits with the sign bit of the initial number before left shift:
diff = 0xFF;
int16 diff16=(diff + (diff & 0x80)*0x01FE) << 1;
Now your diff16 is signed 2*diff
As was pointed out by Richard J Ross III, you can avoid the multiplication (if that's expensive on your platform) with a conditional branch:
int16 diff16 = (diff + ((diff & 0x80)?0xFF00:0))<<1;
If you are worried about things staying in range and such ("undefined behavior"), you can do
int16 diff16 = diff;
diff16 = (diff16 | ((diff16 & 0x80)?0x7F00:0))<<1;
At no point does this produce numbers that are going out of range.
The cleanest solution, though, seems to be "cast and shift":
diff16 = (signed char)diff; // recognizes and preserves the sign of diff
diff16 = (short int)((unsigned short)diff16)<<1; // left shift, preserving sign
This produces the expected result, because the compiler automatically takes care of the sign bit (so no need for the mask) in the first line; and in the second line, it does a left shift on an unsigned int (for which overflow is well defined per the standard); the final cast back to short int ensures that the number is correctly interpreted as negative. I believe that in this form the construct is never "undefined".
All of my quotes come from the C standard, section 6.3.1.3. Unsigned to signed is well defined when the value is within range of the signed type:
1 When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
Signed to unsigned is well defined:
2 Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
Unsigned to signed, when the value lies out of range isn't too well defined:
3 Otherwise, the new type is signed and the value cannot be
represented in it; either the result is implementation-defined or an
implementation-defined signal is raised.
Unfortunately, your question lies in the realm of point 3. C doesn't guarantee any implicit mechanism to convert out-of-range values, so you'll need to explicitly provide one. The first step is to decide which representation you intend to use: Ones' complement, two's complement or sign and magnitude
The representation you use will affect the translation algorithm you use. In the example below, I'll use two's complement: If the sign bit is 1 and the value bits are all 0, this corresponds to your lowest value. Your lowest value is another choice you must make: In the case of two's complement, it'd make sense to use either of INT16_MIN (-32768) or INT8_MIN (-128). In the case of the other two, it'd make sense to use INT16_MIN - 1 or INT8_MIN - 1 due to the presense of negative zeros, which should probably be translated to be indistinguishable from regular zeros. In this example, I'll use INT8_MIN, since it makes sense that (uint8_t) -1 should translate to -1 as an int16_t.
Separate the sign bit from the value bits. The value should be the absolute value, except in the case of a two's complement minimum value when sign will be 1 and the value will be 0. Of course, the sign bit can be where-ever you like it to be, though it's conventional for it to rest at the far left hand side. Hence, shifting right 7 places obtains the conventional "sign" bit:
uint8_t sign = input >> 7;
uint8_t value = input & (UINT8_MAX >> 1);
int16_t result;
If the sign bit is 1, we'll call this a negative number and add to INT8_MIN to construct the sign so we don't end up in the same conundrum we started with, or worse: undefined behaviour (which is the fate of one of the other answers).
if (sign == 1) {
result = INT8_MIN + value;
}
else {
result = value;
}
This can be shortened to:
int16_t result = (input >> 7) ? INT8_MIN + (input & (UINT8_MAX >> 1)) : input;
... or, better yet:
int16_t result = input <= INT8_MAX ? input
: INT8_MIN + (int8_t)(input % (uint8_t) INT8_MIN);
The sign test now involves checking if it's in the positive range. If it is, the value remains unchanged. Otherwise, we use addition and modulo to produce the correct negative value. This is fairly consistent with the C standard's language above. It works well for two's complement, because int16_t and int8_t are guaranteed to use a two's complement representation internally. However, types like int aren't required to use a two's complement representation internally. When converting unsigned int to int for example, there needs to be another check, so that we're treating values less than or equal to INT_MAX as positive, and values greater than or equal to (unsigned int) INT_MIN as negative. Any other values need to be handled as errors; In this case I treat them as zeros.
/* Generate some random input */
srand(time(NULL));
unsigned int input = rand();
for (unsigned int x = UINT_MAX / ((unsigned int) RAND_MAX + 1); x > 1; x--) {
input *= (unsigned int) RAND_MAX + 1;
input += rand();
}
int result = /* Handle positives: */ input <= INT_MAX ? input
: /* Handle negatives: */ input >= (unsigned int) INT_MIN ? INT_MIN + (int)(input % (unsigned int) INT_MIN)
: /* Handle errors: */ 0;
If the offset is in the 2's complement representation, then
convert this
uint8_t argument = 0xFF; //-1
int16_t difference = argument << 1;
*ip += difference;
into this:
uint8_t argument = 0xFF; //-1
int8_t signed_argument;
signed_argument = argument; // this relies on implementation-defined
// conversion of unsigned to signed, usually it's
// just a bit-wise copy on 2's complement systems
// OR
// memcpy(&signed_argument, &argument, sizeof argument);
*ip += signed_argument + signed_argument;
My code is below, and it works for most inputs, but I've noticed that for very large numbers(2147483647 divided by 2 for a specific example), I get a segmentation fault and the program stops working. Note that the badd() and bsub() functions simply add or subtract integers respectively.
unsigned int bdiv(unsigned int dividend, unsigned int divisor){
int quotient = 1;
if (divisor == dividend)
{
return 1;
}
else if (dividend < divisor)
{ return -1; }// this represents dividing by zero
quotient = badd(quotient, bdiv(bsub(dividend, divisor), divisor));
return quotient;
}
I'm also having a bit of trouble with my bmult() function. It works for some values, but the program fails for values such as -8192 times 3. This function is also listed. Thanks in advance for any help. I really appreciate it!
int bmult(int x,int y){
int total=0;
/*for (i = 31; i >= 0; i--)
{
total = total << 1;
if(y&1 ==1)
total = badd(total,x);
}
return total;*/
while (x != 0)
{
if ((x&1) != 0)
{
total = badd(total, y);
}
y <<= 1;
x >>= 1;
}
return total;
}
The problem with your bdiv is most likely resulting from recursion depth. In the example you gave, you will be putting about 1073741824 frames on to the stack, basically using up your allotted memory.
In fact, there is no real reason this function need be recursive. I could quite easily be converted to an iterative solution, alleviating the stack issue.
In the multiplication, this line is going to overflow and truncate y, and so badd() will be getting wrong inputs:
y<<=1;
This line:
x>>=1;
Is not going to work for negative x well. Most compilers will do a so-called arithmetic shift here, which is like a regular shift with 0 shifted into the most significant bit, but with a twist, the most significant bit will not change. So, shifting any negative value right will eventually give you -1. And -1 shifted right will remain -1, resulting in an infinite loop in your multiplication.
You should not be using the algorithm for multiplication of unsigned integers to multiply signed integers. It's unlikely to work well (if at all) if it uses signed types in its core.
If you want to multiply signed integers, you can first implement multiplication for unsigned ones, using unsigned types. And then you can actually use it for signed multiplication. This will work on virtually all systems because they use 2's complement representation of signed integers.
Examples (assuming 16-bit 2's complement integers):
-1 * +1 -> 0xFFFF * 1 = 0xFFFF -> convert back to signed -> -1
-1 * -1 -> 0xFFFF * 0xFFFF = 0xFFFE0001 -> truncate to 16 bits & convert to signed -> 1
In the division the following two lines
else if (dividend < divisor)
{ return -1; }// this represents dividing by zero
Are plain wrong. Think, how much is 1/2? It's 0, not -1 or (unsigned int)-1.
Further, how much is UINT_MAX/1? It's UINT_MAX. So, when your division function returns UINT_MAX or (unsigned int)-1 you won't be able to tell the difference, because the two values are the same. You really should use a different mechanism to notify the caller of the overflow.
Oh, and of course, this line:
quotient = badd(quotient, bdiv(bsub(dividend, divisor), divisor));
is going to cause a stack overflow when the quotient is expected to be big. Don't do this recursively. At the very least, use a loop instead.
Can someone please explain this function to me?
A mask with the least significant n bits set to 1.
Ex:
n = 6 --> 0x2F, n = 17 --> 0x1FFFF // I don't get these at all, especially how n = 6 --> 0x2F
Also, what is a mask?
The usual way is to take a 1, and shift it left n bits. That will give you something like: 00100000. Then subtract one from that, which will clear the bit that's set, and set all the less significant bits, so in this case we'd get: 00011111.
A mask is normally used with bitwise operations, especially and. You'd use the mask above to get the 5 least significant bits by themselves, isolated from anything else that might be present. This is especially common when dealing with hardware that will often have a single hardware register containing bits representing a number of entirely separate, unrelated quantities and/or flags.
A mask is a common term for an integer value that is bit-wise ANDed, ORed, XORed, etc with another integer value.
For example, if you want to extract the 8 least significant digits of an int variable, you do variable & 0xFF. 0xFF is a mask.
Likewise if you want to set bits 0 and 8, you do variable | 0x101, where 0x101 is a mask.
Or if you want to invert the same bits, you do variable ^ 0x101, where 0x101 is a mask.
To generate a mask for your case you should exploit the simple mathematical fact that if you add 1 to your mask (the mask having all its least significant bits set to 1 and the rest to 0), you get a value that is a power of 2.
So, if you generate the closest power of 2, then you can subtract 1 from it to get the mask.
Positive powers of 2 are easily generated with the left shift << operator in C.
Hence, 1 << n yields 2n. In binary it's 10...0 with n 0s.
(1 << n) - 1 will produce a mask with n lowest bits set to 1.
Now, you need to watch out for overflows in left shifts. In C (and in C++) you can't legally shift a variable left by as many bit positions as the variable has, so if ints are 32-bit, 1<<32 results in undefined behavior. Signed integer overflows should also be avoided, so you should use unsigned values, e.g. 1u << 31.
For both correctness and performance, the best way to accomplish this has changed since this question was asked back in 2012 due to the advent of BMI instructions in modern x86 processors, specifically BLSMSK.
Here's a good way of approaching this problem, while retaining backwards compatibility with older processors.
This method is correct, whereas the current top answers produce undefined behavior in edge cases.
Clang and GCC, when allowed to optimize using BMI instructions, will condense gen_mask() to just two ops. With supporting hardware, be sure to add compiler flags for BMI instructions:
-mbmi -mbmi2
#include <inttypes.h>
#include <stdio.h>
uint64_t gen_mask(const uint_fast8_t msb) {
const uint64_t src = (uint64_t)1 << msb;
return (src - 1) ^ src;
}
int main() {
uint_fast8_t msb;
for (msb = 0; msb < 64; ++msb) {
printf("%016" PRIx64 "\n", gen_mask(msb));
}
return 0;
}
First, for those who only want the code to create the mask:
uint64_t bits = 6;
uint64_t mask = ((uint64_t)1 << bits) - 1;
# Results in 0b111111 (or 0x03F)
Thanks to #Benni who asked about using bits = 64. If you need the code to support this value as well, you can use:
uint64_t bits = 6;
uint64_t mask = (bits < 64)
? ((uint64_t)1 << bits) - 1
: (uint64_t)0 - 1
For those who want to know what a mask is:
A mask is usually a name for value that we use to manipulate other values using bitwise operations such as AND, OR, XOR, etc.
Short masks are usually represented in binary, where we can explicitly see all the bits that are set to 1.
Longer masks are usually represented in hexadecimal, that is really easy to read once you get a hold of it.
You can read more about bitwise operations in C here.
I believe your first example should be 0x3f.
0x3f is hexadecimal notation for the number 63 which is 111111 in binary, so that last 6 bits (the least significant 6 bits) are set to 1.
The following little C program will calculate the correct mask:
#include <stdarg.h>
#include <stdio.h>
int mask_for_n_bits(int n)
{
int mask = 0;
for (int i = 0; i < n; ++i)
mask |= 1 << i;
return mask;
}
int main (int argc, char const *argv[])
{
printf("6: 0x%x\n17: 0x%x\n", mask_for_n_bits(6), mask_for_n_bits(17));
return 0;
}
0x2F is 0010 1111 in binary - this should be 0x3f, which is 0011 1111 in binary and which has the 6 least-significant bits set.
Similarly, 0x1FFFF is 0001 1111 1111 1111 1111 in binary, which has the 17 least-significant bits set.
A "mask" is a value that is intended to be combined with another value using a bitwise operator like &, | or ^ to individually set, unset, flip or leave unchanged the bits in that other value.
For example, if you combine the mask 0x2F with some value n using the & operator, the result will have zeroes in all but the 6 least significant bits, and those 6 bits will be copied unchanged from the value n.
In the case of an & mask, a binary 0 in the mask means "unconditionally set the result bit to 0" and a 1 means "set the result bit to the input value bit". For an | mask, an 0 in the mask sets the result bit to the input bit and a 1 unconditionally sets the result bit to 1, and for an ^ mask, an 0 sets the result bit to the input bit and a 1 sets the result bit to the complement of the input bit.