Most efficient way to set n consecutive bits to 1? - c

I want to get a function that will set the n last bits of a numerical type to 1. For example:
bitmask (5) = 0b11111 = 31
bitmask (0) = 0
I, first, had this implementation (mask_t is just a typedef around uint64_t):
mask_t bitmask (unsigned short n) {
return ((((mask_t) 1) << n) - 1;
}
Everything is fine except when the function hit bitmask (64) (the size of mask_t), then I get bitmask (64) = 0 in place of 64 bits set to 1.
So, I have two questions:
Why do I have this behavior ? Pushing the 1 by 64 shifts on the left should clear the register and remain with 0, then applying the -1 should fill the register with 1s...
What is the proper way to achieve this function ?

Yes this is a well known problem. There are easy ways to implement this function over the range 0..63 and over the range 1..64 (one way has been mentioned in the comments), but 0..64 is more difficult.
Of course you can just take either the "left shifting" or "right shifting" mask generation and then special-case the "missing" n,
uint64_t bitmask (unsigned short n) {
if (n == 64) return -((uint64_t)1);
return (((uint64_t) 1) << n) - 1;
}
Or
uint64_t bitmask (unsigned short n) {
if (n == 0) return 0;
uint64_t full = ~(uint64_t)0;
return full >> (64 - n);
}
Either way tends to compile to a branch, though it technically doesn't have to.
You can do it without if (not tested)
uint64_t bitmask (unsigned int n) {
uint64_t x = (n ^ 64) >> 6;
return (x << (n & 63)) - 1;
}
The idea here is that we're going to either shift 1 left by some amount the same as in your original code, or 0 in the case that n = 64. Shifting 0 left by 0 is just going to be 0 again, subtracting 1 sets all 64 bits.
Alternatively if you're on a modern x64 platform and BZHI is available, a very fast (BZHI is fast on all CPUs that implement it) but limited-portability option is:
uint64_t bitmask (unsigned int n) {
return _bzhi_u64(~(uint64_t)0, n);
}
This is even well-defined for n > 64, the actual count of 1's will be min(n & 0xFF, 64) because BZHI saturates but it reads only the lowest byte of the index.

You cannot left shift by a value larger than or equal to the bit width of the type in question. Doing so invokes undefined behavior.
From section 6.5.7 of the C standard:
2 The integer promotions are performed on each of the operands. The
type of the result is that of the promoted left operand. If the value
of the right operand is negative or is greater than or equal to the
width of the promoted left operand, the behavior is undefined.
You'll need to add a check for this in your code:
mask_t bitmask (unsigned short n) {
if (n >= 64) {
return ~(mask_t)0;
} else {
return (((mask_t) 1) << n) - 1;
}
}

Finally, just for your information, I ended up by writing:
mask_t bitmask (unsigned short n) {
return (n < (sizeof (mask_t) * CHAR_BIT)) ? (((mask_t) 1) << n) - 1 : -1;
}
But, the answer of harold is so complete and well explained that I will select it as the answer.

Related

Iterate bits from left to right for any number

I am trying to implement Modular Exponentiation (square and multiply left to right) algorithm in c.
In order to iterate the bits from left to right, I can use masking which is explained in this link
In this example mask used is 0x80 which can work only for a number with max 8 bits.
In order to make it work for any number of bits, I need to assign mask dynamically but this makes it a bit complicated.
Is there any other solution by which it can be done.
Thanks in advance!
-------------EDIT-----------------------
long long base = 23;
long long exponent = 297;
long long mod = 327;
long long result = 1;
unsigned int mask;
for (mask = 0x80; mask != 0; mask >>= 1) {
result = (result * result) % mod; // Square
if (exponent & mask) {
result = (base * result) % mod; // Mul
}
}
As in this example, it will not work if I will use mask 0x80 but if I use 0x100 then it works fine.
Selecting the mask value at run time seems to be an overhead.
If you want to iterate over all bits, you first have to know how many bits there are in your type.
This is a surprisingly complicated matter:
sizeof gives you the number of bytes, but a byte can have more than 8 bits.
limits.h gives you CHAR_BIT to know the number of bits in a byte, but even if you multiply this by the sizeof your type, the result could still be wrong because unsigned types are allowed to contain padding bits that are not part of the number representation, while sizeof returns the storage size in bytes, which includes these padding bits.
Fortunately, this answer has an ingenious macro that can calculate the number of actual value bits based on the maximum value of the respective type:
#define IMAX_BITS(m) ((m) /((m)%0x3fffffffL+1) /0x3fffffffL %0x3fffffffL *30 \
+ (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))
The maximum value of an unsigned type is surprisingly easy to get: just cast -1 to your unsigned type.
So, all in all, your code could look like this, including the macro above:
#define UNSIGNED_BITS IMAX_BITS((unsigned)-1)
// [...]
unsigned int mask;
for (mask = 1 << (UNSIGNED_BITS-1); mask != 0; mask >>= 1) {
// [...]
}
Note that applying this complicated macro has no runtime drawback at all, it's a compile-time constant.
Your algorithm seems unnecessarily complicated: bits from the exponent can be tested from the least significant to the most significant in a way that does not depend on the integer type nor its maximum value. Here is a simple implementation that does not need any special case for any size integers:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
unsigned long long base = (argc > 1) ? strtoull(argv[1], NULL, 0) : 23;
unsigned long long exponent = (argc > 2) ? strtoull(argv[2], NULL, 0) : 297;
unsigned long long mod = (argc > 3) ? strtoull(argv[3], NULL, 0) : 327;
unsigned long long y = exponent;
unsigned long long x = base;
unsigned long long result = 1;
for (;;) {
if (y & 1) {
result = result * x % mod;
}
if ((y >>= 1) == 0)
break;
x = x * x % mod;
}
printf("expmod(%llu, %llu, %llu) = %llu\n", base, exponent, mod, result);
return 0;
}
Without any command line arguments, it produces: expmod(23, 297, 327) = 185. You can try other numbers by passing the base, exponent and modulo as command line arguments.
EDIT:
If you must scan the bits in exponent from most significant to least significant, mask should be defined as the same type as exponent and initialized this way if the type is unsigned:
unsigned long long exponent = 297;
unsigned long long mask = 0;
mask = ~mask - (~mask >> 1);
If the type is signed, for complete portability, you must use the definition for its maximum value from <limits.h>. Note however that it would be more efficient to use the unsigned type.
long long exponent = 297;
long long mask = LLONG_MAX - (LLONG_MAX >> 1);
The loop will waste time running through all the most significant 0 bits, so a simpler loop could be used first to skip these bits:
while (mask > exponent) {
mask >>= 1;
}

Moving a "nibble" to the left using C

I've been working on this puzzle for awhile. I'm trying to figure out how to rotate 4 bits in a number (x) around to the left (with wrapping) by n where 0 <= n <= 31.. The code will look like:
moveNib(int x, int n){
//... some code here
}
The trick is that I can only use these operators:
~ & ^ | + << >>
and of them only a combination of 25. I also can not use If statements, loops, function calls. And I may only use type int.
An example would be moveNib(0x87654321,1) = 0x76543218.
My attempt: I have figured out how to use a mask to store the the bits and all but I can't figure out how to move by an arbitrary number. Any help would be appreciated thank you!
How about:
uint32_t moveNib(uint32_t x, int n) { return x<<(n<<2) | x>>((8-n)<<2); }
It uses <<2 to convert from nibbles to bits, and then shifts the bits by that much. To handle wraparound, we OR by a copy of the number which has been shifted by the opposite amount in the opposite direciton. For example, with x=0x87654321 and n=1, the left part is shifted 4 bits to the left and becomes 0x76543210, and the right part is shifted 28 bits to the right and becomes 0x00000008, and when ORed together, the result is 0x76543218, as requested.
Edit: If - really isn't allowed, then this will get the same result (assuming an architecture with two's complement integers) without using it:
uint32_t moveNib(uint32_t x, int n) { return x<<(n<<2) | x>>((9+~n)<<2); }
Edit2: OK. Since you aren't allowed to use anything but int, how about this, then?
int moveNib(int x, int n) { return (x&0xffffffff)<<(n<<2) | (x&0xffffffff)>>((9+~n)<<2); }
The logic is the same as before, but we force the calculation to use unsigned integers by ANDing with 0xffffffff. All this assumes 32 bit integers, though. Is there anything else I have missed now?
Edit3: Here's one more version, which should be a bit more portable:
int moveNib(int x, int n) { return ((x|0u)<<((n&7)<<2) | (x|0u)>>((9+~(n&7))<<2))&0xffffffff; }
It caps n as suggested by chux, and uses |0u to convert to unsigned in order to avoid the sign bit duplication you get with signed integers. This works because (from the standard):
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Since int and 0u have the same rank, but 0u is unsigned, then the result is unsigned, even though ORing with 0 otherwise would be a null operation.
It then truncates the result to the range of a 32-bit int so that the function will still work if ints have more bits than this (though the rotation will still be performed on the lowest 32 bits in that case. A 64-bit version would replace 7 by 15, 9 by 17 and truncate using 0xffffffffffffffff).
This solution uses 12 operators (11 if you skip the truncation, 10 if you store n&7 in a variable).
To see what happens in detail here, let's go through it for the example you gave: x=0x87654321, n=1. x|0u results in a the unsigned number 0x87654321u. (n&7)<<2=4, so we will shift 4 bits to the left, while ((9+~(n&7))<<2=28, so we will shift 28 bits to the right. So putting this together, we will compute 0x87654321u<<4 | 0x87654321u >> 28. For 32-bit integers, this is 0x76543210|0x8=0x76543218. But for 64-bit integers it is 0x876543210|0x8=0x876543218, so in that case we need to truncate to 32 bits, which is what the final &0xffffffff does. If the integers are shorter than 32 bits, then this won't work, but your example in the question had 32 bits, so I assume the integer types are at least that long.
As a small side-note: If you allow one operator which is not on the list, the sizeof operator, then we can make a version that works with all the bits of a longer int automatically. Inspired by Aki, we get (using 16 operators (remember, sizeof is an operator in C)):
int moveNib(int x, int n) {
int nbit = (n&((sizeof(int)<<1)+~0u))<<2;
return (x|0u)<<nbit | (x|0u)>>((sizeof(int)<<3)+1u+~nbit);
}
Without the additional restrictions, the typical rotate_left operation (by 0 < n < 32) is trivial.
uint32_t X = (x << 4*n) | (x >> 4*(8-n));
Since we are talking about rotations, n < 0 is not a problem. Rotation right by 1 is the same as rotation left by 7 units. Ie. nn=n & 7; and we are through.
int nn = (n & 7) << 2; // Remove the multiplication
uint32_t X = (x << nn) | (x >> (32-nn));
When nn == 0, x would be shifted by 32, which is undefined. This can be replaced simply with x >> 0, i.e. no rotation at all. (x << 0) | (x >> 0) == x.
Replacing the subtraction with addition: a - b = a + (~b+1) and simplifying:
int nn = (n & 7) << 2;
int mm = (33 + ~nn) & 31;
uint32_t X = (x << nn) | (x >> mm); // when nn=0, also mm=0
Now the only problem is in shifting a signed int x right, which would duplicate the sign bit. That should be cured by a mask: (x << nn) - 1
int nn = (n & 7) << 2;
int mm = (33 + ~nn) & 31;
int result = (x << nn) | ((x >> mm) & ((1 << nn) + ~0));
At this point we have used just 12 of the allowed operations -- next we can start to dig into the problem of sizeof(int)...
int nn = (n & (sizeof(int)-1)) << 2; // etc.

Return 1 if any bits in an integer equal 1 using bit operations in C

I've been thinking about this problem for hours. Here it is:
Write an expression that returns 1 if a given integer "x" has any bits equal to 1. return 0 otherwise.
I understand that I'm essentially just trying to figure out if x == 0 because that is the only int that has no 1 bits, but I can't figure out a solution. You may not use traditional control structures. You may use bitwise operators, addition, subtraction, and bit shifts. Suggestions?
Here's the best I could come up with:
y = (((-x) | x) >> (BITS - 1)) & 1;
where BITS = 32 for 32 bit ints, i.e. BITS = sizeof(int) * CHAR_BIT;
Here's a test program:
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
int main(int argc, char *argv[])
{
const int BITS = sizeof(int) * CHAR_BIT;
if (argc == 2)
{
int x = atoi(argv[1]);
int y = (((-x) | x) >> (BITS - 1)) & 1;
printf("%d -> %d\n", x, y);
}
return 0;
}
Using !!x will give you the right answer. Since !0 = 1 and !(any nonzero number) = 0.
For a 32-bit value, the following will work for all bit-patterns.
return (a | -a) >> 31;
Mask each of the bits individually, shift them all down to the lsb position, and or them together.
You could just cast your int to a bool. But I doubt that's the purpose of your homework ;-)
For 32 bit integers
int return_not_zero(int v)
{
r=v;
r=(r&0xFFFF) | (r>>16);
r=(r&0xFF) | (r>>8);
r=(r&0x0F) | (r>>4);
r=(r&0x03) | (r>>2);
r=(r&0x01) | (r>>1);
return r;
}
0 || number - this will return 0 only if the number is 0 and will return 1 if the number is any other number than 0. Since a number without any bit as 1 will be equal to 0, we need to check it with 0.
untested, that's the first thing that came to my mind:
while(n & pow(2, e) == 0 && e++ <= 16) ; // 16 or 32
if e == 16 after the loop n is 0.
int any_bits_to_one(unsigned int n) {
int result = 0, i;
for (i=0; !result && i < sizeof(unsigned int) * 8; i++)
result |= (n & (1<<i)) ? 1 : 0;
return result;
}
Bitwise AND with 0 and any number must equal zero, but the only foolproof test would be with 0xFFFF, or every bit being set. To get all bits set, you should have a signed int, and assign it -1. You will then have an int with all bits set to 1, regardless of size.
So my answer would be to bitwise AND it with -1
How about !(x&&~x)&&x ?
#include <stdio.h>
void main(){
int x;
scanf("%d",&x);
printf("%d\n",(!(x&&~x)&&x));
}
It seems work, but I'm not sure when overflow happens.
I believe this is the simplest way.
return !!(0|x);
The only time your x will not have a 1 in it is when all bits are 0, or x == 0. So 0|0 -> 0 else 0|x -> non zero.
In C language, any value other than ZERO (either positive or negative) is treated as TRUE. And there should be a condition to check either your question's solution returns a ZERO or ONE (or other than ZERO). Therefore this answer is perfectly as per your requirement. This uses only bit-wise operators.
return (x & 0xFFFF);
This line returns ZERO when neither of any bit in "x" is 1, and returns Non-Zero (TRUE in a sense) when any of the bit is 1 in "x".

Algorithm to generate bit mask

I was facing this unique problem of generating a bit-mask based on the input parameter. For example,
if param = 2, then the mask will be 0x3 (11b)
if param = 5, then the mask will be 0x1F (1 1111b)
This I implemented using a for-loop in C, something like
int nMask = 0;
for (int i = 0; i < param; i ++) {
nMask |= (1 << i);
}
I would like to know if there is a better algorithm ~~~
One thing to notice about bitmasks like that is that they are always one less than a power of two.
The expression 1 << n is the easiest way to get the n-th power of two.
You don't want Zero to provide a bitmask of 00000001, you want it to provide zero. So you need to subtract one.
mask = (1 << param) - 1;
Edit:
If you want a special case for param > 32:
int sizeInBits = sizeof(mask) * BITS_PER_BYTE; // BITS_PER_BYTE = 8;
mask = (param >= sizeInBits ? -1 : (1 << param) - 1);
This method should work for 16, 32, or 64 bit integers, but you may have to explicitly type the '1'.
Efficient, Branch-Free, Portable and Generic (but Ugly) Implementation
C:
#include <limits.h> /* CHAR_BIT */
#define BIT_MASK(__TYPE__, __ONE_COUNT__) \
((__TYPE__) (-((__ONE_COUNT__) != 0))) \
& (((__TYPE__) -1) >> ((sizeof(__TYPE__) * CHAR_BIT) - (__ONE_COUNT__)))
C++:
#include <climits>
template <typename R>
static constexpr R bitmask(unsigned int const onecount)
{
// return (onecount != 0)
// ? (static_cast<R>(-1) >> ((sizeof(R) * CHAR_BIT) - onecount))
// : 0;
return static_cast<R>(-(onecount != 0))
& (static_cast<R>(-1) >> ((sizeof(R) * CHAR_BIT) - onecount));
}
Usage (Producing Compile Time Constants)
BIT_MASK(unsigned int, 4) /* = 0x0000000f */
BIT_MASK(uint64_t, 26) /* = 0x0000000003ffffffULL */
Example
#include <stdio.h>
int main()
{
unsigned int param;
for (param = 0; param <= 32; ++param)
{
printf("%u => 0x%08x\n", param, BIT_MASK(unsigned int, param));
}
return 0;
}
Output
0 => 0x00000000
1 => 0x00000001
2 => 0x00000003
3 => 0x00000007
4 => 0x0000000f
5 => 0x0000001f
6 => 0x0000003f
7 => 0x0000007f
8 => 0x000000ff
9 => 0x000001ff
10 => 0x000003ff
11 => 0x000007ff
12 => 0x00000fff
13 => 0x00001fff
14 => 0x00003fff
15 => 0x00007fff
16 => 0x0000ffff
17 => 0x0001ffff
18 => 0x0003ffff
19 => 0x0007ffff
20 => 0x000fffff
21 => 0x001fffff
22 => 0x003fffff
23 => 0x007fffff
24 => 0x00ffffff
25 => 0x01ffffff
26 => 0x03ffffff
27 => 0x07ffffff
28 => 0x0fffffff
29 => 0x1fffffff
30 => 0x3fffffff
31 => 0x7fffffff
32 => 0xffffffff
Explanation
First of all, as already discussed in other answers, >> is used instead of << in order to prevent the problem when the shift count is equal to the number of bits of the storage type of the value. (Thanks Julien's answer above for the idea)
For the ease of discussion, let's "instantiate" the macro with unsigned int as __TYPE__ and see what happens (assuming 32-bit for the moment):
((unsigned int) (-((__ONE_COUNT__) != 0))) \
& (((unsigned int) -1) >> ((sizeof(unsigned int) * CHAR_BIT) - (__ONE_COUNT__)))
Let's focus on:
((sizeof(unsigned int) * CHAR_BIT)
first. sizeof(unsigned int) is known at compile time. It is equal to 4 according to our assumption. CHAR_BIT represents the number of bits per char, a.k.a. per byte. It is also known at compile time. It is equal to 8 on most machines on the Earth. Since this expression is known at a compile time, the compiler would probably do the multiplication at compile time and treat it as a constant, which equals to 32 in this case.
Let's move to:
((unsigned int) -1)
It is equal to 0xFFFFFFFF. Casting -1 to any unsigned type produces a value of "all-1s" in that type. This part is also a compile time constant.
Up to now, the expression:
(((unsigned int) -1) >> ((sizeof(unsigned int) * CHAR_BIT) - (__ONE_COUNT__)))
is in fact the same as:
0xffffffffUL >> (32 - param)
which is the same as Julien's answer above. One problem with his answer is that if param is equal to 0, producing the expression 0xffffffffUL >> 32, the result of the expression would be 0xffffffffUL, instead of the expected 0! (That's why I name my parameter as __ONE_COUNT__ to emphasize its intention)
To solve this problem, we could simply add a special case for __ONE_COUNT equals 0 using if-else or ?:, like this:
#define BIT_MASK(__TYPE__, __ONE_COUNT__) \
(((__ONE_COUNT__) != 0) \
? (((__TYPE__) -1) >> ((sizeof(__TYPE__) * CHAR_BIT) - (__ONE_COUNT__)))
: 0)
But branch-free code is cooler, isn't it?! Let's move to the next part:
((unsigned int) (-((__ONE_COUNT__) != 0)))
Let's start from the innermost expression to the outermost. ((__ONE_COUNT__) != 0) produces 0 when the parameter is 0, or 1 otherwise. (-((__ONE_COUNT__) != 0)) produces 0 when the parameter is 0, or -1 otherwise. For ((unsigned int) (-((__ONE_COUNT__) != 0))), the type-cast trick ((unsigned int) -1) is already explained above. Do you notice the trick now? The expression:
((__TYPE__) (-((__ONE_COUNT__) != 0)))
equals to "all-0s" if __ONE_COUNT__ is zero, and "all-1s" otherwise. It acts as a bit-mask for the value we calculated in the first step. So, if __ONE_COUNT__ is non-zero, the mask as no effect and it is the same as Julien's answer. If __ONE_COUNT__ is 0, it mask away all bits of Julien's answer, producing a constant zero. To visualize, watch this:
__ONE_COUNT__ : 0 Other
------------- --------------
(__ONE_COUNT__) 0 = 0x000...0 (itself)
((__ONE_COUNT__) != 0) 0 = 0x000...0 1 = 0x000...1
((__TYPE__) (-((__ONE_COUNT__) != 0))) 0 = 0x000...0 -1 = 0xFFF...F
Alternatively, you can use a right shift to avoid the issue mentioned in the (1 << param) - 1 solution.
unsigned long const mask = 0xffffffffUL >> (32 - param);
assuming that param <= 32, of course.
For those interested, this is the lookup-table alternative discussed in comments to the other answer - the difference being that it works correctly for a param of 32. It's easy enough to extend to the 64 bit unsigned long long version, if you need that, and shouldn't be significantly different in speed (if it's called in a tight inner loop then the static table will stay in at least L2 cache, and if it's not called in a tight inner loop then the performance difference won't be important).
unsigned long mask2(unsigned param)
{
static const unsigned long masks[] = {
0x00000000UL, 0x00000001UL, 0x00000003UL, 0x00000007UL,
0x0000000fUL, 0x0000001fUL, 0x0000003fUL, 0x0000007fUL,
0x000000ffUL, 0x000001ffUL, 0x000003ffUL, 0x000007ffUL,
0x00000fffUL, 0x00001fffUL, 0x00003fffUL, 0x00007fffUL,
0x0000ffffUL, 0x0001ffffUL, 0x0003ffffUL, 0x0007ffffUL,
0x000fffffUL, 0x001fffffUL, 0x003fffffUL, 0x007fffffUL,
0x00ffffffUL, 0x01ffffffUL, 0x03ffffffUL, 0x07ffffffUL,
0x0fffffffUL, 0x1fffffffUL, 0x3fffffffUL, 0x7fffffffUL,
0xffffffffUL };
if (param < (sizeof masks / sizeof masks[0]))
return masks[param];
else
return 0xffffffffUL; /* Or whatever else you want to do in this error case */
}
It's worth pointing out that if you need the if() statement (because are worried that someone might call it with param > 32), then this doesn't win you anything over the alternative from the other answer:
unsigned long mask(unsigned param)
{
if (param < 32)
return (1UL << param) - 1;
else
return -1;
}
The only difference is that the latter version has to special case param >= 32, whereas the former only has to special case param > 32.
How about this (in Java):
int mask = -1;
mask = mask << param;
mask = ~mask;
This way you can avoid lookup tables and hard coding the length of an integer.
Explanation: A signed integer with a value of -1 is represented in binary as all ones. Shift left the given number of times to add that many 0's to the right side. This will result in a 'reverse mask' of sorts. Then negate the shifted result to create your mask.
This could be shortened to:
int mask = ~(-1<<param);
An example:
int param = 5;
int mask = -1; // 11111111 (shortened for example)
mask = mask << param; // 11100000
mask = ~mask; // 00011111
From top of my head. Sorry, I'm on mobile. I assume a 64 bit type for clarity, but this can be easily generalized.
(((uint64_t) (bits < 64)) << (bits & 63)) - 1u
It's the typical (1 << bits) - 1 but branchless, with no undefined behavior, with the & 63 optimizable away on some platforms and with correct results for the whole range of values.
The left (left) shift operand becomes 0 for shifts bigger or equal than the type width.
The right (left) shift operand is masked to avoid undefined behavior, the value will never get bigger than 63. This is just to make compilers and language lawyers happy, as no platform will be adding ones when the left operand is already zero (for values bigger than 63). A good compiler should remove the & 63 masking on platforms where this is already the behavior of the underlying instruction (e.g. x86).
As we have seen, values bigger than 63 get a result of 0 from the shift, but there is a substraction by one afterwards leaving all bits set by an unsigned integer underflow, which is not undefined behavior on unsigned types.
If you're worried about overflow in a C-like language with (1 << param) - 1 (when param is 32 or 64 at the max size type the mask becomes 0 since bitshift pushes past the bounds of type), one solution I just thought of:
const uint32_t mask = ( 1ul << ( maxBits - 1ul ) ) | ( ( 1ul << ( maxBits - 1ul ) ) - 1ul );
Or another example
const uint64_t mask = ( 1ull << ( maxBits - 1ull ) ) | ( ( 1ull << ( maxBits - 1ull ) ) - 1ull );
Here's a templatized version, keep in mind that you should use this with an unsigned type R:
#include <limits.h> /* CHAR_BIT */
// bits cannot be 0
template <typename R>
static constexpr R bitmask1( const R bits )
{
const R one = 1;
assert( bits >= one );
assert( bits <= sizeof( R ) * CHAR_BIT );
const R bitShift = one << ( bits - one );
return bitShift | ( bitShift - one );
}
Let's say max bits is 8 with a byte, with the first overflowing function we'd have 1 << 8 == 256, which when cast to byte becomes 0. With my function we have 1 << 7 == 128, which a byte can contain, so becomes 1<<7 | 1<<7 - 1.
I haven't compiled the function, so it may contain typos.
And for fun here's Julien Royer's fleshed out:
// bits can be 0
template <typename R>
static constexpr R bitmask2( const R bits )
{
const R zero = 0;
const R mask = ~zero;
const R maxBits = sizeof( R ) * CHAR_BIT;
assert( bits <= maxBits );
return mask >> ( maxBits - bits );
}
For a 32-bit mask you can use this (use uint64_t for a 64-bit mask):
#include <assert.h>
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
int
main()
{
size_t n = 8;
assert(n <= 32);
uint32_t mask = ~(uint32_t)0 >> (32 - n);
printf("mask = %08" PRIX32 "\n", mask);
}
I know it's an answer to a very old post. But in case some human being actually reads this: I would welcome any feedback.
Just for reference (google), I used the following to get an all 1 mask for for integral types.
In C++ one might simply use:
std::numeric_limits<uint_16t>::max() // 65535

Bit reversal of an integer, ignoring integer size and endianness

Given an integer typedef:
typedef unsigned int TYPE;
or
typedef unsigned long TYPE;
I have the following code to reverse the bits of an integer:
TYPE max_bit= (TYPE)-1;
void reverse_int_setup()
{
TYPE bits= (TYPE)max_bit;
while (bits <<= 1)
max_bit= bits;
}
TYPE reverse_int(TYPE arg)
{
TYPE bit_setter= 1, bit_tester= max_bit, result= 0;
for (result= 0; bit_tester; bit_tester>>= 1, bit_setter<<= 1)
if (arg & bit_tester)
result|= bit_setter;
return result;
}
One just needs first to run reverse_int_setup(), which stores an integer with the highest bit turned on, then any call to reverse_int(arg) returns arg with its bits reversed (to be used as a key to a binary tree, taken from an increasing counter, but that's more or less irrelevant).
Is there a platform-agnostic way to have in compile-time the correct value for max_int after the call to reverse_int_setup(); Otherwise, is there an algorithm you consider better/leaner than the one I have for reverse_int()?
Thanks.
#include<stdio.h>
#include<limits.h>
#define TYPE_BITS sizeof(TYPE)*CHAR_BIT
typedef unsigned long TYPE;
TYPE reverser(TYPE n)
{
TYPE nrev = 0, i, bit1, bit2;
int count;
for(i = 0; i < TYPE_BITS; i += 2)
{
/*In each iteration, we swap one bit on the 'right half'
of the number with another on the left half*/
count = TYPE_BITS - i - 1; /*this is used to find how many positions
to the left (and right) we gotta move
the bits in this iteration*/
bit1 = n & (1<<(i/2)); /*Extract 'right half' bit*/
bit1 <<= count; /*Shift it to where it belongs*/
bit2 = n & 1<<((i/2) + count); /*Find the 'left half' bit*/
bit2 >>= count; /*Place that bit in bit1's original position*/
nrev |= bit1; /*Now add the bits to the reversal result*/
nrev |= bit2;
}
return nrev;
}
int main()
{
TYPE n = 6;
printf("%lu", reverser(n));
return 0;
}
This time I've used the 'number of bits' idea from TK, but made it somewhat more portable by not assuming a byte contains 8 bits and instead using the CHAR_BIT macro. The code is more efficient now (with the inner for loop removed). I hope the code is also slightly less cryptic this time. :)
The need for using count is that the number of positions by which we have to shift a bit varies in each iteration - we have to move the rightmost bit by 31 positions (assuming 32 bit number), the second rightmost bit by 29 positions and so on. Hence count must decrease with each iteration as i increases.
Hope that bit of info proves helpful in understanding the code...
The following program serves to demonstrate a leaner algorithm for reversing bits, which can be easily extended to handle 64bit numbers.
#include <stdio.h>
#include <stdint.h>
int main(int argc, char**argv)
{
int32_t x;
if ( argc != 2 )
{
printf("Usage: %s hexadecimal\n", argv[0]);
return 1;
}
sscanf(argv[1],"%x", &x);
/* swap every neigbouring bit */
x = (x&0xAAAAAAAA)>>1 | (x&0x55555555)<<1;
/* swap every 2 neighbouring bits */
x = (x&0xCCCCCCCC)>>2 | (x&0x33333333)<<2;
/* swap every 4 neighbouring bits */
x = (x&0xF0F0F0F0)>>4 | (x&0x0F0F0F0F)<<4;
/* swap every 8 neighbouring bits */
x = (x&0xFF00FF00)>>8 | (x&0x00FF00FF)<<8;
/* and so forth, for say, 32 bit int */
x = (x&0xFFFF0000)>>16 | (x&0x0000FFFF)<<16;
printf("0x%x\n",x);
return 0;
}
This code should not contain errors, and was tested using 0x12345678 to produce 0x1e6a2c48 which is the correct answer.
typedef unsigned long TYPE;
TYPE reverser(TYPE n)
{
TYPE k = 1, nrev = 0, i, nrevbit1, nrevbit2;
int count;
for(i = 0; !i || (1 << i && (1 << i) != 1); i+=2)
{
/*In each iteration, we swap one bit
on the 'right half' of the number with another
on the left half*/
k = 1<<i; /*this is used to find how many positions
to the left (or right, for the other bit)
we gotta move the bits in this iteration*/
count = 0;
while(k << 1 && k << 1 != 1)
{
k <<= 1;
count++;
}
nrevbit1 = n & (1<<(i/2));
nrevbit1 <<= count;
nrevbit2 = n & 1<<((i/2) + count);
nrevbit2 >>= count;
nrev |= nrevbit1;
nrev |= nrevbit2;
}
return nrev;
}
This works fine in gcc under Windows, but I'm not sure if it's completely platform independent. A few places of concern are:
the condition in the for loop - it assumes that when you left shift 1 beyond the leftmost bit, you get either a 0 with the 1 'falling out' (what I'd expect and what good old Turbo C gives iirc), or the 1 circles around and you get a 1 (what seems to be gcc's behaviour).
the condition in the inner while loop: see above. But there's a strange thing happening here: in this case, gcc seems to let the 1 fall out and not circle around!
The code might prove cryptic: if you're interested and need an explanation please don't hesitate to ask - I'll put it up someplace.
#ΤΖΩΤΖΙΟΥ
In reply to ΤΖΩΤΖΙΟΥ 's comments, I present modified version of above which depends on a upper limit for bit width.
#include <stdio.h>
#include <stdint.h>
typedef int32_t TYPE;
TYPE reverse(TYPE x, int bits)
{
TYPE m=~0;
switch(bits)
{
case 64:
x = (x&0xFFFFFFFF00000000&m)>>16 | (x&0x00000000FFFFFFFF&m)<<16;
case 32:
x = (x&0xFFFF0000FFFF0000&m)>>16 | (x&0x0000FFFF0000FFFF&m)<<16;
case 16:
x = (x&0xFF00FF00FF00FF00&m)>>8 | (x&0x00FF00FF00FF00FF&m)<<8;
case 8:
x = (x&0xF0F0F0F0F0F0F0F0&m)>>4 | (x&0x0F0F0F0F0F0F0F0F&m)<<4;
x = (x&0xCCCCCCCCCCCCCCCC&m)>>2 | (x&0x3333333333333333&m)<<2;
x = (x&0xAAAAAAAAAAAAAAAA&m)>>1 | (x&0x5555555555555555&m)<<1;
}
return x;
}
int main(int argc, char**argv)
{
TYPE x;
TYPE b = (TYPE)-1;
int bits;
if ( argc != 2 )
{
printf("Usage: %s hexadecimal\n", argv[0]);
return 1;
}
for(bits=1;b;b<<=1,bits++);
--bits;
printf("TYPE has %d bits\n", bits);
sscanf(argv[1],"%x", &x);
printf("0x%x\n",reverse(x, bits));
return 0;
}
Notes:
gcc will warn on the 64bit constants
the printfs will generate warnings too
If you need more than 64bit, the code should be simple enough to extend
I apologise in advance for the coding crimes I committed above - mercy good sir!
There's a nice collection of "Bit Twiddling Hacks", including a variety of simple and not-so simple bit reversing algorithms coded in C at http://graphics.stanford.edu/~seander/bithacks.html.
I personally like the "Obvious" algorigthm (http://graphics.stanford.edu/~seander/bithacks.html#BitReverseObvious) because, well, it's obvious. Some of the others may require less instructions to execute. If I really need to optimize the heck out of something I may choose the not-so-obvious but faster versions. Otherwise, for readability, maintainability, and portability I would choose the Obvious one.
Here is a more generally useful variation. Its advantage is its ability to work in situations where the bit length of the value to be reversed -- the codeword -- is unknown but is guaranteed not to exceed a value we'll call maxLength. A good example of this case is Huffman code decompression.
The code below works on codewords from 1 to 24 bits in length. It has been optimized for fast execution on a Pentium D. Note that it accesses the lookup table as many as 3 times per use. I experimented with many variations that reduced that number to 2 at the expense of a larger table (4096 and 65,536 entries). This version, with the 256-byte table, was the clear winner, partly because it is so advantageous for table data to be in the caches, and perhaps also because the processor has an 8-bit table lookup/translation instruction.
const unsigned char table[] = {
0x00,0x80,0x40,0xC0,0x20,0xA0,0x60,0xE0,0x10,0x90,0x50,0xD0,0x30,0xB0,0x70,0xF0,
0x08,0x88,0x48,0xC8,0x28,0xA8,0x68,0xE8,0x18,0x98,0x58,0xD8,0x38,0xB8,0x78,0xF8,
0x04,0x84,0x44,0xC4,0x24,0xA4,0x64,0xE4,0x14,0x94,0x54,0xD4,0x34,0xB4,0x74,0xF4,
0x0C,0x8C,0x4C,0xCC,0x2C,0xAC,0x6C,0xEC,0x1C,0x9C,0x5C,0xDC,0x3C,0xBC,0x7C,0xFC,
0x02,0x82,0x42,0xC2,0x22,0xA2,0x62,0xE2,0x12,0x92,0x52,0xD2,0x32,0xB2,0x72,0xF2,
0x0A,0x8A,0x4A,0xCA,0x2A,0xAA,0x6A,0xEA,0x1A,0x9A,0x5A,0xDA,0x3A,0xBA,0x7A,0xFA,
0x06,0x86,0x46,0xC6,0x26,0xA6,0x66,0xE6,0x16,0x96,0x56,0xD6,0x36,0xB6,0x76,0xF6,
0x0E,0x8E,0x4E,0xCE,0x2E,0xAE,0x6E,0xEE,0x1E,0x9E,0x5E,0xDE,0x3E,0xBE,0x7E,0xFE,
0x01,0x81,0x41,0xC1,0x21,0xA1,0x61,0xE1,0x11,0x91,0x51,0xD1,0x31,0xB1,0x71,0xF1,
0x09,0x89,0x49,0xC9,0x29,0xA9,0x69,0xE9,0x19,0x99,0x59,0xD9,0x39,0xB9,0x79,0xF9,
0x05,0x85,0x45,0xC5,0x25,0xA5,0x65,0xE5,0x15,0x95,0x55,0xD5,0x35,0xB5,0x75,0xF5,
0x0D,0x8D,0x4D,0xCD,0x2D,0xAD,0x6D,0xED,0x1D,0x9D,0x5D,0xDD,0x3D,0xBD,0x7D,0xFD,
0x03,0x83,0x43,0xC3,0x23,0xA3,0x63,0xE3,0x13,0x93,0x53,0xD3,0x33,0xB3,0x73,0xF3,
0x0B,0x8B,0x4B,0xCB,0x2B,0xAB,0x6B,0xEB,0x1B,0x9B,0x5B,0xDB,0x3B,0xBB,0x7B,0xFB,
0x07,0x87,0x47,0xC7,0x27,0xA7,0x67,0xE7,0x17,0x97,0x57,0xD7,0x37,0xB7,0x77,0xF7,
0x0F,0x8F,0x4F,0xCF,0x2F,0xAF,0x6F,0xEF,0x1F,0x9F,0x5F,0xDF,0x3F,0xBF,0x7F,0xFF};
const unsigned short masks[17] =
{0,0,0,0,0,0,0,0,0,0X0100,0X0300,0X0700,0X0F00,0X1F00,0X3F00,0X7F00,0XFF00};
unsigned long codeword; // value to be reversed, occupying the low 1-24 bits
unsigned char maxLength; // bit length of longest possible codeword (<= 24)
unsigned char sc; // shift count in bits and index into masks array
if (maxLength <= 8)
{
codeword = table[codeword << (8 - maxLength)];
}
else
{
sc = maxLength - 8;
if (maxLength <= 16)
{
codeword = (table[codeword & 0X00FF] << sc)
| table[codeword >> sc];
}
else if (maxLength & 1) // if maxLength is 17, 19, 21, or 23
{
codeword = (table[codeword & 0X00FF] << sc)
| table[codeword >> sc] |
(table[(codeword & masks[sc]) >> (sc - 8)] << 8);
}
else // if maxlength is 18, 20, 22, or 24
{
codeword = (table[codeword & 0X00FF] << sc)
| table[codeword >> sc]
| (table[(codeword & masks[sc]) >> (sc >> 1)] << (sc >> 1));
}
}
How about:
long temp = 0;
int counter = 0;
int number_of_bits = sizeof(value) * 8; // get the number of bits that represent value (assuming that it is aligned to a byte boundary)
while(value > 0) // loop until value is empty
{
temp <<= 1; // shift whatever was in temp left to create room for the next bit
temp |= (value & 0x01); // get the lsb from value and set as lsb in temp
value >>= 1; // shift value right by one to look at next lsb
counter++;
}
value = temp;
if (counter < number_of_bits)
{
value <<= counter-number_of_bits;
}
(I'm assuming that you know how many bits value holds and it is stored in number_of_bits)
Obviously temp needs to be the longest imaginable data type and when you copy temp back into value, all the extraneous bits in temp should magically vanish (I think!).
Or, the 'c' way would be to say :
while(value)
your choice
We can store the results of reversing all possible 1 byte sequences in an array (256 distinct entries), then use a combination of lookups into this table and some oring logic to get the reverse of integer.
Here is a variation and correction to TK's solution which might be clearer than the solutions by sundar. It takes single bits from t and pushes them into return_val:
typedef unsigned long TYPE;
#define TYPE_BITS sizeof(TYPE)*8
TYPE reverser(TYPE t)
{
unsigned int i;
TYPE return_val = 0
for(i = 0; i < TYPE_BITS; i++)
{/*foreach bit in TYPE*/
/* shift the value of return_val to the left and add the rightmost bit from t */
return_val = (return_val << 1) + (t & 1);
/* shift off the rightmost bit of t */
t = t >> 1;
}
return(return_val);
}
The generic approach hat would work for objects of any type of any size would be to reverse the of bytes of the object, and the reverse the order of bits in each byte. In this case the bit-level algorithm is tied to a concrete number of bits (a byte), while the "variable" logic (with regard to size) is lifted to the level of whole bytes.
Here's my generalization of freespace's solution (in case we one day get 128-bit machines). It results in jump-free code when compiled with gcc -O3, and is obviously insensitive to the definition of foo_t on sane machines. Unfortunately it does depend on shift being a power of 2!
#include <limits.h>
#include <stdio.h>
typedef unsigned long foo_t;
foo_t reverse(foo_t x)
{
int shift = sizeof (x) * CHAR_BIT / 2;
foo_t mask = (1 << shift) - 1;
int i;
for (i = 0; shift; i++) {
x = ((x & mask) << shift) | ((x & ~mask) >> shift);
shift >>= 1;
mask ^= (mask << shift);
}
return x;
}
int main() {
printf("reverse = 0x%08lx\n", reverse(0x12345678L));
}
In case bit-reversal is time critical, and mainly in conjunction with FFT, the best is to store the whole bit reversed array. In any case, this array will be smaller in size than the roots of unity that have to be precomputed in FFT Cooley-Tukey algorithm. An easy way to compute the array is:
int BitReverse[Size]; // Size is power of 2
void Init()
{
BitReverse[0] = 0;
for(int i = 0; i < Size/2; i++)
{
BitReverse[2*i] = BitReverse[i]/2;
BitReverse[2*i+1] = (BitReverse[i] + Size)/2;
}
} // end it's all

Resources