Can anyone explain this bitwise function to compute log(n) - c

int howManyBits(int x) {
int concatenate;
int bias;
int sign = x >> 31; //get the sign
x = (sign & (~x)) | (~sign & x);
concatenate = (!!(x >> 16)) << 4;
concatenate |= (!!(x >> (concatenate + 8))) << 3;
concatenate |= (!!(x >> (concatenate + 4))) << 2;
concatenate |= (!!(x >> (concatenate + 2))) << 1;
concatenate |= x >> (concatenate + 1);
bias = !(x ^ 0);
return concatenate + 2 + (~bias + 1);
}
This code is presented as a way to calculate the minimum number of bits required to represent an integer n in 2's complement, with the assumption that the int data type is represented with 32 bits. Right shifting is assumed to be arithmetic.
I understand that the basic method is to take the log base 2 of n, round it up, and then add 1 bit to account for the sign bit.
I also understand that left-shifting is equivalent to multiplying by 2 and that right-shifting is equivalent to dividing by 2.
That being said, without comments I can't decipher what this code is doing beyond the portion where it obtains the value of the sign bit. I worked through it on pencil and paper with a sample int of the value 5 - the code works, but I don't understand why.
Could someone provide some intuitive breakdown of what the code is doing?

This code could use some comments.
This leaves x as it is if it is positive or takes the one's complement if negative. This allows the calculation to search for the most significant one regardless of positive or negative
x = (sign & (~x)) | (~sign & x);
I think the following would have been more clear:
x = sign ? ~x : x;
Next is a search for the highest 1 bit done with a binary search. First the top half of the word is searched.
concatenate = (!!(x >> 16)) << 4;
If the top half has a 1, then the result is 16. The 16 is used later both as part of the answer, but also to determine where to search next. Since it is used in the shifts that follow it will cause the following tests to either be done with the top half of the board or the bottom half.
The following concatenate operations are searching in a progressively smaller piece of the original number looking is the most significant one in the upper 8 bits or the lower 8 bits of the 16 bits that was chosen, then the upper 4 bits or the lower 4 bits of the 8 bits that was chosen, and so forth.
concatenate |= (!!(x >> (concatenate + 8))) << 3; // Check which 8 bits
concatenate |= (!!(x >> (concatenate + 4))) << 2; // Check which 4 bits
concatenate |= (!!(x >> (concatenate + 2))) << 1; // Check which 2 bits
concatenate |= x >> (concatenate + 1); // Check which 1 bit
The bias is just checking is the number 0 or not. It is 1 only if x is 0. I don't understand the need for the XOR operator.
Finally the pieces are added together.

Related

Making bitmasks in C

I'm new to bit manipulation but I'm struggling to translate it to code.
For the following function I need to make a bitmask
void make_bitmask(unsigned int width, unsigned int shift,unsigned int* mask)
The bitmask function is supposed to have a width and a shift. if w=5 and s=1 it should give 00111110.
How should I approach this? This is my latest code I've been trying:
*mask |= width << shift;
Changed to *mask = ((1 << width) - 1) << shift;
Unit test:
static void test_make_bitmask_correct_width(void)
{
unsigned int mask = 0;
make_bitmask(1,0,&mask)
TEST_ASSERT_EQUAL_HEX32(0x1, mask);
make_bitmask(2,0,&mask)
TEST_ASSERT_EQUAL_HEX32(0x3, mask);
make_bitmask(5,0,&mask)
TEST_ASSERT_EQUAL_HEX32(0x1f, mask);
make_bitmask(32,0,&mask)
TEST_ASSERT_EQUAL_HEX32(0xffffffff, mask); // the one that is currently failing mask is 0x000000000
void make_bitmask(unsigned int width, unsigned int shift,unsigned int *mask)
{
/* With unlimited arithmetic, (1 << width) - 1 would give us a mask of
width bits, starting at the low bit position. E.g., with width 3, 1 <<
3 would give binary 1000, and subtracting one would give 111, a mask of
3 bits. However, in C arithmetic, if width is the full width of an
unsigned int, then "1u << width" would overflow. To avoid this, if
width is positive, we shift 2u by width-1. If width is zero, we simply
use the hardcoded result for a mask of zero bits, zero.
*/
unsigned int t = width ? (2u << width-1) - 1u : 0u;
// Shift the mask to the desired position and store it.
*mask = t << shift;
}
The decimal value 5 is binary 00000101. Shifting it left by one bit gives you 00001010.
If you want 5 to turn into 00011111 (decimal 31) then the easiest solution is to find out the value 00100000 (decimal 32) and the subtracting one.
Remembering that we're dealing with powers of 2, if you raise 2 to the power of 5 we get 32, and can then subtract one to get the base mask. But instead of using the pow function or multiplying in a loop, we can just shift the value 1 five steps left (1 << 5).
Putting it all together, the you should shift 1 left by width bits, subtract 1, and then shift the result shift bits to the left:
*mask = ((1 << width) - 1) << shift;
With width == 5 and shift == 1 then the above will give you binary 00111110 (decimal 62).

Calculate the average bits of byte in C

I try to writ a function that calculate the average bits of byte.
float AvgOnesOnBinaryString (int x)
for example:
-252 is 11111111 11111111 11111111 00000100
so the function return 6.25
because ( 8+8+8+1) / 4 = 6.25
I have to use the function that count bits in char:
int countOnesOnBinaryString (char x){
int bitCount = 0;
while(x > 0)
{
if ( x & 1 == 1 )
bitCount++;
x = x>>1;
}
return bitCount;
}
I tried:
float AvgOnesOnBinaryString (int x){
float total = 0;
total += countOnesOnBinaryString((x >> 24));
total += countOnesOnBinaryString((x >> 16));
total += countOnesOnBinaryString((x >> 8));
total += countOnesOnBinaryString(x);
return total/4;
}
but I get the answae 0.25 and not 6.25
what could be the problem?
UPDATE
I can't change the AvgOnesOnBinaryString function signature.
The C language allows compilers to define char as either a signed or unsigned type. I suspect it is signed on your platform, meaning that a byte like 0xff is likely interpreted as -1. This means that the x > 0 test in countOnesOnBinaryString yields false, so countOnesOnBinaryString(0xff) would return 0 instead of the correct value 8.
You should change countOnesOnBinaryString to take an argument of type unsigned char instead of char.
For somewhat related reasons, it would also be a good idea to change the argument of AvgOnesOnBinaryString to be unsigned int. Or even better, uint32_t from <stdint.h>, since your code assumes the input value is 32 bits, and (unsigned) int is allowed to be of some other size.
There is one algorithm that gives you the count of the number of 1 bits in an unsigned variable far more quickly. Only 5 iterations are needed in a 32 bit integer. I'll show it to you in C for a full length 64 bit unsigned number, so probably you can guess the pattern and why it works (it is explained below):
uint64_t
no_of_1_bits(uint64_t the_value)
{
the_value = ((the_value & 0xaaaaaaaaaaaaaaaa) >> 1) + (the_value & 0x5555555555555555);
the_value = ((the_value & 0xcccccccccccccccc) >> 2) + (the_value & 0x3333333333333333);
the_value = ((the_value & 0xf0f0f0f0f0f0f0f0) >> 4) + (the_value & 0x0f0f0f0f0f0f0f0f);
the_value = ((the_value & 0xff00ff00ff00ff00) >> 8) + (the_value & 0x00ff00ff00ff00ff);
the_value = ((the_value & 0xffff0000ffff0000) >> 16) + (the_value & 0x0000ffff0000ffff);
the_value = ((the_value & 0xffffffff00000000) >> 32) + (the_value & 0x00000000ffffffff);
return the_value;
}
The number of 1 bits will be in the 64bit value of the_value. If you divide the result by eight, you'll have the average of 1 bits per byte for an unsigned long (beware of making the shifts with signed chars as the sign bit is replicated, so your algorithm will never stop for a negative number)
For 8 bit bytes, the algorithm reduces to:
uint8_t
no_of_1_bits(uint8_t the_value)
{
the_value = ((the_value & 0xaa) >> 1) + (the_value & 0x55);
the_value = ((the_value & 0xcc) >> 2) + (the_value & 0x33);
the_value = ((the_value & 0xf0) >> 4) + (the_value & 0x0f);
return the_value;
}
and again, the number of 1 bits is in the variable the_value.
The idea of this algorithm is to produce in the first step the sum of each pair of bits in a two bit accumulator (we shift the left bit of a pair to the right to align it with the right one, then we add them together, and in parallel for each pair of bits). As the accumulators are two bits, it is impossible to overflow (so there's never a carry from a pair of bits to the next, and we use the full integer as a series of two bit registers to add the sum)
Then we sum each pair of bits in an accumulator of four bits and again, that never overflows... let's do the same thing with the nibbles we got, and sum them into registers of 8 bits.... If it was impossible to overflow a 4 bit accumulator with two bits, it is more impossible to overflow an 8 bit accumulator with four bit addings.... and continue until you add the left half of the word with the right half. You finally end with the sum of all bits in one full length register of the word length.
Easy, isn't it? :)

Trouble in understanding the bitwise operators left and right shift function in programming [duplicate]

This question already has answers here:
Need help understanding "getbits()" method in Chapter 2 of K&R C
(6 answers)
Closed 3 years ago.
In ANSI C section 2.9, the bitwise operators, I am unable to understand this particular code.
I know how each bitwise operator works, but combinations need some help.
getbits(x, 4, 3)
unsigned getbits(unsigned x, int p, int n) {
return (x >> (p + 1 - n)) & ~(~0 << n);
}
~0 is an int made of binary ones (111...111111)
~0<<n introduces n zeros in the lower bits (111...111000).
~(~0<<n) flips the bits (000...000111)
x>>(p+1-n) shifts x towards the lower bits (00XXX...XXXXXX).
The & operation combines the previous two results: upper zeros are kept as zeros and the lower X bits (facing the ones) are kept as is (00000...000XXX).
Thus this function retrieves a n-bit pattern of x from bit p, but shifted (p+1-n) positions towards lower bits (ie, placed at the lower position).
The function is supposed to extract a bitfield of width n at position p.
There are problems in this function:
p + 1 - n seems bogus but it is the number of bits to the right of the bitfield if p is the bit number of the most significant bit in the bitfield, numbered from 0 for the least significant bit..
the code has implementation defined behavior if the most significant bit of x is included in the bitfield because 0 is a signed integer. 0U should be used instead.
the code does not work to extract a bitfield that has the full width of unsigned int, because shifting by a number of bits greater or equal to the width of the type has undefined behavior. The shift should be split in 2 parts, n - 1 bits and an additional 1 bit. n - 1 will be in the range [0..31] so the variable shift is fully defined.
Here is a more portable version:
// extract `n` bits at position `p`. n in [1..32], p in `[1..32]`
unsigned getbits(unsigned x, int p, int n) {
return (x >> (p + 1 - n)) & ~(~0U << (n - 1) << 1);
}
Here are the steps:
0U is the unsigned int null constant.
~0U has all its value bits set.
~0 << (n - 1) has all its value bits set except for the n - 1 low order bits, which are cleared.
~0 << (n - 1) << 1 has all its value bits set except for the n low order bits, which are cleared.
~(~0 << (n - 1) << 1) has the n low order bits set.
p + 1 - n is the number of bits with lower order than the bitfield
x >> (p + 1 - n) shifts the value to the right, leaving the bitfield in the low order bit positions.
(x >> (p + 1 - n)) & ~(~0 << (n - 1) << 1) masks the higher order bits, leaving just the bitfield value.
Note that there are other ways to compute the mask:
~0U >> (sizeof(unsigned) * CHAR_BIT - n)
(1U << (n - 1) << 1) - 1

Reverse the order of bits in a bit array

I have a long sequence of bits stored in an array of unsigned long integers, like this
struct bit_array
{
int size; /* nr of bits */
unsigned long *array; /* the container that stores bits */
}
I am trying to design an algorithm to reverse the order of bits in *array. Problems:
size can be anything, i.e. not necessarily a multiple of 8 or 32 etc, so the first bit in the input array can end up at any position within the unsigned long in the output array;
the algorithm should be platform-independent, i.e. work for any sizeof(unsigned long).
Code, pseudocode, algo description etc. -- anything better than bruteforce ("bit by bit") approach is welcome.
My favorite solution is to fill a lookup-table that does bit-reversal on a single byte (hence 256 byte entries).
You apply the table to 1 to 4 bytes of the input operand, with a swap. If the size isn't a multiple of 8, you will need to adjust by a final right shift.
This scales well to larger integers.
Example:
11 10010011 00001010 -> 01010000 11001001 11000000 -> 01 01000011 00100111
To split the number into bytes portably, you need to use bitwise masking/shifts; mapping of a struct or array of bytes onto the integer can make it more efficient.
For brute performance, you can think of mapping up to 16 bits at a time, but this doesn't look quite reasonable.
I like the idea of lookup table. Still it's also a typical task for log(n) group bit tricks that may be very fast. Like:
unsigned long reverseOne(unsigned long x) {
x = ((x & 0xFFFFFFFF00000000) >> 32) | ((x & 0x00000000FFFFFFFF) << 32);
x = ((x & 0xFFFF0000FFFF0000) >> 16) | ((x & 0x0000FFFF0000FFFF) << 16);
x = ((x & 0xFF00FF00FF00FF00) >> 8) | ((x & 0x00FF00FF00FF00FF) << 8);
x = ((x & 0xF0F0F0F0F0F0F0F0) >> 4) | ((x & 0x0F0F0F0F0F0F0F0F) << 4);
x = ((x & 0xCCCCCCCCCCCCCCCC) >> 2) | ((x & 0x3333333333333333) << 2);
x = ((x & 0xAAAAAAAAAAAAAAAA) >> 1) | ((x & 0x5555555555555555) << 1);
return x;
}
The underlying idea is that when we aim to reverse the order of some sequence we may swap the head and tail halves of this sequence and then separately reverse each of halves (which is done here by applying the same procedure recursively to each half).
Here is a more portable version supporting unsigned long widths of 4,8,16 or 32 bytes.
#include <limits.h>
#define ones32 0xFFFFFFFFUL
#if (ULONG_MAX >> 128)
#define fill32(x) (x|(x<<32)|(x<<64)|(x<<96)|(x<<128)|(x<<160)|(x<<192)|(x<<224))
#define patt128 (ones32|(ones32<<32)|(ones32<<64) |(ones32<<96))
#define patt64 (ones32|(ones32<<32)|(ones32<<128)|(ones32<<160))
#define patt32 (ones32|(ones32<<64)|(ones32<<128)|(ones32<<192))
#else
#if (ULONG_MAX >> 64)
#define fill32(x) (x|(x<<32)|(x<<64)|(x<<96))
#define patt64 (ones32|(ones32<<32))
#define patt32 (ones32|(ones32<<64))
#else
#if (ULONG_MAX >> 32)
#define fill32(x) (x|(x<<32))
#define patt32 (ones32)
#else
#define fill32(x) (x)
#endif
#endif
#endif
unsigned long reverseOne(unsigned long x) {
#if (ULONG_MAX >> 32)
#if (ULONG_MAX >> 64)
#if (ULONG_MAX >> 128)
x = ((x & ~patt128) >> 128) | ((x & patt128) << 128);
#endif
x = ((x & ~patt64) >> 64) | ((x & patt64) << 64);
#endif
x = ((x & ~patt32) >> 32) | ((x & patt32) << 32);
#endif
x = ((x & fill32(0xffff0000UL)) >> 16) | ((x & fill32(0x0000ffffUL)) << 16);
x = ((x & fill32(0xff00ff00UL)) >> 8) | ((x & fill32(0x00ff00ffUL)) << 8);
x = ((x & fill32(0xf0f0f0f0UL)) >> 4) | ((x & fill32(0x0f0f0f0fUL)) << 4);
x = ((x & fill32(0xccccccccUL)) >> 2) | ((x & fill32(0x33333333UL)) << 2);
x = ((x & fill32(0xaaaaaaaaUL)) >> 1) | ((x & fill32(0x55555555UL)) << 1);
return x;
}
In a collection of related topics which can be found here, the bits of an individual array entry could be reversed as follows.
unsigned int v; // input bits to be reversed
unsigned int r = v; // r will be reversed bits of v; first get LSB of v
int s = sizeof(v) * CHAR_BIT - 1; // extra shift needed at end
for (v >>= 1; v; v >>= 1)
{
r <<= 1;
r |= v & 1;
s--;
}
r <<= s; // shift when v's highest bits are zero
The reversal of the entire array could be done afterwards by rearranging the individual positions.
You must define what is the order of bits in an unsigned long. You might assume that bit n is corresponds to array[x] & (1 << n) but this needs to be specified. If so, you need to handle the byte ordering (little or big endian) if you are going to use access the array as bytes instead of unsigned long.
I would definitely implement brute force first and measure whether the speed is an issue. No need to waste time trying to optimize this if it is not used a lot on large arrays. An optimized version can be tricky to implement correctly. If you end up trying anyway, the brute force version can be used to verify correctness on test values and benchmark the speed of the optimized version.
The fact that the size is not multiple of sizeof(long) is the hardest part of the problem. This can result in a lot of bit shifting.
But, you don't have to do that if you can introduce new struct member:
struct bit_array
{
int size; /* nr of bits */
int offset; /* First bit position */
unsigned long *array; /* the container that stores bits */
}
Offset would tell you how many bits to ignore at the beginning of the array.
Then you only only have to do following steps:
Reverse array elements.
Swap bits of each element. There are many hacks for in the other answers, but your compiler might also provide intrisic functions to do it in fewer instructions (like RBIT instruction on some ARM cores).
Calculate new starting offset. This is equal to unused bits the last element had.
I would split the problem into two parts.
First, I would ignore the fact that the number of used bits is not a multiple of 32. I would use one of the given methods to swap around the whole array like that.
pseudocode:
for half the longs in the array:
take the first longword;
take the last longword;
swap the bits in the first longword
swap the bits in the last longword;
store the swapped first longword into the last location;
store the swapped last longword into the first location;
and then fix up the fact that the first few bits (call than number n) are actually garbage bits from the end of the longs:
for all of the longs in the array:
split the value in the leftmost n bits and the rest;
store the leftmost n bits into the righthand part of the previous word;
shift the rest bits to the left over n positions (making the rightmost n bits zero);
store them back;
You could try to fold that into one pass over the whole array of course. Something like this:
for half the longs in the array:
take the first longword;
take the last longword;
swap the bits in the first longword
swap the bits in the last longword;
split both value in the leftmost n bits and the rest;
for the new first longword:
store the leftmost n bits into the righthand side of the previous word;
store the remaining bits into the first longword, shifted left;
for the new last longword:
remember the leftmost n bits for the next iteration;
store the remembered leftmost n bits, combined with the remaining bits, into the last longword;
store the swapped first longword into the last location;
store the swapped last longword into the first location;
I'm abstracting from the edge cases here (first and last longword), and you may need to reverse the shifting direction depending on how the bits are ordered inside each longword.

Bitwise OR of constants

While reading some documentation here, I came across this:
unsigned unitFlags = NSYearCalendarUnit | NSMonthCalendarUnit | NSDayCalendarUnit;
I have no idea how this works. I read up on the bitwise operators in C, but I do not understand how you can fit three (or more!) constants inside one int and later being able to somehow extract them back from the int? Digging further down the documentation, I also found this, which is probably related:
typedef enum {
kCFCalendarUnitEra = (1 << 1),
kCFCalendarUnitYear = (1 << 2),
kCFCalendarUnitMonth = (1 << 3),
kCFCalendarUnitDay = (1 << 4),
kCFCalendarUnitHour = (1 << 5),
kCFCalendarUnitMinute = (1 << 6),
kCFCalendarUnitSecond = (1 << 7),
kCFCalendarUnitWeek = (1 << 8),
kCFCalendarUnitWeekday = (1 << 9),
kCFCalendarUnitWeekdayOrdinal = (1 << 10),
} CFCalendarUnit;
How do the (1 << 3) statements / variables work? I'm sorry if this is trivial, but could someone please enlighten me by either explaining or maybe posting a link to a good explanation?
Basically, the constants are represented just by one bit, so if you have a 32 bit integer, you can fit 32 constants in it. Your constants have to be powers of two, so they take only one "set" bit to represent.
For example:
#define CONSTANT_1 0x01 // 0001 in binary
#define CONSTANT_2 0x02 // 0010 in binary
#define CONSTANT_3 0x04 // 0100 in binary
then you can do
int options = CONSTANT_1 | CONSTANT_3; // will have 0101 in binary.
As you can see, each bit represents that particular constant. So you can binary AND in your code and test for the presence of each constant, like:
if (options & CONSTANT_3)
{
// CONSTANT_3 is set
}
I recommend you to read about binary operations (they work like LOGICAL operators, but at the bit level), if you grok this stuff, it will make you a bit better of a programmer.
Cheers.
If you look at a number in binary, each digit is either on (1) or off (0).
You can use bitwise operators to set or interrogate the individual bits efficiently to see if they are set or not.
Take the 8 bit value 156. In binary this is 10011100.
The set bits correspond to bits 7,4,3, and 2 (values 128, 16, 8, 4). You can compute these values with 1 << (position) rather easily. So, 1 << 7 = 128.
The number 1 is represented as 00000000000000000000000000000001
(1 << n) means shift the 1 in 1's representation n places to the left
So (1 << 3) would be 00000000000000000000000000001000
In one int you can have 32 options each of which can be turned on or off.
Option number n is on if the n'th bit is 1
1 << y is basically the same thing as 2 to the power of y
More generally, x << y is the same thing as x multiplied by 2 to the power of y.
In binary x << y means moving all the bits of x to the left by y places, adding zeroes in the place of the moved bits:
00010 << 2 = 01000
So:
1 << 1 = 2
1 << 2 = 4
1 << 3 = 8
...
<< is the shift left operator, it shifts the bits of the first operand left by the number of positions specified in the right operand (with zeros coming into the shifted positions from the right).
In your enum you end up with values that eacg have a different bit set to 1, so when you construct something like unitDate, you can later find out which flags it contains by using the & operator, e.g. unitDate & NSMonthCalendarUnit == NSMonthCalendarUnit will be true.

Resources