Finding next bigger number with same number of set bits - c

I'm working on a problem where I'm given a number n, I have to find the next larger element with same number of set bits. While searching on Internet, I found an interesting piece of code which does this in few lines of code (BIT MAGIC) here:
unsigned nexthi_same_count_ones(unsigned a) {
/* works for any word length */
unsigned c = (a & -a);
unsigned r = a+c;
return (((r ^ a) >> 2) / c) | r);
}
But I want to understand the underlying logic about the algorithm that it will work always. All the boundary cases will be handled properly.
Can someone please explain the logic in simple steps.
Thanks

In the next higher number, the leftmost 1 of the rightmost run of 1s exchanges place with the 0 to its left, while the remaining 1s move to the far right.
The code isolates lowest 1,
adds it to a (making carries ripple through to the next higher 0, inverting all those bits)
The ex-or gets the least significant run of ones, extended one position to the left.
Shifting it right two positions takes its left boundary one position right of the original one
(leaving place for that one 0 from the high position),
dividing by the lowest 1 makes room for as many 0-bits more as there were on the right end of a.

Let say we have a bit pattern such as
111100111 - representing 487 in decimal
to generate the next highest integer whilst preserving the number of 0's and 1's as in the input we need to find the first 0 bit from the right of the input that is followed by 1, and we need to toggle this bit to a 1. We then need to reduce the number of 1's on the right of this flip point by 1 to compensate for the bit we switched from a 0 to 1.
our new bit pattern will become
111101011 - 491 in decimal (we have preserved the no of bits set and not set as per the input)
int getNextNumber(int input)
{
int flipPosition=0;
int trailingZeros=0;
int trailingOnes=0;
int copy = input;
//count trailing zeros
while(copy != 0 && (copy&1) == 0 )
{
++trailingZeros;
//test next bit
copy = copy >> 1;
}
//count trailing ones
while(copy != 0 && (copy&1) == 1 )
{
++trailingOnes;
//test next bit
copy = copy >> 1;
}
//if we have no 1's we cannot form another patter with the same number of 1's
//which will increment the input, or if we have leading consecutive
//zero's followed by consecutive 1's up to the maximum bit size of a int
//we cannot increase the input whilst preserving the no of 0's and
//1's in the original bit pattern
if(trailingZeros + trailingOnes == 0 || trailingZeros + trailingOnes == 31)
return -1;
//flip first 0 followed by a 1 found from the right of the bit pattern
flipPosition = trailingZeros + trailingOnes+1;
input |= 1<<(trailingZeros+trailingOnes);
//clear fields to the right of the flip position
int mask = ~0 << (trailingZeros+trailingOnes);
input &= mask;
//insert a bit pattern to the right of the flop position that will contain
//one less 1 to compensate for the bit we switched from 0 to 1
int insert = flipPosition-1;
input |= insert;
return input;
}

Related

left Rotate algorithm binary value in c

I'm trying to left rotate binary value:
int left_side=0;
for (int i = 0; i < n; i++)
{
left_side = left_side | ( ( number & ( 1<<BITS-i )) >> (BITS+i+1)-n );
}
BITS indicates the length of the binary value, n is the distance for rotation.
Example: 1000 , and n=1 which means the solution will be: 0001.
Some reason that I don't understand when I rotate it (from left to right side), lets take an example for number 253 which in binary sequence is 11111101 and n=3 (distance), the result from my code in binary sequence is 101 (which is 5).
Why the answer isn't 7? What I missed in my condition in this loop?
Thanks.
You want to rotate left your number n of a specific amount of bits.
Thus you have to shift your number to the left using n << amount and put the left bits to the right. This is done by putting the bits [0-amount[ to [NUMBER_BITS-amount,NUMBER_BITS[` using right shift.
For instance, if your number is an uint32_t you can use the following code (or easily adapt it to other types).
uint32_t RotateLeft(uint32_t n, int amount) {
return (n << amount)|(n >> (8*sizeof(uint32_t) - amount));
}
I think there are 2 approaches:
rotate by 1 bit n times
rotate by n % BITS bits only once
The rotation "wraps over" so we could just modulo every BITSth (8th) multiple of n to 0 in case someone would want to rotate 100 times. I think the second approach would be easier to understand, implement and read and why loop 100 times, if it can be done once?
Algorithm:
I will use 8 bits for demonstration, even though int is minimum 16 bits. The original MSB is marked by a dot.
.11110000 rl 2 == 110000.11
Well what has happened? 2 bits went right and the rest (BITS - 2) went left. That is just shift left and shift right "combined".
a = .11110000 << 2 == 110000.00
b = .11110000 >> (BITS - 2) == 000000.11
c = a | b
c == 110000.11
Easy, isn't it? Just remember to use n % BITS first and to use an unsigned type.
unsigned int rotateLeft(unsigned int number, int n) {
n %= BITS;
unsigned int left = number << n;
unsigned int right = number >> (BITS - n);
return left | right;
}

JPEG category encode bitwise operation

In the Cryx note about JPEG compression, categories are described as "the minimum size in bits in which we can keep that value". It goes on to say that values falling within certain ranges are put into categories. I have pasted a segment below but not the whole table.
Values Category Bits for the value
0 0 -
-1,1 1 0,1
-3,-2,2,3 2 00,01,10,11
-7,-6,-5,-4,4,5,6,7 3 000,001,010,011,100,101,110,111
The JPEG encoder/decoder found here performs the category encoding with a bitwise operation that I don't understand and I am hoping someone can clarify for me. RLE of 0's is done elsewhere, but this portion of the code breaks up the remaining pixel values into the categories as specified in the Cryx document.
In the code below, the variable code is the value of the YUV value of the pixel. In the while loop, if the conditions are met, i is decremented until the correct category is reached. For example, if the pixel value is 6.0, starting from category 15, i is decremented until 3 is reached. This is done with a bitwise operation that I do not understand. Can someone please clarify what condition is being tested for in the while loop? More specifically, !(absc & mask) is a Boolean, but I don't understand how this helps us to know the correct category.
The reason for the last if statement is also unclear to me. Thanks
unsigned absc = abs(*code);
unsigned mask = (1 << 15);
int i = 15;
if (absc == 0) { *size = 0; return; }
while (i && !(absc & mask)) { mask >>= 1; i--; }
*size = i + 1;
if (*code < 0) *code = (1 << *size) - absc - 1;
while here is used to find the most significant bit in code. Or in other words - length of code in bits.
The loop consequently applies mask to get next bit in code. First, mask is 1000000000000000 in binary form with 1 in 15th bit (zero based), the most valued bit in 2-byte (16 bit) number. Operator & (binary AND) zeros all bits in absc except one with 1 in mask. If result is zero than shift mask right (remove last binary digit) and repeat with next bit.
For value 6 = 110b (binary form) while will work till mask = 100b and i = 2. After it size will be set to 3.
If code was negative than the last line will convert it in one’s compliment representation with size length. Such encoding of negative numbers is described in your categories list.

How to get position of right most set bit in C

int a = 12;
for eg: binary of 12 is 1100 so answer should be 3 as 3rd bit from right is set.
I want the position of the last most set bit of a. Can anyone tell me how can I do so.
NOTE : I want position only, here I don't want to set or reset the bit. So it is not duplicate of any question on stackoverflow.
This answer Unset the rightmost set bit tells both how to get and unset rightmost set bit for an unsigned integer or signed integer represented as two's complement.
get rightmost set bit,
x & -x
// or
x & (~x + 1)
unset rightmost set bit,
x &= x - 1
// or
x -= x & -x // rhs is rightmost set bit
why it works
x: leading bits 1 all 0
~x: reversed leading bits 0 all 1
~x + 1 or -x: reversed leading bits 1 all 0
x & -x: all 0 1 all 0
eg, let x = 112, and choose 8-bit for simplicity, though the idea is same for all size of integer.
// example for get rightmost set bit
x: 01110000
~x: 10001111
-x or ~x + 1: 10010000
x & -x: 00010000
// example for unset rightmost set bit
x: 01110000
x-1: 01101111
x & (x-1): 01100000
Finding the (0-based) index of the least significant set bit is equivalent to counting how many trailing zeros a given integer has. Depending on your compiler there are builtin functions for this, for example gcc and clang support __builtin_ctz.
For MSVC you would need to implement your own version, this answer to a different question shows a solution making use of MSVC intrinsics.
Given that you are looking for the 1-based index, you simply need to add 1 to ctz's result in order to achieve what you want.
int a = 12;
int least_bit = __builtin_ctz(a) + 1; // least_bit = 3
Note that this operation is undefined if a == 0. Furthermore there exist __builtin_ctzl and __builtin_ctzll which you should use if you are working with long and long long instead of int.
One can use the property of 2s-complement here.
Fastest way to find 2s-complement of a number is to get the rightmost set bit and flip everything to the left of it.
For example: consider a 4 bit system
/* Number in binary */
4 = 0100
/* 2s complement of 4 */
complement = 1100
/* which nothing but */
complement == -4
/* Result */
4 & (-4) = 0100
Notice that there is only one set bit and its at rightmost set bit of 4.
Similarly we can generalise this for n.
n&(-n) will contain only one set bit which is actually at the rightmost set bit position of n.
Since there is only one set bit in n&(-n), it is a power of 2.
So finally we can get the bit position by:
log2(n&(-n))+1
The leftmost bit of n can be obtained using the formulae:
n & ~(n-1)
This works because when you calculate (n-1) .. you are actually making all the zeros till the rightmost bit to 1, and the rightmost bit to 0.
Then you take a NOT of it .. which leaves you with the following:
x= ~(bits from the original number) + (rightmost 1 bit) + trailing zeros
Now, if you do (n & x), you get what you need, as the only bit that is 1 in both n and x is the rightmost bit.
Phewwwww .. :sweat_smile:
http://www.catonmat.net/blog/low-level-bit-hacks-you-absolutely-must-know/
helped me understand this.
There is a neat trick in Knuth 7.1.3 where you multiply by a "magic" number (found by a brute-force search) that maps the first few bits of the number to a unique value for each position of the rightmost bit, and then you can use a small lookup table. Here is an implementation of that trick for 32-bit values, adapted from the nlopt library (MIT/expat licensed).
/* Return position (0, 1, ...) of rightmost (least-significant) one bit in n.
*
* This code uses a 32-bit version of algorithm to find the rightmost
* one bit in Knuth, _The Art of Computer Programming_, volume 4A
* (draft fascicle), section 7.1.3, "Bitwise tricks and
* techniques."
*
* Assumes n has a 1 bit, i.e. n != 0
*
*/
static unsigned rightone32(uint32_t n)
{
const uint32_t a = 0x05f66a47; /* magic number, found by brute force */
static const unsigned decode[32] = { 0, 1, 2, 26, 23, 3, 15, 27, 24, 21, 19, 4, 12, 16, 28, 6, 31, 25, 22, 14, 20, 18, 11, 5, 30, 13, 17, 10, 29, 9, 8, 7 };
n = a * (n & (-n));
return decode[n >> 27];
}
Try this
int set_bit = n ^ (n&(n-1));
Explanation:
As noted in this answer, n&(n-1) unsets the last set bit.
So, if we unset the last set bit and xor it with the number; by the nature of the xor operation, the last set bit will become 1 and the rest of the bits will return 0
1- Subtract 1 form number: (a-1)
2- Take it's negation : ~(a-1)
3- Take 'AND' operation with original number:
int last_set_bit = a & ~(a-1)
The reason behind subtraction is, when you take negation it set its last bit 1, so when take 'AND' it gives last set bit.
Check if a & 1 is 0. If so, shift right by one until it's not zero. The number of times you shift is how many bits from the right is the rightmost bit that is set.
You can find the position of rightmost set bit by doing bitwise xor of n and (n&(n-1) )
int pos = n ^ (n&(n-1));
I inherited this one, with a note that it came from HAKMEM (try it out here). It works on both signed and unsigned integers, logical or arithmetic right shift. It's also pretty efficient.
#include <stdio.h>
int rightmost1(int n) {
int pos, temp;
for (pos = 0, temp = ~n & (n - 1); temp > 0; temp >>= 1, ++pos);
return pos;
}
int main()
{
int pos = rightmost1(16);
printf("%d", pos);
}
You must check all 32 bits starting at index 0 and working your way to the left. If you can bitwise-and your a with a one bit at that position and get a non-zero value back, it means the bit is set.
#include <limits.h>
int last_set_pos(int a) {
for (int i = 0; i < sizeof a * CHAR_BIT; ++i) {
if (a & (0x1 << i)) return i;
}
return -1; // a == 0
}
On typical systems int will be 32 bits, but doing sizeof a * CHAR_BIT will get you the right number of bits in a even if it's a different size
Accourding to dbush's solution, Try this:
int rightMostSet(int a){
if (!a) return -1; //means there isn't any 1-bit
int i=0;
while(a&1==0){
i++;
a>>1;
}
return i;
}
return log2(((num-1)^num)+1);
explanation with example: 12 - 1100
num-1 = 11 = 1011
num^ (num-1) = 12^11 = 7 (111)
num^ (num-1))+1 = 8 (1000)
log2(1000) = 3 (answer).
x & ~(x-1) isolates the lowest bit that is one.
int main(int argc, char **argv)
{
int setbit;
unsigned long d;
unsigned long n1;
unsigned long n = 0xFFF7;
double nlog2 = log(2);
while(n)
{
n1 = (unsigned long)n & (unsigned long)(n -1);
d = n - n1;
n = n1;
setbit = log(d) / nlog2;
printf("Set bit: %d\n", setbit);
}
return 0;
}
And the result is as below.
Set bit: 0
Set bit: 1
Set bit: 2
Set bit: 4
Set bit: 5
Set bit: 6
Set bit: 7
Set bit: 8
Set bit: 9
Set bit: 10
Set bit: 11
Set bit: 12
Set bit: 13
Set bit: 14
Set bit: 15
Let x be your integer input.
Bitwise AND by 1.
If it's even ie 0, 0&1 returns you 0.
If it's odd ie 1, 1&1 returns you 1.
if ( (x & 1) == 0) )
{
std::cout << "The rightmost bit is 0 ie even \n";
}
else
{
std::cout<< "The rightmost bit is 1 ie odd \n";
}```
Alright, so number systems is just working with logarithms and exponents. So I'll dive down into an approach that really makes sense to me.
I would prefer you read this because I write there about how I interpret logarithms as.
When you perform the x & -x operation, it gives you the value which has the right most bit as 1 (for example, it can be 0001000 or 0000010. Now according to how I interpret logarithms as, this value of the right most set bit, is the final value after I grow at the rate of 2. Now we are interested in finding the number of digits in this answer because whatever that is, if you subtract 1 from it, that is precisely the bit-count of set bit (bit count begins with 0 here and the digit count begins with 1, so yeah). But the number of digits is precisely the time you expanded for + 1 (in accordance with my logic) or just the formula I mentioned in the previous link. But now, as we don't really need the digits, but need the bit count, and we also don't have to worry about values of bits which potentially can be real (if the number is 65) because the number is always some multiple of 2 (except 1). So if you just take the logarithm of the value x & -x, we get the bit count! I did see an answer before that mentioned this, but diving down to why it really works was something I felt like writing down.
P.S: You could also count the number of digits and then subtract 1 from it to get the bit-count.

How to store two different things in one byte and then access them again?

I am trying to learn C for my class. One thing I need to know is given an array, I have to take information from two characters and store it in one bytes. For eg. if string is "A1B3C5" then I have to store A = 001 in higher 3bits and then store 1 in lower 5bits. I have to function that can get two chars from array at a time and print it here is that function,
void print2(char string[])
{
int i = 0;
int length = 0;
char char1, char2;
length = strlen(string);
for ( i = 0; i <length; i= i + 2)
{
char1 = string[i];
char2 = string[i+1];
printf("%c, %c\n", char1, char2);
}
}
but now i am not sure how to get it encoded and then decode again. Can anyone help me please?
Assuming an ASCII character set, subtract '#' from the letter and shift left five bits, then subtract '0' from the character representing the digit and add it to the first part.
So you've got a byte, and you want the following bit layout:
76543210
AAABBBBB
To store A, you would do:
unsigned char result;
int input_a = somevalue;
result &= 0x1F; // Clear the upper 3 bits.
// Store "A": make sure only the lower 3 bits of input_a are used,
// Then shift it by 5 positions. Finally, store it by OR'ing.
result |= (char)((input_a & 7) << 5);
To read it:
// Simply shift the byte by five positions.
int output_a = (result >> 5);
To store B, you would do:
int input_b = yetanothervalue;
result &= 0xE0; // Clear the lower 5 bits.
// Store "B": make sure only the lower 5 bits of input_b are used,
// then store them by OR'ing.
result |= (char)(input_b & 0x1F);
To read it:
// Simply get the lower 5 bits.
int output_b = (result & 0x1F);
You may want to read about the boolean operations AND and OR, bit shifting and finally bit masks.
First of all, one bit can only represent two states: 0 and 1, or TRUE and FALSE. What you mean is a Byte, which consists of 8 bits and can thus represent 2^8 states.
Two put two values in one byte, use logical OR (|) and bitwise shift (<< and >>).
I don't post the code here since you should learn this stuff - it's really important to know what bits and bytes are and how to work with them. But feel free to ask follow up question if something is not clear to you.

How do the bit manipulations in this bit-sorting code work?

Jon Bentley in Column 1 of his book programming pearls introduces a technique for sorting a sequence of non-zero positive integers using bit vectors.
I have taken the program bitsort.c from here and pasted it below:
/* Copyright (C) 1999 Lucent Technologies */
/* From 'Programming Pearls' by Jon Bentley */
/* bitsort.c -- bitmap sort from Column 1
* Sort distinct integers in the range [0..N-1]
*/
#include <stdio.h>
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];
void set(int i)
{
int sh = i>>SHIFT;
a[i>>SHIFT] |= (1<<(i & MASK));
}
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }
int main()
{ int i;
for (i = 0; i < N; i++)
clr(i);
/*Replace above 2 lines with below 3 for word-parallel init
int top = 1 + N/BITSPERWORD;
for (i = 0; i < top; i++)
a[i] = 0;
*/
while (scanf("%d", &i) != EOF)
set(i);
for (i = 0; i < N; i++)
if (test(i))
printf("%d\n", i);
return 0;
}
I understand what the functions clr, set and test are doing and explain them below: ( please correct me if I am wrong here ).
clr clears the ith bit
set sets the ith bit
test returns the value at the ith bit
Now, I don't understand how the functions do what they do. I am unable to figure out all the bit manipulation happening in those three functions.
The first 3 constants are inter-related. BITSPERWORD is 32. This you'd want to set based on your compiler+architecture. SHIFT is 5, because 2^5 = 32. Finally, MASK is 0x1F which is 11111 in binary (ie: the bottom 5 bits are all set). Equivalently, MASK = BITSPERWORD - 1.
The bitset is conceptually just an array of bits. This implementation actually uses an array of ints, and assumes 32 bits per int. So whenever we want to set, clear or test (read) a bit we need to figure out two things:
which int (of the array) is it in
which of that int's bits are we talking about
Because we're assuming 32 bits per int, we can just divide by 32 (and truncate) to get the array index we want. Dividing by 32 (BITSPERWORD) is the same as shifting to the right by 5 (SHIFT). So that's what the a[i>>SHIFT] bit is about. You could also write this as a[i/BITSPERWORD] (and in fact, you'd probably get the same or very similar code assuming your compiler has a reasonable optimizer).
Now that we know which element of a we want, we need to figure out which bit. Really, we want the remainder. We could do this with i%BITSPERWORD, but it turns out that i&MASK is equivalent. This is because BITSPERWORD is a power of 2 (2^5 in this case) and MASK is the bottom 5 bits all set.
Basically is a bucket sort optimized:
reserve a bit array of length n
bits.
clear the bit array (first for in main).
read the items one by one (they must all be distinct).
set the i'th bit in the bit array if the read number is i.
iterate the bit array.
if the bit is set then print the position.
Or in other words (for N < 10 and to sort 3 numbers 4, 6, 2) 0
start with an empty 10 bit array (aka one integer usually)
0000000000
read 4 and set the bit in the array..
0000100000
read 6 and set the bit in the array
0000101000
read 2 and set the bit in the array
0010101000
iterate the array and print every position in which the bits are set to one.
2, 4, 6
sorted.
Starting with set():
A right shift of 5 is the same as dividing by 32. It does that to find which int the bit is in.
MASK is 0x1f or 31. ANDing with the address gives the bit index within the int. It's the same as the remainder of dividing the address by 32.
Shifting 1 left by the bit index ("1<<(i & MASK)") results in an integer which has just 1 bit in the given position set.
ORing sets the bit.
The line "int sh = i>>SHIFT;" is a wasted line, because they didn't use sh again beneath it, and instead just repeated "i>>SHIFT"
clr() is basically the same as set, except instead of ORing with 1<<(i & MASK) to set the bit, it ANDs with the inverse to clear the bit. test() ANDs with 1<<(i & MASK) to test the bit.
The bitsort will also remove duplicates from the list, because it will only count up to 1 per integer. A sort that uses integers instead of bits to count more than 1 of each is called a radix sort.
The bit magic is used as a special addressing scheme that works well with row sizes that are powers of two.
If you try understand this (note: I rather use bits-per-row than bits-per-word, since we're talking about a bit-matrix here):
// supposing an int of 1 bit would exist...
int1 bits[BITSPERROW * N]; // an array of N x BITSPERROW elements
// set bit at x,y:
int linear_address = y*BITSPERWORD + x;
bits + linear_address = 1; // or 0
// 0 1 2 3 4 5 6 7 8 9 10 11 ... 31
// . . . . . . . . . . . . .
// . . . . X . . . . . . . . -> x = 4, y = 1 => i = (1*32 + 4)
The statement linear_address = y*BITSPERWORD + x also means that x = linear_address % BITSPERWORD and y = linear_address / BITSPERWORD.
When you optimize this in memory by using 1 word of 32 bits per row, you get the fact that a bit at column x can be set using
int bitrow = 0;
bitrow |= 1 << (x);
Now when we iterate over the bits, we have the linear address, but need to find the corresponding word.
int column = linear_address % BITSPERROW;
int bit_mask = 1 << column; // meaning for the xth column,
// you take 1 and shift that bit x times
int row = linear_address / BITSPERROW;
So to set the i'th bit, you can do this:
bits[ i%BITSPERROW ] |= 1 << (linear_address / BITSPERROW );
An extra gotcha is, that the modulo operator can be replaced by a logical AND, and the / operator can be replaced by a shift, too, if the second operand is a power of two.
a % BITSPERROW == a & ( BITSPERROW - 1 ) == a & MASK
a / BITSPERROW == a >> ( log2(BITSPERROW) ) == a & SHIFT
This ultimately boils down to the very dense, yet hard-to-understand-for-the-bitfucker-agnostic notation
a[ i >> SHIFT ] |= ( 1 << (i&MASK) );
But I don't see the algorithm working for e.g. 40 bits per word.
Quoting the excerpts from Bentleys' original article in DDJ, this is what the code does at a high level:
/* phase 1: initialize set to empty */
for (i = 0; i < n; i++)
bit[i] = 0
/* phase 2: insert present elements */
for each i in the input file
bit[i] = 1
/* phase 3: write sorted output */
for (i = 0; i < n; i++)
if bit[i] == 1
write i on the output file
A few doubts :
1. Why is it a need for a 32 bit ?
2. Can we do this in Java by creating a HashMap with Keys from 0000000 to 9999999
and values 0 or 1 based on the presence/absence of the bit ? What are the implications
for such a program ?

Resources