Understanding Bitwise Operations in This Function - c

For the sake of simplicity, let's just assume the integer I am passing to this function is 9 which is 1001 in binary.
It has been my goal for a while now to write my own integer to binary function in C. The way I used to figure out binary values for number in shorthand was as followed (using 9 as mentioned above):
9 / 2 = 4.5 (remainder) = 1
4 / 2 = 2 (no remainder) = 0
2 / 2 = 1 (no remainder) = 0
1 / 1 = 1 (remainder) = 1
So if you reverse the 1 0 0 1 we get you will have the binary value of 9 which is still 1 0 0 1.
But then after looking over this site I found that the binary value of an integer can be found with some "simple" bitwise arithmetic. I found a function on another post on this site and adapted it into a function of my own:
char *itob(int integer)
{
char *bin = 0X00, *tmp;
int bff = 0;
while(integer)
{
if(!(tmp = realloc(bin, bff + 1)))
{
free(bin);
printf("\nError! Memory allocation failed while building binary string.");
return 0x00;
}
bin = tmp;
if(integer & 1) bin[bff++] = '1';
else bin[bff++] = '0';
integer >>= 1;
}
bin[bff+1] = 0x00;
return bin;
}
Here is how I understand what is going on as well as my questions (that appear as comments)
1001 & 1 = 1 so put a 1 into the buffer //what is & doing that makes it equate to 1? Is it because the first digit in that sequence is a 1?
shift the bits in 1001 to the right one time
0010 & 1 != 1 so move a 0 into the buffer //same question as before is & just looking at the 0 because it is the first digit in the sequence?
shift the bits in 0010 to the right one time
0100 & 1 != 1 so move a 0 into the buffer //same question as before
shift the bits in 0100 to the right one time
1000 & 1 = 1 so put a 1 into the buffer //same question as before (at this point I'm thinking my theory is correct but I'm still not entirely sure)
shift the bits in 1000 to the right one time
loop ends
So as mentioned in my comments this is what I believe is going on in my program but I'm not 100% sure. Also I'm not sure if this is the best way to be even converting decimal to binary. (I'm already aware that if integer were for whatever reason to be a 0 I would eventually be trying to dereference a NULL pointer when trying to free the memory allocated by itob() along with a few other hiccups) But besides the questions that I already asked earlier is there a better method or more appropriate way to do this conversion?

No, the sequence of tests and shifts is
1001 & 1 => 1 then shift right
100 & 1 => 0 "
10 & 1 => 0 "
1 & 1 => 1 "
The resulting integer 0 makes the loop terminate. So what this does is test each bit starting with the least significant bit and append a 0 or 1 in the buffer. That I'd say is backwards because when printed as a string the bit sequence is reversed from the one used most often, where the least significant bit is the rightmost one.

thats seems correct reasoning
the only things is the function above gives the binary results back in reverse this is possibly not wanted...
you will not spot this with number 9 (1001) as its binary representation is the same both ways, but you will with the number 4 (0100)

Modelled after the one in my link. Untested, but should be allright.
char * bit2str(unsigned int num )
{
unsigned int bit,pos;
char *dst;
dst = malloc(1+CHAR_BIT*sizeof bit) ;
if (!dst) return NULL;
for(pos=0,bit = 1u << (CHAR_BIT*sizeof bit -1); bit; bit >>= 1 ) {
dst[pos++] = num & bit ? '1' : '0' ;
}
dst[pos] = 0;
return dst;
}

Related

How to extract bits from a number in C?

I need to extract specific part (no of bits) of a short data type in C.
Fox example, i have a binary of 45 as 101101 and i just want 2 bits in middle such as (10)
I started with C code 2 days ago so don't given a lot of functions.
How do i extract them ?
Please search for bit-wise operations for more general information, and bit masking for your specific question. I wouldn't recommend to jump to bits if you are new to programming though.
The solution will slightly change depending on whether your input will be fixed in length. If it won't be fixed, you need to arrange you mask accordingly. Or you can use a different method, this is probably simplest way.
In order to get specific bits that you want, you can use bitmasking.
E.g you have 101101 and you want those middle two bits, if you & this with 001100, only bits that are 1 on the mask will remain unchanged in the source, all the other bits will be set to 0. Effectively, you will have those bits that you are interested in.
If you don't know what & (bitwise and) is, it takes two operands, and returns 1 only if first AND second operands are 1, returns 0 otherwise.
input : 1 0 1 1 0 1
mask : 0 0 1 1 0 0
result : 0 0 1 1 0 0
As C syntax, we can do this like:
unsigned int input = 45;
unsigned int mask = 0b001100; // I don't know if this is standard notation. May not work with all compilers
// or
unsigned int mask = 12; // This is equivalent
unsigned int result = input & mask; // result contains ...001100
As yo can see, we filtered the bits we wanted. The next step depends on what you want to do with those bytes.
At this point, the result 001100 corresponds to 12. I assume this is not really useful. What you can do is, you can move those bits around. In order to get rid of 0s at the right, we can shit it 2 bits to the right. For this, we need to use >> operator.
0 0 1 1 0 0 >> 2 ≡ 0 0 0 0 1 1
result = result >> 2; // result contains ...011
From there, you can set a bool variable to store each of them being 1 or 0.
unsigned char flag1 = result & 0b01; // or just 1
unsigned char flag2 = result & 0b10; // or just 2
You could do this without shifting at all but this way it's more clear.
You need to mask the bits that you want to extract. If suppose you want to create mask having first 4 bits set. Then you can do that by using:
(1 << 4) - 1
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
void print_bin(short n)
{
unsigned long i = CHAR_BIT * sizeof(n);
while(i--)
putchar('0' + ((n >> i) & 1));
printf("\n");
}
int main()
{
short num = 45; /* Binary 101101 */
short mask = 4; /* 4 bits */
short start = 0; /* Start from leftmost bit
position 0 */
print_bin((num >> start) & ((1 << mask) - 1)); /* Prints 1101 */
mask = 2; /* 2 bits */
start = 1; /* start from bit indexed at position 1 */
print_bin((num >> start) & ((1 << mask) - 1)); /* Prints 10 */
return 0;
}
Output:
0000000000001101
0000000000000010

How to get position of right most set bit in C

int a = 12;
for eg: binary of 12 is 1100 so answer should be 3 as 3rd bit from right is set.
I want the position of the last most set bit of a. Can anyone tell me how can I do so.
NOTE : I want position only, here I don't want to set or reset the bit. So it is not duplicate of any question on stackoverflow.
This answer Unset the rightmost set bit tells both how to get and unset rightmost set bit for an unsigned integer or signed integer represented as two's complement.
get rightmost set bit,
x & -x
// or
x & (~x + 1)
unset rightmost set bit,
x &= x - 1
// or
x -= x & -x // rhs is rightmost set bit
why it works
x: leading bits 1 all 0
~x: reversed leading bits 0 all 1
~x + 1 or -x: reversed leading bits 1 all 0
x & -x: all 0 1 all 0
eg, let x = 112, and choose 8-bit for simplicity, though the idea is same for all size of integer.
// example for get rightmost set bit
x: 01110000
~x: 10001111
-x or ~x + 1: 10010000
x & -x: 00010000
// example for unset rightmost set bit
x: 01110000
x-1: 01101111
x & (x-1): 01100000
Finding the (0-based) index of the least significant set bit is equivalent to counting how many trailing zeros a given integer has. Depending on your compiler there are builtin functions for this, for example gcc and clang support __builtin_ctz.
For MSVC you would need to implement your own version, this answer to a different question shows a solution making use of MSVC intrinsics.
Given that you are looking for the 1-based index, you simply need to add 1 to ctz's result in order to achieve what you want.
int a = 12;
int least_bit = __builtin_ctz(a) + 1; // least_bit = 3
Note that this operation is undefined if a == 0. Furthermore there exist __builtin_ctzl and __builtin_ctzll which you should use if you are working with long and long long instead of int.
One can use the property of 2s-complement here.
Fastest way to find 2s-complement of a number is to get the rightmost set bit and flip everything to the left of it.
For example: consider a 4 bit system
/* Number in binary */
4 = 0100
/* 2s complement of 4 */
complement = 1100
/* which nothing but */
complement == -4
/* Result */
4 & (-4) = 0100
Notice that there is only one set bit and its at rightmost set bit of 4.
Similarly we can generalise this for n.
n&(-n) will contain only one set bit which is actually at the rightmost set bit position of n.
Since there is only one set bit in n&(-n), it is a power of 2.
So finally we can get the bit position by:
log2(n&(-n))+1
The leftmost bit of n can be obtained using the formulae:
n & ~(n-1)
This works because when you calculate (n-1) .. you are actually making all the zeros till the rightmost bit to 1, and the rightmost bit to 0.
Then you take a NOT of it .. which leaves you with the following:
x= ~(bits from the original number) + (rightmost 1 bit) + trailing zeros
Now, if you do (n & x), you get what you need, as the only bit that is 1 in both n and x is the rightmost bit.
Phewwwww .. :sweat_smile:
http://www.catonmat.net/blog/low-level-bit-hacks-you-absolutely-must-know/
helped me understand this.
There is a neat trick in Knuth 7.1.3 where you multiply by a "magic" number (found by a brute-force search) that maps the first few bits of the number to a unique value for each position of the rightmost bit, and then you can use a small lookup table. Here is an implementation of that trick for 32-bit values, adapted from the nlopt library (MIT/expat licensed).
/* Return position (0, 1, ...) of rightmost (least-significant) one bit in n.
*
* This code uses a 32-bit version of algorithm to find the rightmost
* one bit in Knuth, _The Art of Computer Programming_, volume 4A
* (draft fascicle), section 7.1.3, "Bitwise tricks and
* techniques."
*
* Assumes n has a 1 bit, i.e. n != 0
*
*/
static unsigned rightone32(uint32_t n)
{
const uint32_t a = 0x05f66a47; /* magic number, found by brute force */
static const unsigned decode[32] = { 0, 1, 2, 26, 23, 3, 15, 27, 24, 21, 19, 4, 12, 16, 28, 6, 31, 25, 22, 14, 20, 18, 11, 5, 30, 13, 17, 10, 29, 9, 8, 7 };
n = a * (n & (-n));
return decode[n >> 27];
}
Try this
int set_bit = n ^ (n&(n-1));
Explanation:
As noted in this answer, n&(n-1) unsets the last set bit.
So, if we unset the last set bit and xor it with the number; by the nature of the xor operation, the last set bit will become 1 and the rest of the bits will return 0
1- Subtract 1 form number: (a-1)
2- Take it's negation : ~(a-1)
3- Take 'AND' operation with original number:
int last_set_bit = a & ~(a-1)
The reason behind subtraction is, when you take negation it set its last bit 1, so when take 'AND' it gives last set bit.
Check if a & 1 is 0. If so, shift right by one until it's not zero. The number of times you shift is how many bits from the right is the rightmost bit that is set.
You can find the position of rightmost set bit by doing bitwise xor of n and (n&(n-1) )
int pos = n ^ (n&(n-1));
I inherited this one, with a note that it came from HAKMEM (try it out here). It works on both signed and unsigned integers, logical or arithmetic right shift. It's also pretty efficient.
#include <stdio.h>
int rightmost1(int n) {
int pos, temp;
for (pos = 0, temp = ~n & (n - 1); temp > 0; temp >>= 1, ++pos);
return pos;
}
int main()
{
int pos = rightmost1(16);
printf("%d", pos);
}
You must check all 32 bits starting at index 0 and working your way to the left. If you can bitwise-and your a with a one bit at that position and get a non-zero value back, it means the bit is set.
#include <limits.h>
int last_set_pos(int a) {
for (int i = 0; i < sizeof a * CHAR_BIT; ++i) {
if (a & (0x1 << i)) return i;
}
return -1; // a == 0
}
On typical systems int will be 32 bits, but doing sizeof a * CHAR_BIT will get you the right number of bits in a even if it's a different size
Accourding to dbush's solution, Try this:
int rightMostSet(int a){
if (!a) return -1; //means there isn't any 1-bit
int i=0;
while(a&1==0){
i++;
a>>1;
}
return i;
}
return log2(((num-1)^num)+1);
explanation with example: 12 - 1100
num-1 = 11 = 1011
num^ (num-1) = 12^11 = 7 (111)
num^ (num-1))+1 = 8 (1000)
log2(1000) = 3 (answer).
x & ~(x-1) isolates the lowest bit that is one.
int main(int argc, char **argv)
{
int setbit;
unsigned long d;
unsigned long n1;
unsigned long n = 0xFFF7;
double nlog2 = log(2);
while(n)
{
n1 = (unsigned long)n & (unsigned long)(n -1);
d = n - n1;
n = n1;
setbit = log(d) / nlog2;
printf("Set bit: %d\n", setbit);
}
return 0;
}
And the result is as below.
Set bit: 0
Set bit: 1
Set bit: 2
Set bit: 4
Set bit: 5
Set bit: 6
Set bit: 7
Set bit: 8
Set bit: 9
Set bit: 10
Set bit: 11
Set bit: 12
Set bit: 13
Set bit: 14
Set bit: 15
Let x be your integer input.
Bitwise AND by 1.
If it's even ie 0, 0&1 returns you 0.
If it's odd ie 1, 1&1 returns you 1.
if ( (x & 1) == 0) )
{
std::cout << "The rightmost bit is 0 ie even \n";
}
else
{
std::cout<< "The rightmost bit is 1 ie odd \n";
}```
Alright, so number systems is just working with logarithms and exponents. So I'll dive down into an approach that really makes sense to me.
I would prefer you read this because I write there about how I interpret logarithms as.
When you perform the x & -x operation, it gives you the value which has the right most bit as 1 (for example, it can be 0001000 or 0000010. Now according to how I interpret logarithms as, this value of the right most set bit, is the final value after I grow at the rate of 2. Now we are interested in finding the number of digits in this answer because whatever that is, if you subtract 1 from it, that is precisely the bit-count of set bit (bit count begins with 0 here and the digit count begins with 1, so yeah). But the number of digits is precisely the time you expanded for + 1 (in accordance with my logic) or just the formula I mentioned in the previous link. But now, as we don't really need the digits, but need the bit count, and we also don't have to worry about values of bits which potentially can be real (if the number is 65) because the number is always some multiple of 2 (except 1). So if you just take the logarithm of the value x & -x, we get the bit count! I did see an answer before that mentioned this, but diving down to why it really works was something I felt like writing down.
P.S: You could also count the number of digits and then subtract 1 from it to get the bit-count.

How to compare two bit values in C?

I've been dabbling around a bit with C and I find that being able to directly manipulate bits is fascinating and powerful (and dangerous I suppose). I was curious as to what the best way would be to compare different bits in C would be. For instance, the number 15 is represented in binary as:
00001111
And the number 13 is represented as:
00001101
How would you compare what bits are different without counting them? It would be easy to use shifts to determine that 15 contains 4 1s and 13 contains 3 1s, but how would you output the difference between the two (ex that the 2^1 spot is different between the two)? I just can't think of an easy way to do this. Any pointers would be much appreciated!
EDIT: I should have clarified that I know XOR is the right way to go about this problem, but I had an issue with implementation. I guess my issue was comparing one bit at a time (and not generating the difference per say). The solution I came up with is:
void compare(int vector1, int vector2) {
int count = 0;
unsigned int xor = vector1 ^ vector2;
while (count < bit_length) {
if (xor % 2 == 1) { //would indicicate a difference
printf("%d ", count);
}
xor >>= 1;
count++;
}
}
Use bitwise operations:
c = a ^ b ;
00000010b = 00001111b ^ 00001101b;
What ^, or XOR, does is:
0 ^ 0 = 0
1 ^ 0 = 1
0 ^ 1 = 1
1 ^ 1 = 0
One way of thinking about it would be:
If the two operands (a and b) are different, the result is 1.
If they are equal, the result is 0.

How do I check if an integer is even or odd using bitwise operators

How do I check if an integer is even or odd using bitwise operators
Consider what being "even" and "odd" means in "bit" terms. Since binary integer data is stored with bits indicating multiples of 2, the lowest-order bit will correspond to 20, which is of course 1, while all of the other bits will correspond to multiples of 2 (21 = 2, 22 = 4, etc.). Gratuituous ASCII art:
NNNNNNNN
||||||||
|||||||+−− bit 0, value = 1 (20)
||||||+−−− bit 1, value = 2 (21)
|||||+−−−− bit 2, value = 4 (22)
||||+−−−−− bit 3, value = 8 (23)
|||+−−−−−− bit 4, value = 16 (24)
||+−−−−−−− bit 5, value = 32 (25)
|+−−−−−−−− bit 6, value = 64 (26)
+−−−−−−−−− bit 7 (highest order bit), value = 128 (27) for unsigned numbers,
value = -128 (-27) for signed numbers (2's complement)
I've only shown 8 bits there, but you get the idea.
So you can tell whether an integer is even or odd by looking only at the lowest-order bit: If it's set, the number is odd. If not, it's even. You don't care about the other bits because they all denote multiples of 2, and so they can't make the value odd.
The way you look at that bit is by using the AND operator of your language. In C and many other languages syntactically derived from B (yes, B), that operator is &. In BASICs, it's usually And. You take your integer, AND it with 1 (which is a number with only the lowest-order bit set), and if the result is not equal to 0, the bit was set.
I'm intentionally not actually giving the code here, not only because I don't know what language you're using, but because you marked the question "homework." :-)
In C (and most C-like languages)
if (number & 1) {
// It's odd
}
if (number & 1)
number is odd
else // (number & 1) == 0
number is even
For example, let's take integer 25, which is odd.
In binary 25 is 00011001. Notice that the least significant bit b0 is 1.
00011001
00000001 (00000001 is 1 in binary)
&
--------
00000001
Just a footnote to Jim's answer.
In C#, unlike C, bitwise AND returns the resulting number, so you'd want to write:
if ((number & 1) == 1) {
// It's odd
}
if(x & 1) // '&' is a bit-wise AND operator
printf("%d is ODD\n", x);
else
printf("%d is EVEN\n", x);
Examples:
For 9:
9 -> 1 0 0 1
1 -> & 0 0 0 1
-------------------
result-> 0 0 0 1
So 9 AND 1 gives us 1, as the right most bit of every odd number is 1.
For 14:
14 -> 1 1 1 0
1 -> & 0 0 0 1
------------------
result-> 0 0 0 0
So 14 AND 1 gives us 0, as the right most bit of every even number is 0.
Also in Java you will have to use if((number&1)==1){//then odd}, because in Java and C# like languages the int is not casted to boolean. You'll have to use the relational operators to return
a boolean value i.e true and false unlike C and C++ like languages which treats non-zero value as true.
You can do it simply using bitwise AND & operator.
if(num & 1)
{
//I am odd number.
}
Read more over here - Checking even odd using bitwise operator in C
Check Number is Even or Odd using XOR Operator
Number = 11
1011 - 11 in Binary Format
^ 0001 - 1 in Binary Format
----
1010 - 10 in Binary Format
Number = 14
1110 - 14 in Binary Format
^ 0001 - 1 in Binary Format
----
1111 - 15 in Binary Format
AS It can observe XOR Of a number with 1, increments it by 1 if it is
even, decrements it by 1 if it is odd.
Code:
if((n^1) == (n+1))
cout<<"even\n";
else
cout<<"odd\n";
#include <iostream>
#include <algorithm>
#include <vector>
void BitConvert(int num, std::vector<int> &array){
while (num > 0){
array.push_back(num % 2);
num = num / 2;
}
}
void CheckEven(int num){
std::vector<int> array;
BitConvert(num, array);
if (array[0] == 0)
std::cout << "Number is even";
else
std::cout << "Number is odd";
}
int main(){
int num;
std::cout << "Enter a number:";
std::cin >> num;
CheckEven(num);
std::cout << std::endl;
return 0;
}
In Java,
if((num & 1)==0){
//its an even num
}
//otherwise its an odd num
This is an old question, however the other answers have left this out.
In addition to using num & 1, you can also use num | 1 > num.
This works because if a number is odd, the resulting value will be the same since the original value num will have started with the ones bit set, however if the original value num was even, the ones bit won't have been set, so changing it to a 1 will make the new value greater by one.
Approach 1: Short and no need for explicit comparison with 1
if (number & 1) {
// number is odd
}
else {
// number is even
}
Approach 2: Needs an extra bracket and explicit comparison with 0
if((num & 1) == 0){ // Note: Bracket is MUST around num & 1
// number is even
}
else {
// number is odd
}
What would happen if I miss the bracket in the above code
if(num & 1 == 0) { } // wrong way of checking even or not!!
becomes
if(num & (1 == 0)) { } // == is higher precedence than &
https://en.cppreference.com/w/cpp/language/operator_precedence

How do the bit manipulations in this bit-sorting code work?

Jon Bentley in Column 1 of his book programming pearls introduces a technique for sorting a sequence of non-zero positive integers using bit vectors.
I have taken the program bitsort.c from here and pasted it below:
/* Copyright (C) 1999 Lucent Technologies */
/* From 'Programming Pearls' by Jon Bentley */
/* bitsort.c -- bitmap sort from Column 1
* Sort distinct integers in the range [0..N-1]
*/
#include <stdio.h>
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];
void set(int i)
{
int sh = i>>SHIFT;
a[i>>SHIFT] |= (1<<(i & MASK));
}
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }
int main()
{ int i;
for (i = 0; i < N; i++)
clr(i);
/*Replace above 2 lines with below 3 for word-parallel init
int top = 1 + N/BITSPERWORD;
for (i = 0; i < top; i++)
a[i] = 0;
*/
while (scanf("%d", &i) != EOF)
set(i);
for (i = 0; i < N; i++)
if (test(i))
printf("%d\n", i);
return 0;
}
I understand what the functions clr, set and test are doing and explain them below: ( please correct me if I am wrong here ).
clr clears the ith bit
set sets the ith bit
test returns the value at the ith bit
Now, I don't understand how the functions do what they do. I am unable to figure out all the bit manipulation happening in those three functions.
The first 3 constants are inter-related. BITSPERWORD is 32. This you'd want to set based on your compiler+architecture. SHIFT is 5, because 2^5 = 32. Finally, MASK is 0x1F which is 11111 in binary (ie: the bottom 5 bits are all set). Equivalently, MASK = BITSPERWORD - 1.
The bitset is conceptually just an array of bits. This implementation actually uses an array of ints, and assumes 32 bits per int. So whenever we want to set, clear or test (read) a bit we need to figure out two things:
which int (of the array) is it in
which of that int's bits are we talking about
Because we're assuming 32 bits per int, we can just divide by 32 (and truncate) to get the array index we want. Dividing by 32 (BITSPERWORD) is the same as shifting to the right by 5 (SHIFT). So that's what the a[i>>SHIFT] bit is about. You could also write this as a[i/BITSPERWORD] (and in fact, you'd probably get the same or very similar code assuming your compiler has a reasonable optimizer).
Now that we know which element of a we want, we need to figure out which bit. Really, we want the remainder. We could do this with i%BITSPERWORD, but it turns out that i&MASK is equivalent. This is because BITSPERWORD is a power of 2 (2^5 in this case) and MASK is the bottom 5 bits all set.
Basically is a bucket sort optimized:
reserve a bit array of length n
bits.
clear the bit array (first for in main).
read the items one by one (they must all be distinct).
set the i'th bit in the bit array if the read number is i.
iterate the bit array.
if the bit is set then print the position.
Or in other words (for N < 10 and to sort 3 numbers 4, 6, 2) 0
start with an empty 10 bit array (aka one integer usually)
0000000000
read 4 and set the bit in the array..
0000100000
read 6 and set the bit in the array
0000101000
read 2 and set the bit in the array
0010101000
iterate the array and print every position in which the bits are set to one.
2, 4, 6
sorted.
Starting with set():
A right shift of 5 is the same as dividing by 32. It does that to find which int the bit is in.
MASK is 0x1f or 31. ANDing with the address gives the bit index within the int. It's the same as the remainder of dividing the address by 32.
Shifting 1 left by the bit index ("1<<(i & MASK)") results in an integer which has just 1 bit in the given position set.
ORing sets the bit.
The line "int sh = i>>SHIFT;" is a wasted line, because they didn't use sh again beneath it, and instead just repeated "i>>SHIFT"
clr() is basically the same as set, except instead of ORing with 1<<(i & MASK) to set the bit, it ANDs with the inverse to clear the bit. test() ANDs with 1<<(i & MASK) to test the bit.
The bitsort will also remove duplicates from the list, because it will only count up to 1 per integer. A sort that uses integers instead of bits to count more than 1 of each is called a radix sort.
The bit magic is used as a special addressing scheme that works well with row sizes that are powers of two.
If you try understand this (note: I rather use bits-per-row than bits-per-word, since we're talking about a bit-matrix here):
// supposing an int of 1 bit would exist...
int1 bits[BITSPERROW * N]; // an array of N x BITSPERROW elements
// set bit at x,y:
int linear_address = y*BITSPERWORD + x;
bits + linear_address = 1; // or 0
// 0 1 2 3 4 5 6 7 8 9 10 11 ... 31
// . . . . . . . . . . . . .
// . . . . X . . . . . . . . -> x = 4, y = 1 => i = (1*32 + 4)
The statement linear_address = y*BITSPERWORD + x also means that x = linear_address % BITSPERWORD and y = linear_address / BITSPERWORD.
When you optimize this in memory by using 1 word of 32 bits per row, you get the fact that a bit at column x can be set using
int bitrow = 0;
bitrow |= 1 << (x);
Now when we iterate over the bits, we have the linear address, but need to find the corresponding word.
int column = linear_address % BITSPERROW;
int bit_mask = 1 << column; // meaning for the xth column,
// you take 1 and shift that bit x times
int row = linear_address / BITSPERROW;
So to set the i'th bit, you can do this:
bits[ i%BITSPERROW ] |= 1 << (linear_address / BITSPERROW );
An extra gotcha is, that the modulo operator can be replaced by a logical AND, and the / operator can be replaced by a shift, too, if the second operand is a power of two.
a % BITSPERROW == a & ( BITSPERROW - 1 ) == a & MASK
a / BITSPERROW == a >> ( log2(BITSPERROW) ) == a & SHIFT
This ultimately boils down to the very dense, yet hard-to-understand-for-the-bitfucker-agnostic notation
a[ i >> SHIFT ] |= ( 1 << (i&MASK) );
But I don't see the algorithm working for e.g. 40 bits per word.
Quoting the excerpts from Bentleys' original article in DDJ, this is what the code does at a high level:
/* phase 1: initialize set to empty */
for (i = 0; i < n; i++)
bit[i] = 0
/* phase 2: insert present elements */
for each i in the input file
bit[i] = 1
/* phase 3: write sorted output */
for (i = 0; i < n; i++)
if bit[i] == 1
write i on the output file
A few doubts :
1. Why is it a need for a 32 bit ?
2. Can we do this in Java by creating a HashMap with Keys from 0000000 to 9999999
and values 0 or 1 based on the presence/absence of the bit ? What are the implications
for such a program ?

Resources