I have a long sequence of bits stored in an array of unsigned long integers, like this
struct bit_array
{
int size; /* nr of bits */
unsigned long *array; /* the container that stores bits */
}
I am trying to design an algorithm to reverse the order of bits in *array. Problems:
size can be anything, i.e. not necessarily a multiple of 8 or 32 etc, so the first bit in the input array can end up at any position within the unsigned long in the output array;
the algorithm should be platform-independent, i.e. work for any sizeof(unsigned long).
Code, pseudocode, algo description etc. -- anything better than bruteforce ("bit by bit") approach is welcome.
My favorite solution is to fill a lookup-table that does bit-reversal on a single byte (hence 256 byte entries).
You apply the table to 1 to 4 bytes of the input operand, with a swap. If the size isn't a multiple of 8, you will need to adjust by a final right shift.
This scales well to larger integers.
Example:
11 10010011 00001010 -> 01010000 11001001 11000000 -> 01 01000011 00100111
To split the number into bytes portably, you need to use bitwise masking/shifts; mapping of a struct or array of bytes onto the integer can make it more efficient.
For brute performance, you can think of mapping up to 16 bits at a time, but this doesn't look quite reasonable.
I like the idea of lookup table. Still it's also a typical task for log(n) group bit tricks that may be very fast. Like:
unsigned long reverseOne(unsigned long x) {
x = ((x & 0xFFFFFFFF00000000) >> 32) | ((x & 0x00000000FFFFFFFF) << 32);
x = ((x & 0xFFFF0000FFFF0000) >> 16) | ((x & 0x0000FFFF0000FFFF) << 16);
x = ((x & 0xFF00FF00FF00FF00) >> 8) | ((x & 0x00FF00FF00FF00FF) << 8);
x = ((x & 0xF0F0F0F0F0F0F0F0) >> 4) | ((x & 0x0F0F0F0F0F0F0F0F) << 4);
x = ((x & 0xCCCCCCCCCCCCCCCC) >> 2) | ((x & 0x3333333333333333) << 2);
x = ((x & 0xAAAAAAAAAAAAAAAA) >> 1) | ((x & 0x5555555555555555) << 1);
return x;
}
The underlying idea is that when we aim to reverse the order of some sequence we may swap the head and tail halves of this sequence and then separately reverse each of halves (which is done here by applying the same procedure recursively to each half).
Here is a more portable version supporting unsigned long widths of 4,8,16 or 32 bytes.
#include <limits.h>
#define ones32 0xFFFFFFFFUL
#if (ULONG_MAX >> 128)
#define fill32(x) (x|(x<<32)|(x<<64)|(x<<96)|(x<<128)|(x<<160)|(x<<192)|(x<<224))
#define patt128 (ones32|(ones32<<32)|(ones32<<64) |(ones32<<96))
#define patt64 (ones32|(ones32<<32)|(ones32<<128)|(ones32<<160))
#define patt32 (ones32|(ones32<<64)|(ones32<<128)|(ones32<<192))
#else
#if (ULONG_MAX >> 64)
#define fill32(x) (x|(x<<32)|(x<<64)|(x<<96))
#define patt64 (ones32|(ones32<<32))
#define patt32 (ones32|(ones32<<64))
#else
#if (ULONG_MAX >> 32)
#define fill32(x) (x|(x<<32))
#define patt32 (ones32)
#else
#define fill32(x) (x)
#endif
#endif
#endif
unsigned long reverseOne(unsigned long x) {
#if (ULONG_MAX >> 32)
#if (ULONG_MAX >> 64)
#if (ULONG_MAX >> 128)
x = ((x & ~patt128) >> 128) | ((x & patt128) << 128);
#endif
x = ((x & ~patt64) >> 64) | ((x & patt64) << 64);
#endif
x = ((x & ~patt32) >> 32) | ((x & patt32) << 32);
#endif
x = ((x & fill32(0xffff0000UL)) >> 16) | ((x & fill32(0x0000ffffUL)) << 16);
x = ((x & fill32(0xff00ff00UL)) >> 8) | ((x & fill32(0x00ff00ffUL)) << 8);
x = ((x & fill32(0xf0f0f0f0UL)) >> 4) | ((x & fill32(0x0f0f0f0fUL)) << 4);
x = ((x & fill32(0xccccccccUL)) >> 2) | ((x & fill32(0x33333333UL)) << 2);
x = ((x & fill32(0xaaaaaaaaUL)) >> 1) | ((x & fill32(0x55555555UL)) << 1);
return x;
}
In a collection of related topics which can be found here, the bits of an individual array entry could be reversed as follows.
unsigned int v; // input bits to be reversed
unsigned int r = v; // r will be reversed bits of v; first get LSB of v
int s = sizeof(v) * CHAR_BIT - 1; // extra shift needed at end
for (v >>= 1; v; v >>= 1)
{
r <<= 1;
r |= v & 1;
s--;
}
r <<= s; // shift when v's highest bits are zero
The reversal of the entire array could be done afterwards by rearranging the individual positions.
You must define what is the order of bits in an unsigned long. You might assume that bit n is corresponds to array[x] & (1 << n) but this needs to be specified. If so, you need to handle the byte ordering (little or big endian) if you are going to use access the array as bytes instead of unsigned long.
I would definitely implement brute force first and measure whether the speed is an issue. No need to waste time trying to optimize this if it is not used a lot on large arrays. An optimized version can be tricky to implement correctly. If you end up trying anyway, the brute force version can be used to verify correctness on test values and benchmark the speed of the optimized version.
The fact that the size is not multiple of sizeof(long) is the hardest part of the problem. This can result in a lot of bit shifting.
But, you don't have to do that if you can introduce new struct member:
struct bit_array
{
int size; /* nr of bits */
int offset; /* First bit position */
unsigned long *array; /* the container that stores bits */
}
Offset would tell you how many bits to ignore at the beginning of the array.
Then you only only have to do following steps:
Reverse array elements.
Swap bits of each element. There are many hacks for in the other answers, but your compiler might also provide intrisic functions to do it in fewer instructions (like RBIT instruction on some ARM cores).
Calculate new starting offset. This is equal to unused bits the last element had.
I would split the problem into two parts.
First, I would ignore the fact that the number of used bits is not a multiple of 32. I would use one of the given methods to swap around the whole array like that.
pseudocode:
for half the longs in the array:
take the first longword;
take the last longword;
swap the bits in the first longword
swap the bits in the last longword;
store the swapped first longword into the last location;
store the swapped last longword into the first location;
and then fix up the fact that the first few bits (call than number n) are actually garbage bits from the end of the longs:
for all of the longs in the array:
split the value in the leftmost n bits and the rest;
store the leftmost n bits into the righthand part of the previous word;
shift the rest bits to the left over n positions (making the rightmost n bits zero);
store them back;
You could try to fold that into one pass over the whole array of course. Something like this:
for half the longs in the array:
take the first longword;
take the last longword;
swap the bits in the first longword
swap the bits in the last longword;
split both value in the leftmost n bits and the rest;
for the new first longword:
store the leftmost n bits into the righthand side of the previous word;
store the remaining bits into the first longword, shifted left;
for the new last longword:
remember the leftmost n bits for the next iteration;
store the remembered leftmost n bits, combined with the remaining bits, into the last longword;
store the swapped first longword into the last location;
store the swapped last longword into the first location;
I'm abstracting from the edge cases here (first and last longword), and you may need to reverse the shifting direction depending on how the bits are ordered inside each longword.
Related
I am trying to understand masking concept and want to set bits 24,25,26 of a uint32_t number in C.
example i have
uint32_t data =0;
I am taking an input from user of uint_8 which can be only be value 3 and 4 (011,100)
I want to set the value 011 or 110 in bits 24,25,26 of the data variable without disturbing other bits.
Thanks.
To set bits 24, 25, and 26 of an integer without modifying the other bits, you can use this pattern:
data = (data & ~((uint32_t)7 << 24)) | ((uint32_t)(newBitValues & 7) << 24);
The first & operation clears those three bits. Then we use another & operation to ensure we have a number between 0 and 7. Then we shift it to the left by 24 bits and use | to put those bits into the final result.
I have some uint32_t casts just to ensure that this code works properly on systems where int has fewer than 32 bits, but you probably won't need those unless you are programming embedded systems.
More general approach macro and function. Both are the same efficient as optimizing compilers do a really good job. Macro sets n bits of the d at position s to nd. Function has the same parameters order.
#define MASK(n) ((1ULL << n) - 1)
#define SMASK(n,s) (~(MASK(n) << s))
#define NEWDATA(d,n,s) (((d) & MASK(n)) << s)
#define SETBITS(d,nd,n,s) (((d) & SMASK(n,s)) | NEWDATA(nd,n,s))
uint32_t setBits(uint32_t data, uint32_t newBitValues, unsigned nbits, unsigned startbit)
{
uint32_t mask = (1UL << nbits) - 1;
uint32_t smask = ~(mask << startbit);
data = (data & smask) | ((newBitValues & mask) << startbit);
return data;
}
I try to writ a function that calculate the average bits of byte.
float AvgOnesOnBinaryString (int x)
for example:
-252 is 11111111 11111111 11111111 00000100
so the function return 6.25
because ( 8+8+8+1) / 4 = 6.25
I have to use the function that count bits in char:
int countOnesOnBinaryString (char x){
int bitCount = 0;
while(x > 0)
{
if ( x & 1 == 1 )
bitCount++;
x = x>>1;
}
return bitCount;
}
I tried:
float AvgOnesOnBinaryString (int x){
float total = 0;
total += countOnesOnBinaryString((x >> 24));
total += countOnesOnBinaryString((x >> 16));
total += countOnesOnBinaryString((x >> 8));
total += countOnesOnBinaryString(x);
return total/4;
}
but I get the answae 0.25 and not 6.25
what could be the problem?
UPDATE
I can't change the AvgOnesOnBinaryString function signature.
The C language allows compilers to define char as either a signed or unsigned type. I suspect it is signed on your platform, meaning that a byte like 0xff is likely interpreted as -1. This means that the x > 0 test in countOnesOnBinaryString yields false, so countOnesOnBinaryString(0xff) would return 0 instead of the correct value 8.
You should change countOnesOnBinaryString to take an argument of type unsigned char instead of char.
For somewhat related reasons, it would also be a good idea to change the argument of AvgOnesOnBinaryString to be unsigned int. Or even better, uint32_t from <stdint.h>, since your code assumes the input value is 32 bits, and (unsigned) int is allowed to be of some other size.
There is one algorithm that gives you the count of the number of 1 bits in an unsigned variable far more quickly. Only 5 iterations are needed in a 32 bit integer. I'll show it to you in C for a full length 64 bit unsigned number, so probably you can guess the pattern and why it works (it is explained below):
uint64_t
no_of_1_bits(uint64_t the_value)
{
the_value = ((the_value & 0xaaaaaaaaaaaaaaaa) >> 1) + (the_value & 0x5555555555555555);
the_value = ((the_value & 0xcccccccccccccccc) >> 2) + (the_value & 0x3333333333333333);
the_value = ((the_value & 0xf0f0f0f0f0f0f0f0) >> 4) + (the_value & 0x0f0f0f0f0f0f0f0f);
the_value = ((the_value & 0xff00ff00ff00ff00) >> 8) + (the_value & 0x00ff00ff00ff00ff);
the_value = ((the_value & 0xffff0000ffff0000) >> 16) + (the_value & 0x0000ffff0000ffff);
the_value = ((the_value & 0xffffffff00000000) >> 32) + (the_value & 0x00000000ffffffff);
return the_value;
}
The number of 1 bits will be in the 64bit value of the_value. If you divide the result by eight, you'll have the average of 1 bits per byte for an unsigned long (beware of making the shifts with signed chars as the sign bit is replicated, so your algorithm will never stop for a negative number)
For 8 bit bytes, the algorithm reduces to:
uint8_t
no_of_1_bits(uint8_t the_value)
{
the_value = ((the_value & 0xaa) >> 1) + (the_value & 0x55);
the_value = ((the_value & 0xcc) >> 2) + (the_value & 0x33);
the_value = ((the_value & 0xf0) >> 4) + (the_value & 0x0f);
return the_value;
}
and again, the number of 1 bits is in the variable the_value.
The idea of this algorithm is to produce in the first step the sum of each pair of bits in a two bit accumulator (we shift the left bit of a pair to the right to align it with the right one, then we add them together, and in parallel for each pair of bits). As the accumulators are two bits, it is impossible to overflow (so there's never a carry from a pair of bits to the next, and we use the full integer as a series of two bit registers to add the sum)
Then we sum each pair of bits in an accumulator of four bits and again, that never overflows... let's do the same thing with the nibbles we got, and sum them into registers of 8 bits.... If it was impossible to overflow a 4 bit accumulator with two bits, it is more impossible to overflow an 8 bit accumulator with four bit addings.... and continue until you add the left half of the word with the right half. You finally end with the sum of all bits in one full length register of the word length.
Easy, isn't it? :)
The code is from an open source project of sha256,
uint64_t swapE64(uint64_t val) {
uint64_t x = val;
x = (x & 0xffffffff00000000) >> 32 | (x & 0x00000000ffffffff) << 32;
x = (x & 0xffff0000ffff0000) >> 16 | (x & 0x0000ffff0000ffff) << 16;
x = (x & 0xff00ff00ff00ff00) >> 8 | (x & 0x00ff00ff00ff00ff) << 8;
return x;
}
the function is not complex, but I don't know its mathematical means and usage.
My fault, I did't ask the question very clear. In different environments which use different endian representation, it is clear, this function will keep the data in a same meaning, but under the same endian representation, what does it means?
It absolutely will change the meaning of the data, or there is some other reason to swap it?
In the pseudocode for SHA256 on wikipedia it says
Pre-processing: append the bit '1' to the message append k bits '0',
where k is the minimum number >= 0 such that the resulting message
length (modulo 512 in bits) is 448. append length of message (without the '1' bit or padding), in bits, as 64-bit big-endian
integer
(this will make the entire post-processed length a multiple of 512 bits)
x86/x86_64 Linux and Unix are small endian.
It's converting the length of the message to big endian to add it to the end of the message, which it does in the source at L105 of sha256.c, and that section of the code is the only place where the swapE64 function is called:
https://github.com/noryb009/sha256/blob/77a185c837417ea3fc502289215738766a8f8046/sha256.c#L100
am having a little trouble with this function of mine. We are supposed to use bit wise operators only (that means no logical operators and no loops or if statements) and we aren't allowed to use a constant bigger than 0xFF.
I got my function to work, but it uses a huge constant. When I try to implement it with smaller numbers and shifting, I can't get it to work and I'm not sure why.
The function is supposed to check all of the even bits in a given integer, and return 1 if they are all set to 1.
Working code
int allEvenBits(int x) {
/* implements a check for all even-numbered bits in the word set to 1 */
/* if yes, the program outputs 1 WORKING */
int all_even_bits = 0x55555555;
return (!((x & all_even_bits) ^ all_even_bits));
}
Trying to implement with a smaller constant and shifts
int allEvenBits(int x) {
/* implements a check for all even-numbered bits in the word set to 1 */
/* if yes, the program outputs 1 WORKING */
int a, b, c, d, e = 0;
int mask = 0x55;
/* first 8 bits */
a = (x & mask)&1;
/* second eight bits */
b = ((x>>8) & mask)&1;
/* third eight bits */
c = ((x>>16) & mask)&1;
/* fourth eight bits */
d = ((x>>24) & mask)&1;
e = a & b & c & d;
return e;
}
What am I doing wrong here?
When you do, for example, this:
d = ((x>>24) & mask)&1;
..you're actually checking whether the lowest bit (with value 1) is set, not whether any of the the mask bits are set... since the &1 at the end bitwise ANDs the result of the rest with 1. If you change the &1 to == mask, you'll instead get 1 when all of the bits set in mask are set in (x>>24), as intended. And of course, the same problem exists for the other similar lines as well.
If you can't use comparisons like == or != either, then you'll need to shift all the interesting bits into the same position, then AND them together and with a mask to eliminate the other bit positions. In two steps, this could be:
/* get bits that are set in every byte of x */
x = (x >> 24) & (x >> 16) & (x >> 8) & x;
/* 1 if all of bits 0, 2, 4 and 6 are set */
return (x >> 6) & (x >> 4) & (x >> 2) & x & 1;
I don't know why you are ANDing your values with 1. What is the purpose of that?
This code is untested, but I would do something along the lines of the following.
int allEvenBits(int x) {
return (x & 0x55 == 0x55) &&
((x >> 8) & 0x55 == 0x55) &&
((x >> 16) & 0x55 == 0x55) &&
((x >> 24) & 0x55 == 0x55);
}
Say you are checking the first 4 least significant digits, the even ones would make 1010. Now you should AND this with the first 4 bits of the number you're checking against. All 1's should remain there. So the test would be ((number & mask) == mask) (mask is 1010) for the 4 least significant bits, you do this in blocks of 4bits (or you can use 8 since you are allowed).
If you aren't allowed to use constants larger than 0xff and your existing program works, how about replacing:
int all_even_bits = 0x55555555;
by:
int all_even_bits = 0x55;
all_even_bits |= all_even_bits << 8; /* it's now 0x5555 */
all_even_bits |= all_even_bits << 16; /* it's now 0x55555555 */
Some of the other answers here right shift signed integers (i.e. int) which is undefined behaviour.
An alternative route is:
int allevenbitsone(unsigned int a)
{
a &= a>>16; /* superimpose top 16 bits on bottom */
a &= a>>8; /* superimpose top 8 bits on bottom */
a &= a>>4; /* superimpose top 4 bits on bottom */
a &= a>>2; /* and down to last 2 bits */
return a&1; /* return & of even bits */
}
What this is doing is and-ing together the even 16 bits into bit 0, and the odd 16 bits into bit 1, then returning bit 0.
the main problem in your code that you're doing &1, so you take first 8 bits from number, mask them with 0x55 and them use only 1st bit, which is wrong
consider straightforward approach:
int evenBitsIn8BitNumber(int a) {
return (a & (a>>2) & (a>>4) & (a>>6)) & 1;
}
int allEvenBits(int a) {
return evenBitsIn8BitNumber(a) &
evenBitsIn8BitNumber(a>>8) &
evenBitsIn8BitNumber(a>>16) &
evenBitsIn8BitNumber(a>>24);
}
I am working with bitvectors in C. My bitvectors are unsigned long long's. For a large number of vectors I need to know if the parity, i.e. the number of bits that are 1, is even or odd.
The exact value is not important, just the parity. I was wondering if there is anything faster than calculating the number of ones and checking. I tried to think of something, but couldn't find anything.
A short example of how I want this to work:
void checkIntersection(unsigned long long int setA, unsigned long long int setB){
if(isEven(setA & setB)){
//do something
}
}
With divide and conquer technique:
uint64_t a = value;
a ^= (a >> 32); // Fold the 32 MSB over the 32 LSB
a ^= (a >> 16); // reducing the problem by 50%
a ^= (a >> 8); // <-- this can be a good break even point
..
return lookup_table[a & 0xff]; // 16 or 256 entries are typically good
..
Folding procedure can be applied until the end:
a ^= (a >> 1);
return a & 1;
In IA the Parity flag can be directly retrieved after the reduction to 8 bits.
a ^= (a >> 4); makes another good point to stop dividing, since some processor architectures can provide parallel Look Up Tables uint8_t LUT[16] embedded into XXM (or NEON) registers. Or simply the potential cache misses of 256-entry LUT's can simply overweight the computational task of one extra round. It's naturally best to measure which LUT size is optimal in a given architecture.
This last table consists actually of 16 bits only and can be emulated with the sequence:
return ((TRUTH_TABLE_FOR_PARITY) >> (a & 15)) & 1;
where bit N of the magic constant above encodes the boolean value for Parity(N).
You could precompute in an array the parity for all possible combinations of bits in a byte:
bool pre[256] = { 0, 1, 1, 0, 1, ....}
When you need to find out the parity of a larger array you just do:
bool parity (long long unsigned x)
{
bool parity = 0;
while(x)
{
parity ^= pre[x&0xff];
x>>=8;
}
return parity;
}
Disclaimer: I haven't tested the code, it's just an idea.
Pretty easy. Something like
unsigned population(unsigned long long x) {
x = ((x >> 1) & 0x5555555555555555) + (x & 0x5555555555555555);
x = ((x >> 2) & 0x3333333333333333) + (x & 0x3333333333333333);
x = ((x >> 4) & 0x0f0f0f0f0f0f0f0f) + (x & 0x0f0f0f0f0f0f0f0f);
x = (x >> 8) + x; // Don't need to mask, because 64 < 0xff
x = (x >> 16) + x;
x = (x >> 32) + x;
return x & 0xff;
}
should work. Also, some CPUs have population count instructions (I don’t think x86 does, mind).
If you like this kind of thing, you should check out the book Hacker’s Delight by Henry S. Warren, Jr.