Let's say we have a variable x of 64 bits, we also know how many bits of such variable we actually use, say 1 <= nx <= 64 , so the last bit is in position nx - 1. What could be the fastest way to perform a padding of the last bit in the remaining 64 - nx?
I would try something like (pseudocode/C):
uint64_t padd_input(uint64_t x, int nx) {
assert(0 < nx && nx <= 64);
msb = (x & (1ULL<<(nx - 1))) != 0ULL; //or (x >> (nx - 1)) & 0x1ULL;
x |= ((msb<<(64 - nx)) - msb)<<nx;
return x;
}
Is all the shift/masking redundant? Or is there a smarter way to achieve the same thing?
I make an example of I want to achieve, assume the unused part is already set to 0.
Let's say i have 0x7, and nx = 4 in this case there's nothing to do. Assuming instead 0xF the padding has to provide 0xFFFFFFFFFFFFFFFF.
I would do:
uint64_t padd_input(uint64_t x, int nx)
{
uint64_t t = x & (1ULL << (nx-1));
t = t - 1;
x = x | ~t;
return x;
}
or perhaps
uint64_t padd_input(uint64_t x, int nx)
{
uint64_t t = x & (1ULL << (nx-1));
if (t)
{
t = t - 1;
x = x | ~t;
}
return x;
}
as it seems more clear to me.
Note: I have not compared performance of OPs code and my code.
Related
I need to find out the mask value with respect to the number provided by the user.
For example. If user provides input as
22 (in binary 10110)
and then I need to find the mask value by changing the high bit of the input as 1 and rest to 0.
So in this case it should be:
16 (in binary 10000)
Is there any inbuilt method in c language to do so.
you could compute the position of the highest bit
Once you have it, just shift left to get the proper mask value:
unsigned int x = 22;
int result = 0;
if (x != 0)
{
unsigned int y = x;
int bit_pos=-1;
while (y != 0)
{
y >>= 1;
bit_pos++;
}
result = 1<<bit_pos;
}
this sets result to 16
(there's a particular case if entered value is 0)
Basically, you need to floor align to the nearest power of two number. I am not sure there is a standard function for that, but try the following:
static inline uint32_t
floor_align32pow2(uint32_t x)
{
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
return (x >> 1) + (x & 1);
}
I have an array that represents an 8x8 "bit" block
unsigned char plane[8]
What I want to do is loop through this "bit" block horizontally
and count up the number of times a change occurs between
a 1 and 0.
When I extract a bit, it is getting stored in an
unsigned char, so basically I want to increase a count
when one char is nonzero and the other is zero.
What I have is the following:
int getComplexity(unsigned char *plane) {
int x, y, count = 0;
unsigned char bit;
for(x = 0; x < 8; x++) {
for(y = 0; y < 8; y++) {
if(y == 0) {
bit = plane[x] & 1 << y;
continue;
}
/*
MISSING CODE
*/
}
}
}
For the missing code, I could do:
if( (bit && !(plane[x] & 1 << y)) || (!bit && (plane[x] & 1 << y)) ) {
bit = plane[x] & 1 << y;
count++;
}
However, what I really want see is if there is some
clever bitwise operation to do this step instead
of having two separate tests.
This is really just a gcc solution because the popcnt intrinsic wont work on every other compiler.
unsigned char plane[8];
static const uint64_t tmask = 0x8080808080808080UL;
uint64_t uplane = *(uint64_t *)plane; // pull the whole array into a single uint
return __builtin_popcnt( ~tmask & (uplane ^ (uplane >> 1) ) );
For x86 the popcnt instruction wasnt actually implemented until sse4.2 was (so rather recently).
Also, although this looks like it relies on endianness, it doesn't because none of the individual bytes are allowed to interact thanks to the mask.
It is making some assumptions about the way memory works :\
As a side note doing this same thing in the "horizontal" direction is just as easy:
return __builtin_popcnt(0xFFFFFFFFFFFFFFUL & ( uplane ^ (uplane >> 8) ) );
I have following function which counts the number of binary digits in an unsigned 32-bit integer.
uint32_t L(uint32_t in)
{
uint32_t rc = 0;
while (in)
{
rc++;
in >>= 1;
}
return(rc);
}
Could anyone tell me please in case of signed 32-bit integer, which approach i should take ? implementing two's complement is an option. if you have any better approach, please let me know.
What about:
uint32_t count_bits(int32_t in)
{
uint32_t unsigned_in = (uint32_t) in;
uint32_t rc = 0;
while (unsigned_in)
{
rc++;
unsigned_in >>= 1;
}
return(rc);
}
Just convert the signed int into an unsigned one and do the same thing as before.
BTW: I guess you know that - unless your processor has a special instruction for it and you have access to it - one of the fastest implementation of counting the bits is:
int count_bits(unsigned x) {
x = x - ((x >> 1) & 0xffffffff);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x + (x >> 4)) & 0x0f0f0f0f;
x = x + (x >> 8);
x = x + (x >> 16);
return x & 0x0000003f;
}
It's not the fastest though...
Just reuse the function you defined as is:
int32_t bla = /* ... */;
uin32_t count;
count = L(bla);
You can cast bla to uint32_t (i.e., L((uint32_t) bla);) to make the conversion explicit, but it's not required by C.
If you are using gcc, it already provides fast implementations of functions to count bits and you can use them:
int __builtin_popcount (unsigned int x);
int __builtin_popcountl (unsigned long);
int __builtin_popcountll (unsigned long long);
http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
Your negative number always shows 32 because the first digit of a signed negative integer is 1. A UInt4 of 1000 = 16 but an Int4 of 1000 = -8, an Int4 of 1001 = -7, and Int4 of 1010 = -6 etc...
Since the first digit in an Int32 is meaningful rather just a bit of padding, you cannot really ignore it.
This question already has answers here:
Find most significant bit (left-most) that is set in a bit array
(17 answers)
Compute fast log base 2 ceiling
(15 answers)
Closed 9 years ago.
I have a requirement to compute the greatest power of 2 which is < an integer value, x
currently I am using:
#define log2(x) log(x)/log(2)
#define round(x) (int)(x+0.5)
x = round(pow(2,(ceil(log2(n))-1)));
this is in a performance critical function
Is there a more computationally efficient way of calculating x?
You are essentially looking for the highest non-zero bit in your number. Many processors have built-in instructions for this, which in turn are exposed by many compilers. For example, in GCC I would look at __builtin_clz, which
Returns the number of leading 0-bits in x, starting at the most significant bit position.
Together with sizeof(int) * CHAR_BIT and a shift, you can use this to figure out the corresponding pure-power-of-two integer. There's also a version for long integers.
(The CPU instruction is presumably called "CLZ" (count leading zeros), in case you need to look this up for other compilers.)
I have an integer log2 function in my c-libutl library (hosted on googlecode if anyone is interested)
/*
** Integer log base 2 of a 32 bits integer values.
** llog2(0) == llog2(1) == 0
*/
unsigned short llog2(unsigned long x)
{
long l = 0;
x &= 0xFFFFFFFF /* just in case 'long' is more than 32bit */
if (x==0) return 0;
#ifndef UTL_NOASM
#if defined(__POCC__) || defined(_MSC_VER) || defined (__WATCOMC__)
/* Pelles C MS Visual C++ OpenWatcom */
__asm { mov eax, [x]
bsr ecx, eax
mov l, ecx
}
#elif defined(__GNUC__)
l = (unsigned short) ((sizeof(long)*8 -1) - __builtin_clzl(x));
#else
#define UTL_NOASM
#endif
#endif
#ifdef UTL_NOASM /* Make a binary search.*/
if (x & 0xFFFF0000) {l += 16; x >>= 16;} /* 11111111111111110000000000000000 */
if (x & 0xFF00) {l += 8; x >>= 8 ;} /* 1111111100000000*/
if (x & 0xF0) {l += 4; x >>= 4 ;} /* 11110000*/
if (x & 0xC) {l += 2; x >>= 2 ;} /* 1100 */
if (x & 2) {l += 1; } /* 10 */
return l;
#endif
return (unsigned short)l;
}
Then you can simply compute
(1 << llog2(x))
to compute the greatest power of two that is less than x. Beware 0! You should handle it separately.
It uses assembler code but can also be forced to plain C code by defining the UTL_NOASM symbol.
The code has been tested at the time but it's quite some time I don't use it and I can't say if it behaves in a 64-bit environment.
Based on Bit Twiddling Hacks: Find the log base 2 of an N-bit integer in O(lg(N)) operations by Sean Eron Anderson (code contributed by Eric Cole and Andrew Shapira):
unsigned int highest_bit (uint32_t v) {
unsigned int r = 0, s;
s = (v > 0xFFFF) << 4; v >>= s; r |= s;
s = (v > 0xFF ) << 3; v >>= s; r |= s;
s = (v > 0xF ) << 2; v >>= s; r |= s;
s = (v > 0x3 ) << 1; v >>= s; r |= s;
return r | (v >> 1);
}
This returns the index of the highest bit of the input; the greatest power of 2 no greater than the input is then 1 << highest_bit(x), and the greatest power of 2 strictly less than the input is thus simply 1 << highest_bit(x-1).
For 64-bit inputs, just change the input type to uint64_t and add the following extra line at the beginning of the function, after the variable declarations:
s = (v > 0xFFFFFFFF) << 8; v >>= s; r |= s;
Left and right shift operators do this the best
int MaxPowerOf2(int x)
{
int out = 1;
while(x > 1) { x>>1; out<<1;}
return out;
}
#include <math.h>
double greatestPower( double x )
{
return floor(log( x ) / log( 2 ));
}
That is true since log in monotony increasing function.
Shifting bits around will most likely be much faster. Probably some bisection method on bits could make it even faster. Nice exercise for an improvement.
#include <stdio.h>
int closestPow2(int x)
{
int p;
if (x <= 1) return 0; /* No such power exists */
x--; /* Account for exact powers of 2, then one power less must be returned */
for (p = 0; x > 0; p++)
{
x >>= 1;
}
return 1<<(p-1);
}
int main(void)
{
printf("%x\n", closestPow2(0x7FFFFFFF));
return 0;
}
Given an array,
unsigned char q[32]="1100111...",
how can I generate a 4-bytes bit-set, unsigned char p[4], such that, the bit of this bit-set, equals to value inside the array, e.g., the first byte p[0]= "q[0] ... q[7]"; 2nd byte p[1]="q[8] ... q[15]", etc.
and also how to do it in opposite, i.e., given bit-set, generate the array?
my own trial out for the first part.
unsigned char p[4]={0};
for (int j=0; j<N; j++)
{
if (q[j] == '1')
{
p [j / 8] |= 1 << (7-(j % 8));
}
}
Is the above right? any conditions to check? Is there any better way?
EDIT - 1
I wonder if above is efficient way? As the array size could be upto 4096 or even more.
First, Use strtoul to get a 32-bit value. Then convert the byte order to big-endian with htonl. Finally, store the result in your array:
#include <arpa/inet.h>
#include <stdlib.h>
/* ... */
unsigned char q[32] = "1100111...";
unsigned char result[4] = {0};
*(unsigned long*)result = htonl(strtoul(q, NULL, 2));
There are other ways as well.
But I lack <arpa/inet.h>!
Then you need to know what byte order your platform is. If it's big endian, then htonl does nothing and can be omitted. If it's little-endian, then htonl is just:
unsigned long htonl(unsigned long x)
{
x = (x & 0xFF00FF00) >> 8) | (x & 0x00FF00FF) << 8);
x = (x & 0xFFFF0000) >> 16) | (x & 0x0000FFFF) << 16);
return x;
}
If you're lucky, your optimizer might see what you're doing and make it into efficient code. If not, well, at least it's all implementable in registers and O(log N).
If you don't know what byte order your platform is, then you need to detect it:
typedef union {
char c[sizeof(int) / sizeof(char)];
int i;
} OrderTest;
unsigned long htonl(unsigned long x)
{
OrderTest test;
test.i = 1;
if(!test.c[0])
return x;
x = (x & 0xFF00FF00) >> 8) | (x & 0x00FF00FF) << 8);
x = (x & 0xFFFF0000) >> 16) | (x & 0x0000FFFF) << 16);
return x;
}
Maybe long is 8 bytes!
Well, the OP implied 4-byte inputs with their array size, but 8-byte long is doable:
#define kCharsPerLong (sizeof(long) / sizeof(char))
unsigned char q[8 * kCharsPerLong] = "1100111...";
unsigned char result[kCharsPerLong] = {0};
*(unsigned long*)result = htonl(strtoul(q, NULL, 2));
unsigned long htonl(unsigned long x)
{
#if kCharsPerLong == 4
x = (x & 0xFF00FF00UL) >> 8) | (x & 0x00FF00FFUL) << 8);
x = (x & 0xFFFF0000UL) >> 16) | (x & 0x0000FFFFUL) << 16);
#elif kCharsPerLong == 8
x = (x & 0xFF00FF00FF00FF00UL) >> 8) | (x & 0x00FF00FF00FF00FFUL) << 8);
x = (x & 0xFFFF0000FFFF0000UL) >> 16) | (x & 0x0000FFFF0000FFFFUL) << 16);
x = (x & 0xFFFFFFFF00000000UL) >> 32) | (x & 0x00000000FFFFFFFFUL) << 32);
#else
#error Unsupported word size.
#endif
return x;
}
For char that isn't 8 bits (DSPs like to do this), you're on your own. (This is why it was a Big Deal when the SHARC series of DSPs had 8-bit bytes; it made it a LOT easier to port existing code because, face it, C does a horrible job of portability support.)
What about arbitrary length buffers? No funny pointer typecasts, please.
The main thing that can be improved with the OP's version is to rethink the loop's internals. Instead of thinking of the output bytes as a fixed data register, think of it as a shift register, where each successive bit is shifted into the right (LSB) end. This will save you from all those divisions and mods (which, hopefully, are optimized away to bit shifts).
For sanity, I'm ditching unsigned char for uint8_t.
#include <stdint.h>
unsigned StringToBits(const char* inChars, uint8_t* outBytes, size_t numBytes,
size_t* bytesRead)
/* Converts the string of '1' and '0' characters in `inChars` to a buffer of
* bytes in `outBytes`. `numBytes` is the number of available bytes in the
* `outBytes` buffer. On exit, if `bytesRead` is not NULL, the value it points
* to is set to the number of bytes read (rounding up to the nearest full
* byte). If a multiple of 8 bits is not read, the last byte written will be
* padded with 0 bits to reach a multiple of 8 bits. This function returns the
* number of padding bits that were added. For example, an input of 11 bits
* will result `bytesRead` being set to 2 and the function will return 5. This
* means that if a nonzero value is returned, then a partial byte was read,
* which may be an error.
*/
{ size_t bytes = 0;
unsigned bits = 0;
uint8_t x = 0;
while(bytes < numBytes)
{ /* Parse a character. */
switch(*inChars++)
{ '0': x <<= 1; ++bits; break;
'1': x = (x << 1) | 1; ++bits; break;
default: numBytes = 0;
}
/* See if we filled a byte. */
if(bits == 8)
{ outBytes[bytes++] = x;
x = 0;
bits = 0;
}
}
/* Padding, if needed. */
if(bits)
{ bits = 8 - bits;
outBytes[bytes++] = x << bits;
}
/* Finish up. */
if(bytesRead)
*bytesRead = bytes;
return bits;
}
It's your responsibility to make sure inChars is null-terminated. The function will return on the first non-'0' or '1' character it sees or if it runs out of output buffer. Some example usage:
unsigned char q[32] = "1100111...";
uint8_t buf[4];
size_t bytesRead = 5;
if(StringToBits(q, buf, 4, &bytesRead) || bytesRead != 4)
{
/* Partial read; handle error here. */
}
This just reads 4 bytes, and traps the error if it can't.
unsigned char q[4096] = "1100111...";
uint8_t buf[512];
StringToBits(q, buf, 512, NULL);
This just converts what it can and sets the rest to 0 bits.
This function could be done better if C had the ability to break out of more than one level of loop or switch; as it stands, I'd have to add a flag value to get the same effect, which is clutter, or I'd have to add a goto, which I simply refuse.
I don't think that will quite work. You are comparing each "bit" to 1 when it should really be '1'. You can also make it a bit more efficient by getting rid of the if:
unsigned char p[4]={0};
for (int j=0; j<32; j++)
{
p [j / 8] |= (q[j] == `1`) << (7-(j % 8));
}
Going in reverse is pretty simple too. Just mask for each "bit" that you set earlier.
unsigned char q[32]={0};
for (int j=0; j<32; j++) {
q[j] = p[j / 8] & ( 1 << (7-(j % 8)) ) + '0';
}
You'll notice the creative use of (boolean) + '0' to convert between 1/0 and '1'/'0'.
According to your example it does not look like you are going for readability, and after a (late) refresh my solution looks very similar to Chriszuma except for the lack of parenthesis due to order of operations and the addition of the !! to enforce a 0 or 1.
const size_t N = 32; //N must be a multiple of 8
unsigned char q[N+1] = "11011101001001101001111110000111";
unsigned char p[N/8] = {0};
unsigned char r[N+1] = {0}; //reversed
for(size_t i = 0; i < N; ++i)
p[i / 8] |= (q[i] == '1') << 7 - i % 8;
for(size_t i = 0; i < N; ++i)
r[i] = '0' + !!(p[i / 8] & 1 << 7 - i % 8);
printf("%x %x %x %x\n", p[0], p[1], p[2], p[3]);
printf("%s\n%s\n", q,r);
If you are looking for extreme efficiency, try to use the following techniques:
Replace if by subtraction of '0' (seems like you can assume your input symbols can be only 0 or 1).
Also process the input from lower indices to higher ones.
for (int c = 0; c < N; c += 8)
{
int y = 0;
for (int b = 0; b < 8; ++b)
y = y * 2 + q[c + b] - '0';
p[c / 8] = y;
}
Replace array indices by auto-incrementing pointers:
const char* qptr = q;
unsigned char* pptr = p;
for (int c = 0; c < N; c += 8)
{
int y = 0;
for (int b = 0; b < 8; ++b)
y = y * 2 + *qptr++ - '0';
*pptr++ = y;
}
Unroll the inner loop:
const char* qptr = q;
unsigned char* pptr = p;
for (int c = 0; c < N; c += 8)
{
*pptr++ =
qptr[0] - '0' << 7 |
qptr[1] - '0' << 6 |
qptr[2] - '0' << 5 |
qptr[3] - '0' << 4 |
qptr[4] - '0' << 3 |
qptr[5] - '0' << 2 |
qptr[6] - '0' << 1 |
qptr[7] - '0' << 0;
qptr += 8;
}
Process several input characters simultaneously (using bit twiddling hacks or MMX instructions) - this has great speedup potential!