Finding closest power of 2 for any float at compile time - c

I need to scale all floats to [-1,1] range by dividing with closest higher power of 2. The code needs to be Q0.31 fixed-point, so no floats.
For example, 10.75 would be divided by 16, 20.91 by 32, 1000.17 by 1024, etc, all the way to 2^31.
I'd need the scaling to be done at compilation time.
For example:
#define PARAMETER1 10.0f // this could be changed in various builds
#define PARAMETER1_SCALE ( CALC_SCALE(PARAMETER1) )
#define float_to_fixed(x) ( (int)( (float)(x)*(float)0x80000000 ) )
int main()
{
int par1 = float_to_fixed( PARAMETER1/PARAMETER1_SCALE );
// use par1 here
// ...
// then descale using PARAMETER1_SCALE again
}
Is there a C macro CALC_SCALE which would calculate this?

How about this:
#include <math.h>
#define collapse(value) (value < 0 ? value / pow(2, ceil(log2(value * -1))) : value / pow(2, ceil(log2(value))))
int main() {
cout << collapse(10.75) << endl;
cout << collapse(20.91) << endl;
cout << collapse(-1) << endl;
cout << collapse(-2.5) << endl;
cout << collapse(5.7) << endl;
}
Output is:
0.671875
0.653438
-1
-0.625
0.7125

This does it:
#define CALC_SCALE ( x > 16 ? 32 : x > 8 ? 16 : x > 4 ? 8 : x > 2 ? 4 : x > 1 ? 2 : 1 )
At compile time:
int x = CALC_SCALE( 13 );
is compiled to:
int x = 16;
This can easily be changed to support floats.

Related

efficient loop of a map within a map c++

I am attempting to find an efficient loop over a map data structure.
The map structure maps the following integers:
1 2, 2 3, 3 1, 4 1, 4 5, 5 3, 5 6, 5 7, 5 8, 6 4, 7 6, 8 9, 9 10
The resulting map looks as follows:
1| 2
2| 3
3| 1
4| 1 5
5| 3 6 7 8
6| 4
7| 6
8| 9
9| 10
Start : 4
Result :
1(1) 2(2) 5(1) 3(2) 6(2) 7(2) 8(2)
Can anybody suggest how to efficiently loop (possibly recursive method) so that, given a start of say 4, the result would be
1(1), 2(2), 5(1), 3(2), 6(2), 7(2), 8(2), 9(3), 10(4)
So the idea is to use each inner key, as an outer key, starting with a given outer key. With outer 4 for example, the inner keys are 5 and 1. So use 5 and 1 as outer keys to obtain inner keys (3 6 7 8) and (2), the process should continue mapping the inner keys to outer keys. A running total should be kept per "jump". So it probably resolves to a recursive problem rather than a loop.
The process should stop if either you reach the starting point, 4 in the above scenario, or there are no more inner keys, for example, 10 has no mapping.
The loop starting at line 44, only performs the above, up to two levels, which is inadequate.
#include <iostream>
#include <map>
#include <sstream>
int digit_count(int number) {
int digits = 0;
if (number < 0) digits = 1; // remove this line if '-' counts as a digit
while (number) {
number /= 10;
digits++;
}
return digits;
}
int main() {
int v1, v2;
std::map< int, std::map< int, int> > m;
std::istringstream stm {"1 2 2 3 3 1 4 1 4 5 5 3 5 6 5 7 5 8 6 4 7 6 8 9 9 10"};
while (stm >> v1 >> v2) {
m[v1];
m[v1][v2] = 1;
}
std::cout << "Map layout " << "\n";
std::string ss = "";
int dc = digit_count(m.rbegin()->first); // equals max number
for (const auto & p : m) {
std::cout << p.first << ss.append(" ", (dc - digit_count(p.first))) << "| ";
for (const auto & val : p.second)
std::cout << val.first << " ";
ss = "";
std::cout << "\n";
}
int start {4};
std::cout << "\nStart : " << start << "\n";
std::cout << "Result : " << "\n";
// efficient loop
for (const auto & e : m[start]) {
std::cout << e.first << "(" << e.second << ") ";
for (const auto & x : m[e.first])
std::cout << x.first << "(" << (e.second + x.second) << ") ";
}
std::cout << "\n";
return 0;
}
Any help would be much appreciated.
Took me a while, but I've answered my own question, and thought I'd update the post. The only way I could think of finding a solution was to create a recursive function. I don't think it is possible to achieve with loops. I didn't select Dijkstra's algorithm, since there are no weights to consider and the results of this recursive function serve as input to a red black tree where each node of the red black tree holds a hash table (unordered_map).
So the result of the query on the combined red black tree/hash table is Log n, to find the shortest path. The problem is providing the input, as you can see from below, is recursive and inefficient.
#include <iostream>
#include <map>
#include <sstream>
#include <set>
#include <vector>
#include <fstream>
int digit_count(int number) {
int digits = 0;
if (number < 0) digits = 1; // remove this line if '-' counts as a digit
while (number) {
number /= 10;
digits++;
}
return digits;
}
struct vertex {
int point;
mutable bool visited{false};
int id;
};
void clear_visited(std::map<int, vertex>& verteces) {
for (const auto & e : verteces) {
e.second.visited = false;
}
}
void traverse_graph(std::map<int, vertex>& verteces, const vertex & v, std::map< vertex, std::map< vertex, int> >& graph, int& counter) {
if (verteces[v.id].visited)
return;
++counter;
verteces[v.id].visited = true;
std::cout << v.point << "(" << counter << ") ";
for (const auto & e : graph[v]) {
traverse_graph(verteces, e.first, graph, counter);
}
}
void start_traverse_graph(std::map<int, vertex>& verteces, const vertex & v, std::map< vertex, std::map< vertex, int> >& graph) {
if (verteces[v.id].visited)
return;
verteces[v.id].visited = true;
for (const auto & e : graph[v]) {
int counter{0};
clear_visited(verteces);
verteces[v.id].visited = true;
traverse_graph(verteces, e.first, graph, counter);
}
}
bool operator<(vertex a, vertex b) { return a.point < b.point; }
int main (int argc, char *argv[]) {
vertex v1, v2;
std::map< vertex, std::map< vertex, int> > m;
std::istringstream stm {"1 2 2 3 3 1 4 1 4 5 5 3 5 6 5 7 5 8 6 4 7 6 8 9 9 10"};
std::set<vertex> vertecesSet;
std::map<int, vertex> verteces;
while (stm >> v1.point >> v2.point) {
v1.id = v1.point;
v2.id = v2.point;
m[v1];
m[v1][v2] = 1;
vertecesSet.insert({v1.point, false, v1.point});
vertecesSet.insert({v2.point, false, v2.point});
}
for(auto & el : vertecesSet)
verteces[el.id] = std::move(el); // dont need set objects anymore so move them
std::cout << "Map layout " << "\n";
std::string ss = "";
int dc = digit_count(m.rbegin()->first.point); // equals max number
for (const auto & p : m) {
std::cout << p.first.point << ss.append(" ", (dc - digit_count(p.first.point))) << "| ";
for (const auto & val : p.second)
std::cout << val.first.point << " ";
ss = "";
std::cout << "\n";
}
vertex start {5,false,5};
std::cout << "\nStart : " << start.point << "\n";
start_traverse_graph( verteces, start, m);
std::cout << "\n";
return 0;
}

Wrong results multiplying two 32 bit numbers in C

I am trying two multiply to matrices in C and I cant understand why I get these results...
I want to do : Btranspose * B
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <math.h>
#define LOW_WORD(x) (((x) << 16) >> 16)
#define HIGH_WORD(x) ((x) >> 16)
#define ABS(x) (((x) >= 0) ? (x) : -(x))
#define SIGN(x) (((x) >= 0) ? 1 : -1)
#define UNSIGNED_MULT(a, b) \
(((LOW_WORD(a) * LOW_WORD(b)) << 0) + \
(((int64_t)((LOW_WORD((a)) * HIGH_WORD((b))) + (HIGH_WORD((a)) * LOW_WORD((b))))) << 16) + \
((int64_t)(HIGH_WORD((a)) * HIGH_WORD((b))) << 32))
#define MULT(a, b) (UNSIGNED_MULT(ABS((a)), ABS((b))) * SIGN((a)) * SIGN((b)))
int main()
{
int c,d,k;
int64_t multmatrix[3][3];
int64_t sum64 = 0;
int32_t Btranspose[3][3] = {{15643, 24466, 58751},
{54056, 26823, -25563},
{-33591, 54561, -13777}};
int32_t B[3][3] = {{15643, 54056, -33591},
{24466, 26823, 54561},
{58751, -25563, -13777}};
for ( c = 0 ; c < 3 ; c++ ){
for ( d = 0 ; d < 3 ; d++ ){
for ( k = 0 ; k < 3 ; k++ ){
sum64 = sum64 + MULT(Btranspose[c][k], B[k][d]);
printf("\n the MULT for k = %d is: %ld \n", k, MULT(Btranspose[c][k], B[k][d]));
printf("\n the sum for k = %d is: %ld \n", k, sum64);
}
multmatrix[c][d] = sum64;
sum64 = 0;
}
}
printf("\n\n multmatrix \n");
for( c = 0 ; c < 3; c++ ){
printf("\n");
for( d = 0 ; d < 3 ; d++ ){
printf(" %ld ", multmatrix[c][d]);
}
}
return 0;
}
My output is below put that is wrong and I notice that the mistake is when is multiplying the 3rd element (58751 * 58751) for k=2.
I think is not overflowing because 58751^2 needs 32bits.
the MULT for k = 0 is: 244703449
the sum for k = 0 is: 244703449
the MULT for k = 1 is: 598585156
the sum for k = 1 is: 843288605
the MULT for k = 2 is: 46036225 // this is WRONG!!!
the sum for k = 2 is: 889324830
.
.
.
.
the MULT for k = 2 is: 189805729
the sum for k = 2 is: 1330739379
multmatrix
889324830 650114833 324678230
650114833 1504730698 -308929574
324678230 -308929574 1330739379
Correct result should be
multmatrix - correct
4.2950e+09 -2.2870e+03 1.2886e+04
-2.2870e+03 4.2950e+09 -1.2394e+05
1.2886e+04 -1.2394e+05 4.2951e+09
Why is the multiplication of the matrix wrong??
What should I change the above code so that the multiplication of two matrices will be overflow-proof??
(I am trying write a program that multiplies two 32 bits numbers to be imported on a system that has only 32 bit registers)
So according to the answer below this actually works.
#define LOW_WORD(x) ((uint32_t)(x) & 0xffff)
#define HIGH_WORD(x) ((uint32_t)(x) >> 16)
#define ABS(x) (((x) >= 0) ? (x) : -(x))
#define SIGN(x) (((x) >= 0) ? 1 : -1)
#define UNSIGNED_MULT(a, b) \
(((LOW_WORD(a) * LOW_WORD(b)) << 0) + \
((int64_t)(LOW_WORD(a) * HIGH_WORD(b) + HIGH_WORD(a) * LOW_WORD(b)) << 16) + \
((int64_t)(HIGH_WORD((a)) * HIGH_WORD((b))) << 32))
#define MULT(a, b) (UNSIGNED_MULT(ABS((a)), ABS((b))) * SIGN((a)) * SIGN((b)))
Thank you for helping me understand some things! I'll try turning the whole thing to functions and posting it back.
This
(((x) << 16) >> 16)
doesn't produce unsigned 16-bit number, as you might expect. The type of this expression is the same as the type of x, which is int32_t (signed integer). Indeed, if using any sensible (two's complement) C implementation, for x=58751:
x = 00000000000000001110010101111111
(x) << 16 = 11100101011111110000000000000000 (negative number)
(((x) << 16) >> 16) = 11111111111111111110010101111111 (negative number)
To extract the low 16 bits properly, use unsigned arithmetic:
((uint32_t)(x) & 0xffff)
or (preserving your style)
((uint32_t)(x) << 16 >> 16)
To get the high word, you have to use unsigned arithmetic too:
((uint32_t)(x) >> 16)
Also, the compiler might need help determining the range of this expression (to do optimizations):
(uint16_t)((uint32_t)(x) & 0xffff)
Some (all?) compilers are smart enough to do that by themselves though.
Also, as noted by doynax, the product of low word and high word is a 32-bit number (or 31-bit, but it doesn't matter). To shift it left by 16 bits, you have to cast it to a 64-bit type, just like you do it with the high words:
((int64_t)(LOW_WORD(a) * HIGH_WORD(b) + HIGH_WORD(a) * LOW_WORD(b)) << 16)

Set last n bits in C

I wan't to set the last n bits of any given number to 1. I have a number (which is variable in it's lenght) and a variable n.
Example:
12 (dec) set last 2 bits
Output: 15
Now the basic operation should be something like:
return 0b11 | 12;
But how can I make 0b11 variable in length?
Thank you!
Try this:
int SetLastBits(int value,int numOfBits)
{
return value | ((1<<numOfBits)-1);
}
You can set the last n bits of a number to 1 in the following manner:
int num = 5; // number of bits to set to 1
int val = <some_value>;
val |= (1 << num) - 1;
You can do it like this:
uint32_t set_last_n_bits(uint32_t x, uint32_t bits)
{
return x | ((1U << bits) - 1U);
}
This is also a relatively rare case where a macro might be justifiable, on the grounds that it would work with different integer types.
As all others have showed the same approach I will show one more approach
int value;
//...
value |= ~( ~0u << n );
Here is a demonstrative program
#include <stdio.h>
int set_bits( int x, size_t n )
{
return x | ~( ~0u << n );
}
int main(void)
{
int x = 12;
printf( "%d\t%d\n", x, set_bits( x, 2 ) );
return 0;
}
The output is
12 15

ANSI-C: maximum number of characters printing a decimal int

I'd like to know if it is an easy way of determining the maximum number of characters to print a decimal int.
I know <limits.h> contains definitions like INT_MAX that say the maximum value an int can assume, but it is not what I want.
I'd like to be able to do something like:
int get_int( void )
{
char draft[ MAX_CHAR_OF_A_DECIMAL_INT ];
fgets( draft, sizeof( draft ), stdin );
return strtol( draft, NULL, 10 );
}
But how to find the value of MAX_CHAR_OF_A_DECIMAL_INT in a portable and low overheaded way?
Thanks!
If you assume CHAR_BIT is 8 (required on POSIX, so a safe assumption for any code targetting POSIX systems as well as any other mainstream system like Windows), a cheap safe formula is 3*sizeof(int)+2. If not, you can make it 3*sizeof(int)*CHAR_BIT/8+2, or there's a slightly simpler version.
In case you're interested in the reason this works, sizeof(int) is essentially a logarithm of INT_MAX (roughly log base 2^CHAR_BIT), and conversion between logarithms of different bases (e.g. to base 10) is just multiplication. In particular, 3 is an integer approximation/upper bound on log base 10 of 256.
The +2 is to account for a possible sign and null termination.
The simplest canonical and arguably most portable way is to ask snprintf() how much space would be required:
char sbuf[2];
int ndigits;
ndigits = snprintf(sbuf, (size_t) 1, "%lld", (long long) INT_MIN);
slightly less portable perhaps using intmax_t and %j:
ndigits = snprintf(sbuf, (size_t) 1, "%j", (intmax_t) INT_MIN);
One could consider that to be too expensive to do at runtime though, but it can work for any value, not just the MIN/MAX values of any integer type.
You could of course also just directly calculate the number of digits that a given integer would require to be expressed in Base 10 notation with a simple recursive function:
unsigned int
numCharsB10(intmax_t n)
{
if (n < 0)
return numCharsB10((n == INTMAX_MIN) ? INTMAX_MAX : -n) + 1;
if (n < 10)
return 1;
return 1 + numCharsB10(n / 10);
}
but that of course also requires CPU at runtime, even when inlined, though perhaps a little less than snprintf() does.
#R.'s answer above though is more or less wrong, but on the right track. Here's the correct derivation of some very well and widely tested and highly portable macros that implement the calculation at compile time using sizeof(), using a slight correction of #R.'s initial wording to start out:
First we can easily see (or show) that sizeof(int) is the log base 2 of UINT_MAX divided by the number of bits represented by one unit of sizeof() (8, aka CHAR_BIT):
sizeof(int) == log2(UINT_MAX) / 8
because UINT_MAX is of course just 2 ^ (sizeof(int) * 8)) and log2(x) is the inverse of 2^x.
We can use the identity "logb(x) = log(x) / log(b)" (where log() is the natural logarithm) to find logarithms of other bases. For example, you could compute the "log base 2" of "x" using:
log2(x) = log(x) / log(2)
and also:
log10(x) = log(x) / log(10)
So, we can deduce that:
log10(v) = log2(v) / log2(10)
Now what we want in the end is the log base 10 of UINT_MAX, so since log2(10) is approximately 3, and since we know from above what log2() is in terms of sizeof(), we can say that log10(UINT_MAX) is approximately:
log10(2^(sizeof(int)*8)) ~= (sizeof(int) * 8) / 3
That's not perfect though, especially since what we really want is the ceiling value, but with some minor adjustment to account for the integer rounding of log2(10) to 3, we can get what we need by first adding one to the log2 term, then subtracting 1 from the result for any larger-sized integer, resulting in this "good-enough" expression:
#if 0
#define __MAX_B10STRLEN_FOR_UNSIGNED_TYPE(t) \
((((sizeof(t) * CHAR_BIT) + 1) / 3) - ((sizeof(t) > 2) ? 1 : 0))
#endif
Even better we can multiply our first log2() term by 1/log2(10) (multiplying by the reciprocal of the divisor is the same as dividing by the divisor), and doing so makes it possible to find a better integer approximation. I most recently (re?)encountered this suggestion while reading Sean Anderson's bithacks: http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
To do this with integer math to the best approximation possible, we need to find the ideal ratio representing our reciprocal. This can be found by searching for the smallest fractional part of multiplying our desired value of 1/log2(10) by successive powers of 2, within some reasonable range of powers of 2, such as with the following little AWK script:
awk 'BEGIN {
minf=1.0
}
END {
for (i = 1; i <= 31; i++) {
a = 1.0 / (log(10) / log(2)) * 2^i
if (a > (2^32 / 32))
break;
n = int(a)
f = a - (n * 1.0)
if (f < minf) {
minf = f
minn = n
bits = i
}
# printf("a=%f, n=%d, f=%f, i=%d\n", a, n, f, i)
}
printf("%d + %f / %d, bits=%d\n", minn, minf, 2^bits, bits)
}' < /dev/null
1233 + 0.018862 / 4096, bits=12
So we can get a good integer approximation of multiplying our log2(v) value by 1/log2(10) by multiplying it by 1233 followed by a right-shift of 12 (2^12 is 4096 of course):
log10(UINT_MAX) ~= ((sizeof(int) * 8) + 1) * 1233 >> 12
and, together with adding one to do the equivalent of finding the ceiling value, that gets rid of the need to fiddle with odd values:
#define __MAX_B10STRLEN_FOR_UNSIGNED_TYPE(t) \
(((((sizeof(t) * CHAR_BIT)) * 1233) >> 12) + 1)
/*
* for signed types we need room for the sign, except for int64_t
*/
#define __MAX_B10STRLEN_FOR_SIGNED_TYPE(t) \
(__MAX_B10STRLEN_FOR_UNSIGNED_TYPE(t) + ((sizeof(t) == 8) ? 0 : 1))
/*
* NOTE: this gives a warning (for unsigned types of int and larger) saying
* "comparison of unsigned expression < 0 is always false", and of course it
* is, but that's what we want to know (if indeed type 't' is unsigned)!
*/
#define __MAX_B10STRLEN_FOR_INT_TYPE(t) \
(((t) -1 < 0) ? __MAX_B10STRLEN_FOR_SIGNED_TYPE(t) \
: __MAX_B10STRLEN_FOR_UNSIGNED_TYPE(t))
whereas normally the compiler will evaluate at compile time the expression my __MAX_B10STRLEN_FOR_INT_TYPE() macro becomes. Of course my macro always calculates the maximum space required by a given type of integer, not the exact space required by a particular integer value.
I don't know if it is any trick to do what you want in plain ANSI-C, but in C++ you can easily use template metaprogramming to do:
#include <iostream>
#include <limits>
#include <climits>
template< typename T, unsigned long N = INT_MAX >
class MaxLen
{
public:
enum
{
StringLen = MaxLen< T, N / 10 >::StringLen + 1
};
};
template< typename T >
class MaxLen< T, 0 >
{
public:
enum
{
StringLen = 1
};
};
And you can call it from your pure-C code creating an additional C++ function like this:
extern "C"
int int_str_max( )
{
return MaxLen< int >::StringLen;
}
This has a ZERO execution time overhead and calculates the exact space needed.
You can test the above templates with something like:
int main( )
{
std::cout << "Max: " << std::numeric_limits< short >::max( ) << std::endl;
std::cout << "Digits: " << std::numeric_limits< short >::digits10 << std::endl;
std::cout << "A \"short\" is " << sizeof( short ) << " bytes." << std::endl
<< "A string large enough to fit any \"short\" is "
<< MaxLen< short, SHRT_MAX >::StringLen << " bytes wide." << std::endl;
std::cout << "Max: " << std::numeric_limits< int >::max( ) << std::endl;
std::cout << "Digits: " << std::numeric_limits< int >::digits10 << std::endl;
std::cout << "An \"int\" is " << sizeof( int ) << " bytes." << std::endl
<< "A string large enough to fit any \"int\" is "
<< MaxLen< int >::StringLen << " bytes wide." << std::endl;
std::cout << "Max: " << std::numeric_limits< long >::max( ) << std::endl;
std::cout << "Digits: " << std::numeric_limits< long >::digits10 << std::endl;
std::cout << "A \"long\" is " << sizeof( long ) << " bytes." << std::endl
<< "A string large enough to fit any \"long\" is "
<< MaxLen< long, LONG_MAX >::StringLen << " bytes wide." << std::endl;
return 0;
}
The output is:
Max: 32767
Digits: 4
A "short" is 2 bytes.
A string large enough to fit any "short" is 6 bytes wide.
Max: 2147483647
Digits: 9
An "int" is 4 bytes.
A string large enough to fit any "int" is 11 bytes wide.
Max: 9223372036854775807
Digits: 18
A "long" is 8 bytes.
A string large enough to fit any "long" is 20 bytes wide.
Note the slightly different values from std::numeric_limits< T >::digits10 and MaxLen< T, N >::StringLen, as the former does not take into account digits if if can't reach '9'.
Of course you can use it and simply add two if you don't care wasting a single byte in some cases.
EDIT:
Some may have found weird including <climits>.
If you can count with C++11, you won't need it, and will earn an additional simplicity:
#include <iostream>
#include <limits>
template< typename T, unsigned long N = std::numeric_limits< T >::max( ) >
class MaxLen
{
public:
enum
{
StringLen = MaxLen< T, N / 10 >::StringLen + 1
};
};
template< typename T >
class MaxLen< T, 0 >
{
public:
enum
{
StringLen = 1
};
};
Now you can use
MaxLen< short >::StringLen
instead of
MaxLen< short, SHRT_MAX >::StringLen
Good, isn't?
The maximum number of decimal digits d of a signed or unsigned integer x of b bits matches the number of decimal digits of the number 2^b.
In the case of signed numbers, an extra character must be added for the sign.
The number of decimal digits of x can be calculated as log_10(x), rounded up.
Therefore, the maximum number of decimal digits of x will be log_10(2^b) = b * log_10(2) = b * 0.301029995663981, rounded up.
If s is the size in bytes (given by the sizeof operator) of a certain type of integer used to store x, its size b in bits will be b = s * 8. So, the maximum number of decimal digits d will be (s * 8) * 0.301029995663981, rounded up.
Rounding up will consist of truncating (converting to an integer), and adding 1.
Of course, all these constants will have to be added 1 to count the final 0 byte (see IntegerString in the following example).
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#define COMMON_LOG_OF_2 0.301029995663981
#define MAX_DECIMAL_DIGITS_UCHAR ((unsigned) (sizeof (unsigned char ) * 8 * COMMON_LOG_OF_2) + 1)
#define MAX_DECIMAL_DIGITS_USHORT ((unsigned) (sizeof (unsigned short ) * 8 * COMMON_LOG_OF_2) + 1)
#define MAX_DECIMAL_DIGITS_UINT ((unsigned) (sizeof (unsigned int ) * 8 * COMMON_LOG_OF_2) + 1)
#define MAX_DECIMAL_DIGITS_ULONG ((unsigned) (sizeof (unsigned long ) * 8 * COMMON_LOG_OF_2) + 1)
#define MAX_DECIMAL_DIGITS_ULONGLONG ((unsigned) (sizeof (unsigned long long) * 8 * COMMON_LOG_OF_2) + 1)
#define MAX_DECIMAL_DIGITS_UINT128 ((unsigned) (sizeof (unsigned __int128 ) * 8 * COMMON_LOG_OF_2) + 1)
#define MAX_DECIMAL_DIGITS_CHAR (1 + MAX_DECIMAL_DIGITS_UCHAR )
#define MAX_DECIMAL_DIGITS_SHORT (1 + MAX_DECIMAL_DIGITS_USHORT )
#define MAX_DECIMAL_DIGITS_INT (1 + MAX_DECIMAL_DIGITS_UINT )
#define MAX_DECIMAL_DIGITS_LONG (1 + MAX_DECIMAL_DIGITS_ULONG )
#define MAX_DECIMAL_DIGITS_LONGLONG (1 + MAX_DECIMAL_DIGITS_ULONGLONG)
#define MAX_DECIMAL_DIGITS_INT128 (1 + MAX_DECIMAL_DIGITS_UINT128 )
int main (void)
{
char IntegerString[MAX_DECIMAL_DIGITS_INT + 1];
printf ("MAX_DECIMAL_DIGITS_UCHAR = %2u\n",MAX_DECIMAL_DIGITS_UCHAR );
printf ("MAX_DECIMAL_DIGITS_USHORT = %2u\n",MAX_DECIMAL_DIGITS_USHORT );
printf ("MAX_DECIMAL_DIGITS_UINT = %2u\n",MAX_DECIMAL_DIGITS_UINT );
printf ("MAX_DECIMAL_DIGITS_ULONG = %2u\n",MAX_DECIMAL_DIGITS_ULONG );
printf ("MAX_DECIMAL_DIGITS_ULONGLONG = %2u\n",MAX_DECIMAL_DIGITS_ULONGLONG);
printf ("MAX_DECIMAL_DIGITS_UINT128 = %2u\n",MAX_DECIMAL_DIGITS_UINT128 );
printf ("MAX_DECIMAL_DIGITS_CHAR = %2u\n",MAX_DECIMAL_DIGITS_CHAR );
printf ("MAX_DECIMAL_DIGITS_SHORT = %2u\n",MAX_DECIMAL_DIGITS_SHORT );
printf ("MAX_DECIMAL_DIGITS_INT = %2u\n",MAX_DECIMAL_DIGITS_INT );
printf ("MAX_DECIMAL_DIGITS_LONG = %2u\n",MAX_DECIMAL_DIGITS_LONG );
printf ("MAX_DECIMAL_DIGITS_LONGLONG = %2u\n",MAX_DECIMAL_DIGITS_LONGLONG );
printf ("MAX_DECIMAL_DIGITS_INT128 = %2u\n",MAX_DECIMAL_DIGITS_INT128 );
sprintf (IntegerString,"%d",INT_MAX);
printf ("INT_MAX = %d\n",INT_MAX);
printf ("IntegerString = %s\n",IntegerString);
sprintf (IntegerString,"%d",INT_MIN);
printf ("INT_MIN = %d\n",INT_MIN);
printf ("IntegerString = %s\n",IntegerString);
return EXIT_SUCCESS;
}
EDIT:
Unfortunately, the use of floating point may cause problems when evaluating the expressions as constants. I have modified them by multiplying by 2 ^ 11 and dividing by 2 ^ 8, so that all calculations should be performed by the preprocessor with integers:
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#define LOG2_x_2_11 616 // log(2) * 2^11
#define MAX_DECIMAL_DIGITS_UCHAR (((sizeof (unsigned char ) * LOG2_x_2_11) >> 8) + 1)
#define MAX_DECIMAL_DIGITS_USHORT (((sizeof (unsigned short ) * LOG2_x_2_11) >> 8) + 1)
#define MAX_DECIMAL_DIGITS_UINT (((sizeof (unsigned int ) * LOG2_x_2_11) >> 8) + 1)
#define MAX_DECIMAL_DIGITS_ULONG (((sizeof (unsigned long ) * LOG2_x_2_11) >> 8) + 1)
#define MAX_DECIMAL_DIGITS_ULONGLONG (((sizeof (unsigned long long) * LOG2_x_2_11) >> 8) + 1)
#define MAX_DECIMAL_DIGITS_UINT128 (((sizeof (unsigned __int128 ) * LOG2_x_2_11) >> 8) + 1)
#define MAX_DECIMAL_DIGITS_CHAR (1 + MAX_DECIMAL_DIGITS_UCHAR )
#define MAX_DECIMAL_DIGITS_SHORT (1 + MAX_DECIMAL_DIGITS_USHORT )
#define MAX_DECIMAL_DIGITS_INT (1 + MAX_DECIMAL_DIGITS_UINT )
#define MAX_DECIMAL_DIGITS_LONG (1 + MAX_DECIMAL_DIGITS_ULONG )
#define MAX_DECIMAL_DIGITS_LONGLONG (1 + MAX_DECIMAL_DIGITS_ULONGLONG)
#define MAX_DECIMAL_DIGITS_INT128 (1 + MAX_DECIMAL_DIGITS_UINT128 )
int main (void)
{
char IntegerString[MAX_DECIMAL_DIGITS_INT + 1];
printf ("MAX_DECIMAL_DIGITS_UCHAR = %2zu\n",MAX_DECIMAL_DIGITS_UCHAR );
printf ("MAX_DECIMAL_DIGITS_USHORT = %2zu\n",MAX_DECIMAL_DIGITS_USHORT );
printf ("MAX_DECIMAL_DIGITS_UINT = %2zu\n",MAX_DECIMAL_DIGITS_UINT );
printf ("MAX_DECIMAL_DIGITS_ULONG = %2zu\n",MAX_DECIMAL_DIGITS_ULONG );
printf ("MAX_DECIMAL_DIGITS_ULONGLONG = %2zu\n",MAX_DECIMAL_DIGITS_ULONGLONG);
printf ("MAX_DECIMAL_DIGITS_UINT128 = %2zu\n",MAX_DECIMAL_DIGITS_UINT128 );
printf ("MAX_DECIMAL_DIGITS_CHAR = %2zu\n",MAX_DECIMAL_DIGITS_CHAR );
printf ("MAX_DECIMAL_DIGITS_SHORT = %2zu\n",MAX_DECIMAL_DIGITS_SHORT );
printf ("MAX_DECIMAL_DIGITS_INT = %2zu\n",MAX_DECIMAL_DIGITS_INT );
printf ("MAX_DECIMAL_DIGITS_LONG = %2zu\n",MAX_DECIMAL_DIGITS_LONG );
printf ("MAX_DECIMAL_DIGITS_LONGLONG = %2zu\n",MAX_DECIMAL_DIGITS_LONGLONG );
printf ("MAX_DECIMAL_DIGITS_INT128 = %2zu\n",MAX_DECIMAL_DIGITS_INT128 );
sprintf (IntegerString,"%d",INT_MAX);
printf ("INT_MAX = %d\n",INT_MAX);
printf ("IntegerString = %s\n",IntegerString);
sprintf (IntegerString,"%d",INT_MIN);
printf ("INT_MIN = %d\n",INT_MIN);
printf ("IntegerString = %s\n",IntegerString);
return EXIT_SUCCESS;
}
After accept answer (2+ yr)
The following fraction 10/33 exactly meets the needs for unpadded int8_t, int16_t, int32_t and int128_t. Only 1 char extra for int64_t. Exact or 1 over for all integer sizes up to int362_t. Beyond that may be more that 1 over.
#include <limits.h>
#define MAX_CHAR_LEN_DECIMAL_INTEGER(type) (10*sizeof(type)*CHAR_BIT/33 + 2)
#define MAX_CHAR_SIZE_DECIMAL_INTEGER(type) (10*sizeof(type)*CHAR_BIT/33 + 3)
int get_int( void ) {
// + 1 for the \n of fgets()
char draft[MAX_CHAR_SIZE_DECIMAL_INTEGER(long) + 1]; //**
fgets(draft, sizeof draft, stdin);
return strtol(draft, NULL, 10);
}
** fgets() typically works best with an additional char for the terminating '\n'.
Similar to #R.. but with a better fraction.
Recommend using generous, 2x, buffers when reading user input. Sometimes a user adds spaces, leading zeros, etc.
char draft[2*(MAX_CHAR_SIZE_DECIMAL_INTEGER(long) + 1)];
fgets(draft, sizeof draft, stdin);
In C++11 and later, you can do the following:
namespace details {
template<typename T>
constexpr size_t max_to_string_length_impl(T value) {
return (value >= 0 && value < 10) ? 1 // [0..9] -> 1
: (std::is_signed<T>::value && value < 0 && value > -10) ? 2 // [-9..-1] -> 2
: 1 + max_to_string_length_impl(value / 10); // ..-10] [10.. -> recursion
}
}
template<typename T>
constexpr size_t max_to_string_length() {
return std::max(
details::max_to_string_length_impl(std::numeric_limits<T>::max()),
details::max_to_string_length_impl(std::numeric_limits<T>::min()));
}
You can calculate the number of digits using log base 10. In my system, calculating the ceiling of log base 2 using the bit representation of the number didn't provide any significant gain in speed. The floor of log base 10 + 1 gives the number of digits, I add 2 to account for the null character and sign.
#include <limits.h>
#include <stdio.h>
#include <math.h>
int main(void){
printf("%d %d\n", INT_MAX, (int)floor(log10(INT_MAX)) + 3);
return 0;
}
Also note that the number of bytes of an int can be 2 or 4 and it is 2 only in old systems, so you could calculate the upper bound and use it in your program.
Here's the C version:
#include <limits.h>
#define xstr(s) str(s)
#define str(s) #s
#define INT_STR_MAX sizeof(xstr(INT_MAX))
char buffer[INT_STR_MAX];
Then:
$ gcc -E -o str.cpp str.c
$ grep buffer str.cpp
char buffer[sizeof("2147483647")];
$ gcc -S -o str.S str.c
$ grep buffer str.S
.comm buffer,11,1

Rounding up to next power of 2

I want to write a function that returns the nearest next power of 2 number. For example if my input is 789, the output should be 1024. Is there any way of achieving this without using any loops but just using some bitwise operators?
Check the Bit Twiddling Hacks. You need to get the base 2 logarithm, then add 1 to that. Example for a 32-bit value:
Round up to the next highest power of 2
unsigned int v; // compute the next highest power of 2 of 32-bit v
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v++;
The extension to other widths should be obvious.
next = pow(2, ceil(log(x)/log(2)));
This works by finding the number you'd have raise 2 by to get x (take the log of the number, and divide by the log of the desired base, see wikipedia for more). Then round that up with ceil to get the nearest whole number power.
This is a more general purpose (i.e. slower!) method than the bitwise methods linked elsewhere, but good to know the maths, eh?
I think this works, too:
int power = 1;
while(power < x)
power*=2;
And the answer is power.
unsigned long upper_power_of_two(unsigned long v)
{
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v++;
return v;
}
If you're using GCC, you might want to have a look at Optimizing the next_pow2() function by Lockless Inc.. This page describes a way to use built-in function builtin_clz() (count leading zero) and later use directly x86 (ia32) assembler instruction bsr (bit scan reverse), just like it's described in another answer's link to gamedev site. This code might be faster than those described in previous answer.
By the way, if you're not going to use assembler instruction and 64bit data type, you can use this
/**
* return the smallest power of two value
* greater than x
*
* Input range: [2..2147483648]
* Output range: [2..2147483648]
*
*/
__attribute__ ((const))
static inline uint32_t p2(uint32_t x)
{
#if 0
assert(x > 1);
assert(x <= ((UINT32_MAX/2) + 1));
#endif
return 1 << (32 - __builtin_clz (x - 1));
}
One more, although I use cycle, but thi is much faster than math operands
power of two "floor" option:
int power = 1;
while (x >>= 1) power <<= 1;
power of two "ceil" option:
int power = 2;
x--; // <<-- UPDATED
while (x >>= 1) power <<= 1;
UPDATE
As mentioned in comments there was mistake in ceil where its result was wrong.
Here are full functions:
unsigned power_floor(unsigned x) {
int power = 1;
while (x >>= 1) power <<= 1;
return power;
}
unsigned power_ceil(unsigned x) {
if (x <= 1) return 1;
int power = 2;
x--;
while (x >>= 1) power <<= 1;
return power;
}
In standard c++20 this is included in <bit>.
The answer is simply
#include <bit>
unsigned long upper_power_of_two(unsigned long v)
{
return std::bit_ceil(v);
}
NOTE:
The solution I gave is for c++, not c, I would give an answer this question instead, but it was closed as a duplicate of this one!
For any unsigned type, building on the Bit Twiddling Hacks:
#include <climits>
#include <type_traits>
template <typename UnsignedType>
UnsignedType round_up_to_power_of_2(UnsignedType v) {
static_assert(std::is_unsigned<UnsignedType>::value, "Only works for unsigned types");
v--;
for (size_t i = 1; i < sizeof(v) * CHAR_BIT; i *= 2) //Prefer size_t "Warning comparison between signed and unsigned integer"
{
v |= v >> i;
}
return ++v;
}
There isn't really a loop there as the compiler knows at compile time the number of iterations.
Despite the question is tagged as c here my five cents. Lucky us, C++ 20 would include std::ceil2 and std::floor2 (see here). It is consexpr template functions, current GCC implementation uses bitshifting and works with any integral unsigned type.
For IEEE floats you'd be able to do something like this.
int next_power_of_two(float a_F){
int f = *(int*)&a_F;
int b = f << 9 != 0; // If we're a power of two this is 0, otherwise this is 1
f >>= 23; // remove factional part of floating point number
f -= 127; // subtract 127 (the bias) from the exponent
// adds one to the exponent if were not a power of two,
// then raises our new exponent to the power of two again.
return (1 << (f + b));
}
If you need an integer solution and you're able to use inline assembly, BSR will give you the log2 of an integer on the x86. It counts how many right bits are set, which is exactly equal to the log2 of that number. Other processors have similar instructions (often), such as CLZ and depending on your compiler there might be an intrinsic available to do the work for you.
Here's my solution in C. Hope this helps!
int next_power_of_two(int n) {
int i = 0;
for (--n; n > 0; n >>= 1) {
i++;
}
return 1 << i;
}
In x86 you can use the sse4 bit manipulation instructions to make it fast.
//assume input is in eax
mov ecx,31
popcnt edx,eax //cycle 1
lzcnt eax,eax //cycle 2
sub ecx,eax
mov eax,1
cmp edx,1 //cycle 3
jle #done //cycle 4 - popcnt says its a power of 2, return input unchanged
shl eax,cl //cycle 5
#done: rep ret //cycle 5
In c you can use the matching intrinsics.
Or jumpless, which speeds up things by avoiding a misprediction due to a jump, but slows things down by lengthening the dependency chain. Time the code to see which works best for you.
//assume input is in eax
mov ecx,31
popcnt edx,eax //cycle 1
lzcnt eax,eax
sub ecx,eax
mov eax,1 //cycle 2
cmp edx,1
mov edx,0 //cycle 3
cmovle ecx,edx //cycle 4 - ensure eax does not change
shl eax,cl
#done: rep ret //cycle 5
/*
** http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog
*/
#define __LOG2A(s) ((s &0xffffffff00000000) ? (32 +__LOG2B(s >>32)): (__LOG2B(s)))
#define __LOG2B(s) ((s &0xffff0000) ? (16 +__LOG2C(s >>16)): (__LOG2C(s)))
#define __LOG2C(s) ((s &0xff00) ? (8 +__LOG2D(s >>8)) : (__LOG2D(s)))
#define __LOG2D(s) ((s &0xf0) ? (4 +__LOG2E(s >>4)) : (__LOG2E(s)))
#define __LOG2E(s) ((s &0xc) ? (2 +__LOG2F(s >>2)) : (__LOG2F(s)))
#define __LOG2F(s) ((s &0x2) ? (1) : (0))
#define LOG2_UINT64 __LOG2A
#define LOG2_UINT32 __LOG2B
#define LOG2_UINT16 __LOG2C
#define LOG2_UINT8 __LOG2D
static inline uint64_t
next_power_of_2(uint64_t i)
{
#if defined(__GNUC__)
return 1UL <<(1 +(63 -__builtin_clzl(i -1)));
#else
i =i -1;
i =LOG2_UINT64(i);
return 1UL <<(1 +i);
#endif
}
If you do not want to venture into the realm of undefined behaviour the input value must be between 1 and 2^63. The macro is also useful to set constant at compile time.
For completeness here is a floating-point implementation in bog-standard C.
double next_power_of_two(double value) {
int exp;
if(frexp(value, &exp) == 0.5) {
// Omit this case to round precise powers of two up to the *next* power
return value;
}
return ldexp(1.0, exp);
}
An efficient Microsoft (e.g., Visual Studio 2017) specific solution in C / C++ for integer input. Handles the case of the input exactly matching a power of two value by decrementing before checking the location of the most significant 1 bit.
inline unsigned int ExpandToPowerOf2(unsigned int Value)
{
unsigned long Index;
_BitScanReverse(&Index, Value - 1);
return (1U << (Index + 1));
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#if defined(WIN64) // The _BitScanReverse64 intrinsic is only available for 64 bit builds because it depends on x64
inline unsigned long long ExpandToPowerOf2(unsigned long long Value)
{
unsigned long Index;
_BitScanReverse64(&Index, Value - 1);
return (1ULL << (Index + 1));
}
#endif
This generates 5 or so inlined instructions for an Intel processor similar to the following:
dec eax
bsr rcx, rax
inc ecx
mov eax, 1
shl rax, cl
Apparently the Visual Studio C++ compiler isn't coded to optimize this for compile-time values, but it's not like there are a whole lot of instructions there.
Edit:
If you want an input value of 1 to yield 1 (2 to the zeroth power), a small modification to the above code still generates straight through instructions with no branch.
inline unsigned int ExpandToPowerOf2(unsigned int Value)
{
unsigned long Index;
_BitScanReverse(&Index, --Value);
if (Value == 0)
Index = (unsigned long) -1;
return (1U << (Index + 1));
}
Generates just a few more instructions. The trick is that Index can be replaced by a test followed by a cmove instruction.
Trying to make an "ultimate" solution for this. The following code
is targeted for C language (not C++),
uses compiler built-ins to yield efficient code (CLZ or BSR instruction) if compiler supports any,
is portable (standard C and no assembly) with the exception of built-ins, and
addresses all undefined behaviors.
If you're writing in C++, you may adjust the code appropriately. Note that C++20 introduces std::bit_ceil which does the exact same thing except the behavior may be undefined on certain conditions.
#include <limits.h>
#ifdef _MSC_VER
# if _MSC_VER >= 1400
/* _BitScanReverse is introduced in Visual C++ 2005 and requires
<intrin.h> (also introduced in Visual C++ 2005). */
#include <intrin.h>
#pragma intrinsic(_BitScanReverse)
#pragma intrinsic(_BitScanReverse64)
# define HAVE_BITSCANREVERSE 1
# endif
#endif
/* Macro indicating that the compiler supports __builtin_clz().
The name HAVE_BUILTIN_CLZ seems to be the most common, but in some
projects HAVE__BUILTIN_CLZ is used instead. */
#ifdef __has_builtin
# if __has_builtin(__builtin_clz)
# define HAVE_BUILTIN_CLZ 1
# endif
#elif defined(__GNUC__)
# if (__GNUC__ > 3)
# define HAVE_BUILTIN_CLZ 1
# elif defined(__GNUC_MINOR__)
# if (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
# define HAVE_BUILTIN_CLZ 1
# endif
# endif
#endif
/**
* Returns the smallest power of two that is not smaller than x.
*/
unsigned long int next_power_of_2_long(unsigned long int x)
{
if (x <= 1) {
return 1;
}
x--;
#ifdef HAVE_BITSCANREVERSE
if (x > (ULONG_MAX >> 1)) {
return 0;
} else {
unsigned long int index;
(void) _BitScanReverse(&index, x);
return (1UL << (index + 1));
}
#elif defined(HAVE_BUILTIN_CLZ)
if (x > (ULONG_MAX >> 1)) {
return 0;
}
return (1UL << (sizeof(x) * CHAR_BIT - __builtin_clzl(x)));
#else
/* Solution from "Bit Twiddling Hacks"
<http://www.graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2>
but converted to a loop for smaller code size.
("gcc -O3" will unroll this.) */
{
unsigned int shift;
for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
x |= (x >> shift);
}
}
return (x + 1);
#endif
}
unsigned int next_power_of_2(unsigned int x)
{
if (x <= 1) {
return 1;
}
x--;
#ifdef HAVE_BITSCANREVERSE
if (x > (UINT_MAX >> 1)) {
return 0;
} else {
unsigned long int index;
(void) _BitScanReverse(&index, x);
return (1U << (index + 1));
}
#elif defined(HAVE_BUILTIN_CLZ)
if (x > (UINT_MAX >> 1)) {
return 0;
}
return (1U << (sizeof(x) * CHAR_BIT - __builtin_clz(x)));
#else
{
unsigned int shift;
for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
x |= (x >> shift);
}
}
return (x + 1);
#endif
}
unsigned long long next_power_of_2_long_long(unsigned long long x)
{
if (x <= 1) {
return 1;
}
x--;
#if (defined(HAVE_BITSCANREVERSE) && \
ULLONG_MAX == 18446744073709551615ULL)
if (x > (ULLONG_MAX >> 1)) {
return 0;
} else {
/* assert(sizeof(__int64) == sizeof(long long)); */
unsigned long int index;
(void) _BitScanReverse64(&index, x);
return (1ULL << (index + 1));
}
#elif defined(HAVE_BUILTIN_CLZ)
if (x > (ULLONG_MAX >> 1)) {
return 0;
}
return (1ULL << (sizeof(x) * CHAR_BIT - __builtin_clzll(x)));
#else
{
unsigned int shift;
for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
x |= (x >> shift);
}
}
return (x + 1);
#endif
}
Portable solution in C#:
int GetNextPowerOfTwo(int input) {
return 1 << (int)Math.Ceiling(Math.Log2(input));
}
Math.Ceiling(Math.Log2(value)) calculates the exponent of the next power of two, the 1 << calculates the real value through bitshifting.
Faster solution if you have .NET Core 3 or above:
uint GetNextPowerOfTwoFaster(uint input) {
return (uint)1 << (sizeof(uint) * 8 - System.Numerics.BitOperations.LeadingZeroCount(input - 1));
}
This uses System.Numerics.BitOperations.LeadingZeroCount() which uses a hardware instruction if available:
https://github.com/dotnet/corert/blob/master/src/System.Private.CoreLib/shared/System/Numerics/BitOperations.cs
Update:
RoundUpToPowerOf2() is Coming in .NET 6! The internal implementation is mostly the same as the .NET Core 3 solution above.
Here's the community update.
You might find the following clarification to be helpful towards your purpose:
constexpr version of clp2 for C++14
#include <iostream>
#include <type_traits>
// Closest least power of 2 minus 1. Returns 0 if n = 0.
template <typename UInt, std::enable_if_t<std::is_unsigned<UInt>::value,int> = 0>
constexpr UInt clp2m1(UInt n, unsigned i = 1) noexcept
{ return i < sizeof(UInt) * 8 ? clp2m1(UInt(n | (n >> i)),i << 1) : n; }
/// Closest least power of 2 minus 1. Returns 0 if n <= 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value && std::is_signed<Int>::value,int> = 0>
constexpr auto clp2m1(Int n) noexcept
{ return clp2m1(std::make_unsigned_t<Int>(n <= 0 ? 0 : n)); }
/// Closest least power of 2. Returns 2^N: 2^(N-1) < n <= 2^N. Returns 0 if n <= 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value,int> = 0>
constexpr auto clp2(Int n) noexcept
{ return clp2m1(std::make_unsigned_t<Int>(n-1)) + 1; }
/// Next power of 2. Returns 2^N: 2^(N-1) <= n < 2^N. Returns 1 if n = 0. Returns 0 if n < 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value,int> = 0>
constexpr auto np2(Int n) noexcept
{ return clp2m1(std::make_unsigned_t<Int>(n)) + 1; }
template <typename T>
void test(T v) { std::cout << clp2(v) << std::endl; }
int main()
{
test(-5); // 0
test(0); // 0
test(8); // 8
test(31); // 32
test(33); // 64
test(789); // 1024
test(char(260)); // 4
test(unsigned(-1) - 1); // 0
test<long long>(unsigned(-1) - 1); // 4294967296
return 0;
}
Many processor architectures support log base 2 or very similar operation – count leading zeros. Many compilers have intrinsics for it. See https://en.wikipedia.org/wiki/Find_first_set
Assuming you have a good compiler & it can do the bit twiddling before hand thats above me at this point, but anyway this works!!!
// http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
#define SH1(v) ((v-1) | ((v-1) >> 1)) // accidently came up w/ this...
#define SH2(v) ((v) | ((v) >> 2))
#define SH4(v) ((v) | ((v) >> 4))
#define SH8(v) ((v) | ((v) >> 8))
#define SH16(v) ((v) | ((v) >> 16))
#define OP(v) (SH16(SH8(SH4(SH2(SH1(v))))))
#define CB0(v) ((v) - (((v) >> 1) & 0x55555555))
#define CB1(v) (((v) & 0x33333333) + (((v) >> 2) & 0x33333333))
#define CB2(v) ((((v) + ((v) >> 4) & 0xF0F0F0F) * 0x1010101) >> 24)
#define CBSET(v) (CB2(CB1(CB0((v)))))
#define FLOG2(v) (CBSET(OP(v)))
Test code below:
#include <iostream>
using namespace std;
// http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
#define SH1(v) ((v-1) | ((v-1) >> 1)) // accidently guess this...
#define SH2(v) ((v) | ((v) >> 2))
#define SH4(v) ((v) | ((v) >> 4))
#define SH8(v) ((v) | ((v) >> 8))
#define SH16(v) ((v) | ((v) >> 16))
#define OP(v) (SH16(SH8(SH4(SH2(SH1(v))))))
#define CB0(v) ((v) - (((v) >> 1) & 0x55555555))
#define CB1(v) (((v) & 0x33333333) + (((v) >> 2) & 0x33333333))
#define CB2(v) ((((v) + ((v) >> 4) & 0xF0F0F0F) * 0x1010101) >> 24)
#define CBSET(v) (CB2(CB1(CB0((v)))))
#define FLOG2(v) (CBSET(OP(v)))
#define SZ4 FLOG2(4)
#define SZ6 FLOG2(6)
#define SZ7 FLOG2(7)
#define SZ8 FLOG2(8)
#define SZ9 FLOG2(9)
#define SZ16 FLOG2(16)
#define SZ17 FLOG2(17)
#define SZ127 FLOG2(127)
#define SZ1023 FLOG2(1023)
#define SZ1024 FLOG2(1024)
#define SZ2_17 FLOG2((1ul << 17)) //
#define SZ_LOG2 FLOG2(SZ)
#define DBG_PRINT(x) do { std::printf("Line:%-4d" " %10s = %-10d\n", __LINE__, #x, x); } while(0);
uint32_t arrTble[FLOG2(63)];
int main(){
int8_t n;
DBG_PRINT(SZ4);
DBG_PRINT(SZ6);
DBG_PRINT(SZ7);
DBG_PRINT(SZ8);
DBG_PRINT(SZ9);
DBG_PRINT(SZ16);
DBG_PRINT(SZ17);
DBG_PRINT(SZ127);
DBG_PRINT(SZ1023);
DBG_PRINT(SZ1024);
DBG_PRINT(SZ2_17);
return(0);
}
Outputs:
Line:39 SZ4 = 2
Line:40 SZ6 = 3
Line:41 SZ7 = 3
Line:42 SZ8 = 3
Line:43 SZ9 = 4
Line:44 SZ16 = 4
Line:45 SZ17 = 5
Line:46 SZ127 = 7
Line:47 SZ1023 = 10
Line:48 SZ1024 = 10
Line:49 SZ2_16 = 17
I'm trying to get nearest lower power of 2 and made this function. May it help you.Just multiplied nearest lower number times 2 to get nearest upper power of 2
int nearest_upper_power(int number){
int temp=number;
while((number&(number-1))!=0){
temp<<=1;
number&=temp;
}
//Here number is closest lower power
number*=2;
return number;
}
Adapted Paul Dixon's answer to Excel, this works perfectly.
=POWER(2,CEILING.MATH(LOG(A1)/LOG(2)))
A variant of #YannDroneaud answer valid for x==1, only for x86 plateforms, compilers, gcc or clang:
__attribute__ ((const))
static inline uint32_t p2(uint32_t x)
{
#if 0
assert(x > 0);
assert(x <= ((UINT32_MAX/2) + 1));
#endif
int clz;
uint32_t xm1 = x-1;
asm(
"lzcnt %1,%0"
:"=r" (clz)
:"rm" (xm1)
:"cc"
);
return 1 << (32 - clz);
}
Here is what I'm using to have this be a constant expression, if the input is a constant expression.
#define uptopow2_0(v) ((v) - 1)
#define uptopow2_1(v) (uptopow2_0(v) | uptopow2_0(v) >> 1)
#define uptopow2_2(v) (uptopow2_1(v) | uptopow2_1(v) >> 2)
#define uptopow2_3(v) (uptopow2_2(v) | uptopow2_2(v) >> 4)
#define uptopow2_4(v) (uptopow2_3(v) | uptopow2_3(v) >> 8)
#define uptopow2_5(v) (uptopow2_4(v) | uptopow2_4(v) >> 16)
#define uptopow2(v) (uptopow2_5(v) + 1) /* this is the one programmer uses */
So for instance, an expression like:
uptopow2(sizeof (struct foo))
will nicely reduce to a constant.
The g++ compiler provides a builtin function __builtin_clz that counts leading zeros:
So we could do:
int nextPowerOfTwo(unsigned int x) {
return 1 << sizeof(x)*8 - __builtin_clz(x);
}
int main () {
std::cout << nextPowerOfTwo(7) << std::endl;
std::cout << nextPowerOfTwo(31) << std::endl;
std::cout << nextPowerOfTwo(33) << std::endl;
std::cout << nextPowerOfTwo(8) << std::endl;
std::cout << nextPowerOfTwo(91) << std::endl;
return 0;
}
Results:
8
32
64
16
128
But note that, for x == 0, __builtin_clz return is undefined.
If you need it for OpenGL related stuff:
/* Compute the nearest power of 2 number that is
* less than or equal to the value passed in.
*/
static GLuint
nearestPower( GLuint value )
{
int i = 1;
if (value == 0) return -1; /* Error! */
for (;;) {
if (value == 1) return i;
else if (value == 3) return i*4;
value >>= 1; i *= 2;
}
}
Convert it to a float and then use .hex() which shows the normalized IEEE representation.
>>> float(789).hex()
'0x1.8a80000000000p+9'
Then just extract the exponent and add 1.
>>> int(float(789).hex().split('p+')[1]) + 1
10
And raise 2 to this power.
>>> 2 ** (int(float(789).hex().split('p+')[1]) + 1)
1024
from math import ceil, log2
pot_ceil = lambda N: 0x1 << ceil(log2(N))
Test:
for i in range(10):
print(i, pot_ceil(i))
Output:
1 1
2 2
3 4
4 4
5 8
6 8
7 8
8 8
9 16
10 16
import sys
def is_power2(x):
return x > 0 and ((x & (x - 1)) == 0)
def find_nearest_power2(x):
if x <= 0:
raise ValueError("invalid input")
if is_power2(x):
return x
else:
bits = get_bits(x)
upper = 1 << (bits)
lower = 1 << (bits - 1)
mid = (upper + lower) // 2
if (x - mid) > 0:
return upper
else:
return lower
def get_bits(x):
"""return number of bits in binary representation"""
if x < 0:
raise ValueError("invalid input: input should be positive integer")
count = 0
while (x != 0):
try:
x = x >> 1
except TypeError as error:
print(error, "input should be of type integer")
sys.exit(1)
count += 1
return count

Resources