Closest power of 2 - c

Given an unsigned integer a (less or equal to 1024), I need to find a number p which satisfy the following condition :
lowest p >= a
p is a power of 2
I'm sure there is a better solution, using bitwise operators.
Have you a better solution ?
unsigned int closest_pow2(unsigned int a)
{
if (a == 0 || a > 1024) return 0; //error, never happen
if (a == 1) return 1;
if (a == 2) return 2;
if (a <= 4) return 4;
if (a <= 8) return 8;
if (a <= 16) return 16;
if (a <= 32) return 32;
if (a <= 64) return 64;
if (a <= 128) return 128;
if (a <= 256) return 256;
if (a <= 512) return 512;
if (a <= 1024) return 1024;
}

The following does it without the relatively expensive conditional statements or loops:
unsigned next_power_of_two(unsigned int x) {
x = x - 1;
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
return x + 1;
}

If this is a trick question (since I see no requirement that you must find the lowest p >= a), then this is a solution:
return 1024;

Stylistically, I prefer not to use bitwise operators because they tend to make the code harder to read--they encapsulate the bit structure much less than other types of commands. Even without bitwise operators, the code could be made much more concise:
int pow = 1;
if (a == 0 || a > 1024) return 0;
while (pow < 2000) {
if (a <= pow) return pow;
pow *= 2;
}
If you don't want to hardcode a number larger than the largest number of bits (probably a better coding practice anyway), you can write as follows:
final int MAX_POSSIBLE_BIT_VALUE = 1024;
unsigned int closest_pow2(unsigned int a) {
if (a == 0 || a > MAX_POSSIBLE_BIT_VALUE) return 0;
int pow = 1;
while (pow <= MAX_POSSIBLE_BIT_VALUE) {
if (a <= pow) return pow;
pow *= 2;
}
return pow;
}

Related

C - convert number string to SQL_NUMERIC_STRUCT

I need help converting a number string to a SQL_NUMERIC_STRUCT value to use decimal and numeric database data types. The SQL_NUMERIC_STRUCT value is a 16-byte hexadecimal unsigned integer. For example, I have a string "12312312899012312890522341231231232198", that contains 38 digits (maximum for SQL SERVER decimal or numeric data types). In other languages such a c# there is a built-in conversion function, but my Visual studio 2019 does not allow me to directly use 128-bit integers in the C++ environment. The Microsoft help page offers example with a small,2-byte integer, unfortunately.
I have found a solution.
bool ConvertToNumericStruct (char* s, SQL_NUMERIC_STRUCT* v){
int sc = (int)strlen(s), scale = 0, i,y, z;
char c, p = 0, d; bool minus = false;
int _tmp, x, carryover;
memset(v->val, 0, 16);
for (i = 0; i < sc; i++) {
c = s[i];
if (i == 0 && c == '-')minus = true;
else if (c == '.') { if (scale == 0)scale = sc - i - 1; else return false; }
else if (c < '0' || c>'9') return false;
else
{
if (p > 38) return false;
d = c - 48;
_tmp = 0;
carryover = d;
y = 0; z = 0;
for (x = sc - 1; x > -1; x--)
{
if (y % 2 == 1)
{
_tmp = (v->val[z] >> 4) * 10 + carryover;
v->val[z] &= 0x0F;
v->val[z] |= ((_tmp % 16) << 4 & 0xF0);
z++;
if (z > 15) break;
}
else {
_tmp = (v->val[z] & 0x0F) * 10 + carryover;
v->val[z] &= 0Xf0;
v->val[z] |= ((_tmp % 16) & 0x0F);
}
y++;
carryover = _tmp / 16;
}
p++;
}
}
v->precision = p;
v->scale = scale;
if (minus) v->sign = 0; else v->sign = 1;
return true;}
If you want to insert data defined by decimal or numeric into database such as MySql via UnixODBC with the function SQLBindParameter,you can just use SQL_C_CHAR for fCtype and SQL_CHAR for fSqltype with a char-string buffer.No need to convert.That would be done implicitly.

Fast multiplication modulo 2^16 + 1

The IDEA cipher uses multiplication modulo 2^16 + 1. Is there an algorithm to perform this operation without general modulo operator (only modulo 2^16 (truncation))? In the context of IDEA, zero is interpreted as 2^16 (it means zero isn't an argument of our multiplication and it cannot be the result, so we can save one bit and store value 2^16 as bit pattern 0000000000000000). I am wondering how to implement it efficiently (or whether it is possible at all) without using the standard modulo operator.
You can utilize the fact, that (N-1) % N == -1.
Thus, (65536 * a) % 65537 == -a % 65537.
Also, -a % 65537 == -a + 1 (mod 65536), when 0 is interpreted as 65536
uint16_t fastmod65537(uint16_t a, uint16_t b)
{
uint32_t c;
uint16_t hi, lo;
if (a == 0)
return -b + 1;
if (b == 0)
return -a + 1;
c = (uint32_t)a * (uint32_t)b;
hi = c >> 16;
lo = c;
if (lo > hi)
return lo-hi;
return lo-hi+1;
}
The only problem here is if hi == lo, the result would be 0. Luckily a test suite confirms, that it actually can't be...
int main()
{
uint64_t a, b;
for (a = 1; a <= 65536; a++)
for (b = 1; b <= 65536; b++)
{
uint64_t c = a*b;
uint32_t d = (c % 65537) & 65535;
uint32_t e = m(a & 65535, b & 65535);
if (d != e)
printf("a * b % 65537 != m(%d, %d) real=%d m()=%d\n",
(uint32_t)a, (uint32_t)b, d, e);
}
}
Output: none
First, the case where either a or b is zero. In that case, it is interpreted as having the value 2^16, therefore elementary modulo arithmetic tells us that:
result = -a - b + 1;
, because (in the context of IDEA) the multiplicative inverse of 2^16 is still 2^16, and its lowest 16 bits are all zeroes.
The general case is much easier than it seems, now that we took care of the "0" special case (2^16+1 is 0x10001):
/* This operation can overflow: */
unsigned result = (product & 0xFFFF) - (product >> 16);
/* ..so account for cases of overflow: */
result -= result >> 16;
Putting it together:
/* All types must be sufficiently wide unsigned, e.g. uint32_t: */
unsigned long long product = a * b;
if (product == 0) {
return -a - b + 1;
} else {
result = (product & 0xFFFF) - (product >> 16);
result -= result >> 16;
return result & 0xFFFF;
}

Bitwise operations equivalent of greater than operator

I am working on a function that will essentially see which of two ints is larger. The parameters that are passed are 2 32-bit ints. The trick is the only operators allowed are ! ~ | & << >> ^ (no casting, other data types besides signed int, *, /, -, etc..).
My idea so far is to ^ the two binaries together to see all the positions of the 1 values that they don't share. What I want to do is then take that value and isolate the 1 farthest to the left. Then see of which of them has that value in it. That value then will be the larger.
(Say we use 8-bit ints instead of 32-bit).
If the two values passed were 01011011 and 01101001
I used ^ on them to get 00100010.
I then want to make it 00100000 in other words 01xxxxxx -> 01000000
Then & it with the first number
!! the result and return it.
If it is 1, then the first # is larger.
Any thoughts on how to 01xxxxxx -> 01000000 or anything else to help?
Forgot to note: no ifs, whiles, fors etc...
Here's a loop-free version which compares unsigned integers in O(lg b) operations where b is the word size of the machine. Note the OP states no other data types than signed int, so it seems likely the top part of this answer does not meet the OP's specifications. (Spoiler version as at the bottom.)
Note that the behavior we want to capture is when the most significant bit mismatch is 1 for a and 0 for b. Another way of thinking about this is any bit in a being larger than the corresponding bit in b means a is greater than b, so long as there wasn't an earlier bit in a that was less than the corresponding bit in b.
To that end, we compute all the bits in a greater than the corresponding bits in b, and likewise compute all the bits in a less than the corresponding bits in b. We now want to mask out all the 'greater than' bits that are below any 'less than' bits, so we take all the 'less than' bits and smear them all to the right making a mask: the most significant bit set all the way down to the least significant bit are now 1.
Now all we have to do is remove the 'greater than' bits set by using simple bit masking logic.
The resulting value is 0 if a <= b and nonzero if a > b. If we want it to be 1 in the latter case we can do a similar smearing trick and just take a look at the least significant bit.
#include <stdio.h>
// Works for unsigned ints.
// Scroll down to the "actual algorithm" to see the interesting code.
// Utility function for displaying binary representation of an unsigned integer
void printBin(unsigned int x) {
for (int i = 31; i >= 0; i--) printf("%i", (x >> i) & 1);
printf("\n");
}
// Utility function to print out a separator
void printSep() {
for (int i = 31; i>= 0; i--) printf("-");
printf("\n");
}
int main()
{
while (1)
{
unsigned int a, b;
printf("Enter two unsigned integers separated by spaces: ");
scanf("%u %u", &a, &b);
getchar();
printBin(a);
printBin(b);
printSep();
/************ The actual algorithm starts here ************/
// These are all the bits in a that are less than their corresponding bits in b.
unsigned int ltb = ~a & b;
// These are all the bits in a that are greater than their corresponding bits in b.
unsigned int gtb = a & ~b;
ltb |= ltb >> 1;
ltb |= ltb >> 2;
ltb |= ltb >> 4;
ltb |= ltb >> 8;
ltb |= ltb >> 16;
// Nonzero if a > b
// Zero if a <= b
unsigned int isGt = gtb & ~ltb;
// If you want to make this exactly '1' when nonzero do this part:
isGt |= isGt >> 1;
isGt |= isGt >> 2;
isGt |= isGt >> 4;
isGt |= isGt >> 8;
isGt |= isGt >> 16;
isGt &= 1;
/************ The actual algorithm ends here ************/
// Print out the results.
printBin(ltb); // Debug info
printBin(gtb); // Debug info
printSep();
printBin(isGt); // The actual result
}
}
Note: This should work for signed integers as well if you flip the top bit on both of the inputs, e.g. a ^= 0x80000000.
Spoiler
If you want an answer that meets all of the requirements (including 25 operators or less):
int isGt(int a, int b)
{
int diff = a ^ b;
diff |= diff >> 1;
diff |= diff >> 2;
diff |= diff >> 4;
diff |= diff >> 8;
diff |= diff >> 16;
diff &= ~(diff >> 1) | 0x80000000;
diff &= (a ^ 0x80000000) & (b ^ 0x7fffffff);
return !!diff;
}
I'll leave explaining why it works up to you.
To convert 001xxxxx to 00100000, you first execute:
x |= x >> 4;
x |= x >> 2;
x |= x >> 1;
(this is for 8 bits; to extend it to 32, add shifts by 8 and 16 at the start of the sequence).
This leaves us with 00111111 (this technique is sometimes called "bit-smearing"). We can then chop off all but the first 1 bit:
x ^= x >> 1;
leaving us with 00100000.
An unsigned variant given that one can use logical (&&, ||) and comparison (!=, ==).
int u_isgt(unsigned int a, unsigned int b)
{
return a != b && ( /* If a == b then a !> b and a !< b. */
b == 0 || /* Else if b == 0 a has to be > b (as a != 0). */
(a / b) /* Else divide; integer division always truncate */
); /* towards zero. Giving 0 if a < b. */
}
!= and == can easily be eliminated., i.e.:
int u_isgt(unsigned int a, unsigned int b)
{
return a ^ b && (
!(b ^ 0) ||
(a / b)
);
}
For signed one could then expand to something like:
int isgt(int a, int b)
{
return
(a != b) &&
(
(!(0x80000000 & a) && 0x80000000 & b) || /* if a >= 0 && b < 0 */
(!(0x80000000 & a) && b == 0) ||
/* Two more lines, can add them if you like, but as it is homework
* I'll leave it up to you to decide.
* Hint: check on "both negative" and "both not negative". */
)
;
}
Can be more compact / eliminate ops. (at least one) but put it like this for clarity.
Instead of 0x80000000 one could say ie:
#include <limits.h>
static const int INT_NEG = (1 << ((sizeof(int) * CHAR_BIT) - 1));
Using this to test:
void test_isgt(int a, int b)
{
fprintf(stdout,
"%11d > %11d = %d : %d %s\n",
a, b,
isgt(a, b), (a > b),
isgt(a, b) != (a>b) ? "BAD!" : "OK!");
}
Result:
33 > 0 = 1 : 1 OK!
-33 > 0 = 0 : 0 OK!
0 > 33 = 0 : 0 OK!
0 > -33 = 1 : 1 OK!
0 > 0 = 0 : 0 OK!
33 > 33 = 0 : 0 OK!
-33 > -33 = 0 : 0 OK!
-5 > -33 = 1 : 1 OK!
-33 > -5 = 0 : 0 OK!
-2147483647 > 2147483647 = 0 : 0 OK!
2147483647 > -2147483647 = 1 : 1 OK!
2147483647 > 2147483647 = 0 : 0 OK!
2147483647 > 0 = 1 : 1 OK!
0 > 2147483647 = 0 : 0 OK!
A fully branchless version of Kaganar's smaller isGt function might look like so:
int isGt(int a, int b)
{
int diff = a ^ b;
diff |= diff >> 1;
diff |= diff >> 2;
diff |= diff >> 4;
diff |= diff >> 8;
diff |= diff >> 16;
//1+ on GT, 0 otherwise.
diff &= ~(diff >> 1) | 0x80000000;
diff &= (a ^ 0x80000000) & (b ^ 0x7fffffff);
//flatten back to range of 0 or 1.
diff |= diff >> 1;
diff |= diff >> 2;
diff |= diff >> 4;
diff |= diff >> 8;
diff |= diff >> 16;
diff &= 1;
return diff;
}
This clocks in at around 60 instructions for the actual computation (MSVC 2010 compiler, on an x86 arch), plus an extra 10 stack ops or so for the function's prolog/epilog.
EDIT:
Okay, there were some issues with the code, but I revised it and the following works.
This auxiliary function compares the numbers' n'th significant digit:
int compare ( int a, int b, int n )
{
int digit = (0x1 << n-1);
if ( (a & digit) && (b & digit) )
return 0; //the digit is the same
if ( (a & digit) && !(b & digit) )
return 1; //a is greater than b
if ( !(a & digit) && (b & digit) )
return -1; //b is greater than a
}
The following should recursively return the larger number:
int larger ( int a, int b )
{
for ( int i = 8*sizeof(a) - 1 ; i >= 0 ; i-- )
{
if ( int k = compare ( a, b, i ) )
{
return (k == 1) ? a : b;
}
}
return 0; //equal
}
As much as I don't want to do someone else's homework I couldn't resist this one.. :) I am sure others can think of a more compact one..but here is mine..works well, including negative numbers..
Edit: there are couple of bugs though. I will leave it to the OP to find it and fix it.
#include<unistd.h>
#include<stdio.h>
int a, b, i, ma, mb, a_neg, b_neg, stop;
int flipnum(int *num, int *is_neg) {
*num = ~(*num) + 1;
*is_neg = 1;
return 0;
}
int print_num1() {
return ((a_neg && printf("bigger number %d\n", mb)) ||
printf("bigger number %d\n", ma));
}
int print_num2() {
return ((b_neg && printf("bigger number %d\n", ma)) ||
printf("bigger number %d\n", mb));
}
int check_num1(int j) {
return ((a & j) && print_num1());
}
int check_num2(int j) {
return ((b & j) && print_num2());
}
int recursive_check (int j) {
((a & j) ^ (b & j)) && (check_num1(j) || check_num2(j)) && (stop = 1, j = 0);
return(!stop && (j = j >> 1) && recursive_check(j));
}
int main() {
int j;
scanf("%d%d", &a, &b);
ma = a; mb = b;
i = (sizeof (int) * 8) - 1;
j = 1 << i;
((a & j) && flipnum(&a, &a_neg));
((b & j) && flipnum(&b, &b_neg));
j = 1 << (i - 1);
recursive_check(j);
(!stop && printf("numbers are same..\n"));
}
I think I have a solution with 3 operations:
Add one to the first number, the subtract it from the largest possible number you can represent (all 1's). Add that number to the second number. If it it overflows, then the first number is less than the second.
I'm not 100% sure if this is correct. That is you might not need to add 1, and I don't know if it's possible to check for overflow (if not then just reserve the last bit and test if it's 1 at the end.)
EDIT: The constraints make the simple approach at the bottom invalid. I am adding the binary search function and the final comparison to detect the greater value:
unsigned long greater(unsigned long a, unsigned long b) {
unsigned long x = a;
unsigned long y = b;
unsigned long t = a ^ b;
if (t & 0xFFFF0000) {
x >>= 16;
y >>= 16;
t >>= 16;
}
if (t & 0xFF00) {
x >>= 8;
y >>= 8;
t >>= 8;
}
if (t & 0xf0) {
x >>= 4;
y >>= 4;
t >>= 4;
}
if ( t & 0xc) {
x >>= 2;
y >>= 2;
t >>= 2;
}
if ( t & 0x2) {
x >>= 1;
y >>= 1;
t >>= 1;
}
return (x & 1) ? a : b;
}
The idea is to start off with the most significant half of the word we are interested in and see if there are any set bits in there. If there are, then we don't need the least significant half, so we shift the unwanted bits away. If not, we do nothing (the half is zero anyway, so it won't get in the way). Since we cannot keep track of the shifted amount (it would require addition), we also shift the original values so that we can do the final and to determine the larger number. We repeat this process with half the size of the previous mask until we collapse the interesting bits into bit position 0.
I didn't add the equal case in here on purpose.
Old answer:
The simplest method is probably the best for a homework. Once you've got the mismatching bit value, you start off with another mask at 0x80000000 (or whatever suitable max bit position for your word size), and keep right shifting this until you hit a bit that is set in your mismatch value. If your right shift ends up with 0, then the mismatch value is 0.
I assume you already know the final step required to determine the larger number.

Integer cube root

I'm looking for fast code for 64-bit (unsigned) cube roots. (I'm using C and compiling with gcc, but I imagine most of the work required will be language- and compiler-agnostic.) I will denote by ulong a 64-bit unisgned integer.
Given an input n I require the (integral) return value r to be such that
r * r * r <= n && n < (r + 1) * (r + 1) * (r + 1)
That is, I want the cube root of n, rounded down. Basic code like
return (ulong)pow(n, 1.0/3);
is incorrect because of rounding toward the end of the range. Unsophisticated code like
ulong
cuberoot(ulong n)
{
ulong ret = pow(n + 0.5, 1.0/3);
if (n < 100000000000001ULL)
return ret;
if (n >= 18446724184312856125ULL)
return 2642245ULL;
if (ret * ret * ret > n) {
ret--;
while (ret * ret * ret > n)
ret--;
return ret;
}
while ((ret + 1) * (ret + 1) * (ret + 1) <= n)
ret++;
return ret;
}
gives the correct result, but is slower than it needs to be.
This code is for a math library and it will be called many times from various functions. Speed is important, but you can't count on a warm cache (so suggestions like a 2,642,245-entry binary search are right out).
For comparison, here is code that correctly calculates the integer square root.
ulong squareroot(ulong a) {
ulong x = (ulong)sqrt((double)a);
if (x > 0xFFFFFFFF || x*x > a)
x--;
return x;
}
The book "Hacker's Delight" has algorithms for this and many other problems. The code is online here. EDIT: That code doesn't work properly with 64-bit ints, and the instructions in the book on how to fix it for 64-bit are somewhat confusing. A proper 64-bit implementation (including test case) is online here.
I doubt that your squareroot function works "correctly" - it should be ulong a for the argument, not n :) (but the same approach would work using cbrt instead of sqrt, although not all C math libraries have cube root functions).
I've adapted the algorithm presented in 1.5.2 (the kth root) in Modern Computer Arithmetic (Brent and Zimmerman). For the case of (k == 3), and given a 'relatively' accurate over-estimate of the initial guess - this algorithm seems to out-perform the 'Hacker's Delight' code above.
Not only that, but MCA as a text provides theoretical background as well as a proof of correctness and terminating criteria.
Provided that we can produce a 'relatively' good initial over-estimate, I haven't been able to find a case that exceeds (7) iterations. (Is this effectively related to 64-bit values having 2^6 bits?) Either way, it's an improvement over the (21) iterations in the HacDel code - with linear O(b) convergence, despite having a loop body that is evidently much faster.
The initial estimate I've used is based on a 'rounding up' of the number of significant bits in the value (x). Given (b) significant bits in (x), we can say: 2^(b - 1) <= x < 2^b. I state without proof (though it should be relatively easy to demonstrate) that: 2^ceil(b / 3) > x^(1/3)
static inline uint32_t u64_cbrt (uint64_t x)
{
uint64_t r0 = 1, r1;
/* IEEE-754 cbrt *may* not be exact. */
if (x == 0) /* cbrt(0) : */
return (0);
int b = (64) - __builtin_clzll(x);
r0 <<= (b + 2) / 3; /* ceil(b / 3) */
do /* quadratic convergence: */
{
r1 = r0;
r0 = (2 * r1 + x / (r1 * r1)) / 3;
}
while (r0 < r1);
return ((uint32_t) r1); /* floor(cbrt(x)); */
}
A crbt call probably isn't all that useful - unlike the sqrt call which can be efficiently implemented on modern hardware. That said, I've seen gains for sets of values under 2^53 (exactly represented in IEEE-754 doubles), which surprised me.
The only downside is the division by: (r * r) - this can be slow, as the latency of integer division continues to fall behind other advances in ALUs. The division by a constant: (3) is handled by reciprocal methods on any modern optimising compiler.
It's interesting that Intel's 'Icelake' microarchitecture will significantly improve integer division - an operation that seems to have been neglected for a long time. I simply won't trust the 'Hacker's Delight' answer until I can find a sound theoretical basis for it. And then I have to work out which variant is the 'correct' answer.
You could try a Newton's step to fix your rounding errors:
ulong r = (ulong)pow(n, 1.0/3);
if(r==0) return r; /* avoid divide by 0 later on */
ulong r3 = r*r*r;
ulong slope = 3*r*r;
ulong r1 = r+1;
ulong r13 = r1*r1*r1;
/* making sure to handle unsigned arithmetic correctly */
if(n >= r13) r+= (n - r3)/slope;
if(n < r3) r-= (r3 - n)/slope;
A single Newton step ought to be enough, but you may have off-by-one (or possibly more?) errors. You can check/fix those using a final check&increment step, as in your OQ:
while(r*r*r > n) --r;
while((r+1)*(r+1)*(r+1) <= n) ++r;
or some such.
(I admit I'm lazy; the right way to do it is to carefully check to determine which (if any) of the check&increment things is actually necessary...)
If pow is too expensive, you can use a count-leading-zeros instruction to get an approximation to the result, then use a lookup table, then some Newton steps to finish it.
int k = __builtin_clz(n); // counts # of leading zeros (often a single assembly insn)
int b = 64 - k; // # of bits in n
int top8 = n >> (b - 8); // top 8 bits of n (top bit is always 1)
int approx = table[b][top8 & 0x7f];
Given b and top8, you can use a lookup table (in my code, 8K entries) to find a good approximation to cuberoot(n). Use some Newton steps (see comingstorm's answer) to finish it.
// On my pc: Math.Sqrt 35 ns, cbrt64 <70ns, cbrt32 <25 ns, (cbrt12 < 10ns)
// cbrt64(ulong x) is a C# version of:
// http://www.hackersdelight.org/hdcodetxt/acbrt.c.txt (acbrt1)
// cbrt32(uint x) is a C# version of:
// http://www.hackersdelight.org/hdcodetxt/icbrt.c.txt (icbrt1)
// Union in C#:
// http://www.hanselman.com/blog/UnionsOrAnEquivalentInCSairamasTipOfTheDay.aspx
using System.Runtime.InteropServices;
[StructLayout(LayoutKind.Explicit)]
public struct fu_32 // float <==> uint
{
[FieldOffset(0)]
public float f;
[FieldOffset(0)]
public uint u;
}
private static uint cbrt64(ulong x)
{
if (x >= 18446724184312856125) return 2642245;
float fx = (float)x;
fu_32 fu32 = new fu_32();
fu32.f = fx;
uint uy = fu32.u / 4;
uy += uy / 4;
uy += uy / 16;
uy += uy / 256;
uy += 0x2a5137a0;
fu32.u = uy;
float fy = fu32.f;
fy = 0.33333333f * (fx / (fy * fy) + 2.0f * fy);
int y0 = (int)
(0.33333333f * (fx / (fy * fy) + 2.0f * fy));
uint y1 = (uint)y0;
ulong y2, y3;
if (y1 >= 2642245)
{
y1 = 2642245;
y2 = 6981458640025;
y3 = 18446724184312856125;
}
else
{
y2 = (ulong)y1 * y1;
y3 = y2 * y1;
}
if (y3 > x)
{
y1 -= 1;
y2 -= 2 * y1 + 1;
y3 -= 3 * y2 + 3 * y1 + 1;
while (y3 > x)
{
y1 -= 1;
y2 -= 2 * y1 + 1;
y3 -= 3 * y2 + 3 * y1 + 1;
}
return y1;
}
do
{
y3 += 3 * y2 + 3 * y1 + 1;
y2 += 2 * y1 + 1;
y1 += 1;
}
while (y3 <= x);
return y1 - 1;
}
private static uint cbrt32(uint x)
{
uint y = 0, z = 0, b = 0;
int s = x < 1u << 24 ? x < 1u << 12 ? x < 1u << 06 ? x < 1u << 03 ? 00 : 03 :
x < 1u << 09 ? 06 : 09 :
x < 1u << 18 ? x < 1u << 15 ? 12 : 15 :
x < 1u << 21 ? 18 : 21 :
x >= 1u << 30 ? 30 : x < 1u << 27 ? 24 : 27;
do
{
y *= 2;
z *= 4;
b = 3 * y + 3 * z + 1 << s;
if (x >= b)
{
x -= b;
z += 2 * y + 1;
y += 1;
}
s -= 3;
}
while (s >= 0);
return y;
}
private static uint cbrt12(uint x) // x < ~255
{
uint y = 0, a = 0, b = 1, c = 0;
while (a < x)
{
y++;
b += c;
a += b;
c += 6;
}
if (a != x) y--;
return y;
}
Starting from the code within the GitHub gist from the answer of Fabian Giesen, I have arrived at the following, faster implementation:
#include <stdint.h>
static inline uint64_t icbrt(uint64_t x) {
uint64_t b, y, bits = 3*21;
int s;
for (s = bits - 3; s >= 0; s -= 3) {
if ((x >> s) == 0)
continue;
x -= 1 << s;
y = 1;
for (s = s - 3; s >= 0; s -= 3) {
y += y;
b = 1 + 3*y*(y + 1);
if ((x >> s) >= b) {
x -= b << s;
y += 1;
}
}
return y;
}
return 0;
}
While the above is still somewhat slower than methods relying on the GNU specific __builtin_clzll, the above does not make use of compiler specifics and is thus completely portable.
The bits constant
Lowering the constant bits leads to faster computation, but the highest number x for which the function gives correct results is (1 << bits) - 1. Also, bits must be a multiple of 3 and be at most 64, meaning that its maximum value is really 3*21 == 63. With bits = 3*21, icbrt() thus works for input x <= 9223372036854775807. If we know that a program is working with limited x, say x < 1000000, then we can speed up the cube root computation by setting bits = 3*7, since (1 << 3*7) - 1 = 2097151 >= 1000000.
64-bit vs. 32-bit integers
Though the above is written for 64-bit integers, the logic is the same for 32-bit:
#include <stdint.h>
static inline uint32_t icbrt(uint32_t x) {
uint32_t b, y, bits = 3*7; /* or whatever is appropriate */
int s;
for (s = bits - 3; s >= 0; s -= 3) {
if ((x >> s) == 0)
continue;
x -= 1 << s;
y = 1;
for (s = s - 3; s >= 0; s -= 3) {
y += y;
b = 1 + 3*y*(y + 1);
if ((x >> s) >= b) {
x -= b << s;
y += 1;
}
}
return y;
}
return 0;
}
I would research how to do it by hand, and then translate that into a computer algorithm, working in base 2 rather than base 10.
We end up with an algorithm something like (pseudocode):
Find the largest n such that (1 << 3n) < input.
result = 1 << n.
For i in (n-1)..0:
if ((result | 1 << i)**3) < input:
result |= 1 << i.
We can optimize the calculation of (result | 1 << i)**3 by observing that the bitwise-or is equivalent to addition, refactoring to result**3 + 3 * i * result ** 2 + 3 * i ** 2 * result + i ** 3, caching the values of result**3 and result**2 between iterations, and using shifts instead of multiplication.
You can try and adapt this C algorithm :
#include <limits.h>
// return a number that, when multiplied by itself twice, makes N.
unsigned cube_root(unsigned n){
unsigned a = 0, b;
for (int c = sizeof(unsigned) * CHAR_BIT / 3 * 3 ; c >= 0; c -= 3) {
a <<= 1;
b = a + (a << 1), b = b * a + b + 1 ;
if (n >> c >= b)
n -= b << c, ++a;
}
return a;
}
Also there is :
// return the number that was multiplied by itself to reach N.
unsigned square_root(const unsigned num) {
unsigned a, b, c, d;
for (b = a = num, c = 1; a >>= 1; ++c);
for (c = 1 << (c & -2); c; c >>= 2) {
d = a + c;
a >>= 1;
if (b >= d)
b -= d, a += c;
}
return a;
}
Source

bit shifting in C

int x = 2;
x = rotateInt('L', x, 1); // should return 4
x = rotateInt('R', x, 3); // should return 64
Here is the code, can someone check it and let me know what the error is?
Compilation is successful, but it says Segmentation Fault when I execute it.
int rotateInt(char direction, unsigned int x, int y)
{
int i;
for(i = 0; i < y; i++)
{
if(direction == 'R')
{
if((x & 1) == 1)
{
x = x >> 1;
x = (x ^ 128);
}
else
x = x >> 1;
}
else if(direction == 'L')
{
if((x & 128) == 1)
{
x = x << 1;
x = (x ^ 1);
}
else
x = x << 1;
}
}
return x;
}
Start honing your debugging skills now. If you are going to be any form of an engineer, you'll need to write programs of some variety, and will thus be debugging all of your life.
A simple way to start debugging is to put print statements in to see how far your code makes it before it dies. I recommend you start by isolating the error.
Not sure about the seg fault, but I think
if((x & 128) == 1)
should be
if((x & 128) == 128)
or just
if(x & 128)
I tried on my computer (MacBookPro / Core2Duo) and it worked.
By the way, what's your target architecture ? Some (many) processors perform rotation instead of shifts when you use the C operators ">>" and "<<".
When you use ^ don't you mean the or operator | ?

Resources