2d Morton code 64bits decode function - c

The first function encodes [x, y] as 64bit wide Morton code where x and y are 32bit wide integers using Interleave bits by Binary Magic Numbers.
What would be the reverse function ?
void xy2d_morton_64bits(uint64_t x, uint64_t y, uint64_t *d)
{
x = (x | (x << 16)) & 0x0000FFFF0000FFFF;
x = (x | (x << 8)) & 0x00FF00FF00FF00FF;
x = (x | (x << 4)) & 0x0F0F0F0F0F0F0F0F;
x = (x | (x << 2)) & 0x3333333333333333;
x = (x | (x << 1)) & 0x5555555555555555;
y = (y | (y << 16)) & 0x0000FFFF0000FFFF;
y = (y | (y << 8)) & 0x00FF00FF00FF00FF;
y = (y | (y << 4)) & 0x0F0F0F0F0F0F0F0F;
y = (y | (y << 2)) & 0x3333333333333333;
y = (y | (y << 1)) & 0x5555555555555555;
*d = x | (y << 1);
}
void d2xy_morton_64bits(uint64_t d, uint64_t *x, uint64_t *y)
{
????
}

This question was answered here.
Split d into evens and odds then use a similar set of shifts and masks to compress the bits back together:
x = d&0x5555555555555555;
x = (x|x>>1)&0x3333333333333333; //converts 0a0b0c0d.. -> 00ab00cd...
x = (x|x>>2)&0x0f0f0f0f0f0f0f0f; //converts 00ab00cd.. -> 0000abcd...
//etc.

Related

How to read reversed number from a binary file?

i would like to read a 32-bits number from binary file in C. The problem is that the order of bits is reversed. For an example 3 digits number 110 would stand for 3, not for 6. At the beginning we have the least significant bit (2^0), then 2^1 and so on. Is there any simple way to do this in C, or do i have to write all the logic by myself (read the first bit, multiply it by 2^0, add to the sum, repeat to the end)?
you have many possible ways:
Portable:
(not my algorithm)
uint32_t rev(uint32_t x)
{
x = (((x & 0xaaaaaaaa) >> 1) | ((x & 0x55555555) << 1));
x = (((x & 0xcccccccc) >> 2) | ((x & 0x33333333) << 2));
x = (((x & 0xf0f0f0f0) >> 4) | ((x & 0x0f0f0f0f) << 4));
x = (((x & 0xff00ff00) >> 8) | ((x & 0x00ff00ff) << 8));
return((x >> 16) | (x << 16));
}
or
uint32_t bit_reverse_4bytes(uint32_t x)
{
x = ((x & 0xF0F0F0F0) >> 4) | ((x & 0x0F0F0F0F) << 4);
x = ((x & 0xCCCCCCCC) >> 2) | ((x & 0x33333333) << 2);
return ((x & 0xAAAAAAAA) >> 1) | ((x & 0x55555555) << 1);
}
Naive
uint32_t naiverevese(uint32_t x)
{
uint32_t result = 0;
for(int i = 0; i < 32; i++)
{
result |= x & 1;
result <<=1;
x >>= 1;
}
return result;
}
or lookup table.
Not portable but the most efficient:
Many processors have a special instructions for it for example:
ARM - rbit and the intrinsic unsigned int __rbit(unsigned int val)

Morton Reverse Encoding for a 3D grid

I have a 3D grid/array say u[nx+2][ny+2][nz+2]. The trailing +2 corresponds to two layers of halo cells in each of the three dimension x,y,z. I have another grid which allows for refinement(using quadtree) hence I have the morton index (or the Z order) of each of the cells.
Lets say without refinement the two grids are alike in the physical reality(except the second code doesnt have halo cells), What I want to find is for a cell q with morton id mid what is the corresponding index i , j and k index in the 3D grid. Basically a decoding of the mid or Z-order to get corresponding i,j,k for u matrix.
Looking for a C solution but general comments in any other programming language is also OK.
For forward encoding I am following the magic bits method as shown in
Morton Encoding using different methods
Morton encoding is just interleaving the bits of two or more components.
If we number binary digits in increasing order of significance, so that the least significant binary digit in an unsigned integer is 0 (and binary digit i has value 2i), then binary digit i in component k of N corresponds to binary digit (i N + k) in the Morton code.
Here are two simple functions to encode and decode three-component Morton codes:
#include <stdlib.h>
#include <inttypes.h>
/* This source is in the public domain. */
/* Morton encoding in binary (components 21-bit: 0..2097151)
0zyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyxzyx */
#define BITMASK_0000000001000001000001000001000001000001000001000001000001000001 UINT64_C(18300341342965825)
#define BITMASK_0000001000001000001000001000001000001000001000001000001000001000 UINT64_C(146402730743726600)
#define BITMASK_0001000000000000000000000000000000000000000000000000000000000000 UINT64_C(1152921504606846976)
/* 0000000ccc0000cc0000cc0000cc0000cc0000cc0000cc0000cc0000cc0000cc */
#define BITMASK_0000000000000011000000000011000000000011000000000011000000000011 UINT64_C(844631138906115)
#define BITMASK_0000000111000000000011000000000011000000000011000000000011000000 UINT64_C(126113986927919296)
/* 00000000000ccccc00000000cccc00000000cccc00000000cccc00000000cccc */
#define BITMASK_0000000000000000000000000000000000001111000000000000000000001111 UINT64_C(251658255)
#define BITMASK_0000000000000000000000001111000000000000000000001111000000000000 UINT64_C(1030792212480)
#define BITMASK_0000000000011111000000000000000000000000000000000000000000000000 UINT64_C(8725724278030336)
/* 000000000000000000000000000ccccccccccccc0000000000000000cccccccc */
#define BITMASK_0000000000000000000000000000000000000000000000000000000011111111 UINT64_C(255)
#define BITMASK_0000000000000000000000000001111111111111000000000000000000000000 UINT64_C(137422176256)
/* ccccccccccccccccccccc */
#define BITMASK_21BITS UINT64_C(2097151)
static inline void morton_decode(uint64_t m, uint32_t *xto, uint32_t *yto, uint32_t *zto)
{
const uint64_t mask0 = BITMASK_0000000001000001000001000001000001000001000001000001000001000001,
mask1 = BITMASK_0000001000001000001000001000001000001000001000001000001000001000,
mask2 = BITMASK_0001000000000000000000000000000000000000000000000000000000000000,
mask3 = BITMASK_0000000000000011000000000011000000000011000000000011000000000011,
mask4 = BITMASK_0000000111000000000011000000000011000000000011000000000011000000,
mask5 = BITMASK_0000000000000000000000000000000000001111000000000000000000001111,
mask6 = BITMASK_0000000000000000000000001111000000000000000000001111000000000000,
mask7 = BITMASK_0000000000011111000000000000000000000000000000000000000000000000,
mask8 = BITMASK_0000000000000000000000000000000000000000000000000000000011111111,
mask9 = BITMASK_0000000000000000000000000001111111111111000000000000000000000000;
uint64_t x = m,
y = m >> 1,
z = m >> 2;
/* 000c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c */
x = (x & mask0) | ((x & mask1) >> 2) | ((x & mask2) >> 4);
y = (y & mask0) | ((y & mask1) >> 2) | ((y & mask2) >> 4);
z = (z & mask0) | ((z & mask1) >> 2) | ((z & mask2) >> 4);
/* 0000000ccc0000cc0000cc0000cc0000cc0000cc0000cc0000cc0000cc0000cc */
x = (x & mask3) | ((x & mask4) >> 4);
y = (y & mask3) | ((y & mask4) >> 4);
z = (z & mask3) | ((z & mask4) >> 4);
/* 00000000000ccccc00000000cccc00000000cccc00000000cccc00000000cccc */
x = (x & mask5) | ((x & mask6) >> 8) | ((x & mask7) >> 16);
y = (y & mask5) | ((y & mask6) >> 8) | ((y & mask7) >> 16);
z = (z & mask5) | ((z & mask6) >> 8) | ((z & mask7) >> 16);
/* 000000000000000000000000000ccccccccccccc0000000000000000cccccccc */
x = (x & mask8) | ((x & mask9) >> 16);
y = (y & mask8) | ((y & mask9) >> 16);
z = (z & mask8) | ((z & mask9) >> 16);
/* 0000000000000000000000000000000000000000000ccccccccccccccccccccc */
if (xto) *xto = x;
if (yto) *yto = y;
if (zto) *zto = z;
}
static inline uint64_t morton_encode(uint32_t xsrc, uint32_t ysrc, uint32_t zsrc)
{
const uint64_t mask0 = BITMASK_0000000001000001000001000001000001000001000001000001000001000001,
mask1 = BITMASK_0000001000001000001000001000001000001000001000001000001000001000,
mask2 = BITMASK_0001000000000000000000000000000000000000000000000000000000000000,
mask3 = BITMASK_0000000000000011000000000011000000000011000000000011000000000011,
mask4 = BITMASK_0000000111000000000011000000000011000000000011000000000011000000,
mask5 = BITMASK_0000000000000000000000000000000000001111000000000000000000001111,
mask6 = BITMASK_0000000000000000000000001111000000000000000000001111000000000000,
mask7 = BITMASK_0000000000011111000000000000000000000000000000000000000000000000,
mask8 = BITMASK_0000000000000000000000000000000000000000000000000000000011111111,
mask9 = BITMASK_0000000000000000000000000001111111111111000000000000000000000000;
uint64_t x = xsrc,
y = ysrc,
z = zsrc;
/* 0000000000000000000000000000000000000000000ccccccccccccccccccccc */
x = (x & mask8) | ((x << 16) & mask9);
y = (y & mask8) | ((y << 16) & mask9);
z = (z & mask8) | ((z << 16) & mask9);
/* 000000000000000000000000000ccccccccccccc0000000000000000cccccccc */
x = (x & mask5) | ((x << 8) & mask6) | ((x << 16) & mask7);
y = (y & mask5) | ((y << 8) & mask6) | ((y << 16) & mask7);
z = (z & mask5) | ((z << 8) & mask6) | ((z << 16) & mask7);
/* 00000000000ccccc00000000cccc00000000cccc00000000cccc00000000cccc */
x = (x & mask3) | ((x << 4) & mask4);
y = (y & mask3) | ((y << 4) & mask4);
z = (z & mask3) | ((z << 4) & mask4);
/* 0000000ccc0000cc0000cc0000cc0000cc0000cc0000cc0000cc0000cc0000cc */
x = (x & mask0) | ((x << 2) & mask1) | ((x << 4) & mask2);
y = (y & mask0) | ((y << 2) & mask1) | ((y << 4) & mask2);
z = (z & mask0) | ((z << 2) & mask1) | ((z << 4) & mask2);
/* 000c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c00c */
return x | (y << 1) | (z << 2);
}
The functions work symmetrically. To decode, binary digits and digit groups are shifted to larger consecutive units; to encode, binary digit groups are split and spread by shifting. Examine the masks (the BITMASK_ constants are named after their binary digit pattern), and the shift operations, to understand in detail how the encoding and decoding happens.
While two functions are quite efficient, they are not optimized.
The above functions have been verified tested to work using a few billion round-trips using random 21-bit unsigned integer components: decoding a Morton-encoded value yields the original three components.

Morton ordering in 3D usign unit64_t as input

I am trying to use Morton code to produce unique encoding for a given (x,y,z) where x,y,z are double precision floating point numbers. I presume that I can use type cast to convert the floats to integers and run Morton ordering on those integers. For example consider the following C++ code. (I don't now how to do the same in C)
double x=-1.123456789123456E205;
int64_t i = reinterpret_cast<int64_t &>(x);
cout<<i<<endl;
output >>> i = -1548698869907112442
And ditto for the reaming x,y. Once I have the "reinterpreted" values ,I would like to use them as a subroutine for Morton coding.
I checked the above type cast and it worked fine in reverse
double y = reinterpret_cast<double &>(i);
cout<<setprecision(16)<<y<<endl;
output>>-1.123456789123456e+205
I managed to find some codes for Morton coding, even some of them on this forum, but non of them used int64_t in 3D. Hence I am going to need the help of the experts on the forum, how to encode and decode int64_t integers.
I managed to reverse engineer the following code. Unfortunately there is some bug, I am not getting the proper numbers when I run the decode part. I would appreciate any help to figure out what is wrong.
2D morton code encode/decode 64bits.
#include <iostream>
#include <stdint.h>
#include<iomanip>
using namespace std;
uint64_t code_2D_M(double xd,double yd){
uint64_t x = reinterpret_cast<uint64_t& >(xd);
uint64_t y = reinterpret_cast<uint64_t& >(yd);
x = (x | (x << 16)) & 0x0000FFFF0000FFFF;
x = (x | (x << 8)) & 0x00FF00FF00FF00FF;
x = (x | (x << 4)) & 0x0F0F0F0F0F0F0F0F;
x = (x | (x << 2)) & 0x3333333333333333;
x = (x | (x << 1)) & 0x5555555555555555;
y = (y | (y << 16)) & 0x0000FFFF0000FFFF;
y = (y | (y << 8)) & 0x00FF00FF00FF00FF;
y = (y | (y << 4)) & 0x0F0F0F0F0F0F0F0F;
y = (y | (y << 2)) & 0x3333333333333333;
y = (y | (y << 1)) & 0x5555555555555555;
return x | (y << 1);
}
uint64_t code_3D_M(double xd,double yd,double zd){
uint64_t x = reinterpret_cast<uint64_t& >(xd);
uint64_t y = reinterpret_cast<uint64_t& >(yd);
uint64_t z = reinterpret_cast<uint64_t& >(zd);
x = (x | (x << 16)) & 0x0000FFFF0000FFFF;
x = (x | (x << 8)) & 0x00FF00FF00FF00FF;
x = (x | (x << 4)) & 0x0F0F0F0F0F0F0F0F;
x = (x | (x << 2)) & 0x3333333333333333;
x = (x | (x << 1)) & 0x5555555555555555;
y = (y | (y << 16)) & 0x0000FFFF0000FFFF;
y = (y | (y << 8)) & 0x00FF00FF00FF00FF;
y = (y | (y << 4)) & 0x0F0F0F0F0F0F0F0F;
y = (y | (y << 2)) & 0x3333333333333333;
y = (y | (y << 1)) & 0x5555555555555555;
z = (y | (y << 16)) & 0x0000FFFF0000FFFF;
z = (y | (y << 8)) & 0x00FF00FF00FF00FF;
z = (y | (y << 4)) & 0x0F0F0F0F0F0F0F0F;
z = (y | (y << 2)) & 0x3333333333333333;
z = (y | (y << 1)) & 0x5555555555555555;
return x | (y << 1) | (z << 2);
}
double decode_M(uint64_t x)
{
x = x & 0x5555555555555555;
x = (x | (x >> 1)) & 0x3333333333333333;
x = (x | (x >> 2)) & 0x0F0F0F0F0F0F0F0F;
x = (x | (x >> 4)) & 0x00FF00FF00FF00FF;
x = (x | (x >> 8)) & 0x0000FFFF0000FFFF;
x = (x | (x >> 16)) & 0xFFFFFFFFFFFFFFFF;
return reinterpret_cast<double& >(x);
}
int main (void){
uint64_t mort;
double x,y,z;
// test input
x=2.123456789123459E205;
y=1.789789123456129E205;
z=9.999999912345779E205;
// echo the input
cout<<setprecision(17)<<x<<endl;
cout<<setprecision(17)<<y<<endl;
cout<<setprecision(17)<<z<<endl;
// encode 2D case
mort = code_2D_M(x,y);
//decode and print the results to see if all was fine
cout<<setprecision(17)<<decode_M(mort>>0)<<endl;
cout<<setprecision(17)<<decode_M(mort>>1)<<endl;
// encode 3D case
mort = code_3D_M(x,y,z);
//decode and print the results to see if all was fine
cout<<setprecision(17)<<decode_M(mort>>0)<<endl;
cout<<setprecision(17)<<decode_M(mort>>1)<<endl;
cout<<setprecision(17)<<decode_M(mort>>2)<<endl;
return 0;
}
I am doing this because I would like not storing the coordinates as a 3D point (x,y,z) but rather as a single long integer and decode them when needed. By doing so I will reduce the size of my coordinate storage array 3-fold.

2D morton code encode/decode 64bits

How to encode/decode morton codes(z-order) given [x, y] as 32bit unsigned integers producing 64bit morton code, and vice verse ?
I do have xy2d and d2xy but only for coordinates that are 16bits wide producing 32bit morton number. Searched a lot in net, but couldn't find. Please help.
If it is possible for you to use architecture specific instructions you'll likely be able to accelerate the operation beyond what is possible using bit-twiddeling hacks:
For example if you write code for the Intel Haswell and later CPUs you can use the BMI2 instruction set which contains the pext and pdep instructions. These can (among other great things) be used to build your functions.
Here is a complete example (tested with GCC):
#include <immintrin.h>
#include <stdint.h>
// on GCC, compile with option -mbmi2, requires Haswell or better.
uint64_t xy_to_morton(uint32_t x, uint32_t y)
{
return _pdep_u32(x, 0x55555555) | _pdep_u32(y,0xaaaaaaaa);
}
void morton_to_xy(uint64_t m, uint32_t *x, uint32_t *y)
{
*x = _pext_u64(m, 0x5555555555555555);
*y = _pext_u64(m, 0xaaaaaaaaaaaaaaaa);
}
If you have to support earlier CPUs or the ARM platform not all is lost. You may still get at least get help for the xy_to_morton function from instructions specific for cryptography.
A lot of CPUs have support for carry-less multiplication these days. On ARM that'll be vmul_p8 from the NEON instruction set. On X86 you'll find it as PCLMULQDQ from the CLMUL instruction set (available since 2010).
The trick here is, that a carry-less multiplication of a number with itself will return a bit-pattern that contains the original bits of the argument with zero-bits interleaved. So it is identical to the _pdep_u32(x,0x55555555) shown above. E.g. it turns the following byte:
+----+----+----+----+----+----+----+----+
| b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
+----+----+----+----+----+----+----+----+
Into:
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| 0 | b7 | 0 | b6 | 0 | b5 | 0 | b4 | 0 | b3 | 0 | b2 | 0 | b1 | 0 | b0 |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
Now you can build the xy_to_morton function as (here shown for CLMUL instruction set):
#include <wmmintrin.h>
#include <stdint.h>
// on GCC, compile with option -mpclmul
uint64_t carryless_square (uint32_t x)
{
uint64_t val[2] = {x, 0};
__m128i *a = (__m128i * )val;
*a = _mm_clmulepi64_si128 (*a,*a,0);
return val[0];
}
uint64_t xy_to_morton (uint32_t x, uint32_t y)
{
return carryless_square(x)|(carryless_square(y) <<1);
}
_mm_clmulepi64_si128 generates a 128 bit result of which we only use the lower 64 bits. So you can even improve upon the version above and use a single _mm_clmulepi64_si128 do do the job.
That is as good as you can get on mainstream platforms (e.g. modern ARM with NEON and x86). Unfortunately I don't know of any trick to speed up the morton_to_xy function using the cryptography instructions and I tried really hard for several month.
void xy2d_morton(uint64_t x, uint64_t y, uint64_t *d)
{
x = (x | (x << 16)) & 0x0000FFFF0000FFFF;
x = (x | (x << 8)) & 0x00FF00FF00FF00FF;
x = (x | (x << 4)) & 0x0F0F0F0F0F0F0F0F;
x = (x | (x << 2)) & 0x3333333333333333;
x = (x | (x << 1)) & 0x5555555555555555;
y = (y | (y << 16)) & 0x0000FFFF0000FFFF;
y = (y | (y << 8)) & 0x00FF00FF00FF00FF;
y = (y | (y << 4)) & 0x0F0F0F0F0F0F0F0F;
y = (y | (y << 2)) & 0x3333333333333333;
y = (y | (y << 1)) & 0x5555555555555555;
*d = x | (y << 1);
}
// morton_1 - extract even bits
uint32_t morton_1(uint64_t x)
{
x = x & 0x5555555555555555;
x = (x | (x >> 1)) & 0x3333333333333333;
x = (x | (x >> 2)) & 0x0F0F0F0F0F0F0F0F;
x = (x | (x >> 4)) & 0x00FF00FF00FF00FF;
x = (x | (x >> 8)) & 0x0000FFFF0000FFFF;
x = (x | (x >> 16)) & 0x00000000FFFFFFFF;
return (uint32_t)x;
}
void d2xy_morton(uint64_t d, uint64_t &x, uint64_t &y)
{
x = morton_1(d);
y = morton_1(d >> 1);
}
The naïve code would be the same irregardless of the bit count. If you don't need super fast bit twiddling version, this will do
uint32_t x;
uint32_t y;
uint64_t z = 0;
for (int i = 0; i < sizeof(x) * 8; i++)
{
z |= (x & (uint64_t)1 << i) << i | (y & (uint64_t)1 << i) << (i + 1);
}
If you need faster bit twiddling, then this one should work. Note that x and y have to be 64bit variables.
uint64_t x;
uint64_t y;
uint64_t z = 0;
x = (x | (x << 16)) & 0x0000FFFF0000FFFF;
x = (x | (x << 8)) & 0x00FF00FF00FF00FF;
x = (x | (x << 4)) & 0x0F0F0F0F0F0F0F0F;
x = (x | (x << 2)) & 0x3333333333333333;
x = (x | (x << 1)) & 0x5555555555555555;
y = (y | (y << 16)) & 0x0000FFFF0000FFFF;
y = (y | (y << 8)) & 0x00FF00FF00FF00FF;
y = (y | (y << 4)) & 0x0F0F0F0F0F0F0F0F;
y = (y | (y << 2)) & 0x3333333333333333;
y = (y | (y << 1)) & 0x5555555555555555;
z = x | (y << 1);

Computing the floor of log₂(x) using only bitwise operators in C

For homework, using C, I'm supposed to make a program that finds the log base 2 of a number greater than 0 using only the operators ! ~ & ^ | + << >>. I know that I'm supposed to shift right a number of times, but I don't know how to keep track of the number of times without having any loops or ifs. I've been stuck on this question for days, so any help is appreciated.
int ilog2(int x) {
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >> 16);
}
This is what I have so far. I pass the most significant bit to the end.
Assumes a 32-bit unsigned int :
unsigned int ulog2 (unsigned int u)
{
unsigned int s, t;
t = (u > 0xffff) << 4; u >>= t;
s = (u > 0xff ) << 3; u >>= s, t |= s;
s = (u > 0xf ) << 2; u >>= s, t |= s;
s = (u > 0x3 ) << 1; u >>= s, t |= s;
return (t | (u >> 1));
}
Since I assumed >, I thought I'd find a way to get rid of it.
(u > 0xffff) is equivalent to: ((u >> 16) != 0). If subtract borrows:
((u >> 16) - 1) will set the msb, iff (u <= 0xffff). Replace -1 with +(~0) (allowed).
So the condition: (u > 0xffff) is replaced with: (~((u >> 16) + ~0U)) >> 31
unsigned int ulog2 (unsigned int u)
{
unsigned int r = 0, t;
t = ((~((u >> 16) + ~0U)) >> 27) & 0x10;
r |= t, u >>= t;
t = ((~((u >> 8) + ~0U)) >> 28) & 0x8;
r |= t, u >>= t;
t = ((~((u >> 4) + ~0U)) >> 29) & 0x4;
r |= t, u >>= t;
t = ((~((u >> 2) + ~0U)) >> 30) & 0x2;
r |= t, u >>= t;
return (r | (u >> 1));
}
This gets the floor of logbase2 of a number.
int ilog2(int x) {
int i, j, k, l, m;
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >> 16);
// i = 0x55555555
i = 0x55 | (0x55 << 8);
i = i | (i << 16);
// j = 0x33333333
j = 0x33 | (0x33 << 8);
j = j | (j << 16);
// k = 0x0f0f0f0f
k = 0x0f | (0x0f << 8);
k = k | (k << 16);
// l = 0x00ff00ff
l = 0xff | (0xff << 16);
// m = 0x0000ffff
m = 0xff | (0xff << 8);
x = (x & i) + ((x >> 1) & i);
x = (x & j) + ((x >> 2) & j);
x = (x & k) + ((x >> 4) & k);
x = (x & l) + ((x >> 8) & l);
x = (x & m) + ((x >> 16) & m);
x = x + ~0;
return x;
}
Your result is simply the rank of the highest non-null bit.
int log2_floor (int x)
{
int res = -1;
while (x) { res++ ; x = x >> 1; }
return res;
}
One possible solution is to take this method:
It is based on the additivity of logarithms:
log2(2nx) = log2(x) + n
Let x0 be a number of 2n bits (for instance, n=16 for 32 bits).
if x0 > 2n, we can define x1 so that
x0 = 2nx1
and we can say that
E(log2(x0)) = n + E(log2(x1))
We can compute
x1
with a binary shift:
x1 = x0 >> n
Otherwise we can simply set X1 = X0
We are now facing the same problem with the remaining upper or lower half of x0
By splitting x in half at each step, we can eventually compute E(log2(x)):
int log2_floor (unsigned x)
{
#define MSB_HIGHER_THAN(n) (x &(~((1<<n)-1)))
int res = 0;
if MSB_HIGHER_THAN(16) {res+= 16; $x >>= 16;}
if MSB_HIGHER_THAN( 8) {res+= 8; $x >>= 8;}
if MSB_HIGHER_THAN( 4) {res+= 4; $x >>= 4;}
if MSB_HIGHER_THAN( 2) {res+= 2; $x >>= 2;}
if MSB_HIGHER_THAN( 1) {res+= 1;}
return res;
}
Since your sadistic teacher said you can't use loops, we can hack our way around by computing a value that will be n in case of positive test and 0 otherwise, thus having no effect on addition or shift:
#define N_IF_MSB_HIGHER_THAN_N_OR_ELSE_0(n) (((-(x>>n))>>n)&n)
If the - operator is also forbidden by your psychopatic teacher (which is stupid since processors are able to handle 2's complements just as well as bitwise operations), you can use -x = ~x+1 in the above formula
#define N_IF_MSB_HIGHER_THAN_N_OR_ELSE_0_WITH_NO_MINUS(n) (((~(x>>n)+1)>>n)&n)
that we will shorten to NIMHTNOE0WNM for readability.
Also we will use | instead of + since we know they will be no carry.
Here the example is for 32 bits integers, but you could make it work on 64, 128, 256, 512 or 1024 bits integers if you managed to find a language that supports that big an integer value.
int log2_floor (unsigned x)
{
#define NIMHTNOE0WNM(n) (((~(x>>n)+1)>>n)&n)
int res, n;
n = NIMHTNOE0WNM(16); res = n; x >>= n;
n = NIMHTNOE0WNM( 8); res |= n; x >>= n;
n = NIMHTNOE0WNM( 4); res |= n; x >>= n;
n = NIMHTNOE0WNM( 2); res |= n; x >>= n;
n = NIMHTNOE0WNM( 1); res |= n;
return res;
}
Ah, but maybe you were forbidden to use #define too?
In that case, I cannot do much more for you, except advise you to flog your teacher to death with an old edition of the K&R.
This leads to useless, obfuscated code that gives off a strong smell of unwashed 70's hackers.
Most if not all processors implement specific "count leading zeroes" instructions (for instance, clz on ARM, bsr on x86 or cntlz on PowerPC) that can do the trick without all this fuss .
If you're allowed to use & then can you use &&? With that you can do conditionals without the need of if
if (cond)
doSomething();
can be done with
cond && doSomething();
Otherwise if you want to assign value conditionally like value = cond ? a : b; then you may do it with &
mask = -(cond != 0); // assuming int is a 2's complement 32-bit type
// or mask = (cond != 0) << 31) >> 31;
value = (mask & a) | (~mask & b);
There are many other ways in the bithacks page:
int v; // 32-bit integer to find the log base 2 of
int r; // result of log_2(v) goes here
union { unsigned int u[2]; double d; } t; // temp
t.u[__FLOAT_WORD_ORDER==LITTLE_ENDIAN] = 0x43300000;
t.u[__FLOAT_WORD_ORDER!=LITTLE_ENDIAN] = v;
t.d -= 4503599627370496.0;
r = (t.u[__FLOAT_WORD_ORDER==LITTLE_ENDIAN] >> 20) - 0x3FF;
or
unsigned int v; // 32-bit value to find the log2 of
register unsigned int r; // result of log2(v) will go here
register unsigned int shift;
r = (v > 0xFFFF) << 4; v >>= r;
shift = (v > 0xFF ) << 3; v >>= shift; r |= shift;
shift = (v > 0xF ) << 2; v >>= shift; r |= shift;
shift = (v > 0x3 ) << 1; v >>= shift; r |= shift;
r |= (v >> 1);
another way
uint32_t v; // find the log base 2 of 32-bit v
int r; // result goes here
static const int MultiplyDeBruijnBitPosition[32] =
{
0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
};
v |= v >> 1; // first round down to one less than a power of 2
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
r = MultiplyDeBruijnBitPosition[(uint32_t)(v * 0x07C4ACDDU) >> 27];
The question is equal to "find the highest bit of 1 of the binary number"
STEP 1: set the left of 1 all to 1
like 0x07000000 to 0x07ffffff
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >> 16); // number of ops = 10
STEP 2: returns count of number of 1's in word and minus 1
Reference: Hamming weight
// use bitCount
int m1 = 0x55; // 01010101...
m1 = (m1 << 8) + 0x55;
m1 = (m1 << 8) + 0x55;
m1 = (m1 << 8) + 0x55;
int m2 = 0x33; // 00110011...
m2 = (m2 << 8) + 0x33;
m2 = (m2 << 8) + 0x33;
m2 = (m2 << 8) + 0x33;
int m3 = 0x0f; // 00001111...
m3 = (m3 << 8) + 0x0f;
m3 = (m3 << 8) + 0x0f;
m3 = (m3 << 8) + 0x0f;
x = x + (~((x>>1) & m1) + 1); // x - ((x>>1) & m1)
x = (x & m2) + ((x >> 2) & m2);
x = (x + (x >> 4)) & m3;
// x = (x & m3) + ((x >> 4) & m3);
x += x>>8;
x += x>>16;
int bitCount = x & 0x3f; // max 100,000(2) = 32(10)
// Number of ops: 35 + 10 = 45
return bitCount + ~0;
This is how I do. Thank you~
I also was assigned this problem for homework and I spent a significant amount of time thinking about it so I thought I'd share what I came up with. This works with integers on a 32 bit machine. !!x returns if x is zero or one.
int ilog2(int x) {
int byte_count = 0;
int y = 0;
//Shift right 8
y = x>>0x8;
byte_count += ((!!y)<<3);
//Shift right 16
y = x>>0x10;
byte_count += ((!!y)<<3);
//Shift right 24 and mask to adjust for arithmetic shift
y = (x>>0x18)&0xff;
byte_count += ((!!y)<<3);
x = (x>>byte_count) & 0xff;
x = x>>1;
byte_count += !!x;
x = x>>1;
byte_count += !!x;
x = x>>1;
byte_count += !!x;
x = x>>1;
byte_count += !!x;
x = x>>1;
byte_count += !!x;
x = x>>1;
byte_count += !!x;
x = x>>1;
byte_count += !!x;
x = x>>1; //8
byte_count += !!x;
return byte_count;
}

Resources