can anyone explain this ((~(preRtcInseconds <<1)+1)>>1)
unsigned long preInseconds;
unsigned long curInSeconds;
unsigned long elapsedInSeconds;
if(curInSeconds>=preInseconds)
{
elapsedInSeconds = curInSeconds - preInseconds;//does easy thing. no needs roll over
}
else{ //rollover
preInseconds = ((~(preInseconds <<1)+1)>>1);
elapsedInSeconds = curInSeconds + preInseconds;
}
Let the width of unsigned long be w. Then the maximal value an unsigned long can hold is
ULONG_MAX = 2^w - 1
Let preInseconds = a + b, where
a = preInseconds & (1ul << (w-1))
b = preInseconds & ((1ul << (w-1)) - 1)
a is either 0 or 2^(w-1), depending on whether preInseconds >= 2^(w-1). Then the initial left shift annihilates a, so
preInseconds << 1
gives b << 1 or 2*b.
Then the bitwise complement is taken,
~(b << 1)
gives ULONG_MAX - (b << 1).
If b == 0, adding 1 to ULONG_MAX - (b << 1) results in 0, otherwise
~(b << 1) + 1
gives
2^w - (b << 1)
Then shifting one bit to the right produces 0 if b == 0 and
2^(w-1) - b
otherwise. Hence
preInseconds = ((~(preInseconds <<1)+1)>>1);
sets preInseconds to
2^(w-1) - b
if b != 0, and to 0 if b == 0.
Finally,
elapsedInSeconds = curInSeconds + preInseconds;
therefore sets elapsedInSeconds to the value of curInSeconds if preInseconds was 2^(w-1) [if the else branch is taken, preInseconds > curInSeconds, so it's not 0] and to
curInSeconds - (preInseconds & (ULONG_MAX >> 1)) + 2^(w-1)
otherwise.
I'm not sure what the purpose of that operation is. For preInseconds > curInSeconds, the computation
curInSeconds - preInseconds
would result in
(2^w - preInseconds) + curInSeconds
which is the same as (mathematically)
(2^w + curInSeconds) - preInseconds
which would be the elapsed time if the curInSeconds counter rolled over once since the last time preInseconds was taken, which presumably happened if the current counter value is smaller than the previous. That would make sense.
With the gymnastics done in the else branch,
if preInseconds <= 2^(w-1), elapsedInSeconds becomes curInSeconds - preInseconds + 2^(w-1), the difference between the rolled-over current counter and the previous counter minus 2^(w-1)
if preInseconds > 2^(w-1), since x - 2^(w-1) == x + 2^(w-1) in unsigned arithmetic, we obtain the same result as with curInSeconds - preInSeconds.
So, assuming curInSeconds rolled over once, it calculates the elapsed seconds between the previous and current event, unless the roll-over took place 2^(w-1) or more seconds after the previous event, in which case 2^(w-1) seconds are subtracted from the actual elapsed time.
It looks like buggy rollover code written by someone who doesn't understand that C unsigned arithmetic is mod 2^n (or doesn't understand mod 2^n arithmetic.) It is in fact exactly equivalent to:
elapsedInSeconds = curInSeconds - preInseconds;
if (curInSeconds < UNLONG_MAX/2 && preInseconds < ULONG_MAX/2)
elapsedInSeconds &= ULONG_MAX/2;
That is, it does a mod 2^n subtraction, but in a few cases (that were probably never hit in any test case), it gets to top bit wrong (clear instead of set).
It is vaguely possible that this is intentional, since what is going on here is that it does a mod 2^n-1 subtraction if both numbers are <2^n-1 and a mod 2^n subtraction if either number is >= 2^n-1 (where n is the size of an unsigned long in bits). If the times are coming from some hardware device which might be always clearing the top bit or might not, this might make sense.
First of all, everything in my discussion below assumes that your unsigned long type is 32 bits wide. If it's not, that's OK; it doesn't really matter how wide it is, but all my examples assume it's 32 bits wide. And I simply use uint32 to denote that, knowing that uint32 isn't actually a standard type like uint32_t, please don't bother telling me that.
Daniel Fischer has done a fine job of explaining each low-level operation in detail, and I won't repeat that here. But I think what you're interested in isn't so much the meaning of each of the low-level operations as much as the necessity of all of those operations applied as a group. Which as I'll explain below, aren't necessary anyway. In fact, they're slightly wrong, but only in the case that the "current" counter reading is more than halfway back around the circle to the "previous" reading.
Before I get to the "meaning" behind the mathematics of your implementation, let's first just look at a trivial example of how it computes elapsedInSeconds in the case of the smallest rollover possible:
If preInseconds = 0xffffffff and curInSeconds = 0x00000000, elapsedInSeconds should be 1.
preInseconds = preInseconds<<1; // 0xfffffffe
preInseconds = ~preInseconds; // 0x00000001
preInseconds = preInseconds+1 // 0x00000002
preInseconds = preInseconds>>1; // 0x00000001
elapsedInSeconds = curInSeconds + preInseconds;
... = 0x00000000 + 0x00000001 = 1
...which is exactly what we expect. Great.
However, the interesting thing is that none of the rollover handling logic is necessary. At all. Every time I've ever seen someone try to calculate the difference between a 'current' and 'previous' counter value, I always see them jumping through hoops to handle the rollover case, and more often than not they do it wrong. The shame of the situation is that handling the rollover with a special case is never necessary for power-of-2-sized counters. If the counter's full scale (point at which it rolls over) is smaller than the size of your data type, you'd need to mask the result back to the number of bits in the counter, but that's the only rollover handling you really ever need, and in that case you'd just do the masking every time without worrying whether it rolled over or not (which also avoids branch instructions so it's faster). Since your rollover point is the full-scale value of a uint32, you don't even need to mask the result.
Here's why:
Assume, as above, that preInseconds=0xffffffff and curInSeconds=0. Again, the result should be 1. If you weren't concerned about rollover, you'd just take curInSeconds-preInseconds as the result. But in the case of rollover, the subtract operation will produce an underflow. What does that mean? It means that if you had more bits that you were dealing with (i.e. another uint32 being used as the high word of a 64-bit compound counter), then you'd need to borrow 1 from the high word (just like grade-school subtraction with decimal numbers). But in your case, there's no higher word to borrow from. That's OK. Really. You don't care about those bits anyway. You still get the difference value you were looking for:
elapsedInSeconds = curInSeconds - preInseconds;
... = 0x00000000 - 0xffffffff = 1
...which gives the expected result without any special rollover handling logic at all.
And so you may be thinking, "Sure, that works for your trivial example, but what if the rollover is HUGE?" OK, well, let's explore that possibility. Assume then that preInseconds=0xffffffff and curInSeconds=0xfffffffe. In this example, we've ALMOST wrapped completely back around from the previous sample; in fact, we're only one count away from it. In this case, our result should be 0xffffffff (i.e. one less count than the number of values that can be represented by a uint32):
elapsedInSeconds = curInSeconds - preInseconds;
... = 0xfffffffe - 0xffffffff = 0xffffffff
Don't believe me? Try this:
#include <stdio.h>
typedef unsigned long uint32;
int main()
{
uint32 prev = 0xffffffff;
uint32 cur = 0xfffffffe;
uint32 result = cur - prev;
printf("0x%08x - 0x%08x = 0x%08x\n", cur, prev, result);
}
NOW, let's get back to the math behind your implementation:
That computation "sort of" computes the two's complement of preInseconds and assigns the result back to preInseconds. And if you know anything about computer representations of numbers and two's complement addition and subtraction, you know that computing the difference A-B is the same as computing the sum of A and the two's complement of B, i.e. A+(-B). If you've never investigated it before, look up on Wikipedia or wherever about how two's complement makes a computer's ALU able to re-use their addition circuitry for subtraction.
Now on to what's actually "wrong" with the code you've shown:
To compute the two's complement of a number, you invert the number (change all of its 0 bits to 1, and all of its 1 bits to 0), and then add one. It's that simple. And that's "sort of" what your code is doing, but not quite.
preInseconds = preInseconds<<1; // oops, here we lose the top bit
preInseconds = ~preInseconds; // do the 2's complement inversion step*
preInseconds = preInseconds+1 // do the 2's complement addition step*
preInseconds = preInseconds>>1; // shift back to where it ought to be,
// but without that top bit we wish we kept
*NOTE: The +1 above only works here because the low bit is
guaranteed to be 1 after the ~ operation, which carries a 1
up into the 2nd bit, where it matters.
So we see here that essentially what the math is doing is manually negating the value of preInseconds by performing a "nearly" two's complement conversion of it. Unfortunately it's also losing the top bit in the process, which makes the rollover logic only work up to a maximum of elapsedInSeconds = 0x7fffffff, not what should really be its full scale limit of 0xffffffff.
You could convert it to the following, and eliminate the loss of the top bit:
preInseconds = ~preInseconds; // do the 2's complement inversion step
preInseconds = preInseconds+1 // do the 2's complement addition step
So now you've computed the two's complement directly, and you can compute the result:
elapsedInSeconds = curInSeconds + preInseconds; // (preInseconds is the 2's compl of its original value)
But the silly thing is that this is computationally equivalent to simply doing this...
elapsedInSeconds = curInSeconds - preInseconds; // (preInseconds is its unconverted original value)
And once you realize that, your code example becomes:
if(curInSeconds>=preInseconds)
{
elapsedInSeconds = curInSeconds - preInseconds;
}
else // rollover
{
elapsedInSeconds = curInSeconds - preInseconds;
}
...which should make it clear that there's no need to handle rollover as a special case in the first place.
Related
Having trouble understanding what happens when the 32bit system tick on a STM32 MCU rolls over using the ST supplied HAL platform.
If the MCU has been running until HAL_GetTick() returns its maximum of 2^32 -1 =0xFFFFFFFF which is 4,294,967,295 / 1000 / 60 / 60 / 24 = approx 49 days (when calculating the 1ms tick to the maximum duration that can be measured).
What happens if you have a timer that running across the rollover point?
Example code creating 100ms delay on a rollover event:
uint32_t start = HAL_GetTick() // start = 0xFFFF FFFF (in this example)
--> Interrupt increments systick which rolls it over to 0 at this point
while ((HAL_GetTick() - start) < 100);
So when the expression in the loop is first evaluated HAL_GetTick() = 0x0000 0000 and start = 0xFFFF FFFF. Hence 0x0000 00000 - 0xFFFF FFFF = ? (This number doesn't exist as it's negative and we are doing unsigned arithmetic)
However when I run the following code on my STM32 that is compiled with the GCC ARM :
uint32_t a = 0xFFFFFFFFUL;
uint32_t b = 0x00000000UL;
uint32_t c = b - a;
printf("a =%lu b=%lu c=%lu\r\n", a, b, c);
The output is:
a =4294967295 b=0 c=1
The fact that c=1 is good from the point of view of the code functioning properly across the overflow but I don't understand what is actually happening here at the low level. How does 0 - 4294967295 = 1 ?? How would I calculate this on paper to show what the arithmetic logic unit inside the MCU is doing when this situation is encountered?
This is a characteristic of modular arithmetic. Or modulo wrapping is what happens when an unsigned integer overflows.
When working with a fixed number of digits/bits, arithmetic operations can overflow the fixed number of digits. But the overflow portion cannot be represented in the fixed number of digits/bits and is basically masked away. The overflow portion can be considered a modulus and the portion within the fixed number of digits/bits is the remainder or modulo. Given the modulus, the modulo value remains correct/congruent after the operation that caused the overflow.
The best way to understand is to do a few operations with a pen on paper. Choose a base. Hexadecimal is great but it works for decimal, binary, and every base. Choose a fixed number of digits/bits. For uint32_t you have 8 hex digits or 32 bits. Choose two values that will overflow the fixed number of digits when you add them. Do the math on paper and include any overflow into an extra digit. Now perform the modulo operation by covering the overflow with your hand. Your CPU does this modulo operation automatically by virtue of having a fixed number of digits (i.e., uint32_t). Repeat this with different numbers and repeat with a subtraction/underflow. Eventually you'll start to trust that it works.
You do have to be careful when setting up this operation. Use unsigned types and subtract the start ticks value from the current ticks value, like is done in your example code. (Do not, for example, add the delay to start ticks and compare with the current ticks.) Raymond Chen's article, Using modular arithmetic to avoid timing overflow problems has more information.
How does 0 - 4294967295 = 1 ?? How would I calculate this on paper to
show what the arithmetic logic unit inside the MCU is doing when this
situation is encountered?
First write it in hex like this:
0000_0000
- FFFF_FFFF
_____________
Then realize that there can be a modulus value of 0x1_0000_0000 on the first value (minuend). (Because according to modular arithmetic, "0x0_0000_0000 and 0x1_0000_0000 are congruent modulo 0x1_0000_0000"). Then it should become obvious that the difference is 1.
1_0000_0000
- 0_FFFF_FFFF
_____________
0_0000_0001
Nothing bad will happen. It will work the same as before the wraparound.
int main(void)
{
uint32_t start = UINT32_MAX - 20;
uint32_t current = start;
for(uint32_t x = 0; x < 100; x++)
{
printf("start = 0x%08"PRIx32" current = 0x%08"PRIx32 " current - start = %"PRIu32"\n", start, current, current-start);
current++;
}
}
You can see it here:
https://godbolt.org/z/jx4T4fhsW
0x00000000 - 0xffffffff will be 1 as 1 needs to be added to 0xffffffff to get 0x00000000. Same with other numbers.
BTW it is much easier to understand if you use hex numbers instead of decimals which have very limited use in programming.
This question is regarding a code example in Section 7.4 of Beej's Guide to Network Programming.
Here is the code example.
uint32_t htonf(float f)
{
uint32_t p;
uint32_t sign;
if (f < 0) { sign = 1; f = -f; }
else { sign = 0; }
p = ((((uint32_t)f)&0x7fff)<<16) | (sign<<31); // whole part and sign
p |= (uint32_t)(((f - (int)f) * 65536.0f))&0xffff; // fraction
return p;
}
Why is the bitwise-AND with 0xffff required to store the fraction?
As far as I understand, f - (int) f is always going to be a number that satisfies the inequality, 0 <= f - (int) f < 1. Since this number is always going to be less than 1, this number multiplied by 65536 is always going to be less than 65536. In other words, this number would never exceed 16 bits in its binary representation.
If this number never exceeds 16 bits in length, then what is the point in trying to select the least significant 16 bits with & 0xffff. It seems like a redundant step to me.
Do you agree? Or do you see a scenario where the & 0xffff is necessary for this function to work correctly?
The & 0xffff is superfluous.
It is questionable to even use. Consider the following. If some prior or potential future code may create a some_float_expression < 0 or some_float_expression >= 0x10000, then truncating the expression with & 0xffff could result in a incorrect answer. The below is sufficient.
(uint32_t)(some_float_expression);
IMO, code has other issues:
No range error detection. #M.M
I'd expect a round-to-nearest conversion, rather than truncate toward 0.0.
Minor: if (f < 0) is the wrong test for detecting th sign of -0.0f. Use signbit().
Unclear why code is not somelthing like
if (in range)
p = (uint32_t)(f*65536.0f) | (sign<<31);
If x is an unsigned int type is there a difference in these statements:
return (x & 7);
and
return (-x & 7);
I understand negating an unsigned value gives a value of max_int - value. But is there a difference in the return value (i.e. true/false) among the above two statements under any specific boundary conditions OR are they both same functionally?
Test code:
#include <stdio.h>
static unsigned neg7(unsigned x) { return -x & 7; }
static unsigned pos7(unsigned x) { return +x & 7; }
int main(void)
{
for (unsigned i = 0; i < 8; i++)
printf("%u: pos %u; neg %u\n", i, pos7(i), neg7(i));
return 0;
}
Test results:
0: pos 0; neg 0
1: pos 1; neg 7
2: pos 2; neg 6
3: pos 3; neg 5
4: pos 4; neg 4
5: pos 5; neg 3
6: pos 6; neg 2
7: pos 7; neg 1
For the specific case of 4 (and also 0), there isn't a difference; for other values, there is a difference. You can extend the range of the input, but the outputs will produce the same pattern.
If you ask specifically for true/false (i.e. is zero / not zero) and two's complement then there is indeed no difference. (You do however return not just a simple truth value but allow different bit patterns for true. As long as the caller does not distinguish, that is fine.)
Consider how a two's complement negation is formed: invert the bits then increment. Since you take only the least significant bits, there will be no carry in for the increment. This is a necessity, so you can't do this with anything but a range of least significant bits.
Let's look at the two cases:
First, if the three low bits are zero (for a false equivalent). Inverting gives all ones, incrementing turns them to zero again. The fourth and more significant bits might be different, but they don't influence the least significant bits and they don't influence the result since they are masked out. So this stays.
Second, if the three low bits are not all zero (for a true equivalent). The only way this can change into false is when the increment operation leaves them at zero, which can only happen if they were all ones before, which in turn could only happen if they were all zeros before the inversion. That can't be, since that is the first case. Again, the more significant bits don't influence the three low bits and they are masked out. So the result does not change.
But again, this only works when the caller considers only the truth value (all bits zero / not all bits zero) and when the mask allows a range of bits starting from the least significant without a gap.
Firstly, negating an unsigned int value produces UINT_MAX - original_value + 1. (For example, 0 remains 0 under negation). The alternative way to describe negation is full inversion of all bits followed by increment.
It is not clear why you'd even ask this question, since it is obvious that basically the very first example that comes to mind — an unsigned int value 1 — already produces different results in your expression. 1u & 7 is 1, while -1u & 7 is 7. Did you mean something else, by any chance?
I'm implementing a relative branching function in my simple VM.
Basically, I'm given an 8-bit relative value. I then shift this left by 1 bit to make it a 9-bit value. So, for instance, if you were to say "branch +127" this would really mean, 127 instructions, and thus would add 256 to the IP.
My current code looks like this:
uint8_t argument = 0xFF; //-1 or whatever
int16_t difference = argument << 1;
*ip += difference; //ip is a uint16_t
I don't believe difference will ever be detected as a less than 0 with this however. I'm rusty on how signed to unsigned works. Beyond that, I'm not sure the difference would be correctly be subtracted from IP in the case argument is say -1 or -2 or something.
Basically, I'm wanting something that would satisfy these "tests"
//case 1
argument = -5
difference -> -10
ip = 20 -> 10 //ip starts at 20, but becomes 10 after applying difference
//case 2
argument = 127 (must fit in a byte)
difference -> 254
ip = 20 -> 274
Hopefully that makes it a bit more clear.
Anyway, how would I do this cheaply? I saw one "solution" to a similar problem, but it involved division. I'm working with slow embedded processors (assumed to be without efficient ways to multiply and divide), so that's a pretty big thing I'd like to avoid.
To clarify: you worry that left shifting a negative 8 bit number will make it appear like a positive nine bit number? Just pad the top 9 bits with the sign bit of the initial number before left shift:
diff = 0xFF;
int16 diff16=(diff + (diff & 0x80)*0x01FE) << 1;
Now your diff16 is signed 2*diff
As was pointed out by Richard J Ross III, you can avoid the multiplication (if that's expensive on your platform) with a conditional branch:
int16 diff16 = (diff + ((diff & 0x80)?0xFF00:0))<<1;
If you are worried about things staying in range and such ("undefined behavior"), you can do
int16 diff16 = diff;
diff16 = (diff16 | ((diff16 & 0x80)?0x7F00:0))<<1;
At no point does this produce numbers that are going out of range.
The cleanest solution, though, seems to be "cast and shift":
diff16 = (signed char)diff; // recognizes and preserves the sign of diff
diff16 = (short int)((unsigned short)diff16)<<1; // left shift, preserving sign
This produces the expected result, because the compiler automatically takes care of the sign bit (so no need for the mask) in the first line; and in the second line, it does a left shift on an unsigned int (for which overflow is well defined per the standard); the final cast back to short int ensures that the number is correctly interpreted as negative. I believe that in this form the construct is never "undefined".
All of my quotes come from the C standard, section 6.3.1.3. Unsigned to signed is well defined when the value is within range of the signed type:
1 When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
Signed to unsigned is well defined:
2 Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
Unsigned to signed, when the value lies out of range isn't too well defined:
3 Otherwise, the new type is signed and the value cannot be
represented in it; either the result is implementation-defined or an
implementation-defined signal is raised.
Unfortunately, your question lies in the realm of point 3. C doesn't guarantee any implicit mechanism to convert out-of-range values, so you'll need to explicitly provide one. The first step is to decide which representation you intend to use: Ones' complement, two's complement or sign and magnitude
The representation you use will affect the translation algorithm you use. In the example below, I'll use two's complement: If the sign bit is 1 and the value bits are all 0, this corresponds to your lowest value. Your lowest value is another choice you must make: In the case of two's complement, it'd make sense to use either of INT16_MIN (-32768) or INT8_MIN (-128). In the case of the other two, it'd make sense to use INT16_MIN - 1 or INT8_MIN - 1 due to the presense of negative zeros, which should probably be translated to be indistinguishable from regular zeros. In this example, I'll use INT8_MIN, since it makes sense that (uint8_t) -1 should translate to -1 as an int16_t.
Separate the sign bit from the value bits. The value should be the absolute value, except in the case of a two's complement minimum value when sign will be 1 and the value will be 0. Of course, the sign bit can be where-ever you like it to be, though it's conventional for it to rest at the far left hand side. Hence, shifting right 7 places obtains the conventional "sign" bit:
uint8_t sign = input >> 7;
uint8_t value = input & (UINT8_MAX >> 1);
int16_t result;
If the sign bit is 1, we'll call this a negative number and add to INT8_MIN to construct the sign so we don't end up in the same conundrum we started with, or worse: undefined behaviour (which is the fate of one of the other answers).
if (sign == 1) {
result = INT8_MIN + value;
}
else {
result = value;
}
This can be shortened to:
int16_t result = (input >> 7) ? INT8_MIN + (input & (UINT8_MAX >> 1)) : input;
... or, better yet:
int16_t result = input <= INT8_MAX ? input
: INT8_MIN + (int8_t)(input % (uint8_t) INT8_MIN);
The sign test now involves checking if it's in the positive range. If it is, the value remains unchanged. Otherwise, we use addition and modulo to produce the correct negative value. This is fairly consistent with the C standard's language above. It works well for two's complement, because int16_t and int8_t are guaranteed to use a two's complement representation internally. However, types like int aren't required to use a two's complement representation internally. When converting unsigned int to int for example, there needs to be another check, so that we're treating values less than or equal to INT_MAX as positive, and values greater than or equal to (unsigned int) INT_MIN as negative. Any other values need to be handled as errors; In this case I treat them as zeros.
/* Generate some random input */
srand(time(NULL));
unsigned int input = rand();
for (unsigned int x = UINT_MAX / ((unsigned int) RAND_MAX + 1); x > 1; x--) {
input *= (unsigned int) RAND_MAX + 1;
input += rand();
}
int result = /* Handle positives: */ input <= INT_MAX ? input
: /* Handle negatives: */ input >= (unsigned int) INT_MIN ? INT_MIN + (int)(input % (unsigned int) INT_MIN)
: /* Handle errors: */ 0;
If the offset is in the 2's complement representation, then
convert this
uint8_t argument = 0xFF; //-1
int16_t difference = argument << 1;
*ip += difference;
into this:
uint8_t argument = 0xFF; //-1
int8_t signed_argument;
signed_argument = argument; // this relies on implementation-defined
// conversion of unsigned to signed, usually it's
// just a bit-wise copy on 2's complement systems
// OR
// memcpy(&signed_argument, &argument, sizeof argument);
*ip += signed_argument + signed_argument;
Suppose I have an increasing sequence of unsigned integers C[i]. As they increase, it's likely that they will occupy increasingly many bits. I'm looking for an efficient conditional, based purely on two consecutive elements of the sequence C[i] and C[i+1] (past and future ones are not observable), that will evaluate to true either exactly or approximately once for every time the number of bits required increases.
An obvious (but slow) choice of conditional is:
if (ceil(log(C[i+1])) > ceil(log(C[i]))) ...
and likewise anything that computes the number of leading zero bits using special cpu opcodes (much better but still not great).
I suspect there may be a nice solution involving an expression using just bitwise or and bitwise and on the values C[i+1] and C[i]. Any thoughts?
Suppose your two numbers are x and y. If they have the same high order bit, then x^y is less than both x and y. Otherwise, it is higher than one of the two.
So
v = x^y
if (v > x || v > y) { ...one more bit... }
I think you just need clz(C[i+1]) < clz(C[i]) where clz is a function which returns the number of leading zeroes ("count leading zeroes"). Some CPU families have an instruction for this (which may be available as an instrinsic). If not then you have to roll your own (it typically only takes a few instructions) - see Hacker's Delight.
Given (I believe this comes from Hacker's Delight):
int hibit(unsigned int n) {
n |= (n >> 1);
n |= (n >> 2);
n |= (n >> 4);
n |= (n >> 8);
n |= (n >> 16);
return n - (n >> 1);
}
Your conditional is simply hibit(C[i]) != hibit(C[i+1]).
BSR - Bit Scan Reverse (386+)
Usage: BSR dest,src
Modifies flags: ZF
Scans source operand for first bit set. Sets ZF if a bit is found
set and loads the destination with an index to first set bit. Clears
ZF is no bits are found set. BSF scans forward across bit pattern
(0-n) while BSR scans in reverse (n-0).
Clocks Size
Operands 808x 286 386 486 Bytes
reg,reg - - 10+3n 6-103 3
reg,mem - - 10+3n 7-104 3-7
reg32,reg32 - - 10+3n 6-103 3-7
reg32,mem32 - - 10+3n 7-104 3-7
You need two of these (on C[i] and C[i]+1) and a compare.
Keith Randall's solution is good, but you can save one xor instruction by using the following code which processes the entire sequence in O(w + n) instructions, where w is the number of bits in a word, and n is the number of elements in the sequence. If the sequence is long, most iterations will only involve one comparison, avoiding one xor instruction.
This is accomplished by tracking the highest power of two that has been reached as follows:
t = 1; // original setting
if (c[i + 1] >= t) {
do {
t <<= 1;
} while (c[i + 1] >= t); // watch for overflow
... // conditional code here
}
The number of bits goes up when the value is about overflow a power of two. A simple test is then, is the value equal to a power of two, minus 1? This can be accomplished by asking:
if ((C[i] & (C[i]+1))==0) ...
The number of bits goes up when the value is about to overflow a power of two.
A simple test is then:
while (C[i] >= (1<<number_of_bits)) then number_of_bits++;
If you want it even faster:
int number_of_bits = 1;
int two_to_number_of_bits = 1<<number_of_bits ;
... your code ....
while ( C[i]>=two_to_number_of_bits )
{ number_of_bits++;
two_to_number_of_bits = 1<<number_of_bits ;
}