Convert continues binary fraction to decimal fraction in C - c

I implemented a digit-by-digit calculation of the square root of two. Each round it will outpute one bit of the fractional part e.g.
1 0 1 1 0 1 0 1 etc.
I want to convert this output to decimal numbers:
4 1 4 2 1 3 6 etc.
The issue I´m facing is, that this would generally work like this:
1 * 2^-1 + 0 * 2^-2 + 1 * 2^-3 etc.
I would like to avoid fractions altogether, as I would like to work with integers to convert from binary to decimal. Also I would like to print each decimal digit as soon as it has been computed.
Converting to hex is trivial, as I only have to wait for 4 bits. Is there a smart aproach to convert to base10 which allows to observe only a part of the whole output and idealy remove digits from the equation, once we are certain, that it wont change anymore, i.e.
1 0
2 0,25
3 0,375
4 0,375
5 0,40625
6 0,40625
7 0,4140625
8 0,4140625
After processing the 8th bit, I´m pretty sure that 4 is the first decimal fraction digit. Therefore I would like to remove 0.4 complelty from the equation to reduce the bits I need to take care of.

Is there a smart approach to convert to base10 which allows to observe only a part of the whole output and ideally remove digits from the equation, once we are certain that it wont change anymore (?)
Yes, eventually in practice, but in theory, no in select cases.
This is akin to the Table-maker's dilemma.
Consider the below handling of a value near 0.05. As long as the binary sequence is .0001 1001 1001 1001 1001 ... , we cannot know it the decimal equivalent is 0.04999999... or 0.05000000...non-zero.
int main(void) {
double a;
a = nextafter(0.05, 0);
printf("%20a %.20f\n", a, a);
a = 0.05;
printf("%20a %.20f\n", a, a);
a = nextafter(0.05, 1);
printf("%20a %.20f\n", a, a);
return 0;
}
0x1.9999999999999p-5 0.04999999999999999584
0x1.999999999999ap-5 0.05000000000000000278
0x1.999999999999bp-5 0.05000000000000000971
Code can analyse the incoming sequence of binary fraction bits and then ask two questions, after each bit: "if the remaining bits are all 0" what is it in decimal?" and "if the remaining bits are all 1" what is it in decimal?". In many cases, the answers will share common leading significant digits. Yet as shown above, as long as 1001 is received, there are no common significant decimal digits.
A usual "out" is to have an upper bound as to the number of decimal digits that will ever be shown. In that case code is only presenting a rounded result and that can be deduced in finite time even if the binary input sequence remains 1001
ad nauseam.

The issue I´m facing is, that this would generally work like this:
1 * 2^-1 + 0 * 2^-2 + 1 * 2^-3 etc.
Well 1/2 = 5/10 and 1/4 = 25/100 and so on which means you will need powers of 5 and shift the values by powers of 10
so given 0 1 1 0 1
[1] 0 * 5 = 0
[2] 0 * 10 + 1 * 25 = 25
[3] 25 * 10 + 1 * 125 = 375
[4] 375 * 10 + 0 * 625 = 3750
[5] 3750 * 10 + 1 * 3125 = 40625
Edit:
Is there a smart aproach to convert to base10 which allows to observe only a part of the whole output and idealy remove digits from the equation, once we are certain, that it wont change anymore
It might actually be possible to pop the most significant digits(MSD) in this case. This will be a bit long but please bear with me
Consider the values X and Y:
If X has the same number of digits as Y, then the MSD will change.
10000 + 10000 = 20000
If Y has 1 or more digits less than X, then the MSD can change.
19000 + 1000 = 20000
19900 + 100 = 20000
So the first point is self explanatory but the second point is what will allow us to pop the MSD. The first thing we need to know is that the values we are adding is continuously being divided in half every iteration. Which means that if we only consider the MSD, the largest value in base10 is 9 which will produce the sequence
9 > 4 > 2 > 1 > 0
If we sum up these values it will be equal to 16, but if we try to consider the values of the next digits (e.g. 9.9 or 9.999), the value actually approaches 20 but it doesn't exceed 20. What this means is that if X has n digits and Y has n-1 digits, the MSD of X can still change. But if X has n digits and Y has n-2 digits, as long as the n-1 digit of X is less than 8, then the MSD will not change (otherwise it would be 8 + 2 = 10 or 9 + 2 = 11 which means that the MSD will change). Here are some examples
Assuming X is the running sum of sqrt(2) and Y is 5^n:
1. If X = 10000 and Y = 9000 then the MSD of X can change.
2. If X = 10000 and Y = 900 then the MSD of X will not change.
3. If X = 19000 and Y = 900 then the MSD of X can change.
4. If X = 18000 and Y = 999 then the MSD of X can change.
5. If X = 17999 and Y = 999 then the MSD of X will not change.
6. If X = 19990 and Y = 9 then the MSD of X can change.
In the example above, on point #2 and #5, the 1 can already be popped. However for point #6, it is possible to have 19990 + 9 + 4 = 20003, but this also means that both 2 and 0 can be popped after that happened.
Here's a simulation for sqrt(2)
i Out X Y flag
-------------------------------------------------------------------
1 0 5 0
2 25 25 1
3 375 125 1
4 3,750 625 0
5 40,625 3,125 1
6 406,250 15,625 0
7 4 140,625 78,125 1
8 4 1,406,250 390,625 0
9 4 14,062,500 1,953,125 0
10 41 40,625,000 9,765,625 0
11 41 406,250,000 48,828,125 0
12 41 4,062,500,000 244,140,625 0
13 41 41,845,703,125 1,220,703,125 1
14 414 18,457,031,250 6,103,515,625 0
15 414 184,570,312,500 30,517,578,125 0
16 414 1,998,291,015,625 152,587,890,625 1
17 4142 0,745,849,609,375 762,939,453,125 1

You can use multiply and divide approach to reduce the floating point arithmetic.
1 0 1 1
Which is equivalent to 1*2^0+0*2^1+2^(-2)+2^(-3) can be simplified to (1*2^3+0*2^2+1*2^1+1*2^0)/(2^3) only division remains floating point arithmetic rest all is integer arithmetic operation. Multiplication by 2 can be implemented through left shift.

Related

float %.2f round but double %.2lf not rounding and how to bypass it? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Floating point comparison
I have a problem about the accuracy of float in C/C++. When I execute the program below:
#include <stdio.h>
int main (void) {
float a = 101.1;
double b = 101.1;
printf ("a: %f\n", a);
printf ("b: %lf\n", b);
return 0;
}
Result:
a: 101.099998
b: 101.100000
I believe float should have 32-bit so should be enough to store 101.1 Why?
You can only represent numbers exactly in IEEE754 (at least for the single and double precision binary formats) if they can be constructed from adding together inverted powers of two (i.e., 2-n like 1, 1/2, 1/4, 1/65536 and so on) subject to the number of bits available for precision.
There is no combination of inverted powers of two that will get you exactly to 101.1, within the scaling provided by floats (23 bits of precision) or doubles (52 bits of precision).
If you want a quick tutorial on how this inverted-power-of-two stuff works, see this answer.
Applying the knowledge from that answer to your 101.1 number (as a single precision float):
s eeeeeeee mmmmmmmmmmmmmmmmmmmmmmm 1/n
0 10000101 10010100011001100110011
| | | || || || |+- 8388608
| | | || || || +-- 4194304
| | | || || |+----- 524288
| | | || || +------ 262144
| | | || |+--------- 32768
| | | || +---------- 16384
| | | |+------------- 2048
| | | +-------------- 1024
| | +------------------ 64
| +-------------------- 16
+----------------------- 2
The mantissa part of that actually continues forever for 101.1:
mmmmmmmmm mmmm mmmm mmmm mm
100101000 1100 1100 1100 11|00 1100 (and so on).
hence it's not a matter of precision, no amount of finite bits will represent that number exactly in IEEE754 format.
Using the bits to calculate the actual number (closest approximation), the sign is positive. The exponent is 128+4+1 = 133 - 127 bias = 6, so the multiplier is 26 or 64.
The mantissa consists of 1 (the implicit base) plus (for all those bits with each being worth 1/(2n) as n starts at 1 and increases to the right), {1/2, 1/16, 1/64, 1/1024, 1/2048, 1/16384, 1/32768, 1/262144, 1/524288, 1/4194304, 1/8388608}.
When you add all these up, you get 1.57968747615814208984375.
When you multiply that by the multiplier previously calculated, 64, you get 101.09999847412109375.
All numbers were calculated with bc using a scale of 100 decimal digits, resulting in a lot of trailing zeros, so the numbers should be very accurate. Doubly so, since I checked the result with:
#include <stdio.h>
int main (void) {
float f = 101.1f;
printf ("%.50f\n", f);
return 0;
}
which also gave me 101.09999847412109375000....
You need to read more about how floating-point numbers work, especially the part on representable numbers.
You're not giving much of an explanation as to why you think that "32 bits should be enough for 101.1", so it's kind of hard to refute.
Binary floating-point numbers don't work well for all decimal numbers, since they basically store the number in, wait for it, base 2. As in binary.
This is a well-known fact, and it's the reason why e.g. money should never be handled in floating-point.
Your number 101.1 in base 10 is 1100101.0(0011) in base 2. The 0011 part is repeating. Thus, no matter how many digits you'll have, the number cannot be represented exactly in the computer.
Looking at the IEE754 standard for floating points, you can find out why the double version seemed to show it entirely.
PS: Derivation of 101.1 in base 10 is 1100101.0(0011) in base 2:
101 = 64 + 32 + 4 + 1
101 -> 1100101
.1 * 2 = .2 -> 0
.2 * 2 = .4 -> 0
.4 * 2 = .8 -> 0
.8 * 2 = 1.6 -> 1
.6 * 2 = 1.2 -> 1
.2 * 2 = .4 -> 0
.4 * 2 = .8 -> 0
.8 * 2 = 1.6 -> 1
.6 * 2 = 1.2 -> 1
.2 * 2 = .4 -> 0
.4 * 2 = .8 -> 0
.8 * 2 = 1.6 -> 1
.6 * 2 = 1.2 -> 1
.2 * 2 = .4 -> 0
.4 * 2 = .8 -> 0
.8 * 2 = 1.6 -> 1
.6 * 2 = 1.2 -> 1
.2 * 2 = .4 -> 0
.4 * 2 = .8 -> 0
.8 * 2 = 1.6 -> 1
.6 * 2 = 1.2 -> 1
.2 * 2....
PPS: It's the same if you'd wanted to store exactly the result of 1/3 in base 10.
What you see here is the combination of two factors:
IEEE754 floating point representation is not capable of accurately representing a whole class of rational and all irrational numbers
The effects of rounding (by default here to 6 decimal places) in printf. That is say that the error when using a double occurs somewhere to the right of the 6th DP.
If you had more digits to the print of the double you'll see that even double cannot be represented exactly:
printf ("b: %.16f\n", b);
b: 101.0999999999999943
The thing is float and double are using binary format and not all floating pointer numbers can be represented exactly with binary format.
Unfortunately, most decimal floating point numbers cannot be accurately represented in (machine) floating point. This is just how things work.
For instance, the number 101.1 in binary will be represented like 1100101.0(0011) ( the 0011 part will be repeated forever), so no matter how many bytes you have to store it, it will never become accurate. Here is a little article about binary representation of floating point, and here you can find some examples of converting floating point numbers to binary.
If you want to learn more on this subject, I could recommend you this article, though it's long and not too easy to read.

binomial coefficient for very high numbers in c

So the task I have to solve is to calculate the binomial coefficient for 100>=n>k>=1 and then say how many solutions for n and k are over an under barrier of 123456789.
I have no problem in my formula of calculating the binomial coefficient but for high numbers n & k -> 100 the datatypes of c get to small to calculated this.
Do you have any suggestions how I can bypass this problem with overflowing the datatypes.
I thought about dividing by the under barrier straight away so the numbers don't get too big in the first place and I have to just check if the result is >=1 but i couldn't make it work.
Say your task is to determine how many binomial coefficients C(n, k) for 1 ≤ k < n ≤ 8 exceed a limit of m = 18. You can do this by using the recurrence C(n, k) = C(n − 1, k) + C(n − 1, k − 1) that can visualized in Pascal's triangle.
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 (20) 15 6 1
1 7 (21 35 35 21) 7 1
1 8 (28 56 70 56 28) 8 1
Start at the top and work your way down. Up to n = 5, everything is below the limit of 18. On the next line, the 20 exceeds the limit. From now on, more and more coefficients are beyond 18.
The triangle is symmetric and strictly increasing in the first half of each row. You only need to find the first element that exceeds the limit on each line in order to know how many items to count.
You don't have to store the whole triangle. It is enough to keey the last and current line. Alternatively, you can use the algorithm detailed [in this article][ot] to work your way from left to right on each row. Since you just want to count the coefficients that exceed a limit and don't care about their values, the regular integer types should be sufficient.
First, you'll need a type that can handle the result. The larget number you need to handle is C(100,50) = 100,891,344,545,564,193,334,812,497,256. This number requires 97 bits of precision, so your normal data types won't do the trick. A quad precision IEEE float would do the trick if your environment provides it. Otherwise, you'll need some form of high/arbitrary precision library.
Then, to keep the numbers within this size, you'll want cancel common terms in the numerator and the denominator. And you'll want to calculate the result using ( a / c ) * ( b / d ) * ... instead of ( a * b * ... ) / ( c * d * ... ).

How to sort Hexadecimal Numbers (like 10 1 A B) in C?

I want to implement a sorting algorithm in C for sorting hexadecimal numbers like:
10 1 A B
to this:
1 A B 10
The problem that I am facing here is I didn;t understand how A & B is less than 10 as A = 10 and B = 11 in hexadecimal numbers. Im sorry if I am mistaken.
Thank you!
As mentioned in the previous comments, 10 is 0x10, so this sorting seems to be no problem: 0x1 < 0xA < 0xB < 0x10
In any base a number with two digits is always greater than a number with one digit.
In hexadecimal notation we have 6 more digits available than in decimal, but they still count as one "digit":
hexadecimal digit | value in decimal representation
A | 10
B | 11
C | 12
D | 13
E | 14
F | 15
When you get a number in hexadecimal notation, it might be that its digits happen to use none of the above extra digits, but just the well-known 0..9 digits. This can be confusing, as we still must treat them as hexadecimal. In particular, a digit in a multi-digit hexadecimal representation must be multiplied with a power of 16 (instead of 10) to be correctly interpreted. So when you get 10 as hexadecimal number, it has a value of one (1) time sixteen plus zero (0), so the hexadecimal number 10 has a (decimal) value of 16.
The hexadecimal numbers you gave should therefore be ordered as 1 < A < B < 10.
As a more elaborate example, the hexadecimal representation 1D6A can be converted to decimal like this:
1D6A
│││└─> 10 x 16⁰ = 10
││└──> 6 x 16¹ = 96
│└───> 13 x 16² = 3328
└────> 1 x 16³ = 4096
──── +
7530
Likewise
10
│└─> 0 x 16⁰ = 0
└──> 1 x 16¹ = 16
── +
16

Fastest way to compute sum of first set bit over consecutive integers?

Edit: I wish SO let me accept 2 answers because neither is complete without the other. I suggest reading both!
I am trying to come up with a fast implementation of a function that given an unsigned 32-bit integer x returns the sum of 2^trailing_zeros(i) for i=1..x-1, where trailing_zeros is the count trailing zeros operation which is defined as returning the 0 bits after the least significant 1 bit. This seems like the kind of problem that should lend itself to a clever bit manipulation implementation that takes the same number of instructions regardless of the input, but I haven't been able to derive it.
Mathematically, 2^trailing_zeros(i) is equivalent to the largest factor of 2 that exactly divides i. So we are summing those largest factors for 1..x-1.
i | 1 2 3 4 5 6 7 8 9 10
-----------------------------------------------------------------------
2^trailing_zeroes(i) | 1 2 1 4 1 2 1 8 1 2
-----------------------------------------------------------------------
Sum (desired value) | 0 1 3 4 8 9 11 12 20 21
It is a little easier to see the structure of 2^trailing_zeroes(i) if we 'plot' the values -- horizontal position increasing from left to right corresponding to i and vertical position increasing from top to bottom corresponding to trailing_zeroes(i).
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
16 16 16 16 16 16 16 16
32 32 32 32
64 64
Here it is easier to see the pattern that 2's are always 4 apart, 8's are always 16 apart, etc. However, each pattern starts at a different time -- 8's don't begin until i=8, 16 doesn't begin until i=16, etc. If you don't take into account that the patterns don't start right away you can come up with formulas that don't work -- for example you might think to determine the number of 8's going into the total you should just compute floor(x/16) but i=25 is far enough to the right to include both of the first two 8s.
The best solution I have come up with so far is:
Set n = floor(log2(x)). This can be computed quickly using the count leading zeros operation. This tells us the highest power of two that is going to be involved in the sum.
Set sum = 0
for i = 1..n
sum += floor((x - 2^i) / 2^(i+1))*2^i + 2^i
The way this works as for each power, it calculates the horizontal distance on the plot between x and the first appearance of that power, e.g. the distance between x and the first 8 is (x-8), and then it divides by the distance between repeating instances of that power, e.g. floor((x-8)/16), which gives us how many times that power appeared, we the sum for that power, e.g. floor((x-8)/16)*8. Then we add one instance of the given power because that calculation excludes the very first time that power appears.
In practice this implementation should be pretty fast because the division/floor can be done by right bit shift and powers of two can be done with 1 bit-shifted to the left. However it seems like it should still be possible to do better. This implementation will loop more for larger inputs, up to 32 times (it's O(log2(n)), ideally we want O(1) without a gigantic lookup table using up all the CPU cache). I've been eyeing the BMI/BMI2 intrinsics but I don't see an obvious way to apply them.
Although my goal is to implement this in a compiled language like C++ or Rust with real bit shifting and intrinsics, I've been prototyping in Python. Included below is my script that includes the implementation I described, z(x), and the code for generating the plot, tower(x).
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from math import pow, floor, log, ceil
def leading_zeros(x):
return len(bin(x).split('b')[-1].split('1')[-1])
def f(x):
s = 0
for c, i in enumerate(range(1,x)):
a = pow(2, len(bin(i).split('b')[-1].split('1')[-1]))
s += a
return s
def g(x): return sum([pow(2,i)*floor((x+pow(2,i)-1)/pow(2,i+1)) for i in range(0,32)])
def h(x):
s = 0
extra = 0
extra_s = 0
for i in range(0,32):
num = (x+pow(2,i)-1)
den = pow(2,i+1)
fraction = num/den
floored = floor(num/den)
power = pow(2,i)
product = power*floored
if product == 0:
break
s += product
extra += (fraction - floored)
extra_s += power*fraction
#print(f"i={i} s={s} num={num} den={den} fraction={fraction} floored={floored} power={power} product={product} extra={extra} extra_s={extra_s}")
return s
def z(x):
upper_bound = floor(log(x,2)) if x > 0 else 0
s = 0
for i in range(upper_bound+1):
num = (x - pow(2,i))
den = pow(2,i+1)
fraction = num/den
floored = floor(fraction)
added = pow(2,i)
s += floored * added
s += added
print(f"i={i} s={s} upper_bound={upper_bound} num={num} den={den} floored={floored} added={added}")
return s
# return sum([floor((x - pow(2,i))/pow(2,i+1) + pow(2,i)) for i in range(floor(log(x, 2)))])
def tower(x):
table = [[" " for i in range(x)] for j in range(ceil(log(x,2)))]
for i in range(1,x):
p = leading_zeros(i)
table[p][i] = 2**p
for row in table:
for col in row:
print(col,end='')
print()
# h(9000)
for i in range(1,16):
tower(i)
print((i, f(i), g(i), h(i), z(i-1)))
Based on the method of Eric Postpischil, here is a way to do it without a loop.
Note that every bit is being multiplied by its position, and the results are summed (sort of, except there is also a factor of 0.5 in it, let's put that aside for now). Let's call those values that are being added up "the partial products" just to call them something, it's not really accurate to call them that, I can't come up with anything better. If we transpose that a little bit, then it's built up like this: the lowest bit of every partial product is the lowest bit of the position of every bit multiplied by that bit. Single-bit-products are bitwise-AND, and the values of the lowest bits of the positions are 0,1,0,1 etc, so it works out to x & 0xAAAAAAAA, the second bit of every partial product is x & 0xCCCCCCCC (and has a "weight" of 2, so this must be multiplied by 2) etc.
Then the whole thing needs to be shifted right by 1, to account for the factor of 0.5
So in total:
unsigned CountCumulativeTrailingZeros(unsigned x)
{
--x;
unsigned sum = x;
sum += (x >> 1) & 0x55555555;
sum += x & 0xCCCCCCCC;
sum += (x & 0xF0F0F0F0) << 1;
sum += (x & 0xFF00FF00) << 2;
sum += (x & 0xFFFF0000) << 3;
return sum;
}
For an additional explanation, here is a more visual example. Let's temporarily drop the factor of 0.5 again, it doesn't fundamentally change the algorithm but adds some complication.
First I write above every bit of v (some example value), the position of that bit in binary (p0 is the least significant bit of the position, p1 the second bit etc). Read the ps vertically, every column is a number:
p0: 10101010101010101010101010101010
p1: 11001100110011001100110011001100
p2: 11110000111100001111000011110000
p3: 11111111000000001111111100000000
p4: 11111111111111110000000000000000
v : 00000000100001000000001000000000
So for example bit 9 is set, and it has (reading from bottom to top) 01001 above it (9 in binary).
What we want to do (why this works has been explained by Eric's answer), is take the indexes of the bits that are set, shift them to their corresponding positions, and add them. In this case, they are already at their own positions (by construction, the numbers were written at their own positions), so there is no shift, but they still need to be filtered so only the numbers that correspond to set bits survive. This is what I meant by the "single bit products": take a bit of v and multiply it by the corresponding bits of p0, p1, etc.
You can look at that as multiplying the bit value by its index as well so 2^bit * bit as mentioned in the comments. That is not how it is done here, but that is effectively what is done.
Back to the example, applying bitwise-AND results in these partial products:
pp0: 00000000100000000000001000000000
pp1: 00000000100001000000000000000000
pp2: 00000000100000000000000000000000
pp3: 00000000000000000000001000000000
pp4: 00000000100001000000000000000000
v : 00000000100001000000001000000000
The only values that are left are 01001, 10010, 10111, and they are at their corresponding positions (so, already shifted to where they need to go).
Those values must be added, while keeping them at their positions. They don't need to be extracted from the strange form which they are in, addition is freely reorderable (associative and commutative) so it's OK to add all the least significant bits of the partial products to the sum first, then all the seconds bits, and so on. But they have to added with the right "weight", after all a set bit in pp0 corresponds to a 1 at that position but a set bit in pp1 really corresponds to a 2 at that position (since it's the second bit of the number that it is part of). So pp0 is used directly, but pp1 is shifted left by 1, pp2 is shifted left by 2, etc.
The the factor of 0.5 must still be accounted for, which I did mostly by shifting over the bits of the partial products by one less than what their weight would imply. pp0 was shifted left by zero, so it must be shifted right by 1 now. This could be done with less complication by just putting return sum >> 1; at the end, but that would reduce the range of values that the function can handle before running into integer wrapping modulo 232 (also it would cost an extra operation, and doing it the weird way does not).
Observe that if we count from 1 to x instead of to x−1, we have a pattern:
x
sum
sum/x
1
1
1
2
3
1.5
4
8
2
8
20
2.5
16
48
3
So we can easily calculate the sum for any power of two p as p • (1 + ½b), where b is the power (equivalently, the number of the bit that is set or the log2 of the power). We can see this by induction: If the sum from 1 to 2b is 2b•(1+½b) (which it is for b=0), then the sum from 1 to 2b+1 reprises the individual term contributions twice except that the last term adds 2b+1 instead of 2b, so the sum is 2•2b•(1+½b) − 2b + 2b+1 = 2b+1•(1+½b) + ½•2b+1 = 2b+1•(1+½(b+1)).
Further, between any two powers of two, the lower bits reprise the previous partial sums. Thus, for any x, we can compute the cumulative number of trailing zeros by summing the sums for the set bits in it. Recalling this provides the sum for numbers from 1 to x, we adjust by to get the desired sum from 1 to x−1 subtracting one from x before computation:
unsigned CountCumulative(unsigned x)
{
--x;
unsigned sum = 0;
for (unsigned bit = 0; bit < sizeof x * CHAR_BIT; ++bit)
sum += (x & 1u << bit) * (1 + bit * .5);
return sum;
}
We can terminate the loop when x is exhausted:
unsigned CountCumulative(unsigned x)
{
--x;
unsigned sum = 0;
for (unsigned bit = 0; x; ++bit, x >>= 1)
sum += ((x & 1) << bit) * (1 + bit * .5);
return sum;
}
As harold points out, we can factor out the 1, as summing the value of each bit of x equals x:
unsigned CountCumulative(unsigned x)
{
--x;
unsigned sum = x;
for (unsigned bit = 0; x; ++bit, x >>= 1)
sum += ((x & 1) << bit) * bit * .5;
return sum;
}
Then eliminate the floating-point:
unsigned CountCumulative(unsigned x)
{
unsigned sum = --x;
for (unsigned bit = 0; x; ++bit, x >>= 1)
sum += ((x & 1) << bit) / 2 * bit;
return sum;
}
Note that when bit is zero, ((x & 1) << bit) / 2 will lose the fraction, but this irrelevant as * bit makes the contribution zero anyway. For all other values of bit, (x & 1) << bit is even, so the division does not lose anything.
This will overflow unsigned at some point, so one might want to use a wider type for the calculations.
More Code Golf
Another way to add half the values of the bits of x repeatedly depending on their bit position is to shift x (to halve its bit values) and then add that repeatedly while removing successive bits from low to high:
unsigned CountCumulative(unsigned x)
{
unsigned sum = --x;
for (unsigned bit = 0; x >>= 1; ++bit)
sum += x << bit;
return sum;
}

C - Fast conversion between binary and hex representations

Reading or writing a C code, I often have difficulties translating the numbers from the binary to the hex representations and back. Usually, different masks like 0xAAAA5555 are used very often in low-level programming, but it's difficult to recognize a special pattern of bits they represent. Is there any easy-to-remember rule how to do it fast in the mind?
Each hex digit map exactly on 4 bit, I usually keep in mind the 8421 weights of each of these bits, so it is very easy to do even an in mind conversion ie
A = 10 = 8+2 = 1010 ...
5 = 4+1 = 0101
just keep the 8-4-2-1 weights in mind.
A 5
8+4+2+1 8+4+2+1
1 0 1 0 0 1 0 1
I always find easy to map HEX to BINARY numbers. Since each hex digit can be directly mapped to a four digit binary number, you can think of:
> 0xA4
As
> b 1010 0100
> ---- ---- (4 binary digits for each part)
> A 4
The conversion is calculated by dividing the base 10 representation by 2 and stringing the remainders in reverse order. I do this in my head, seems to work.
So you say what does 0xAAAA5555 look like
I just work out what A looks like and 5 looks like by doing
A = 10
10 / 2 = 5 r 0
5 / 2 = 2 r 1
2 / 2 = 1 r 0
1 / 2 = 0 r 1
so I know the A's look like 1010 (Note that 4 fingers are a good way to remember the remainders!)
You can string blocks of 4 bits together, so A A is 1010 1010. To convert binary back to hex, I always go through base 10 again by summing up the powers of 2. You can do this by forming blocks of 4 bits (padding with 0s) and string the results.
so 111011101 is 0001 1101 1101 which is (1) (1 + 4 + 8) (1 + 4 + 8) = 1 13 13 which is 1DD

Resources