C - Fast conversion between binary and hex representations - c

Reading or writing a C code, I often have difficulties translating the numbers from the binary to the hex representations and back. Usually, different masks like 0xAAAA5555 are used very often in low-level programming, but it's difficult to recognize a special pattern of bits they represent. Is there any easy-to-remember rule how to do it fast in the mind?

Each hex digit map exactly on 4 bit, I usually keep in mind the 8421 weights of each of these bits, so it is very easy to do even an in mind conversion ie
A = 10 = 8+2 = 1010 ...
5 = 4+1 = 0101
just keep the 8-4-2-1 weights in mind.
A 5
8+4+2+1 8+4+2+1
1 0 1 0 0 1 0 1

I always find easy to map HEX to BINARY numbers. Since each hex digit can be directly mapped to a four digit binary number, you can think of:
> 0xA4
As
> b 1010 0100
> ---- ---- (4 binary digits for each part)
> A 4

The conversion is calculated by dividing the base 10 representation by 2 and stringing the remainders in reverse order. I do this in my head, seems to work.
So you say what does 0xAAAA5555 look like
I just work out what A looks like and 5 looks like by doing
A = 10
10 / 2 = 5 r 0
5 / 2 = 2 r 1
2 / 2 = 1 r 0
1 / 2 = 0 r 1
so I know the A's look like 1010 (Note that 4 fingers are a good way to remember the remainders!)
You can string blocks of 4 bits together, so A A is 1010 1010. To convert binary back to hex, I always go through base 10 again by summing up the powers of 2. You can do this by forming blocks of 4 bits (padding with 0s) and string the results.
so 111011101 is 0001 1101 1101 which is (1) (1 + 4 + 8) (1 + 4 + 8) = 1 13 13 which is 1DD

Related

How to sort Hexadecimal Numbers (like 10 1 A B) in C?

I want to implement a sorting algorithm in C for sorting hexadecimal numbers like:
10 1 A B
to this:
1 A B 10
The problem that I am facing here is I didn;t understand how A & B is less than 10 as A = 10 and B = 11 in hexadecimal numbers. Im sorry if I am mistaken.
Thank you!
As mentioned in the previous comments, 10 is 0x10, so this sorting seems to be no problem: 0x1 < 0xA < 0xB < 0x10
In any base a number with two digits is always greater than a number with one digit.
In hexadecimal notation we have 6 more digits available than in decimal, but they still count as one "digit":
hexadecimal digit | value in decimal representation
A | 10
B | 11
C | 12
D | 13
E | 14
F | 15
When you get a number in hexadecimal notation, it might be that its digits happen to use none of the above extra digits, but just the well-known 0..9 digits. This can be confusing, as we still must treat them as hexadecimal. In particular, a digit in a multi-digit hexadecimal representation must be multiplied with a power of 16 (instead of 10) to be correctly interpreted. So when you get 10 as hexadecimal number, it has a value of one (1) time sixteen plus zero (0), so the hexadecimal number 10 has a (decimal) value of 16.
The hexadecimal numbers you gave should therefore be ordered as 1 < A < B < 10.
As a more elaborate example, the hexadecimal representation 1D6A can be converted to decimal like this:
1D6A
│││└─> 10 x 16⁰ = 10
││└──> 6 x 16¹ = 96
│└───> 13 x 16² = 3328
└────> 1 x 16³ = 4096
──── +
7530
Likewise
10
│└─> 0 x 16⁰ = 0
└──> 1 x 16¹ = 16
── +
16

Convert continues binary fraction to decimal fraction in C

I implemented a digit-by-digit calculation of the square root of two. Each round it will outpute one bit of the fractional part e.g.
1 0 1 1 0 1 0 1 etc.
I want to convert this output to decimal numbers:
4 1 4 2 1 3 6 etc.
The issue I´m facing is, that this would generally work like this:
1 * 2^-1 + 0 * 2^-2 + 1 * 2^-3 etc.
I would like to avoid fractions altogether, as I would like to work with integers to convert from binary to decimal. Also I would like to print each decimal digit as soon as it has been computed.
Converting to hex is trivial, as I only have to wait for 4 bits. Is there a smart aproach to convert to base10 which allows to observe only a part of the whole output and idealy remove digits from the equation, once we are certain, that it wont change anymore, i.e.
1 0
2 0,25
3 0,375
4 0,375
5 0,40625
6 0,40625
7 0,4140625
8 0,4140625
After processing the 8th bit, I´m pretty sure that 4 is the first decimal fraction digit. Therefore I would like to remove 0.4 complelty from the equation to reduce the bits I need to take care of.
Is there a smart approach to convert to base10 which allows to observe only a part of the whole output and ideally remove digits from the equation, once we are certain that it wont change anymore (?)
Yes, eventually in practice, but in theory, no in select cases.
This is akin to the Table-maker's dilemma.
Consider the below handling of a value near 0.05. As long as the binary sequence is .0001 1001 1001 1001 1001 ... , we cannot know it the decimal equivalent is 0.04999999... or 0.05000000...non-zero.
int main(void) {
double a;
a = nextafter(0.05, 0);
printf("%20a %.20f\n", a, a);
a = 0.05;
printf("%20a %.20f\n", a, a);
a = nextafter(0.05, 1);
printf("%20a %.20f\n", a, a);
return 0;
}
0x1.9999999999999p-5 0.04999999999999999584
0x1.999999999999ap-5 0.05000000000000000278
0x1.999999999999bp-5 0.05000000000000000971
Code can analyse the incoming sequence of binary fraction bits and then ask two questions, after each bit: "if the remaining bits are all 0" what is it in decimal?" and "if the remaining bits are all 1" what is it in decimal?". In many cases, the answers will share common leading significant digits. Yet as shown above, as long as 1001 is received, there are no common significant decimal digits.
A usual "out" is to have an upper bound as to the number of decimal digits that will ever be shown. In that case code is only presenting a rounded result and that can be deduced in finite time even if the binary input sequence remains 1001
ad nauseam.
The issue I´m facing is, that this would generally work like this:
1 * 2^-1 + 0 * 2^-2 + 1 * 2^-3 etc.
Well 1/2 = 5/10 and 1/4 = 25/100 and so on which means you will need powers of 5 and shift the values by powers of 10
so given 0 1 1 0 1
[1] 0 * 5 = 0
[2] 0 * 10 + 1 * 25 = 25
[3] 25 * 10 + 1 * 125 = 375
[4] 375 * 10 + 0 * 625 = 3750
[5] 3750 * 10 + 1 * 3125 = 40625
Edit:
Is there a smart aproach to convert to base10 which allows to observe only a part of the whole output and idealy remove digits from the equation, once we are certain, that it wont change anymore
It might actually be possible to pop the most significant digits(MSD) in this case. This will be a bit long but please bear with me
Consider the values X and Y:
If X has the same number of digits as Y, then the MSD will change.
10000 + 10000 = 20000
If Y has 1 or more digits less than X, then the MSD can change.
19000 + 1000 = 20000
19900 + 100 = 20000
So the first point is self explanatory but the second point is what will allow us to pop the MSD. The first thing we need to know is that the values we are adding is continuously being divided in half every iteration. Which means that if we only consider the MSD, the largest value in base10 is 9 which will produce the sequence
9 > 4 > 2 > 1 > 0
If we sum up these values it will be equal to 16, but if we try to consider the values of the next digits (e.g. 9.9 or 9.999), the value actually approaches 20 but it doesn't exceed 20. What this means is that if X has n digits and Y has n-1 digits, the MSD of X can still change. But if X has n digits and Y has n-2 digits, as long as the n-1 digit of X is less than 8, then the MSD will not change (otherwise it would be 8 + 2 = 10 or 9 + 2 = 11 which means that the MSD will change). Here are some examples
Assuming X is the running sum of sqrt(2) and Y is 5^n:
1. If X = 10000 and Y = 9000 then the MSD of X can change.
2. If X = 10000 and Y = 900 then the MSD of X will not change.
3. If X = 19000 and Y = 900 then the MSD of X can change.
4. If X = 18000 and Y = 999 then the MSD of X can change.
5. If X = 17999 and Y = 999 then the MSD of X will not change.
6. If X = 19990 and Y = 9 then the MSD of X can change.
In the example above, on point #2 and #5, the 1 can already be popped. However for point #6, it is possible to have 19990 + 9 + 4 = 20003, but this also means that both 2 and 0 can be popped after that happened.
Here's a simulation for sqrt(2)
i Out X Y flag
-------------------------------------------------------------------
1 0 5 0
2 25 25 1
3 375 125 1
4 3,750 625 0
5 40,625 3,125 1
6 406,250 15,625 0
7 4 140,625 78,125 1
8 4 1,406,250 390,625 0
9 4 14,062,500 1,953,125 0
10 41 40,625,000 9,765,625 0
11 41 406,250,000 48,828,125 0
12 41 4,062,500,000 244,140,625 0
13 41 41,845,703,125 1,220,703,125 1
14 414 18,457,031,250 6,103,515,625 0
15 414 184,570,312,500 30,517,578,125 0
16 414 1,998,291,015,625 152,587,890,625 1
17 4142 0,745,849,609,375 762,939,453,125 1
You can use multiply and divide approach to reduce the floating point arithmetic.
1 0 1 1
Which is equivalent to 1*2^0+0*2^1+2^(-2)+2^(-3) can be simplified to (1*2^3+0*2^2+1*2^1+1*2^0)/(2^3) only division remains floating point arithmetic rest all is integer arithmetic operation. Multiplication by 2 can be implemented through left shift.

Integer compression method

How can I compress a row of integers into something shorter ?
Like:
Input: '1 2 4 5 3 5 2 3 1 2 3 4' -> Algorithm -> Output: 'X Y Z'
and can get it back the other way around? ('X Y Z' -> '1 2 4 5 3 5 2 3 1 2 3 4')
Note:Input will only contain numbers between 1-5 and the total string of number will be 10-16
Is there any way I can compress it to 3-5 numbers?
Here is one way. First, subtract one from each of your little numbers. For your example input that results in
0 1 3 4 2 4 1 2 0 1 2 3
Now treat that as the base-5 representation of an integer. (You can choose either most significant digit first or last.) Calculate the number in binary that means the same thing. Now you have a single integer that "compressed" your string of little numbers. Since you have shown no code of your own, I'll just stop here. You should be able to implement this easily.
Since you will have at most 16 little numbers, the maximum resulting value from that algorithm will be 5^16 which is 152,587,890,625. This fits into 38 bits. If you need to store smaller numbers than that, convert your resulting value into another, larger number base, such as 2^16 or 2^32. The former would result in 3 numbers, the latter in 2.
#SergGr points out in a comment that this method does not show the number of integers encoded. If that is not stored separately, that can be a problem, since the method does not distinguish between leading zeros and coded zeros. There are several ways to handle that, if you need the number of integers included in the compression. You could require the most significant digit to be 1 (first or last depends on where the most significant number is.) This increases the number of bits by one, so you now may need 39 bits.
Here is a toy example of variable length encoding. Assume we want to encode two strings: 1 2 3 and 1 2 3 0 0. How the results will be different? Let's consider two base-5 numbers 321 and 00321. They represent the same value but still let's convert them into base-2 preserving the padding.
1 + 2*5 + 3*5^2 = 86 dec = 1010110 bin
1 + 2*5 + 3*5^2 + 0*5^3 + 0*5^4 = 000001010110 bin
Those additional 0 in the second line mean that the biggest 5-digit base-5 number 44444 has a base-2 representation of 110000110100 so the binary representation of the number is padded to the same size.
Note that there is no need to pad the first line because the biggest 3-digit base-5 number 444 has a base-2 representation of 1111100 i.e. of the same length. For an initial string 3 2 1 some padding will be required in this case as well, so padding might be required even if the top digits are not 0.
Now lets add the most significant 1 to the binary representations and that will be our encoded values
1 2 3 => 11010110 binary = 214 dec
1 2 3 0 0 => 1000001010110 binary = 4182 dec
There are many ways to decode those values back. One of the simplest (but not the most efficient) is to first calculate the number of base-5 digits by calculating floor(log5(encoded)) and then remove the top bit and fill the digits one by one using mod 5 and divide by 5 operations.
Obviously such encoding of variable-length always adds exactly 1 bit of overhead.
Its call : polidatacompressor.js but license will be cost you, you have to ask author about prices LOL
https://github.com/polidatacompressor/polidatacompressor
Ncomp(65535) will output: 255, 255 and when you store this in database as bytes you got 2 char
another way is to use "Hexadecimal aka base16" in javascript (1231).toString(16) give you '4cf' in 60% situation it compress char by -1
Or use base10 to base64 https://github.com/base62/base62.js/
4131 --> 14D
413131 --> 1Jtp

How to find out number of bits enabled : bits handling

This was asked in one of the interview I gave. I couldn't answer this properly.
I want to find out how many bits are enabled based on a number.
Suppose , if the number is 2 , I should return 3.
if the number is 3 , I should return 7
8 4 2 1
1 1
8 4 2 1
1 1 1
Is there any easy way of doing it?
Yes, there is: subtract 1 from the corresponding power of 2, like this:
int allBitsSet = (1U << n) - 1;
The expression (1U << n) - 1 computes the value of 2 to the power of n, which always has this form in binary:
1000...00
i.e. one followed by n zeros. When you subtract 1 from a number of that form, you "borrow" from the bit that is set to 1 making it zero, and flip the remaining bits to 1.
You can visualize this by solving an analogous problem in decimal system: "make a number that has n nines". The solution is the same, except now you need to use 10 instead of 2.

Can anyone explain why '>>2' shift means 'divided by 4' in C codes?

I know and understand the result.
For example:
<br>
7 (decimal) = 00000111 (binary) <br>
and 7 >> 2 = 00000001 (binary) <br>
00000001 (binary) is same as 7 / 4 = 1 <br>
So 7 >> 2 = 7 / 4 <br>
<br>
But I'd like to know how this logic was created.
Can anyone elaborate on this logic?
(Maybe it just popped up in a genius' head?)
And are there any other similar logics like this ?
It didn't "pop-up" in a genius' head. Right shifting binary numbers would divide a number by 2 and left shifting the numbers would multiply it by 2. This is because 10 is 2 in binary. Multiplying a number by 10(be it binary or decimal or hexadecimal) appends a 0 to the number(which is effectively left shifting). Similarly, dividing by 10(or 2) removes a binary digit from the number(effectively right shifting). This is how the logic really works.
There are plenty of such bit-twiddlery(a word I invented a minute ago) in computer world.
http://graphics.stanford.edu/~seander/bithacks.html Here is for the starters.
This is my favorite book: http://www.amazon.com/Hackers-Delight-Edition-Henry-Warren/dp/0321842685/ref=dp_ob_image_bk on bit-twiddlery.
It is actually defined that way in the C standard.
From section 6.5.7:
The result of E1 >> E2 is E1 right-shifted E2 bit positions. [...]
the value of the result is the integral part of the quotient of E1 / 2E2
On most architectures, x >> 2 is only equal to x / 4 for non-negative numbers. For negative numbers, it usually rounds the opposite direction.
Compilers have always been able to optimize x / 4 into x >> 2. This technique is called "strength reduction", and even the oldest compilers can do this. So there is no benefit to writing x / 4 as x >> 2.
Elaborating on Aniket Inge's answer:
Number: 30710 = 1001100112
How multiply by 10 works in decimal system
10 * (30710)
= 10 * (3*102 + 7*100)
= 3*102+1 + 7*100+1
= 3*103 + 7*101
= 307010
= 30710 << 1
Similarly multiply by 2 in binary,
2 * (1001100112)
= 2 * (1*28 + 1*25 + 1*24 + 1*21 1*20)
= 1*28+1 + 1*25+1 + 1*24+1 + 1*21+1 1*20+1
= 1*29 + 1*26 + 1*25 + 1*22 + 1*21
= 10011001102
= 1001100112 << 1
I think you are confused by the "2" in:
7 >> 2
and are thinking it should divide by 2.
The "2" here means shift the number ("7" in this case) "2" bit positions to the right.
Shifting a number "1"bit position to the right will have the effect of dividing by 2:
8 >> 1 = 4 // In binary: (00001000) >> 1 = (00000100)
and shifting a number "2"bit positions to the right will have the effect of dividing by 4:
8 >> 2 = 2 // In binary: (00001000) >> 2 = (00000010)
Its inherent in the binary number system used in computer.
a similar logic is --- left shifting 'n' times means multiplying by 2^n.
An easy way to see why it works, is to look at the familiar decimal ten-based number system, 050 is fifty, shift it to the right, it becomes 005, five, equivalent to dividing it by 10. The same thing with shifting left, 050 becomes 500, five hundred, equivalent to multiplying it by 10.
All the other numeral systems work the same way.
they do that because shifting is more efficient than actual division. you're just moving all the digits to the right or left, logically multiplying/dividing by 2 per shift
If you're wondering why 7/4 = 1, that's because the rest of the result, (3/4) is truncated off so that it's an interger.
Just my two cents: I did not see any mention to the fact that shifting right does not always produce the same results as dividing by 2. Since right shifting rounds toward negative infinity and integer division rounds to zero, some values (like -1 in two's complement) will just not work as expected when divided.
It's because >> and << operators are shifting the binary data.
Binary value 1000 is the double of binary value 0100
Binary value 0010 is the quarter of binary value 1000
You can call it an idea of a genius mind or just the need of the computer language.
To my belief, a Computer as a device never divides or multiplies numbers, rather it only has a logic of adding or simply shifting the bits from here to there. You can make an algorithm work by telling your computer to multiply, subtract them up, but when the logic reaches for actual processing, your results will be either an outcome of shifting of bits or just adding of bits.
You can simply think that for getting the result of a number being divided by 4, the computer actually right shifts the bits to two places, and gives the result:
7 in 8-bit binary = 00000111
Shift Right 2 places = 00000001 // (Which is for sure equal to Decimal 1)
Further examples:
//-- We can divide 9 by four by Right Shifting 2 places
9 in 8-bit binary = 00001001
Shift right 2 places: 00000010 // (Which is equal to 9/4 or Decimal 2)
A person with deep knowledge of assembly language programming can explain it with more examples. If you want to know the actual sense behind all this, I guess you need to study bit level arithmetic and assembly language of computer.

Resources