Hey, I'm trying to write a program to convert from a BASE64 string to a BASE16(HEX) string.
Here's an example:
BASE64: Ba7+Kj3N
HEXADECIMAL: 05 ae fe 2a 3d cd
BINARY: 00000101 10101110 11111110 00101010 00111101 11001101
DECIMAL: 5 174 254 42 61 205
What's the logic to convert from BASE64 to HEXIDECIMAL?
Why is the decimal representation split up?
How come the binary representation is split into 6 section?
Just want the math, the code I can handle just this process is confusing me. Thanks :)
Here's a function listing that will convert between any two bases: https://sites.google.com/site/computersciencesourcecode/conversion-algorithms/base-to-base
Edit (Hopefully to make this completely clear...)
You can find more information on this at the Wikipedia entry for Base 64.
The customary character set used for base 64, which is different than the character set you'll find in the link I provided prior to the edit, is:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
The character 'A' is the value 0, 'B' is the value 1, 'C' is the value 2, ...'8' is the value 60, '9' is the value 61, '+' is the value 62, and '/' is the value 63. This character set is very different from what we're used to using for binary, octal, base 10, and hexadecimal, where the first character is '0', which represents the value 0, etc.
Soju noted in the comments to this answer that each base 64 digit requires 6 bits to represent it in binary. Using the base 64 number provided in the original question and converting from base 64 to binary we get:
B a 7 + K j 3 N
000001 011010 111011 111110 001010 100011 110111 001101
Now we can push all the bits together (the spaces are only there to help humans read the number):
000001011010111011111110001010100011110111001101
Next, we can introduce new white-space delimiters every four bits starting with the Least Significant Bit:
0000 0101 1010 1110 1111 1110 0010 1010 0011 1101 1100 1101
It should now be very easy to see how this number is converted to base 16:
0000 0101 1010 1110 1111 1110 0010 1010 0011 1101 1100 1101
0 5 A E F E 2 A 3 D C D
think of base-64 as base(2^6)
so in order to get alignment with a hex nibble you need at least 2 base 64 digits...
with 2 base-64 digits you have a base-(2^12) number, which could be represented by 3 2^4 digits...
(00)(01)(02)(03)(04)(05)---(06)(07)(08)(09)(10)(11) (base-64) maps directly to:
(00)(01)(02)(03)---(04)(05)(06)(07)---(08)(09)(10)(11) (base 16)
so you can either convert to contiguous binary... or use 4 operations... the operations could deal with binary values, or they could use a set of look up tables (which could work on the char encoded digits):
first base-64 digit to first hex digit
first base-64 digit to first half of second hex digit
second base-64 digit to second half of second hex digit
second base-64 digit to third hex digit.
the advantage of this is that you can work on encoded bases without binary conversion.
it is pretty easy to do in a stream of chars... I am not aware of this actually being implemented anywhere though.
Related
I need a DFA for a set of all strings beginning with a 1 that, interpreted as the binary representation of an integer, have a remainder of 1 when divided by 3.
For example, the binary number 1010 b is decimal 10. When you divide 10 by 3 you get a remainder of 1, so 1010 is in the language. However, the binary number 1111 b is decimal 15. When you divide 15 by 3 you get a remainder of 0, so 1111 is not in the language.
I've attached my DFA below. Could you please check it?
It looks correct to me.
You could make two simplifications:
q4 represents (mod 0), so you could make it the starting state and get rid of q0 and q5. (Unless you are required to reject strings beginning with a 0? Your question doesn't specify.)
q1 and q3 can be merged. They both represent (mod 1) and have the same transitions.
These two changes would leave you with exactly 3 states, one for each remainder.
I'm trying to understand how a binary addition and logical OR table differs.
does both carry forward 1 or if not which one does carry forward operation and which does not?
The exclusive-or (XOR) operation is like binary addition, except that
there is no carry from one bit position to the next. Thus, each bit
position can be evaluated independently of the rest.
I'll attempt to clarify a few points with a few illustrations.
First, addition. Basically like adding numbers in grade school. But if you have a 1-bit aligned with a 1-bit, you get a 0 with a 1 carry (i.e. 10, essentially analogous to 5 plus 5 in base-10). Otherwise, add them like 'regular' (base-10) numbers. For instance:
₁₁₁
1001
+ 1111
______
11000
Note that in the left-most column two 1's are added to give 10, which with another 1 gives 11 (similar to 5 + 5 + 5).
Now, assuming by "logical OR" you mean something along the lines of bitwise OR (an operation which basically performs the logical OR (inclusive) operation on each pair of corresponding bits), then you have this:
1001
| 1111
______
1111
Only case here you should have a 0 bit is if both bits are 0.
Finally, since you tagged this question xor, which I assume is bitwise as well.
1001
^ 1111
______
0110 = 110₂
In this case, two 1-bits give a 0, and of course two 0-bits give 0.
With a logical OR you get a logical result (Boolean). IOW true OR true is true (anything other than false OR false is true). In some languages (like C) any numeric value other than 0 means true. And some languages use an explicit datatype for true, false (bool, Boolean).
In case of binary OR, you are ORing the bits of two binary values. ie: 1 (which is binary 1) bitwise OR 2 (which is binary 10) is binary 11:
01
10
11
which is 3. Thus binary OR is also an addition when the values do not have shared bits (like flag values).
Given the following C code:
int x = atoi(argv[1]);
int y = (x & -x);
if (x==y)
printf("Wow... they are the same!\n");
What values of x will result in "Wow... they are the same!" getting printed? Why?
So. It generally depends, but I can assume, that your architecture represents numbers with sign in U2 format (everything is false if it's not in U2 format). Let's have an example.
We take 3, which representation will be like:
0011
and -3. which will be:
~ 0011
+ 1
-------
1101
and we make and
1101
& 0011
------
0001
so:
1101 != 0001
that's what is happening underhood. You have to find numbers that fit to this pattern. I do not know what kind of numbers fit it upfront. But basing on this you can predict this.
The question is asking about the binary & operator, and 2's compliment arithmetic.
I would look to how numbers are represented in 2's compliment, and what the binary & symbol does.
Assuming a 2's compliment representation for negative numbers, the only values for which this is true are positive numbers of the form 2^n where n >= 0, and 0.
When you take the 2's compliment of a number, you flip all bits and then add one. So the least significant bit will always match. The next bit won't match unless the prior carried over, and the same for the next bit.
An int is typically 32 bits, however I'll use 5 bits in the following examples for simplicity.
For example, 5 is 00101. Flipping all bits gives us 11010, then adding 1 gives us 11011. Then 00101 & 11011 = 00001. The only bit that matches a set bit is the last one, so 5 doesn't work.
Next we'll try 12, which is 01100. Flipping the bits gives us 10011, then adding 1 gives us 10100. Then 01100 & 10100 = 00100. Because of the carry-over the third bit is set, however the second bit is not, so 12 doesn't work either.
So the most significant bit which is set won't match unless all lower bits carry over when 1 is added. This is true only for numbers with one bit set, i.e. powers of 2.
If we now try 8, which is 01000, flipping the bits gives us 10111 and adding 1 gives us 11000. And 01000 & 11000 = 01000. In this case, the second bit is set, which is the only bit set in the original number. So the condition holds.
Negative numbers cannot satisfy this condition because positive numbers have the most significant bit set to 0, while negative numbers have the most significant bit set to 1. So a bitwise AND of a number and its negative will always have the most significant bit set to 0, meaning this number cannot be negative.
0 is a special case since it is its own negative. 0 & 0 = 0, so it also satisfies this condition.
Another special case is the smallest number you can represent. In the case of a 5-bit number this is -16, which is represented by 10000. Flipping all the bits gives you 01111 and adding 1 gives you 10000, which is the same number. On the surface it seems this number also satisfies the condition, however this is an overflow condition and implementations may not handle this case correctly. See this link for more details.
I am trying to refresh on floats. I am reading an exercise that asks to convert from format A having: k=3, 4 bits fraction and Bias=3 to format B having k=4, 3 bits fraction and Bias 7.
We should round when necessary.
Example between formats:
011 0000 (Value = 1) =====> 0111 000 (Value = 1)
010 1001 (Value = 25/32) =====> 0110 100 (Value = 3/4 Rounded down)
110 1111 (Value = 31/2) =====> 1011 000 (Value = 16 Rounded up)
Problem: I can not figure out how the conversion works. First of all I managed to do it correctly in some case but my approach was to convert the bit pattern of format A to the decimal value and then to express that value in the bit pattern of format B.
But is there a way to somehow go from one bit pattern to the other without doing this conversion, just knowing that we extend the e by 1 bit and reduce the fraction by 1?
But is there a way to somehow go from one bit pattern to the other without doing this conversion, just knowing that we extend the e by 1 bit and reduce the fraction by 1?
Yes, and this is much simpler than going through the decimal value (which is only correct if you convert to the exact decimal value and not an approximation).
011 0000 (Value = 1)
represents 1.0000 * 23-3
is really 1.0 * 20 in “natural” binary
represents 1.000 * 27-7 to pre-format for the destination format
=====> 0111 000 (Value = 1)
Second example:
010 1001 (Value = 25/32)
represents 1.1001 * 22-3
is really 1.1001 * 2-1
rounds to 1.100 * 2-1 when we suppress one digit, because of “ties-to-even”
is 1.100 * 26-7 pre-formated
=====> 0110 100 (Value = 3/4 Rounded down)
Third example:
110 1111 (Value = 31/2)
represents 1.1111 * 26-3
is really 1.1111 * 23
rounds to 10.000 * 23 when we suppress one digit, because “ties-to-even” means “up” here and the carry propagates a long way
renormalizes into 1.000 * 24
is 1.000 * 211-7 pre-formated
=====> 1011 000 (Value = 16 Rounded up)
Examples 2 and 3 are “halfway cases”. Well, rounding from 4-bit fractions to 3-bit fractions, 50% of examples will be halfway cases anyway.
In example 2, 1.1001 is as close to 1.100 as it is to 1.101. So how is the result chosen? The one that is chosen is the one that ends with 0. Here, 1.100.
I can't get the 2-complement calculation to work.
I know C compiles ~b that would invert all bits to -6 if b=5. But why?
int b=101, inverting all bits is 010 then for 2 complement's notation I just add 1 but that becomes 011 i.e. 3 which is wrong answer.
How should I calculate with bit inversion operator ~?
Actually, here's how 5 is usually represented in memory (16-bit integer):
0000 0000 0000 0101
When you invert 5, you flip all the bits to get:
1111 1111 1111 1010
That is actually -6 in decimal form. I think in your question, you were simply flipping the last three bits only, when in fact you have to consider all the bits that comprise the integer.
The problem with b = 101 (5) is that you have chosen one too few binary digits.
binary | decimal
~101 = 010 | ~5 = 2
~101 + 1 = 011 | ~5 + 1 = 3
If you choose 4 bits, you'll get the expected result:
binary | decimal
~0101 = 1010 | ~5 = -6
~0101 + 1 = 1011 | ~5 + 1 = -5
With only 3 bits you can encode integers from -4 to +3 in 2's complement representation.
With 4 bits you can encode integers from -8 to +7 in 2's complement representation.
-6 was getting truncated to 2 and -5 was getting truncated to 3 in 3 bits. You needed at least 4 bits.
And as others have already pointed out, ~ simply inverts all bits in a value, so, ~~17 = 17.
~b is not a 2-complement operation. It is a bitwise NOT operation. It just inverts every bit in a number, therefore ~b is unequal to -b.
Examples:
b = 5
binary representation of b: 0000 0000 0000 0101
binary representation of ~b: 1111 1111 1111 1010
~b = -6
b = 17
binary representation of b: 0000 0000 0001 0001
binary representation of ~b: 1111 1111 1110 1110
~b = -18
binary representation of ~(~b): 0000 0000 0001 0001
~(~b) = 17
~ simply inverts all the bits of a number:
~(~a)=17 if a=17
~0...010001 = 1...101110 ( = -18 )
~1...101110 = 0...010001 ( = 17 )
You need to add 1 only in case you want to negate a number (to get a 2-s complement) i.e. get -17 out of 17.
~b + 1 = -b
So:
~(~b) equals ~(-b - 1) equals -(-b - 1) -1 equals b
In fact, ~ reverse all bits, and if you do ~ again, it will reverse back.
I can't get the 2-completement calculation to work.
I know C compiles ~b that whould invert all bits to -6 if b=5. But why?
Because you are using two's complement. Do you know what two's complement is?
Lets say that we have a byte variable (signed char). Such a variable can have the values from 0 to 127 or from -128 to 0.
Binary, it works like this:
0000 0000 // 0
...
0111 1111 // 127
1000 0000 // -128
1000 0001 // -127
...
1111 1111 // -1
Signed numbers are often described with a circle.
If you understand the above, then you understand why ~1 equals -2 and so on.
Had you used one's complement, then ~1 would have been -1, because one's complement uses a signed zero. For a byte, described with one's complement, values would go from 0 to 127 to -127 to -0 back to 0.
you declared b as an integer. That means the value of b will be stored in 32 bits and the complement (~) will take place on the 32 bit word and not the last 3 bits as you are doing.
int b=5 // b in binary: 0000 0000 0000 0101
~b // ~b in binary: 1111 1111 1111 1010 = -6 in decimal
The most significant bit stores the sign of the integer (1:negetive 0:positive) so 1111 1111 1111 1010 is -6 in decimal.
Similarly:
b=17 // 17 in binary 0000 0000 0001 0001
~b // = 1111 1111 1110 1110 = -18