Different bases for radix sort in C

Different bases for radix sort in C - c

I am having a difficult time understanding radix sort. I have no problems implementing code to work with bases of 2 or 10. However, I have an assignment that requires a command line argument to specify the radix. The radix can be anywhere from 2 - 100,000. I have spent around 10 hours trying to understand this problem. I am not asking for a direct answer, because this is homework. However, if anyone can shed some light on this, please do.
A few things I don't understand. What is the point of having base 100,000? How would that even work. I understand having a base for every letter of the alphabet, or every number 1-9. I just can't seem to wrap my head around this concept.
I'm sorry if I haven't been specific enough.

A number N in any base B is just a series of digits in the range [0, B-1]. Since we don't have enough symbols to represent all the digits in a "normal" human writing system, don't think about how it's written in characters. You'll just need to know that the digits are stored/written separately
For example 255 in base 177 is a 2-digit number in which the first digit has value 1 and the second digit has value 78 since 25510 = 1×1771 + 78×1770. If some culture uses this base they'll have 177 distinct symbols for the digits and they write it in only 2 digits. Since we only have 10 symbols we'll need to define some symbol to delimit the digits, which is often :. As you can see from Wolfram Alpha, 25510 = 1:78177
Note that not all people count in base 10. There exists cultures that count in base 4, 5, 6, 8, 12, 15, 16, 20, 24, 27, 32, 36, 60... so they'll have more or less symbols than most of us. However among the non-decimal bases, only base 20, 12 and 60 are most commonly used nowadays.
In base 100000 it's the same. 1234567890987654321 will be a 4-digit number written as symbols with value 1234, 56789, 9876, 54321 in order

I was about to explain it in a comment, but basically you're talking about what we sometimes call "modular arithmetic." Each digit is {0...n-1} and represents that times nk, where k is the position. 255 in decimal is 5×100 + 5×101 + 2×102.
So, your 255 base 177 is hard to represent, but there's a 1 in the 177s place (177×101) and 78 in the 1s (177×100) place.
As a general pseudocode algorithm, you want something like...
n = input value
digits = []
while n > 1
quotient = n / base (as an integer)
digits += quotient
remainder = n - quotient * base
n = remainder
And you might need to check the final remainder, in case something has gone wrong.
Of course, how you represent those digits is another story. MIME is contains semi-standard way for handling up through Base-64, for example.
If it was me, I'd just delimit the digits and make it clear that's the representation, but there's all of Unicode, if you want to mess around with hexadecimal-like extensions...

Related

Decimal and hexadecimal representation of different powers of 2

I understand 2n(9) = 512 but how did it convert to 0x200.
Can anyone explain it to me from a different perspective.
Reference problem from "Computer Systems: programmers perspective" Pg.35

Here is an explanation:
The pattern for using any number base is as follows:
Take the sum of each digit multiplied by the base raised to the power of the offset of the digit from the right.
Decimal Numbers:
Deci means "ten"
In school, we were taught that there are ten unique digits. These are:
0, 1, 2, 3, 4, 5, 6, 7, 8 and 9
We use the Arabic Numeral System which says that when we write, 512, what we are saying is this:
( 500 ) + ( 10 ) + ( 2 )
(5 * 10^2) + (1 * 10^1) + (2 * 10^0)
... or ...
five hundreds, one ten, and two ones.
These places are termed things such as, "the ones places," "the tens place," and "the hundreds place."
Hexadecimal
Hexa means six and deci means ten, so we have six and ten here, or sixteen.
This means that we have 16 unique digits, however the same rules apply as above. We start with the right most digit and move to the left and for each digit we increase the power that the base is raised to and that is the value of that "place." Example:
200 in hexadecimal means:
( 512 ) + ( 0 ) + ( 0 )
(2 * 16^2) + (0 * 16^1) + (0 * 16^0)
In hexadecimal we have the terms, "ones place," "sixteens place," "two hundred and fifty sixths place," "four thousand and ninty sixths place" (mouth full)
Arbitrary Bases
As long as there is an agreed upon set of characters for each digit, anyone who knows how to read decimal also knows how to read any other base. You just follow the same pattern: take the sum of each digit multiplied by the base raised to the power of the offset of the digit from the right.
Note: Arabic is written from right to left which might explain why the digits increase from right to left and not left to right, i.e., they seem backwards, if we take the time to really think about it.

How I can separate integer number in 3 "houses"? Hundred, Ten and Unity

I have, for example, a variable
int number = 300;
And I need modify "number" by "number", I wonder if I need separate in 3 variables for hundred, ten, unity or if there a method for divide that enable to me change a unique variable "number' by "number", "house" by "house", a hundred, a ten and unity (3 - 2 - 1).
Example: The user need only change number "2" of 3'2'1, and he want that "2" to being "5", as "321" must become to "351". In other words, the number 3 and 0 not will modified, only number 2 from 321, turning 3-5-1.

This has nothing to do with Arduino, it is C.
You can for example convert this to an array with itoa() (see https://playground.arduino.cc/Code/PrintingNumbers/)
And then convert it back to int with atoi() (see http://www.cplusplus.com/reference/cstdlib/atoi/)

Yes, you can do that, e.g. by using a function to change the digit.
Use the following steps:
What you do, is first shifting the digit right until it is the most right digit and remember the part removed
Clear the last digit.
Add the new digit
Shift left
Add the remembered part
Example (same steps as above) to change 321 to 351:
Shift right gives 32. Remember 1
Use the modulo operator and remove it: 32 - (32 % 10) = 32 - 2 = 30
30 + 5 = 35
35 -> after shift gives 350
350 + 1 = 351
I will leave the implementation up to you.

Why floating-points number's significant numbers is 7 or 6

I see this in Wikipedia log 224 = 7.22.
I have no idea why we should calculate 2^24 and why we should take log10......I really really need your help.

why floating-points number's significant numbers is 7 or 6 (?)
Consider some thoughts employing the Pigeonhole principle:
binary32 float can encode about 232 different numbers exactly. The numbers one can write in text like 42.0, 1.0, 3.1415623... are infinite, even if we restrict ourselves to a range like -1038 ... +1038. Any time code has a textual value like 0.1f, it is encoded to a nearby float, which may not be the exact same text value. The question is: how many digits can we code and still maintain distinctive float?
For the various powers-of-2 range, 223 (8,388,608) values are normally linearly encoded.
Example: In the range [1.0 ... 2.0), 223 (8,388,608) values are linearly encoded.
In the range [233 or 8,589,934,592 ... 234 or 17,179,869,184), again, 223 (8,388,608) values are linearly encoded: 1024.0 apart from each other. In the sub range [9,000,000,000 and 10,000,000,000), there are about 976,562 different values.
Put this together ...
As text, the range [1.000_000 ... 2.000_000), using 1 lead digit and 6 trailing ones, there are 1,000,000 different values. Per #3, In the same range, with 8,388,608 different float exist, allowing each textual value to map to a different float. In this range we can use 7 digits.
As text, the range [9,000,000 × 103 and 10,000,000 × 103), using 1 lead digit and 6 trailing ones, there are 1,000,000 different values. Per #4, In the same range, there are less than 1,000,000 different float values. Thus some decimal textual values will convert to the same float. In this range we can use 6, not 7, digits for distinctive conversions.
The worse case for typical float is 6 significant digits. To find the limit for your float:
#include <float.h>
printf("FLT_DIG = %d\n", FLT_DIG); // this commonly prints 6
... no idea why we should calculate 2^24 and why we should take log10
224 is a generalization as with common float and its 24 bits of binary precision, that corresponds to fanciful decimal system with 7.22... digits. We take log10 to compare the binary float to decimal text.
224 == 107.22...
Yet we should not take 224. Let us look into how FLT_DIG is defined from C11dr §5.2.4.2.2 11:
number of decimal digits, q, such that any floating-point number with q decimal digits can be rounded into a floating-point number with p radix b digits and back again without change to the q decimal digits,
p log10 b ............. if b is a power of 10
⎣(p − 1) log10 _b_⎦.. otherwise
Notice "log10 224" is same as "24 log10 2".
As a float, the values are distributed linearly between powers of 2 as shown in #2,3,4.
As text, values are distributed linearly between powers of 10 like a 7 significant digit values of [1.000000 ... 9.999999]*10some_exponent.
The transition of these 2 groups happen at different values. 1,2,4,8,16,32... versus 1,10,100, ... In determining the worst case, we subtract 1 from the 24 bits to account for the mis-alignment.
⎣(p − 1) log10 _b_⎦ --> floor((24 − 1) log10(2)) --> floor(6.923...) --> 6.
Had our float used base 10, 100, or 1000, rather than very common 2, the transition of these 2 groups happen at same values and we would not subtract one.

An IEEE 754 single-precision float has a 24-bit mantissa. This means it has 24 binary bits' worth of precision.
But we might be interested in knowing how many decimal digits worth of precision it has.
One way of computing this is to consider how many 24-bit binary numbers there are. The answer, of course, is 224. So these binary numbers go from 0 to 16777215.
How many decimal digits is that? Well, log10 gives you the number of decimal digits. log10(224) is 7.2, or a little more than 7 decimal digits.
And look at that: 16777215 has 8 digits, but the leading digit is just 1, so in fact it's only a little more than 7 digits.
(Of course this doesn't mean we can represent only numbers from 0 to 16777215! It means we can represent numbers from 0 to 16777215 exactly. But we've also got the exponent to play with. We can represent numbers from 0 to 1677721.5 more or less exactly to one place past the decimal, numbers from 0 to 167772.15 more or less exactly to two decimal points, etc. And we can represent numbers from 0 to 167772150, or 0 to 1677721500, but progressively less exactly -- always with ~7 digits' worth of precision, meaning that we start losing precision in the low-order digits to the left of the decimal point.)
The other way of doing this is to note that log10(2) is about 0.3. This means that 1 bit corresponds to about 0.3 decimal digits. So 24 bits corresponds to 24 × 0.3 = 7.2.
(Actually, IEEE 754 single-precision floating point explicitly stores only 23 bits, not 24. But there's an implicit leading 1 bit in there, so we do get the effect of 24 bits.)

Let's start a little smaller. With 10 bits (or 10 base-2 digits), you can represent the numbers 0 upto 1023. So you can represent up to 4 digits for some values, but 3 digits for most others (the ones below 1000).
To find out how many base-10 (decimal) digits can be represented by a bunch of base-2 digits (bits), you can use the log10() of the maximum representable value, i.e. log10(2^10) = log10(2) * 10 = 3.01....
The above means you can represent all 3 digit — or smaller — values and a few 4 digits ones. Well, that is easily verified: 0-999 have at most 3 digits, and 1000-1023 have 4.
Now take 24 bits. In 24 bits you can store log10(2^24) = 24 * log(2) base-10 digits. But because the top bit is always the same, you can in fact only store log10(2^23) = log10(8388608) = 6.92. This means you can represent most 7 digits numbers, but not all. Some of the numbers you can represent faithfully can only have 6 digits.
The truth is a bit more complicated though, because exponents play role too, and some of the many possible larger values can be represented too, so 6.92 may not be the exact value. But it gets close, and can nicely serve as a rule of thumb, and that is why they say that single precision can represent 6 to 7 digits.

Determine the adjacency of two fibonacci number

I have many fibonacci numbers, if I want to determine whether two fibonacci number are adjacent or not, one basic approach is as follows:
Get the index of the first fibonacci number, say i1
Get the index of the second fibonacci number, say i2
Get the absolute value of i1-i2, that is |i1-i2|
If the value is 1, then return true.
else return false.
In the first step and the second step, it may need many comparisons to get the correct index by using accessing an array.
In the third step, it need one subtraction and one absolute operation.
I want to know whether there exists another approach to quickly to determine the adjacency of the fibonacci numbers.
I don't care whether this question could be solved mathematically or by any hacking techniques.
If anyone have some idea, please let me know. Thanks a lot!

No need to find the index of both number.
Given that the two number belongs to Fibonacci series, if their difference is greater than the min. number among them then those two are not adjacent. Other wise they are.
Because Fibonacci series follows following rule:
F(n) = F(n-1) + F(n-2) where F(n)>F(n-1)>F(n-2).
So F(n) - F(n-1) = F(n-2) ,
=> Diff(n,n-1) < F(n-1) < F(n-k) for k >= 1
Difference between two adjacent fibonaci number will always be less than the min number among them.
NOTE : This will only hold if numbers belong to Fibonacci series.

Simply calculate the difference between them. If it is smaller than the smaller of the 2 numbers they are adjacent, If it is bigger, they are not.
Each triplet in the Fibonacci sequence a, b, c conforms to the rule
c = a + b
So for every pair of adjacent Fibonaccis (x, y), the difference between them (y-x) is equal to the value of the previous Fibonacci, which of course must be less than x.
If 2 Fibonaccis, say (x, z) are not adjacent, then their difference must be greater than the smaller of the two. At minimum, (if they are one Fibonacci apart) the difference would be equal to the Fibonacci between them, (which is of course greater than the smaller of the two numbers).
Since for (a, b, c, d)
since c= a+b
and d = b+c
then d-b = (b+c) - b = c

By Binet's formula, the nth Fibonacci number is approximately sqrt(5)*phi**n, where phi is the golden ration. You can use base phi logarithms to recover the index easily:
from math import log, sqrt
def fibs(n):
nums = [1,1]
for i in range(n-2):
nums.append(sum(nums[-2:]))
return nums
phi = (1 + sqrt(5))/2
def fibIndex(f):
return round((log(sqrt(5)*f,phi)))
To test this:
for f in fibs(20): print(fibIndex(f),f)
Output:
2 1
2 1
3 2
4 3
5 5
6 8
7 13
8 21
9 34
10 55
11 89
12 144
13 233
14 377
15 610
16 987
17 1597
18 2584
19 4181
20 6765
Of course,
def adjacentFibs(f,g):
return abs(fibIndex(f) - fibIndex(g)) == 1
This fails with 1,1 -- but there is little point for explicit testing special logic for such an edge-case. Add it in if you want.
At some stage, floating-point round-off error will become an issue. For that, you would need to replace math.log by an integer log algorithm (e.g. one which involves binary search).
On Edit:
I concentrated on the question of how to recover the index (and I will keep the answer since that is an interesting problem in its own right), but as #LeandroCaniglia points out in their excellent comment, this is overkill if all you want to do is check if two Fibonacci numbers are adjacent, since another consequence of Binet's formula is that sufficiently large adjacent Fibonacci numbers have a ratio which differs from phi by a negligible amount. You could do something like:
def adjFibs(f,g):
f,g = min(f,g), max(f,g)
if g <= 34:
return adjacentFibs(f,g)
else:
return abs(g/f - phi) < 0.01
This assumes that they are indeed Fibonacci numbers. The index-based approach can be used to verify that they are (calculate the index and then use the full-fledged Binet's formula with that index).

Why is the answer to: How many ways to write a 15 bit string with at least 3 1s

I was going over my textbook to review permutations and combinatorics, which I have great difficulty comprehending despite seeming simple and came across this problem.
How many ways are there to write a length 15 string using binary if there must be exactly 3 "1's" and 12 "0's".
The answer to the problem was C(15, 3) or C(15, 12). Now, I understand why there are two possible solutions to the problem, but I'm puzzled as to why the answer is C(15, 12) || C(15, 3)
From my understanding, we're choosing three (or twelve) of the digits to be 1 (or 0), which is good and all, but how does that ensure that the remaining digits are the remainings 0's or 1's?
tl;dr: By using C(15,3) we ensure that we have the # of ways three digits will be 1, but how does that guarantee the remaining 12 will be 0s?

Go back to first principals:
Start with all 15 bits set to 0 [1 way to do this]
Choose 1 bit and flip it [15 ways to do this]
Choose a different bit and flip it [14 ways to do this]
Choose yet another bit and flip it [13 ways to do this]
It should be clear that exactly 3 bits are 1's and the remaining 12 are 0's
Total number of ways to do this: 1 x 15 x 14 x 13 = C(15, 3)