Algorithm for printing decimal value of a huge(over 128bits) binary number? - c

TLDR, at the bottom :)
Brief:
I am in a process of creating an basic arithmetic library(addition, subtraction, ...) for handling huge numbers. One of the problem i am facing is printing these huge binary numbers into decimal.
I have huge binary number stored in an array of uint64_t. e.g.
uint64_t a[64] = {0};
Now, the goal is to print the 64*64bits binary number in the console/file as its decimal value.
Initial Work:
To elaborate the problem I want to describe how I printed hex value.
int i;
int s = 1;
a[1] = (uint64_t)0xFF;
for(i = s; i>= 0; i--)
{
printf("0x%08llX, ", a[i]);
}
Output:
0x000000FF, 0x00000000,
Similarly for printing OCT value I can just take LSB 3 bits from a[64], print decimal equivalent of those bits, 3 bits right shift all the bits of a[64] and keep repeating until all the values of a[64] has been printed. (print in revers order to keep first Oct digit on the right)
I can print Hex and Oct value of a binary of unlimited size just by repeating this unit algorithm, but I could not find/develop one for Decimal which I can repeat over and over again to print a[64](or something bigger).
What I have thought of:
My initial idea was to keep subtracting
max_64 =(uint64)10000000000000000000; //(i.e.10^19)
the biggest multiple of 10 inside uint64_t, from a until the value inside a is smaller than max_64 (which is basically equivalent of rem_64 = a%max_64 ) and print the rem_64 value using
printf("%019llu",rem_64);
which is the 1st 19 decimal digits of the number a.
Then do an arithmetic operation similar to (not the code):
a = a/max_64; /* Integer division(no fractional part) to remove right most 19 dec digits from 'a' */
and keep repeating and printing 19 decimal digits. (print in such a way that first found 19 digits are on the right, then next 19 digits on its left and so on...).
The problem is this process is to long and I don't want to use all these to just print the dec value. And was looking for a process which avoids using these huge time consuming arithmetic operations.
What I believe is that there must be a way to print huge size just by repeating an algorithm (similar to how Hex and Oct can be printed) and I hope someone could point me to the right direction.
What my library can do(so far):
Add (Using Full-Adder)
Sub (Using Full-subtractor)
Compare (by comparing array size and comparing array elements)
Div (Integer division, no fractional part)
Modulus (%)
Multiplication (basically adding from several times :( )
I will write code for other operations if needed, but I would like to implement the printing function independent of the library if possible.
Consider the problem like this:
You have been given a binary number X of n bits (1<=n<=64*64) you have to print out X in decimal. You can use existing library if absolutely needed but better if unused.
TLDR:
Any code, reference or unit algorithm which I can repeat for printing decimal value of a binary of too big and/or unknown size would be much helpful. Emphasis on algorithm i.e. I don't need a code if some one could describe a process I will be able to implement it. Thanks in advance.

When faced with such doubts, and given that there are many bigint libraries out there, it is interesting to look into their code. I had a look at Java's BigInteger, which has a toString method, and they do two things:
for small numbers, they bite the bullet and do something similar to what you proposed - straightforward link-by-link base conversion, outputting decimal numbers in each step.
for large numbers, they use the recursive Schönhage algorithm, which they quote in the comments as being referred to in, among other places,
Knuth, Donald, The Art of Computer Programming, Vol. 2, Answers to
Exercises (4.4) Question 14.

Related

Multiplication of 2 numbers with a maximum of 2000 digits [duplicate]

This question already has answers here:
What is the simplest way of implementing bigint in C?
(5 answers)
How can I compute a very big digit number like (1000 digits ) in c , and print it out using array
(4 answers)
Store very big numbers in an integer in C
(2 answers)
Closed 3 months ago.
Implement a program to multiply two numbers, with the mention that the first can have a maximum of 2048 digits, and the second number is less than 100. HINT: multiplication can be done using repeated additions.
Up to a certain point, the program works using long double, but when working with larger numbers, only INF is displayed. Any ideas?
Implement a program to multiply two numbers, with the mention that the first can have a maximum of 2048 digits, and the second number is less than 100.
OK. The nature of multiplication is that if a number with N bits is multiplied by a number with M bits, then the result will have up to N+M bits. In other words, you need to handle a result that has 2148 bits.
A long double could be anything (it's implementation dependent). Most likely (Windows or not 80x86) is that it's a synonym for double, but sometimes it might be larger (e.g. the 80-bit format described on this Wikipedia page ). The best you can realistically hope for is a dodgy estimate with lots of precision loss and not a correct result.
The worst case (the most likely case) is that the exponent isn't big enough either. E.g. for double the (unbiased) exponent has to be in the range −1022 to +1023 so attempting to shove a 2048 bit number in there will cause an overflow (an infinity).
What you're actually being asked to do is implement a program that uses "big integers". The idea would be to store the numbers as arrays of integers, like uint32_t result[2148/32];, so that you actually do have enough bits to get a correct result without precision loss or overflow problems.
With this in mind, you want a multiplication algorithm that can work with big integers. Note: I'd recommend something from that Wikipedia page's "Algorithms for multiplying by hand" section - there's faster/more advanced algorithms that are way too complicated for (what I assume is) a university assignment.
Also, the "HINT: multiplication can be done using repeated additions" is a red herring to distract you. It'd take literally days for a computer do the equivalent of a while(source2 != 0) { result += source1; source2--; } with large numbers.
Here's a few hints.
Multiplying a 2048 digit string by a 100 digit string might yield a string with as many as 2148 digits. That's two high for any primitive C type. So you'll have to do all the math the hard way against "strings". So stay in the string space since your input will most likely be read in as much.
Let's say you are trying to multiple "123456" x "789".
That's equivalent to (123456 * (700 + 80 + 9)
Which is equivalent to to 123456 * 700 + 123456 * 80 + 123456 * 9
Which is equivalent to doing these steps:
result1 = Multiply 123456 by 7 and add two zeros at the end
result2 = Multiply 123456 by 8 and add one zero at the end
result3 = Multiply 123456 by 9
final result = result1+result2+result3
So all you need is a handful of primitives that can take a digit string of arbitrary length and do some math operations on it.
You just need these three functions:
// Returns a new string that is identical to s but with a specific number of
// zeros added to the end.
// e.g. MultiplyByPowerOfTen("123", 3) returns "123000"
char* MultiplyByPowerOfTen(char* s, size_t zerosToAdd)
{
};
// Performs multiplication on the big integer represented by s
// by the specified digit
// e.g. Multiple("12345", 2) returns "24690"
char* Multiply(char* s, int digit) // where digit is between 0 and 9
{
};
// Performs addition on the big integers represented by s1 and s2
// e.g. Add("12345", "678") returns "13023"
char* Add(char* s1, char* s2)
{
};
Final hint. Any character at position i in your string can be converted to its integer equivalent like this:
int digit = s[i] - '0';
And any digit can be converted back to a printable char:
char c = '0' + digit

Printing doubles without zeroes at the end?

Is there a way to print doubles in c using fprint so that the precision of the print is the least possible (So for example that an integer is always printed without decimals?)
I know that you can do something like printf("%.0f",number); But I am wondering if there is a way to use the minimum precision that makes the print exact (whenever the number can be expressed finitely in base 10 of course).
All finite double, encoded in base 10 or base 2 (the usual), or base 16 can be exactly finitely printed in base 10. DBL_MIN may take 100+ of digits to do so, but it is not infinite. printf() need not perform to that level. So it ends up being custom code and of course that can "printing doubles without zeros"
Recommend sprintf(buffer, "%.*e", DBL_DECIMAL_DIG - 1, some_double) and post-process the buffer to remove least significant 0 as needed for a "close enough" answer to code's goal.
Ref

GMP most significant digits

I'm performing some calculations on arbitrary precision integers using GNU Multiple Precision (GMP) library. Then I need the decimal digits of the result. But not all of them: just, let's say, a hundred of most significant digits (that is, the digits the number starts with) or a selected range of digits from the middle of the number (e.g. digits 100..200 from a 1000-digit number).
Is there any way to do it in GMP?
I couldn't find any functions in the documentation to extract a range of decimal digits as a string. The conversion functions which convert mpz_t to character strings always convert the entire number. One can only specify the radix, but not the starting/ending digit.
Is there any better way to do it other than converting the entire number into a humongous string only to take a small piece of it and throw out the rest?
Edit: What I need is not to control the precision of my numbers or limit it to a particular fixed amount of digits, but selecting a subset of digits from the digit string of the number of arbitrary precision.
Here's an example of what I need:
71316831 = 19821203202357042996...2076482743
The actual number has 1112852 digits, which I contracted into the ....
Now, I need only an arbitrarily chosen substring of this humongous string of digits. For example, the ten most significant digits (1982120320 in this case). Or the digits from 1112841th to 1112849th (21203202 in this case). Or just a single digit at the 1112841th position (2 in this case).
If I were to first convert my GMP number to a string of decimal digits with mpz_get_str, I would have to allocate a tremendous amount of memory for these digits only to use a tiny fraction of them and throw out the rest. (Not to mention that the original mpz_t number in binary representation already eats up quite a lot.)
If you know the number of decimal digits of x = 7^1316831 in advance, e.g., 1112852. Then you get your lower, say, 10 digits with:
x % (10^10), and the upper 20 digits with:
x / (10^(1112852 - 20)).
Note, I get 19821203202357042995 for the latter; 5 at final, not 6.
I don't think you can do that in GMP. However you can use Boost Multiprecision Library
Depending upon the number type, precision may be arbitrarily large (limited only by available memory), fixed at compile time (for example 50 or 100 decimal digits), or a variable controlled at run-time by member functions. The types are expression-template-enabled for better performance than naive user-defined types.
Emphasis mine
Another alternative is ttmath with the type ttmath::Big<e,m> that you can control the needed precision. Any fixed-precision types will work, provided that you only need the most significant digits, as they all drop the low significant digits like how float and double work. Those digits don't affect the high digits of the result, hence can be omitted safely. For instance if you need the high 20 digits then use a type that can store 20 digits and a little more, in order to provide enough data for correct rounding later
For demonstration let's take a simple example of 77 = 823543 and you only need the top 2 digits. Using a 4-digit type for calculation you'll get this
75 = 16807 => round to 1681×10¹ and store
75×7 = 1681×101×7 = 11767*10¹ ≈ 1177×102
75×7×7 = 1177×102×7 = 8232×102
As you can see the top digits are the same even without needing to get the full exact result. Calculating the full precision using GMP not only wastes a lot of time but also memory. Think about the amount of memory you need to store the result of another operation on 2 bigints to get the digits you want. By fixing the precision instead of leaving it at infinite you'll decrease the CPU and memory usage significantly.
If you need the 100th to 200th high order digits then use a type that has enough room for 201 digits and more, and extract those 101 digits after calculation. But this will be more wasteful so you may need to change to an arbitrary-precision (or fixed-precision) type that uses a base that's a power of 10 for its limbs (I'm using GMP notation here). For example if the type uses base 109 then each limb represents 9 digits in the decimal output and you can get arbitrary digit in decimal directly without any conversion from binary to decimal. That means zero waste for the string. I'm not sure which library uses base 10n but you can look at Mini-Pi's implementation which uses base 109, or write it yourself. This way it also work for efficiently getting the high digits
See
How are extremely large floating-point numbers represented in memory?
What is the simplest way of implementing bigint in C?

Using printf to align doubles by decimal not working

I want to print doubles so that the decimals line up. For example:
1.2345
12.3456
should result in
1.2345
12.3456
I have looked everywhere, and the top recommendation is to use the following method (the 5s can be anything):
printf(%5.5f\n");
I have tried this with the following (very) simple program:
#include <stdio.h>
int main() {
printf("%10.10f\n", 0.523431);
printf("%10.10f\n", 10.43454);
return 0;
}
My output is:
0.5234310000
10.4345400000
Why doesn't this work?
The number before the . is minimum characters total, not just before the radix point.
printf("%21.10f\n", 0.523431);
When you use "%10.10f" you are telling printf() "use 10 character positions to print the number (optional minus sign, integer part, decimal point and decimal part). From these 10 positions, reserve 10 for decimal part. If this is not possible, ignore the first number and use whatever positions needed to print the number so that the number of decimal positions is kept"
So that's what's printf() is doing.
So you need to indicate how many positions you are going to use, for example, 15, and how many positions from these are going to be decimals.... for example, 9. That will leave you with 5 positions for the minus sign and integer part and one position for the decimal point.
That is, try "%15.9f" in your printf's

How do you print out an IEEE754 number (without printf)?

For the purposes of this question, I do not have the ability to use printf facilities (I can't tell you why, unfortunately, but let's just assume for now that I know what I'm doing).
For an IEEE754 single precision number, you have the following bits:
SEEE EEEE EFFF FFFF FFFF FFFF FFFF FFFF
where S is the sign, E is the exponent and F is the fraction.
Printing the sign is relatively easy for all cases, as is catching all the special cases like NaN (E == 0xff, F != 0), Inf (E == 0xff, F == 0) and 0 (E == 0, F == 0, considered special just because the exponent bias isn't used in that case).
I have two questions.
The first is how best to turn denormalised numbers (where E == 0, F != 0) into normalised numbers (where 1 <= E <= 0xfe)? I suspect this will be necessary to simplify the answer to the next question (but I could be wrong so feel free to educate me).
The second question is how to print out the normalised numbers. I want to be able to print them out in two ways, exponential like -3.74195E3 and non-exponential like 3741.95. Although, just looking at those two side-by-side, it should be fairly easy to turn the former into the latter by just moving the decimal point around. So let's just concentrate on the exponential form.
I have a vague recollection of an algorithm I used long ago for printing out PI where you used one of the ever-reducing formulae and kept an upper and lower limit on the possibilities, outputting a digit when both limits agreed, and shifting the calculation by a factor of 10 (so when the upper and lower limits were 3.2364 and 3.1234, you could output the 3 and adjust for that in the calculation).
But it's been a long time since I did that so I don't even know if that's a suitable approach to take here. It seems so since the value of each bit is half that of the previous bit when moving through the fractional part (1/2, 1/4, 1/8 and so on).
I would really prefer not to have to go trudging through printf source code unless absolutely necessary so, if anyone can help out with this, I'll be eternally grateful.
If you want to get exact results for every conversion, you'll have to use arbitrary-precision arithmetic, as done in printf() implementations. If you want to get results that are "close," perhaps differing only in their least significant digit(s), then a very simple double-precision based algorithm will suffice: for the integer part, repeatedly divide by ten and append the remainders to form the decimal string (in reverse); for the fractional part, repeatedly multiply by ten and subtract off the integer parts to form the decimal string.
I recently wrote an article about this method: http://www.exploringbinary.com/quick-and-dirty-floating-point-to-decimal-conversion/ . It does not print scientific notation, but that should be trivial to add. The algorithm prints subnormal numbers (the ones I printed came out accurately, but you'd have to do more thorough testing).
Denormalized numbers cannot be turned into normalized numbers of the same floating point type. The equivalent normalized number's exponent will be too small to be represented by the exponent.
To print normalized numbers, one silly way I can think of is to repeatedly multiply by 10 (well, for the fractional part).
The first thing you need to do is convert the exponent to decimal (since presumably that's what you want the output in) using logarithms. You take the fraction of that result and multiply the mantissa by the exp10 of that fraction, and then convert that to decimal characters. From there you just need to insert the decimal point in the appropriate location, shifted by the now-decimal exponent.
There is a paper by G. Steele describing in more details an algorithm which seems based on the same principle as the one you outline. If memory serve, there are time when you are forced to use unbounded precision arithmetic. (I think it is How to print floating-point numbers accurately but citeseer is currently down from here, I can't confirm and google results are polluted by a retrospective paper by the same from 20 years later).

Resources