big int compiler implementations? - c

i am building a compiler similar to c , but i want it to parse integers bigger than 2^32 . hows it possible?how has been big integers been implemented in python and ruby like languages ..!!

There are libraries to do this sort of thing.
Check out gmplib.

There are lots of big number libraries, see this wikipedia article for a complete list.
GMP(GNU Multiple Precision Arithmetic Library) is sufficient for everything I have encountered. NTL is more of the same but is object orientated.
Generally these libraries represent the numbers with arrays with each digit of a number as a character if you want to roll your own but it is a lot of work.

If you want to write it yourself, follow my trip through memory lane ;-).
In the old days, when computers used 8 bits. We often needed to calculate with big numbers (like > 255). And we all had to write the routines. For example the addition.
If we needed to add numbers of two bytes to each other we used the following algorithm:
Add the least significant bytes.
If the result exceeded 8 bits, the carry bit was set.
Add the most significant bytes and the carry flag (if set).
If the result exceeded 8 bits you produced an overflow error (but you don't need to do this if you want more that 2 bytes.
You can extend this to more bytes/words/dwords/qwords and to other operators.

I believe you'll need some sort of bigint library, which are available on the net, just do a bit of searching and you may find one that's suitable for your project.
Because, simply parsing the integers, I believe, will not be enough. Your users will want not only to store, but also, probably, perform operation with such numbers.

There is a slide by Felix von Leitner that covers some bignum basics. Personally i think it is quite informative and technical.

C++ Big Integer Library from Matt McCutchen
https://mattmccutchen.net/bigint/
C++ source code only. Very simple to use.

You would have to use some sort of struct in c to achieve this. You will find this is difficult if you are on and x86 platform and not x64 as well. If you're on x86, prepare to get very familiar with assembly and the carry flag.
Good luck!

Related

What type I should use for fastest calculation speed?

I am making a 2D shooter game, and thus I have to stuff in a array lots of bullets, including their position, and where they are going.
So I have two issues, one is memory use, specially writing arrays that don't place things out of aligned and results in lots of padding or alignment that makes the speed of calculations suck.
The second is speed of calculation.
First this mean between choosing integers or floats... For now I am going with integers (if someone think floating point is better, please say so).
Then, this also mean choosing a variant of that type (8 bits? 16 bits? C confusing default? The CPU word size? Single precision? Double precision?)
Thus the question is: What type in C is fastest in modern processors (ie: common x86, ARM and other popular processors, don't worry about Z80 or 36bit processors), and what type is more reasonable when taking speed AND memory use in account?
Also, signed and unsigned has differences in speed?
EDIT because of close votes: Yes, it might be premature optimization, but I am asking not only about CPU use, but memory use (that might vary significantly), also I am doing the project to exercise my C skills, it is some years I don't code in C, and I thought to have some fun and find limits and stretch them, and also learn new standards (last time I used C it was still C89).
Finally, the major motivation of asking this question was just hacker curiosity when I found out some new interesting types (like int_fast*_t) existed in newer standards.
But if you still think this is not worth asking, then I can delete the question and go peruse the standards and some books, learn by myself. Then if others one day have the same curiosity, it is not my problem.
I would say an int should be the most comfortable for your CPU. But the C standard does have:
The typedef name int_fastN_t designates the fastest signed integer
type with a width of at least N . The typedef name uint_fastN_t
designates the fastest unsigned integer type with a width of at least
N
So in theory you could say things like: "I need it to be at least 16 bits so I shall use int_fast16_t". In practice that might translate to a plain int.
I suspect it is premature to think about these before you actually hit a performance issue that you can try to work around. I think it is better to solve problems when they occur than to try to think of an elusive super-solution that could solve all future possible issues.
Single precision floating point add and multiply is as fast as as 32 bit integer arithmetic in all modern processors (x86,ARM,MIPS), i.e. one result per clock cycle. Calculating positions and velocity in space is a lot easier with floating point arithmetic, so use floats. Single precision floats are 32 bits, and are the same size as the most efficient integer type on 32 bit CPUs.

Looking for Ansi C89 arbitrary precision math library

I wrote an Ansi C compiler for a friend's custom 16-bit stack-based CPU several years ago but I never got around to implementing all the data types. Now I would like to finish the job so I'm wondering if there are any math libraries out there that I can use to fill the gaps. I can handle 16-bit integer data types since they are native to the CPU and therefore I have all the math routines (ie. +, -, *, /, %) done for them. However, since his CPU does not handle floating point then I have to implement floats/doubles myself. I also have to implement the 8-bit and 32-bit data types (bother integer and floats/doubles). I'm pretty sure this has been done and redone many times and since I'm not particularly looking forward to recreating the wheel I would appreciate it if someone would point me at a library that can help me out.
Now I was looking at GMP but it seems to be overkill (library must be absolutely huge, not sure my custom compiler would be able to handle it) and it takes numbers in the form of strings which would be wasteful for obvious reasons. For example :
mpz_set_str(x, "7612058254738945", 10);
mpz_set_str(y, "9263591128439081", 10);
mpz_mul(result, x, y);
This seems simple enough, I like the api... but I would rather pass in an array rather than a string. For example, if I wanted to multiply two 32-bit longs together I would like to be able to pass it two arrays of size two where each array contains two 16-bit values that actually represent a 32-bit long and have the library place the output into an output array. If I needed floating point then I should be able to specify the precision as well.
This may seem like asking for too much but I'm asking in the hopes that someone has seen something like this.
Many thanks in advance!
Let's divide the answer.
8-bit arithmetic
This one is very easy. In fact, C already talks about this under the term "integer promotion". This means that if you have 8-bit data and you want to do an operation on them, you simply pad them with zero (or one if signed and negative) to make them 16-bit. Then you proceed with the normal 16-bit operation.
32-bit arithmetic
Note: so long as the standard is concerned, you don't really need to have 32-bit integers.
This could be a bit tricky, but it is still not worth using a library for. For each operation, you would need to take a look at how you learned to do them in elementary school in base 10, and then do the same in base 216 for 2 digit numbers (each digit being one 16-bit integer). Once you understand the analogy with simple base 10 math (and hence the algorithms), you would need to implement them in assembly of your CPU.
This basically means loading the most significant 16 bit on one register, and the least significant in another register. Then follow the algorithm for each operation and perform it. You would most likely need to get help from overflow and other flags.
Floating point arithmetic
Note: so long as the standard is concerned, you don't really need to conform to IEEE 754.
There are various libraries already written for software emulated floating points. You may find this gcc wiki page interesting:
GNU libc has a third implementation, soft-fp. (Variants of this are also used for Linux kernel math emulation on some targets.) soft-fp is used in glibc on PowerPC --without-fp to provide the same soft-float functions as in libgcc. It is also used on Alpha, SPARC and PowerPC to provide some ABI-specified floating-point functions (which in turn may get used by GCC); on PowerPC these are IEEE quad functions, not IBM long double ones.
Performance measurements with EEMBC indicate that soft-fp (as speeded up somewhat using ideas from ieeelib) is about 10-15% faster than fp-bit and ieeelib about 1% faster than soft-fp, testing on IBM PowerPC 405 and 440. These are geometric mean measurements across EEMBC; some tests are several times faster with soft-fp than with fp-bit if they make heavy use of floating point, while others don't make significant use of floating point. Depending on the particular test, either soft-fp or ieeelib may be faster; for example, soft-fp is somewhat faster on Whetstone.
One answer could be to take a look at the source code for glibc and see if you could salvage what you need.

Storing and printing integer values greater than 2^64

I am trying to write a program for finding Mersenne prime numbers. Using the unsigned long long type I was able to determine the value of the 9th Mersenne prime, which is (2^61)-1. For larger values I would need a data type that could store integer values greater than 2^64.
I should be able to use operators like *, *=, > ,< and % with this data type.
You can not do what you want with C natives types, however there are libraries that let handle arbitrarily large numbers, like the GNU Multiple Precision Arithmetic Library.
To store large numbers, there are many choices, which are given below in order of decreasing preferences:
1) Use third-party libraries developed by others on github, codeflex etc for your mentioned language, that is, C.
2) Switch to other languages like Python which has in-built large number processing capabilities, Java, which supports BigNum, or C++.
3) Develop your own data structures, may be in terms of strings (where 100 char length could refer to 100 decimal digits) with its custom operations like addition, subtraction, multiplication etc, just like complex number library in C++ were developed in this way. This choice could be meant for your research and educational purpose.
What all these people are basically saying is that the 64bit CPU will not be capable of adding those huge numbers with just an instruction but you rather need an algorithm that will be able to add those numbers. Such an algorithm would have to treat the 2 numbers in pieces.
And the libraries they listed will allow you to do that, a good exercise would be to develop one yourself (just the algorithm/function to learn how it's done).
There is no standard way for having data type greater than 64 bits. You should check the documentation of your systems, some of them define 128 bits integers. However, to really have flexible size integers, you should use an other representation, using an array for instance. Then, it's up to you to define the operators =, <, >, etc.
Fortunately, libraries such as GMP permits you to use arbitrary length integers.
Take a look at the GNU MP Bignum Library.
Use double :)
it will solve your problem!

Manipulating 80 bits datatype in C

I'm implementing some cryptographic algorithm in C which involves an 80 bits key.
A particular operation involves a rotate shifting the key x number of bits.
I've tried the long double type which if I'm not wrong is 80bits, but that doesn't work with the bitshift operator.
The only alternative I can come up with is to use a 10 element char array with some complicated looping and if-else.
My question is whether there's some simple and efficient way of carrying this out.
Thanks.
There is something a bit messed up here. If I understand you correctly, you are using a "soft" cpu on the FPGA.
Traditionally, people use the FPGA to make their own shift registers through VHDL/Verilog. These kind of algorithms are fairly painless to implement and very fast. Back at the university I did this is for a cryptography project.
Moreover, the paper you mentioned talks about a 128 bit key. This would be significantly easier to implement?
Sadly you need a bignum library. While C native data types have support for 80 bit floats it doesn't actually do what you want.
It is possible to link something like GMP or even use a less desirable approaches like 10 character array or two numbers a long and short (64bit and 16bit integers).
Neither is particularly pretty but they do work and if you're planning on using this for anything but a class, GMP is the way to go. Otherwise you could end up with a whole mess of timing attacks which you could code around but it could get really nasty, real quick.

Converting an arbitrary large number to base 256

I have a number of very large length may be upto 50 digits. I am taking that as string input. However, I need to perform operations on it. So, I need to convert them to a proper base, lets say, 256.
What will be the best algorithm to do so?
Multiple-precision arithmetic (a.k.a. bignums) is a difficult subject, and the good algorithms are non intuitive (there are books about that).
There exist several libraries handling bignums, like e.g. the GMP library (and there are other ones). And most of them take profit from some hardware instructions (e.g. add with carry) with carefully tuned small chunks of assembler code. So they perform better than what you would be able to code in a couple of months.
I strongly recommend using existing bignum libraries. Writing your own would take you years of work, if you want it to be competitive.
See also answers to this question.

Resources