large integer multiplication and addition on gpu [closed] - c

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm developing an encryption algorithm on the GPU. This algorithm requires the addition and multiplication of very very large integers . These numbers have a bit length of an estimated 150,000 bit or more.These numbers have different bit length. What algorithms can be used to perform addition and multiplication of these numbers? Please give me your information. Thank you.

Large integer addition is relatively simple: JackOLantern already provided the link to the post. Basically it's just doing carry propagation via parallel prefix sum.
For large-integer multiplication on CUDA, there are two ways come to my mind:
convert the integers to RNS (Residue Number System): then multiplication and addition can be done in parallel (as long as RNS base is large enough). Whenever you need to compare the numbers you can convert them to mixed radix system (see, e.g., How to Convert from a Residual Number System to a Mixed Radix System?). Finally, you can use CRT (Chinese Remaindering) to convert the numbers back to positional number system
implement large-integer multiplication directly using FFT since multiplication can be viewed as acyclic convolution of sequences (150Kbits length is not that much for FFT but can already give you some speedup). Still GNU MP switches to FFT multiplication routines starting from 1Mbit or even more. Again for multiplication via FFT there are two options:
use floating-point double-precision FFT and encode large-integer bits into mantissa (easier to implement)
use the so-called Number-Theoretic transform (FFT over finite field)
Anyway, there is a bunch of theory behind these things. You can also check my paper on FFT mul in CUDA. But there are also many research papers on this subject especially in cryptography field.

Related

How to increase performance of sin and cos using neon instructions? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
How to use arm_neon.h headerfile to increase the performance of a code using sin and cos functions.?
The board used is a Xilinx T1 accelerator card with ARM architecture armv8-a and cortex a53.
Language is c.
arm_neon.h contains SIMD intrinsics, which offer a C API to access/invoke individual low level instructions.
Thus, if you intend to speed up sin/cos with arm_neon.h, the method is to rewrite those trigonometric functions using vector arithmetic calculating 4 values at the same time.
Things you need to concern are:
the code needs to be branchless
you need to define how accurate you need to be
you need to define the input range (no need to handle multiples of 2*pi ?)
you need to define input unit (radians vs degrees vs fractions of 2^n)
All of this will determine what kind of approximation to use -- polynomial, linear piece-wise, rational polynomial and what steps or corner cases can be omitted.

Calculate using Rational Index Binomial Theorem in C [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
I have tried so many ways to calculate using this binomial theorem but I still couldn't find one:
The value of x and n is given for example b=0.5 and n=8
I know for the factorial we have to use loop but the numerator part is a little bit tricky.
Obviously I know how to code for (1+b)^n, but the question is still asking for the coding for binom theorem.
For example if the value of x is 0<x<1 and n is any positive integer, what will the value of (1+x)^n will be using the binomial theorem?
I understand that you know how to calculate the left side of the equation in programming.
I understand that you also know how to program the right side, apart from the problem that it is an infinite loop; but you want it to end at some point and have a result.
By the math theory ending early means a wrong result.
But in programming you will have problems with restricted precision of floating point math anyway. So you can take shortcuts to solve your problem.
In the comments you find recommendations how to do the calculation of each step efficiently. I will only focus on the end condition.
Write a loop calculating more and more precise steps.
End the loop when a freshly calculated (intermediate) result is the same as the previous one. With floating point representation having restricted precision that will sooner or later happen and the result will be within only one "minimal rounding" of the correct result.
Note:
In order to avoid the restricted precision getting in the way at the wrong place, I recommend to calculate the parts (as described in the recommendation in comments) in double and the intermediate results (those you compare for the loop condition) into a float variable.

How to predict Rand() function in C? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I am trying to make an oracle which predicts next random number in a sequence. I have an array of random generated numbers.
The rand() function is a Pseudo-Random Number Generator (PRNG). It is not a cryptographically secure source of entropy. If you know the seed, you can completely predict the sequence as it is deterministic, typically based on Linear Congruential Generator (LCG). Such generators have a finite period length, after which they repeat.
If you know the given sequence starts from the beginning, it would be trivial to brute-force the seed to find matching initial sequence. Otherwise there are statistical methods you could use to narrow down the potential seeds.
If you have actual random numbers, there's no way to predict them.
Software can be programmed to use actual random numbers that are acquired from monitoring random events like background radiation, atomic decay and electrical noise from various components. This is usually used only for critical applications like creating cryptographic keys, since the operation will block until enough "Random bits" have been collected.
Most software uses an algorithm that creates random-looking numbers based on a seed and past events like calls to the PRNG, elapsed time, etc. These are possible to predict with 100 accuracy if you know the algorithm used and all the events it uses for inputs, or have the ability to reset the seed to a known value.

Big Integer arithmetic [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want to know the different techniques that are used for performing arithmetic operations on very large integers in C. One that I know of is using string to hold a number and define operations add, subtract etc. for it. I am not interested in using libraries, this question is purely for knowledge. Please suggest any other such methods/techniques used.
You can go as low level as representing your integers as an array of bytes, and do all the operations (like addition, subtraction, multiplication, division or comparison) just like a CPU does them, at word level.
The simplest algorithms are for addition and subtraction, where you simply add or subtract the digits in sequence, carrying as necessary.
Negative numbers can be represented in 2's complement.
For comparison, you just compare the high order digits until a difference is found.
For multiplication the most straightforward algorithm (and slowest) you can implement is repeated addition.
For division, things are a little more complicated than multiplication, see: http://en.wikipedia.org/wiki/Division_algorithm
A common application for this is public-key cryptography, whose algorithms commonly employ arithmetic with integers having hundreds of digits.
Check the OpenSSL BIGNUM documentation for this: https://www.openssl.org/docs/crypto/bn.html
You could use 3 linked lists, one for number A, one for number B and one for the result.
You would then read each digit as a character input from the user, make it an integer and and save it to a new node in the list, corresponding to the number you read at the moment.
And Finally you would just write as functions the operations for adding,subtracting etc.
In each you would follow their respective algorithm you learned at school, starting from the LSB node, going up to the MSB node, always keeping at mind the base powers of each number(1 node * 10^0, 2 node * 10^1, 3 node * 10^2, ...,n node * 10^n ).

Looking for a tool that would tell me which integer-widths I need for a calculation in C to not overflow [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I have a lengthy calculation (polynomial of 4th degree with fixed decimals) that I have to carry out on a microcontroller (TI/LuminaryMicro lm3s9l97 [CortexM3] if somebody is interested).
When I use 32bit-Integers, some calculations flow over. When I use 64bit Integers the compiler emits an ungodly amount of code to simulate 64bit-multiplication on the 32bit-processor.
I am looking for a program into which I could input (just for example):
int a, b, c;
c = a * b; // Do the multiplication
c >>= 10; // Correct for fixed decimal point
c *= a*b;
where I could specify, that a and b would be in the range of [15000..30000] [40000..100000] respectively and it would tell me what sizes the integers need to not overflow (and/or underflow; I would possibly get a false positive there for the >> 10) in the specified domain, so that I could use 32bit-integers where possible.
Does something like this exists already or do I have to roll my own?
Thanks!
I think you have to roll your own. Implementing an extended sequence of muls and divs in fixed-point can be tricky. If fixed-point is applied without careful thought, overflow can happen quite easily. When implementing such a formula, I use a spreadsheet to experiment with the following:
Ordering of operations: muls require twice the number of bits on the left-hand side, i.e. multiplying two 22.10 numbers can yield a 44-bit result. Div operations reduce the number needed on the LHS. Strategically re-ordering the equation's evaluation, or even rewriting it (expanding, factoring, etc) can provide opportunities to improve precision.
Pre-computed scalars: along the same lines, pre-computing values may help. These scalars may not be need to be constant, since look-up tables may be used to store a collection of pre-computed values.
Loss of precision: is 10-bits of precision really needed at steps in the evaluation of the equation? Perhaps some steps need lower precision, leaving more bits on the LHS to avoid overflow.
Given these concerns (all of which are application-specific), optimal use of fixed-point math remains very much a manual exercise. There are good resources on the web. I've found this one useful on occasion.
Ada might be able to do that using range types.

Resources