Specific modular multiplication algorithm [duplicate] - c

This question already has answers here:
Calculate a*a mod n without overflow
(5 answers)
Closed 10 years ago.
I have 3 large 64 bit numbers: A, B and C. I want to compute:
(A x B) mod C
considering my registers are 64 bits, i.e. writing a * b actually yields (A x B) mod 2⁶⁴.
What is the best way to do it? I am coding in C, but don't think the language is relevant in this case.
After getting upvotes on the comment pointing to this solution:
(a * b) % c == ((a % c) * (b % c)) % c
let me be specific: this isn't a solution, because ((a % c) * (b % c)) may still be bigger than 2⁶⁴, and the register would still overflow and give me the wrong answer. I would have:
(((A mod C) x (B mod C)) mod 2⁶⁴) mod C

As I have pointed in comment, Karatsuba's algorithm might help. But there's still a problem, which requires a separate solution.
Assume
A = (A1 << 32) + A2
B = (B1 << 32) + B2.
When we multiply those we get:
A * B = ((A1 * B1) << 64) + ((A1 * B2 + A2 * B1) << 32) + A2 * B2.
So we have 3 numbers we want to sum and one of this is definitely larger than 2^64 and another could be.
But it could be solved!
Instead of shifting by 64 bits once we can split it into smaller shifts and do modulo operation each time we shift. The result will be the same.
This will still be a problem if C itself is larger than 2^63, but I think it could be solved even in that case.

Related

How to efficiently verify whether pow(a, b) % b == a in C (without overflow)

I'd like to verify whether
pow(a, b) % b == a
is true in C, with 2 ≤ b ≤ 32768 (215) and 2 ≤ a ≤ b with a and b being integers.
However, directly computing pow(a, b) % b with b being a large number, this will quickly cause C to overflow. What would be a trick/efficient way of verifying whether this condition holds?
This question is based on finding a witness for Fermat's little theorem, which states that if this condition is false, b is not prime.
Also, I am also limited in the time it may take, it can't be too slow (near or over 2 seconds). The biggest Carmichael number, a number b that's not prime but also doesn't satisfy pow(a, b)% b == a with 2 <= a <= b (with b <= 32768) is 29341. Thus the method for checking pow(a, b) % b == a with 2 <= a <= 29341 shouldn't be too slow.
You can use the Exponentiation by squaring method.
The idea is the following:
Decompose b in binary form and decompose the product
Notice that we always use %b which is below 32768, so the result will always fit in a 32 bit number.
So the C code is:
/*
* this function computes (num ** pow) % mod
*/
int pow_mod(int num, int pow, int mod)
{
int res = 1
while (pow>0)
{
if (pow & 1)
{
res = (res*num) % mod;
}
pow /= 2;
num = (num*num)%mod;
}
return res;
}
You are doing modular arithmetic in Z/bZ.
Note that, in a quotient ring, the n-th power of the class of an element is the class of the n-th power of the element, so we have the following result:
(a^b) mod b = ((((a mod b) * a) mod b) * a) mod b [...] (b times)
So, you do not need a big integer library.
You can simply write a C program using the following algorithm (pseudo-code):
declare your variables a and b as integers.
use a temporary variable temp that is initialized with a.
do a loop with b steps, and compute (temp * a) mod b at each step, to get the new temp value.
compare the result with a.
With this formula, you can see that the highest value for temp is 32768, so you can choose an integer to store temp.

Program outputting 2^n instead of intended

I was recently trying to make a small program in c to find the nth fibannaci number. For some reason, when I run it it instead does the calculation of 2^n and returns that, I have asked around a bit but no one seems to have been able to determine why. I was hoping someone may be able to help me figure it out.
float wat(int n){
int a = 0x3fcf1bbd, b = 0x3f1e377a, c = 0x807fffff, d = 0x400f1bbd;
int e = (((a >> 23) + n) << 23) | (a & c);
int f = (((b >> 23) + n) << 23) | (b & c);
return ((*(float*)&e) + (*(float*)&f))/(*(float*)&d);
}
Your code is a standards violating, unportable hack that might have been meaningful 20 years ago in very special situations and when floating point hardware was magnitudes slower than the rest of the cpu. It's completely meaningless today and asking someone today to debug it for you is like asking for help to insulate your house with asbestos. We don't do things this way anymore, for good reasons.
It can all be written in correct, portable floating point operations like this:
#include <math.h>
float
wat(int n)
{
float a = 0x1.9e377ap+0;
float b = -0x1.3c6ef4p-1;
float d = 0x1.1e377ap+1;
return (ldexpf(a, n) - ldexpf(b, n)) / d;
}
This does exactly the same thing, but without disgusting hacks. Of course it won't do anything useful because adding n to the exponent of X doesn't do X^n, it does X*2^n. So your calculation ends up being:
s = sqrt(5)
(2^n * (1 + s)/2 - 2^n * (1 - s)/2)/s =
(2^n/2 * ((1 + s) - (1 - s)))/s =
2^n

How to use Modulo efficiently?

Im doing a (for myself) very complex task, where i have to calculate the largest possible number of sequences when given a number n of segments.
I found out that the Catalan Number represents this sequences, and i got it to work for n<=32. The results i get should be calculated mod 1.000.000.007. The problem i have is that "q" and "p" get to big for a long long int and i can't just mod 1.000.000.007 before dividing "q" and "p" because i would get a different result.
My question is, is there a really efficient way to solve my problem, or do i have to think about storing the values differently?
My limitations are the following:
- stdio.h/iostream only
- only Integers
- n<=20.000.000
- n>=2
#include <stdio.h>
long long cat(long long l, long long m, long long n);
int main(){
long long n = 0;
long long val;
scanf("%lld", &n);
val = cat(1, 1, n / 2);
printf("%lld", (val));
return 0;
}
long long cat(long long q, long long p, long long n){
if (n == 0) {
return (q / p) % 1000000007;
}
else {
q *= 4 * n - 2;
}
p *= (n + 1);
return cat(q, p, n - 1);
}
To solve this efficiently, you'll want to use modular arithmetic, with modular inverses substituting for division.
It's simple to prove that, in the absence of overflow, (a * b) % c == ((a % c) * b) % c. If we were just multiplying, we could take results mod 1000000007 at every step and always stay within the bounds of a 64-bit integer. The problem is division. (a / b) % c does not necessarily equal ((a % c) / b) % c.
To solve the problem with division, we use modular inverses. For integers a and c with c prime and a % c != 0, we can always find an integer b such that a * b % c == 1. This means we can use multiplication as division. For any integer d divisible by a, (d * b) % c == (d / a) % c. This means that ((d % c) * b) % c == (d / a) % c, so we can reduce intermediate results mod c without screwing up our ability to divide.
The number we want to calculate is of the form (x1 * x2 * x3 * ...) / (y1 * y2 * y3 * ...) % 1000000007. We can instead compute x = x1 % 1000000007 * x2 % 1000000007 * x3 % 1000000007 ... and y = y1 % 1000000007 * y2 % 1000000007 * y3 % 1000000007 ..., then compute the modular inverse z of y using the extended Euclidean algorithm and return (x * z) % 1000000007.
If you're using gcc or clang and a 64-bit target, there exists a __int128 type. This gives you extra bits to work with, but obviously only to a point.
Most likely the easiest way to deal with this kind of issue is to use a "bignum" library, i.e. a library that deals with representing and doing arithmetic on arbitrarily large numbers. The arguably most popular open source example is libgmp - you should be able to get your algorithm going quite easily with that. It's also tuned to high performance standards.
Obviously you can reimplement this yourself, by representing your numbers as e.g. arrays of integers of a certain size. You'll have to implement algorithms for doing basic arithmetic such as +, -, *, /, % yourself. If you want to do this as a learning experience that's fine, but there's no shame in using libgmp if you just want to focus on the algorithm you're trying to implement.

explicit MOD in C? [duplicate]

This question already has answers here:
How to code a modulo (%) operator in C/C++/Obj-C that handles negative numbers
(16 answers)
Closed 9 years ago.
Ok So I know and understand the difference between MOD and REM. I also am aware that C's % operation is a REM operation. I wanted to know, and could not find online, if there is some C library or function for an explicit MOD.
Specifically, I'd like (-1)%4 == 3 to be true. In C (-1)%4 = -1 since it is a remainder. And preferably I'd like to avoid using absolute values and even better would be to utilize some built in function that I can't seem to find.
Any advice will be much appreciated!
The best option I can think of is to compute:
((-1 % 4) + 4 ) % 4
Here you may replace -1 with any value and you will get MOD not REM.
The most common way to do what you expect is:
((a % b) + b ) % b
It works because (a % b) is a number in ]-b; b[ so (a % b) + b is positive (in ]0; 2 * b[) and adding b did not changed the mod.
Just do,
int mod(int a, int b)
{
int res = a % b;
return(res < 0 ? (res + b) : res);
}
Every negative res content after MOD operation is added to b to get modulus of a & b.

an implement of sizeof guaranteeing bit alignment

A define sentence is:
#define _INTSIZEOF(n) ( (sizeof(n) + sizeof(int) - 1) & ~(sizeof(int) - 1) )
I have been told the purpose is bit alignment.
I wonder how it works, thx in advance.
The above macro simply aligns the size of n to the nearest greater-or-equal sizeof(int) boundary.
The basic algorithm for aligning value a to the nearest greater-or-equal arbitrary boundary b is to
Divide a by b rounding up, and then
Multiply the quotient by b again.
In the domain of unsigned (or just positive) values the first step is achieved by the following popular trick
q = (a + b - 1) / b
// where `/` is ordinary C-style integer division (rounding down)
// Now `q` is `a` divided by `b` rounded up
Combining this with the second step we get the following
aligned_a = (a + b - 1) / b * b
In aligned_a you get the desired aligned value.
Applying this algorithm to the problem at hand one would arrive at the following implementation of _INTSIZEOF macro
#define _INTSIZEOF(n)\
( (sizeof(n) + sizeof(int) - 1) / sizeof(int) * sizeof(int) )
This is already good enough.
However, if you know in advance that the alignment boundary is a power of 2, you can "optimize" the calculations by replacing the divide+multiply sequence with a simple bitwise operation
aligned_a = (a + b - 1) & ~(b - 1)
That is exactly what's done in the above original implementation of _INTSIZEOF macro.
This "optimization" might probably make sense with some compilers (although I would expect a modern compiler to be able to figure it out by itself). However, considering that the above _INTSIZEOF(n) macro is apparently intended to serve as a compile-time expression (it does not depend on any run-time values, barring VLA objects/types passed as n), there's not much point in optimizing it that way.
Here's a hint:
A common method to do ceil(a/b) is:
(a + (b-1)) / b
b * ( (a + b - 1) / b ) = (a + b - 1) & ~(b - 1)
To see why the above holds, consider this:
Part I (why q = (a + b - 1) / b produces the number we are looking for):
... note that we want q to be the number of b's that are in a, but rounded up (i.e., if after integer division, there is a remainder, then that remainder should be rounded up to b and hence q incremented by 1).
there exists Q and R such that a = Qb + R, and hence a + b - 1 = Qb + b - 1 + R. If we perform integer division on a + b - 1 by b we would get Q + (b-1+R)/b. The 2nd part of this will be zero if R is zero and 1 if R is not zero (note R is guaranteed to be less than b).
Part II (the macro):
now if b is a power of two, then integer division of a + b - 1 by b is simply a right shift of the exponent of b (i.e., b = 2^n, then shift right by n places).
in addition, multiplication by b is a left shift (shift left by n places)
hence combined, all we are doing is clearing the rightmost n bits to zero, and this is accomplished by masking: ~(b-1) gives us 1111...111000...0 where the number of 1s is equal to n (b = 2^n)

Resources