I am hoping for insight into what looks like a partial multiplication.
#define LOW(x) ((x)&0xffffffff)
#define HIGH(x) ((x)>>32)
unsigned long long NotMultiply(unsigned long long x, unsigned long long y)
{
return HIGH(x)*HIGH(y) + LOW(x)*LOW(y);
}
This function is iterated multiple times, as follows:
unsigned long long DoBusyWork( unsigned long long x, unsigned long long y, int n)
{
while (n--)
x = NotMultiply(x,y);
return x;
}
Are there any shortcuts for calculating this result ?
What about for the case where x == y?
Any links to more information would help..
Looks like a strange hash calculation. It takes the lower 32 bits of two numbers, multiplies them, takes the higher 32 bits (moving them to the lower places) and multiplies them, and returns the sum of it.
I don't think that you can make it simpler, but probably faster. You can break the while loop if it returns the same value again.
unsigned long long DoBusyWork( unsigned long long x, unsigned long long y, int n)
{
long long previousX = x;
while (n--)
{
x = NotMultiply(x,y);
if (x == previousX) break;
previousX = x;
}
return x;
}
I don't know if there is a high chance to finish the loop early.
DoBusyWork (as #RBerteig has suggested, the name is a red flag) may be a way to force the compiler to not optimize a busy-loop.
These darned compilers are getting so smart that sometimes they decide, "Oh! You don't need a loop there! I see what you're trying to calculate!" despite your real interest as a programmer.
This is an early form of random number generator, often used on minicomputers and small mainframes. The call might look something like this:
unsigned long long seedold = 0xA5A5A5A5A5A5A5A5;
unsigned long long seednew = 0X5A5A5A5A5A5A5A5A;
unsigned long long lltmp;
int finetune;
Randomize finetune by timing the keyboard or some similar truly random but slow method,
then call once like this:
lltmp = DoBusyWork( seedold, seednew, finetune );
seedold = seednew;
seednew = lltmp;
Subsequently use it as a PRNG by calling it like this:
lltmp = DoBusyWork( seedold, seednew, 1 );
seedold = seednew;
seednew = lltmp;
Use seednew as the PRN.
Von Neumann once advocated this kind of calculation for "Monte-Carlo" testing applications but later changed his mind when he learned more about analyzing the output of PRNGs.
-Al.
LOW is trying to take only the bottom 32 bits. High is moving the next 32 bits down 32 bits. The whole NotMultiply routine is trying to multiple the bottom 32 bits of x and y together, and add them to the top 32 bits. DoBusyWork does it n times.
If x==y, the you get HIGH(x)²+LOW(x)².
I have no idea why they'd want to do that, though. It's sort of mashing together the top and bottom halves of x and y.
Related
My Code is taking too much time and possibly giving a wrong answer after a certain limit . What i have observed is the code slows down after I give N>50000 .
int solve(int N)
{
long long int temp=0;
long long int sum=0;
const long long int mod=1000000007;
for(int i=1;i<=N;i++)
{
for(int j=i;j<=N;j++)
{
temp=0;
if((i&j)==0)
{
temp=i*j;
sum=sum+temp;
}
}
}
return sum%mod;
}
I am trying to achieve results for N>=1 to N<=200000.
Can it be that the variable sum is going beyond the scope of long long int after a certain limit?
temp=i*j;
Since i,j are both of type int, this multiplication is done as int (likely signed 32-bit), and if it overflows the largest int (2**31-1) then you have undefined behavior and will certainly not get the right answer. A little quick math shows this will happen as soon as N exceeds 46340 which is around the 50000 you mention. The fact that the result is assigned to a variable of type long long int doesn't change the way the sub-expression i*j is evaluated, so it doesn't solve your overflow problem.
To get the multiplication to be done as long long int (at least 64 bits), you have to cast one or both of the operands:
temp = ((long long int)i) * j;
(Cast has higher precedence than multiplication so you can actually write temp = (long long int)i * j; but I tend to find this is less clear to read.)
This would explain the "possibly wrong answer" but has nothing to do with the slowness, which as others mentioned comes inherently from your quadratic algorithm. However even with N=200000 it finishes in about 8 seconds on my computer. Make sure you are using a good-quality optimizing compiler and that you have optimizations turned on.
Your question (2) about overflowing sum is not an issue though. Each of the summands i*j is at most N**2, and there are at most N such summands, so the maximum value of sum is at most N**3. You can verify that 200000**3 is still much smaller than 2**63.
int power(int first,int second) {
int counter1 = 0;
long ret = 1;
while (counter1 != second){
ret *= first;
counter1 += 1;
}
return ret;
}
int main(int argc,char **argv) {
long one = atol(argv[1]);
long two = atol(argv[2]);
char word[30];
long finally;
printf("What is the operation? 'power','factorial' or 'recfactorial'\n");
scanf("%20s",word);
if (strcmp("power",word) == 0){
finally = power(one,two);
printf("%ld\n",finally);
return 0;
}
}
This function is intended to do the "power of" operation like on the calculator, so if I write: ./a.out 5 3 it will give me 5 to the power of 3 and print out 125
The problem is, in cases where the numbers are like: ./a.out 20 10, 20 to the power of 10, I expect to see the result of: 1.024 x 10^13, but it instead outputs 797966336.
What is the cause of the current output I am getting?
Note: I assume that this has something to do with the atol() and long data types. Are these not big enough to store the information? If not, any idea how to make it run for bigger numbers?
Sure, your inputs are long, but your power function takes and returns int! Apparently, that's 32-bit on your system … so, on your system, 1.024×1013 is more than int can handle.
Make sure that you pick a type that's big enough for your data, and use it consistently. Even long may not be enough — check your system!
First and foremost, you need to change the return type and input parameter types of power() from int to long. Otherwise, on a system where long and int are having different size,
The input arguments may get truncated to int while you're passing long.
The returned value will be casted to int before returning, which can truncate the actual value.
After that, 1.024×1013 (10240000000000) cannot be held by an int or long (if 32 bits). You need to use a data type having more width, like long long.
one and two are long.
long one = atol(argv[1]);
long two = atol(argv[2]);
You call this function with them
int power(int first, int second);
But your function takes int, there is an implicit conversion here, and return int. So now, your long are int, that cause an undefined behaviour (see comments).
Quick answer:
The values of your power function get implicitly converted.
Change the function parameters to type other then int that can hold larger values, one possible type would be long.
The input value gets type converted and truncated to match the parameters of your function.
The result of the computation in the body of the function will be again converted to match the return type, in your case int: not able to handle the size of the values.
Note1: as noted by the more experienced members, there is a machine-specific issue, which is that your int type is not handling the usual size int is supposed to handle.
1. To make the answer complete
Code is mixing int, long and hoping for an answer the exceeds long range.
The answer is simply the result of trying to put 10 pounds of potatoes in a 5-pound sack.
... idea how to make it run for bigger numbers.
Use the widest integer available. Examples: uintmax_t, unsigned long long.
With C99 onward, normally the greatest representable integer will be UINTMAX_MAX.
#include <stdint.h>
uintmax_t power_a(long first, long second) {
long counter1 = 0;
uintmax_t ret = 1;
while (counter1 != second){ // number of iterations could be in the billions
ret *= first;
counter1 += 1;
}
return ret;
}
But let us avoid problematic behavior with negative numbers and improve the efficiency of the calculation from liner to exponential.
// return x raised to the y power
uintmax_t pow_jululu(unsigned long x, unsigned long y) {
uintmax_t z = 1;
uintmax_t base = x;
while (y) { // max number of iterations would bit width: e.g. 64
if (y & 1) {
z *= base;
}
y >>= 1;
base *= base;
}
return z;
}
int main(int argc,char **argv) {
assert(argc >= 3);
unsigned long one = strtoul(argv[1], 0, 10);
unsigned long two = strtoul(argv[2], 0, 10);
uintmax_t finally = pow_jululu(one,two);
printf("%ju\n",finally);
return 0;
}
This approach has limits too. 1) z *= base can mathematically overflow for calls like pow_jululu(2, 1000). 2) base*base may mathematically overflow in the uncommon situation where unsigned long is more than half the width of uintmax_t. 3) some other nuances too.
Resort to other types e.g.: long double, Arbitrary-precision arithmetic. This is likely beyond the scope of this simple task.
You could use a long long which is 8 bytes in length instead of the 4 byte length of long and int.
long long will provide you values between –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. This I think should just about cover every value you may encounter just now.
I wrote some code to determine the nth Fibonacci number using the nice blog post given in the accepted answer to this question: Finding out nth fibonacci number for very large 'n'. I am doing this as a way of practising a more difficult recursion problem given on projecteuler but that is not really relevant. The method relies on changing the problem to a small linear algebra problem of the form
Fn = T^n F1
where F1 = (1 1)^t and Fn contains the nth and (n-1)th Fibonacci number. The term T^n can then be determined in O(log n) time. I implemented this succesfully and it seems to work fine. When I perform the matrix exponentiation I use %10000 so I get the last 4 digits only, which seems to work (I checked against some large Fibonacci numbers). However, I wanted to try to get more last digits by increasing the number 10000. This doesn't seem to work however. I no longer get the correct answer. Here is my code
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
const unsigned long M = 10000;
unsigned long int * matProd(unsigned long int * A, unsigned long int * B){
unsigned long int * C;
C = malloc(4*sizeof(unsigned long int));
C[0] = ((A[0]*B[0]%M) + (A[1]*B[2]%M)) % M;
C[1] = ((A[0]*B[1]%M) + (A[1]*B[3]%M)) % M;
C[2] = ((A[2]*B[0]%M) + (A[3]*B[2]%M)) % M;
C[3] = ((A[2]*B[1]%M) + (A[3]*B[3]%M)) % M;
return C;
}
unsigned long int * matExp(unsigned long int *A, unsigned long int n){
if (n==1){
return A;
}
if (n%2==0){
return matExp(matProd(A,A),n/2);
}
return matProd(A,matExp(A,n-1));
}
unsigned long int findFib(unsigned long int n){
unsigned long int A[4] = {0, 1, 1, 1};
unsigned long int * C;
C = malloc(4*sizeof(unsigned long int));
C = matExp(A,n-2);
return (C[2]+C[3]);
}
main(){
unsigned long int n = 300;
printf("%ld\n",findFib(n));
}
There are probably several issues there with regards to proper coding conventions and things that can be improved. I thought changing to long int might solve the problem but this does not seem to do the trick. So basically the problem is that increasing M to for instance 1000000 does not give me more digits but instead gives me nonsense. What mistake am I making?
P.S. sorry for the poor math formatting, I am used to math.stackexchange.
The issue is probably that you are running on a system where long is 32-bits in size, as I believe is the case for Windows. You can check this by compiling and running printf("%d\n", sizeof(long)) which should output 4.
Since with M=1000000=10^6, the product of two numbers smaller than M can go up to 10^12, you get overflow issues when you are computing your matrix entries since unsigned long can hold up to at most 2^32-1 or roughly 4 * 10^9.
To fix this simply using unsigned long long as your type instead of unsigned long. Or better yet, uint64_t, which is guaranteed to be 64-bits in all platforms (and which will require #include <stdint.h>). This should make your code work for M up to sqrt(2^64)~10^9. If you need bigger than that you'll need to use a big integer library.
If the program works for M == 10000 but fails for M == 1000000 (or even for M == 100000) then that probably means that your C implementation's unsigned long int type is 32 bits wide.
If your matrix elements are drawn exclusively from Z10000, then they require at most 14 significant binary digits. The products you compute in your matrix multiplication function, before reducing modulo M, may therefore require up to 28 binary digits. If you increase M even to 100000, however, then the matrix elements require up to 17 binary digits, and the intermediate products require up to 34. The reduction modulo M is too late to prevent that overflowing a 32-bit integer and therefore giving you garbage results.
You could consider declaring the element type as uint64_t instead. If it's an overflow problem then that should give you enough extra digits to handle M == 1000000.
my algorithm calculates the arithmetic operations given below,for small values it works perfectly but for large numbers such as 218194447 it returns a random value,I have tried to use long long int,double but nothing works because modulus function which I have used can only be used with int types , can anyone explain how to solve it or could provide a links that can be useful
#include<stdio.h>
#include<math.h>
int main()
{
long long i,j;
int t,n;
scanf("%d\n",&t);
while(t--)
{
scanf("%d",&n);
long long k;
i = (n*n);
k = (1000000007);
j = (i % k);
printf("%d\n",j);
}
return 0;
}
You could declare your variables as int64_t or long long ; then they would compute the modulus in their range (e.g. 64 bits for int64_t). And it would work correctly only if all intermediate values fit in their range.
However, you probably want or need bignums. I suggest you to learn and use GMPlib for that.
BTW, don't use pow since it computes in floating point. Try i = n * n; instead of i = pow(n,2);
P.S. this is not for a beginner in C programming, using gmplib requires some fluency with C programming (and programming in general)
The problem in your code is that intermittent values of your computation exceed the range of values that can be stored in an int. n^2 for values of n>2^30 cannot be represented as int.
Follow the link above given by R.T. for a way of doing modulo on big numbers. That won't be enough though, since you also need a class/library that can handle big integer values . With only standard C libraries in place, that will otherwise be a though task do do on your own. (ok, for 2^31, a 64 bit integer would do, but if you're going even larger, you're out of luck again)
After accept answer
To find the modulo of a number n raised to some power p (2 in OP's case), there is no need to first calculate power(n,p). Instead calculate intermediate modulo values as n is raise to intermediate powers.
The following code works with p==2 as needed by OP, but also works quickly if p=1000000000.
The only wider integers needed are integers that are twice as wide as n.
Performing all this with unsigned integers simplifies the needed code.
The resultant code is quite small.
#include <stdint.h>
uint32_t powmod(uint32_t base, uint32_t expo, uint32_t mod) {
// `y = 1u % mod` needed only for the cases expo==0, mod<=1
// otherwise `y = 1u` would do.
uint32_t y = 1u % mod;
while (expo) {
if (expo & 1u) {
y = ((uint64_t) base * y) % mod;
}
expo >>= 1u;
base = ((uint64_t) base * base) % mod;
}
return y;
}
#include<stdio.h>
#include<math.h>
int main(void) {
unsigned long j;
unsigned t, n;
scanf("%u\n", &t);
while (t--) {
scanf("%u", &n);
unsigned long k;
k = 1000000007u;
j = powmod(n, 2, k);
printf("%lu\n", j);
}
return 0;
}
#include <stdio.h>
int main()
{
int i,j,k,t;
long int n;
int count;
int a,b;
float c;
scanf("%d",&t);
for(k=0;k<t;k++)
{
count=0;
scanf("%d",&n);
for(i=1;i<n;i++)
{
a=pow(i,2);
for(j=i;j<n;j++)
{
b=pow(j,2);
c=sqrt(a+b);
if((c-floor(c)==0)&&c<=n)
++count;
}
}
printf("%d\n",count);
}
return 0;
}
The above is a c code that counts the number of Pythagorean triplets within range 1..n.
How do I optimize it ? It times out for large input .
1<=T<=100
1<=N<=10^6
Your inner two loops are O(n*n) so there's not too much that can be done without changing algorithms. Just looking at the inner loop the best I could come up with in a short time was the following:
unsigned long long int i,j,k,t;
unsigned long long int n = 30000; //Example for testing
unsigned long long int count = 0;
unsigned long long int a, b;
unsigned long long int c;
unsigned long long int n2 = n * n;
for(i=1; i<n; i++)
{
a = i*i;
for(j=i; j<n; j++)
{
b = j*j;
unsigned long long int sum = a + b;
if (sum > n2) break;
// Check for multiples of 2, 3, and 5
if ( (sum & 2) || ((sum & 7) == 5) || ((sum & 11) == 8) ) continue;
c = sqrt((double)sum);
if (c*c == sum) ++count;
}
}
A few comments:
For the case of n=30000 this is roughly twice as fast as your original.
If you don't mind n being limited to 65535 you can switch to unsigned int to get a x2 speed increase (or roughly x4 faster than your original).
The check for multiples of 2/3/5 increases the speed by a factor of two. You may be able to increase this by looking at the answers to this question.
Your original code has integer overflows when i > 65535 which is the reason I switched to 64-bit integers for everything.
I think your method of checking for a perfect square doesn't always work due to the inherent in-precision of floating point numbers. The method in my example should get around that and is slightly faster anyways.
You are still bound to the O(n*n) algorithm. On my machine the code for n=30000 runs in about 6 seconds which means the n=1000000 case will take close to 2 hours. Looking at Wikipedia shows a host of other algorithms you could explore.
It really depends on what the benchmark is that you are expecting.
But for now, the power function could be a bottle neck in this. I think you can do either of the two things:
a) precalculate and save in a file and then load into a dictionary all the squared values. Depending on the input size, that might be loading your memory.
b) memorize previously calculated squared values so that when asked again, you could reuse it there by saving CPU time. This again, would eventually load your memory.
You can define your indexes as (unsigned) long or even (unsigned) long long, but you may have to use big num libraries to solve your problem for huge numbers. Using unsigned uppers your Max number limit but forces you to work with positive numbers. I doubt you'll need bigger than long long though.
It seems your question is about optimising your code to make it faster. If you read up on Pythagorean triplets you will see there is a way to calculate them using integer parameters. If 3 4 5 are triplets then we know that 2*3 2*4 2*5 are also triplets and k*3 k*4 k*5 are also triplets. Your algorithm is checking all of those triplets. There are better algorithms to use, but I'm afraid you will have to search on Google to study about Pythagorean triplets.