#include<stdio.h>
#include<time.h>
int main()
{
clock_t start;
double d;
long int n,i,j;
scanf("%ld",&n);
n=100000;
j=2;
start=clock();
printf("\n%ld",j);
for(j=3;j<=n;j+=2)
{
for(i=3;i*i<=j;i+=2)
if(j%i==0)
break;
if(i*i>j)
printf("\n%ld",j);
}
d=(clock()-start)/(double)CLOCKS_PER_SEC;
printf("\n%f",d);
}
I got the running time of 0.015 sec when n=100000 for the above program.
I also implemented the Sieve of Eratosthenes algorithm in C and got the running time of 0.046 for n=100000.
How is my above algorithm faster than Sieve's algorithm that I have implemented.
What is the time complexity of my above program??
My sieve's implementation
#define LISTSIZE 100000 //Number of integers to sieve<br>
#include <stdio.h>
#include <math.h>
#include <time.h>
int main()
{
clock_t start;
double d;
long int list[LISTSIZE],i,j;
int listMax = (int)sqrt(LISTSIZE), primeEstimate = (int)(LISTSIZE/log(LISTSIZE));
for(int i=0; i < LISTSIZE; i++)
list[i] = i+2;
start=clock();
for(i=0; i < listMax; i++)
{
//If the entry has been set to 0 ('removed'), skip it
if(list[i] > 0)
{
//Remove all multiples of this prime
//Starting from the next entry in the list
//And going up in steps of size i
for(j = i+1; j < LISTSIZE; j++)
{
if((list[j] % list[i]) == 0)
list[j] = 0;
}
}
}
d=(clock()-start)/(double)CLOCKS_PER_SEC;
//Output the primes
int primesFound = 0;
for(int i=0; i < LISTSIZE; i++)
{
if(list[i] > 0)
{
primesFound++;
printf("%ld\n", list[i]);
}
}
printf("\n%f",d);
return 0;
}
There are a number of things that might influence your result. To be sure, we would need to see the code for your sieve implementation. Also, what is the resolution of the clock function on your computer? If the implementation does not allow for a high degree of accuracy at the millisecond level, then your results could be within the margin of error for your measurement.
I suspect the problem lies here:
//Remove all multiples of this prime
//Starting from the next entry in the list
//And going up in steps of size i
for(j = i+1; j < LISTSIZE; j++)
{
if((list[j] % list[i]) == 0)
list[j] = 0;
}
This is a poor way to remove all of the multiples of the prime number. Why not use the built in multiplication operator to remove the multiples? This version should be much faster:
//Remove all multiples of this prime
//Starting from the next entry in the list
//And going up in steps of size i
for(j = list[i]; j < LISTSIZE; j+=list[i])
{
list[j] = 0;
}
What is the time complexity of my above program??
To empirically measure the time complexity of your program, you need more than one data point. Run your program for multiple values of N, then make a graph of N vs. time. You can do this using a spreadsheet, GNUplot, or graph paper and pencil. You can also use software and/or plain old mathematics to find a polynomial curve that fits your data.
Non-empirically: much has been written (and lectured in computer science classes) about analyzing computational complexity. The Wikipedia article on computational complexity theory might provide some starting points for further reading.
Your sieve implementation is incorrect; that's the reason why it is so slow:
you shouldn't make it an array of numbers, but an array of flags (you may still use int as the data type, but char would do as well)
you shouldn't be using index shifts for the array, but list[i] should determine whether i is a prime or not (and not whether i+2 is a prime)
you should start the elimination with i=2
with these modifications, you should follow 1800 INFORMATION's advice, and cancel all multiples of i with a loop that goes in steps of i, not steps of 1
Just for your time complexity:
You have an outer loop of ~LISTMAX iterations and an inner loop of max. LISTSIZE iterations. This means your complexity is
O(sqrt(n)*n)
where n = listsize. It is actually a bit lower since the inner loop reduces it's count eacht time and is only run for each unknown number. But that's difficult to calculate. Since the O-Notation offers an upper bound, O(sqrt(n)*n) should be ok.
The behaviour is difficult to predict, but you should take into account that accessing the memory is not cheap... it's probably faster to just calculate it again for small primes.
Those run times are too small to be meaningful. The system clock resolution is not accurate to that kind of level.
What you should do to get accurate timing information is run your algorithm in a loop. Repeat it a few thousand times to get the run time up to at least a second, then you can divide the time by the number of loops.
Related
I need to optimize this c code in order for it to run as fast as possible. I am quite new to code optimization in general. What should I begin with?
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char*argv[]) {
int n, i, flag;
int sumOfPrimeNumbers; //sum of prime numbers
sumOfPrimeNumbers = 0;
do {
flag = 0;
scanf("%d", &n);
for(i=2;i < n;i++)
{
if(n%i==0) {
flag=1; // flag all non-prime numbers
break;
}
}
if(flag==0) {
sumOfPrimeNumbers = sumOfPrimeNumbers + n; // sum prime numbers
}
} while (n != 0);
printf("%d\n", sumOfPrimeNumbers);
return 0;
}
For small values of n (maybe values less than 66536?) you can use a table of precomputed answers, like "printf("%d\n", table[n]);".
For larger values you can split n into "zone" and "offset in zone", like "zone = n / zone_size; offset = n % zone_size;" and then use "zone" as an index into a precomputed table to determine an initial starting point (and skip a huge amount of work, like "sumOfPrimeNumbers = zoneStartTable[n / zone_size;"). The "offset in zone" part can be used with Sieve of Eratosthenes; which means that it's nicer for "zone_size" to be the product of the smallest primes (e.g. maybe like "zone_size = 2 * 3 * 5 * 7 * 11 * 13 * 17;") because that makes it a little easier to create a Sieve of Eratosthenes from a non-zero starting point.
For this approach to work you will actually need 2 sieves - one to find primes from 1 to "sqrt(n)" so that you can mark multiples of those primes as "not prime" in the second sieve (which will contain values from "zone * zone_size" to n). This process can be accelerated by recognizing that the sieve for the smallest primes (that you used to determine "zone_size") create a pattern that repeats every "zone_size" numbers, and that pattern can be predetermined and then copied into both of the sieves to initialize the sieves, allowing you to skip marking the smallest primes in both sieves.
Improve the algorithm. Avoid premature optimizations
Rather than test up to n, search to the square root of n
// for(i=2;i < n;i++)
for (i=2; i <= n/i; i++)
Sieve of Eratosthenes
Form a list of found primes {2,3,5} and only test against those. As a new prime is found, append it to the list.
Many other optimizations possible.
I'm having a bit of trouble figuring out the Big O run time for the two set of code samples where the iterations depend on outside loops. I have a basic understanding of the Big O run times and I can figure out the run times for simpler code samples. I'm not too sure how some lines are affecting the run time.
I would consider this first one O(n^2). However, I'm not certain.
for(i = 1; i < n; i++){
for(j = 1000/i; j > 0; j--){ <--Not sure if this is still O(n)
arr[j]++; /* THIS LINE */
}
}
I'm a bit more lost with this one. O(n^3) possibly O(n^2)?
for(i = 0; i < n; i++){
for(j = i; j < n; j++){
while( j<n ){
arr[i] += arr[j]; /* THIS LINE */
j++;
}
}
}
I found this post and I applied this to the first code sample but I'm still unsure about the second. What is the Big-O of a nested loop, where number of iterations in the inner loop is determined by the current iteration of the outer loop?
Regarding the first one. It is not O(n^2)!!! For the sake of simplicity and readability, let's rewrite it in the form of pseudocode:
for i in [1, 2, ... n]: # outer loop
for j in [1, 2, ... 1000/i]: # inner loop
do domething with time complexity O(1). # constant-time operation
Now, the number of constant-time operations within the inner loop (which depends on parameter i of the outer loop) can be expressed as:
Now, we can calculate the number of constant-time operations overall:
Here, N(n) is a harmonic number (see wikipedia), and there is a very interesting property of these numbers:
Where C is Euler–Mascheroni constant. Therefore, the complexity of the first algorithm is:
Regarding the second one. It seems like either the code contains a mistake, or it is a trick test question. The code resolves to
for (i = 1; i < n; i++)
for(j = i; j < n; j++){
arr[j]++;
j++;
}
The inner loop takes
operations, so we can calculate overall complexity:
For the second loop (which it appears that you still need an answer for), you have sort of a misleading bit of code, where you have 3 nested loops, so at first glance, it makes sense that the runtime is O(n^3).
However, this is incorrect. This is because the innermost while loop modifies j, the same variable that the for loop modifies. This code is actually equivalent to this bit of code below:
for(i = 0; i < n; i++){
for(j = i; j < n; j++){
arr[i] += arr[j]; /* THIS LINE */
j++;
}
}
This is because the while loop on the inside will run, incrementing j until j == n, then it breaks out. At that point, the inner for loop will increment j again and compare it to n, where it will find that j >= n, and exit. You should be familiar with this case already, and recognize it as O(n^2).
Just a note, the second bit of code is not safe (technically), as j may overflow when you increment it an additional time after the while loop finishes running. This would cause the for loop to run forever. However, this will only occur when n = int_max().
I need to write code that sorts in 'n' run time and I don't know how to calculate it. I need to simply sort an array so that left side is odd and right side is even. This is what I wrote and I wonder how do I to find the run time.
for (i=0;i<size-1;i++)
{
if(ptr[i]%2==0 || ptr[i]==0)
{
for (j=i;j<size;j++)
{
if(ptr[j]%2!=0)
{
temp=ptr[i];
ptr[i]=ptr[j];
ptr[j]=temp;
break;
}
}
}
}
Thanks in advance.
Your runtime for this Code is O(N^2)
You can use Counting Sort to sort an array in linear time
For reference Counting Sort
As #VenuKant Sahu answered, OP's code is O(n*n)
That is due to its double nested for loops
for (i=0;i<size-1;i++)
...
for (j=i;j<size;j++)
...
I need to write code that sorts in 'n' run time
O(n) algorithm (did not want to just give the code)
The number of loop iterations below can not exceed n/2.
The increment even_side happens at most n times.
The decrement odd_side happens at most n times.
// set up indexes
int even_side = left-most valid index
int odd_side = right-most valid index
loop {
while (the_even_index_is_not_at_the_right_end && is_even(a[even_side]) increment even_side;
while (the_odd_index_is_not_at_the_left_end && !is_even(a[odd_side]) decrement odd_side
compare the indexes
if (done) exit the loop;
a[even_side] <==> a[odd_side]
}
Some helper code to set up a random array.
#define N 10
srand(time(NULL));
int a[N];
for (int i = 0; i<N; i++) {
a[i] = rand()%100;
printf(" %d", a[i]);
}
puts("");
I am solving the system of linear algebraic equations Ax = b by using Jacobian method but by taking manual inputs. I want to analyze the performance of the solver for large system. Is there any method to generate matrix A i.e non singular?
I am attaching my code here.`
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#define TOL = 0.0001
void main()
{
int size,i,j,k = 0;
printf("\n enter the number of equations: ");
scanf("%d",&size);
double reci = 0.0;
double *x = (double *)malloc(size*sizeof(double));
double *x_old = (double *)malloc(size*sizeof(double));
double *b = (double *)malloc(size*sizeof(double));
double *coeffMat = (double *)malloc(size*size*sizeof(double));
printf("\n Enter the coefficient matrix: \n");
for(i = 0; i < size; i++)
{
for(j = 0; j < size; j++)
{
printf(" coeffMat[%d][%d] = ",i,j);
scanf("%lf",&coeffMat[i*size+j]);
printf("\n");
//coeffMat[i*size+j] = 1.0;
}
}
printf("\n Enter the b vector: \n");
for(i = 0; i < size; i++)
{
x[i] = 0.0;
printf(" b[%d] = ",i);
scanf("%lf",&b[i]);
}
double sum = 0.0;
while(k < size)
{
for(i = 0; i < size; i++)
{
x_old[i] = x[i];
}
for(i = 0; i < size; i++)
{
sum = 0.0;
for(j = 0; j < size; j++)
{
if(i != j)
{
sum += (coeffMat[i * size + j] * x_old[j] );
}
}
x[i] = (b[i] -sum) / coeffMat[i * size + i];
}
k = k+1;
}
printf("\n Solution is: ");
for(i = 0; i < size; i++)
{
printf(" x[%d] = %lf \n ",i,x[i]);
}
}
This is all a bit Heath Robinson, but here's what I've used. I have no idea how 'random' such matrices all, in particular I don't know what distribution they follow.
The idea is to generate the SVD of the matrix. (Called A below, and assumed nxn).
Initialise A to all 0s
Then generate n positive numbers, and put them, with random signs, in the diagonal of A. I've found it useful to be able to control the ratio of the largest of these positive numbers to the smallest. This ratio will be the condition number of the matrix.
Then repeat n times: generate a random n vector f , and multiply A on the left by the Householder reflector I - 2*f*f' / (f'*f). Note that this can be done more efficiently than by forming the reflector matrix and doing a normal multiplication; indeed its easy to write a routine that given f and A will update A in place.
Repeat the above but multiplying on the right.
As for generating test data a simple way is to pick an x0 and then generate b = A * x0. Don't expect to get exactly x0 back from your solver; even if it is remarkably well behaved you'll find that the errors get bigger as the condition number gets bigger.
Talonmies' comment mentions http://www.eecs.berkeley.edu/Pubs/TechRpts/1991/CSD-91-658.pdf which is probably the right approach (at least in principle, and in full generality).
However, you are probably not handling "very large" matrixes (e.g. because your program use naive algorithms, and because you don't run it on a large supercomputer with a lot of RAM). So the naive approach of generating a matrix with random coefficients and testing afterwards that it is non-singular is probably enough.
Very large matrixes would have many billions of coefficients, and you need a powerful supercomputer with e.g. terabytes of RAM. You probably don't have that, if you did, your program probably would run too long (you don't have any parallelism), might give very wrong results (read http://floating-point-gui.de/ for more) so you don't care.
A matrix of a million coefficients (e.g. 1024*1024) is considered small by current hardware standards (and is more than enough to test your code on current laptops or desktops, and even to test some parallel implementations), and generating randomly some of them (and computing their determinant to test that they are not singular) is enough, and easily doable. You might even generate them and/or check their regularity with some external tool, e.g. scilab, R, octave, etc. Once your program computed a solution x0, you could use some tool (or write another program) to compute Ax0 - b and check that it is very close to the 0 vector (there are some cases where you would be disappointed or surprised, since round-off errors matter).
You'll need some good enough pseudo random number generator perhaps as simple as drand48(3) which is considered as nearly obsolete (you should find and use something better); you could seed it with some random source (e.g. /dev/urandom on Linux).
BTW, compile your code with all warnings & debug info (e.g. gcc -Wall -Wextra -g). Your #define TOL = 0.0001 is probably wrong (should be #define TOL 0.0001 or const double tol = 0.0001;). Use the debugger (gdb) & valgrind. Add optimizations (-O2 -mcpu=native) when benchmarking. Read the documentation of every used function, notably those from <stdio.h>. Check the result count from scanf... In C99, you should not cast the result of malloc, but you forgot to test against its failure, so code:
double *b = malloc(size*sizeof(double));
if (!b) {perror("malloc b"); exit(EXIT_FAILURE); };
You'll rather end, not start, your printf control strings with \n because stdout is often (not always!) line buffered. See also fflush.
You probably should read also some basic linear algebra textbook...
Notice that actually writing robust and efficient programs to invert matrixes or to solve linear systems is a difficult art (which I don't know at all : it has programming issues, algorithmic issues, and mathematical issues; read some numerical analysis book). You can still get a PhD and spend your whole life working on that. Please understand that you need ten years to learn programming (or many other things).
I have been tasked with optimizing a particular for loop in C. Here is the loop:
#define ARRAY_SIZE 10000
#define N_TIMES 600000
for (i = 0; i < N_TIMES; i++)
{
int j;
for (j = 0; j < ARRAY_SIZE; j++)
{
sum += array[j];
}
}
I'm supposed to use loop unrolling, loop splitting, and pointers in order to speed it up, but every time I try to implement something, the program doesn't return. Here's what I've tried so far:
for (i = 0; i < N_TIMES; i++)
{
int j,k;
for (j = 0; j < ARRAY_SIZE; j++)
{
for (k = 0; k < 100; k += 2)
{
sum += array[k];
sum += array[k + 1];
}
}
}
I don't understand why the program doesn't even return now. Any help would be appreciated.
That second piece of code is both inefficient and wrong, since it adds values more than the original code.
The loop unrolling (or lessening in this case since you probably don't want to unroll a ten-thousand-iteration loop) would be:
// Ensure ARRAY_SIZE is a multiple of two before trying this.
for (int i = 0; i < N_TIMES; i++)
for (int j = 0; j < ARRAY_SIZE; j += 2)
sum += array[j] + array[j+1];
But, to be honest, the days of dumb compilers has long since gone. You should generally leave this level of micro-optimisation up to your compiler, while you concentrate on the more high-level stuff like data structures, algorithms and human analysis.
That last one is rather important. Since you're adding the same array to an accumulated sum a constant number of times, you only really need the sum of the array once, then you can add that partial sum as many times as you want:
int temp = 0;
for (int i = 0; i < ARRAY_SIZE; i++)
temp += array[i];
sum += temp * N_TIMES;
It's still O(n) but with a much lower multiplier on the n (one rather than six hundred thousand). It may be that gcc's insane optimisation level of -O3 could work that out but I doubt it. The human brain can still outdo computers in a lot of areas.
For now, anyway :-)
There is nothing wrong on your program... it will return. It is only going to take 50 times more than the first one...
On the first you had 2 fors: 600.000 * 10.000 = 6.000.000.000 iterations.
On the second you have 3 fors: 600.000 * 10.000 * 50 = 300.000.000.000 iterations...
Loop unrolling doesn't speed loops up, it slows them down. In olden times it gave you a speed bump by reducing the number of conditional evaluations. In modern times it slows you down by killing the cache.
There's no obvious use case for loop splitting here. To split a loop you're looking for two or more obvious groupings in the iterations. At a stretch you could multiply array[j] by i rather than doing the outer loop and claim you've split the inner from the outer, then discarded the outer as useless.
C array-indexing syntax is just defined as (a peculiar syntax for) pointer arithmetic. But I guess you'd want something like:
sum += *arrayPointer++;
In place of your use of j, with things initialised suitably. But I doubt you'll gain anything from it.
As per the comments, if this were real life then you'd just let the compiler figure this stuff out.