I am writing a program in which a certain for-loop gets iterated over many many times.
One single iteration doesn't take to long but since the program iterates the loop so often it takes quite some time to compute.
In an effort to get more information on the progress of the program without slowing it down to much I would like to print the progress every xth step.
Is there a different way to do this, than a conditional with a modulo like so:
for(int i = 0; i < some_large_number; i++){
if(i % x == 0)
printf("%f%%\r", percent);
//some other code
.
.
.
}
?
Thanks is advance
This code:
for(int i = 0; i < some_large_number; i++){
if(i % x == 0)
printf("%f%%\r", percent);
//some other code
.
.
.
}
can be restructured as:
/* Partition the execution into blocks of x iterations, possibly including a
final fragmentary block. The expression (some_large_number+(x-1))/x
calculates some_large_number/x with any fraction rounded up.
*/
for (int block = 0, i = 0; block < (some_large_number+(x-1))/x; ++block)
{
printf("%f%%\r", percent);
// Set limit to the lesser of the end of the current block or some_large_number.
int limit = (block+1) * x;
if (some_large_number < limit) limit = some_large_number;
// Iterate the original code.
for (; i < limit; ++i)
{
//some other code
}
}
With the following caveats and properties:
The inner loop has no more work than the original loop (it has no extra variable to count or test) and has the i % x == 0 test completely removed. This is optimal for the inner loop in the sense it reduces the nominal amount of work as much as possible, although real-world hardware sometimes has finicky behaviors that can result in more compute time for less actual work.
New identifiers block and limit are introduced but can be changed to avoid any conflicts with uses in the original code.
Other than the above, the inner loop operates identically to the original code: It sees the same values of i in the same order as the original code, so no changes are needed in that code.
some_large_number+(x-1) could overflow int.
I would do it like this:
int j = x;
for (int i = 0; i < some_large_number; i++){
if(--j == 0) {
printf("%f%%\r", percent);
j = x;
}
//some other code
.
.
.
}
Divide the some_large_number by x. Now loop for x times and nest it with the new integer and then print the percent. I meant this:
int temp = some_large_number/x;
for (int i = 0; i < x; i++){
for (int j = 0; j < temp; j++){
//some code
}
printf("%f%%\r", percent);
}
The fastest approach regarding your performance concern would be to use a nested loop:
unsigned int x = 6;
unsigned int segments = some_large_number / x;
unsigned int y;
for ( unsigned int i = 0; i < segments; i++ ) {
printf("%f%%\r", percent);
for ( unsigned int j = 0; j < x; j++ ) {
/* some code here */
}
}
// If some_large_number canĀ“t be divided evenly through `x`:
if (( y = (some_large_number % x)) != 0 )
{
for ( unsigned int i = 0; i < y; i++ ) {
/* same code as inside of the former inner loop. */
}
}
Another example would be to use a different counting variable for the check to execute the print process by comparing that to x - 1 and reset the variable to -1 if it matches:
unsigned int x = 6;
unsigned int some_large_number = 100000000;
for ( unsigned int i = 0, int j = 0; i < some_large_number; i++, j++ ) {
if(j == (x - 1))
{
printf("%f%%\r", percent);
j = -1;
}
/* some code here */
}
I'm currently rewriting a minesweeper program in C using the CSFML library.
I'm having some issues at managing the initialization only after the first click, more precisely in the part where I'm supposed to set the tiles around the click empty.
I can't find a way to make these tiles empty without having a risk of removing some bombs.
Here's my init code block for now :
int current = 0;
temp.bombs = BOMB_EASY;
temp.difficulty = EASY;
temp.mapEasy = malloc(sizeof(sTILE *) * (Y_EASY + 1));
for (int i = 0; i < Y_EASY + 1 ; i++)
{
temp.mapEasy[i] = malloc(sizeof(sTILE) * (X_EASY + 1));
}
for (int i = 0; i < X_EASY + 1; i++)
{
temp.mapEasy[Y_EASY][i].type = 0;
}
while (current < BOMB_EASY)
{
for (int i = 0; i < Y_EASY; i++)
{
for (int j = 0; j < X_EASY; j++)
{
int isBomb = rand() % 10;
if (isBomb == 0 && current < BOMB_EASY && temp.mapEasy[i][j].type != 9)
{
temp.mapEasy[i][j].type = 9;
current++;
}
else if (temp.mapEasy[i][j].type != 9)
{
temp.mapEasy[i][j].type = 0;
}
}
}
}
for (int i = 0; i < Y_EASY; i++)
{
for (int j = 0; j < X_EASY; j++)
{
if (temp.mapEasy[i][j].type == 0)
{
temp.mapEasy[i][j].type = HowManyBombs(temp.mapEasy, i, j, Y_EASY, X_EASY);
}
temp.mapEasy[i][j].isRevealed = sfFalse;
temp.mapEasy[i][j].isFlagged = sfFalse;
}
}
}
I know my question might seem stupid and someone probably already answered it but I couldn't find the answer so thanks at the ones who will answer me.
Create an empty matrix
Fill it with n mines at random locations. Upon generating (x, y) coordinates, check if they are already taken.
If the coordinates are already taken, you should place the mine at the next available position.
For example by increasing x by 1, check if free, if not, increase x again. Upon reaching the end of the row, increase y instead and start over with x=0.
If you simply generate a new random number, your algorithm could in theory get forever stuck. In practice it will probably work, but I would expect such an algorithm to generate the grid slower1) than one that just picks the next free spot.
1) rand() call overhead is the most likely bottleneck in this algorithm. But also if you pick the next spot rather than calling rand() again, the CPU might be able to speculatively load (parts of) the array in prefetch data cache. This wouldn't be possible when the memory location is literally random each time you pick it.
i'm trying to make IIR filter. I made FIR filter, but I feels IIR is more difficult than FIR.
I think IIR is similar with FIR, but it made me feels confused.
I think the filters are like this.
FIR : y(n) = b0(x[n]) + ... +bM-1(x[n-M+1])
IIR : y(n) = {b0(x[n]) + ... +bM-1(x[n-M+1])} - {a1(y[n-1]) + ... +aN(y[n-N}
in this case, how about a0? Is it just 1?
The part of y[n-1]..... is the problem. I confused how to make it.
Here is my code.
for (n = 0; n < length; n++) {
coeffa = coeffs_A;
coeffb = coeffs_B;
inputp = &insamp[filterLength - 1 + n];
acc = 0;
bcc = 0;
for (k = 0; k < filterLength; k++) {
bcc += (*coeffb++) * (*inputp--);
}
for (k = 0; k < filterLength; k++) {
acc += (*coeffa++) * (////////);
}
output[n] = bcc-acc;
}
In this case, filterLength is 7 and n is 80
////// is what i want to know.
Am I think wrong?
Typically IIR filters are implemented using direct form I or direct form II topologies. Each form requires memory states. These memory states keep the histories of the outputs and inputs in them. This makes it a lot simpler to implement the IIR filter. gtkiostream implements a direct form II approach and may be a helpful reference.
In addressing your question, I like the approach of directly estimating your filters from inputs and outputs, using the input and output buffers as memory states. As a coefficients always operate on outputs, use the outputs as the missing variable, something like this :
acc += (*coeffa++) * output[n-k];
I took your programm and rewrote it a bit to make it a MCVE and be easier readable:
#include <stdio.h>
#define flen 3
#define slen 10
int main(void)
{
int n,k;
double a[flen] = {1,-0.5,0}, b[flen]={1,0,0};
double input[slen]={0,0,1}; // 0s for flen-1 to avoid out of bounds
double output[slen]={0};
double acc, bcc;
for (n = flen-1; n < slen; n++) {
acc = 0;
bcc = 0;
for (k = 0; k < flen; k++) {
bcc += b[k] * (input[n-k]);
}
for (k = 0; k < flen; k++) {
acc += a[k] * (output[n-k]);
}
output[n] = bcc-acc;
printf("%f\t%f\t%f\t%f\n",input[n],acc,bcc,output[n]);
}
return 0;
}
I am not entirely sure this is correct because I do not have the means to test with different filter settings against a filter design tool.
The impulse response for the given coefficents seems to be okay. But I haven't done anything in this topic for a few years now so I am not certain. The general idea should be okay I think.
I have a giant nested for-loop, designed to set a large array to its default value. I'm trying to use OpenMP for the first time to parallelize, and have no idea where to begin. I have been reading tutorials, and am afraid the process will be performed independently on N number of cores, instead of N cores divided the process amongst itself for a common output. The code is in C, compiled in Visual Studio v14. Any help for this newbie is appreciated -- thanks!
(Attached below is the monster nested for-loop...)
for (j = 0;j < box1; j++)
{
for (k = 0; k < box2; k++)
{
for (l = 0; l < box3; l++)
{
for (m = 0; m < box4; m++)
{
for (x = 0;x < box5; x++)
{
for (y = 0; y < box6; y++)
{
for (xa = 0;xa < box7; xa++)
{
for (xb = 0; xb < box8; xb++)
{
for (nb = 0; nb < memvara; nb++)
{
for (na = 0; na < memvarb; na++)
{
for (nx = 0; nx < memvarc; nx++)
{
for (nx1 = 0; nx1 < memvard; nx1++)
{
for (naa = 0; naa < adirect; naa++)
{
for (nbb = 0; nbb < tdirect; nbb++)
{
for (ncc = 0; ncc < fs; ncc++)
{
for (ndd = 0; ndd < bs; ndd++)
{
for (o = 0; o < outputnum; o++)
{
lookup->n[j][k][l][m][x][y][xa][xb][nb][na][nx][nx1][naa][nbb][ncc][ndd][o] = -3; //set to default value
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
If n is actually a multidimensional array, you can do this:
size_t i;
size_t count = sizeof(lookup->n) / sizeof(int);
int *p = (int*)lookup->n;
for( i = 0; i < count; i++ )
{
p[i] = -3;
}
Now, that's much easier to parallelize.
Read more on why this works here (applies to C as well): How do I use arrays in C++?
This is more of an extended comment than an answer.
Find the iteration limit (ie the variable among box1, box2, etc) with the largest value. Revise your loop nest so that the outermost loop runs over that. Simply parallelise the outermost loop. Choosing the largest value means that you'll get, in the limit, an equal number of inner loop iterations to run for each thread.
Collapsing loops, whether you can use OpenMP's collapse clause or have to do it by hand, is only useful when you have reason to believe that parallelising over only the outermost loop will result in significant load imbalance. That seems very unlikely in this case, so distributing the work (approximately) evenly across the available threads at the outermost level would probably provide reasonably good load balancing.
I believe, based on tertiary research, that the solution might be found in adding #pragma omp parallel for collapse(N) directly above the nested loops. However, this seems to only work in OpenMP v3.0, and the whole project is based on Visual Studio (and therefore, OpenMP v2.0) for now...
Given a snipplet of code, how will you determine the complexities in general. I find myself getting very confused with Big O questions. For example, a very simple question:
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
System.out.println("*");
}
}
The TA explained this with something like combinations. Like this is n choose 2 = (n(n-1))/2 = n^2 + 0.5, then remove the constant so it becomes n^2. I can put int test values and try but how does this combination thing come in?
What if theres an if statement? How is the complexity determined?
for (int i = 0; i < n; i++) {
if (i % 2 ==0) {
for (int j = i; j < n; j++) { ... }
} else {
for (int j = 0; j < i; j++) { ... }
}
}
Then what about recursion ...
int fib(int a, int b, int n) {
if (n == 3) {
return a + b;
} else {
return fib(b, a+b, n-1);
}
}
In general, there is no way to determine the complexity of a given function
Warning! Wall of text incoming!
1. There are very simple algorithms that no one knows whether they even halt or not.
There is no algorithm that can decide whether a given program halts or not, if given a certain input. Calculating the computational complexity is an even harder problem since not only do we need to prove that the algorithm halts but we need to prove how fast it does so.
//The Collatz conjecture states that the sequence generated by the following
// algorithm always reaches 1, for any initial positive integer. It has been
// an open problem for 70+ years now.
function col(n){
if (n == 1){
return 0;
}else if (n % 2 == 0){ //even
return 1 + col(n/2);
}else{ //odd
return 1 + col(3*n + 1);
}
}
2. Some algorithms have weird and off-beat complexities
A general "complexity determining scheme" would easily get too complicated because of these guys
//The Ackermann function. One of the first examples of a non-primitive-recursive algorithm.
function ack(m, n){
if(m == 0){
return n + 1;
}else if( n == 0 ){
return ack(m-1, 1);
}else{
return ack(m-1, ack(m, n-1));
}
}
function f(n){ return ack(n, n); }
//f(1) = 3
//f(2) = 7
//f(3) = 61
//f(4) takes longer then your wildest dreams to terminate.
3. Some functions are very simple but will confuse lots of kinds of static analysis attempts
//Mc'Carthy's 91 function. Try guessing what it does without
// running it or reading the Wikipedia page ;)
function f91(n){
if(n > 100){
return n - 10;
}else{
return f91(f91(n + 11));
}
}
That said, we still need a way to find the complexity of stuff, right? For loops are a simple and common pattern. Take your initial example:
for(i=0; i<N; i++){
for(j=0; j<i; j++){
print something
}
}
Since each print something is O(1), the time complexity of the algorithm will be determined by how many times we run that line. Well, as your TA mentioned, we do this by looking at the combinations in this case. The inner loop will run (N + (N-1) + ... + 1) times, for a total of (N+1)*N/2.
Since we disregard constants we get O(N2).
Now for the more tricky cases we can get more mathematical. Try to create a function whose value represents how long the algorithm takes to run, given the size N of the input. Often we can construct a recursive version of this function directly from the algorithm itself and so calculating the complexity becomes the problem of putting bounds on that function. We call this function a recurrence
For example:
function fib_like(n){
if(n <= 1){
return 17;
}else{
return 42 + fib_like(n-1) + fib_like(n-2);
}
}
it is easy to see that the running time, in terms of N, will be given by
T(N) = 1 if (N <= 1)
T(N) = T(N-1) + T(N-2) otherwise
Well, T(N) is just the good-old Fibonacci function. We can use induction to put some bounds on that.
For, example, Lets prove, by induction, that T(N) <= 2^n for all N (ie, T(N) is O(2^n))
base case: n = 0 or n = 1
T(0) = 1 <= 1 = 2^0
T(1) = 1 <= 2 = 2^1
inductive case (n > 1):
T(N) = T(n-1) + T(n-2)
aplying the inductive hypothesis in T(n-1) and T(n-2)...
T(N) <= 2^(n-1) + 2^(n-2)
so..
T(N) <= 2^(n-1) + 2^(n-1)
<= 2^n
(we can try doing something similar to prove the lower bound too)
In most cases, having a good guess on the final runtime of the function will allow you to easily solve recurrence problems with an induction proof. Of course, this requires you to be able to guess first - only lots of practice can help you here.
And as f final note, I would like to point out about the Master theorem, the only rule for more difficult recurrence problems I can think of now that is commonly used. Use it when you have to deal with a tricky divide and conquer algorithm.
Also, in your "if case" example, I would solve that by cheating and splitting it into two separate loops that don; t have an if inside.
for (int i = 0; i < n; i++) {
if (i % 2 ==0) {
for (int j = i; j < n; j++) { ... }
} else {
for (int j = 0; j < i; j++) { ... }
}
}
Has the same runtime as
for (int i = 0; i < n; i += 2) {
for (int j = i; j < n; j++) { ... }
}
for (int i = 1; i < n; i+=2) {
for (int j = 0; j < i; j++) { ... }
}
And each of the two parts can be easily seen to be O(N^2) for a total that is also O(N^2).
Note that I used a good trick trick to get rid of the "if" here. There is no general rule for doing so, as shown by the Collatz algorithm example
In general, deciding algorithm complexity is theoretically impossible.
However, one cool and code-centric method for doing it is to actually just think in terms of programs directly. Take your example:
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
System.out.println("*");
}
}
Now we want to analyze its complexity, so let's add a simple counter that counts the number of executions of the inner line:
int counter = 0;
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
System.out.println("*");
counter++;
}
}
Because the System.out.println line doesn't really matter, let's remove it:
int counter = 0;
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
counter++;
}
}
Now that we have only the counter left, we can obviously simplify the inner loop out:
int counter = 0;
for (int i = 0; i < n; i++) {
counter += n;
}
... because we know that the increment is run exactly n times. And now we see that counter is incremented by n exactly n times, so we simplify this to:
int counter = 0;
counter += n * n;
And we emerged with the (correct) O(n2) complexity :) It's there in the code :)
Let's look how this works for a recursive Fibonacci calculator:
int fib(int n) {
if (n < 2) return 1;
return fib(n - 1) + fib(n - 2);
}
Change the routine so that it returns the number of iterations spent inside it instead of the actual Fibonacci numbers:
int fib_count(int n) {
if (n < 2) return 1;
return fib_count(n - 1) + fib_count(n - 2);
}
It's still Fibonacci! :) So we know now that the recursive Fibonacci calculator is of complexity O(F(n)) where F is the Fibonacci number itself.
Ok, let's look at something more interesting, say simple (and inefficient) mergesort:
void mergesort(Array a, int from, int to) {
if (from >= to - 1) return;
int m = (from + to) / 2;
/* Recursively sort halves */
mergesort(a, from, m);
mergesort(m, m, to);
/* Then merge */
Array b = new Array(to - from);
int i = from;
int j = m;
int ptr = 0;
while (i < m || j < to) {
if (i == m || a[j] < a[i]) {
b[ptr] = a[j++];
} else {
b[ptr] = a[i++];
}
ptr++;
}
for (i = from; i < to; i++)
a[i] = b[i - from];
}
Because we are not interested in the actual result but the complexity, we change the routine so that it actually returns the number of units of work carried out:
int mergesort(Array a, int from, int to) {
if (from >= to - 1) return 1;
int m = (from + to) / 2;
/* Recursively sort halves */
int count = 0;
count += mergesort(a, from, m);
count += mergesort(m, m, to);
/* Then merge */
Array b = new Array(to - from);
int i = from;
int j = m;
int ptr = 0;
while (i < m || j < to) {
if (i == m || a[j] < a[i]) {
b[ptr] = a[j++];
} else {
b[ptr] = a[i++];
}
ptr++;
count++;
}
for (i = from; i < to; i++) {
count++;
a[i] = b[i - from];
}
return count;
}
Then we remove those lines that do not actually impact the counts and simplify:
int mergesort(Array a, int from, int to) {
if (from >= to - 1) return 1;
int m = (from + to) / 2;
/* Recursively sort halves */
int count = 0;
count += mergesort(a, from, m);
count += mergesort(m, m, to);
/* Then merge */
count += to - from;
/* Copy the array */
count += to - from;
return count;
}
Still simplifying a bit:
int mergesort(Array a, int from, int to) {
if (from >= to - 1) return 1;
int m = (from + to) / 2;
int count = 0;
count += mergesort(a, from, m);
count += mergesort(m, m, to);
count += (to - from) * 2;
return count;
}
We can now actually dispense with the array:
int mergesort(int from, int to) {
if (from >= to - 1) return 1;
int m = (from + to) / 2;
int count = 0;
count += mergesort(from, m);
count += mergesort(m, to);
count += (to - from) * 2;
return count;
}
We can now see that actually the absolute values of from and to do not matter any more, but only their distance, so we modify this to:
int mergesort(int d) {
if (d <= 1) return 1;
int count = 0;
count += mergesort(d / 2);
count += mergesort(d / 2);
count += d * 2;
return count;
}
And then we get to:
int mergesort(int d) {
if (d <= 1) return 1;
return 2 * mergesort(d / 2) + d * 2;
}
Here obviously d on the first call is the size of the array to be sorted, so you have the recurrence for the complexity M(x) (this is in plain sight on the second line :)
M(x) = 2(M(x/2) + x)
and this you need to solve in order to get to a closed form solution. This you do easiest by guessing the solution M(x) = x log x, and verify for the right side:
2 (x/2 log x/2 + x)
= x log x/2 + 2x
= x (log x - log 2 + 2)
= x (log x - C)
and verify it is asymptotically equivalent to the left side:
x log x - Cx
------------ = 1 - [Cx / (x log x)] = 1 - [C / log x] --> 1 - 0 = 1.
x log x
Even though this is an over generalization, I like to think of Big-O in terms of lists, where the length of the list is N items.
Thus, if you have a for-loop that iterates over everything in the list, it is O(N). In your code, you have one line that (in isolation all by itself) is 0(N).
for (int i = 0; i < n; i++) {
If you have a for loop nested inside another for loop, and you perform an operation on each item in the list that requires you to look at every item in the list, then you are doing an operation N times for each of N items, thus O(N^2). In your example above you do in fact, have another for loop nested inside your for loop. So you can think about it as if each for loop is 0(N), and then because they are nested, multiply them together for a total value of 0(N^2).
Conversely, if you are just doing a quick operation on a single item then that would be O(1). There is no 'list of length n' to go over, just a single one time operation.To put this in context, in your example above, the operation:
if (i % 2 ==0)
is 0(1). What is important isn't the 'if', but the fact that checking to see if a single item is equal to another item is a quick operation on a single item. Like before, the if statement is nested inside your external for loop. However, because it is 0(1), then you are multiplying everything by '1', and so there is no 'noticeable' affect in your final calculation for the run time of the entire function.
For logs, and dealing with more complex situations (like this business of counting up to j or i, and not just n again), I would point you towards a more elegant explanation here.
I like to use two things for Big-O notation: standard Big-O, which is worst case scenario, and average Big-O, which is what normally ends up happening. It also helps me to remember that Big-O notation is trying to approximate run-time as a function of N, the number of inputs.
The TA explained this with something like combinations. Like this is n choose 2 = (n(n-1))/2 = n^2 + 0.5, then remove the constant so it becomes n^2. I can put int test values and try but how does this combination thing come in?
As I said, normal big-O is worst case scenario. You can try to count the number of times that each line gets executed, but it is simpler to just look at the first example and say that there are two loops over the length of n, one embedded in the other, so it is n * n. If they were one after another, it'd be n + n, equaling 2n. Since its an approximation, you just say n or linear.
What if theres an if statement? How is the complexity determined?
This is where for me having average case and best case helps a lot for organizing my thoughts. In worst case, you ignore the if and say n^2. In average case, for your example, you have a loop over n, with another loop over part of n that happens half of the time. This gives you n * n/x/2 (the x is whatever fraction of n gets looped over in your embedded loops. This gives you n^2/(2x), so you'd get n^2 just the same. This is because its an approximation.
I know this isn't a complete answer to your question, but hopefully it sheds some kind of light on approximating complexities in code.
As has been said in the answers above mine, it is clearly not possible to determine this for all snippets of code; I just wanted to add the idea of using average case Big-O to the discussion.
For the first snippet, it's just n^2 because you perform n operations n times. If j was initialized to i, or went up to i, the explanation you posted would be more appropriate but as it stands it is not.
For the second snippet, you can easily see that half of the time the first one will be executed, and the second will be executed the other half of the time. Depending on what's in there (hopefully it's dependent on n), you can rewrite the equation as a recursive one.
The recursive equations (including the third snippet) can be written as such: the third one would appear as
T(n) = T(n-1) + 1
Which we can easily see is O(n).
Big-O is just an approximation, it doesn't say how long an algorithm takes to execute, it just says something about how much longer it takes when the size of its input grows.
So if the input is size N and the algorithm evaluates an expression of constant complexity: O(1) N times, the complexity of the algorithm is linear: O(N). If the expression has linear complexity, the algorithm has quadratic complexity: O(N*N).
Some expressions have exponential complexity: O(N^N) or logarithmic complexity: O(log N). For an algorithm with loops and recursion, multiply the complexities of each level of loop and/or recursion. In terms of complexity, looping and recursion are equivalent. An algorithm that has different complexities at different stages in the algorithm, choose the highest complexity and ignore the rest. And finally, all constant complexities are considered equivalent: O(5) is the same as O(1), O(5*N) is the same as O(N).