I have a question regarding this metric.
For example, if I have the following code, how many path does the function main have?
void main()
{
int i = 0, j = 0, k = 0;
for (i=0; i<10; i++)
{
for (j=0; j<10; j++)
{
for (k=0; k<10; k++)
{
if (i < 2 )
printf("value is more than 2\n");
if (i > 5)
printf("value is less than 5\n");
}
}
}
}
The first three for loops should have 3 paths and one additional that bypasses all control flow statements, then what about the nested if statement? Is it 1x1 = 1 path?
Anyone without matlab account, the following description is on the website:
A control flow statement introduces branches and adds to the original one path.
if-else if-else: Each if keyword introduces a new branch. The contribution from an if-else if-else block is the number of branches plus one (the original path). If a catch-all else is present, all paths go through the block; otherwise, one path bypasses the block.
For instance, a function with an if(..) {} else if(..) {} else {} statement has three paths. A function with one if() {} only has two paths, one that goes through the if block and one that bypasses the block.
for and while: Each loop statement introduces a new branch. The contribution from a loop is two - a path that goes through the loop and a path that bypasses the loop.
If more than one control flow statement are present in a sequence without any nesting, the number of paths is the product of the contributions from each control flow statement.
For instance, if a function has three for loops and two if-else blocks, one after another, the number of paths is 2 × 2 × 2 × 2 × 2 = 32.
void func()
{
int i = 0, j = 0, k = 0;
for (i=0; i<10; i++)
{
for (j=0; j<10; j++)
{
for (k=0; k<10; k++)
{
if (i < 2 )
;
else
{
if (i > 5)
;
else
;
}
}
}
}
}
In this example, func has six paths: three from the for statements, two from the if statements plus the original path that bypasses all control flow statements.
According to the description:
If more than one control flow statement are present in a sequence without any nesting, the number of paths is the product of the contributions from each control flow statement.
and
The contribution from an if-else block is the number of branches plus one (the original path)
your two if statements contribute 2 paths each, so that's 2 x 2 = 4 paths, and then each of your loop contributes one additional path, so the total would be 7 paths.
However, this description does not make much sense to me, since this code has only one possible path:
if (true) {
// this part is always executed
}
This can seem obvious, but more complicated situations can easily reduce the number of possible paths, just like in your example where there are only three possibilities: either i < 2 or i > 5 or i >= 2 && i <= 5.
I suppose the metric's description would make sense as an upper bound on the number of paths.
Related
edit - -
This code will be run with optimizations off
full transparency this is a homework assignment.
I’m having some trouble figuring out how to optimize this code...
My instructor went over unrolling and splitting but neither seems to greatly reduce the time needed to execute the code. Any help would be appreciated!
for (i = 0; i < N_TIMES; i++) {
// You can change anything between this comment ...
int j;
for (j = 0; j < ARRAY_SIZE; j++) {
sum += array[j];
}
// ... and this one. But your inner loop must do the *same
// number of additions as this one does.
}
Assuming you mean same number of additions to sum at runtime (rather than same number of additions in the source code), unrolling could give you something like:
for (j = 0; j + 5 < ARRAY_SIZE; j += 5) {
sum += array[j] + array[j+1] + array[j+2] + array[j+3] + array[j+4];
}
for (; j < ARRAY_SIZE; j++) {
sum += array[j];
}
Alternatively, since you're adding the same values each time through the outer loop, you don't need to process it N_TIMES times, just do this:
for (i = 0; i < N_TIMES; i++) {
// You can change anything between this comment ...
int j;
for (j = 0; j < ARRAY_SIZE; j++) {
sum += array[j];
}
sum *= N_TIMES;
break;
// ... and this one. But your inner loop must do the *same
// number of additions as this one does.
}
This requires that the initial value of sum is zero, which is likely but there's actually nothing in your question that mandates this, so I include it as a pre-condition for this method.
Except by cheating*, this inner loop is essentially non-optimizable. Because you must fetch all the array elements and perform all the additions anyway.
The body of the loop performs:
a conditional branch on j;
a fetch of array[j];
the accumulation to a scalar variable;
the incrementation of j.
As said, 2. to 4. are inescapable.Then all you can do is reducing the number of conditional branches by loop unrolling (this turns the conditional branch in an unconditional one, at the expense of the number of iterations becoming fixed).
It is no surprise that you don't see a big difference. Modern processors are "loop aware", meaning that branch prediction is well tuned to such loops so that the cost of the branches is pretty low.
Cheating:
As others said, you can completely bypass the outer loop. This is just exploiting a flaw in the exercise statement.
As optimizations must be turned off, using inline assembly, pragmas, vector instructions or intrinsics should be banned as well (not mentioning automatic parallelization).
There is a possibility to pack two ints in a long long. If the sum doesn't overflow, you will perform two additions at a time. But is this legal ?
One might think of an access pattern that favors cache utilization. But here there is no hope as the array is fully traversed on every loop and there is no possibility of reuse of the values fetched.
First of all, unless you are explicitly compiling with -O0, your compiler has already likely optimized this loop much further than you could possibly expect.
Including unrolling, and on top of unrolling also vectorization and more. Trying to optimize this by hand is something you should never, absolutely never do. At most you will successfully make the code harder to read and understand, while most likely not even being able to match the compiler in terms of performance.
As to why there is no measurable gain? Possibly because you already hit a bottleneck, even with the "non optimized" version. For ARRAY_SIZE greater than your processors cache even the compiler optimized version is already limited by memory bandwidth.
But for completeness, let's just assume you have not hit that bottleneck, and that you actually had turned optimizations almost off (so no more than -O1), and optimize for that.
for (i = 0; i < N_TIMES; i++) {
// You can change anything between this comment ...
int j;
int tmpSum[4] = {0,0,0,0};
for (j = 0; j < ARRAY_SIZE; j+=4) {
tmpSum[0] += array[j+0];
tmpSum[1] += array[j+1];
tmpSum[2] += array[j+2];
tmpSum[3] += array[j+3];
}
sum += tmpSum[0] + tmpSum[1] + tmpSum[2] + tmpSum[3];
if(ARRAY_SIZE % 4 != 0) {
j -= 4;
for (; j < ARRAY_SIZE; j++) {
sum += array[j];
}
}
// ... and this one. But your inner loop must do the *same
// number of additions as this one does.
}
There is pretty much only one factor left which still could have reduced the performance, for a smaller array.
Not the overhead for the loop, so plain unrolling would had been pointless with a modern processor. Don't even bother, you won't beat the branch prediction.
But the latency between two instructions, until a value written by one instruction may be read again by the next instruction still applies. In this case, sum is constantly written and read all over again, and even if sum is cached in a register, this delay still applies and the processors pipeline had to wait.
The way around that, is to have multiple independent additions going on simultaneously, and finally just combine the results. This is by the way also an optimization which most modern compilers do know how to perform.
On top of that, you could now also express the first loop with vector instructions - once again also something the compiler would have done. At this point you are running into instruction latency again, so you will likely have to introduce one more set of temporaries, so that you now have two independent addition streams each using vector instructions.
Why the requirement of at least -O1? Because otherwise the compiler won't even place tmpSum in a register, or will try to express e.g. array[j+0] as a sequence of instructions for performing the addition first, rather than just using a single instruction for that. Hardly possible to optimize in that case, without using inline assembly directly.
Or if you just feel like (legit) cheating:
const int N_TIMES = 1000;
const int ARRAY_SIZE = 1024;
const int array[1024] = {1};
int sum = 0;
__attribute__((optimize("O3")))
__attribute__((optimize("unroll-loops")))
int fastSum(const int array[]) {
int j;
int tmpSum;
for (j = 0; j < ARRAY_SIZE; j++) {
tmpSum += array[j];
}
return tmpSum;
}
int main() {
int i;
for (i = 0; i < N_TIMES; i++) {
// You can change anything between this comment ...
sum += fastSum(array);
// ... and this one. But your inner loop must do the *same
// number of additions as this one does.
}
return sum;
}
The compiler will then apply pretty much all the optimizations described above.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am unable to understand how is it working. Can somebody explain me this code?
#include <stdio.h>
int main () {
/* local variable definition */
int i, j;
for(i = 2; i<100; i++) {
for(j = 2; j <= (i/j); j++) {
if(!(i%j)) break; // if factor found, not prime
}
if(j > (i/j)) printf("%d is prime", i);
}
return 0;
}
1.#include <stdio.h> is a header that defines three variable types, several macros, and various functions for performing input and output. In other words, it's basically a C-Library being referenced to add some other externally defined logic, besides the code below, like the size_t variable, which is the result of the sizeof keyword for example. That's just one example of what the the stdio.h header does but you can see more info here: https://www.tutorialspoint.com/c_standard_library/stdio_h.htm
2.int main() is an integer function (int) that uses a deprecated declaration style main(), meaning you shouldn't it anymore because it's outdated by other functions, and the main() function in particular is a function that takes an unspecified number of arguments (integers in this case) and then runs some operations with those integers.
Next, the curly braces are what contain all the logic inside of the int main() function. Then inside of it, on the line int i, j; , two local variables are declared (i and j) to be later used as placeholders for some integers that will be plugged into the function.
Below that, for(i = 2; i<100; i++) indicates there is a loop that sets the i variable to 2, then after the semi-colon i<100 means that the loop will continue to execute again and again as long as the variable i is less than 100. After yet another semi-colon, i++ means that each time that the loop runs, the variable i will increment by 1. So it starts at 2, then 3, then 4, etc, until i reaches 100 and the loop stops executing.
Next, for(j = 2; j <= (i/j); j++) is another loop inside of the first loop, but this time the loop is using the variable j as a placeholder/counter instead of the variable i (the variable used by the previous loop), which surrounds this loop starting with "for(j..." . This loop also setsj to 2 (the same way the surrounding loop set i to 2); as long as j is less than or equal to (i divided by j) the loop will continue to execute; and j will increment (increase) by one each time that the loop is run, the same way that i does in the loop that surrounds this one.
if(!(i%j)) break; // if factor found, not prime this line means that the loop will also stop executing (break) if the remainder of i divided by j does not equal zero.
if(j > (i/j)) printf("%d is prime", i); This line means that if j is greater than i divided by j that the loop will write/output the text to stdout (std out is the standard output device, a pointer to a FILE stream that represents the default output device for the application).
Lastly, the last return 0; line indicates a return from the function and the final curly brace encloses the functions logic/code. The main function also should return 0(also EXIT_SUCCESS) to identify that the program has executed successfully and -1 otherwise (also EXIT_FAILURE).
Additional Note - Loops in every programming language I've seen personally tend to have a few things in common:
i. An init counter, a value where the loop will initialize (start counting), inside the loop's parentheses and before the first semi-colon.
ii. A test counter, which will be evaluated each time that the loop continues, and if it evaluates to TRUE the loop will continue usually but if it evaluates to false then the loop will end. This is the part of the loop after the first semi-colon but before the second semi-colon.
iii. An increment/decrement counter, which increases or decreases the loop by some value each time that the loop is run. This is the part of the loop inside the parentheses, after the second semi-colon. If there is no increment counter or test counter that causes the loop to exit/break at some point, then this is known as an infinite loop. This is a very bad thing in programming because it will cause just about any computer program to crash since it will execute and consume computing resources indefinitely. Not good :)
Disclaimer: I don't actually code in C but the language has so many similarities with programming languages I do use, that I'm guessing this answer is very close if not 100% correct. Curious to hear some input from an expert C programmer though!
Your code is looping over all integers from 2 to 99, holding the actual number in i.
for(i = 2; i<100; i++)
Then, for every number i, the code is looping again over all integers from 2 to (i/j).
for(j = 2; j <= (i/j); j++)
Your loop's finishing condition is mathematically equivalent to j being smaller than the square root of i, since any larger integer j would already contain itself a smaller factor of i.
(To check this, get a paper and rewrite the inequality so hat i is the sole part of the right hand side of your condition.)
Then it checks whether or not j divides i.
(i%j)
If j is a factor of i, then i modulo j is zero and hence
if (!(i%j))
evaluates to true (since 0 is evualted as false and ! negotiates this) and you can break out of the loop since i has a divisor not being 1 or i, hence i is not a prime.
On the other hand, if the inner loop is finished, you have found a prime since it has only 1 and i as divisor.
Needles to say that this bruteforce approach is very slow for large integers (you won't crack RSA with that), but does this clarify your questions?
#include <stdio.h>
int main () {
/* local variable definition */
int i, j;
// Loop from 2 to 99; i will hold the number we are checking to
// see if it is prime
for(i = 2; i<100; i++) {
// now loop through the numbers checking each one to see if
// it is a factor of i (if it is, then i isn't prime). This
// loop stops when j^2 is greater than or equal to the number
// we are checking
for(j = 2; j <= (i/j); j++) {
// i % j (i modulus j) is 0 iff j is a factor of i. This
// if test relies on the fact that 0 is false in C (and
// all nonzero values are true)
if(!(i%j)) break; // if factor found, not prime
}
// this tests to see if we exited the above loop by failing
// the test in the for() statement, or whether we exited the
// loop via the break statement. If we made it through all
// iterations of the loop, then we found no factors, and the
// number is prime.
//
// note that a \n is missing at the end of the printf format
// string. The output will be "2 is prime3 is prime5..."
if(j > (i/j)) printf("%d is prime", i);
}
// returns from main() with a value of 0, which will result in
// the program exiting with an exit code of 0. An explicit
// exit(0) is better form here, but this is not incorrect.
return 0;
}
I have a program that defines a matrix Pixel, a function iBlend (which I can't see or change) that operates on two cells in Pixel returning a float, and actually just contains the following:
float RowResult[columns];
#pragma omp parallel for
for (int i = 0; i < rows; i++) {
float sum = 0;
RowResult[0] = 0;
for (int j = 1; j < columns; j++) {
RowResult[j] = iBlend(Pixel[i],Pixel[j],RowResult[j-1]);
sum += RowResult[j];
}
printf("Row %d scored a total of: %f\n", i, sum);
if (sum > 100) {
for (int j = 0; j < columns; j++) printf("%f,",RowResult[j]);
printf("\n");
}
}
This code goes through a matrix, a row at a time, doing some fancy-pants math with the indices i and j.
Naturally, this leads to gibberish when I try to run this with more than 1 thread because threads are calling printf on a first-come, first-serve basis.
I tried adding a simple #pragma omp single to the second part of the function (creating a block from the first printf until the second-to-last line), but I get the classic error about not being able to nest omp blocks:
> error: work-sharing region may not be closely nested inside of
> work-sharing, ‘critical’, ‘ordered’, ‘master’, explicit ‘task’ or
> ‘taskloop’ region
> #pragma omp single
> ^~~
A few solutions came to mind. First was "just define a big enough matrix to hold all the answers and then loop over them at the end with one thread." But this way I actually run out of memory -- it turns out only a few rows really get printed, so doing it on the fly is totally essential. I also can't change the format of the output, although which row is printed in which order doesn't matter, as long as it's printed row-wise and the pixels are in order per row.
Another solution I thought about was "just flip the two loops and have the omp loop go through the pixels in the other direction, and then when that parallel loop is done I'm in single-threaded land again!" -- but turns out the rows need to have their pixels calculated in order because there's a carried dependency (iBlend needs the result of a previous computation).
At first I thought this was a simple problem, so maybe I'm just missing something obvious. I just want to exit "multithread" mode when I'm printing. Maybe something like "add this result to the stack of answers (treat adding to stack as 'atomic' with OpenMP?)" and then "if I'm the master thread, print everything on the stack so far in order"? But that'd be slow and a little tricky to implement. I'm sure there's an easier way. ...Right?
Thanks for any help.
Apropos the question "Why does using the same count variable name in nested FOR loops work?" already posted in this forum,a count variable "i" defined in each nested loop should be considered a new variable whose scope is limited to that loop only.And we should expect that variable's value to be erased and overridden by the value of "i" which was in the outer loop (before control passed to inner loop).But in my following code, when the control comes out of the inner loop,instead of the variable "i" having the value 0 (which was it's value in the first iteration of outer loop,before control passed to inner loop),it continues to have the value 10 instead (which it got in last iteration of inner loop).Then this 10 is incremented to 11 and hence the condition of the outer loop in not satisfied and the outer loop exits.
I had expected my program to print the numbers 0 to 9 horizontally 10 times, in 10 different lines.But it prints just for one line and exits.
And here's another thing to it--If I use any number greater than 10 in the outer loop condition (i<=10),then it creates an infinite loop.According to my reasoning, it happens because i gets a value of 11 after the first iteration of outer loop and hence if condition is <=11 or more then the outer loop comes to another iteration.Whereupon i is again initialized to 0 in inner loop and the cycle continues.
Sorry if I couldn't put my question very clearly.All I want to ask is, isn't the inner i supposed to be a different variable if we are to assume the linked question on this forum is correct?Why then the value of i continues to hold on after we exit inner loop,instead of reverting to the value of i that was there when we entered the inner loop?
#include <stdio.h>
int main()
{
int i;
//for (i = 0; i <= 11; i++) Creates infinite loop if this condition is used instead
for (i = 0; i <= 9; i++)
{
for (i = 0; i <= 9; i++)
{
printf("%d ", i);
}
printf("\n");
}
}
OUTPUT : 0 1 2 3 4 5 6 7 8 9
PS: As a secondary question, is it impossible to print the number 0 to 9 horizontally, in 10 different lines, using nested for loop if we use the same count variable in each loop,as I have done here? (Ignore this secondary question if it's not relevant to main question)
The answer you linked to is using different variables with the same name, you're simply using the same variable.
Compare:
for(int i = 0; ...
to:
for(i = 0; ...
The former declares a new variable called i, which is how you nest loops like the linked-to answer. Not that I would ever (ever!) recommend doing so.
As you've noticed, support for the former syntax wasn't added to C until C99.
If i were defined in each loop then the behaviour would be as your linked question. In your example you only define i once, outside any loop, then reuse it
int i;
for(i=0; i<=9; i++)
{
for(i=0; i<=9; i++)
{
is not the same as
for(int i=0; i<=9; i++)
{
for(int i=0; i<=9; i++)
{
If you want each for loop to have its own i, you need to create i individually for each. As-is, you have exactly one i that's defined outside both loops, so the modifications done by one loop affect the value seen by the other.
int i;
for (i=0; i<10; i++) {
int i; /* define another i for the inner loop */
for (i=0; i<10; i++)
printf("%d\t", i);
printf("\n");
}
Note that I'd generally recommend against this -- while the compiler has no problem at all with having two variables with the same name at different scopes, code like this where it's not immediately obvious what i is being referred to when may well confuse people reading the code.
All I want to ask is, isn't the inner i supposed to be a different
variable
no, there is only one declaration, so only one variable
Sorry - late response, but couldn't help but notice:
The reason it "works" is that the inner loop resets i to zero, prints & increments i, and returns to the outer loop - at which point i>9 so the outer loop exits.
There is only one iteration of the outer loop and the values printed are entirely determined by the inner loop. It's not a question of scope, it's the fact you reassign new values to i in the inner loop.
PS: As a secondary question, is it impossible to print the number 0 to 9 horizontally, in 10 different lines, using nested for loop if we use the same count variable in each loop, as I have done here? (Ignore this secondary question if it's not relevant to main question)
Of course you can, but you need a more complex format string for your printf.
printf("%d ",i);
The above statement works by printing I, immediately followed by a space, and leaving the carriage where the print stops.
The effect that I think you have in mind is something like the following.
0
1
2
3
4
5
6
7
8
9
To make that happen, you need a couple of changes to your printf statement, as illustrated in the following complete program.
// MarchingDigits.cpp : Defines the entry point for the console application.
//
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char* argv[])
{
for ( int i = 0 ; i < 10 ; i++ )
{
printf ( "%*d\n", ( i > 0 ? i + 1 : i ) , i ) ;
} // for ( int i = 0 ; i < 10 ; i++ )
return 0;
} // int main
The output generated by this program is as follows.
0
1
2
3
4
5
6
7
8
9
There are three fundamental differences between your printf statement and the one that generated this output.
Between the opening % and the closing d, I inserted an asterisk where the width goes.
I replaced the trailing space with a newline.
Between the format string and your integer argument i, I inserted another argument, in the form of a ternary expression, i > 0 ? i + 1 : i, which says, in effect, if I is greater than zero, set the width to i + 1, otherwise set the width to i. Although the else block sets the width to i, which happens to be zero, this works because printf guarantees never to truncate the output.
This is a homework problem. I'm currently in the process of writing a program that calculates one's bowling score. My logic is to use multidimensional arrays with 9 frames as the rows and 2 throws as the columns. I will account for the 10th frame at the very end due to its unique nature of having possibly up to three rolls. The addition of strikes and spares also complicate the question, so I felt like I needed to use arrays to keep track of past and future rolls. I know it's not required but I couldn't think of another way.
Here is my code so far:
for (i=0; i<9; i++)
{
for (j=0; j<2; j++)
{
scanf("%d", &score[i][j]);
tempTotal += score[i][j];
}
}
printf("temporary: %d\n", tempTotal);
//strike
for (i=0; i<9; i++)
{
for (j=0; j<2; j++)
{
if(score[i][0] == 10 && score[i][1] == 0)
{
total = tempTotal + score[i+1][0] + score[i+1][1];
}
//spare
else if ((score[i][0] + score[i][1]) == 10)
{
total = tempTotal + score[i+1][0];
}
else
{
total = tempTotal;
}
}
}
printf("result: %d\n", total);
Here is an update of my progress. As per the comments, I used scanf to fill up my multidimensional array first. And then I looked through the array to account for the cases of strikes and spares. The code works until there are consecutive strikes and/or spares. How do I account for the accumulation of consecutive spares and/or strikes? Am I supposed to add additional conditional statements within my strike and spare conditions to account for if score[i+1][0] is another strike in which case I have to account for that additional 10 points plus the next three throws as well? How can I make such a conditional statement that comprehensively searches through the entire array like that? Any ideas would be appreciated. Thank you.