Function optimized to infinite loop at 'gcc -O2' - c

Context
I was asked the following puzzle by one of my friends:
void fn(void)
{
/* write something after this comment so that the program output is 10 */
/* write something before this comment */
}
int main()
{
int i = 5;
fn();
printf("%d\n", i);
return 0;
}
I know there can be multiple solutions, some involving macro and some assuming something about the implementation and violating C.
One particular solution I was interested in is to make certain assumptions about stack and write following code: (I understand it is undefined behavior, but may work as expected on many implementations)
void fn(void)
{
/* write something after this comment so that the program output is 10 */
int a[1] = {0};
int j = 0;
while(a[j] != 5) ++j; /* Search stack until you find 5 */
a[j] = 10; /* Overwrite it with 10 */
/* write something before this comment */
}
Problem
This program worked fine in MSVC and gcc without optimization. But when I compiled it with gcc -O2 flag or tried on ideone, it loops infinitely in function fn.
My Observation
When I compiled the file with gcc -S vs gcc -S -O2 and compared, it clearly shows gcc kept an infinite loop in function fn.
Question
I understand because the code invokes undefined behavior, one can not call it a bug. But why and how does compiler analyze the behavior and leave an infinite loop at O2?
Many people commented to know the behavior if some of the variables are changed to volatile. The result as expected is:
If i or j is changed to volatile, program behavior remains same.
If array a is made volatile, program does not suffer infinite loop.
Moreover if I apply the following patch
- int a[1] = {0};
+ int aa[1] = {0};
+ int *a = aa;
The program behavior remains same (infinite loop)
If I compile the code with gcc -O2 -fdump-tree-optimized, I get the following intermediate file:
;; Function fn (fn) (executed once)
Removing basic block 3
fn ()
{
<bb 2>:
<bb 3>:
goto <bb 3>;
}
;; Function main (main) (executed once)
main ()
{
<bb 2>:
fn ();
}
Invalid sum of incoming frequencies 0, should be 10000
This verifies the assertions made after the answers below.

This is undefined behavior so the compiler can really do anything at all, we can find a similar example in GCC pre-4.8 Breaks Broken SPEC 2006 Benchmarks, where gcc takes a loop with undefined behavior and optimizes it to:
L2:
jmp .L2
The article says (emphasis mine):
Of course this is an infinite loop. Since SATD() unconditionally
executes undefined behavior (it’s a type 3 function), any
translation (or none at all) is perfectly acceptable behavior for a
correct C compiler. The undefined behavior is accessing d[16] just
before exiting the loop. In C99 it is legal to create a pointer to
an element one position past the end of the array, but that pointer
must not be dereferenced. Similarly, the array cell one element past
the end of the array must not be accessed.
which if we examine your program with godbolt we see:
fn:
.L2:
jmp .L2
The logic being used by the optimizer probably goes something like this:
All the elements of a are initialized to zero
a is never modified before or within the loop
So a[j] != 5 is always true -> infinite loop
Because of the infinite, the a[j] = 10; is unreachable and so that can be optimized away, so can a and j since they are no longer needed to determine the loop condition.
which is similar to the case in the article which given:
int d[16];
analyzes the following loop:
for (dd=d[k=0]; k<16; dd=d[++k])
like this:
upon seeing d[++k], is permitted to assume that the incremented value
of k is within the array bounds, since otherwise undefined behavior
occurs. For the code here, GCC can infer that k is in the range 0..15.
A bit later, when GCC sees k<16, it says to itself: “Aha– that
expression is always true, so we have an infinite loop.”
Perhaps an interesting secondary point, is whether an infinite loop is considered observable behavior(w.r.t. to the as-if rule) or not, which effects whether an infinite loop can also be optimized away. We can see from C Compilers Disprove Fermat’s Last Theorem that before C11 there was at least some room for interpretation:
Many knowledgeable people (including me) read this as saying that the
termination behavior of a program must not be changed. Obviously some
compiler writers disagree, or else don’t believe that it matters. The
fact that reasonable people disagree on the interpretation would seem
to indicate that the C standard is flawed.
C11 adds clarification to section 6.8.5 Iteration statements and is covered in more detail in this answer.

In the optimized version, the compiler has decided a few things:
The array a doesn't change before that test.
The array a doesn't contain a 5.
Therefore, we can rewrite the code as:
void fn(void) {
int a[1] = {0};
int j = 0;
while(true) ++j;
a[j] = 10;
}
Now, we can make further decisions:
All the code after the while loop is dead code (unreachable).
j is written but never read. So we can get rid of it.
a is never read.
At this point, your code has been reduced to:
void fn(void) {
int a[1] = {0};
while(true);
}
And we can make the note that a is now never read, so let's get rid of it as well:
void fn(void) {
while(true);
}
Now, the unoptimized code:
In unoptimized generated code, the array will remain in memory. And you'll literally walk it at runtime. And it's possible that there will be a 5 thats readable after it once you walk past the end of the array.
Which is why the unoptimized version sometimes doesn't crash and burn.

If the loop does get optimized out into an infinite loop, it could be due to static code analyzis seeing that your array is
not volatile
contains only 0
never gets written to
and thus it is not possible for it to contain the number 5. Which means an infinite loop.
Even if it didn't do this, your approach could fail easily. For example, it's possible that some compiler would optimize your code without making your loop infinite, but would stuff the contents of i into a register, making it unavailable from the stack.
As a side note, I bet what your friend actually expected was this:
void fn(void)
{
/* write something after this comment so that the program output is 10 */
printf("10\n"); /* Output 10 */
while(1); /* Endless loop, function won't return, i won't be output */
/* write something before this comment */
}
or this (if stdlib.h is included):
void fn(void)
{
/* write something after this comment so that the program output is 10 */
printf("10\n"); /* Output 10 */
exit(0); /* Exit gracefully */
/* write something before this comment */
}

Related

Why does it takes a little longer to display 'Hello' after terminating a For loop before the printing syntax?

#include <stdio.h>
int main(void) {
// single-line for-loop
for (int i = 0; i < 5; i--);
// delays to execute this syntax
printf("Hello\n");
return 0;
}
Why does it take around 10 seconds to print Hello in console after running it?
Notice that the use of a semicolon at the end of the For loop is intentionally given.
I've figured out how and why this happened..
It's because the range of int data type is from -2,147,483,648 to 2,147,483,647. "i" in this for loop is going from 0 to -2,147,483,648 (because of i--;) and after that when the value of "i" becomes 2,147,483,647 (because of limitations of data type's range and data overflow), hence making the condition (i<5;) false and the loop stops. After that the next statement (printf("Hello");) prints "Hello".
The whole process of the loop iterating 2 Billion times takes 10 seconds by my compiler to process and after that it prints the next call (which is printf("Hello");)
int i;
for(i=0;i<5;i--);
We don't know anything about your C implementation.
For more about the C language, see this reference and later the C11 standard n1570. Read also Modern C
My guess would be that you use a recent GCC compiler on a x86-64 computer. I recommend reading the documentation of your optimizing compiler, for GCC it is here. You could also need to read the documentation of your linker, so (on my Linux computer) of binutils.
If indeed that is the case, I recommend enabling all warning and debug info, so compile your code with gcc -Wall -Wextra -g. You are likely to get some warnings.
You are decrementing i. Assume that int are 32 bits. Then on the first loop, i is 0; on the second loop, i becomes -1 .... Your computer probably will loop 231 times. So about two billions loops.
Computers are fast, but not infinitely fast.
My recommendation: learn to use a debugger, such as GDB.
At last, <stdio.h> gives buffered input output. So learn to use fflush(3) and read of course the documentation of printf(3).
for(i=0;i<5;i--);
Well I don't know how you got Hello even after 10 seconds! Take a closer look at the for loop. It actually never ends because the i value is constantly decreasing and hence it will always be less than 5. So I suppose the process was terminated.
Now don't think that just because you added a semi colon after the for loop the for loop won't run. It will run. Looping will take place, its just that there is no code to execute. The semi colon will be considered as a null statement by the compiler. And once the condition will return false the compiler will move on to the next statement.
Please check whether the code is correct or not. I highly doubt that it is ++i and not --i
Since the decrement (i.e. i--) will be executed till infinity for i < 5 never meets, as soon as the iterating integer i reaches the minimum value that an integer could hold (i.e. -2,147,483,648) get overflowed and it instantly quits the loop at negative 2147483648-th iteration.
If you write something like this:
int i;
long long j = 0; // intentionally using to hold lesser than the value
// and integer could hold for debugging test
for(i = 0; i < 5; i--)
j = i;
printf("%d\n", j);
Then you'll get to know practically in which iteration the loop is quitting and the reason behind getting a few time to print Hello after that loop.

How can I fix the 'end of non-void function' for my minimum coins owed in change?

The problem is to find the minimum number of coins owed given an amount of dollars in change, assuming that available coins to give back are 25c, 10c, 5c and 1c.
I implemented a solution with recursion in C, but somehow it kept throwing the "error: control may reach end of non-void function". I'm fairly new to C so I couldn't quite figure out whats going on.
Any help is greatly appreciated! Here's my code:
#include <cs50.h>
#include <stdio.h>
#include <math.h>
int processChange(float change){
int centsChange = round(change*100);
int arr[4] = {25,10,5,1};
for(int i=0;i<4;i++){
int numCoins =0;
int remainder = centsChange%arr[i];
if(remainder==0){
numCoins = (centsChange - remainder)/arr[i];
return numCoins;
}
if(centsChange ==1){return 1;}//base case
if(centsChange>=arr[i]){
numCoins = (centsChange - remainder)/arr[i]+ processChange(remainder/100);
return numCoins;
}
}
}
int main(){
float change;
do
{
change = get_float("Enter the changed owed\n");
}while (change<0);
printf("Minimum number of coins returned is %d\n", processChange(change));
}
The code in the for loop in processChange does this:
If remainder is zero, do a calculation and return.
If centsChange is one, return.
If centsChange is at least arr[i], do a calculation and return.
Otherwise, reach the end of the for loop and continue iterating.
As far as the compiler can tell, the value of i will reach four, and control will leave the for loop. At that point, control would flow to the end of the function, where there is no return statement. Thus, the compiler is warning you that control would reach the end of a non-void function. (A “non-void function” is one with a return type that is not void. The return type of processChange is int.)
One way to fix this is to insert a return statement at the end of the function.
Another is to disable compiler warnings for this situation, which you can do with GCC and Clang using the -Wno-return-type command-line switch.
We can see that control cannot actually leave the for statement because, when i is three, arr[i] is one, so centsChange % arr[i] necessarily produces zero, which is assigned to remainder, causing code to flow into the first case above. With GCC and Clang, you can inform the compiler of this by inserting __builtin_unreachable(); as the last statement in the function. That tells the compiler that that point in the code logically cannot be reached by any combination of circumstances within the program. (Using this compiler feature when it is not true that control cannot reach the location will break your program.)
Note that the fact that control cannot leave the for loop for the above reason implies the centsChange == 1 base case is unnecessary. The fact that remainder == 0 must be satisfied at some point means it serves as a base case.
Although this analysis discusses the code as it is, experienced programmers would restructure the code so that none of the above solutions are necessary. There are times when various complications motivate us to use code where the compiler cannot deduce that a certain point is never reached in execution, but we know it is, and the above workarounds may be used in such cases. However, this is not one of them. This code is fairly simple and can be restructured so that control flow is simpler and more apparent to the compiler.

Why does this for loop exit on some platforms and not on others?

I have recently started to learn C and I am taking a class with C as the subject. I'm currently playing around with loops and I'm running into some odd behaviour which I don't know how to explain.
#include <stdio.h>
int main()
{
int array[10],i;
for (i = 0; i <=10 ; i++)
{
array[i]=0; /*code should never terminate*/
printf("test \n");
}
printf("%d \n", sizeof(array)/sizeof(int));
return 0;
}
On my laptop running Ubuntu 14.04, this code does not break. It runs to completion. On my school's computer running CentOS 6.6, it also runs fine. On Windows 8.1, the loop never terminates.
What's even more strange is that when I edit the condition of the for loop to: i <= 11, the code only terminates on my laptop running Ubuntu. It never terminates in CentOS and Windows.
Can anyone explain what's happening in the memory and why the different OSes running the same code give different outcomes?
EDIT: I know the for loop goes out of bounds. I'm doing it intentionally. I just can't figure out how the behaviour can be different across different OSes and computers.
On my laptop running Ubuntu 14.04, this code does not break it runs to completion. On my school's computer running CentOS 6.6, it also runs fine. On Windows 8.1, the loop never terminates.
What is more strange is when I edit the conditional of the for loop to: i <= 11, the code only terminates on my laptop running Ubuntu. CentOS and Windows never terminates.
You've just discovered memory stomping. You can read more about it here: What is a “memory stomp”?
When you allocate int array[10],i;, those variables go into memory (specifically, they're allocated on the stack, which is a block of memory associated with the function). array[] and i are probably adjacent to each other in memory. It seems that on Windows 8.1, i is located at array[10]. On CentOS, i is located at array[11]. And on Ubuntu, it's in neither spot (maybe it's at array[-1]?).
Try adding these debugging statements to your code. You should notice that on iteration 10 or 11, array[i] points at i.
#include <stdio.h>
int main()
{
int array[10],i;
printf ("array: %p, &i: %p\n", array, &i);
printf ("i is offset %d from array\n", &i - array);
for (i = 0; i <=11 ; i++)
{
printf ("%d: Writing 0 to address %p\n", i, &array[i]);
array[i]=0; /*code should never terminate*/
}
return 0;
}
The bug lies between these pieces of code:
int array[10],i;
for (i = 0; i <=10 ; i++)
array[i]=0;
Since array only has 10 elements, in the last iteration array[10] = 0; is a buffer overflow. Buffer overflows are UNDEFINED BEHAVIOR, which means they might format your hard drive or cause demons to fly out of your nose.
It is fairly common for all stack variables to be laid out adjacent to each other. If i is located where array[10] writes to, then the UB will reset i to 0, thus leading to the unterminated loop.
To fix, change the loop condition to i < 10.
In what should be the last run of the loop,you write to array[10], but there are only 10 elements in the array, numbered 0 through 9. The C language specification says that this is “undefined behavior”. What this means in practice is that your program will attempt to write to the int-sized piece of memory that lies immediately after array in memory. What happens then depends on what does, in fact, lie there, and this depends not only on the operating system but more so on the compiler, on the compiler options (such as optimization settings), on the processor architecture, on the surrounding code, etc. It could even vary from execution to execution, e.g. due to address space randomization (probably not on this toy example, but it does happen in real life). Some possibilities include:
The location wasn't used. The loop terminates normally.
The location was used for something which happened to have the value 0. The loop terminates normally.
The location contained the function's return address. The loop terminates normally, but then the program crashes because it tries to jump to the address 0.
The location contains the variable i. The loop never terminates because i restarts at 0.
The location contains some other variable. The loop terminates normally, but then “interesting” things happen.
The location is an invalid memory address, e.g. because array is right at the end of a virtual memory page and the next page isn't mapped.
Demons fly out of your nose. Fortunately most computers lack the requisite hardware.
What you observed on Windows was that the compiler decided to place the variable i immediately after the array in memory, so array[10] = 0 ended up assigning to i. On Ubuntu and CentOS, the compiler didn't place i there. Almost all C implementations do group local variables in memory, on a memory stack, with one major exception: some local variables can be placed entirely in registers. Even if the variable is on the stack, the order of variables is determined by the compiler, and it may depend not only on the order in the source file but also on their types (to avoid wasting memory to alignment constraints that would leave holes), on their names, on some hash value used in a compiler's internal data structure, etc.
If you want to find out what your compiler decided to do, you can tell it to show you the assembler code. Oh, and learn to decipher assembler (it's easier than writing it). With GCC (and some other compilers, especially in the Unix world), pass the option -S to produce assembler code instead of a binary. For example, here's the assembler snippet for the loop from compiling with GCC on amd64 with the optimization option -O0 (no optimization), with comments added manually:
.L3:
movl -52(%rbp), %eax ; load i to register eax
cltq
movl $0, -48(%rbp,%rax,4) ; set array[i] to 0
movl $.LC0, %edi
call puts ; printf of a constant string was optimized to puts
addl $1, -52(%rbp) ; add 1 to i
.L2:
cmpl $10, -52(%rbp) ; compare i to 10
jle .L3
Here the variable i is 52 bytes below the top of the stack, while the array starts 48 bytes below the top of the stack. So this compiler happens to have placed i just before the array; you'd overwrite i if you happened to write to array[-1]. If you change array[i]=0 to array[9-i]=0, you'll get an infinite loop on this particular platform with these particular compiler options.
Now let's compile your program with gcc -O1.
movl $11, %ebx
.L3:
movl $.LC0, %edi
call puts
subl $1, %ebx
jne .L3
That's shorter! The compiler has not only declined to allocate a stack location for i — it's only ever stored in the register ebx — but it hasn't bothered to allocate any memory for array, or to generate code to set its elements, because it noticed that none of the elements are ever used.
To make this example more telling, let's ensure that the array assignments are performed by providing the compiler with something it isn't able to optimize away. An easy way to do that is to use the array from another file — because of separate compilation, the compiler doesn't know what happens in another file (unless it optimizes at link time, which gcc -O0 or gcc -O1 doesn't). Create a source file use_array.c containing
void use_array(int *array) {}
and change your source code to
#include <stdio.h>
void use_array(int *array);
int main()
{
int array[10],i;
for (i = 0; i <=10 ; i++)
{
array[i]=0; /*code should never terminate*/
printf("test \n");
}
printf("%zd \n", sizeof(array)/sizeof(int));
use_array(array);
return 0;
}
Compile with
gcc -c use_array.c
gcc -O1 -S -o with_use_array1.c with_use_array.c use_array.o
This time the assembler code looks like this:
movq %rsp, %rbx
leaq 44(%rsp), %rbp
.L3:
movl $0, (%rbx)
movl $.LC0, %edi
call puts
addq $4, %rbx
cmpq %rbp, %rbx
jne .L3
Now the array is on the stack, 44 bytes from the top. What about i? It doesn't appear anywhere! But the loop counter is kept in the register rbx. It's not exactly i, but the address of the array[i]. The compiler has decided that since the value of i was never used directly, there was no point in performing arithmetic to calculate where to store 0 during each run of the loop. Instead that address is the loop variable, and the arithmetic to determine the boundaries was performed partly at compile time (multiply 11 iterations by 4 bytes per array element to get 44) and partly at run time but once and for all before the loop starts (perform a subtraction to get the initial value).
Even on this very simple example, we've seen how changing compiler options (turn on optimization) or changing something minor (array[i] to array[9-i]) or even changing something apparently unrelated (adding the call to use_array) can make a significant difference to what the executable program generated by the compiler does. Compiler optimizations can do a lot of things that may appear unintuitive on programs that invoke undefined behavior. That's why undefined behavior is left completely undefined. When you deviate ever so slightly from the tracks, in real-world programs, it can be very hard to understand the relationship between what the code does and what it should have done, even for experienced programmers.
Unlike Java, C doesn't do array boundary check, i.e, there's no ArrayIndexOutOfBoundsException, the job of making sure the array index is valid is left to the programmer. Doing this on purpose leads to undefined behavior, anything could happen.
For an array:
int array[10]
indexes are only valid in the range 0 to 9. However, you are trying to:
for (i = 0; i <=10 ; i++)
access array[10] here, change the condition to i < 10
You have a bounds violation, and on the non-terminating platforms, I believe you are inadvertently setting i to zero at the end of the loop, so that it starts over again.
array[10] is invalid; it contains 10 elements, array[0] through array[9], and array[10] is the 11th. Your loop should be written to stop before 10, as follows:
for (i = 0; i < 10; i++)
Where array[10] lands is implementation-defined, and amusingly, on two of your platforms, it lands on i, which those platforms apparently lay out directly after array. i is set to zero and the loop continues forever. For your other platforms, i may be located before array, or array may have some padding after it.
You declare int array[10] means array has index 0 to 9 (total 10 integer elements it can hold). But the following loop,
for (i = 0; i <=10 ; i++)
will loop 0 to 10 means 11 time. Hence when i = 10 it will overflow the buffer and cause Undefined Behavior.
So try this:
for (i = 0; i < 10 ; i++)
or,
for (i = 0; i <= 9 ; i++)
It is undefined at array[10], and gives undefined behavior as described before. Think about it like this:
I have 10 items in my grocery cart. They are:
0: A box of cereal
1: Bread
2: Milk
3: Pie
4: Eggs
5: Cake
6: A 2 liter of soda
7: Salad
8: Burgers
9: Ice cream
cart[10] is undefined, and may give an out of bounds exception in some compilers. But, a lot apparently don't. The apparent 11th item is an item not actually in the cart. The 11th item is pointing to, what I'm going to call, a "poltergeist item." It never existed, but it was there.
Why some compilers give i an index of array[10] or array[11] or even array[-1] is because of your initialization/declaration statement. Some compilers interpret this as:
"Allocate 10 blocks of ints for array[10] and another int block. to make it easier, put them right next to each other."
Same as before, but move it a space or two away, so that array[10] doesn't point to i.
Do the same as before, but allocate i at array[-1] (because an index of an array can't, or shouldn't, be negative), or allocate it at a completely different spot because the OS can handle it, and it's safer.
Some compilers want things to go quicker, and some compilers prefer safety. It's all about the context. If I was developing an app for the ancient BREW OS (the OS of a basic phone), for example, it wouldn't care about safety. If I was developing for an iPhone 6, then it could run fast no matter what, so I would need an emphasis on safety. (Seriously, have you read Apple's App Store Guidelines, or read up on the development of Swift and Swift 2.0?)
Since you created an array of size 10, for loop condition should be as follows:
int array[10],i;
for (i = 0; i <10 ; i++)
{
Currently you are trying to access the unassigned location from the memory using array[10] and it is causing the undefined behavior. Undefined behavior means your program will behave undetermined fashion, so it can give different outputs in each execution.
Well, C compiler traditionally does not check for bounds. You can get a segmentation fault in case you refer to a location that does not "belong" to your process. However, the local variables are allocated on stack and depending on the way the memory is allocated, the area just beyond the array (array[10]) may belong to the process' memory segment. Thus, no segmentation fault trap is thrown and that is what you seem to experience. As others have pointed out, this is undefined behavior in C and your code may be considered erratic. Since you are learning C, you are better off getting into the habit of checking for bounds in your code.
Beyond the possibility that memory might be laid out so that an attempt to write to a[10] actually overwrites i, it would also be possible that an optimizing compiler might determine that the loop test cannot be reached with a value of i greater than ten without code having first accessed the non-existent array element a[10].
Since an attempt to access that element would be undefined behavior, the compiler would have no obligations with regard to what the program might do after that point. More specifically, since the compiler would have no obligation to generate code to check the loop index in any case where it might be greater than ten, it would have no obligation to generate code to check it at all; it could instead assume that the <=10 test will always yield true. Note that this would be true even if the code would read a[10] rather than writing it.
When you iterate past i==9 you assign zero to the 'array items' which are actually located past the array, so you're overwritnig some other data. Most probably you overwrite the i variable, which is located after a[]. That way you simply reset the i variable to zero and thus restart the loop.
You could discover that yourself if you printed i in the loop:
printf("test i=%d\n", i);
instead of just
printf("test \n");
Of course that result strongly depends on the memory allocation for your variables, which in turn depends on a compiler and its settings, so it is generally Undefined Behavior — that's why results on different machines or different operating systems or on different compilers may differ.
the error is in portion array[10] w/c is also address of i (int array[10],i;).
when array[10] is set to 0 then the i would be 0 w/c resets the entire loop and
causes the infinite loop.
there will be infinite loop if array[10] is between 0-10.the correct loop should be for (i = 0; i <10 ; i++) {...}
int array[10],i;
for (i = 0; i <=10 ; i++)
array[i]=0;
I will suggest something that I dint find above:
Try assigning array[i] = 20;
I guess this should terminate the code everywhere.. (given you keep i< =10 or ll)
If this runs you can firmly decide that the answers specified here already are correct [the answer related to memory stomping one for ex.]
There are two things wrong here. The int i is actually an array element, array[10], as seen on the stack. Because you have allowed the indexing to actually make array[10] = 0, the loop index, i, will never exceed 10. Make it for(i=0; i<10; i+=1).
i++ is, as K&R would call it, 'bad style'. It is incrementing i by the size of i, not 1. i++ is for pointer math and i+=1 is for algebra. While this depends on the compiler, it is not a good convention for portability.

Does C99 Standard allow the compiler to transform code such that the same expression is no longer evaluated once some deduced condition is met?

I don't quite get the following part of 5.1.2.3/3:
An actual implementation need not evaluate part of an expression if it can deduce that its
value is not used and that no needed side effects are produced (including any caused by
calling a function or accessing a volatile object).
Suppose I have the following code:
char data[size];
int i;
int found;
/* initialize data to some values in here */
found = 0;
for( i = 0; i < size; i++ ) {
if( data[i] == 0 ) {
found = 1;
/* no break in here */
}
}
/* i no longer used, do something with "found" here */
Note that found starts with 0 and can either remain unchanged or turn into 1. It cannot turn into 1 and then into something else. So the following code would yield the same result (except for i value which is not used after the loop anyway):
char data[size];
int i;
int found;
/* initialize data to some values in here */
found = 0;
for( i = 0; i < size; i++ ) {
if( data[i] == 0 ) {
found = 1;
break;
}
}
/* i no longer used, do something with "found" here */
Now what does the Standard say about need not evaluate part of an expression with regard to found = 1 and the loop control expressions which follow the first iteration in which control gets inside if?
Clearly if found is used somewhere after this code the compiler must emit the code that traverses the array and conditionally evaluates found = 1 expression.
Is the implementation required to evaluate found = 1 once for every zero found in the array or can it instead evaluate it no more that once and so effectively emit the code for the second snippet when compiling the first snippet?
can it instead evaluate it no more that once and so effectively emit the code for the second snippet when compiling the first snippet?
Yes, a compiler has the right to perform that optimization. It seems like a pretty aggressive optimization but it would be legal.
It might be interesting to look at an example that more closely matches the spirit of the text:
An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
Suppose we have:
int x = pureFunction(y) * otherPureFunction(z);
Suppose the compiler knows that both functions are int-returning "pure" functions; that is, they have no side effects and their result depends solely on the arguments. Suppose the compiler also believes that otherPureFunction is an extremely expensive operation. A compiler could choose to implement the code as though you had written:
int temp = pureFunction(y);
int x = temp == 0 ? 0 : temp * otherPureFunction(z);
That is, determine that under some conditions it is unnecessary to compute otherPureFunction() because the result of the multiplication is already known once the left operand is known to be zero. No needed side effects will be elided because there are no side effects.
Yes, it may perform this optimization, since there are no I/O operations, reads from volatile locations or externally visible writes to memory omitted by the optimized code, so the behavior is preserved.
As an example of this kind of optimization, GCC will compile
void noop(const char *s)
{
for (size_t i = 0; i < strlen(s); i++) {
}
}
to a completely empty function:
noop:
.LFB33:
.cfi_startproc
rep ret
.cfi_endproc
It is allowed to do so because the Standard guarantees the behavior of strlen, the compiler knows that it has no externally visible effect on s or any other piece of memory, and it can deduce that the whole function has no behavior. (Amazingly, this simple optimization brings the complexity down from quadratic to constant.)

jumping inside loop

C language allows jumping inside loop. What would be the use of doing so?
if(n > 3) {
i = 2;
goto inner;
}
/* a lot of code */
for(i = 0; i < limit ;i ++) {
inner:
/* ... */
}
If you've ever coded in Assembler (ASM), then you'll know that GOTOs are pretty standard, and required, actually. The C Language was designed to be very close to ASM without actually being ASM. As such, I imagine that "GOTO" was kept for this reason.
Though, I'll admit that GOTOs are generally a "bad idea, mmmkay?" in terms of program flow control in C and any other higher level language.
It's certainly a questionable construct. A design that depends on this behavior is probably a poor design.
You've tagged this as C++, but C++ (intelligently, IMO) doesn't allow you to jump inside a loop where a variable was declared in the first part of the for statement:
int main()
{
int q = 5;
goto inner;
for (int i = 0; i < 4; i++)
{
q *= 2;
inner:
q++;
std::cout << q << std::endl;
}
}
g++ output:
l.cpp: In function ‘int main()’:
l.cpp:12: error: jump to label ‘inner’
l.cpp:7: error: from here
l.cpp:9: error: crosses initialization of ‘int i’
Initializing i before the loop allows the program to compile fine (as would be expected).
Oddly, compiling this with gcc -std=c99 (and using printf instead) doesn't give an error, and on my computer, the output is:
6
13
27
55
as would be expected if i were initialized outside the loop. This might lead one to believe that int i = 0 might be simply "pulled out" of the loop initializer during compilation, but i is still out of scope if tried to use outside of the loop.
From http://en.wikipedia.org/wiki/Duff%27s_device
In computer science, Duff's device is an optimized implementation of a serial copy that uses a technique widely applied in assembly language for loop unwinding.
...
Reason it works
The ability to legally jump into the middle of a loop in C.

Resources