void main() {
int i, j=6;
for(; i=j ; j-=2 )
printf("%d",j);
}
By following regular pattern, there should be a condition after first semicolon, but here it is initialization,so this should have given an error.
How is this even a valid format?
But the output is 642
First, let me correct the terminology, the i=j is an assignment, not an initialization.
That said, let's analyze the for loop syntax first.
for ( clause-1 ; expression-2 ; expression-3 ) statement
So, the expression-2 should be an "expression".
Now, coming to the syntax for statement having assignment operator
assignment-expression: conditional-expression unary-expression
assignment-operator assignment-expression
So, as the spec C11 mentions in chapter §6.5.16, the assignment operation is also an expression which fits perfectly for the expression-2 part in for loop syntax.
Regarding the result,
An
assignment expression has the value of the left operand after the assignment,
so, i=j will basically assign the value of j to i and then, the value of i will be used for condition checking, (i.e., non-zero or zero as TRUE or FALSE).
TL;DR Syntactically, there's no issue with the code, so no error is generated by your compiler.
Also, for a hosted environment, void main() should be int main(void) to be conforming to the standard.
i=j is also an expression, the value of which is the value of i after the assignment. So it can serve as a condition.
You'd normally see this type of cleverness used like this:
if ((ptr = some_complex_function()) != NULL)
{
/* Use ptr */
}
Where some programmers like to fold the assignment and check into one line of code. How good or bad this is for readability is a matter of opinion.
Your code does not contain a syntax error, hence the compiler accepts it and generates code to produce 642.
The condition i=j is interpreted as (i = j) != 0.
To prevent this and many similar error patterns, enable more compiler warnings and make them fatal with:
gcc -Wall -W -Werror
If you use clang, use clang -Weverything -Werror
This is a very good question.
To really understand this, you better know how C codes are executed in computer:
First, the compiler will compile the C code into assembly code, then assembly codes will be translated into machine code, which can run in main memory directly.
As for your code:
void main() {
int i, j=6;
for(; i=j ; j-=2 )
printf("%d",j);
}
To figure out why the result is 642, We want to see its assembly code.
Using VS debugging mode, we can see:
Especially look at this:
010217D0 mov eax,dword ptr [j]
010217D3 mov dword ptr [i],eax
010217D6 cmp dword ptr [i],0
010217DA je main+4Fh (010217EFh)
The four lines of assembly code correponding to the C code "i=j", it means, first move the value of j to register eax, then move the value of register eax to i(since computer can not directly move the value of j to i, it just use register eax as a bridge), then compare the value of i with 0, if they are equal, jump to 010217EFh, the loop ends; if not, the loop continues.
So actually it's first an assignment, then a comparision to decide whether the loop is over; as 6 declines to 0 ,the loop finally stops, I hope this can help you understand why the result is 642 :D
Related
How can we interpret the following program and its success?(Its obvious that there must not be any error message). I mean how does compiler interpret lines 2 and 3 inside main?
#include <stdio.h>
int main()
{
int a,b;
a; //(2)
b; //(3)
return 0;
}
Your
a;
is just an expression statement. As always in C, the full expression in expression statement is evaluated and its result is immediately discarded.
For example, this
a = 2 + 3;
is an expression statement containing full expression a = 2 + 3. That expression evaluates to 5 and also has a side-effect of writing 5 into a. The result is evaluated and discarded.
Expression statement
a;
is treated in the same way, except that is has no side-effects. Since you forgot to initialize your variables, evaluation of the above expression can formally lead to undefined behavior.
Obviously, practical compilers will simply skip such expression statements entirely, since they have no observable behavior.
That's why you should use some compilation warning flags!
-Wall would trigger a "statement with no effect" warning.
If you want to see what the compilation produces, compile using -S.
Try it with your code, with/without -O (optimization) flag...
This is just like you try something like this:
#include <stdio.h>
int main(void){
1;
2;
return 0;
}
As we can see we have here two expressions followed by semicolon (1; and 2;). It is a well formed statement according to the rules of the language.
There is nothing wrong with it, it is just useless.
But if you try to use though statements (a or b) the behavior will be undefined.
Of course that, the compiler will interpret it as a statement with no effect
L.E:
If you run this:
#include <stdio.h>
int main(void){
int a;
int b;
printf("A = %d\n",a);
printf("B = %d\n",b);
if (a < b){
printf("TRUE");
}else{
printf("FALSE");
}
return 0;
}
You wil get:
A = 0
B = 0
FALSE
Because a and b are set to 0;
Sentences in C wich are not control structures (if, switch, for, while, do while) or control statements (break, continue, goto, return) are expressions.
Every expression has a resulting value.
An expression is evaluated for its side effects (change the value of an object, write a file, read volatile objects, and functions doing some of these things).
The final result of such an expression is always discarded.
For example, the function printf() returns an int value, that in general is not used. However this value is produced, and then discarded.
However the function printf() produces side effects, so it has to be processed.
If a sentence has no side effects, then the compiler is free to discard it at all.
I think that for a compiler will not be so hard to check if a sentence has not any side effects. So, what you can expect in this case is that the compiler will choose to do nothing.
Moreover, this will not affect the observable behaviour of the program, so there is no difference in what is obtained in the resulting execution of the program. However, of course, the program will run faster if any computation is ignored at all by the compiler.
Also, note that in some cases the floating point environment can set flags, which are considered side-effects.
The Standard C (C11) says, as part of paragraph 5.1.2.3p.4:
An actual implementation need not evaluate part of an expression if it
can deduce that its value is not used and that no needed side effects
are produced [...]
CONCLUSION: One has to read the documentation of the particular compiler that oneself is using.
Context
I was asked the following puzzle by one of my friends:
void fn(void)
{
/* write something after this comment so that the program output is 10 */
/* write something before this comment */
}
int main()
{
int i = 5;
fn();
printf("%d\n", i);
return 0;
}
I know there can be multiple solutions, some involving macro and some assuming something about the implementation and violating C.
One particular solution I was interested in is to make certain assumptions about stack and write following code: (I understand it is undefined behavior, but may work as expected on many implementations)
void fn(void)
{
/* write something after this comment so that the program output is 10 */
int a[1] = {0};
int j = 0;
while(a[j] != 5) ++j; /* Search stack until you find 5 */
a[j] = 10; /* Overwrite it with 10 */
/* write something before this comment */
}
Problem
This program worked fine in MSVC and gcc without optimization. But when I compiled it with gcc -O2 flag or tried on ideone, it loops infinitely in function fn.
My Observation
When I compiled the file with gcc -S vs gcc -S -O2 and compared, it clearly shows gcc kept an infinite loop in function fn.
Question
I understand because the code invokes undefined behavior, one can not call it a bug. But why and how does compiler analyze the behavior and leave an infinite loop at O2?
Many people commented to know the behavior if some of the variables are changed to volatile. The result as expected is:
If i or j is changed to volatile, program behavior remains same.
If array a is made volatile, program does not suffer infinite loop.
Moreover if I apply the following patch
- int a[1] = {0};
+ int aa[1] = {0};
+ int *a = aa;
The program behavior remains same (infinite loop)
If I compile the code with gcc -O2 -fdump-tree-optimized, I get the following intermediate file:
;; Function fn (fn) (executed once)
Removing basic block 3
fn ()
{
<bb 2>:
<bb 3>:
goto <bb 3>;
}
;; Function main (main) (executed once)
main ()
{
<bb 2>:
fn ();
}
Invalid sum of incoming frequencies 0, should be 10000
This verifies the assertions made after the answers below.
This is undefined behavior so the compiler can really do anything at all, we can find a similar example in GCC pre-4.8 Breaks Broken SPEC 2006 Benchmarks, where gcc takes a loop with undefined behavior and optimizes it to:
L2:
jmp .L2
The article says (emphasis mine):
Of course this is an infinite loop. Since SATD() unconditionally
executes undefined behavior (it’s a type 3 function), any
translation (or none at all) is perfectly acceptable behavior for a
correct C compiler. The undefined behavior is accessing d[16] just
before exiting the loop. In C99 it is legal to create a pointer to
an element one position past the end of the array, but that pointer
must not be dereferenced. Similarly, the array cell one element past
the end of the array must not be accessed.
which if we examine your program with godbolt we see:
fn:
.L2:
jmp .L2
The logic being used by the optimizer probably goes something like this:
All the elements of a are initialized to zero
a is never modified before or within the loop
So a[j] != 5 is always true -> infinite loop
Because of the infinite, the a[j] = 10; is unreachable and so that can be optimized away, so can a and j since they are no longer needed to determine the loop condition.
which is similar to the case in the article which given:
int d[16];
analyzes the following loop:
for (dd=d[k=0]; k<16; dd=d[++k])
like this:
upon seeing d[++k], is permitted to assume that the incremented value
of k is within the array bounds, since otherwise undefined behavior
occurs. For the code here, GCC can infer that k is in the range 0..15.
A bit later, when GCC sees k<16, it says to itself: “Aha– that
expression is always true, so we have an infinite loop.”
Perhaps an interesting secondary point, is whether an infinite loop is considered observable behavior(w.r.t. to the as-if rule) or not, which effects whether an infinite loop can also be optimized away. We can see from C Compilers Disprove Fermat’s Last Theorem that before C11 there was at least some room for interpretation:
Many knowledgeable people (including me) read this as saying that the
termination behavior of a program must not be changed. Obviously some
compiler writers disagree, or else don’t believe that it matters. The
fact that reasonable people disagree on the interpretation would seem
to indicate that the C standard is flawed.
C11 adds clarification to section 6.8.5 Iteration statements and is covered in more detail in this answer.
In the optimized version, the compiler has decided a few things:
The array a doesn't change before that test.
The array a doesn't contain a 5.
Therefore, we can rewrite the code as:
void fn(void) {
int a[1] = {0};
int j = 0;
while(true) ++j;
a[j] = 10;
}
Now, we can make further decisions:
All the code after the while loop is dead code (unreachable).
j is written but never read. So we can get rid of it.
a is never read.
At this point, your code has been reduced to:
void fn(void) {
int a[1] = {0};
while(true);
}
And we can make the note that a is now never read, so let's get rid of it as well:
void fn(void) {
while(true);
}
Now, the unoptimized code:
In unoptimized generated code, the array will remain in memory. And you'll literally walk it at runtime. And it's possible that there will be a 5 thats readable after it once you walk past the end of the array.
Which is why the unoptimized version sometimes doesn't crash and burn.
If the loop does get optimized out into an infinite loop, it could be due to static code analyzis seeing that your array is
not volatile
contains only 0
never gets written to
and thus it is not possible for it to contain the number 5. Which means an infinite loop.
Even if it didn't do this, your approach could fail easily. For example, it's possible that some compiler would optimize your code without making your loop infinite, but would stuff the contents of i into a register, making it unavailable from the stack.
As a side note, I bet what your friend actually expected was this:
void fn(void)
{
/* write something after this comment so that the program output is 10 */
printf("10\n"); /* Output 10 */
while(1); /* Endless loop, function won't return, i won't be output */
/* write something before this comment */
}
or this (if stdlib.h is included):
void fn(void)
{
/* write something after this comment so that the program output is 10 */
printf("10\n"); /* Output 10 */
exit(0); /* Exit gracefully */
/* write something before this comment */
}
I don't quite get the following part of 5.1.2.3/3:
An actual implementation need not evaluate part of an expression if it can deduce that its
value is not used and that no needed side effects are produced (including any caused by
calling a function or accessing a volatile object).
Suppose I have the following code:
char data[size];
int i;
int found;
/* initialize data to some values in here */
found = 0;
for( i = 0; i < size; i++ ) {
if( data[i] == 0 ) {
found = 1;
/* no break in here */
}
}
/* i no longer used, do something with "found" here */
Note that found starts with 0 and can either remain unchanged or turn into 1. It cannot turn into 1 and then into something else. So the following code would yield the same result (except for i value which is not used after the loop anyway):
char data[size];
int i;
int found;
/* initialize data to some values in here */
found = 0;
for( i = 0; i < size; i++ ) {
if( data[i] == 0 ) {
found = 1;
break;
}
}
/* i no longer used, do something with "found" here */
Now what does the Standard say about need not evaluate part of an expression with regard to found = 1 and the loop control expressions which follow the first iteration in which control gets inside if?
Clearly if found is used somewhere after this code the compiler must emit the code that traverses the array and conditionally evaluates found = 1 expression.
Is the implementation required to evaluate found = 1 once for every zero found in the array or can it instead evaluate it no more that once and so effectively emit the code for the second snippet when compiling the first snippet?
can it instead evaluate it no more that once and so effectively emit the code for the second snippet when compiling the first snippet?
Yes, a compiler has the right to perform that optimization. It seems like a pretty aggressive optimization but it would be legal.
It might be interesting to look at an example that more closely matches the spirit of the text:
An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
Suppose we have:
int x = pureFunction(y) * otherPureFunction(z);
Suppose the compiler knows that both functions are int-returning "pure" functions; that is, they have no side effects and their result depends solely on the arguments. Suppose the compiler also believes that otherPureFunction is an extremely expensive operation. A compiler could choose to implement the code as though you had written:
int temp = pureFunction(y);
int x = temp == 0 ? 0 : temp * otherPureFunction(z);
That is, determine that under some conditions it is unnecessary to compute otherPureFunction() because the result of the multiplication is already known once the left operand is known to be zero. No needed side effects will be elided because there are no side effects.
Yes, it may perform this optimization, since there are no I/O operations, reads from volatile locations or externally visible writes to memory omitted by the optimized code, so the behavior is preserved.
As an example of this kind of optimization, GCC will compile
void noop(const char *s)
{
for (size_t i = 0; i < strlen(s); i++) {
}
}
to a completely empty function:
noop:
.LFB33:
.cfi_startproc
rep ret
.cfi_endproc
It is allowed to do so because the Standard guarantees the behavior of strlen, the compiler knows that it has no externally visible effect on s or any other piece of memory, and it can deduce that the whole function has no behavior. (Amazingly, this simple optimization brings the complexity down from quadratic to constant.)
I know this is a stupid question to ask but i am just asking this out of my curiosity.
I just read this code somewhere:
#include<stdio.h>
int main() {
for ( ; 0 ; )
printf("This code will be executed one time.");
return 0;
}
Output:
This code will be executed one time.
This loop is executing once in Turbo C compiler while not working in gcc, but how can this be possible that this loop execute even for once?
Can you please guide me for the unusual behavior of this code in the Turbo C compiler, if there is any?
It's a bug in the compiler. The C99 standard describes for loops like this:
The statement
for ( clause-1 ; expression-2 ; expression-3 ) statement
behaves as follows: The expression expression-2 is the controlling expression
that is evaluated before each execution of the loop body.
The expression expression-3 is evaluated as a void expression after each
execution of the loop body. [...]
Given that expression-2 evaluates to false, the code should print no output.
TurboC does not follow the C99 standard. This could explain the unusual behaviour.REst assured, gcc will give you the correct output.
I have stumbled upon a piece of code that generates some interesting results while debugging someone else's program.
I have created a small program to illustrate this behavior:
#include <stdio.h>
int main()
{
char* word = "foobar"; int i, iterator = 0;
for (i = 0; i < 6; i++ && iterator++)
printf("%c", word[iterator]);
return 0;
}
I know that this is not the right way to print a string. This is for demonstration purpose only.
Here I expected the output to be "foobar", obviously, but instead it is "ffooba". Basically it reads the first character twice, as if the first time iterator++ is executed nothing happens.
Can anyone explain why this happens?
The thing is iterator++ actually isn't executed the first time. The ++ operator returns the current value of a variable and then increments it, so the first time through, i++ will be equal to 0. && short-circuits, so iterator++ is not executed the first time.
To fix this, you could use the comma operator which unconditionally evaluates both, rather than the short-circuiting &&.
The result of i++ is the current value of i which is zero on first iteration. This means iterator++ is not executed on first iteration due to short circuting (the right-hand side of && is only executed if the left-hand side is "true").
To fix you could use the comma operator (as already suggested or) use ++i which will return the value of i after the incremement (though comma operator is more obvious that both must always be evaluated).
You really should learn to use a debugger like e.g. gdb and to compile with warnings and debugging info like gcc -Wall -g (assuming a Linux system). A recent gcc with -Wall gives you a warning about value computed is not used before the && operation.
The increment part of your for loop is strange. It is i++ && iterator++ (but it should be i++, iterator++ instead).
When i is 0 (on the first iteration), i++ gives 0 as result, so it is false, so the iterator++ is not executed.
I am reading K&R about logic operators,let me quote the origin words of book,which can explain your question.
"Expressions connected by && or || are evaluated left to right, and
evaluation stops as soon as the truth or falsehood of the result is known."
Have a good understanding of these,the outputs wont puzzle.