C language allows jumping inside loop. What would be the use of doing so?
if(n > 3) {
i = 2;
goto inner;
}
/* a lot of code */
for(i = 0; i < limit ;i ++) {
inner:
/* ... */
}
If you've ever coded in Assembler (ASM), then you'll know that GOTOs are pretty standard, and required, actually. The C Language was designed to be very close to ASM without actually being ASM. As such, I imagine that "GOTO" was kept for this reason.
Though, I'll admit that GOTOs are generally a "bad idea, mmmkay?" in terms of program flow control in C and any other higher level language.
It's certainly a questionable construct. A design that depends on this behavior is probably a poor design.
You've tagged this as C++, but C++ (intelligently, IMO) doesn't allow you to jump inside a loop where a variable was declared in the first part of the for statement:
int main()
{
int q = 5;
goto inner;
for (int i = 0; i < 4; i++)
{
q *= 2;
inner:
q++;
std::cout << q << std::endl;
}
}
g++ output:
l.cpp: In function ‘int main()’:
l.cpp:12: error: jump to label ‘inner’
l.cpp:7: error: from here
l.cpp:9: error: crosses initialization of ‘int i’
Initializing i before the loop allows the program to compile fine (as would be expected).
Oddly, compiling this with gcc -std=c99 (and using printf instead) doesn't give an error, and on my computer, the output is:
6
13
27
55
as would be expected if i were initialized outside the loop. This might lead one to believe that int i = 0 might be simply "pulled out" of the loop initializer during compilation, but i is still out of scope if tried to use outside of the loop.
From http://en.wikipedia.org/wiki/Duff%27s_device
In computer science, Duff's device is an optimized implementation of a serial copy that uses a technique widely applied in assembly language for loop unwinding.
...
Reason it works
The ability to legally jump into the middle of a loop in C.
Related
So I am trying to learn C and I am trying to make this code so it will sort of the array's elements from lowest to highest, it's obviously not complete but I just wanted to see the random numbers printed.
Anyway, I am getting an error E0028 & C2131 (Visual Studios) that says "expression must have a constant value" & "expression did not evaluate to a constant." The int goals[howMany];is where VS is telling me I have an error
int main()
{
int i, temp, swapped;
int howMany = 10;
int goals[howMany];
for (i = 0; i < howMany; i++) {
goals[i] = (rand() % 25) + 1;
}
printf("Original List\n");
for (i = 0; i < howMany; i++) {
printf("%d \n", goals[i]);
}
return 0;
}
This is exactly how the code is written out in the tutorial I am watching and they are using Code:Blocks. I know sometimes those two compilers can be different but I was hoping someone can let me know what's going on or how to fix this.
Visual studio doesn't support variable length arrays. C is a little tricky, and the compiler/flags you use matters. For example, if you were to compile with the gcc compiler, using the -std=c99 flag would allow you to run your code with no errors (since -std=c99 supports variable length arrays).
I'm not sure exactly how compilation in Visual Studio works, but that's your problem. I usually don't like C programming in VS for this reason. It's much easier for me to use something like Vim, and compile at the command like so that I can specify compiler settings.
In order to use rand(), you need to include the according lib and then "plant a rand seed" : take a look at C Rand Function
const howMany = 10;
This most likely is just the compiler you are using. I suppose it's a safe guard to prevent a segmentation fault. If howMany variable is changed after being used to initialize the array then a seg fault will will definitely result.A seg fault is when you access something out of bounds. If you try to change a const variable then the compiler will not let you.By making howMany const this will prevent any such errors.
My experience with C is relatively modest, and I lack good understanding of its compiled output on modern CPUs. The context: I'm working on image processing for an Android app. I have read that branch-free machine code is preferred for inner loops, so I'd like to know whether there could be a significant performance difference between something like this:
if (p) { double for loop, computing f() }
else if (q) { double for loop, computing g() }
else { double for loop, computing h() }
Versus the less verbose version which does the condition checking within the loop:
for (int i = 0; i < xRes; i++)
{
for (int j = 0; j < yRes; j++)
{
image[i][j] = p ? f() : (q ? g() : h());
}
}
In this code, p and q are expressions like mode == 3, where mode is passed into the function and never changed within it. I have three simple questions:
(1) Would the first, more verbose version compile to more efficient code than the second version?
(2) For the second version, would performance improve if I evaluate and store the results of p and q above the loop, so I can replace the boolean expressions in the loop with variables?
(3) Should I even be worried about this, or will branch prediction (or some other optimization) ensure the boolean expressions in the loop(s) are almost never evaluated anyway?
Finally, I'd be delighted if someone can say whether the answers to these 3 questions depend on the architecture. I'm interested in the main Android NDK platforms: ARM, MIPS, x86 etc. My thanks in advance!
It looks like the question was already well-answered here: the compiler probably performs loop unswitching, removing the conditional from the loop and automatically generating 3 copies of the loop, just like stark suggested. Moreover, from comments given there and above, it seems branch prediction works very well for loops like these.
Context
I was asked the following puzzle by one of my friends:
void fn(void)
{
/* write something after this comment so that the program output is 10 */
/* write something before this comment */
}
int main()
{
int i = 5;
fn();
printf("%d\n", i);
return 0;
}
I know there can be multiple solutions, some involving macro and some assuming something about the implementation and violating C.
One particular solution I was interested in is to make certain assumptions about stack and write following code: (I understand it is undefined behavior, but may work as expected on many implementations)
void fn(void)
{
/* write something after this comment so that the program output is 10 */
int a[1] = {0};
int j = 0;
while(a[j] != 5) ++j; /* Search stack until you find 5 */
a[j] = 10; /* Overwrite it with 10 */
/* write something before this comment */
}
Problem
This program worked fine in MSVC and gcc without optimization. But when I compiled it with gcc -O2 flag or tried on ideone, it loops infinitely in function fn.
My Observation
When I compiled the file with gcc -S vs gcc -S -O2 and compared, it clearly shows gcc kept an infinite loop in function fn.
Question
I understand because the code invokes undefined behavior, one can not call it a bug. But why and how does compiler analyze the behavior and leave an infinite loop at O2?
Many people commented to know the behavior if some of the variables are changed to volatile. The result as expected is:
If i or j is changed to volatile, program behavior remains same.
If array a is made volatile, program does not suffer infinite loop.
Moreover if I apply the following patch
- int a[1] = {0};
+ int aa[1] = {0};
+ int *a = aa;
The program behavior remains same (infinite loop)
If I compile the code with gcc -O2 -fdump-tree-optimized, I get the following intermediate file:
;; Function fn (fn) (executed once)
Removing basic block 3
fn ()
{
<bb 2>:
<bb 3>:
goto <bb 3>;
}
;; Function main (main) (executed once)
main ()
{
<bb 2>:
fn ();
}
Invalid sum of incoming frequencies 0, should be 10000
This verifies the assertions made after the answers below.
This is undefined behavior so the compiler can really do anything at all, we can find a similar example in GCC pre-4.8 Breaks Broken SPEC 2006 Benchmarks, where gcc takes a loop with undefined behavior and optimizes it to:
L2:
jmp .L2
The article says (emphasis mine):
Of course this is an infinite loop. Since SATD() unconditionally
executes undefined behavior (it’s a type 3 function), any
translation (or none at all) is perfectly acceptable behavior for a
correct C compiler. The undefined behavior is accessing d[16] just
before exiting the loop. In C99 it is legal to create a pointer to
an element one position past the end of the array, but that pointer
must not be dereferenced. Similarly, the array cell one element past
the end of the array must not be accessed.
which if we examine your program with godbolt we see:
fn:
.L2:
jmp .L2
The logic being used by the optimizer probably goes something like this:
All the elements of a are initialized to zero
a is never modified before or within the loop
So a[j] != 5 is always true -> infinite loop
Because of the infinite, the a[j] = 10; is unreachable and so that can be optimized away, so can a and j since they are no longer needed to determine the loop condition.
which is similar to the case in the article which given:
int d[16];
analyzes the following loop:
for (dd=d[k=0]; k<16; dd=d[++k])
like this:
upon seeing d[++k], is permitted to assume that the incremented value
of k is within the array bounds, since otherwise undefined behavior
occurs. For the code here, GCC can infer that k is in the range 0..15.
A bit later, when GCC sees k<16, it says to itself: “Aha– that
expression is always true, so we have an infinite loop.”
Perhaps an interesting secondary point, is whether an infinite loop is considered observable behavior(w.r.t. to the as-if rule) or not, which effects whether an infinite loop can also be optimized away. We can see from C Compilers Disprove Fermat’s Last Theorem that before C11 there was at least some room for interpretation:
Many knowledgeable people (including me) read this as saying that the
termination behavior of a program must not be changed. Obviously some
compiler writers disagree, or else don’t believe that it matters. The
fact that reasonable people disagree on the interpretation would seem
to indicate that the C standard is flawed.
C11 adds clarification to section 6.8.5 Iteration statements and is covered in more detail in this answer.
In the optimized version, the compiler has decided a few things:
The array a doesn't change before that test.
The array a doesn't contain a 5.
Therefore, we can rewrite the code as:
void fn(void) {
int a[1] = {0};
int j = 0;
while(true) ++j;
a[j] = 10;
}
Now, we can make further decisions:
All the code after the while loop is dead code (unreachable).
j is written but never read. So we can get rid of it.
a is never read.
At this point, your code has been reduced to:
void fn(void) {
int a[1] = {0};
while(true);
}
And we can make the note that a is now never read, so let's get rid of it as well:
void fn(void) {
while(true);
}
Now, the unoptimized code:
In unoptimized generated code, the array will remain in memory. And you'll literally walk it at runtime. And it's possible that there will be a 5 thats readable after it once you walk past the end of the array.
Which is why the unoptimized version sometimes doesn't crash and burn.
If the loop does get optimized out into an infinite loop, it could be due to static code analyzis seeing that your array is
not volatile
contains only 0
never gets written to
and thus it is not possible for it to contain the number 5. Which means an infinite loop.
Even if it didn't do this, your approach could fail easily. For example, it's possible that some compiler would optimize your code without making your loop infinite, but would stuff the contents of i into a register, making it unavailable from the stack.
As a side note, I bet what your friend actually expected was this:
void fn(void)
{
/* write something after this comment so that the program output is 10 */
printf("10\n"); /* Output 10 */
while(1); /* Endless loop, function won't return, i won't be output */
/* write something before this comment */
}
or this (if stdlib.h is included):
void fn(void)
{
/* write something after this comment so that the program output is 10 */
printf("10\n"); /* Output 10 */
exit(0); /* Exit gracefully */
/* write something before this comment */
}
I don't quite get the following part of 5.1.2.3/3:
An actual implementation need not evaluate part of an expression if it can deduce that its
value is not used and that no needed side effects are produced (including any caused by
calling a function or accessing a volatile object).
Suppose I have the following code:
char data[size];
int i;
int found;
/* initialize data to some values in here */
found = 0;
for( i = 0; i < size; i++ ) {
if( data[i] == 0 ) {
found = 1;
/* no break in here */
}
}
/* i no longer used, do something with "found" here */
Note that found starts with 0 and can either remain unchanged or turn into 1. It cannot turn into 1 and then into something else. So the following code would yield the same result (except for i value which is not used after the loop anyway):
char data[size];
int i;
int found;
/* initialize data to some values in here */
found = 0;
for( i = 0; i < size; i++ ) {
if( data[i] == 0 ) {
found = 1;
break;
}
}
/* i no longer used, do something with "found" here */
Now what does the Standard say about need not evaluate part of an expression with regard to found = 1 and the loop control expressions which follow the first iteration in which control gets inside if?
Clearly if found is used somewhere after this code the compiler must emit the code that traverses the array and conditionally evaluates found = 1 expression.
Is the implementation required to evaluate found = 1 once for every zero found in the array or can it instead evaluate it no more that once and so effectively emit the code for the second snippet when compiling the first snippet?
can it instead evaluate it no more that once and so effectively emit the code for the second snippet when compiling the first snippet?
Yes, a compiler has the right to perform that optimization. It seems like a pretty aggressive optimization but it would be legal.
It might be interesting to look at an example that more closely matches the spirit of the text:
An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
Suppose we have:
int x = pureFunction(y) * otherPureFunction(z);
Suppose the compiler knows that both functions are int-returning "pure" functions; that is, they have no side effects and their result depends solely on the arguments. Suppose the compiler also believes that otherPureFunction is an extremely expensive operation. A compiler could choose to implement the code as though you had written:
int temp = pureFunction(y);
int x = temp == 0 ? 0 : temp * otherPureFunction(z);
That is, determine that under some conditions it is unnecessary to compute otherPureFunction() because the result of the multiplication is already known once the left operand is known to be zero. No needed side effects will be elided because there are no side effects.
Yes, it may perform this optimization, since there are no I/O operations, reads from volatile locations or externally visible writes to memory omitted by the optimized code, so the behavior is preserved.
As an example of this kind of optimization, GCC will compile
void noop(const char *s)
{
for (size_t i = 0; i < strlen(s); i++) {
}
}
to a completely empty function:
noop:
.LFB33:
.cfi_startproc
rep ret
.cfi_endproc
It is allowed to do so because the Standard guarantees the behavior of strlen, the compiler knows that it has no externally visible effect on s or any other piece of memory, and it can deduce that the whole function has no behavior. (Amazingly, this simple optimization brings the complexity down from quadratic to constant.)
I am following the book Let us C and the following code has been shown as being perfectly correct:
for ( i < 4 ; j = 5 ; j = 0 )
printf ( "%d", i ) ;
But in the Turbo C it gives 3 warnings:
Code has no effect. Possibly incorrect assignment. 'j' is assigned a
value that is never used.
If the book is making the point that this code is allowed by the C standard, then it is correct. This code does not violate any rule of the C standard, provided that i and j have previously been declared correctly (and printf too, by including #include <stdio.h>).
However, nobody would actually write code like this, because it is not useful. That is why the compiler is issuing a warning, because the code is technically allowed but is probably not what a programmer would intend.
If the book is claiming that this code is useful in some way, then it is probably a typographical error. It is certainly wrong. If the book has more than a few errors like this, you should discard it.
I don't know what your book want to teach you with this example, but AFAIK a for loop should always be in the form
for ( init; check; next ) {
/* do something */
}
where init initialize what you're going to use, check check if it should stop or continue and next perform some kind of action. It is the same as
init;
while ( check ) {
/* do something */
next;
}
Therefore you are getting the warning because:
Code has no effect is referred to i < 4. As you can see in the while form, this comparison isn't used in any way, therefore it has no effect.
Possibly incorrect assignment. is refereed to j = 5 cause you're making a check of an assignment witch will always evaluate to the value assigned (in this case 5)
'j' is assigned a value that is never used as it says, 'j' is never used, as you print the 'i' in this example.
Probably what the book wants to do is for ( i = 5; i < 5; i++ ).
And probably what you need to do is using a better book.
It is valid C code but it's pretty much meaningless. This will not initialize the loop properly and trigger an infinite loop. Loops look something like
for (i = 0; i < 10; i++)
The first statement is the initializer, the second stipulates the end case, and the last is the increment. I would get rid of that book
Check this out.
int i=0;
for(i=0;i<5;i++)
{
printf("%d",i);
}
This is a correct but infinite loop,
the correct way to instantiate a for loop is
int i ;
for(i = 0; i< [variable or number];i++){
printf("%d",i);
}
the code you wrote is meaningless and you can't do anything with that code, actually it print the value of i infinite time because it never change.
The only thing we know about i is less then 4. Probably the output is always the same number.