Is an infinite loop like for (;;); undefined behavior in C? (It is for C++, but I don't know about C.)
No, the behavior of a for (;;) statement is well defined in C.
N1570, which is essentially identical to the offical 2011 ISO C standard, says, in section 6.8.5 paragraph 6:
An iteration statement whose controlling expression is not a constant
expression, that performs no input/output operations, does not access
volatile objects, and performs no synchronization or atomic operations
in its body, controlling expression, or (in the case of a for
statement) its expression-3, may be assumed by the implementation to
terminate.
with two footnotes:
An omitted controlling expression is replaced by a nonzero constant,
which is a constant expression.
This is intended to allow compiler transformations such as removal of
empty loops even when termination cannot be proven.
The first footnote makes it clear that for (;;) is treated as if it had a constant controlling expression.
The point of the rule is to permit optimizations when the compiler can't prove that the loop terminates. But if the controlling expression is constant, the compiler can trivially prove that the loop does or does not terminate, so the additional permission isn't needed.
The rationale for this question with relevance to C++ isn't relevant to C. Section 5.1.2.3p6 states the limits to optimisation, and one of them is:
At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.
Now the question becomes "What data would execution according to the abstract semantics have produced?". Assuming a signal interrupts the loop, the program may very well terminate. The abstract semantics would have produced no output prior to that signal being raised, however. If anything, the compiler may optimise the puts("Hello"); away.
Related
Assume the following code
struct a {
unsigned cntr;
};
void boo(struct a *v) {
v->cntr++;
while(v->cntr > 1);
}
I wonder if the compiler is allowed to omit the while loop inside boo() due to the following statement in the C11 standard:
An iteration statement whose controlling expression is not a constant expression,156) that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.157)
157)This is intended to allow compiler transformations such as removal of empty loops even when termination cannot be proven.
Can v->cntr, in the controlling expression, be considered as a synchronization since v may be a pointer to a global structure which can be modified externally (for example by another thread)?
Additional question.
Is the compiler allowed not to re-read v->cntr on each iteration if v is not defined as volatile?
Can v->cntr, in the controlling expression, be considered as a synchronization
No.
From https://port70.net/~nsz/c/c11/n1570.html#5.1.2.4p5 :
The library defines a number of atomic operations (7.17) and operations on mutexes (7.26.4) that are specially identified as synchronization operations.
So basically, functions from stdatomic.h and mtx_* from thread.h are synchronization operations.
since v may be a pointer to a global structure which can be modified externally (for example by another thread)?
Does not matter. Assumptions like sound to me like they would disallow many sane optimizations, I wouldn't want my compiler to assume that.
If v were modified in another thread, then it would be unsequenced, that would just result in undefined behavior https://port70.net/~nsz/c/c11/n1570.html#5.1.2.4p25 .
Is the compiler allowed not to re-read v->cntr on each iteration if v is not defined as volatile?
Yes.
From ISO/IEC 9899:201x section 5.1.2.3 Program execution paragraph 4:
In the abstract machine, all expressions are evaluated as specified by
the semantics. An actual implementation need not evaluate part of an
expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).
What exactly is the allowed optimization here regarding the volatile object? can someone give an example of a volatile access that CAN be optimized away?
Since volatiles access are an observable behaviour (described in paragraph 6) it seems that no optimization can take please regarding volatiles, so, I'm curious to know what optimization is allowed in section 4.
Reformatting a little:
An actual implementation need not evaluate part of an expression if:
a) it can deduce that its value is not used; and
b) it can deduce that that no needed side effects are produced (including any
caused by calling a function or accessing a volatile object).
Reversing the logic without changing the meaning:
An actual implementation must evaluate part of an expression if:
a) it can't deduce that its value is not used; or
b) it can't deduce that that no needed side effects are produced (including
any caused by calling a function or accessing a volatile object).
Simplifying to focus on the volatile part:
An actual implementation must evaluate part of an expression if needed
side effects are produced (including accessing a volatile object).
Accesses to volatile objects must be evaluated. The phrase “including any…” modifies “side effects.” It does not modify “if it can deduce…” It has the same meaning as:
An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects (including any caused by calling a function or accessing a volatile object) are produced.
This means “side effects” includes side effects that are caused by accessing a volatile object. In order to decide it cannot evaluate part of an expression, an implementation must deduce that no needed side effects, including any caused by calling a function or accessing a volatile object, are produced.
It does not mean that an implementation can discard evaluation of part of an expression even if that expression includes accesses to a volatile object.
can someone give an example of a volatile access that CAN be optimized
away?
I think that you misinterpreted the text, IMO this paragraph means that
volatile unsigned int bla = whatever();
if (bla < 0) // the code is not evaluated even if a volatile is involved
Adding another example that fits into this in my understanding:
volatile int vol_a;
....
int b = vol_a * 0; // vol_a is not evaluated
In cases where an access to a volatile object would affect system behavior in a way that would be necessary to make a program achieve its purpose, such an access must not be omitted. If the access would have no effect whatsoever on system behavior, then the operation could be "performed" on the abstract machine without having to execute any instructions. It would be rare, however, for a compiler writer to know with certainty that the effect of executing instructions to perform the accesses would be the same as the effect of pretending to do those instructions on the abstract machine while skipping them on the real one.
In the much more common scenario where a compiler writer would have no particular knowledge of any effect that a volatile access might have, but also have no particular reason to believe that such accesses couldn't have effects the compiler writer doesn't know about (e.g. because of hardware which is triggered by operations involving certain addresses), a compiler writer would have to allow for the possibility that such accesses might have "interesting" effects by performing them in the specified sequence, without regard for whether the compiler writer knows of any particular reason that the sequence of operations should matter.
Consider the following C code:
static sig_atomic_t x;
static sig_atomic_t y;
int foo()
{
x = 1;
y = 2;
}
First question: can the C compiler decide to "optimize" the code for foo to y = 2; x = 1 (in the sense that the memory location for y is changed before the memory location for x)? This would be equivalent, except when multiple threads or signals are involved.
If the answer to the first question is "yes": what should I do if I really want the guarantee that x is stored before y?
Yes, the compiler may change the order of the two assignments, because the reordering is not "observable" as defined by the C standard, e.g., there are no side-effects to the assignments (again, as defined by the C standard, which does not consider the existence of an outside observer).
In practice you need some kind of barrier/fence to guarantee the order, e.g., use the services provided by your multithreading environment, or possibly C11 stdatomic.h if available.
The C standard specifies a term called observable behavior. This means that at a minimum, the compiler/system has a few restrictions: it is not allowed to re-order expressions containing volatile-qualified operands, nor is it allowed to re-order input/output.
Apart from those special cases, anything is fair game. It may execute y before x, it may execute them in parallel. It might optimize the whole code away as there are no observable side-effects in the code. And so on.
Please note that thread-safety and order of execution are different things. Threads are created explicitly by the programmer/libraries. A context switch may interrupt any variable acccess which is not atomic. That's another issue and the solution is to use mutex, _Atomic qualifier or similar protection mechanisms.
If the order matters, you should volatile-qualify the variables. In that case, the following guarantees are made by the language:
C17 5.1.2.3 § 6 (the definition of observable behavior):
Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.
C17 5.1.2.3 § 4:
In the abstract machine, all expressions are evaluated as specified by the semantics.
Where "semantics" is pretty much the whole standard, for example the part that specifies that a ; consists of a sequence point. (In this case, C17 6.7.6 "The end of a full
declarator is a sequence point." The term "sequenced before" is specified in C17 5.1.2.3 §3).
So given this:
volatile int x = 1;
volatile int y = 1;
then the order of initialization is guaranteed to be x before y, as the ; of the first line guarantees the sequencing order, and volatile guarantees that the program strictly follows the evaluation order specified in the standard.
Now as it happens in the real world, volatile does not guarantee memory barriers on many compiler implementations for multi-core systems. Those implementations are not conforming.
Opportunist compilers might claim that the programmer must use system-specific memory barriers to guarantee order of execution. But in case of volatile, that is not true, as proven above. They just want to dodge their responsibility and hand it over to the programmers. The C standard doesn't care if the CPU has 57 cores, branch prediction and instruction pipelining.
We know that logical-AND operator (&&) guarantees left-to-right evaluation.
But I am wondering if the compiler optimizer can ever reorder the memory access instructions for *a and b->foo in the following code, i.e. the optimizer writes instructions that try to access *b before accessing *a.
(Consider both a and b to be pointers to memory regions in the heap.)
if (*a && b->foo) {
/* do something */
}
One might think that && causes a sequence point, so the compiler must emit instructions to access *a before accessing *b but after reading the accepted answer at https://stackoverflow.com/a/14983432/1175080, I am not so sure. If you look at this answer, there are semi-colons between statements and they also establish sequence points and therefore they should also prevent reordering, but the answer there seems to indicate that they need compiler level memory barrier despite the presence of semicolons.
I mean if you claim that && establishes a sequence point, then that is true for semicolons in the code at https://stackoverflow.com/a/14983432/1175080. Then why is a compiler-level memory barrier required in that code?
The system can evaluate b->foo until such time as it hits something that exceeds its ability to execute speculatively. Most modern systems can handle a speculative fault and ignore the fault if it turns out that the results of the operation are never used.
So it's purely up to the capabilities of the compiler, CPU, and other system components. So long as it can ensure there are no visible consequences to conforming code, it can execute (almost) anything it wants (almost) any time it wants.
But I am wondering if the compiler optimizer can ever reorder the memory access instructions for *a and b->foo in the following code, i.e. the optimizer writes instructions that try to access *b before accessing *a.
if (*a && b->foo) {
/* do something */
}
The C semantics for the expression require that *a be evaluated first, and that b->foo be evaluated only if *a evaluated to nonzero. #Jack's answer provides the basis for that in the standard. But your question is about optimizations that compiler performs, and the standard specifies that
The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant.
(C2013, 5.1.2.3/1)
An optimizing compiler can produce code that does not conform to the abstract semantics if it produces the same external behavior.
In particular, in your example code, if the compiler can prove (or is willing to assume) that the evaluations of *a and b->foo have no externally visible behavior and are independent -- neither has a side effect that impacts the evaluation or side effects of the other -- then it may emit code that evaluates b->foo unconditionally, either before or after evaluating *a. Note that if b is NULL or contains an invalid pointer value then evaluating b->foo has undefined behavior. In that case, evaluation of b->foo is not independent of any other evaluation in the program.
As #DavidSchwartz observes, however, even if b's value may be null or invalid, the compiler may still be able to emit code that speculatively proceeds as if it were valid, and backtracks in the event that that turns out not to be the case. The key point here is that the externally-visible behavior is unaffected by valid optimizations.
According to the C11 ISO standard, at §C, annex C, it's stated that
The following are the sequence points described in
... Between the evaluations of the first and second operands of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); comma , (6.5.17).
And, as stated in §5.1.2.3:
Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread, which induces a partial order among those evaluations. Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B.
So it's guaranteed that the first operand is evaluated before the second. No safe optimization should be possible in this circumstance.
First of all I take it as granted that && stands for the built-in version of the logical AND operator.
I think that the compiler may legitimately perform evaluations from the right-hand-side sub-expression of the && operator before it completes evaluation of the left-hand side, but in a manner that wouldn't change the semantics of the full expression.
For your example, C++ compiler is allowed to introduce reordering under the following conditions:
a is a primitive pointer (i.e. its type is not a class that overloads operator*).
b is a primitive pointer (i.e. its type is not a class that overloads operator->)
b is known to be dereferenceable regardless of the value of *a
If 1. doesn't hold, then the user-defined operator* may have a side effect of changing the value of b->foo.
If 2. doesn't hold, then the user-defined operator-> may change the value of *a, or throw, or produce another observable side-effect (e.g. print something) that shouldn't have shown up had *a evaluated to false.
If 3. cannot be proved through static analysis, then reordering would introduce undefined behaviour that is not in the original program.
C compiler, naturally, only needs to perform the 3rd check.
In fact, even if *a and b->foo involve operator overloading, C++ compiler can still reorder some instructions when those operators can be inlined and the compiler doesn't detect anything dangerous.
In C++03 Standard 1.9/6 there's this definition of observable behavior
The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions.
and The Standard goes to length explaining that the compiler must preserve observable behavior while doing optimizations.
However there's no such or similar definition in C99 draft I'm looking at. The only time observable behavior is mentioned is 6.7.3/7
The intended use of the restrict qualifier (like the register storage class) is to promote
optimization, and deleting all instances of the qualifier from a conforming program does
not change its meaning (i.e., observable behavior)
Is there a definition of what exactly the compiler must preserve when optimizing a C99 program?
In my draft, §3.4, defines behavior as "external appearance or action". "Observable behavior" seems to be a pleonasm that occurs exactly once.
§5.1.2.3, Program execution, further defines the behavior of C programs:
The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant.
It then goes on to define side-effects as "changes in the state of the execution environment" caused by "[a]ccessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations". Side-effects are sequenced at sequence points.
This seems to be stricter than C++ in that "modifying an object", i.e. writing to memory, is (observable) behavior in C.
As for allowed optimization:
In the abstract machine, all expressions are evaluated as specified by the semantics. An
actual implementation need not evaluate part of an expression if it can deduce that its
value is not used and that no needed side effects are produced (including any caused by
calling a function or accessing a volatile object).
"Needed side-effects" are then listed in the following point:
At sequence points, volatile objects are stable in the sense that previous accesses are
complete and subsequent accesses have not yet occurred.
At program termination, all data written into files shall be identical to the result that
execution of the program according to the abstract semantics would have produced.
The input and output dynamics of interactive devices shall take place as specified in
7.19.3.
The paragraph concludes with a list of examples; §7.19.3 describes files in the context of stdio.