C11 - omitting potentially infinite loop by the compiler - c

Assume the following code
struct a {
unsigned cntr;
};
void boo(struct a *v) {
v->cntr++;
while(v->cntr > 1);
}
I wonder if the compiler is allowed to omit the while loop inside boo() due to the following statement in the C11 standard:
An iteration statement whose controlling expression is not a constant expression,156) that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.157)
157)This is intended to allow compiler transformations such as removal of empty loops even when termination cannot be proven.
Can v->cntr, in the controlling expression, be considered as a synchronization since v may be a pointer to a global structure which can be modified externally (for example by another thread)?
Additional question.
Is the compiler allowed not to re-read v->cntr on each iteration if v is not defined as volatile?

Can v->cntr, in the controlling expression, be considered as a synchronization
No.
From https://port70.net/~nsz/c/c11/n1570.html#5.1.2.4p5 :
The library defines a number of atomic operations (7.17) and operations on mutexes (7.26.4) that are specially identified as synchronization operations.
So basically, functions from stdatomic.h and mtx_* from thread.h are synchronization operations.
since v may be a pointer to a global structure which can be modified externally (for example by another thread)?
Does not matter. Assumptions like sound to me like they would disallow many sane optimizations, I wouldn't want my compiler to assume that.
If v were modified in another thread, then it would be unsequenced, that would just result in undefined behavior https://port70.net/~nsz/c/c11/n1570.html#5.1.2.4p25 .
Is the compiler allowed not to re-read v->cntr on each iteration if v is not defined as volatile?
Yes.

Related

Optimization allowed on volatile objects

From ISO/IEC 9899:201x section 5.1.2.3 Program execution paragraph 4:
In the abstract machine, all expressions are evaluated as specified by
the semantics. An actual implementation need not evaluate part of an
expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).
What exactly is the allowed optimization here regarding the volatile object? can someone give an example of a volatile access that CAN be optimized away?
Since volatiles access are an observable behaviour (described in paragraph 6) it seems that no optimization can take please regarding volatiles, so, I'm curious to know what optimization is allowed in section 4.
Reformatting a little:
An actual implementation need not evaluate part of an expression if:
a) it can deduce that its value is not used; and
b) it can deduce that that no needed side effects are produced (including any
caused by calling a function or accessing a volatile object).
Reversing the logic without changing the meaning:
An actual implementation must evaluate part of an expression if:
a) it can't deduce that its value is not used; or
b) it can't deduce that that no needed side effects are produced (including
any caused by calling a function or accessing a volatile object).
Simplifying to focus on the volatile part:
An actual implementation must evaluate part of an expression if needed
side effects are produced (including accessing a volatile object).
Accesses to volatile objects must be evaluated. The phrase “including any…” modifies “side effects.” It does not modify “if it can deduce…” It has the same meaning as:
An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects (including any caused by calling a function or accessing a volatile object) are produced.
This means “side effects” includes side effects that are caused by accessing a volatile object. In order to decide it cannot evaluate part of an expression, an implementation must deduce that no needed side effects, including any caused by calling a function or accessing a volatile object, are produced.
It does not mean that an implementation can discard evaluation of part of an expression even if that expression includes accesses to a volatile object.
can someone give an example of a volatile access that CAN be optimized
away?
I think that you misinterpreted the text, IMO this paragraph means that
volatile unsigned int bla = whatever();
if (bla < 0) // the code is not evaluated even if a volatile is involved
Adding another example that fits into this in my understanding:
volatile int vol_a;
....
int b = vol_a * 0; // vol_a is not evaluated
In cases where an access to a volatile object would affect system behavior in a way that would be necessary to make a program achieve its purpose, such an access must not be omitted. If the access would have no effect whatsoever on system behavior, then the operation could be "performed" on the abstract machine without having to execute any instructions. It would be rare, however, for a compiler writer to know with certainty that the effect of executing instructions to perform the accesses would be the same as the effect of pretending to do those instructions on the abstract machine while skipping them on the real one.
In the much more common scenario where a compiler writer would have no particular knowledge of any effect that a volatile access might have, but also have no particular reason to believe that such accesses couldn't have effects the compiler writer doesn't know about (e.g. because of hardware which is triggered by operations involving certain addresses), a compiler writer would have to allow for the possibility that such accesses might have "interesting" effects by performing them in the specified sequence, without regard for whether the compiler writer knows of any particular reason that the sequence of operations should matter.

Do the C compiler know when a statement operates on a file and thus has "observable behaviour"?

The C99 standard 5.1.2.3$2 says
Accessing a volatile object, modifying an object, modifying a file, or calling a function
that does any of those operations are all side effects, 12) which are changes in the state of
the execution environment. Evaluation of an expression in general includes both value
computations and initiation of side effects. Value computation for an lvalue expression
includes determining the identity of the designated object.
I guess that in a lot of cases the compiler can't inline and possibly eliminate the functions doing I/O since they live in a different translation unit. And the parameters to functions doing I/O are often pointers, further hindering the optimizer.
But link-time-optimization gives the compiler "more to chew on".
And even though the paragraph I quoted says that "modifying an object" (that's standard-speak for memory) is a side-effect, stores to memory is not automatically treated as a side effect when the optimizer kicks in. Here's an example from John Regehrs Nine Ways to Break your Systems Software using Volatile where the message store is reordered relative to the volatile ready variable.
.
volatile int ready;
int message[100];
void foo (int i) {
message[i/10] = 42;
ready = 1;
}
How do a C compiler determine if a statement operates on a file? In a free-standing embedded environment I declare registers as volatile, thus hindering the compiler from optimizing calls away and swapping order of I/O calls.
Is that the only way to tell the compiler that we're doing I/O? Or do the C standard dictate that these N calls in the standard library do I/O and thus must receive special treatment? But then, what if someone created their own system call wrapper for say read?
As C has no statement dedicated to IO, only function calls can modify files. So if the compiler sees no function call in a sequence of statements, it knows that this sequence has not modified any file.
If only functions from the standard library are called, and if the environment is hosted, the compiler could know what they do and use that to guess what will happen.
But what is really important, is that the compiler only needs to respect side effects. It is perfectly allowed when it does not know, to assume that a function call could involve side effects and act accordingly. It will not be a violation of the standard if no side effects are actually involved, it will just possibly lose a higher optimization.

C: thread safety and order of operations

Consider the following C code:
static sig_atomic_t x;
static sig_atomic_t y;
int foo()
{
x = 1;
y = 2;
}
First question: can the C compiler decide to "optimize" the code for foo to y = 2; x = 1 (in the sense that the memory location for y is changed before the memory location for x)? This would be equivalent, except when multiple threads or signals are involved.
If the answer to the first question is "yes": what should I do if I really want the guarantee that x is stored before y?
Yes, the compiler may change the order of the two assignments, because the reordering is not "observable" as defined by the C standard, e.g., there are no side-effects to the assignments (again, as defined by the C standard, which does not consider the existence of an outside observer).
In practice you need some kind of barrier/fence to guarantee the order, e.g., use the services provided by your multithreading environment, or possibly C11 stdatomic.h if available.
The C standard specifies a term called observable behavior. This means that at a minimum, the compiler/system has a few restrictions: it is not allowed to re-order expressions containing volatile-qualified operands, nor is it allowed to re-order input/output.
Apart from those special cases, anything is fair game. It may execute y before x, it may execute them in parallel. It might optimize the whole code away as there are no observable side-effects in the code. And so on.
Please note that thread-safety and order of execution are different things. Threads are created explicitly by the programmer/libraries. A context switch may interrupt any variable acccess which is not atomic. That's another issue and the solution is to use mutex, _Atomic qualifier or similar protection mechanisms.
If the order matters, you should volatile-qualify the variables. In that case, the following guarantees are made by the language:
C17 5.1.2.3 § 6 (the definition of observable behavior):
Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.
C17 5.1.2.3 § 4:
In the abstract machine, all expressions are evaluated as specified by the semantics.
Where "semantics" is pretty much the whole standard, for example the part that specifies that a ; consists of a sequence point. (In this case, C17 6.7.6 "The end of a full
declarator is a sequence point." The term "sequenced before" is specified in C17 5.1.2.3 §3).
So given this:
volatile int x = 1;
volatile int y = 1;
then the order of initialization is guaranteed to be x before y, as the ; of the first line guarantees the sequencing order, and volatile guarantees that the program strictly follows the evaluation order specified in the standard.
Now as it happens in the real world, volatile does not guarantee memory barriers on many compiler implementations for multi-core systems. Those implementations are not conforming.
Opportunist compilers might claim that the programmer must use system-specific memory barriers to guarantee order of execution. But in case of volatile, that is not true, as proven above. They just want to dodge their responsibility and hand it over to the programmers. The C standard doesn't care if the CPU has 57 cores, branch prediction and instruction pipelining.

Restrictions on non volatile variables in C

I Would like to understand what Restrictions if any does the compiler have with regards to non volatile variables in C.
I'm not sure if its true or not, but I've been told that if you have the following code:
int x;
...
void update_x() {
lock();
x = x*5+3;
unlock();
}
You must acquire the lock to read x because even tough the compiler is unlikely to do it is technically legal for it to store intermediate calculation such as x*5 into x, and so the read might read an intermediate value. so my first question is whether it is indeed the case? if not, why not?
If it is, I have a followup question, is there anything that's prevents to compiler from using x as a temporary storage before or after taking the lock? (Assuming the compiler can prove that a single thread executing the program will not notice it).
If not, does that mean that any program that has non volatile shared variables is technically undefined even if all the accesses are protected by locks?
Thanks,
Ilya
Prior to C11, the answer is No, as the spec doesn't define anything about what multiple threads do, so any program that uses multiple threads where one thread writes an object and another thread reads it is undefined behavior.
With C11, there's actually a memory model that talks about multiple threads and data races, so the answer is Yes, as long as the lock/unlock routines do certain synchronization operations (involving either library functions that do the synchronization or operations on special _Atomic objects).
Since the C11 spec is attempting to codify behavior of existing implementations (for the most part), it is likely that any code that does what it requires (ie, using a implementation provided library for locking, or implementation provided extensions for atomic operations) will work correctly even on pre-C11 implementations.
Section 5.2.1.4 of the C11 spec covers this.

Is an (empty) infinite loop undefined behavior in C?

Is an infinite loop like for (;;); undefined behavior in C? (It is for C++, but I don't know about C.)
No, the behavior of a for (;;) statement is well defined in C.
N1570, which is essentially identical to the offical 2011 ISO C standard, says, in section 6.8.5 paragraph 6:
An iteration statement whose controlling expression is not a constant
expression, that performs no input/output operations, does not access
volatile objects, and performs no synchronization or atomic operations
in its body, controlling expression, or (in the case of a for
statement) its expression-3, may be assumed by the implementation to
terminate.
with two footnotes:
An omitted controlling expression is replaced by a nonzero constant,
which is a constant expression.
This is intended to allow compiler transformations such as removal of
empty loops even when termination cannot be proven.
The first footnote makes it clear that for (;;) is treated as if it had a constant controlling expression.
The point of the rule is to permit optimizations when the compiler can't prove that the loop terminates. But if the controlling expression is constant, the compiler can trivially prove that the loop does or does not terminate, so the additional permission isn't needed.
The rationale for this question with relevance to C++ isn't relevant to C. Section 5.1.2.3p6 states the limits to optimisation, and one of them is:
At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.
Now the question becomes "What data would execution according to the abstract semantics have produced?". Assuming a signal interrupts the loop, the program may very well terminate. The abstract semantics would have produced no output prior to that signal being raised, however. If anything, the compiler may optimise the puts("Hello"); away.

Resources