When can a volatile variable be optimized away completely? - c

Consider this code example:
int main(void)
{
volatile int a;
static volatile int b;
volatile int c;
c = 20;
static volatile int d;
d = 30;
volatile int e = 40;
static volatile int f = 50;
return 0;
}
Without volatile a compiler could optimize away all the variables since they are never read from.
I think a and b can be optimized away since they are completely unused, see unused volatile variable.
I think c and d can not be removed since they are written to, and writes to volatile variables must actually happen. e should be equivalent to c.
GCC does not optimize away f, but it also does not emit any instruction to write to it. 50 is set in a data section. LLVM (clang) removes f completely.
Are these statements true?
A volatile variable can be optimized away if it is never accessed.
Initialization of a static or global variable does not count as access.

Writes to volatile variables (even automatic ones) count as observable behaviour.
C11 (N1570) 5.1.2.3/6:
The least requirements on a conforming implementation are:
— Accesses to volatile objects are evaluated strictly according to the rules of the abstract
machine.
— At program termination, all data written into files shall be identical to the result that
execution of the program according to the abstract semantics would have produced.
— The input and output dynamics of interactive devices shall take place as specified in
7.21.3. The intent of these requirements is that unbuffered or line-buffered output
appear as soon as possible, to ensure that prompting messages actually appear prior to
a program waiting for input.
This is the observable behavior of the program.
The question is: does initialization (e, f) count as an "access"? As pointed out by Sander de Dycker, 6.7.3 says:
What constitutes an access to an object that has volatile-qualified type is implementation-defined.
which means it's up to the compiler whether or not e and f can be optimized away - but this must be documented!

Strictly speaking any volatile variable that is accessed (read or written to) cannot be optimized out according to the C Standard. The Standard says that an access to a volatile object may have unknown side-effects and that accesses to volatile object has to follow the rules of the C abstract machine (where all expressions are evaluated as specified by their semantics).
From the mighty Standard (emphasis mine):
(C11, 6.7.3p7) "An object that has volatile-qualified type may be modified in ways unknown to the
implementation or have other unknown side effects. Therefore any expression referring
to such an object shall be evaluated strictly according to the rules of the abstract machine,
as described in 5.1.2.3."
Following that even a simple initialization of a variable should be considered as an access. Remember the static specifier also causes the object to be initialized (to 0) and thus accessed.
Now compilers are known to behave differently with the volatile qualifier, and I guess a lot of them will just optimize out most of the volatile objects of your example program except the ones with the explicit assignment (=).

Related

Undefined behavior when working with partially initialized struct in C90

Let's consider the following code:
struct M {
unsigned char a;
unsigned char b;
};
void pass_by_value(struct M);
int main() {
struct M m;
m.a = 0;
pass_by_value(m);
return 0;
}
In the function pass_by_value m.b is initialized before used.
However, since m is passed by value the compiler copies it to the stack already.
No variable has storage class register here. a and b are of type unsigned char.
Does that have to be considered UB in C90? (Please note: I am specifically asking for C90)
This question is very similar to Returning a local partially initialized struct from a function and undefined behavior, but actually the other way around.
The C 1990 standard (and the C 199 standard) does not contain the sentence that first appears in C 2011 that makes the behavior of using some uninitialized values undefined.
C 2011 6.3.2.1 2 includes:
… If the lvalue has an incomplete type and does not have array type, the behavior is undefined. If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
The whole of the corresponding paragraph in C 1990, clause 6.2.2.1, second paragraph, is:
Except when it is the operand of the sizeof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue). If the lvalue has qualified type, the value has the unqualified version of the type of the lvalue; otherwise, the value has the type of the lvalue. If the lvalue has an incomplete type and does not have array type, the behavior is undefined.
Therefore, the behavior of the code in the question would seem to be defined, inasmuch that it passes the value stored in the structure.
In the absence of explicit statements in the standard, common practice helps guide interpretation. It is perfectly normal not to initialize all members of a structure yet to expect the structure to represent useful data, and therefore the behavior of using the structure as a value must be defined if at least one of its members is initialized. The equivalent question for C 2011 contains mention (from a C defect report) of the standard struct tm in one of its answers. The struct tm may be used to represent a specific date by filling in all of date fields (year, month, day of month) and possibly the time fields (hour, minute, second, even Daylight Savings Time indication) but leaving the day of week and day of year fields uninitialized.
In defining undefined behavior in 3.16, the 1990 standard does say it is “Behavior, upon use … of indeterminately valued objects, for which this International Standard imposes no requirements.” And 6.5.7 says “… If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate…” However, a structure with automatic storage duration in which one member, but not another, has been initialized is neither fully initialized nor not initialized. Given the intended uses of structures, I would say we should not consider use of the value of a partially initialized structure to be subject to being made undefined by 3.16.
Under C90, if an object held Indeterminate Value, each and every bit could independently be zero or one, regardless of whether or not they would in combination represent a valid bit pattern for the object's type. If an implementation specified the behavior of attempting to read each and every one of the 2ⁿ individual possible bit patterns an object could hold, the behavior of reading an object with Indeterminate Value would be equivalent to reading the value of an arbitrarily chosen bit pattern. If there were any bit patterns for which an implementation did not specify the effect of an attempted read, then the effects of trying to read an object that might hold such bit patterns would be likewise unspecified.
Code generation efficiency could be improved in some cases by specifying the behavior of uninitialized objects more loosely, in a way which would not otherwise be consistent with sequential program execution as specified but would nonetheless meet program requirements. For example, given something like:
struct foo { short dat[16]; } x,y,z;
void test1(int a, int b, int c, int d)
{
struct foo temp;
temp.dat[a] = 1;
temp.dat[b] = 2;
temp.dat[c] = 3;
temp.dat[d] = 4;
x=temp;
y=temp;
}
void test2(int a, int b, int c, int d)
{
test1(a,b,c,d);
z=x;
}
If client code only cares about the values of x and y that correspond to values of temp that were written, efficiency might be improved, while still meeting requirements, if the code were rewritten as:
void test1(int a, int b, int c, int d)
{
x.dat[a] = 1;
y.dat[a] = 1;
x.dat[b] = 2;
y.dat[b] = 1;
x.dat[c] = 3;
y.dat[c] = 1;
x.dat[d] = 4;
y.dat[d] = 1;
}
The fact that the original function test1 doesn't do anything to initialize temp suggests that it won't care about what is yielded by any individual attempt to read it. On the other hand, nothing within the code for test2 would imply that client code wouldn't care about whether all members of x held the same values as corresponding values of y. Thus, such an inference would more likely be dangerous there.
The C Standard makes no attempt to define behavior in situations where an optimization might yield program behavior which, although useful, would be inconsistent with sequential processing of non-optimized code. Instead, the principle that optimizations must never affect any defined behavior is taken to imply that the Standard must characterize as Undefined all actions whose behavior would be visibly affected by optimization, leaving implementor discretion the question of what aspects of behavior should or should not be defined in what circumstances. Ironically, the only time the Standard's laxity with regard to this behavior would allow more efficient code generation outside contrived scenarios would be in cases where implementations treat the behavior as at least loosely defined, and programmers are able to exploit that. If a programmer had to explicitly initialize all elements of temp to avoid having the compiler behave in completely nonsensical fashion, that would eliminate any possibility of optimizing out the unnecessary writes to unused elements of x and y.

__restrict vis-a-vis a function optimization behavior of popular compilers

Consider the following function:
int bar(const int* __restrict x, void g())
{
int result = *x;
g();
result += *x;
return result;
}
Do we need to read twice from x because of the call to g()? Or is the __restriction enough to guarantee the invocation of g() does not access/does not alter the value at address x?
At this link we see the most popular compilers have to say about this (GodBolt; language standard C99, platform AMD64):
clang 7.0: Restriction respected.
GCC 8.3: No restriction.
MSVC 19.16: No restriction.
Is clang rightly optimizing the second read away, or isn't it? I'm asking both for C and C++ here, as the behavior is the same (thanks #PSkocik).
Related information and some notes:
Readers unfamiliar with __restrict (or __restrict__) may want to have a look at: What does the restrict keyword mean in C++?
GCC's documentation page on restricted pointers in C++.
I've opened a bug against GCC on this point.
The fact that x is marked const is not significant here - we get the same behavior if we drop the const and the question stands as is.
I think this is effectively a C question, since C is effectively the language that has restrict, with a formal spec attached to it.
The part of the C standard that governs the use of restrict is 6.7.3.1:
1 Let D be a declaration of an ordinary identifier that provides a
means of designating an object P as a restrict-qualified pointer to
type T.
2 If D appears inside a block and does not have storage class extern,
let B denote the block. If D appears in the list of parameter
declarations of a function definition, let B denote the associated
block. Otherwise, let B denote the block of main (or the block of
whatever function is called at program startup in a freestanding
environment).
3 In what follows, a pointer expression E is said to be based on
object P if (at some sequence point in the execution of B prior to the
evaluation of E) modifying P to point to a copy of the array object
into which it formerly pointed would change the value of E.137) Note
that ''based'' is defined only for expressions with pointer types.
4 During each execution of B, let L be any lvalue that has &L based on
P. If L is used to access the value of the object X that it
designates, and X is also modified (by any means), then the following
requirements apply: T shall not be const-qualified. Every other lvalue
used to access the value of X shall also have its address based on P.
Every access that modifies X shall be considered also to modify P, for
the purposes of this subclause. If P is assigned the value of a
pointer expression E that is based on another restricted pointer
object P2, associated with block B2, then either the execution of B2
shall begin before the execution of B, or the execution of B2 shall
end prior to the assignment. If these requirements are not met, then
the behavior is undefined.
5 Here an execution of B means that portion of the execution of the
program that would correspond to the lifetime of an object with scalar
type and automatic storage duration associated with B.
The way I read it, the execution of g() falls under the execution of the bar's block, so g() is disallowed from modifying *x and clang is right to optimize out the second load (IOW, if *x refers to a non-const global, g() must not modify that global).

behaviour of const static qualifier [duplicate]

I wrote some thing similar to this in my code
const int x=1;
int *ptr;
ptr = &x;
*ptr = 2;
Does this work on all compilers? Why doesn't the GCC compiler notice that we are changing a constant variable?
const actually doesn't mean "constant". Something that's "constant" in C has a value that's determined at compile time; a literal 42 is an example. The const keyword really means read-only. Consider, for example:
const int r = rand();
The value of r is not determined until program execution time, but the const keyword means that you're not permitted to modify r after it's been initialized.
In your code:
const int x=1;
int *ptr;
ptr = &x;
*ptr = 2;
the assignment ptr = &x; is a constraint violation, meaning that a conforming compiler is required to complain about it; you can't legally assign a const int* (pointer to const int) value to a non-const int* object. If the compiler generates an executable (which it needn't do; it could just reject it), then the behavior is not defined by the C standard.
For example, the generated code might actually store the value 2 in x -- but then a later reference to x might yield the value 1, because the compiler knows that x can't have been modified after its initialization. And it knows that because you told it so, by defining x as const. If you lie to the compiler, the consequences can be arbitrarily bad.
Actually, the worst thing that can happen is that the program behaves as you expect it to; that means you have a bug that's very difficult to detect. (But the diagnostic you should have gotten will have been a large clue.)
Online C 2011 draft:
6.7.3 Type qualifiers
...
6 If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined.133)
133) This applies to those objects that behave as if they were defined with qualified types, even if they are
never actually defined as objects in the program (such as an object at a memory-mapped input/output
address).
Emphasis added.
Since the behavior is left undefined, the compiler is not required to issue a diagnostic, nor is it required to halt translation. This would be difficult to catch in the general case; suppose you had a function like
void foo( int *p ) { *p = ...; }
defined in it's own separate translation unit. During translation, the compiler has no way of knowing if p could be pointing to a const-qualified object or not. If your call is something like
const int x;
foo( &x );
you may get a warning like parameter 1 of 'foo' discards qualifiers or something similarly illuminating.
Also note that the const qualifier doesn't necessarily mean that the associated variable will be stored in read-only memory, so it's possible the above code would "work" (update the value in x) in that you'd successfully update x by doing an end-run around the const semantics. But then you might as well just not declare x to be const.
There is a good discussion of this here:Does the evil cast get trumped by the evil compiler?
I would expect gcc to compile this because:
ptr is allowed to point to x, otherwise reading it would be impossible, although as the comment below says it's not exactly brilliant code and the compiler should complain. Warning options will (I guess) affect whether or not it's actually warned about.
when you write to x, the compiler no longer "knows" that it is writing to a const, all this is in the coders hands in C. However, it does really know, so it may well warn you, depending on the warning options you've selected.
whether it works or not, however, will depend on how the code is compiled, the way const is implemented for the compile options selected and the target CPU and architecture. It may work. Or it may crash. Or you may write to a "random" bit of memory and cause (or not) some freaky effect. It's not a good coding strategy, but that wasn't your question :-)
Bad programmer. No Moon Pie!
If you need to modify a const, copy it to a non-const variable and then work with it. It's const for a reason. Trying to "sneak" around a const can cause serious runtime issues. i.e. the optimizer has likely used the value inline, etc.
const int x=1;
int non_const_x = x;
non_const_x = 2;

Violating of strict-aliasing in C, even without any casting?

How can *i and u.i print different numbers in this code, even though i is defined as int *i = &u.i;? I can only assuming that I'm triggering UB here, but I can't see how exactly.
(ideone demo replicates if I select 'C' as the language. But as #2501 pointed out, not if 'C99 strict' is the language. But then again, I get the problem with gcc-5.3.0 -std=c99!)
// gcc -fstrict-aliasing -std=c99 -O2
union
{
int i;
short s;
} u;
int * i = &u.i;
short * s = &u.s;
int main()
{
*i = 2;
*s = 100;
printf(" *i = %d\n", *i); // prints 2
printf("u.i = %d\n", u.i); // prints 100
return 0;
}
(gcc 5.3.0, with -fstrict-aliasing -std=c99 -O2, also with -std=c11)
My theory is that 100 is the 'correct' answer, because the write to the union member through the short-lvalue *s is defined as such (for this platform/endianness/whatever). But I think that the optimizer doesn't realize that the write to *s can alias u.i, and therefore it thinks that *i=2; is the only line that can affect *i. Is this a reasonable theory?
If *s can alias u.i, and u.i can alias *i, then surely the compiler should think that *s can alias *i? Shouldn't aliasing be 'transitive'?
Finally, I always had this assumption that strict-aliasing problems were caused by bad casting. But there is no casting in this!
(My background is C++, I'm hoping I'm asking a reasonable question about C here. My (limited) understanding is that, in C99, it is acceptable to write through one union member and then reading through another member of a different type.)
The disrepancy is issued by -fstrict-aliasing optimization option. Its behavior and possible traps are described in GCC documentation:
Pay special attention to code like this:
union a_union {
int i;
double d;
};
int f() {
union a_union t;
t.d = 3.0;
return t.i;
}
The practice of reading from a different union member than the one
most recently written to (called “type-punning”) is common. Even with
-fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.
See Structures unions enumerations and bit-fields implementation. However, this code might
not:
int f() {
union a_union t;
int* ip;
t.d = 3.0;
ip = &t.i;
return *ip;
}
Note that conforming implementation is perfectly allowed to take advantage of this optimization, as second code example exhibits undefined behaviour. See Olaf's and others' answers for reference.
C standard (i.e. C11, n1570), 6.5p7:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
...
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
a character type.
The lvalue expressions of your pointers are not union types, thus this exception does not apply. The compiler is correct exploiting this undefined behaviour.
Make the pointers' types pointers to the union type and dereference with the respective member. That should work:
union {
...
} u, *i, *p;
Strict aliasing is underspecified in the C Standard, but the usual interpretation is that union aliasing (which supersedes strict aliasing) is only permitted when the union members are directly accessed by name.
For rationale behind this consider:
void f(int *a, short *b) {
The intent of the rule is that the compiler can assume a and b don't alias, and generate efficient code in f. But if the compiler had to allow for the fact that a and b might be overlapping union members, it actually couldn't make those assumptions.
Whether or not the two pointers are function parameters or not is immaterial, the strict aliasing rule doesn't differentiate based on that.
This code indeed invokes UB, because you do not respect the strict aliasing rule. n1256 draft of C99 states in 6.5 Expressions §7:
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the
object,
— a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its
members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
Between the *i = 2; and the printf(" *i = %d\n", *i); only a short object is modified. With the help of the strict aliasing rule, the compiler is free to assume that the int object pointed by i has not been changed, and it can directly use a cached value without reloading it from main memory.
It is blatantly not what a normal human being would expect, but the strict aliasing rule was precisely written to allow optimizing compilers to use cached values.
For the second print, unions are referenced in same standard in 6.2.6.1 Representations of types / General §7:
When a value is stored in a member of an object of union type, the bytes of the object
representation that do not correspond to that member but do correspond to other members
take unspecified values.
So as u.s has been stored, u.i have taken a value unspecified by standard
But we can read later in 6.5.2.3 Structure and union members §3 note 82:
If the member used to access the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called "type
punning"). This might be a trap representation.
Although notes are not normative, they do allow better understanding of the standard. When u.s have been stored through the *s pointer, the bytes corresponding to a short have been changed to the 2 value. Assuming a little endian system, as 100 is smaller that the value of a short, the representation as an int should now be 2 as high order bytes were 0.
TL/DR: even if not normative, the note 82 should require that on a little endian system of the x86 or x64 families, printf("u.i = %d\n", u.i); prints 2. But per the strict aliasing rule, the compiler is still allowed to assumed that the value pointed by i has not changed and may print 100
You are probing a somewhat controversial area of the C standard.
This is the strict aliasing rule:
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
a type that is the signed or unsigned type corresponding to
the effective type of the object,
a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
a character type.
(C2011, 6.5/7)
The lvalue expression *i has type int. The lvalue expression *s has type short. These types are not compatible with each other, nor both compatible with any other particular type, nor does the strict aliasing rule afford any other alternative that allows both accesses to conform if the pointers are aliased.
If at least one of the accesses is non-conforming then the behavior is undefined, so the result you report -- or indeed any other result at all -- is entirely acceptable. In practice, the compiler must produce code that reorders the assignments with the printf() calls, or that uses a previously loaded value of *i from a register instead of re-reading it from memory, or some similar thing.
The aforementioned controversy arises because people will sometimes point to footnote 95:
If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.
Footnotes are informational, however, not normative, so there's really no question which text wins if they conflict. Personally, I take the footnote simply as an implementation guidance, clarifying the meaning of the fact that the storage for union members overlaps.
Looks like this is a result of the optimizer doing its magic.
With -O0, both lines print 100 as expected (assuming little-endian). With -O2, there is some reordering going on.
gdb gives the following output:
(gdb) start
Temporary breakpoint 1 at 0x4004a0: file /tmp/x1.c, line 14.
Starting program: /tmp/x1
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000
Temporary breakpoint 1, main () at /tmp/x1.c:14
14 {
(gdb) step
15 *i = 2;
(gdb)
18 printf(" *i = %d\n", *i); // prints 2
(gdb)
15 *i = 2;
(gdb)
16 *s = 100;
(gdb)
18 printf(" *i = %d\n", *i); // prints 2
(gdb)
*i = 2
19 printf("u.i = %d\n", u.i); // prints 100
(gdb)
u.i = 100
22 }
(gdb)
0x0000003fa441d9f4 in __libc_start_main () from /lib64/libc.so.6
(gdb)
The reason this happens, as others have stated, is because it is undefined behavior to access a variable of one type through a pointer to another type even if the variable in question is part of a union. So the optimizer is free to do as it wishes in this case.
The variable of the other type can only be read directly via a union which guarantees well defined behavior.
What's curious is that even with -Wstrict-aliasing=2, gcc (as of 4.8.4) doesn't complain about this code.
Whether by accident or by design, C89 includes language which has been interpreted in two different ways (along with various interpretations in-between). At issue is the question of when a compiler should be required to recognize that storage used for one type might be accessed via pointers of another. In the example given in the C89 rationale, aliasing is considered between a global variable which is clearly not part of any union and a pointer to a different type, and nothing in the code would suggest that aliasing could occur.
One interpretation horribly cripples the language, while the other would restrict the use of certain optimizations to "non-conforming" modes. If those who didn't to have their preferred optimizations given second-class status had written C89 to unambiguously match their interpretation, those parts of the Standard would have been widely denounced and there would have been some sort of clear recognition of a non-broken dialect of C which would honor the non-crippling interpretation of the given rules.
Unfortunately, what has happened instead is since the rules clearly don't require compiler writers apply a crippling interpretation, most compiler writers have for years simply interpreted the rules in a fashion which retains the semantics that made C useful for systems programming; programmers didn't have any reason to complain that the Standard didn't mandate that compilers behave sensibly because from their perspective it seemed obvious to everyone that they should do so despite the sloppiness of the Standard. Meanwhile, however, some people insist that since the Standard has always allowed compilers to process a semantically-weakened subset of Ritchie's systems-programming language, there's no reason why a standard-conforming compiler should be expected to process anything else.
The sensible resolution for this issue would be to recognize that C is used for sufficiently varied purposes that there should be multiple compilation modes--one required mode would treat all accesses of everything whose address was taken as though they read and write the underlying storage directly, and would be compatible with code which expects any level of pointer-based type punning support. Another mode could be more restrictive than C11 except when code explicitly uses directives to indicate when and where storage that has been used as one type would need to be reinterpreted or recycled for use as another. Other modes would allow some optimizations but support some code that would break under stricter dialects; compilers without specific support for a particular dialect could substitute one with more defined aliasing behaviors.

Does `const T *restrict` guarantee the object pointed-to isn’t modified?

Consider the following code:
void doesnt_modify(const int *);
int foo(int *n) {
*n = 42;
doesnt_modify(n);
return *n;
}
where the definition of doesnt_modify isn’t visible for the compiler. Thus, it must assume, that doesnt_modify changes the object n points to and must read *n before the return (the last line cannot be replaced by return 42;).
Assume, doesnt_modify doesn’t modify *n. I thought about the following to allow the optimization:
int foo_r(int *n) {
*n = 42;
{ /* New scope is important, I think. */
const int *restrict n_restr = n;
doesnt_modify(n_restr);
return *n_restr;
}
}
This has the drawback that the caller of doesnt_modify has to tell the compiler *n isn’t modified, rather than that the function itself could tell the compiler via its prototype. Simply restrict-qualifying the parameter to doesnt_modify in the declaration doesn’t suffice, cf. “Is top-level volatile or restrict significant [...]?”.
When compiling with gcc -std=c99 -O3 -S (or Clang with the same options), all functions are compiled to equivalent assembly, all re-reading the 42 from *n.
Would a compiler be allowed to do this optimization (replace the last line by return 42;) for foo_r? If not, is there a (portable, if possible) way to tell the compiler doesnt_modify doesn’t modify what its argument points to? Is there a way compilers do understand and make use of?
Does any function have UB (provided doesnt_modify doesn’t modify its argument’s pointee)?
Why I think, restrict could help here (From C11 (n1570) 6.7.3.1 “Formal definition of restrict”, p4 [emph. mine]):
[In this case, B is the inner block of foo_r, P is n_restr, T is const int, and X is the object denoted by *n, I think.]
During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified. […]
$ clang --version
Ubuntu clang version 3.5.0-4ubuntu2 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
Target: x86_64-pc-linux-gnu
Gcc version is 4.9.2, on an x86 32bit target.
Version 1 seems clearly specified by the formal definition of restrict (C11 6.7.3.1). For the following code:
const int *restrict P = n;
doesnt_modify(P);
return *P;
the symbols used in 6.7.3.1 are:
B - that block of code
P - the variable P
T - the type of *P which is const int
X - the (non-const) int being pointed to by P
L - the lvalue *P is what we're interested in
6.7.3.1/4 (partial):
During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified
[...]
If these requirements are not met, then the behavior is undefined.
Note that T is const-qualified. Therefore, if X is modified in any way during this block (which includes during the call to a function in that block), the behaviour is undefined.
Therefore the compiler can optimize as if doesnt_modify did not modify X.
Version 2 is a bit more difficult for the compiler. 6.7.6.3/15 says that top-level qualifiers are not considered in prototype compatibility -- although they aren't ignored completely.
So although the prototype says:
void doesnt_modify2(const int *restrict p);
it could still be that the body of the function is declared as void doesnt_modify2(const int *p) and therefore might modify *p.
My conclusion is that if and only if the compiler can see the definition for doesnt_modify2 and confirm that p is declared restrict in the definition's parameter list then it would be able to perform the optimization.
Generally, restrict means that the pointer is not aliased (i.e. only it or a pointer derived from it can be used to access the pointed-to object).
With const, this means that the pointed-to object cannot be modified by well-formed code.
There is, however, nothing to stop the programmer breaking the rules using an explicit type conversion to remove the constness. Then the compiler (having been beaten into submission by the programmer) will permit an attempt to modify the pointed-to object without any complaint. This, strictly speaking, results in undefined behaviour so any result imaginable is then permitted including - possibly - modifying the pointed-to object.
If not, is there a (portable, if possible) way to tell the compiler doesnt_modify doesn’t modify what its argument points to?
No such way.
Compiler optimizers have difficulty optimizing when pointer and reference function parameters are involved. Because the implementation of that function can cast away constness compilers assume that T const* is as bad as T*.
Hence, in your example, after the call doesnt_modify(n) it must reload *n from memory.
See 2013 Keynote: Chandler Carruth: Optimizing the Emergent Structures of C++. It applies to C as well.
Adding restrict keyword here does not change the above.
Simultaneous use of a restrict qualifier on a pointer-type parameter and a const qualifier on its target type would invite a compiler to assume that no region of storage which is accessed during the lifetime of the pointer object via the pointer contained therein or any pointer derived from it, will be modified via any means during that pointer's lifetime. It generally says nothing whatsoever about regions of storage which are not accessed using the pointer in question.
The only situations where const restrict would have implications for an entire object would be those where pointer is declared using array syntax with a static bound. In that situation, behavior would only be defined in cases where the entire array object could be read (without invoking UB). Since reading any part of the array object which changes during function execution would invoke UB, code would be allowed to assume that no portion of the array can be changed in any fashion whatsoever.
Unfortunately, while a compiler that knew that a function's actual definition starts with:
void foo(int const thing[restrict static 1]);
would be entitled to assume that no part of *thing would be changed during the function's execution, even if the object might be one the function could otherwise access via pointer not derived from thing, the fact that a function's prototype includes such qualifiers would not compel its definition to do likewise.

Resources