GNU C compiler sabotages undefined behaviour - c

I have an embedded project that requires at some point that I write to address 0. So naturally I try:
*(int*)0 = 0 ;
But at optimisation level 2 or higher, the gcc compiler rubs its hands and says, in effect, "That is undefined behaviour! I can do what I like! Bwahaha!" and emits an invalid instruction to the code stream!
Here is my source file:
void f (void)
{
*(int*)0 = 0 ;
}
and here is the output listing:
.file "bug.c"
.text
.p2align 4,,15
.globl _f
.def _f; .scl 2; .type 32; .endef
_f:
LFB0:
.cfi_startproc
movl $0, 0
ud2 <-- Invalid instruction!
.cfi_endproc
LFE0:
.ident "GCC: (i686-posix-dwarf-rev0, Built by MinGW-W64 project) 7.3.0"
My question is: Why would anybody do this? What possible benefit could accrue from sabotaging code like this? Surely the obvious course of action is to issue a warning and carry on compiling?
I know the compiler is allowed to do this, I just wonder about the motivation of the compiler writer. It cost me two days and four engineering samples to track this down, so I'm a little peeved.
Edited to add: I have worked around this by using assembly language. So I'm not looking for solutions. I'm just curious why anybody would think this compiler behaviour was a good idea.

(Disclaimer: I'm not an expert on GCC internals, and this is more of a "post hoc" attempt to explain its behavior. But maybe it will be helpful.)
the gcc compiler rubs its hands and says, in effect, "That is undefined behaviour! I can do what I like! Bwahaha!" and emits an invalid instruction to the code stream!
I won't deny that there are cases where GCC does more or less that, but here there's a little more going on, and there is some method to its madness.
As I understand it, GCC isn't treating the null dereference as totally undefined here; it is making some assumptions about what it does. Its handling of null dereferences is controlled by a flag called -fdelete-null-pointer-checks, which is probably enabled by default when you turn on optimizations. From the manual:
-fdelete-null-pointer-checks
Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. This option
enables simple constant folding optimizations at all optimization
levels. In addition, other optimization passes in GCC use this flag to
control global dataflow analyses that eliminate useless checks for
null pointers; these assume that a memory access to address zero
always results in a trap, so that if a pointer is checked after it has
already been dereferenced, it cannot be null.
Note however that in some environments this assumption is not true. Use -fno-delete-null-pointer-checks to disable this optimization
for programs that depend on that behavior.
This option is enabled by default on most targets. On Nios II ELF, it defaults to off. On AVR, CR16, and MSP430, this option is
completely disabled.
Passes that use the dataflow information are enabled independently at different optimization levels.
So, if you are intending to actually access address 0, or if for some other reason your code will go on executing after the dereference, then you want to disable this with -fno-delete-null-pointer-checks. That will achieve the "carry on compiling" part of what you want. It will not give you warnings, however, presumably under the assumption that such dereferences are intentional.
But under default options, why are you seeing the generated code that you do, with the undefined instruction, and why isn't there a warning? I would guess that GCC's logic is running as follows:
Because -fdelete-null-pointer-checks is in effect, the compiler assumes that execution will not continue past the null dereference, but instead will trap. How the trap will be handled, it doesn't know: maybe program termination, maybe a signal or exception handler, maybe a longjmp up the stack. The null dereference itself is emitted as requested, perhaps under the assumption that you are intentionally exercising your trap handler. But either way, whatever code comes after the null dereference is now unreachable.
So now it does what any reasonable optimizing compiler does with unreachable code: it doesn't emit it. In your case, that's nothing but a ret, but whatever it is, as far as GCC is concerned it would just be wasted bytes of memory, and should be omitted.
You might think you should get a warning here, but GCC has a longstanding design decision not to warn about unreachable code, on the grounds that such warnings tended to be inconsistent and the false positives would do more harm than good. See for instance https://gcc.gnu.org/legacy-ml/gcc-help/2011-05/msg00360.html.
However, as a safety feature, GCC emits an undefined instruction (ud2 on x86) in place of the omitted unreachable code. The idea, I believe, is that just in case execution somehow does continue past the null dereference, it is better for the program to die, than to go off into the weeds and try to execute whatever memory contents happen to come next. (And indeed this can happen even on systems that do unmap the zero page; for instance, if you do struct huge *p = NULL; p->x = 0;, GCC understands this as a null dereference, even though p->x may not be on the zero page at all, and could conceivably be located at an accessible address.)
There is a warning flag, -Wnull-dereference, that will trigger a warning on your blatant null dereference. However, it only works if -fdelete-null-pointer-checks is enabled.
When would GCC's behavior be useful? Here's an example, maybe contrived, but it might get the idea across. Imagine your program has some allocation function that might fail:
struct foo *p = get_foo();
// do other stuff for a while
if (!p) {
// 5000 lines of elaborate backup plan in case we can't get a foo
}
frob(p->bar);
Now imagine that you redesign get_foo() so that it can't fail. You forget to take out your "backup plan" code, but you go ahead and use the returned object right away:
struct foo *p = get_foo();
frob(p->bar);
// do other stuff for a while
if (!p) {
// 5000 lines of elaborate backup plan in case we can't get a foo
}
The compiler doesn't know, a priori, that get_foo() will always return a valid pointer. But it can see that you've dereferenced it, and thus can assume that execution will only continue past that point if the pointer was not null. Therefore, it can tell that the elaborate backup plan is unreachable and should be omitted, which will save you a lot of bloat in your binary.
Incidentally, the situation with clang. Although as Eric Postpischil points out you do get a warning, what you don't get is an actual load from address 0: clang omits it and just emits ud2. This is what "doing whatever it likes" would really look like, and if you were hoping to exercise your page zero trap handler, you are out of luck.

In describing Undefined Behavior, the Standard refers to it as resulting "upon use of a nonportable or erroneous program construct or of erroneous data,", and the authors of the Standard clarify their intentions more clearly in the published Rationale: "Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior." The question of when to extend the language in such fashion--treating various forms of UB as non-portable but correct, was left as a Quality of Implementation issue outside the Standard's jurisdiction.
The maintainers of clang and gcc take the view that the phrase "non-portable or erroneous" should be interpreted as synonymous with "erroneous", since the Standard would not forbid such an interpretation. If a compiler will never be used to process non-portable programs that will never be fed erroneous data, such an interpretation will sometimes allow them to process some strictly conforming programs which are fed exclusively valid data more quickly than would otherwise be possible, at the expense of making them less suitable for other purposes. I personally would view the range of programs that a compiler can usefully process reasonably efficiently as a much better metric of quality than the efficiency with which a compiler can process strictly-conforming programs, but people who are using compilers for different purposes may have different views about what would make a compiler more or less useful for those purposes.

Related

Are there any examples of semantics non-preserving optimizations (except FP optimizations)?

It is considered that optimizations have semantics preservation property. However, floating-point (FP) optimizations may not preserve the semantics. Usually these FP-optimizations are the result of selection of non-strict FP models (examples: ICC, MSVC, GCC, Clang/LLVM, KEIL, etc.).
Out of curiosity, are there any examples of other semantics non-preserving optimizations?
There are, but you have to look hard to find them.
Try replacing a standard library function. If it doesn't do what the standard library function does, you may find that your code doesn't do what you expect, because the compiler assumes standard library functions do what the documentation says they do.
Also, mmap() a region at address zero. The compiler may omit code that accesses it because it assumes that code is unreachable because it dereferences a NULL pointer and thus undefined behavior. However, if that mmap() call succeeds, the behavior of dereferencing a zero (NULL is zero on most platforms) just became defined. gcc has a compiler option to tell it to stop doing that. Clang eventually caved to pressure to add it because it would otherwise miscompile the kernel. https://reviews.llvm.org/D47894#change-z5AkMbcq7h1h
Back in the 90s when the aliasing rules were just starting to become things, there were more examples, as the aliasing rules changed the definition of the language. But this is well-settled now.

C code with undefined results, compiler generates invalid code (with -O3)

I know that when you do certain things in a C program, the results are undefined. However, the compiler should not be generating invalid (machine) code, right? It would be reasonable if the code did the wrong thing, or if the code generated a segfault or something...
Is this supposed to happen according to the compiler spec, or is it a bug in the compiler?
Here's the (simple) program I'm using:
int main() {
char *ptr = 0;
*(ptr) = 0;
}
I'm compiling with -O3. That shouldn't generate invalid hardware instructions though, right? With -O0, I get a segfault when I run the code. That seems a lot more sane.
Edit: It's generating a ud2 instruction...
The ud2 instruction is a "valid instruction" and it stands for Undefined Instruction and generates an invalid opcode exception clang and apparently gcc can generate this code when a program invokes undefined behavior.
From the clang link above the rationale is explained as follows:
Stores to null and calls through null pointers are turned into a
__builtin_trap() call (which turns into a trapping instruction like "ud2" on x86). These happen all of the time in optimized code (as the
result of other transformations like inlining and constant
propagation) and we used to just delete the blocks that contained them
because they were "obviously unreachable".
While (from a pedantic language lawyer standpoint) this is strictly
true, we quickly learned that people do occasionally dereference null
pointers, and having the code execution just fall into the top of the
next function makes it very difficult to understand the problem. From
the performance angle, the most important aspect of exposing these is
to squash downstream code. Because of this, clang turns these into a
runtime trap: if one of these is actually dynamically reached, the
program stops immediately and can be debugged. The drawback of doing
this is that we slightly bloat code by having these operations and
having the conditions that control their predicates.
at the end of the day once your are invoking undefined behavior the behavior of your program is unpredictable. The philosophy here is that is probably better to crash hard and give the developer an indication that something is seriously wrong and allow them to debug fro the right point than to produce a program that seems to work but actually is broken.
As Ruslan notes, it is "valid" in the sense that it guaranteed to raise an invalid opcode exception as opposed to other unused sequences which may in the future become valid.

C dummy operations

I cant imagine what the compiler does when for instance there is no lvalue for instance like this :
number>>1;
My intuition tells me that the compiler will discard this line from compilation due to optimizations and if the optimization is removed what happens?
Does it use a register to do the manipulation? or does it behave like if it was a function call so the parameters are passed to the stack, and than the memory used is marked as freed? OR does it transform that to an NOP operation?
Can I see what is happening using the VS++ debugger?
Thank your for your help.
In the example you give, it discards the operation. It knows the operation has no side effects and therefore doesn't need to emit the code to execute the statement in order to produce a correct program. If you disable optimizations, the compiler may still emit code. If you enable optimizations, the compiler may still emit code, too -- it's not perfect.
You can see the code the compiler emits using the /FAsc command line option of the Microsoft compiler. That option creates a listing file which has the object code output of the compiler interspersed with the related source code.
You can also use "view disassembly" in the debugger to see the code generated by the compiler.
Using either "view disassembly" or /FAsc on optimized code, I'd expect to see no emitted code from the compiler.
Assuming that number is a regular variable of integer type (not volatile) then any competent optimizing compiler (Microsoft, Intel, GNU, IBM, etc) will generate exactly NOTHING. Not a nop, no registers are used, etc.
If optimization is disabled off (in a "debug build"), then the compiler may well "do what you asked for", because it doesn't realize it doesn't have side-effects from the code. In this case, the value will be loaded into a register, shifted right once. The result of this is not stored anywhere. The compiler will perform "useless code elimination" as one of the optimization steps - I'm not sure which one, but for this sort of relatively simple thing, I expect the compiler to figure out with fairly basic optimization settings. Some cases, where loops are concerned, etc, the compiler may not optimize away the code until some more advanced optimization settings are enabled.
As mentioned in the comments, if the variable is volatile, then the read of the memory reprsented by number will have to be made, as the compiler MUST read volatile memory.
In Visual studio, if you "view disassembly", it should show you the code that the compiler generated.
Finally, if this was C++, there is also the possibility that the variable is not a regular integer type, the function operator>> is being called when this code is seen by the compiler - this function may have side-effects besides returning a result, so may well have to be performed. But this can't be the case in C, since there is no operator overloading.

Handling null pointers on AIX with GCC C

We have a code written in C that sometimes doesn’t handle zero pointers very well.
The code was originally written on Solaris and such pointers cause a segmentation fault. Not ideal but better than ploughing on.
Our experience is that if you read from a null pointer on AIX you get 0. If you use the xlc compiler you can add an option -qcheck=all to trap these pointers. But we use gcc (and want to continue using that compiler). Does gcc provide such an option?
Does gcc provide such an option?
I'm sheepishly volunteering the answer no, it doesn't. Although I can't cite the absence of information regarding gcc and runtime NULL checks.
The problem you're tackling is that you're trying to make undefined behavior a little more defined in a program that's poorly-written.
I recommend that you bite the bullet and either switch to xlc or manually add NULL checks to the code until the bad behavior has been found and removed.
Consider:
Making a macro to null-check a pointer
Adding that macro after pointer assignments
Adding that macro to the entry point of functions that accept pointers
As bugs are removed, you can begin to remove these checks.
Please do us all a favor and add proper NULL checks to your code. Not only will you have a slight gain in performance by checking for NULL only when needed, rather than having the compiler perform the check everywhere, but your code will be more portable to other platforms.
And let's not mention the fact that you will be more likely to print a proper error message rather than have the compiler drop some incomprehensible stack dump/source code location/error code that will not help your users at all.
AIX uses the concept of a NULL page. Essentially, NULL (i.e. virtual address 0x0) is mapped to a location that contains a whole bunch of zeros. This allows string manipulation code e.t.c. to continue despite encountering a NULL pointer.
This is contrary to most other Unix-like systems, but it is not in violation of the C standard, which considers dereferencing NULL an undefined operation. In my opinion, though, this is woefully broken: it takes an application that would crash violently and turns it into one that ignores programming errors silently, potentially producing totally incorrect results.
As far as I know, GCC has no options to work around fundamentally broken code. Even historically supported patterns, such as writable string literals, have been slowly phased out in newer GCC versions.
There might be some support when using memory debugging options such as -fmudflap, but I don't really know - in any case you should not use debugging code in production systems, especially for forcing broken code to work.
Bottom line: I don't think that you can avoid adding explicit NULL checks.
Unfortunately we now come to the basic question: Where should the NULL checks be added?. I suppose having the compiler add such checks indiscriminately would help, provided that you add an explicit check when you discover an issue.
Unfortunately, there is no Valgrind support for AIX. If you have the cash, you might want to have a look at IBM Rational Purify Plus for AIX - it might catch such errors.
It might also be possible to use xlc on a testing system and gcc for everything else, but unfortunately they are not fully compatible.

C optimization breaks algorithm

I am programming an algorithm that contains 4 nested for loops. The problem is at at each level a pointer is updated. The innermost loop only uses 1 of the pointers. The algorithm does a complicated count. When I include a debugging statement that logs the combination of the indexes and the results of the count I get the correct answer. When the debugging statement is omitted, the count is incorrect. The program is compiled with the -O3 option on gcc. Why would this happen?
Always put your code through something like valgrind, Purify, etc, before blaming the optimizer. Especially when blaming things related to pointers.
It's not to say the optimizer isn't broken, but more than likely, it's you. I've worked on various C++ compilers and seen my share of seg faults that only happen with optimized code. Quite often, people do things like forget to count the \0 when allocating space for a string, etc. And it's just luck at that point on which pages you're allocated when the program runs with different -O settings.
Also, important questions: are you dealing with restricted pointers at all?
Print out the assembly code generated by the compiler, with optimizations. Compare to an assembly language listing of the code without optimizations.
The compiler may have figured out the some of the variables can be eliminated. They were not used in the computation. You can try to match wits with the compiler and factor out variables that are not used.
The compiler may have substituted a for loop with an equation. In some cases (after removing unused variables), the loop can be replaced by a simple equation. For example, a loop that adds 1 to a variable can be replaced by a multiplication statement.
You can tell the compiler to let a variable be by declaring it as volatile. The volatile keyword tells the compiler that the variable's value may be altered by means outside of the program and the compiler should not cache nor eliminate the variable. This is a popular technique in embedded systems programming.
Most likely your program somehow exploits undefined behaviour which works in your favour without optimisation, but with -O3 optimisation it turns against you.
I had a similar experience with one my project - it works fine with -O2 but breaks with -O3. I used setjmp()/longjmp() heavily in my code and I had to make half of variables volatile to get it working so I decided that -O2 is good enough.
Sounds like something is accessing memory that it shouldn't. Debugging symbols are famous for postponing bad news.
Is it pure C or there's any crazy thing like inline assembly?
However, run it on valgrind to check whether this might be happening. Also, did you try compiling with different optimization levels? And without debugging & optimizations?
Without code this is difficult, but here's some things that I've seen before.
Debugging print statements often end up being the only user of a value that the compiler knows about. Without the print statement the compiler thinks that it can do away with any operations and memory requirements that would otherwise be required to compute or store that value.
A similar thing happens when you have side effects included within the argument list of your print statement.
printf("%i %i\n", x, y = x - z);
Another type of error can be:
for( i = 0; i < END; i++) {
int *a = &i;
foo(a);
}
if (bar) {
int * a;
baz(a);
}
This code would likely have the intended result because the compiler would probably choose to store both a variables in the same location, so the second a would have the last value that the other a had.
inline functions can have some strange behavior or you somehow rely on them not being inlined (or sometimes the other way round), which is often the case for unoptimized code.
You should definitely try compiling with warnings turned up to the maximum (-Wall for gcc).
That will often tell you about the risky code.
(edit)
Just thought of another.
If you have more than one way to reference a variable then you can have issues that work right without optimization, but break when optimization is turned up. There are two main ways this can happen.
The first is if a value can be changed by a signal handler or another thread. You need to tell the compiler about that so it will know that any access to assume that the value needs to be reloaded and/or stored. This is done by using the volatile keyword.
The second is aliasing. This is when you create two different ways to access the same memory. Compilers usually are quick to assume that you are aliasing with pointers, but not always. Also, they're are optimization flags for some that tell them to be less quick to make those assumptions, as well as ways that you could fool the compiler (crazy stuff like while (foo != bar) { foo++; } *foo = x; not being obviously a copy of bar to foo).

Resources