I'm cleaning up some code in a driver (Netgear A6210) written in C, and I've run into a helper function, VIRTUAL_IF_DOWN(), which is forcibly inlined (e.g. __inline instead of inline) and contains a seemingly arbitrary return statement at the end.
__inline void VIRTUAL_IF_DOWN(void *pAd)
{
/* Some code here */
return ;
}
However, this helper function is called in the body of two other functions, before control is given back to the rest of the program, so my question is, does this return statement get inlined with the rest of the function, thus breaking out of the larger function or just do nothing? My general rule of thumb for inlined functions is that I should always treat them as separate functions and not assume that they'll be inlined as-is, anyway, I've give one of the encapsulating functions as an example:
static void rtusb_disconnect(struct usb_interface *intf)
{
/* Some code here and then an ugly looking preprocessor branch */
#ifdef IFUP_IN_PROBE
VIRTUAL_IF_DOWN(pAd); // Function is used here
#endif
/* Other code here */
}
I apologize for the messy boilerplate code, but even if the return statement just gets inlined it seems to be obfuscating the code. It seems like bad practice to hide a statement that can affect the flow behind an inlined function. What would be a better solution?
Another part of my question would be, is inlining determined at the preprocessor stage of compilation or later, such as in the assembler or linker stage?
Inlining is more then just copy pasting code in contrast to macro pre-processing.
When a compiler encounters an inlining directive of any sort, it evaluates what the inlined functions returns or does.
You can use sites such as godbolt.org to see what assembly is generated for C code. For example, the following functions evaluate to the same assembly code:
#include <stdio.h>
inline void test1(int number){
printf("%d", number);
return;
}
inline int test2(){
return 1+1;
}
void doSomething() {
test1(test2());
}
void doSomethingElse() {
printf("%d", 2);
}
and the assembly:
.LC0:
.string "%d"
_Z11doSomethingv:
sub rsp, 8
mov esi, 2
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
add rsp, 8
ret
_Z15doSomethingElsev:
sub rsp, 8
mov esi, 2
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
add rsp, 8
ret
You can see this example at https://godbolt.org/z/rBULdo
You should also note that inlining is a compiler optimization. Different compilation flags could result in different results from inlining.
It also depends on how much information is available at compile time versus run time. If the compiler knows more information at compile time it could optimize better then what it could if everything was only known at run time.
See link for GCC's behaviour when encountering the inline attribute
return; at the end of a void function is redundant.
Inlining preserves semantics: If return returns from the non-inlined call, then return in an inlined function returns from the inlined "call", i.e. it jumps to the end of the inlined body.
Inlining happens during compilation, i.e. after preprocessing and parsing, but before code generation and assembling.
Related
Recently I read the code of one public C library and found below function definition:
void* block_alloc(void** block, size_t* len, size_t type_size)
{
return malloc(type_size);
(void)block;
(void)len;
}
I wonder whether it will arrive at the statements after return. If not, what's the purpose of these 2 statements that convert some data to void ?
As Basil notes, the (void) statements are likely intended to silence compiler warnings about the unused parameters. But - you can move the (void) statements before the return to make them less confusing, and with the same effect.
In fact, there's yet another way to achieve the same effect, without resorting to any extra statements. It's supported by many compilers already today, although it's not officially in the C standard before C2X:
void* block_alloc(void**, size_t*, size_t type_size)
{
return malloc(type_size);
}
if you don't name the parameters, typical compilers don't expect you to be using them.
First, these statements appearing in the block after a return will never be executed.
Check by reading some C standard like n1570.
Second, on some compilers (perhaps GCC 10 invoked as gcc -Wall -Wextra) the useless statements might avoid some warnings.
In my opinion, coding these statements before the return won't change the machine code emitted by an optimizing compiler (use gcc -Wall -Wextra -O2 -fverbose-asm -S to check and emit the assembler code) and makes the C source code more understandable.
GCC provides, as an extension, the variable __attribute__ named unused.
Perhaps in your software your block_alloc is assigned to some important function pointer (whose signature is requested)
It is used to silence the warnings. Some programming standards required all the parameters to be used in the function body, and their static analyzers will not pass the code without it.
It is added after the return to prevent a generation of the code in some circumstances:
int foo(volatile unsigned x)
{
(void)x;
return 0;
}
int foo1(volatile unsigned x)
{
return 0;
(void)x;
}
foo:
mov DWORD PTR [rsp-4], edi
mov eax, DWORD PTR [rsp-4]
xor eax, eax
ret
foo1:
mov DWORD PTR [rsp-4], edi
xor eax, eax
ret
I read this question about noreturn attribute, which is used for functions that don't return to the caller.
Then I have made a program in C.
#include <stdio.h>
#include <stdnoreturn.h>
noreturn void func()
{
printf("noreturn func\n");
}
int main()
{
func();
}
And generated assembly of the code using this:
.LC0:
.string "func"
func:
pushq %rbp
movq %rsp, %rbp
movl $.LC0, %edi
call puts
nop
popq %rbp
ret // ==> Here function return value.
main:
pushq %rbp
movq %rsp, %rbp
movl $0, %eax
call func
Why does function func() return after providing noreturn attribute?
The function specifiers in C are a hint to the compiler, the degree of acceptance is implementation defined.
First of all, _Noreturn function specifier (or, noreturn, using <stdnoreturn.h>) is a hint to the compiler about a theoretical promise made by the programmer that this function will never return. Based on this promise, compiler can make certain decisions, perform some optimizations for the code generation.
IIRC, if a function specified with noreturn function specifier eventually returns to its caller, either
by using and explicit return statement
by reaching end of function body
the behaviour is undefined. You MUST NOT return from the function.
To make it clear, using noreturn function specifier does not stop a function form returning to its caller. It is a promise made by the programmer to the compiler to allow it some more degree of freedom to generate optimized code.
Now, in case, you made a promise earlier and later, choose to violate this, the result is UB. Compilers are encouraged, but not required, to produce warnings when a _Noreturn function appears to be capable of returning to its caller.
According to chapter §6.7.4, C11, Paragraph 8
A function declared with a _Noreturn function specifier shall not return to its caller.
and, the paragraph 12, (Note the comments!!)
EXAMPLE 2
_Noreturn void f () {
abort(); // ok
}
_Noreturn void g (int i) { // causes undefined behavior if i <= 0
if (i > 0) abort();
}
For C++, the behaviour is quite similar. Quoting from chapter §7.6.4, C++14, paragraph 2 (emphasis mine)
If a function f is called where f was previously declared with the noreturn attribute and f eventually
returns, the behavior is undefined. [ Note: The function may terminate by throwing an exception. —end
note ]
[ Note: Implementations are encouraged to issue a warning if a function marked [[noreturn]] might
return. —end note ]
3 [ Example:
[[ noreturn ]] void f() {
throw "error"; // OK
}
[[ noreturn ]] void q(int i) { // behavior is undefined if called with an argument <= 0
if (i > 0)
throw "positive";
}
—end example ]
Why function func() return after providing noreturn attribute?
Because you wrote code that told it to.
If you don't want your function to return, call exit() or abort() or similar so it doesn't return.
What else would your function do other than return after it had called printf()?
The C Standard in 6.7.4 Function specifiers, paragraph 12 specifically includes an example of a noreturn function that can actually return - and labels the behavior as undefined:
EXAMPLE 2
_Noreturn void f () {
abort(); // ok
}
_Noreturn void g (int i) { // causes undefined behavior if i<=0
if (i > 0) abort();
}
In short, noreturn is a restriction that you place on your code - it tells the compiler "MY code won't ever return". If you violate that restriction, that's all on you.
noreturn is a promise. You're telling the compiler, "It may or may not be obvious, but I know, based on the way I wrote the code, that this function will never return." That way, the compiler can avoid setting up the mechanisms that would allow the function to return properly. Leaving out those mechanisms might allow the compiler to generate more efficient code.
How can a function not return? One example would be if it called exit() instead.
But if you promise the compiler that your function won't return, and the compiler doesn't arrange for it to be possible for the function to return properly, and then you go and write a function that does return, what's the compiler supposed to do? It basically has three possibilities:
Be "nice" to you and figure out a way to have the function return properly anyway.
Emit code that, when the function improperly returns, it crashes or behaves in arbitrarily unpredictable ways.
Give you a warning or error message pointing out that you broke your promise.
The compiler might do 1, 2, 3, or some combination.
If this sounds like undefined behavior, that's because it is.
The bottom line, in programming as in real life, is: Don't make promises you can't keep. Someone else might have made decisions based on your promise, and bad things can happen if you then break your promise.
The noreturn attribute is a promise that you make to the compiler about your function.
If you do return from such a function, behavior is undefined, but this doesn't mean a sane compiler will allow you to mess the state of the application completely by removing the ret statement, especially since the compiler will often even be able to deduce that a return is indeed possible.
However, if you write this:
noreturn void func(void)
{
printf("func\n");
}
int main(void)
{
func();
some_other_func();
}
then it's perfectly reasonable for the compiler to remove the some_other_func completely, it if feels like it.
As others have mentioned, this is classic undefined behavior. You promised func wouldn't return, but you made it return anyway. You get to pick up the pieces when that breaks.
Although the compiler compiles func in the usual manner (despite your noreturn), the noreturn affects calling functions.
You can see this in the assembly listing: the compiler has assumed, in main, that func won't return. Therefore, it literally deleted all of the code after the call func (see for yourself at https://godbolt.org/g/8hW6ZR). The assembly listing isn't truncated, it literally just ends after the call func because the compiler assumes any code after that would be unreachable. So, when func actually does return, main is going to start executing whatever crap follows the main function - be it padding, immediate constants, or a sea of 00 bytes. Again - very much undefined behavior.
This is transitive - a function that calls a noreturn function in all possible code paths can, itself, be assumed to be noreturn.
According to this
If the function declared _Noreturn returns, the behavior is undefined. A compiler diagnostic is recommended if this can be detected.
It is the programmer's responsibility to make sure that this function never returns, e.g. exit(1) at the end of the function.
ret simply means that the function returns control back to the caller. So, main does call func, the CPU executes the function, and then, with ret, the CPU continues execution of main.
Edit
So, it turns out, noreturn does not make the function not return at all, it's just a specifier that tells the compiler that the code of this function is written in such a way that the function won't return. So, what you should do here is to make sure that this function actually doesn't return control back to the callee. For example, you could call exit inside it.
Also, given what I've read about this specifier it seems that in order to make sure the function won't return to its point of invocation, one should call another noreturn function inside it and make sure that the latter is always run (in order to avoid undefined behavior) and doesn't cause UB itself.
no return function does not save the registers on the entry as it is not necessary. It makes the optimisations easier. Great for the scheduler routine for example.
See the example here:
https://godbolt.org/g/2N3THC and spot the difference
TL:DR: It's a missed-optimization by gcc.
noreturn is a promise to the compiler that the function won't return. This allows optimizations, and is useful especially in cases where it's hard for the compiler to prove that a loop won't ever exit, or otherwise prove there's no path through a function that returns.
GCC already optimizes main to fall off the end of the function if func() returns, even with the default -O0 (minimum optimization level) that it looks like you used.
The output for func() itself could be considered a missed optimization; it could just omit everything after the function call (since having the call not return is the only way the function itself can be noreturn). It's not a great example since printf is a standard C function that is known to return normally (unless you setvbuf to give stdout a buffer that will segfault?)
Lets use a different function that the compiler doesn't know about.
void ext(void);
//static
int foo;
_Noreturn void func(int *p, int a) {
ext();
*p = a; // using function args after a function call
foo = 1; // requires save/restore of registers
}
void bar() {
func(&foo, 3);
}
(Code + x86-64 asm on the Godbolt compiler explorer.)
gcc7.2 output for bar() is interesting. It inlines func(), and eliminates the foo=3 dead store, leaving just:
bar:
sub rsp, 8 ## align the stack
call ext
mov DWORD PTR foo[rip], 1
## fall off the end
Gcc still assumes that ext() is going to return, otherwise it could have just tail-called ext() with jmp ext. But gcc doesn't tailcall noreturn functions, because that loses backtrace info for things like abort(). Apparently inlining them is ok, though.
Gcc could have optimized by omitting the mov store after the call as well. If ext returns, the program is hosed, so there's no point generating any of that code. Clang does make that optimization in bar() / main().
func itself is more interesting, and a bigger missed optimization.
gcc and clang both emit nearly the same thing:
func:
push rbp # save some call-preserved regs
push rbx
mov ebp, esi # save function args for after ext()
mov rbx, rdi
sub rsp, 8 # align the stack before a call
call ext
mov DWORD PTR [rbx], ebp # *p = a;
mov DWORD PTR foo[rip], 1 # foo = 1
add rsp, 8
pop rbx # restore call-preserved regs
pop rbp
ret
This function could assume that it doesn't return, and use rbx and rbp without saving/restoring them.
Gcc for ARM32 actually does that, but still emits instructions to return otherwise cleanly. So a noreturn function that does actually return on ARM32 will break the ABI and cause hard-to-debug problems in the caller or later. (Undefined behaviour allows this, but it's at least a quality-of-implementation problem: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82158.)
This is a useful optimization in cases where gcc can't prove whether a function does or doesn't return. (It's obviously harmful when the function does simply return, though. Gcc warns when it's sure a noreturn function does return.) Other gcc target architectures don't do this; that's also a missed optimization.
But gcc doesn't go far enough: optimizing away the return instruction as well (or replacing it with an illegal instruction) would save code size and guarantee noisy failure instead of silent corruption.
And if you're going to optimize away the ret, optimizing away everything that's only needed if the function will return makes sense.
Thus, func() could be compiled to:
sub rsp, 8
call ext
# *p = a; and so on assumed to never happen
ud2 # optional: illegal insn instead of fall-through
Every other instruction present is a missed optimization. If ext is declared noreturn, that's exactly what we get.
Any basic block that ends with a return could be assumed to never be reached.
Simple question. The function asm in C is used to do inline assembly in your code. But what does it return? Is it the conventional eax, and if not, what does it return?
__asm__ itself does not return a value. C standard does not define how __asm__ should handle the return value, so the behavior might be different between compilers. You stated that Visual Studio example is valid, but Visual Studio uses __asm. __asm__ is used at least by GCC.
Visual Studio
To get the result in a C program, you can place return value to eax in the assembly code, and return from the function. The caller will receive contents of eax as the return value. This is supported even with optimization enabled, even if the compiler decides to inline the function containing the __asm{} block.
It avoids a store/reload you'd otherwise get from moving the value to a C variable in the asm and returning that C variable, because MSVC inline asm syntax doesn't support inputs/outputs in registers (except for this return-value case).
Visual Studio 2015 documentation:
int power2( int num, int power )
{
__asm
{
mov eax, num ; Get first argument
mov ecx, power ; Get second argument
shl eax, cl ; EAX = EAX * ( 2 to the power of CL )
}
// Return with result in EAX
// by falling off the end of a non-void function
}
clang -fasm-blocks supports the same inline-asm syntax but does not support falling off the end of a non-void function as returning the value that an asm{} block left in EAX/RAX. Beware of that if porting MSVC inline asm to clang. It will break horribly when compiled with optimization enabled (function inlining).
GCC
GCC inline assembly HOWTO does not contain a similar example. You can't use an implicit return as in Visual Studio, but fortunately you don't need to because GNU C inline asm syntax allows specifying outputs in registers. No hack is needed to avoid a store/reload of an output value.
The HOWTO shows that you can store the result to C variable inside the assembly block, and return value of that variable after the assembly block has ended. You can even use "=r"(var) to let the compiler pick its choice of register, in case EAX isn't the most convenient after inlining.
An example of an (inefficient) string copy function, returning value of dest:
static inline char * strcpy(char * dest,const char *src)
{
int d0, d1, d2;
__asm__ __volatile__( "1:\tlodsb\n\t"
"stosb\n\t"
"testb %%al,%%al\n\t"
"jne 1b"
: "=&S" (d0), "=&D" (d1), "=&a" (d2)
: "0" (src),"1" (dest)
: "memory");
return dest;
}
(Note that dest isn't actually an output from the inline asm statement. The matching constraint for the dummy output operands tells the compiler the inline asm destroyed that copy of the variable so it needs to preserve it across the asm statement on its own somehow.)
If you omit a return statement in a non-void function with optimization enabled, you get a warning like warning: no return statement in function returning non-void [-Wreturn-type] and recent GCC/clang won't even emit a ret; it assumes this path of execution is never taken (because that would be UB). It doesn't matter whether or not the function contained an asm statement or not.
It's unlikely; per the C99 spec, under J3 Implementation-defined behaviour:
The asm keyword may be used to insert assembly language directly into
the translator output (6.8). The most common implementation is via a statement of the form:
asm ( character-string-literal );
So it's unlikely that an implementor is going to come up with an approach that both inserts the assembly language into the translator output and also generates some additional intermediary linking code to wire a particular register as a return result.
It's a keyword, not a function.
E.g. GCC uses "=r"-type constraint semantics to allow you in your assembly to have write access to a variable. But you ensure the result ends up in the right place.
I am using C99 under GCC.
I have a function declared static inline in a header that I cannot modify.
The function never returns but is not marked __attribute__((noreturn)).
How can I call the function in a way that tells the compiler it will not return?
I am calling it from my own noreturn function, and partly want to suppress the "noreturn function returns" warning but also want to help the optimizer etc.
I have tried including a declaration with the attribute but get a warning about the repeated declaration.
I have tried creating a function pointer and applying the attribute to that, but it says the function attribute cannot apply to a pointed function.
From the function you defined, and which calls the external function, add a call to __builtin_unreachable which is built into at least GCC and Clang compilers and is marked noreturn. In fact, this function does nothing else and should not be called. It's only here so that the compiler can infer that program execution will stop at this point.
static inline external_function() // lacks the noreturn attribute
{ /* does not return */ }
__attribute__((noreturn)) void your_function() {
external_function(); // the compiler thinks execution may continue ...
__builtin_unreachable(); // ... and now it knows it won't go beyond here
}
Edit: Just to clarify a few points raised in the comments, and generally give a bit of context:
A function has has only two ways of not returning: loop forever, or short-circuit the usual control-flow (e.g. throw an exception, jump out of the function, terminate the process, etc.)
In some cases, the compiler may be able to infer and prove through static analysis that a function will not return. Even theoretically, this is not always possible, and since we want compilers to be fast only obvious/easy cases are detected.
__attribute__((noreturn)) is an annotation (like const) which is a way for the programmer to inform the compiler that he's absolutely sure a function will not return. Following the trust but verify principle, the compiler tries to prove that the function does indeed not return. If may then issue an error if it proves the function may return, or a warning if it was not able to prove whether the function returns or not.
__builtin_unreachable has undefined behaviour because it is not meant to be called. It's only meant to help the compiler's static analysis. Indeed the compiler knows that this function does not return, so any following code is provably unreachable (except through a jump).
Once the compiler has established (either by itself, or with the programmer's help) that some code is unreachable, it may use this information to do optimizations like these:
Remove the boilerplate code used to return from a function to its caller, if the function never returns
Propagate the unreachability information, i.e. if the only execution path to a code points is through unreachable code, then this point is also unreachable. Examples:
if a function does not return, any code following its call and not reachable through jumps is also unreachable. Example: code following __builtin_unreachable() is unreachable.
in particular, it the only path to a function's return is through unreachable code, the function can be marked noreturn. That's what happens for your_function.
any memory location / variable only used in unreachable code is not needed, therefore settings/computing the content of such data is not needed.
any computations which is probably (1) unnecessary (previous bullet) and (2) has no side effects (such as pure functions) may be removed.
Illustration:
The call to external_function cannot be removed because it might have side-effects. In fact, it probably has at least the side effect of terminating the process!
The return boiler plate of your_function may be removed
Here's another example showing how code before the unreachable point may be removed
int compute(int) __attribute((pure)) { return /* expensive compute */ }
if(condition) {
int x = compute(input); // (1) no side effect => keep if x is used
// (8) x is not used => remove
printf("hello "); // (2) reachable + side effect => keep
your_function(); // (3) reachable + side effect => keep
// (4) unreachable beyond this point
printf("word!\n"); // (5) unreachable => remove
printf("%d\n", x); // (6) unreachable => remove
// (7) mark 'x' as unused
} else {
// follows unreachable code, but can jump here
// from reachable code, so this is reachable
do_stuff(); // keep
}
Several solutions:
redeclaring your function with the __attribute__
You should try to modify that function in its header by adding __attribute__((noreturn)) to it.
You can redeclare some functions with new attribute, as this stupid test demonstrates (adding an attribute to fopen) :
#include <stdio.h>
extern FILE *fopen (const char *__restrict __filename,
const char *__restrict __modes)
__attribute__ ((warning ("fopen is used")));
void
show_map_without_care (void)
{
FILE *f = fopen ("/proc/self/maps", "r");
do
{
char lin[64];
fgets (lin, sizeof (lin), f);
fputs (lin, stdout);
}
while (!feof (f));
fclose (f);
}
overriding with a macro
At last, you could define a macro like
#define func(A) {func(A); __builtin_unreachable();}
(this uses the fact that inside a macro, the macro name is not macro-expanded).
If your never-returning func is declaring as returning e.g. int you'll use a statement expression like
#define func(A) ({func(A); __builtin_unreachable(); (int)0; })
Macro-based solutions like above won't always work, e.g. if func is passed as a function pointer, or simply if some guy codes (func)(1) which is legal but ugly.
redeclaring a static inline with the noreturn attribute
And the following example:
// file ex.c
// declare exit without any standard header
void exit (int);
// define myexit as a static inline
static inline void
myexit (int c)
{
exit (c);
}
// redeclare it as notreturn
static inline void myexit (int c) __attribute__ ((noreturn));
int
foo (int *p)
{
if (!p)
myexit (1);
if (p)
return *p + 2;
return 0;
}
when compiled with GCC 4.9 (from Debian/Sid/x86-64) as gcc -S -fverbose-asm -O2 ex.c) gives an assembly file containing the expected optimization:
.type foo, #function
foo:
.LFB1:
.cfi_startproc
testq %rdi, %rdi # p
je .L5 #,
movl (%rdi), %eax # *p_2(D), *p_2(D)
addl $2, %eax #, D.1768
ret
.L5:
pushq %rax #
.cfi_def_cfa_offset 16
movb $1, %dil #,
call exit #
.cfi_endproc
.LFE1:
.size foo, .-foo
You could play with #pragma GCC diagnostic to selectively disable a warning.
Customizing GCC with MELT
Finally, you could customize your recent gcc using the MELT plugin and coding your simple extension (in the MELT domain specific language) to add the attribute noreturn when encoutering the desired function. It is probably a dozen of MELT lines, using register_finish_decl_first and a match on the function name.
Since I am the main author of MELT (free software GPLv3+) I could perhaps even code that for you if you ask, e.g. here or preferably on gcc-melt#googlegroups.com; give the concrete name of your never-returning function.
Probably the MELT code is looking like:
;;file your_melt_mode.melt
(module_is_gpl_compatible "GPLv3+")
(defun my_finish_decl (decl)
(let ( (tdecl (unbox :tree decl))
)
(match tdecl
(?(tree_function_decl_named
?(tree_identifier ?(cstring_same "your_function_name")))
;;; code to add the noreturn attribute
;;; ....
))))
(register_finish_decl_first my_finish_decl)
The real MELT code is slightly more complex. You want to define your_adding_attr_mode there. Ask me for more.
Once you coded your MELT extension your_melt_mode.melt for your needs (and compiled that MELT extension into your_melt_mode.quicklybuilt.so as documented in the MELT tutorials) you'll compile your code with
gcc -fplugin=melt \
-fplugin-arg-melt-extra=your_melt_mode.quicklybuilt \
-fplugin-arg-melt-mode=your_adding_attr_mode \
-O2 -I/your/include -c yourfile.c
In other words, you just add a few -fplugin-* flags to your CFLAGS in your Makefile !
BTW, I'm just coding in the MELT monitor (on github: https://github.com/bstarynk/melt-monitor ..., file meltmom-process.melt something quite similar.
With a MELT extension, you won't get any additional warning, since the MELT extension would alter the internal GCC AST (a GCC Tree) of the declared function on the fly!
Customizing GCC with MELT is probably the most bullet-proof solution, since it is modifying the GCC internal AST. Of course, it is probably the most costly solution (and it is GCC specific and might need -small- changes when GCC is evolving, e.g. when using the next version of GCC), but as I am trying to show it is quite easy in your case.
PS. In 2019, GCC MELT is an abandoned project. If you want to customize GCC (for any recent version of GCC, e.g. GCC 7, 8, or 9), you need to write your own GCC plugin in C++.
Let's say I have pseudocode like this:
main() {
BOOL b = get_bool_from_environment(); //get it from a file, network, registry, whatever
while(true) {
do_stuff(b);
}
}
do_stuff(BOOL b) {
if(b)
path_a();
else
path_b();
}
Now, since we know that the external environment can influence get_bool_from_environment() to potentially produce either a true or false result, then we know that the code for both the true and false branches of if(b) must be included in the binary. We can't simply omit path_a(); or path_b(); from the code.
BUT -- we only set BOOL b the one time, and we always reuse the same value after program initialization.
If I were to make this valid C code and then compile it using gcc -O0, the if(b) would be repeatedly evaluated on the processor each time that do_stuff(b) is invoked, which inserts what are, in my opinion, needless instructions into the pipeline for a branch that is basically static after initialization.
If I were to assume that I actually had a compiler that was as stupid as gcc -O0, I would re-write this code to include a function pointer, and two separate functions, do_stuff_a() and do_stuff_b(), which don't perform the if(b) test, but simply go ahead and perform one of the two paths. Then, in main(), I would assign the function pointer based on the value of b, and call that function in the loop. This eliminates the branch, though it admittedly adds a memory access for the function pointer dereference (due to architecture implementation I don't think I really need to worry about that).
Is it possible, even in principle, for a compiler to take code of the same style as the original pseudocode sample, and to realize that the test is unnecessary once the value of b is assigned once in main()? If so, what is the theoretical name for this compiler optimization, and can you please give an example of an actual compiler implementation (open source or otherwise) which does this?
I realize that compilers can't generate dynamic code at runtime, and the only types of systems that could do that in principle would be bytecode virtual machines or interpreters (e.g. Java, .NET, Ruby, etc.) -- so the question remains whether or not it is possible to do this statically and generate code that contains both the path_a(); branch and the path_b() branch, but avoid evaluating the conditional test if(b) for every call of do_stuff(b);.
If you tell your compiler to optimise, you have a good chance that the if(b) is evaluated only once.
Slightly modifying the given example, using the standard _Bool instead of BOOL, and adding the missing return types and declarations,
_Bool get_bool_from_environment(void);
void path_a(void);
void path_b(void);
void do_stuff(_Bool b) {
if(b)
path_a();
else
path_b();
}
int main(void) {
_Bool b = get_bool_from_environment(); //get it from a file, network, registry, whatever
while(1) {
do_stuff(b);
}
}
the (relevant part of the) produced assembly by clang -O3 [clang-3.0] is
callq get_bool_from_environment
cmpb $1, %al
jne .LBB1_2
.align 16, 0x90
.LBB1_1: # %do_stuff.exit.backedge.us
# =>This Inner Loop Header: Depth=1
callq path_a
jmp .LBB1_1
.align 16, 0x90
.LBB1_2: # %do_stuff.exit.backedge
# =>This Inner Loop Header: Depth=1
callq path_b
jmp .LBB1_2
b is tested only once, and main jumps into an infinite loop of either path_a or path_b depending on the value of b. If path_a and path_b are small enough, they would be inlined (I strongly expect). With -O and -O2, the code produced by clang would evaluate b in each iteration of the loop.
gcc (4.6.2) behaves similarly with -O3:
call get_bool_from_environment
testb %al, %al
jne .L8
.p2align 4,,10
.p2align 3
.L9:
call path_b
.p2align 4,,6
jmp .L9
.L8:
.p2align 4,,8
call path_a
.p2align 4,,8
call path_a
.p2align 4,,5
jmp .L8
oddly, it unrolled the loop for path_a, but not for path_b. With -O2 or -O, it would however call do_stuff in the infinite loop.
Hence to
Is it possible, even in principle, for a compiler to take code of the same style as the original pseudocode sample, and to realize that the test is unnecessary once the value of b is assigned once in main()?
the answer is a definitive Yes, it is possible for compilers to recognize this and take advantage of that fact. Good compilers do when asked to optimise hard.
If so, what is the theoretical name for this compiler optimization, and can you please give an example of an actual compiler implementation (open source or otherwise) which does this?
I don't know the name of the optimisation, but two implementations doing that are gcc and clang (at least, recent enough releases).