Why doesn't gcc remove this check of a non-volatile variable? - c

This question is mostly academic. I ask out of curiosity, not because this poses an actual problem for me.
Consider the following incorrect C program.
#include <signal.h>
#include <stdio.h>
static int running = 1;
void handler(int u) {
running = 0;
}
int main() {
signal(SIGTERM, handler);
while (running)
;
printf("Bye!\n");
return 0;
}
This program is incorrect because the handler interrupts the program flow, so running can be modified at any time and should therefore be declared volatile. But let's say the programmer forgot that.
gcc 4.3.3, with the -O3 flag, compiles the loop body (after one initial check of the running flag) down to the infinite loop
.L7:
jmp .L7
which was to be expected.
Now we put something trivial inside the while loop, like:
while (running)
putchar('.');
And suddenly, gcc does not optimize the loop condition anymore! The loop body's assembly now looks like this (again at -O3):
.L7:
movq stdout(%rip), %rsi
movl $46, %edi
call _IO_putc
movl running(%rip), %eax
testl %eax, %eax
jne .L7
We see that running is re-loaded from memory each time through the loop; it is not even cached in a register. Apparently gcc now thinks that the value of running could have changed.
So why does gcc suddenly decide that it needs to re-check the value of running in this case?

In the general case it's difficult for a compiler to know exactly which objects a function might have access to and therefore could potentially modify. At the point where putchar() is called, GCC doesn't know if there might be a putchar() implementation that might be able to modify running so it has to be somewhat pessimistic and assume that running might in fact have been changed.
For example, there might be a putchar() implementation later in the translation unit:
int putchar( int c)
{
running = c;
return c;
}
Even if there's not a putchar() implementation in the translation unit, there could be something that might, for example, pass the address of the running object such that putchar might be able to modify it:
void foo(void)
{
set_putchar_status_location( &running);
}
Note that your handler() function is globally accessible, so putchar() might call handler() itself (directly or otherwise), which is an instance of the above situation.
On the other hand, since running is visible only to the translational unit (being static), by the time the compiler gets to the end of the file it should be able to determine that there is no opportunity for putchar() to access it (assuming that's the case), and the compiler could go back and 'fix up' the pessimization in the while loop.
Since running is static, the compiler might be able to determine that it's not accessible from outside the translation unit and make the optimization you're talking about. However, since it's accessible through handler() and handler() is accessible externally, the compiler can't optimize the access away. Even if you make handler() static, it's accessible externally since you pass the address of it to another function.
Note that in your first example, even though what I mentioned in the above paragraph is still true the compiler can optimize away the access to running because the 'abstract machine model' the C language is based on doesn't take into account asynchronous activity except in very limited circumstances (one of which is the volatile keyword and another is signal handling, though the requirements of the signal handling aren't strong enough to prevent the compiler being able to optimize away the access to running in your first example).
In fact, here's something the C99 says about the abstract machine behavior in pretty much these exact circumstances:
5.1.2.3/8 "Program execution"
EXAMPLE 1:
An implementation might define a one-to-one correspondence between abstract and actual semantics: at every sequence point, the values of the actual objects would agree with those specified by the abstract semantics. The keyword volatile would then be redundant.
Alternatively, an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries. In such an implementation, at the time of each function entry and function return where the calling function and the called function are in different translation units, the values of all externally linked objects and of all objects accessible via pointers therein would agree with the abstract semantics. Furthermore, at the time of each such function entry the values of the parameters of the called function and of all objects accessible via pointers therein would agree with the abstract semantics. In this type of implementation, objects referred to by interrupt service routines activated by the signal function would require explicit specification of volatile storage, as well as other implementation defined restrictions.
Finally, you should note that the C99 standard also says:
7.14.1.1/5 "The signal function`
If the signal occurs other than as the result of calling the abort or raise function, the behavior is undefined if the signal handler refers to any object with static storage duration other than by assigning a value to an object declared as volatile sig_atomic_t...
So strictly speaking the running variable may need to be declared as:
volatile sig_atomic_t running = 1;

Because the call to putchar() could change the value of running (GCC only knows that putchar() is an external function and does not know what it does - for all GCC knows putchar() could call handler()).

GCC probably assumes that the call to putchar can modify any global variable, including running.
Take a look at the pure function attribute, which states that the function does not have side-effects on the global state. I suspect if you replace putchar() with a call to a "pure" function, GCC will reintroduce the loop optimization.

Thank you all for your answers and comments. They have been very helpful, but none of them provide the full story. [Edit: Michael Burr's answer now does, making this somewhat redundant.] I'll sum up here.
Even though running is static, handler is not static; therefore it might be called from putchar and change running in that way. Since the implementation of putchar is not known at this point, it could conceivably call handler from the body of the while loop.
Suppose handler were static. Can we optimize away the running check then? The answer is no, because the signal implementation is also outside this compilation unit. For all gcc knows, signal might store the address of handle somewhere (which, in fact, it does), and putchar might then call handler through this pointer even though it has no direct access to that function.
So in what cases can the running check be optimized away? It seems that this is only possible if the loop body does not call any functions from outside this translation unit, so that it is known at compilation time what does and does not happen inside the loop body.
This explains why forgetting a volatile is not such a big deal in practice as it might seem at first.

putchar can change running.
Only link-time analysis could, in theory, determine that it doesn't.

Related

Is there any practical use for a function that does nothing?

Would there be any use for a function that does nothing when run, i.e:
void Nothing() {}
Note, I am not talking about a function that waits for a certain amount of time, like sleep(), just something that takes as much time as the compiler / interpreter gives it.
Such a function could be necessary as a callback function.
Supposed you had a function that looked like this:
void do_something(int param1, char *param2, void (*callback)(void))
{
// do something with param1 and param2
callback();
}
This function receives a pointer to a function which it subsequently calls. If you don't particularly need to use this callback for anything, you would pass a function that does nothing:
do_something(3, "test", Nothing);
When I've created tables that contain function pointers, I do use empty functions.
For example:
typedef int(*EventHandler_Proc_t)(int a, int b); // A function-pointer to be called to handle an event
struct
{
Event_t event_id;
EventHandler_Proc_t proc;
} EventTable[] = { // An array of Events, and Functions to be called when the event occurs
{ EventInitialize, InitializeFunction },
{ EventIncrement, IncrementFunction },
{ EventNOP, NothingFunction }, // Empty function is used here.
};
In this example table, I could put NULL in place of the NothingFunction, and check if the .proc is NULL before calling it. But I think it keeps the code simpler to put a do-nothing function in the table.
Yes. Quite a lot of things want to be given a function to notify about a certain thing happening (callbacks). A function that does nothing is a good way to say "I don't care about this."
I am not aware of any examples in the standard library, but many libraries built on top have function pointers for events.
For an example, glib defines a callback "GLib.LogFunc(log_domain, log_level, message, *user_data)" for providing the logger. An empty function would be the callback you provide when logging is disabled.
One use case would be as a possibly temporary stub function midway through a program's development.
If I'm doing some amount of top-down development, it's common for me to design some function prototypes, write the main function, and at that point, want to run the compiler to see if I have any syntax errors so far. To make that compile happen I need to implement the functions in question, which I'll do by initially just creating empty "stubs" which do nothing. Once I pass that compile test, I can go on and flesh out the functions one at a time.
The Gaddis textbook Starting out with C++: From Control Structures Through Objects, which I teach out of, describes them this way (Sec. 6.16):
A stub is a dummy function that is called instead of the actual
function it represents. It usually displays a test message
acknowledging that it was called, and nothing more.
A function that takes arguments and does nothing with them can be used as a pair with a function that does something useful, such that the arguments are still evaluated even when the no-op function is used. This can be useful in logging scenarios, where the arguments must still be evaluated to verify the expressions are legal and to ensure any important side-effects occur, but the logging itself isn't necessary. The no-op function might be selected by the preprocessor when the compile-time logging level was set at a level that doesn't want output for that particular log statement.
As I recall, there were two empty functions in Lions' Commentary on UNIX 6th Edition, with Source Code, and the introduction to the re-issue early this century called Ritchie, Kernighan and Thompson out on it.
The function that gobbles its argument and returns nothing is actually ubiquitous in C, but not written out explicitly because it is implicitly called on nearly every line. The most common use of this empty function, in traditional C, was the invisible discard of the value of any statement. But, since C89, this can be explicitly spelled as (void). The lint tool used to complain whenever a function return value was ignored without explicitly passing it to this built-in function that returns nothing. The motivation behind this was to try to prevent programmers from silently ignoring error conditions, and you will still run into some old programs that use the coding style, (void)printf("hello, world!\n");.
Such a function might be used for:
Callbacks (which the other answers have mentioned)
An argument to higher-order functions
Benchmarking a framework, with no overhead for the no-op being performed
Having a unique value of the correct type to compare other function pointers to. (Particularly in a language like C, where all function pointers are convertible and comparable with each other, but conversion between function pointers and other kinds of pointers is not portable.)
The sole element of a singleton value type, in a functional language
If passed an argument that it strictly evaluates, this could be a way to discard a return value but execute side-effects and test for exceptions
A dummy placeholder
Proving certain theorems in the typed Lambda Calculus
Another temporary use for a do-nothing function could be to have a line exist to put a breakpoint on, for example when you need to check the run-time values being passed into a newly created function so that you can make better decisions about what the code you're going to put in there will need to access. Personally, I like to use self-assignments, i.e. i = i when I need this kind of breakpoint, but a no-op function would presumably work just as well.
void MyBrandNewSpiffyFunction(TypeImNotFamiliarWith whoKnowsWhatThisVariableHas)
{
DoNothing(); // Yay! Now I can put in a breakpoint so I can see what data I'm receiving!
int i = 0;
i = i; // Another way to do nothing so I can set a breakpoint
}
From a language lawyer perspective, an opaque function call inserts a barrier for optimizations.
For example:
int a = 0;
extern void e(void);
int b(void)
{
++a;
++a;
return a;
}
int c(void)
{
++a;
e();
++a;
return a;
}
int d(void)
{
++a;
asm(" ");
++a;
return a;
}
The ++a expressions in the b function can be merged to a += 2, while in the c function, a needs to be updated before the function call and reloaded from memory after, as the compiler cannot prove that e does not access a, similar to the (non-standard) asm(" ") in the d function.
In the embedded firmware world, it could be used to add a tiny delay, required for some hardware reason. Of course, this could be called as many times in a row, too, making this delay expandable by the programmer.
Empty functions are not uncommon in platform-specific abstraction layers. There are often functions that are only needed on certain platforms. For example, a function void native_to_big_endian(struct data* d) would contain byte-swapping code on a little-endian CPU but could be completely empty on a big-endian CPU. This helps keep the business logic platform-agnostic and readable. I've also seen this sort of thing done for tasks like converting native file paths to Unix/Windows style, hardware initialization functions (when some platforms can run with defaults and others must be actively reconfigured), etc.
At the risk of being considered off-topic, I'm going to argue from a Thomistic perspective that a function that does nothing, and the concept of NULL in computing, really has no place anywhere in computing.
Software is constituted in substance by state, behavior, and control flow which belongs to behavior. To have the absence of state is impossible; and to have the absence of behavior is impossible.
Absence of state is impossible because a value is always present in memory, regardless of initialization state for the memory that is available. Absence of behavior is impossible because non-behavior cannot be executed (even "nop" instructions do something).
Instead, we might better state that there is negative and positive existence defined subjectively by the context with an objective definition being that negative existence of state or behavior means no explicit value or implementation respectively, while the positive refers to explicit value or implementation respectively.
This changes the perspective concerning the design of an API.
Instead of:
void foo(void (*bar)()) {
if (bar) { bar(); }
}
we instead have:
void foo();
void foo_with_bar(void (*bar)()) {
if (!bar) { fatal(__func__, "bar is NULL; callback required\n"); }
bar();
}
or:
void foo(bool use_bar, void (*bar)());
or if you want even more information about the existence of bar:
void foo(bool use_bar, bool bar_exists, void (*bar)());
of which each of these is a better design that makes your code and intent well-expressed. The simple fact of the matter is that the existence of a thing or not concerns the operation of an algorithm, or the manner in which state is interpreted. Not only do you lose a whole value by reserving NULL with 0 (or any arbitrary value there), but you make your model of the algorithm less perfect and even error-prone in rare cases. What more is that on a system in which this reserved value is not reserved, the implementation might not work as expected.
If you need to detect for the existence of an input, let that be explicit in your API: have a parameter or two for that if it's that important. It will be more maintainable and portable as well since you're decoupling logic metadata from inputs.
In my opinion, therefore, a function that does nothing is not practical to use, but a design flaw if part of the API, and an implementation defect if part of the implementation. NULL obviously won't disappear that easily, and we just use it because that's what currently is used by necessity, but in the future, it doesn't have to be that way.
Besides all the reasons already given here, note that an "empty" function is never truly empty, so you can learn a lot about how function calls work on your architecture of choice by looking at the assembly output. Let's look at a few examples. Let's say I have the following C file, nothing.c:
void DoNothing(void) {}
Compile this on an x86_64 machine with clang -c -S nothing.c -o nothing.s and you'll get something that looks like this (stripped of metadata and other stuff irrelevant to this discussion):
nothing.s:
_Nothing: ## #Nothing
pushq %rbp
movq %rsp, %rbp
popq %rbp
retq
Hmm, that doesn't really look like nothing. Note the pushing and popping of %rbp (the frame pointer) onto the stack. Now let's change the compiler flags and add -fomit-frame-pointer, or more explicitly: clang -c -S nothing.c -o nothing.s -fomit-frame-pointer
nothing.s:
_Nothing: ## #Nothing
retq
That looks a lot more like "nothing", but you still have at least one x86_64 instruction being executed, namely retq.
Let's try one more. Clang supports the gcc gprof profiler option -pg so what if we try that: clang -c -S nothing.c -o nothing.s -pg
nothing.s:
_Nothing: ## #Nothing
pushq %rbp
movq %rsp, %rbp
callq mcount
popq %rbp
retq
Here we've added a mysterious additional call to a function mcount() that the compiler has inserted for us. This one looks like the least amount of nothing-ness.
And so you get the idea. Compiler options and architecture can have a profound impact on the meaning of "nothing" in a function. Armed with this knowledge you can make much more informed decisions about both how you write code, and how you compile it. Moreover, a function like this called millions of times and measured can give you a very accurate measure of what you might call "function call overhead", or the bare minimum amount of time required to make a call given your architecture and compiler options. In practice given modern superscalar instruction scheduling, this measurement isn't going to mean a whole lot or be particularly useful, but on certain older or "simpler" architectures, it might.
These functions have a great place in test driven development.
class Doer {
public:
int PerformComplexTask(int input) { return 0; } // just to make it compile
};
Everything compiles and the test cases says Fail until the function is properly implemented.

Global Variable Access Relative to Function Calls and Returns

I have been researching this topic and I can not find a specific authoritative answer. I am hoping that someone very familiar with the C spec can answer - i.e. confirm or refute my assertion, preferably with citation to the spec.
Assertion:
If a program consists of more than one compilation unit (separately compiled source file), the compiler must assure that global variables (if modified) are written to memory before any call to a function in another unit or before the return from any function. Also, in any function, the global must be read before its first use. Also after a call of any function, not in the same unit, the global must be read before use. And these things must be true whether the variable is qualified as "volatile" or not because a function in another compilation unit (source file) could access the variable without the compiler's knowledge. Otherwise, "volatile" would always be required for global variables - i.e. non-volatile globals would have no purpose.
Could the compiler treat functions in the same compilation unit differently than ones that aren't? All of the discussions I have found for the "volatile" qualifier on globals show all functions in the same compilation unit.
Edit: The compiler cannot know whether functions in other units use the global or not. Therefore I am assuming the above conditions.
I found these two other questions with information related to this topic but they don't address it head on or they give information that I find suspect:
Are global variables refreshed between function calls?
When do I need to use volatile in ISRs?
[..] in any function, the global must be read before its first use.
Definitely not:
static int variable;
void foo(void) {
variable = 42;
}
Why should the compiler bother generating code to read the variable?
The compiler must assure that global variables are written to memory before any function call or before the return from a function.
No, why should it?
void bar(void) {
return;
}
void baz(void) {
variable = 42;
bar();
}
bar is a pure function (should be determinable for a decent compiler), so there's no chance of getting any different behaviour when writing to memory after the function call.
The case of "before returning from a function" is tricky, though. But I think the general statement ("must") is false if we count inlined (static) functions, too.
Could the compiler treat functions in the same compilation unit differently than ones that aren't?
Yes, I think so: for a static function (whose address is never taken) the compiler knows exactly how it is used, and this information could be used to apply some more radical optimisations.
I'm basing all of the above on the C version of the As-If rule, specified in ยง5.1.2.3/6 (N1570):
The least requirements on a conforming implementation are:
Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.
At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.
The input and output dynamics of interactive devices shall take place as specied in 7.21.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input.
This is theobservable behaviorof the program.
In particular, you might want to read the following "EXAMPLE 1".

Is volatile modifier really needed if global variables are modified by an interrupt?

There was a lot said and written about volatile variables, and their use. In those articles, two slightly different ideas can be found:
1 - Volatile should be used when variable is getting changed outside of the compiled program.
2 - Volatile should be used when variable is getting changed outside the normal flow of the function.
First statement limits volatile usage to memory-mapped registers etc. and multi-threaded stuff, but the second one actually adds interrupts into the scope.
This article (http://www.barrgroup.com/Embedded-Systems/How-To/C-Volatile-Keyword) for example explicitly states that volatile modifier should be used for globals changed during the interrupt, and provides this example:
int etx_rcvd = FALSE;
void main()
{
...
while (!ext_rcvd)
{
// Wait
}
...
}
interrupt void rx_isr(void)
{
...
if (ETX == rx_char)
{
etx_rcvd = TRUE;
}
...
}
Note how setting up a rx_isr() as a callback is conveniently omitted here.
Hence I wrote my own example:
#include <stdio.h>
#include <time.h>
#include <signal.h>
void f();
int n = 0;
void main()
{
signal(2,f);
time_t tLastCalled = 0;
printf("Entering the loop\n");
while (n == 0)
{
if (time(NULL) - tLastCalled > 1)
{
printf("Still here...\n");
tLastCalled = time(NULL);
}
}
printf ("Done\n");
}
void f()
{
n = 1;
}
Compiled with gcc on linux with various levels of optimizations, every time loop exited and I saw "Done" when I pressed ctrl+c, which means that really gcc compiler is smart enough not to optimize variable n here.
That said, my question is:
If compiler can really optimize global variables modified by an interrupt service routine, then:
1. Why it has a right of optimizing a global variable in the first place when it can possibly be called from another file?
2. Why the example article and many others on the internet state that the compiler will not "notice" the interrupt callback function?
3. How do I modify my code to accomplish this?
Because you have a function call to an external function, the while loop does check n every time. However, if you remove those function calls the optimizer may registerize or do away with any checks of n.
Ex (gcc x86_64 -O3):
volatile int n;
int main() {
while(n==0) {}
return 0;
}
becomes:
.L3:
movl n(%rip), %eax
testl %eax, %eax
je .L3
xorl %eax, %eax
ret
But
int n;
int main() {
while(n==0) {}
return 0;
}
becomes:
movl n(%rip), %eax
testl %eax, %eax
jne .L2
.L3:
jmp .L3
In this case, n is never looked at in the infinite loop.
If there is a signal handler that modifies a global, you really should label that global volatile. You might not get in trouble by skipping this, but you are either getting lucky or you are counting on the optimizer not being able to verify whether or not a global is being touched.
There is some movement in cross module optimization at link time (llvm), so someday an optimizer may be able to tell that calls to time or printf aren't touching globals in your file. When that happens, missing the volatile keyword may cause problems even if you have external function calls.
If the compiler can really optimize global variables modified by an interrupt service routine, then:
Why does it have the right of optimizing a global variable in the first place when it can possibly be called from another file?
The key here is that in a "normal", single-threaded program with no interrupts, the global variable cannot be modified at any time. All accesses to the variable are sequenced in a predictable manner, no matter which file makes the access.
And the optimizations may be subtle. It is not as simple as "ah ok this global doesn't seem to be used, let's remove it entirely". Rather, for some code like
while(global)
{
do_stuff(global);
}
the optimizer might create something behaving like:
register tmp = global;
loop:
do_stuff(tmp);
goto loop;
Which changes the meaning of the program completely. How such bugs caused by the lack of volatile manifest themselves is always different from case-to-case. They are very hard to find.
Why the example article and many others on the internet state that the compiler will not "notice" the interrupt callback function?
Because embedded compilers are traditionally stupid when it comes to this aspect. Traditionally, when a compiler spots your non-standard interrupt keyword, it will just do 2 things:
Generate the specific return code from that function, since interrupts usually have different calling conventions compared to regular function calls.
Ensure that the function gets linked even though it is never called from the program. Possibly allocated in a separate memory segment. This is actually done by the linker and not the compiler.
There might nowadays be smarter compilers. PC/desktop compilers face the very same issue when dealing with callback functions/threads, but they are usually smart enough to realize that they shouldn't assume things about global variables shared with a callback.
Embedded compilers are traditionally far dumber than PC/desktop compilers when it comes to optimizations. They are generally of lower quality and worse at standard compliance. If you are one of just a few compiler vendors supporting a specific target, or perhaps the only vendor, then the lack of competition means that you don't have to worry much about quality. You can sell crap and charge a lot for it.
But even good compilers can struggle with such scenarios, especially multi-platform ones that don't know anything about how interrupts etc work specifically in "target x".
So you have the case where the good, multi-platform compiler is too generic to handle this bug. While at the same time, the bad, narrow compiler for "target x" is too poorly written to handle it, even though it supposedly knows all about how interrupts work on "target x".
How do I modify my code to accomplish this?
Make such globals volatile.

Compiler optimization call-ret vs jmp

I am building one of the projects and I am looking at the generated list file.(target: x86-64) My code looks like:
int func_1(var1,var2){
asm_inline_(
)
func_2(var1,var2);
return_1;
}
void func_2(var_1,var_2){
asm __inline__(
)
func_3();
}
/**** Jump to kernel ---> System call stub in assembly. This func in .S file***/
void func_3(){
}
When I see the assembly code, I find "jmp" instruction is used instead of "call-return" pair when calling func_2 and func_3. I am sure it is one of the compiler optimization and I have not explored how to disable it. (GCC)
The moment I add some volatile variables to func_2 and func_3 and increment them then "jmp" gets replaced by "call-ret" pair.
I am bemused to see the behavior because those variables are useless and they don't serve any purpose.
Can someone please explain the behavior?
Thanks
If code jumps to the start of another function rather than calling it, when the jumped-to function returns, it will return back to the point where the outer function was called from, ignoring any more of the first function after that point. Assuming the behaviour is correct (the first function contributed nothing else to the execution after that point anyway), this is an optimisation because it reduces the number of instructions and stack manipulations by one level.
In the given example, the behaviour is correct; there's no local stack to pop and no value to return, so there is no code that needs to run after the call. (return_1, assuming it's not a macro for something, is a pure expression and therefore does nothing no matter its value.) So there's no reason to keep the stack frame around for the future when it has nothing more to contribute to events.
If you add volatile variables to the function bodies, you aren't just adding variables whose flow the compiler can analyse - you're adding slots that you've explicitly told the compiler could be accessed outside the normal control flow it can predict. The volatile qualifier warns the compiler that even though there's no obvious way for the variables to escape, something outside has a way to get their address and write to it at any time. So it can't reduce their lifetime, because it's been told that code outside the function might still try to write to that stack space; and obviously that means the stack frame needs to continue to exist for its entire declared lifespan.

C function call with too few arguments

I am working on some legacy C code. The original code was written in the mid-90s, targeting Solaris and Sun's C compiler of that era. The current version compiles under GCC 4 (albeit with many warnings), and it seems to work, but I'm trying to tidy it up -- I want to squeeze out as many latent bugs as possible as I determine what may be necessary to adapt it to 64-bit platforms, and to compilers other than the one it was built for.
One of my main activities in this regard has been to ensure that all functions have full prototypes (which many did not have), and in that context I discovered some code that calls a function (previously un-prototyped) with fewer arguments than the function definition declares. The function implementation does use the value of the missing argument.
Example:
impl.c:
int foo(int one, int two) {
if (two) {
return one;
} else {
return one + 1;
}
}
client1.c:
extern foo();
int bar() {
/* only one argument(!): */
return foo(42);
}
client2.c:
extern int foo();
int (*foop)() = foo;
int baz() {
/* calls the same function as does bar(), but with two arguments: */
return (*foop)(17, 23);
}
Questions: is the result of a function call with missing arguments defined? If so, what value will the function receive for the unspecified argument? Otherwise, would the Sun C compiler of ca. 1996 (for Solaris, not VMS) have exhibited a predictable implementation-specific behavior that I can emulate by adding a particular argument value to the affected calls?
EDIT: I found a stack thread C function with no parameters behavior which gives a very succinct and specific, accurate answer. PMG's comment at the end of the answer taks about UB. Below were my original thoughts, which I think are along the same lines and explain why the behaviour is UB..
Questions: is the result of a function call with missing arguments defined?
I would say no... The reason being is that I think the function will operate as-if it had the second parameter, but as explained below, that second parameter could just be junk.
If so, what value will the function receive for the unspecified argument?
I think the values received are undefined. This is why you could have UB.
There are two general ways of parameter passing that I'm aware of... (Wikipedia has a good page on calling conventions)
Pass by register. I.e., the ABI (Application Binary Interface) for the plat form will say that registers x & y for example are for passing in parameters, and any more above that get passed via stack...
Everything gets passed via stack...
Thus when you give one module a definition of the function with "...unspecified (but not variable) number of parameters..." (the extern def), it will not place as many parameters as you give it (in this case 1) in either the registers or stack location that the real function will look in to get the parameter values. Therefore the second area for the second parameter, which is missed out, essentially contains random junk.
EDIT: Based on the other stack thread I found, I would ammended the above to say that the extern declared a function with no parameters to a declared a function with "unspecified (but not variable) number of parameters".
When the program jumps to the function, that function assumes the parameter passing mechanism has been correctly obeyed, so either looks in registers or the stack and uses whatever values it finds... asumming them to be correct.
Otherwise, would the Sun C compiler of ca. 1996 (for Solaris, not VMS) have exhibited a >> predictable implementation-specific behavior
You'd have to check your compiler documentation. I doubt it... the extern definition would be trusted completely so I doubt the registers or stack, depending on parameter passing mechanism, would get correctly initialised...
If the number or the types of arguments (after default argument promotions) do not match the ones used in the actual function definition, the behavior is undefined.
What will happen in practice depends on the implementation. The values of missing parameters will not be meaningfully defined (assuming the attempt to access missing arguments will not segfault), i.e. they will hold unpredictable and possibly unstable values.
Whether the program will survive such incorrect calls will also depend on the calling convention. A "classic" C calling convention, in which the caller is responsible for placing the parameters into the stack and removing them from there, will be less crash-prone in presence of such errors. The same can be said about calls that use CPU registers to pass arguments. Meanwhile, a calling convention in which the function itself is responsible for cleaning the stack will crash almost immediately.
It is very unlikely the bar function ever in the past would give consistent results. The only thing I can imagine is that it is always called on fresh stack space and the stack space was cleared upon startup of the process, in which case the second parameter would be 0. Or the difference between between returning one and one+1 didn't make a big difference in the bigger scope of the application.
If it really is like you depict in your example, then you are looking at a big fat bug. In the distant past there was a coding style where vararg functions were implemented by specifying more parameters than passed, but just as with modern varargs you should not access any parameters not actually passed.
I assume that this code was compiled and run on the Sun SPARC architecture. According to this ancient SPARC web page: "registers %o0-%o5 are used for the first six parameters passed to a procedure."
In your example with a function expecting two parameters, with the second parameter not specified at the call site, it is likely that register %01 always happened to have a sensible value when the call was made.
If you have access to the original executable and can disassemble the code around the incorrect call site, you might be able to deduce what value %o1 had when the call was made. Or you might try running the original executable on a SPARC emulator, like QEMU. In any case this won't be a trivial task!

Resources