Would there be any use for a function that does nothing when run, i.e:
void Nothing() {}
Note, I am not talking about a function that waits for a certain amount of time, like sleep(), just something that takes as much time as the compiler / interpreter gives it.
Such a function could be necessary as a callback function.
Supposed you had a function that looked like this:
void do_something(int param1, char *param2, void (*callback)(void))
{
// do something with param1 and param2
callback();
}
This function receives a pointer to a function which it subsequently calls. If you don't particularly need to use this callback for anything, you would pass a function that does nothing:
do_something(3, "test", Nothing);
When I've created tables that contain function pointers, I do use empty functions.
For example:
typedef int(*EventHandler_Proc_t)(int a, int b); // A function-pointer to be called to handle an event
struct
{
Event_t event_id;
EventHandler_Proc_t proc;
} EventTable[] = { // An array of Events, and Functions to be called when the event occurs
{ EventInitialize, InitializeFunction },
{ EventIncrement, IncrementFunction },
{ EventNOP, NothingFunction }, // Empty function is used here.
};
In this example table, I could put NULL in place of the NothingFunction, and check if the .proc is NULL before calling it. But I think it keeps the code simpler to put a do-nothing function in the table.
Yes. Quite a lot of things want to be given a function to notify about a certain thing happening (callbacks). A function that does nothing is a good way to say "I don't care about this."
I am not aware of any examples in the standard library, but many libraries built on top have function pointers for events.
For an example, glib defines a callback "GLib.LogFunc(log_domain, log_level, message, *user_data)" for providing the logger. An empty function would be the callback you provide when logging is disabled.
One use case would be as a possibly temporary stub function midway through a program's development.
If I'm doing some amount of top-down development, it's common for me to design some function prototypes, write the main function, and at that point, want to run the compiler to see if I have any syntax errors so far. To make that compile happen I need to implement the functions in question, which I'll do by initially just creating empty "stubs" which do nothing. Once I pass that compile test, I can go on and flesh out the functions one at a time.
The Gaddis textbook Starting out with C++: From Control Structures Through Objects, which I teach out of, describes them this way (Sec. 6.16):
A stub is a dummy function that is called instead of the actual
function it represents. It usually displays a test message
acknowledging that it was called, and nothing more.
A function that takes arguments and does nothing with them can be used as a pair with a function that does something useful, such that the arguments are still evaluated even when the no-op function is used. This can be useful in logging scenarios, where the arguments must still be evaluated to verify the expressions are legal and to ensure any important side-effects occur, but the logging itself isn't necessary. The no-op function might be selected by the preprocessor when the compile-time logging level was set at a level that doesn't want output for that particular log statement.
As I recall, there were two empty functions in Lions' Commentary on UNIX 6th Edition, with Source Code, and the introduction to the re-issue early this century called Ritchie, Kernighan and Thompson out on it.
The function that gobbles its argument and returns nothing is actually ubiquitous in C, but not written out explicitly because it is implicitly called on nearly every line. The most common use of this empty function, in traditional C, was the invisible discard of the value of any statement. But, since C89, this can be explicitly spelled as (void). The lint tool used to complain whenever a function return value was ignored without explicitly passing it to this built-in function that returns nothing. The motivation behind this was to try to prevent programmers from silently ignoring error conditions, and you will still run into some old programs that use the coding style, (void)printf("hello, world!\n");.
Such a function might be used for:
Callbacks (which the other answers have mentioned)
An argument to higher-order functions
Benchmarking a framework, with no overhead for the no-op being performed
Having a unique value of the correct type to compare other function pointers to. (Particularly in a language like C, where all function pointers are convertible and comparable with each other, but conversion between function pointers and other kinds of pointers is not portable.)
The sole element of a singleton value type, in a functional language
If passed an argument that it strictly evaluates, this could be a way to discard a return value but execute side-effects and test for exceptions
A dummy placeholder
Proving certain theorems in the typed Lambda Calculus
Another temporary use for a do-nothing function could be to have a line exist to put a breakpoint on, for example when you need to check the run-time values being passed into a newly created function so that you can make better decisions about what the code you're going to put in there will need to access. Personally, I like to use self-assignments, i.e. i = i when I need this kind of breakpoint, but a no-op function would presumably work just as well.
void MyBrandNewSpiffyFunction(TypeImNotFamiliarWith whoKnowsWhatThisVariableHas)
{
DoNothing(); // Yay! Now I can put in a breakpoint so I can see what data I'm receiving!
int i = 0;
i = i; // Another way to do nothing so I can set a breakpoint
}
From a language lawyer perspective, an opaque function call inserts a barrier for optimizations.
For example:
int a = 0;
extern void e(void);
int b(void)
{
++a;
++a;
return a;
}
int c(void)
{
++a;
e();
++a;
return a;
}
int d(void)
{
++a;
asm(" ");
++a;
return a;
}
The ++a expressions in the b function can be merged to a += 2, while in the c function, a needs to be updated before the function call and reloaded from memory after, as the compiler cannot prove that e does not access a, similar to the (non-standard) asm(" ") in the d function.
In the embedded firmware world, it could be used to add a tiny delay, required for some hardware reason. Of course, this could be called as many times in a row, too, making this delay expandable by the programmer.
Empty functions are not uncommon in platform-specific abstraction layers. There are often functions that are only needed on certain platforms. For example, a function void native_to_big_endian(struct data* d) would contain byte-swapping code on a little-endian CPU but could be completely empty on a big-endian CPU. This helps keep the business logic platform-agnostic and readable. I've also seen this sort of thing done for tasks like converting native file paths to Unix/Windows style, hardware initialization functions (when some platforms can run with defaults and others must be actively reconfigured), etc.
At the risk of being considered off-topic, I'm going to argue from a Thomistic perspective that a function that does nothing, and the concept of NULL in computing, really has no place anywhere in computing.
Software is constituted in substance by state, behavior, and control flow which belongs to behavior. To have the absence of state is impossible; and to have the absence of behavior is impossible.
Absence of state is impossible because a value is always present in memory, regardless of initialization state for the memory that is available. Absence of behavior is impossible because non-behavior cannot be executed (even "nop" instructions do something).
Instead, we might better state that there is negative and positive existence defined subjectively by the context with an objective definition being that negative existence of state or behavior means no explicit value or implementation respectively, while the positive refers to explicit value or implementation respectively.
This changes the perspective concerning the design of an API.
Instead of:
void foo(void (*bar)()) {
if (bar) { bar(); }
}
we instead have:
void foo();
void foo_with_bar(void (*bar)()) {
if (!bar) { fatal(__func__, "bar is NULL; callback required\n"); }
bar();
}
or:
void foo(bool use_bar, void (*bar)());
or if you want even more information about the existence of bar:
void foo(bool use_bar, bool bar_exists, void (*bar)());
of which each of these is a better design that makes your code and intent well-expressed. The simple fact of the matter is that the existence of a thing or not concerns the operation of an algorithm, or the manner in which state is interpreted. Not only do you lose a whole value by reserving NULL with 0 (or any arbitrary value there), but you make your model of the algorithm less perfect and even error-prone in rare cases. What more is that on a system in which this reserved value is not reserved, the implementation might not work as expected.
If you need to detect for the existence of an input, let that be explicit in your API: have a parameter or two for that if it's that important. It will be more maintainable and portable as well since you're decoupling logic metadata from inputs.
In my opinion, therefore, a function that does nothing is not practical to use, but a design flaw if part of the API, and an implementation defect if part of the implementation. NULL obviously won't disappear that easily, and we just use it because that's what currently is used by necessity, but in the future, it doesn't have to be that way.
Besides all the reasons already given here, note that an "empty" function is never truly empty, so you can learn a lot about how function calls work on your architecture of choice by looking at the assembly output. Let's look at a few examples. Let's say I have the following C file, nothing.c:
void DoNothing(void) {}
Compile this on an x86_64 machine with clang -c -S nothing.c -o nothing.s and you'll get something that looks like this (stripped of metadata and other stuff irrelevant to this discussion):
nothing.s:
_Nothing: ## #Nothing
pushq %rbp
movq %rsp, %rbp
popq %rbp
retq
Hmm, that doesn't really look like nothing. Note the pushing and popping of %rbp (the frame pointer) onto the stack. Now let's change the compiler flags and add -fomit-frame-pointer, or more explicitly: clang -c -S nothing.c -o nothing.s -fomit-frame-pointer
nothing.s:
_Nothing: ## #Nothing
retq
That looks a lot more like "nothing", but you still have at least one x86_64 instruction being executed, namely retq.
Let's try one more. Clang supports the gcc gprof profiler option -pg so what if we try that: clang -c -S nothing.c -o nothing.s -pg
nothing.s:
_Nothing: ## #Nothing
pushq %rbp
movq %rsp, %rbp
callq mcount
popq %rbp
retq
Here we've added a mysterious additional call to a function mcount() that the compiler has inserted for us. This one looks like the least amount of nothing-ness.
And so you get the idea. Compiler options and architecture can have a profound impact on the meaning of "nothing" in a function. Armed with this knowledge you can make much more informed decisions about both how you write code, and how you compile it. Moreover, a function like this called millions of times and measured can give you a very accurate measure of what you might call "function call overhead", or the bare minimum amount of time required to make a call given your architecture and compiler options. In practice given modern superscalar instruction scheduling, this measurement isn't going to mean a whole lot or be particularly useful, but on certain older or "simpler" architectures, it might.
These functions have a great place in test driven development.
class Doer {
public:
int PerformComplexTask(int input) { return 0; } // just to make it compile
};
Everything compiles and the test cases says Fail until the function is properly implemented.
I have been researching this topic and I can not find a specific authoritative answer. I am hoping that someone very familiar with the C spec can answer - i.e. confirm or refute my assertion, preferably with citation to the spec.
Assertion:
If a program consists of more than one compilation unit (separately compiled source file), the compiler must assure that global variables (if modified) are written to memory before any call to a function in another unit or before the return from any function. Also, in any function, the global must be read before its first use. Also after a call of any function, not in the same unit, the global must be read before use. And these things must be true whether the variable is qualified as "volatile" or not because a function in another compilation unit (source file) could access the variable without the compiler's knowledge. Otherwise, "volatile" would always be required for global variables - i.e. non-volatile globals would have no purpose.
Could the compiler treat functions in the same compilation unit differently than ones that aren't? All of the discussions I have found for the "volatile" qualifier on globals show all functions in the same compilation unit.
Edit: The compiler cannot know whether functions in other units use the global or not. Therefore I am assuming the above conditions.
I found these two other questions with information related to this topic but they don't address it head on or they give information that I find suspect:
Are global variables refreshed between function calls?
When do I need to use volatile in ISRs?
[..] in any function, the global must be read before its first use.
Definitely not:
static int variable;
void foo(void) {
variable = 42;
}
Why should the compiler bother generating code to read the variable?
The compiler must assure that global variables are written to memory before any function call or before the return from a function.
No, why should it?
void bar(void) {
return;
}
void baz(void) {
variable = 42;
bar();
}
bar is a pure function (should be determinable for a decent compiler), so there's no chance of getting any different behaviour when writing to memory after the function call.
The case of "before returning from a function" is tricky, though. But I think the general statement ("must") is false if we count inlined (static) functions, too.
Could the compiler treat functions in the same compilation unit differently than ones that aren't?
Yes, I think so: for a static function (whose address is never taken) the compiler knows exactly how it is used, and this information could be used to apply some more radical optimisations.
I'm basing all of the above on the C version of the As-If rule, specified in ยง5.1.2.3/6 (N1570):
The least requirements on a conforming implementation are:
Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.
At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.
The input and output dynamics of interactive devices shall take place as specied in 7.21.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input.
This is theobservable behaviorof the program.
In particular, you might want to read the following "EXAMPLE 1".
There was a lot said and written about volatile variables, and their use. In those articles, two slightly different ideas can be found:
1 - Volatile should be used when variable is getting changed outside of the compiled program.
2 - Volatile should be used when variable is getting changed outside the normal flow of the function.
First statement limits volatile usage to memory-mapped registers etc. and multi-threaded stuff, but the second one actually adds interrupts into the scope.
This article (http://www.barrgroup.com/Embedded-Systems/How-To/C-Volatile-Keyword) for example explicitly states that volatile modifier should be used for globals changed during the interrupt, and provides this example:
int etx_rcvd = FALSE;
void main()
{
...
while (!ext_rcvd)
{
// Wait
}
...
}
interrupt void rx_isr(void)
{
...
if (ETX == rx_char)
{
etx_rcvd = TRUE;
}
...
}
Note how setting up a rx_isr() as a callback is conveniently omitted here.
Hence I wrote my own example:
#include <stdio.h>
#include <time.h>
#include <signal.h>
void f();
int n = 0;
void main()
{
signal(2,f);
time_t tLastCalled = 0;
printf("Entering the loop\n");
while (n == 0)
{
if (time(NULL) - tLastCalled > 1)
{
printf("Still here...\n");
tLastCalled = time(NULL);
}
}
printf ("Done\n");
}
void f()
{
n = 1;
}
Compiled with gcc on linux with various levels of optimizations, every time loop exited and I saw "Done" when I pressed ctrl+c, which means that really gcc compiler is smart enough not to optimize variable n here.
That said, my question is:
If compiler can really optimize global variables modified by an interrupt service routine, then:
1. Why it has a right of optimizing a global variable in the first place when it can possibly be called from another file?
2. Why the example article and many others on the internet state that the compiler will not "notice" the interrupt callback function?
3. How do I modify my code to accomplish this?
Because you have a function call to an external function, the while loop does check n every time. However, if you remove those function calls the optimizer may registerize or do away with any checks of n.
Ex (gcc x86_64 -O3):
volatile int n;
int main() {
while(n==0) {}
return 0;
}
becomes:
.L3:
movl n(%rip), %eax
testl %eax, %eax
je .L3
xorl %eax, %eax
ret
But
int n;
int main() {
while(n==0) {}
return 0;
}
becomes:
movl n(%rip), %eax
testl %eax, %eax
jne .L2
.L3:
jmp .L3
In this case, n is never looked at in the infinite loop.
If there is a signal handler that modifies a global, you really should label that global volatile. You might not get in trouble by skipping this, but you are either getting lucky or you are counting on the optimizer not being able to verify whether or not a global is being touched.
There is some movement in cross module optimization at link time (llvm), so someday an optimizer may be able to tell that calls to time or printf aren't touching globals in your file. When that happens, missing the volatile keyword may cause problems even if you have external function calls.
If the compiler can really optimize global variables modified by an interrupt service routine, then:
Why does it have the right of optimizing a global variable in the first place when it can possibly be called from another file?
The key here is that in a "normal", single-threaded program with no interrupts, the global variable cannot be modified at any time. All accesses to the variable are sequenced in a predictable manner, no matter which file makes the access.
And the optimizations may be subtle. It is not as simple as "ah ok this global doesn't seem to be used, let's remove it entirely". Rather, for some code like
while(global)
{
do_stuff(global);
}
the optimizer might create something behaving like:
register tmp = global;
loop:
do_stuff(tmp);
goto loop;
Which changes the meaning of the program completely. How such bugs caused by the lack of volatile manifest themselves is always different from case-to-case. They are very hard to find.
Why the example article and many others on the internet state that the compiler will not "notice" the interrupt callback function?
Because embedded compilers are traditionally stupid when it comes to this aspect. Traditionally, when a compiler spots your non-standard interrupt keyword, it will just do 2 things:
Generate the specific return code from that function, since interrupts usually have different calling conventions compared to regular function calls.
Ensure that the function gets linked even though it is never called from the program. Possibly allocated in a separate memory segment. This is actually done by the linker and not the compiler.
There might nowadays be smarter compilers. PC/desktop compilers face the very same issue when dealing with callback functions/threads, but they are usually smart enough to realize that they shouldn't assume things about global variables shared with a callback.
Embedded compilers are traditionally far dumber than PC/desktop compilers when it comes to optimizations. They are generally of lower quality and worse at standard compliance. If you are one of just a few compiler vendors supporting a specific target, or perhaps the only vendor, then the lack of competition means that you don't have to worry much about quality. You can sell crap and charge a lot for it.
But even good compilers can struggle with such scenarios, especially multi-platform ones that don't know anything about how interrupts etc work specifically in "target x".
So you have the case where the good, multi-platform compiler is too generic to handle this bug. While at the same time, the bad, narrow compiler for "target x" is too poorly written to handle it, even though it supposedly knows all about how interrupts work on "target x".
How do I modify my code to accomplish this?
Make such globals volatile.
I am building one of the projects and I am looking at the generated list file.(target: x86-64) My code looks like:
int func_1(var1,var2){
asm_inline_(
)
func_2(var1,var2);
return_1;
}
void func_2(var_1,var_2){
asm __inline__(
)
func_3();
}
/**** Jump to kernel ---> System call stub in assembly. This func in .S file***/
void func_3(){
}
When I see the assembly code, I find "jmp" instruction is used instead of "call-return" pair when calling func_2 and func_3. I am sure it is one of the compiler optimization and I have not explored how to disable it. (GCC)
The moment I add some volatile variables to func_2 and func_3 and increment them then "jmp" gets replaced by "call-ret" pair.
I am bemused to see the behavior because those variables are useless and they don't serve any purpose.
Can someone please explain the behavior?
Thanks
If code jumps to the start of another function rather than calling it, when the jumped-to function returns, it will return back to the point where the outer function was called from, ignoring any more of the first function after that point. Assuming the behaviour is correct (the first function contributed nothing else to the execution after that point anyway), this is an optimisation because it reduces the number of instructions and stack manipulations by one level.
In the given example, the behaviour is correct; there's no local stack to pop and no value to return, so there is no code that needs to run after the call. (return_1, assuming it's not a macro for something, is a pure expression and therefore does nothing no matter its value.) So there's no reason to keep the stack frame around for the future when it has nothing more to contribute to events.
If you add volatile variables to the function bodies, you aren't just adding variables whose flow the compiler can analyse - you're adding slots that you've explicitly told the compiler could be accessed outside the normal control flow it can predict. The volatile qualifier warns the compiler that even though there's no obvious way for the variables to escape, something outside has a way to get their address and write to it at any time. So it can't reduce their lifetime, because it's been told that code outside the function might still try to write to that stack space; and obviously that means the stack frame needs to continue to exist for its entire declared lifespan.
I am working on some legacy C code. The original code was written in the mid-90s, targeting Solaris and Sun's C compiler of that era. The current version compiles under GCC 4 (albeit with many warnings), and it seems to work, but I'm trying to tidy it up -- I want to squeeze out as many latent bugs as possible as I determine what may be necessary to adapt it to 64-bit platforms, and to compilers other than the one it was built for.
One of my main activities in this regard has been to ensure that all functions have full prototypes (which many did not have), and in that context I discovered some code that calls a function (previously un-prototyped) with fewer arguments than the function definition declares. The function implementation does use the value of the missing argument.
Example:
impl.c:
int foo(int one, int two) {
if (two) {
return one;
} else {
return one + 1;
}
}
client1.c:
extern foo();
int bar() {
/* only one argument(!): */
return foo(42);
}
client2.c:
extern int foo();
int (*foop)() = foo;
int baz() {
/* calls the same function as does bar(), but with two arguments: */
return (*foop)(17, 23);
}
Questions: is the result of a function call with missing arguments defined? If so, what value will the function receive for the unspecified argument? Otherwise, would the Sun C compiler of ca. 1996 (for Solaris, not VMS) have exhibited a predictable implementation-specific behavior that I can emulate by adding a particular argument value to the affected calls?
EDIT: I found a stack thread C function with no parameters behavior which gives a very succinct and specific, accurate answer. PMG's comment at the end of the answer taks about UB. Below were my original thoughts, which I think are along the same lines and explain why the behaviour is UB..
Questions: is the result of a function call with missing arguments defined?
I would say no... The reason being is that I think the function will operate as-if it had the second parameter, but as explained below, that second parameter could just be junk.
If so, what value will the function receive for the unspecified argument?
I think the values received are undefined. This is why you could have UB.
There are two general ways of parameter passing that I'm aware of... (Wikipedia has a good page on calling conventions)
Pass by register. I.e., the ABI (Application Binary Interface) for the plat form will say that registers x & y for example are for passing in parameters, and any more above that get passed via stack...
Everything gets passed via stack...
Thus when you give one module a definition of the function with "...unspecified (but not variable) number of parameters..." (the extern def), it will not place as many parameters as you give it (in this case 1) in either the registers or stack location that the real function will look in to get the parameter values. Therefore the second area for the second parameter, which is missed out, essentially contains random junk.
EDIT: Based on the other stack thread I found, I would ammended the above to say that the extern declared a function with no parameters to a declared a function with "unspecified (but not variable) number of parameters".
When the program jumps to the function, that function assumes the parameter passing mechanism has been correctly obeyed, so either looks in registers or the stack and uses whatever values it finds... asumming them to be correct.
Otherwise, would the Sun C compiler of ca. 1996 (for Solaris, not VMS) have exhibited a >> predictable implementation-specific behavior
You'd have to check your compiler documentation. I doubt it... the extern definition would be trusted completely so I doubt the registers or stack, depending on parameter passing mechanism, would get correctly initialised...
If the number or the types of arguments (after default argument promotions) do not match the ones used in the actual function definition, the behavior is undefined.
What will happen in practice depends on the implementation. The values of missing parameters will not be meaningfully defined (assuming the attempt to access missing arguments will not segfault), i.e. they will hold unpredictable and possibly unstable values.
Whether the program will survive such incorrect calls will also depend on the calling convention. A "classic" C calling convention, in which the caller is responsible for placing the parameters into the stack and removing them from there, will be less crash-prone in presence of such errors. The same can be said about calls that use CPU registers to pass arguments. Meanwhile, a calling convention in which the function itself is responsible for cleaning the stack will crash almost immediately.
It is very unlikely the bar function ever in the past would give consistent results. The only thing I can imagine is that it is always called on fresh stack space and the stack space was cleared upon startup of the process, in which case the second parameter would be 0. Or the difference between between returning one and one+1 didn't make a big difference in the bigger scope of the application.
If it really is like you depict in your example, then you are looking at a big fat bug. In the distant past there was a coding style where vararg functions were implemented by specifying more parameters than passed, but just as with modern varargs you should not access any parameters not actually passed.
I assume that this code was compiled and run on the Sun SPARC architecture. According to this ancient SPARC web page: "registers %o0-%o5 are used for the first six parameters passed to a procedure."
In your example with a function expecting two parameters, with the second parameter not specified at the call site, it is likely that register %01 always happened to have a sensible value when the call was made.
If you have access to the original executable and can disassemble the code around the incorrect call site, you might be able to deduce what value %o1 had when the call was made. Or you might try running the original executable on a SPARC emulator, like QEMU. In any case this won't be a trivial task!