Does Klocwork detect never called functions? - c

my code is a mix up of different bits and pieces from older code.
I would like to erase all never used functions in order to keep the code simple.
Is Klocwork the tool? How do I do it?
Thanks,
Moshe.

You could use the -p or -pg options to gcc to cause code to be added to the prologue and epilogue of every function so that a profile database is written when the program executes. The tool prof is used to analyze the output from -p and gprof for -pg. These tools produce reports showing what functions were used, how many calls, and how much time was spent in each. Unused functions will be missing from the profile database.
You could also use gcov to get a report of what lines of code were actually executed. Functions never called will be executed 0 times....

Klocwork will find unused function/methods. There is a special checker pack you can download on my.klocwork.com (if you have an account) that will give you these special checkers.

I am not familiar with Klocwork, but gcc has the warning option -Wunused-function that detects most uncalled functions. -Wunused-function is part of -Wall.

Klockwork doesnt detect uncalled functions. Its used for static analysis only.
You can check it like this:
foo()
{
char *a;
a = malloc(100);
}
bar()
{
char a[100];
}
main()
{
bar();
}
This would probably report leak in function foo which is actually uncalled. However as schot suggested you can look into compiler options.

Related

Resolve undefined reference by stripping unused code

Assume we have the following C code:
void undefined_reference(void);
void bad(void) {
undefined_reference();
}
int main(void) {}
In function bad we fall into the linker error undefined reference to 'undefined_reference', as expected. This function is not actually used anywhere in the code, though, and as such, for the execution of the program, this undefined reference doesn't matter.
Is it possible to compile this code successfully, such that bad simply gets removed as it is never called (similar to tree-shaking in JavaScript)?
This function is not actually used anywhere in the code!
You know that, I know that, but the compiler doesn't. It deals with one translation unit at a time. It cannot divine out that there are no other translation units.
But main doesn't call anything, so there cannot be other translation units!
There can be code that runs before and after main (in an implementation-defined manner).
OK what about the linker? It sees the whole program!
Not really. Code can be loaded dynamically at run time (also by code that the linker cannot see).
So neither the compiler nor linker even try to find unused function by default.
On some systems it is possible to instruct the compiler and the linker to try and garbage-collect unused code (and assume a whole-program view when doing so), but this is not usually the default mode of operation.
With gcc and gnu ld, you can use these options:
gcc -ffunction-sections -Wl,--gc-sections main.c -o main
Other systems may have different ways of doing this.
Many compilers (for example gcc) will compile and link it correctly if you
Enable optimizations
make function bad static. Otherwise, it will have external linkage.
https://godbolt.org/z/KrvfrYYdn
Another way is to add the stump version of this function (and pragma displaying warning)

How can I plant assembly instructions in the prologue and epilogue of function via gcc

I try to build profiler to some c project.
I want that gcc plant some assembly instruction in all the function entries and function exit points in compile time.
I try to search some guides in the web but without success.
where can I learn how to do that?
thank in advance.
Apparently you can use the -finstrument-functions flag to get gcc to generate instrumentation calls
void __cyg_profile_func_enter(void *func, void *callsite);
void __cyg_profile_func_exit(void *func, void *callsite);
at function entry and exit. I've never used this, but a quick search brings up information and examples here, here, here and here.
Unless you want to modify gcc (which is non-trivial!), I would think that there are two fairly obvious approaches.
Pre-process the C code itself - it's not easy, but not terribly hard either. Find the beginning and end of a function, and add your code to it, then let the compiler proper do the job of making the code... There are quite a few tools on the market that does this in one way or another, for a variety of purposes [code flow analysis, profiling, etc].
Take the assembler output of gcc and process it to to add code to functions there. This is in some ways easier, and in some ways harder. Identifiying functions is probably no more difficult, but "not breaking" the assembler code may be harder unless your inserted assembler code is completely "safe".
Obviously, the option of modifying gcc is also a possibility, but the compiler code is fairly complex, and unless you basically take all the existing hooks for gprof, I don't think it's a school project - unless you are on your way to a PhD or some such.

Avoiding gcc function prologue overhead?

I've lately encountered a lot of functions where gcc generates really bad code on x86. They all fit a pattern of:
if (some_condition) {
/* do something really simple and return */
} else {
/* something complex that needs lots of registers */
}
Think of simple case as something so small that half or more of the work is spent pushing and popping registers that won't be modified at all. If I were writing the asm by hand, I would save and restore the saved-across-calls registers inside the complex case, and avoid touching the stack pointer at all in the simple case.
Is there any way to get gcc to be a little bit smarter and do this itself? Preferably with command line options rather than ugly hacks in the source...
Edit: To make it concrete, here's something very close to some of the functions I'm dealing with:
if (buf->pos < buf->end) {
return *buf->pos++;
} else {
/* fill buffer */
}
and another one:
if (!initialized) {
/* complex initialization procedure */
}
return &initialized_object;
and another:
if (mutex->type == SIMPLE) {
return atomic_swap(&mutex->lock, 1);
} else {
/* deal with ownership, etc. */
}
Edit 2: I should have mentioned to begin with: these functions cannot be inlined. They have external linkage and they're library code. Allowing them to be inlined in the application would result in all kinds of problems.
Update
To explicitely suppress inlining for a single function in gcc, use:
void foo() __attribute__ ((noinline))
{
...
}
See also How can I tell gcc not to inline a function?
Functions like this will regularly be inlined automatically unless compiled -O0 (disable optimization).
In C++ you can hint the compiler using the inline keyword
If the compiler won't take your hint you are probably using too many registers/branches inside the function. The situation is almost certainly resolved by extracting the 'complicated' block into it's own function.
Update i noticed you added the fact that they are extern symbols. (Please update the question with that crucial info). Well, in a sense, with external functions, all bets are off. I cannot really believe that gcc will by definition inline all of a complex function into a tiny caller simply because it is only called from there. Perhaps you can give some sample code that demonstrates the behaviour and we can find the proper optimization flags to remedy that?
Also, is this C or C++? In C++ I know it is common place to include the trivial decision functions inline (mostly as members defined in the class declaration). This won't give a linkage conflict like with simple (extern) C functions.
Also you can have template functions defined that will inline perfectly in all compilation modules without resulting in link conflicts.
I hope you are using C++ because it will give you a ton of options here.
I would do it like this:
static void complex_function() {}
void foo()
{
if(simple_case) {
// do whatever
return;
} else {
complex_function();
}
}
The compiler my insist on inlining complex_function(), in which case you can use the noinline attribute on it.
Perhaps upgrade your version of gcc? 4.6 has just been released. As far as I understand, it has the possibility of "partial inline". That is, an easily integratable outer part of a function is inlined and the expensive part is transformed into a call. But I have to admit that I didn't try it myself, yet.
Edit: The statement I was referring to from the ChangeLog:
Partial inlining is now supported and
enabled by default at -O2 and greater.
The feature can be controlled via
-fpartial-inlining.
Partial inlining splits functions with
short hot path to return. This allows
more aggressive inlining of the hot
path leading to better performance and
often to code size reductions (because
cold parts of functions are not
duplicated).
...
Inlining when optimizing for size
(either in cold regions of a program
or when compiling with -Os) was
improved to better handle C++ programs
with larger abstraction penalty,
leading to smaller and faster code.
I would probably refactor the code to encourage inlining of the simple case. That said, you can use -finline-limit to make gcc consider inlining larger functions, or -fomit-frame-pointer -fno-exceptions to minimize the stack frame. (Note that the latter may break debugging and cause C++ exceptions to misbehave badly.)
Probably you won't be able to get much from tweaking compiler options, though, and will have to refactor.
Seeing as these are external calls, it might be possible the gcc is treating them as unsafe and preserving registers for the function call(hard to know without seeing the registers that it preserves, including the ones you say 'aren't used'). Out of curiousity, does this excessive register spilling still occur with all optimizations disabled?

Macros giving problems with dladdr()

I have implemented tracing behavior using the -finstrument-functions option of gcc and this (simplified) code:
void __cyg_profile_func_enter(void *this_fn, void *call_site)
{
Dl_info di;
if(dladdr(this_fn, &di))
printf("entered %s\n", (di.dli_sname?di_dli_sname:"<unknown>"));
}
This works great, except for one thing: macros are processed as well, but the function prints the information of the function which contains the macro.
So functions containing macros have their information printed multiple times (which is of course undesired).
Is there anything to detect that a macro is being processed? Or is is possible to turn off instrumenting macros at all?
PS Same problems occur with sizeof()
Edit: To clarify: I am looking for a solution to prevent macros messing with the instrumented functions (which they should not be doing). Not for methods to trace macros, functions and/or other things.
Macros are expanded inline by the preprocessor, therefore there is no way to distinguish between a function called directly from the code and called from a macro.
The only possible way around this would be to have your macros set a global flag, which your tracing function will check.
This is of course less than foolproof, since any calls done by a function called from a macro will also appear the same way.
If you really want to dig into it you can see my response to breakdown c++ code size. C++ templates are really just more formal macros, so this may work for you.
It also may not, since LINE and FILE within a macro correspond to the caller.
edit
from my comment on this:
$ gcc -E foo.c | gcc -x c-cpp-output -c -finstrument-functions - -o foo.o
preprocess piped into gcc expecting preprocessed input on standard input

Adding a pass to gcc?

Has anybody added a pass to gcc ? or not really a pass but adding an option to do some nasty things... :-) ...
I still have the same problem about calling a function just before returning from another...so I would like to investigate it by implementing something in gcc...
Cheers.
EDIT: Adding a pass to a compiler means revisiting the tree to perform some optimizations or some analysis. I would like to emulate the behavior of __cyg_profile_func_exit but only for some functions and be able to access the original return value.
So I'm going to try to enhance my question. I would like to emulate really basic AOSD-like behavior. AOSD or Aspect oriented programming enables to add crosscutting concerns (debugging is a cross-cutting concern).
int main(int argc, char ** argv) {
return foo(argc);
}
int foo(int arg_num) {
int result = arg_num > 3 ? arg_num : 42;
return result;
}
int dbg(int returned) {
printf("Return %d", returned);
}
I would like to be able to say, I'd like to trigger the dbg function after function foo has been executed. The problem is how to tell the compiler to modify the control flow and execute dbg. dbg should be executed between return and foo(argc) ...
That's really like __cyg_profile_function_exit but only in some cases (and the problem in __cyg_profile_function_exit is that you cannot easily see and modify the returned value).
If you still are interested in adding a GCC pass, you can start reading up GCC Wiki material just about that:
http://gcc.gnu.org/wiki/WritingANewPass and "Implementing Passes" from http://www.airs.com/dnovillo/200711-GCC-Internals/ on how to, well, add a pass.
The intermediate representation you are interested in is called GIMPLE. Some introduction is at http://www.airs.com/dnovillo/200711-GCC-Internals/200711-GCC-Internals-3-IR.pdf
Other information at http://gcc.gnu.org/wiki/GettingStarted
Just for future reference: Upcoming versions of gcc (4.4.0+) will provide support for plugins specifically meant for use cases such as adding optimization passes to the compiler without having to bootstrap the whole compiler.
May 6, 2009:GCC can now be extended using a generic plugin framework on host platforms that support dynamically loadable objects.
(see gcc.gnu.org)
To answer your question: gcc is a pretty popular compiler platform to do compiler research on, so yes, I'm sure someone has done it.
However, I don't think this is something done in a weekend. Hooking into gcc's code-generation is not something you'd do over the weekend. (I'm not sure what your scope is and how much time you're willing to invest.) If you really do want to hack gcc to do what you want, you most certainly want to start by discussing it on one of the gcc mailing lists.
Tips: don't assume that people have read your other questions. If you want to refer to a question, please add a link to it if you want people to find it.
Do you need to use GCC? LLVM looks like it would work. It is written in C++, and it is very easy to write a pass.
It's an interesting question. I'm going to address concepts around the question rather than answer the question directly because, well, I don't know that much about gcc internals.
You've probably already explored some higher-level manipulation of the source code to achieve what you want to accomplish; some kind of
int main(int argc, char ** argv) {
return dbg(foo(argc));
}
inserted with with a macro on the function "foo", perhaps. If you're looking for a compiler hack, though, then you probably don't want to modify source.
There are some gcc extensions discussed here that sound a bit like what you're going for. If gcc has anything that does what you want, it'll probably be documented in the C-language extensions area of the documentation. I couldn't find anything that sounded exactly like what you've described, but perhaps since you understand best what you're looking for, you'll know better how to find it.
A gdb script would do a pretty good job of outputting debug, but it sounds like you've got bigger plans than simply doing printf's. Inserting significant logic into the code seems to be what you're after.
Which reminds me of some dynamic linker tricks I've come across recently. Library interposing could insert code around function calls without affecting the original source. The example I've encountered was on Solaris, but there is probably an analog on other platforms.
Just came across the -finstrument-functions option documented here
-finstrument-functions
Generate instrumentation calls for entry and exit to functions. Just after function
entry and just before function exit, the following profiling functions will be called
with the address of the current function and its call site. (On some platforms,
__builtin_return_address does not work beyond the current function, so the call site
information may not be available to the profiling functions otherwise.)
void __cyg_profile_func_enter (void *this_fn,
void *call_site);
void __cyg_profile_func_exit (void *this_fn,
void *call_site);
But I guess this doesn't work because you are not able to modify the return value from the profiling functions.
The GCC, the GNU Compiler Collection, is a large suite, and I don't think hacking up its source code is your answer for find problems in a single application.
It sounds like you are looking more-so for debugging or profiling tools, such as gdb, and its various front-ends (xgdb, ddd) and and gprof. Memory / Bounds checking tools like electric fence, glibc's memcheck, valgrind, and mudflap might help if this is a memory or pointer issues. Enabling compiler flags for warnings and newer C standards might be useful -std=c99 -Wall -pedantic.
I cannot understand what you mean by
I still have the same problem about
calling a function just before
returning from another.
So I am not certain what you are looking for. Can you give a trivial or pseudo-code example?
I.e.
#include <stdio.h>
void a(void) {
b();
}
void b(void) {
printf("Hello World\n");
}
int main(int ac, char *av[]) {
a();
return 0;
}

Resources