unused functions detection utility for c - c

I am trying to measure my code coverage utilization on a C project consist of several libraries, and main program.
Is there a utility that can help me find which function I dont use from both libraries and main program.
I want to build list of functions (public functions) that are not used by my main program, in order to ignore them in my code coverage report.

If you are using gcc you compile your code with option:
-Wunused-function
Warn whenever a static function is declared but not defined or a non-inline static function is unused. This warning is enabled by -Wall.

cflow can create a call graph for the program, but it doesn't work well with pointers to functions in some cases.
for eaxample:
#include <stdio.h>
static int f1(){
return 1;
}
int (*p_f1)() = f1;
int main() {
p_f1();
return 0;
}

There are coverage tools available for free - for example "gcov" that runs on go with the gcc tool suite. However, Code Coverage only tells you which functions get hit by your testing (or whatever you do to excercise the code), so for example
ptr = malloc(...);
if (!ptr)
{
allocation_failed(__FILE__, __LINE__);
}
would only show that allocation_failed is called if you are also using some tool that makes your allocations fail from time to time.
I'm not aware of a tool that will show you what functions are not used across larger systems (with multiple libraries, etc). I expect you could make something by using the output of "nm" and a bit of "pulling things in". It won't cover foo and bar as unusued in this case:
unit1.c:
extern int foo(void);
int bar()
{
return foo();
}
unit2.c:
int foo(void)
{
return 42;
}
int baz(void)
{
return bar();
}
and then baz isn't used anywhere. But if you remove baz, it will show that bar is not called, and then you can remove foo after that...
Edit: Crazy idea time - how about taking every C file in the project and just concatenating the whole thing into a a single .C file, then add static at the beginning of every function, and compiling with -Wunused-functon - I'm sure there will be some "interesting" effects from this if your code isn't extremely well written, but it may be worth a try [it would be fairly easy to do this in a Linux system, something like find . -name "*.c" -print | xargs cat {} > giantsource.c - you then need a little bit of sed or something to label all functions static, which I'm not quite sure how you'd go about doing - it depends very much on the formatting of your code.
You may want to have a look at this:
http://www.gedanken.demon.co.uk/cxref/
I haven't used it, but any decent cross referencing tool should be able to identify anything that is "not used" as not having any references. Of course, you'll probably still have to run over the code severa times to weed out the functions that are used by functions that aren't being called, etc.

cflow has an option to build a cross-reference table: --xref
The format of the output is described by GNU cflow: Cross-Reference
GNU cflow is also able to produce cross-reference listings. This mode is enabled by --xref (-x) command line option. Cross-reference output lists each symbol occurrence on a separate line. Each line shows the identifier and the source location where it appears. If this location is where the symbol is defined, it is additionally marked with an asterisk and followed by the definition. For example, here is a fragment of a cross-reference output for d.c program:
printdir * d.c:42 void printdir (int level,char *name)
printdir d.c:74
printdir d.c:102
It shows that the function printdir is defined in line 42 and referenced twice, in lines 74 and 102.
To detect unused functions, search the line with a star not followed by a line with the same prefix. The following GNU Awk code print the unused functions:
{
if( $2 == "*" ) {
if( f ) {
print f
}
f = $1
}
else {
f = ""
}
}
The command may be:
cflow -x src/*.c src-gen/*.c | awk -f find-unused-functions.awk

Related

How do I find out where main() is defined in a big project?

Let's say I have the following program (a.c):
#include <stdio.h>
void f()
{
printf("Hello, world!");
}
int main(void)
{
f();
return 0;
}
$ gcc -g a.c
Having a.out, how do I find out where main() is defined? I mean, in a big project it's not always clear where main() comes from.
You can use gdb. Probably there's a better command, anyway I know of:
$ gdb -batch -ex "info function main" a.out
All functions matching regular expression "main":
File a.c:
8: int main(void);
In the executable is not normally the place to look for a function definition. You can do the compiling of all source files and run nm(1) on them to see if any of them has a definition of main. The executable is not the proper place as it will have no reference to the module it came from. The source files will be hard to follow (as some can have compilation directives with optionally compiled code /with a main definition in case you don't provide one/ that will make uncertaing the place where you find it) but the compiled module will have a reference to main to indicate the linker it can get it and solve all the main references from this file. The linker divides a compiled input into a set of segments and piles them up to the appropiate places in the processor memory map, and so, you get a messed final executable with pieces of each module mixed up, making it more difficult to check if a main definition is there (it applies to main or to any other function)
Output should be something like:
0000000000000c10 T main
in the file that conttains it. In the opposite, all files that require main and need it provided by the linker appear as:
U memset
instead.

Line number of the caller of a preloaded library function

Let's say I have a program (program.c) that uses rand function in standard C library.
1 #include <stdlib.h>
2 int main(){
3 int rand_number = rand();
4 }
I also have a shared library (intercept.c) that I created to change the behaviour of rand function (simply adds +1 to the result) in the standard library.
int rand(void){
int (*rand_func)();
rand_func = dlsym(RTLD_NEXT, "rand");
int result = (*rand_func)();
return result + 1;
}
And I run the program with
LD_PRELOAD=./intercept.so ./program
Is there any way to get the line number (Line 3) and name of the caller function (main) without modifying the program.c's source code?
It is not immediate, but you can use backtrace() in order to obtain each frame in the call stack.
Then invoking the external command eu-addr2line -f -C -s --pretty-print -p your_pid the_previous_frames... (with popen() or pipe()/fork()/dup2()/exec()...) and parsing its output will provide the information you need
(if compiled with -g).
regarding:
Is there any way to get the line number (Line 3) and name of the caller function (main) without modifying the program.c's source code?
compile the program with the -ggdb3 option, Then set a break point where you want to stop the program. Then use the backtrace command bt. This will show the function names, the line numbers, etc
Another (Linux specific) approach is to compile everything with -g (perhaps also -O) using GCC and to use Ian Taylor's excellent libbacktrace.
That library parses the DWARF debug information and knows line numbers.
You'll need several hours to understand libbacktrace (read carefully the header file). I am using it in RefPerSys

Running automated tests on several C files

I have several files in the structure
/algorithms
/a1
a1.c
/a2
a2.c
/a3
a3.c
Each of these files contains a function of the same name. Each has the same signature, except for the name (which is the same as the filename). Essentially, each algorithm is a different implementation of the same thing -- different means to the same end. There may however be small helper functions.
The content of the files (comments, functions, layout, etc) cannot change.
I want to create some method that will test each algorithm. This method is not confined to being implemented in C.
I have a C file that essentially contains three functions:
// Runs the algorithm, which modifies the given integer array.
void run(void (*algorithm) (int*, size_t));
// Checks that the algorithm successfully completed and
// the array is correct.
int check(int*, size_t);
// Should call run() with appropriate algorithm and a random data set
// and then call check() to make sure it worked.
int main(int, char**);
I need an automated way of including the appropriate files and then calling
the function within them. Currently, I have a bash file that gets all the algorithms, copies the tester file, prepends an #include statement at the beginning and a generated injected_main() function that gets called by the actual main() function. It runs the copied tester and then deletes it.
function testC() {
local tempout=temptest.c
local filename="$(basename -- $1)"
local functionname="${filename%.*}"
local main="void injectedMain() {test(&$functionname);}"
local include="#include\"$1\"\n$main"
touch $tempout
chmod +x $tempout
printf "$filename: "
cp $TESTER_C $tempout
printf "$include" >> $tempout
gcc $tempout -o tempout -Wall -Wextra -pedantic
./tempout
rm $tempout tempout
}
Where the function is run in a loop for every algorithm C file.
However, this method is prone to error, not extendable, and just downright ugly. Is there a better way to do this?
Combine your existing code and #Herve's answer.
You could let the bash script just collect all algorithms and build a C source like that proposed by #Herve. This way there will be no error prone manual step.
To run all tests compile this automatically generated source and link it to your test runner. Let the latter loop through all.
Can't you just a header including your algorithm implementations and loop through all of them?
Something like
#include "a1/a1.h"
#include "a2/a2.h"
#include "a3/a3.h"
typedef void (*AlgorithmImplemetation)(void); //Your algorithm function signature goes here
AlgorithmImplemetation *all = {
a1,
a2,
a3
};
Then include this header in your main.c and loop through all.

List all functions declared in header but missing in the source file?

Question
Are there some linters/statical analyzers that warn/error on functions, that are declared in the header file but not implemented in the corresponding source file?
Lets say we have the following header (guard omitted):
/* example.h */
int doSomething(int i);
double doSomethingElse(double d);
And the following source:
/* example.c */
#require "example.h"
int doSomething(int i) {
return i + 1;
}
So is there some tool, that can tell me that doSomethingElse() is missing in example.c?
Why asking?
In an exercise we got some headerfiles with a fully fletched interface, and partially prepared sourcefiles, with some functions beeing fully provided, some functions beeing partially provided, and some missing.
For actually running and compiling this programm it was enough to complete the partially provided functions, but still there is some discrepancy between the defined interface in the header and the now provided functions in the source file.
I could go through all header/source pairs by hand and implement the missing funtions, but it would be nice to have some autogenerated todolist.
I'd just do it with grep etc.:
grep ');' foo.h | tr -d ';' | while read decl
do
if ! grep -q "$decl" foo.c
then
echo "not found: $decl"
fi
done
No, this isn't perfect, but it might work if your use case is as simple as you've outlined.

Why are some debug symbols missing and how to track them?

I am currently debugging a Kernel module and to this purpose, I built the whole kernel with debug information (produces kallsyms, etc ...).
When I try nm my_module.ko, I get the list of symbols included by my module. All is allright except that some symbols are kind of missing as they do not appear in the symbol list. My feeling about this is that the related functions are being automatically inlined.
Anyway, when running the kernel with qemu-kgdb/gdb, I am able to see that the "missing" function is called. This means the compiler did not wipe it out because it was never used in any code path (hence my "feeling").
Since the symbol does not appear, I can't set a breakpoint on it and gdb won't unroll it so that I can see the running code path - understand I don't know how to tell gdb to unroll it.
Unfortunately, I want to see this part of the code path ... How can I do so ?
EDIT : As suggested in Tom's answer, I tried using the file:line syntax as below :
My code file looks like this :
int foo(int arg) // The function that I suspect to be inlined - here is line 1
{
/* Blabla */
return 42;
}
void foo2(void)
{
foo(0); // Line 9
}
I tried b file.c:1, and the breakpoint was hit but the foo() function is not unrolled.
Of course, I am producing debug symbols, since I also set a breakpoint to foo2 to check what happened (which worked well).
You don't say what version of gdb you are using.
Very old versions of gdb don't have any support for inline functions. This was true for 6.8 and maybe even 7.0 -- I don't recall. You can look at the NEWS file for your gdb to see.
Then there were some versions of gdb that supported breakpoints on inline functions, but only using the "file:line" syntax. So what you would do is look up the function in your editor, and find its line number and enter, e.g.:
(gdb) break myfile.c:777
Even more recent versions of gdb, starting with 7.4 or 7.5 (I forget) will handle "break function" just fine if "function" was inlined.
All of this only works if you have debuginfo available. So if you tried this, and it failed, either you have an older gdb, or you forgot to use -g.
There's no good way inside gdb to see what objects in a compilation were missing -g. You can see it pretty easily from the shell, though, by running "readelf -WS" on the .o files, and looking for files that don't have a .debug_info section.
Setting a breakpoint to the signature line of the function did not work. But setting one to the line of an instruction of the inlined function solved the problem for me. For instance, considering the following function inline_foo, found in myfile.c:
inline int inline_foo(int arg) // l.1
{
int a_var = 0;
do_smth(&a_var);
do_some_other_thing(); // l.5
if (a_var) {
a_var = blob();
} else {
a_var = blub();
return a_var; // l.10
}
I was trying b myfile.c:1, which did not appear to work. But if I tried b myfile.c:3 instead, the breakpoint was well handled by GDB.
Since the technique is the same as the one described previously by Tom, I'll accept his answer.

Resources