How to automatically call all functions in C source code - c

have you ever heard about automatic C code generators?
I have to do a kind of strange API functionality research which includes at least one attempt of every function execution. It may lead to crushes, segmentation faults - no matter. I just need to register every function call.
So i got a long list (several hundreds) of functions from sources using
ctags -x --c-kinds=f *.c
Can i use any tool to generate code calling every of them? Thanks a lot.
UPD: thanks for all your answers.

You could also consider customizing the GCC compiler, e.g. with a MELT extension (which e.g. would generate the testing during some customized compilation). Then you might even define your own #pragma or __attribute__ to parameterize these functions (enabling their auto-testing, giving default arguments for testing, etc etc).
However, I'm not sure it is the right approach for unit testing. There are many unit testing frameworks (but I am not very familiar with them).

Maybe something like autoconf could help you with that: as described here. In particular check for AC_CHECK_FUNCS. Autoconf creates small programs to test the existence of registered functions.

Related

Enable mocking for unit testing a library in C

In our environment we're encountering a problem regarding mocking functions for our library unit tests.
The thing is that instead of mocking whole modules (.c files) we would like to mock single functions.
The library is compiled to an archive file and linked statically to the unit test. Without mocking there isn't any issue.
Now when trying to mock single functions of the library we would get multiple definitions obviously.
My approach now is to use the weak function attribute when compiling/linking the library so that the linker takes the mocked (non-weak) function when linking against the unit test. I already tested it and it seems to work as expected.
The downside of this is that we need many attribute declarations in the code.
My final approach would be to pass some compile or link arguments to the compiler, that every function is automatically declared as a weak symbol.
The question now is: Is there anything to do this in a nice way?
btw: We use clang 8 as a compiler.
James Grenning describes several options to solve this problem (http://blog.wingman-sw.com/linker-substitution-in-c-limitations-and-workarounds). The option "function pointer substitution" gives a high degree of freedom. It works as follows: Replace functions by pointers to functions. The function pointers are initialized to point to the original function, but each pointer can be redirected individually to a test double.
This approach allows to have one single test executable where you can still decide for each test case individually for which function you use a test double and for which you use the original function.
It certainly also comes at a price:
One indirection for each call. But, if you use link-time-optimization the optimizer will most likely eliminate that indirection again, so this may not be an issue.
You make it possible to redirect function calls also in production code. This would certainly be a misuse of the concept, however.
I would suggest using VectorCAST
https://www.vector.com/us/en/products/products-a-z/software/vectorcast/
I've used, unity/cmock and others for unit testing C in the past, but after a while its vary tedious to manually create these for a language that isnt really built around that concept and is very much a heres a Hammer and Chissel the world is yours approach.
VectorCAST abstracts majority of the manual work that is required with tools like Unity/Cmock, we can get results across a project/module sooner and quicker than we did in the past with the other tools.
Is vectorCAST expensive and very much an enterprise level tool? yes... but its defiantly worth its weight in gold. And thats coming from someone who is very old school, manual approach to software development... just text editors, terminals and commandline debuggers.
VetorCAST handles function pointers and pointers extremely well, stubbing functions is easy as two clicks away. It saved our team alot of time... allowing us to focus on results and reducing the feedback loop of development.

Compile-time test if function is optimized out

I'm writing a small operating system for microcontrollers in C (not C++, so I can't use templates). It makes heavy use of some gcc features, one of the most important being the removal of unused code. The OS doesn't load anything at runtime; the user's program and the OS source are compiled together to form a single binary.
This design allows gcc to include only the OS functions that the program actually uses. So if the program never uses i2c or USB, support for those won't be included in the binary.
The problem is when I want to include optional support for those features without introducing a dependency. For example, a debug console should provide functions to debug i2c if it's being used, but including the debug console shouldn't also pull in i2c if the program isn't using it.
The methods that come to mind to achieve this aren't ideal:
Have the user explicitly enable the modules they need (using #define), and use #if to only include support for them in the debug console if enabled. I don't like this method, because currently the user doesn't have to do this, and I'd prefer to keep it that way.
Have the modules register function pointers with the debug module at startup. This isn't ideal, because it adds some runtime overhead and means the debug code is split up over several files.
Do the same as above, but using weak symbols instead of pointers. But I'm still not sure how to actually accomplish this.
Do a compile-time test in the debug code, like:
if(i2cInit is used) {
debugShowi2cStatus();
}
The last method seems ideal, but is it possible?
This seems like an interesting problem. Here's an idea, although it's not perfect:
Two-pass compile.
What you can do is first, compile the program with a flag like FINDING_DEPENDENCIES=1. Surround all the dependency checks with #ifs for this (I'm assuming you're not as concerned about adding extra ifs there.)
Then, when the compile is done (without any optional features), use nm or similar to detect the usage of functions/features in the program (such as i2cInit), and format this information into a .h file.
#ifndef FINDING_DEPENDENCIES
#include "dependency_info.h"
#endif
Now the optional dependencies are known.
This still doesn't seem like a perfect solution, but ultimately, it's mostly a chicken-and-the-egg problem. When compiling, the compiler doesn't know what symbols are going to be gc'd out. You basically need to get this information from the linker stage and feed it back to the compilation stage.
Theoretically, this might not increase build times much, especially if you used a temp file for the generated h, and then only replaced it if it was different. You'd need to use different object dirs, though.
Also this might help (pre-strip, of course):
How can I view function names and parameters contained in an ELF file?

Can I run GCC as a daemon (or use it as a library)?

I would like to use GCC kind of as a JIT compiler, where I just compile short snippets of code every now and then. While I could of course fork a GCC process for each function I want to compile, I find that GCC's startup overhead is too large for that (it seems to be about 50 ms on my computer, which would make it take 50 seconds to compile 1000 functions). Therefore, I'm wondering if it's possible to run GCC as a daemon or use it as a library or something similar, so that I can just submit a function for compilation without the startup overhead.
In case you're wondering, the reason I'm not considering using an actual JIT library is because I haven't found one that supports all the features I want, which include at least good knowledge of the ABI so that it can handle struct arguments (lacking in GNU Lightning), nested functions with closure (lacking in libjit) and having a C-only interface (lacking in LLVM; I also think LLVM lacks nested functions).
And no, I don't think I can batch functions together for compilation; half the point is that I'd like to compile them only once they're actually called for the first time.
I've noticed libgccjit, but from what I can tell, it seems very experimental.
My answer is "No (you can't run GCC as a daemon process, or use it as a library)", assuming you are trying to use the standard GCC compiler code. I see at least two problems:
The C compiler deals in complete translation units, and once it has finished reading the source, compiles it and exits. You'd have to rejig the code (the compiler driver program) to stick around after reading each file. Since it runs multiple sub-processes, I'm not sure that you'll save all that much time with it, anyway.
You won't be able to call the functions you create as if they were normal statically compiled and linked functions. At the least you will have to load them (using dlopen() and its kin, or writing code to do the mapping yourself) and then call them via the function pointer.
The first objection deals with the direct question; the second addresses a question raised in the comments.
I'm late to the party, but others may find this useful.
There exists a REPL (read–eval–print loop) for c++ called Cling, which is based on the Clang compiler. A big part of what it does is JIT for c & c++. As such you may be able to use Cling to get what you want done.
The even better news is that Cling is undergoing an attempt to upstream a lot of the Cling infrastructure into Clang and LLVM.
#acorn pointed out that you'd ruled out LLVM and co. for lack of a c API, but Clang itself does have one which is the only one they guarantee stability for: https://clang.llvm.org/doxygen/group__CINDEX.html

Tool to list callers of a function in C?

Background:
In a particular project there are about couple of thousand functions in more than hundred files. The functions are divided to reside in two banks of code memory - fast_mem and slow_mem. But now, since the fast_mem area is limited, its running out of space to accommodate any new code changes.
As part of code review, its been found that some functions in fast_mem have no callers. But the list of functions is too huge to check them one by one manually.
Question:
So, coming to the question, is there a tool that can list the callers of all the functions in the project? With this, I can go ahead and remove functions in fast_mem that don't have any callers.
I use cscope for code browsing along with ctags. But this requires one to input the function name manually. Can this be automated some how to get the complete list?
I also tried Doxygen with its caller graph feature. The result is not so comfortable to use though.
I use Scientific Toolworks Understand
If your compiler is a recent GCC (or if you can switch to GCC 4.6, possibly as a cross-compiler) you might develop a GCC plugin or a MELT extension to find out.
Of course, if you are e.g. doing tricks with function pointers (e.g. unportable pointer arithmetic on function pointers) the original question is undecidable.
Actually, if you are using function pointers, often the only reasonable thing to say is that they can reach only functions of the same signature.
And perhaps the project is important enough so that customizing the compiler to make a better (automatic or semi-automatic) trade-off between fast_mem & slow_mem is worthwhile. This is typically an excellent case for GCC plugins or MELT extensions (but that take some work -days or weeks, not hours-, because you need to understand the internal GCC representations to customize GCC), and you are probably the only one who could do it (because your question is very peculiar to some strange systems).
Let's assume there aren't any odd function pointer games going on. Then you can break out the under-used cflow:
http://www.gnu.org/software/cflow/
Generate a "reverse index" with the -r flag. you'll get a list of every function, followed by where it's called. You can feed it multiple files.
You can use static code analysis tool like cppcheck.
If you call it with --enable=unusedFunction parameter it will warn about unused function.

Any good reason to #include source (*.c *.cpp) files?

i've been working for some time with an opensource library ("fast artificial neural network"). I'm using it's source in my static library. When i compile it however, i get hundreds of linker warnings which are probably caused by the fact that the library includes it's *.c files in other *.c files (as i'm only including some headers i need and i did not touch the code of the lib itself).
My question: Is there a good reason why the developers of the library used this approach, which is strongly discouraged? (Or at least i've been told all my life that this is bad and from my own experience i believe it IS bad). Or is it just bad design and there is no gain in this approach?
I'm aware of this related question but it does not answer my question. I'm looking for reasons that might justify this.
A bonus question: Is there a way how to fix this without touching the library code too much? I have a lot of work of my own and don't want to create more ;)
As far as I see (grep '#include .*\.c'), they only do this in doublefann.c, fixedfann.c, and floatfann.c, and each time include the reason:
/* Easy way to allow for build of multiple binaries */
This exact use of the preprocessor for simple copy-pasting is indeed the only valid use of including implementation (*.c) files, and relatively rare. (If you want to include some code for another reason, just give it a different name, like *.h or *.inc.) An alternative is to specify configuration in macros given to the compiler (e.g. -DFANN_DOUBLE, -DFANN_FIXED, or -DFANN_FLOAT), but they didn't use this method. (Each approach has drawbacks, so I'm not saying they're necessarily wrong, I'd have to look at that project in depth to determine that.)
They provide makefiles and MSVS projects which should already not link doublefann.o (from doublefann.c) with either fann.o (from fann.c) or fixedfann.o (from fixedfann.c) and so on, and either their files are screwed up or something similar has gone wrong.
Did you try to create a project from scratch (or use your existing project) and add all the files to it? If you did, what is happening is each implementation file is being compiled independently and the resulting object files contain conflicting definitions. This is the standard way to deal with implementation files and many tools assume it. The only possible solution is to fix the project settings to not link these together. (Okay, you could drastically change their source too, but that's not really a solution.)
While you're at it, if you continue without using their project settings, you can likely skip compiling fann.c, et. al. and possibly just removing those from the project is enough – then they won't be compiled and linked. You'll want to choose exactly one of double-/fixed-/floatfann to use, otherwise you'll get the same link errors. (I haven't looked at their instructions, but would not be surprised to see this summary explained a bit more in-depth there.)
Including C/C++ code leads to all the code being stuck together in one translation unit. With a good compiler, this can lead to a massive speed boost (as stuff can be inlined and function calls optimized away).
If actual code is going to be included like this, though, it should have static in most of its declarations, or it will cause the warnings you're seeing.
If you ever declare a single global variable or function in that .c file, it cannot be included in two places which both compile to the same binary, or the two definitions will collide. If it is included in even one place, it cannot also be compiled on its own while still being linked into the same binary as its user.
If the file is only included in one place, why not just make it a discrete compilation unit (and use its globals via extern declarations)? Why bother having it included at all?
If your C files declare no global variables or functions, they are header files and should be named as such.
Therefore, by exhaustive search, I can say that the only time you would ever potentially want to include C files is if the same C code is used in building multiple different binaries. And even there, you're increasing your compile time for no real gain.
This is assuming that functions which should be inlined are marked inline and that you have a decent compiler and linker.
I don't know of a quick way to fix this.
I don't know that library, but as you describe it, it is either bad practice or your understanding of how to use it is not good enough.
A C project that wants to be included by others should always provide well structured .h files for others and then the compiled library for linking. If it wants to include function definitions in header files it should either mark them as static (old fashioned) or as inline (possible since C99).
I haven't looked at the code, but it's possible that the .c or .cpp files being included actually contain code that works in a header. For example, a template or an inline function. If that is the case, then the warnings would be spurious.
I'm doing this at the moment at home because I'm a relative newcomer to C++ on Linux and don't want to get bogged down in difficulties with the linker. But I wouldn't recommend it for proper work.
(I also once had to include a header.dat into a C++ program, because Rational Rose didn't allow headers to be part of the issued software and we needed that particular source file on the running system (for arcane reasons).)

Resources