I have a static C library that I can build with different compile time options (e.g. _BUILD_SMALL, _BUILD_FAST). It has a function
void Foo(void);
I would like to use a single instance of a benchmarking tool to benchmark the "small" and the "fast" versions of the library. I don't want to use .dlls.
How can I link to the "small" and the "fast" libraries and alias the function names so I can call the small version and the fast version. Ideally it would look something like:
void benchmark(void)
{
FAST_Foo();
SMALL_Foo();
}
More information:
The library can be built with different optimizations options -Os versus -O3. Also, the algorithms vary slightly (i.e. cached values vs looking up values always). I want to compare the size vs speed tradeoffs of the different versions. I'd like the unit tests and benchmarking to be ran on both versions of the library the easiest way possible.
This is just a variation of the method as given by #Michał Górny (I run out of comment space there)...
You could create an include file of the following form:
/* Automatically created file - do not edit or ugly dinosaur will eat you */
#ifndef PREFIX
# define RENAME(f)
#else
# define RENAME(f) PREFIX ## f
#endif
/* list all the function and variables you want to rename here in one place */
#define func_foo RENAME(func_foo)
#define func_bar RENAME(func_bar)
/* ... many more ... */
#undef RENAME
At least gcc allows you to specify the inclusion of a header file from command line with option -include rename.h (assuming this file is called rename.h). Because you use gcc lookalike options (-O3 and Os), I am assuming you use gcc in the rest of this answer. Otherwise, if your C compiler is reasonable, you should be able to do it in some similar way.
You can create easily two or even three versions of your library that can be linked in at the same time if you want, by providing different options for your C compiler (here through CFLAGS setting):
CFLAGS += -include rename.h -DPREFIX=fast_ -D_BUILD_FAST -O3 -DBENCHMARKING
CFLAGS += -include rename.h -DPREFIX=small_ -D_BUILD_SMALL -Os -DBENCHMARKING
CFLAGS += -D_BUILD_FAST -O2
If your library header files look very regular and if you declare the library private functions static, then it is easy to extract the functions from those header files by some dummy script using very simple regular expressions to automatically generate the rename.h file for you. This is a natural build target if you are using make or something similar. All the global variables also need to be renamed using the same method to allow simultaneous use.
There are three main points with this solution:
The ugly renaming business can be hidden in one file, you do not need to edit the actual source files - especially you do not need to clutter the source files but can keep them clean and easy to read.
The renaming can be easily automated, if you follow some simple principles (coding conventions followed for the header files and the header files will declare all the global variables and functions).
There is no reason to make benchmarking more cumbersome by needing to run your test program multiple times (this is relevant if you are as lazy as I am and dislike repetive tasks as violently as I do - I know many people do not care, it is somewhat a matter of preference).
One way would be: keep the same name for both and call appropriately depending on the compile time option set.
ifdef SMALL_FOO
void foo() {
/* Small foo code */
}
#endif
ifdef BIG_FOO
void foo() {
/* Big foo code */
}
#endif
Set the SMALL_FOO/BIG_FOO during compilation with -d.
As a quick solution, you can use macro to mangle the function name like:
#ifdef FAST
# define FUNC(x) FAST_##x
#else
# define FUNC(x) SLOW_##x
#endif
void FUNC(Foo)();
And now with -DFAST the library with FAST_Foo will be built; and without it, one with SLOW_Foo. Just note that you need to use the FUNC() macro in the implementation part as well (and whenever you are referring to that function from inside the library), and #ifdef FAST to switch between fast/slow code.
Just please don't use that in a production code.
If you attempt to link in both static libraries to the same executable, the second library listed in your link line will not have any effect, because all the symbols it provided was satisfied already by the first library. If you provided simple unique wrapper functions to call Foo, it would still fail, now because of multiple definitions. Here is an example:
/* x.c */
extern void Y_Bar ();
extern void Z_Bar ();
int main ()
{
Y_Bar();
Z_Bar();
}
This main calls unique wrapper functions, which are provided in liby.a and libz.a.
/* y.c in liby.a */
#include <stdio.h>
void Y_Bar () {
extern void Foo ();
Foo();
}
void Foo () {
printf("%s\n", "that Foo");
}
/* z.c in libz.a */
#include <stdio.h>
void Z_Bar () {
extern void Foo ();
Foo();
}
void Foo () {
puts("this foo");
}
Attempting to link the executable with -ly -lz will fail.
The easiest work around for you is to build two separate executables. Your benchmark driver could then execute both executables to compare their relative performance.
You say that you can build the library, changing the compile time options, so why not edit the code to change the names of the functions in each. (You'd be making two different versions of your library.)
Maybe you can use -D option when call gcc, like -D_FAST_, -D_SMALL_, or your can received a input parameter when using make, like use make CFG=FAST, make CFG=SMALL, in your makefile, you can define, when get parameterFAST, link to FAST library.
Related
In my case I am writing a simple plugin system in C using dlfcn.h (linux). The plugins are compiled separately from the main program and result in a bunch of .so files.
There are certain functions that must be defined in the plugin in order for the the plugin to be called properly by the main program. Ideally I would like each plugin to have included in it a .h file or something that somehow states what functions a valid plugin must have, if these functions are not defined in the plugin I would like the plugin to fail compilation.
I don't think you can enforce that a function be defined at compile time. However, if you use gcc toolchain, you can use the --undefined flag when linking to enforce that a symbol be defined.
ld --undefined foo
will treat foo as though it is an undefined symbol that must be defined for the linker to succeed.
You cannot do that.
It's common practice, to only define two exported functions in a library opened by dlopen(), one to import functions in your plugin and one to export functions of your plugin.
A few lines of code are better than any explanation:
struct plugin_import {
void (*draw)(float);
void (*update)(float);
};
struct plugin_export {
int (*get_version)(void);
void (*set_version)(int);
};
extern void import(struct plugin_import *);
extern void export(struct plugin_export *);
int setup(void)
{
struct plugin_export out = {0};
struct plugin_import in;
/* give the plugin our function pointers */
in.draw = &draw, in.update = &update;
import(&in);
/* get our functions out of the plugin */
export(&out);
/* verify that all functions are defined */
if (out.get_version == NULL || out.set_version == NULL)
return 1;
return 0;
}
This is very similar to the system Quake 2 used. You can look at the source here.
With the only difference, Quake 2 only exported a single function, which im- and exports the functions defined by the dynamic library at once.
Well after doing some research and asking a few people that I know of on IRC I have found the following solution:
Since I am using gcc I am able to use a linker script.
linker.script:
ASSERT(DEFINED(funcA), "must define funcA" ) ;
ASSERT(DEFINED(funcB), "must define funcB" ) ;
If either of those functions are not defined, then a custom error message will be output when the program tries to link.
(more info on linker script syntax can be found here: http://www.math.utah.edu/docs/info/ld_3.html)
When compiling simply add the linker script file after the source file:
gcc -o test main.c linker.script
Another possibility:
Something that I didn't think of (seems a bit obvious now) that was brought to my attention is you can create small program that loads your plugin and checks to see that you have valid function pointers to all of the functions that you want your plugin to have. Then incorporate this into your build system, be it a makefile or a script or whatever. This has the benefit that you are no longer limited to using a particular compiler to make this work. As well as you can do some more sophisticated checks for other other things. The only downside being you have a little more work to do to get it set up.
I recently had to face a fairly complex issue regarding lib management, but I would be very surprised to be the first one.
Let's imagine you are creating a library (static or dynamic) called lib1 in C. Inside lib1 are a few functions that are exposed through an API, and a few other ones which remain private.
Ideally, the private functions would be static. Unfortunately, let's assume one of the source files, called extmod.c, come from another project, and it would be beneficial to keep it unmodified. Therefore, it becomes unpractical to static the functions it defines.
As a consequence, all the functions defined into extmod are present into lib1 ABI, but not the API, since the relevant *.h is not shipped. So no one notice.
Unfortunately, at later stage, someone wants to link both lib1 and another lib2 which also includes extmod. It results in a linking error, due to duplicate definitions.
In C++, the answer to this problem would be a simple namespace. In C, we are less lucky.
There are a few solutions to this problem, but I would like to probe if anyone believes to have found an efficient and non-invasive way. By "non-invasive", I mean a method which avoids if possible to modify extmod.c.
Among the possible workaround, there is the possibility to change all definitions from extmod.c using a different prefix, effectively emulating namespace. Or the possibility to put the content of extmod.c into extmod.h and static everything. Both method do extensively modify extmod though ...
Note that I've looked at this previous answer, but it doesn't address this specific concern.
You could implement your 'different prefix' solution by excluding extmod.c from your your build and instead treating it as header file in a way. Use the C pre-processor to effectively modify the file without actually modifying it. For example if extmod.c contains:
void print_hello()
{
printf("hello!");
}
Exclude this file from your build and add one called ns_extmod.c. The content of this file should look like this:
#define print_hello ns_print_hello
#include "extmod.c"
On compilation, print_hello will be renamed to ns_print_hello by the C pre-processor but the original file will remain intact.
Alternatively, IF AND ONLY IF the function are not called internally by extmod.c, it might work to use the preprocessor to make them static in the same way:
#define print_hello static print_hello
#include "extmod.c"
This should work for you assuming you have control over the build process.
One way you can do prefixing without actually editing extmod.c is as follows:
Create a new header file extmod_prefix.h as:
#ifndef EXTMOD_PREFIX_H
#define EXTMOD_PREFIX_H
#ifdef LIB1
#define PREFIX lib1_
#else
#ifdef LIB2
#define PREFIX lib2_
#endif
#endif
#define function_in_extmod PREFIX##function_in_extmod
/* Do this for all the functions in extmod.c */
#endif
Include this file in extmod.h and define LIB1 in lib1's build process and LIB2 in lib2.
This way, all the functions in extmod.c will be prefixed by lib1_ in lib1 and lib2_ in lib2.
Here's the answer (in the form of a question). The relevant portion:
objcopy --prefix-symbols allows me to prefix all symbols exported by
an object file / static library.
First of all, I've been searching for an answer here and I haven't been able to find one. If this question is really replicated please redirect me to the right answer and I'll delete it right away. My problem is that I'm making a C library that has a few unimplemented functions in the .h file, that will need to be implemented in the main.c that calls this library. However, there is an implemented function in the library that calls them. I have a makefile for this library that gives me "undefined reference to" every function that's not implemented, so the when I try to link the .o s in the main.c file that does have those implementations I can't, because the original library wasn't able to compile because of these errors.
My question is, are there any flags that I could put in the makefile so that it will ignore the unimplemented headers or look for them once the library is linked?
This is a very old-fashioned way of writing a library (but I've worked on code written like that). It does not work well with shared libraries, as you are now discovering.
If you can change the library design
Your best bet is to rearrange the code so that the 'missing functions' are specified as callbacks in some initialization function. For example, you might currently have a header a bit like:
#ifndef HEADER_H_INCLUDED
#define HEADER_H_INCLUDED
extern int implemented_function(int);
extern int missing_function(int);
#endif
I'm assuming that your library contains implemented_function() but one of the functions in the library makes a call to missing_function(), which the user's application should provide.
You should consider restructuring your library along the lines of:
#ifndef HEADER_H_INCLUDED
#define HEADER_H_INCLUDED
typedef int (*IntegerFunction)(int);
extern int implemented_function(int);
extern IntegerFunction set_callback(IntegerFunction);
#endif
Your library code would have:
#include "header.h"
static IntegerFunction callback = 0;
IntegerFunction set_callback(IntegerFunction new_callback)
{
IntegerFunction old_callback = callback;
callback = new_callback;
return old_callback;
}
static int internal_function(int x)
{
if (callback == 0)
...major error...callback not set yet...
return (*callback)(x);
}
(or you can use return callback(x); instead; I use the old school notation for clarity.) Your application would then contain:
#include "header.h"
static int missing_function(int x);
int some_function(int y)
{
set_callback(missing_function);
return implemented_function(y);
}
An alternative to using a function like set_callback() is to pass the missing_function as a pointer to any function that ends up calling it. Whether that's reasonable depends on how widely used the missing function is.
If you can't change the library design
If that is simply not feasible, then you are going to have to find the platform-specific options to the code that builds shared libraries so that the missing references do not cause build errors. The details vary widely between platforms; what works on Linux won't work on AIX and vice versa. So you will need to clarify your question to specify where you need the solution to work.
(I found this question which is similar but not a duplicate:
How to check validity of header file in C programming language )
I have a function implementation, and a non-matching prototype (same name, different types) which is in a header file. The header file is included by a C file that uses the function, but is not included in the file that defines the function.
Here is a minimal test case :
header.h:
void foo(int bar);
File1.c:
#include "header.h"
int main (int argc, char * argv[])
{
int x = 1;
foo(x);
return 0;
}
File 2.c:
#include <stdio.h>
typedef struct {
int x;
int y;
} t_struct;
void foo (t_struct *p_bar)
{
printf("%x %x\n", p_bar->x, p_bar->y);
}
I can compile this with VS 2010 with no errors or warnings, but unsurprisingly it segfaults when I run it.
The compiler is fine with it (this I understand)
The linker did not catch it (this I was slightly surprised by)
The static analysis tool (Coverity) did not catch it (this I was very surprised by).
How can I catch these kinds of errors?
[Edit: I realise if I #include "header.h" in file2.c as well, the compiler will complain. But I have an enormous code base and it is not always possible or appropriate to guarantee that all headers where a function is prototyped are included in the implementation files.]
Have the same header file included in both file1.c and file2.c. This will pretty much prevent a conflicting prototype.
Otherwise, such a mistake cannot be detected by the compiler because the source code of the function is not visible to the compiler when it compiles file1.c. Rather, it can only trust the signature that has been given.
At least theoretically, the linker could be able to detect such a mismatch if additional metadata is stored in the object files, but I am not aware if this is practically possible.
-Werror-implicit-function-declaration, -Wmissing-prototypes or equivalent on one of your supported compilers. then it will either error or complain if the declaration does not precede the definition of a global.
Compiling the programs in some form of strict C99 mode should also generate these messages. GCC, ICC, and Clang all support this feature (not sure about MS's C compiler and its current status, as VS 2005 or 2008 was the latest I've used for C).
You may use the Frama-C static analysis platform available at http://frama-c.com.
On your examples you would get:
$ frama-c 1.c 2.c
[kernel] preprocessing with "gcc -C -E -I. 1.c"
[kernel] preprocessing with "gcc -C -E -I. 2.c"
[kernel] user error: Incompatible declaration for foo:
different type constructors: int vs. t_struct *
First declaration was at header.h:1
Current declaration is at 2.c:8
[kernel] Frama-C aborted: invalid user input.
Hope this helps!
Looks like this is not possible with C compiler because of its way how function names are mapped into symbolic object names (directly, without considering actual signature).
But this is possible with C++ because it uses name mangling that depends on function signature. So in C++ void foo(int) and void foo(t_struct*) will have different names on linkage stage and linker will raise error about it.
Of course, that will not be easy to switch a huge C codebase to C++ in turn. But you can use some relatively simple workaround - e.g. add single .cpp file into your project and include all C files into it (actually generate it with some script).
Taking your example and VS2010 I added TestCpp.cpp to project:
#include "stdafx.h"
namespace xxx
{
#include "File1.c"
#include "File2.c"
}
Result is linker error LNK2019:
TestCpp.obj : error LNK2019: unresolved external symbol "void __cdecl xxx::foo(int)" (?foo#xxx##YAXH#Z) referenced in function "int __cdecl xxx::main(int,char * * const)" (?main#xxx##YAHHQAPAD#Z)
W:\TestProjects\GenericTest\Debug\GenericTest.exe : fatal error LNK1120: 1 unresolved externals
Of course, this will not be so easy for huge codebase, there can be other problems leading to compilation errors that cannot be fixed without changing codebase. You can partially mitigate it by protecting .cpp file contents with conditional #ifdef and use only for periodical checks rather than for regular builds.
Every (non-static) function defined in every foo.c file should have a prototype in the corresponding foo.h file, and foo.c should have #include "foo.h". (main is the only exception.) foo.h should not contain prototypes for any functions not defined in foo.c.
Every function should prototyped exactly once.
You can have .h files with no corresponding .c files if they don't contain any prototypes. The only .c file without a corresponding .h file should be the one containing main.
You already know this, and your problem is that you have a huge code base where this rule has not been followed.
So how do you get from here to there? Here's how I'd probably do it.
Step 1 (requires a single pass over your code base):
For each file foo.c, create a file foo.h if it doesn't already exist. Add "#include "foo.h" near the top of foo.c. If you have a convention for where .h and .c files should live (either in the same directory or in parallel include and src directories, follow it; if not, try to introduce such a convention).
For each function in foo.c, copy its prototype to foo.h if it's not already there. Use copy-and-paste to ensure that everything stays consistent. (Parameter names are optional in prototypes and mandatory in definitions; I suggest keeping the names in both places.)
Do a full build and fix any problems that show up.
This won't catch all your problems. You could still have multiple prototypes for some functions. But you'll have caught any cases where two headers have inconsistent prototypes for the same function and both headers are included in the same translation unit.
Once everything builds cleanly, you should have a system that's at least as correct as what you started with.
Step 2:
For each file foo.h, delete any prototypes for functions that aren't defined in foo.c.
Do a full build and fix any problems that show up. If bar.c calls a function that's defined in foo.c, then bar.c needs a #include "foo.h".
For both of these steps, the "fix any problems that show up" phase is likely to be long and tedious.
If you can't afford to do all this at once, you can probably do a lot of it incrementally. Start with one or a few .c files, clean up their .h files, and remove any extra prototypes declared elsewhere.
Any time you find a case where a call uses an incorrect prototype, try to figure out the circumstances in which that call is executed, and how it causes your application to misbehave. Create a bug report and add a test to your regression test suite (you have one, right?). You can demonstrate to management that the test now passes because of all the work you've done; you really weren't just messing around.
Automated tools that can parse C are likely to be useful. Ira Baxter has some suggestions. ctags may also be useful. Depending on how your code is formatted, you can probably throw together some tools that don't require a full C parser. For example, you might use grep, sed, or perl to extract a list of function definitions from a foo.c file, then manually edit the list to remove false positives.
Its obvious ("I have a huge code base") you cannot do this by hand.
What you need is an automated tool that can read your source files as the compiler sees them, collect all function prototypes and definitions, and verify that all definitions/prototypes match. I doubt you'll find such a tool lying around.
Of course, this match much check the signature, and this requires something like the compiler's front end to compare the signatures.
Consider
typedef int T;
void foo(T x);
in one compilation unit, and
typedef float T;
void foo(T x);
in another. You can't just compare the signature "lines" for equality; you need something that can resolve the types when checking.
GCCXML may be able to help, if you are using a GCC dialect of C; it extracts top-level declarations from source files as XML chunks. I don't know if it will resolve typedefs, though. You obviously have to build (considerable) support to collect the definitions in a central place (a database) and compare them. Comparing XML documents for equivalents is at least reasonably straightforward, and pretty easy if they are formatted in a regular way. This is likely your easiest bet.
If that doesn't work, you need something that has a full C front end that you can customize. GCC is famously available, and famously hard to customize. Clang is available, and might be pressed into service for this, but AFAIK only works with GCC dialects.
Our DMS Software Reengineering Toolkit has C front ends (with full preprocessing capability) for many dialects of C (GCC, MS, GreenHills, ...) and builds symbol tables with complete type information. Using DMS you might be able (depending on the real scale of your application) to simply process all the compilation units, and build just the symbol tables for each compilation unit. Checking that symbol table entries "match" (are compatible according to compiler rules including using equivalent typedefs) is built-into the C front ends; all one needs to do is orchestrate the reading, and calling the match logic for all symbol table entries at global scope across the various compilation units.
Whether you do this with GCC/Clang/DMS, it is a fair amount of work to cobble together a custom tool. So you have decide how critical you need for fewer suprises is, compared to the energy to build such a custom tool.
I'm going through the source code of the "less" unix tool by Mark Nudelman, and the beginning of main.c has many of the following:
public int logfile = -1;
public int force_logfile = FALSE;
public char * namelogfile = NULL;
etc. in the global scope, before the definition of main(),
What does public mean in this context? And more important, where can I find this information by myself? I searched using countless query combinations, and could not find this information, or any thorough C reference.
In the file less.h is your answer:
#define public /* PUBLIC FUNCTION */
It seems like public is only used as a marker for public/global functions and variables.
When compiled, it is expanded to nothing.
How to find this information?
Search the .c file from top to the location of the identifier you want more information about
If you do not find any declaration, look for #include directives
Open any included file and look for the declaration of what you are looking for
Repeat from step two for every included file
In this case, that was pretty simple.
This has nothing to do with C as such. If you look in the include file less.h you will see that the author has defined a number of preprocessor instructions. Some of them like 'public' is most likely for readability. E.g.:
/*
* Language details.
*/
#if HAVE_VOID
#define VOID_POINTER void *
#else
#define VOID_POINTER char *
#define void int
#endif
#if HAVE_CONST
#define constant const
#else
#define constant
#endif
#define public /* PUBLIC FUNCTION */
See how public is defined. It's translated to nothing and as you have already figured out it's in the global scope. However it's more readable and more obious that it's in the global scope. Also, one could argue that if the source is written consistently like this and a new version of C emerges that does have a public keyword, it's a matter of redefining the the header file and recompile to actually use it.
Preprocessing tricks like this can even be used in clever ways to have one source compile in different languages (like C++ and Java). This is not something you should be doing, but it's possible to it.
The options like HAVE_VOID you see in the example from less.h above are usually specified as compiler (actually preprocessor) options on compile time. So if you have a compiler and a version of C that supports the void keyword you would compile your source with:
g++ -g -DHAVE_VOID -Wall myprog.C -o
myprog
Everywhere the author uses VOID_POINTER in the source would then actually be considered by the compiler as:
void *
If you didn't specify HAVE_VOID the compiler would instead use
char *
which is a reasonable substitue.
TIP: Check your compiler's options to see if you have an option to just preprocess your sources. That way you can look at the actual source that gets sent to the compiler.
C doesn't have a keyword "public", so it's probably a macro defined in the less source code somewhere.
The definition of public as an empty pre-processor macro has been addressed in other answers. To find the definition, you probably want to use a tool like ctags/etags or cscope. (There are many tools to scan a source tree to generate this information.) For example, you can find the definition of public at line 55 of less.h by invoking:
$ ctags -dtw *.c *.h
$ vi -t public
Or, simply run ctags before you start editing anything. When you see a definition you don't understand, put the cursor on it and type ^] (that's control-right square bracket, and will work in vi-like editors.)