Keep functions private into a lib in C

Keep functions private into a lib in C - c

I recently had to face a fairly complex issue regarding lib management, but I would be very surprised to be the first one.
Let's imagine you are creating a library (static or dynamic) called lib1 in C. Inside lib1 are a few functions that are exposed through an API, and a few other ones which remain private.
Ideally, the private functions would be static. Unfortunately, let's assume one of the source files, called extmod.c, come from another project, and it would be beneficial to keep it unmodified. Therefore, it becomes unpractical to static the functions it defines.
As a consequence, all the functions defined into extmod are present into lib1 ABI, but not the API, since the relevant *.h is not shipped. So no one notice.
Unfortunately, at later stage, someone wants to link both lib1 and another lib2 which also includes extmod. It results in a linking error, due to duplicate definitions.
In C++, the answer to this problem would be a simple namespace. In C, we are less lucky.
There are a few solutions to this problem, but I would like to probe if anyone believes to have found an efficient and non-invasive way. By "non-invasive", I mean a method which avoids if possible to modify extmod.c.
Among the possible workaround, there is the possibility to change all definitions from extmod.c using a different prefix, effectively emulating namespace. Or the possibility to put the content of extmod.c into extmod.h and static everything. Both method do extensively modify extmod though ...
Note that I've looked at this previous answer, but it doesn't address this specific concern.

You could implement your 'different prefix' solution by excluding extmod.c from your your build and instead treating it as header file in a way. Use the C pre-processor to effectively modify the file without actually modifying it. For example if extmod.c contains:
void print_hello()
{
printf("hello!");
}
Exclude this file from your build and add one called ns_extmod.c. The content of this file should look like this:
#define print_hello ns_print_hello
#include "extmod.c"
On compilation, print_hello will be renamed to ns_print_hello by the C pre-processor but the original file will remain intact.
Alternatively, IF AND ONLY IF the function are not called internally by extmod.c, it might work to use the preprocessor to make them static in the same way:
#define print_hello static print_hello
#include "extmod.c"
This should work for you assuming you have control over the build process.

One way you can do prefixing without actually editing extmod.c is as follows:
Create a new header file extmod_prefix.h as:
#ifndef EXTMOD_PREFIX_H
#define EXTMOD_PREFIX_H
#ifdef LIB1
#define PREFIX lib1_
#else
#ifdef LIB2
#define PREFIX lib2_
#endif
#endif
#define function_in_extmod PREFIX##function_in_extmod
/* Do this for all the functions in extmod.c */
#endif
Include this file in extmod.h and define LIB1 in lib1's build process and LIB2 in lib2.
This way, all the functions in extmod.c will be prefixed by lib1_ in lib1 and lib2_ in lib2.

Here's the answer (in the form of a question). The relevant portion:
objcopy --prefix-symbols allows me to prefix all symbols exported by
an object file / static library.

Related

is it possible to have only header file in C without source file

I would like to write a C library with fast access by including just header files without using compiled library. For that I have included my code directly in my header file.
The header file contains:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#ifndef INC_TEST_H_
#define INC_TEST_H_
void test(){
printf("hello\n");
}
#endif
My program doesn't compile because I have multiple reference to function test(). If I had a correct source file with my header it works without error.
Is it possible to use only header file by including code inside in a C app?

Including code in a header is generally a really bad idea.
If you have file1.c and file2.c, and in each of them you include your coded.h, then at the link part of the compilation, there will be 2 test functions with global scope (one in file1.c and the other one in file2.c).
You can use the word "static" in order to say that the function will be restricted so it is only visible in the .c file which includes coded.h, but then again, it's a bad idea.
Last but not least: how do you intend to make a library without a .so/.a file? This is not a library; this is copy/paste code directly in your project.
And when a bug is found in your "library", you will be left with no solution apart correcting your code, redispatch it in every project, and recompile every project, missing the very point of a dynamic library: The ability to "just" correct the library without touching every program using it.

If I understand what you're asking correctly, you want to create a "library" which is strictly source code that gets #incuded as necessary, rather than compiled separately and linked.
As you have discovered, this is not easy when you're dealing with functions - the compiler complains of multiple definitions (you will have the same problem with object definitions).
You have a couple of options at this point.
You could declare the function static:
static void test( void )
{
...
}
The static keyword limits the function's visibility to the current translation unit, so you don't run into multiple definition errors at link time. It means that each translation unit is creating its own separate "instance" of the function, leading to a bit of code bloat and slightly longer build times. If you can live with that, this is the easiest solution.
You could use a macro in place of a function:
#define TEST() (printf( "hello\n" ))
except that macros are not functions and do not behave like functions. While macro-based "libraries" do exist, they are not trivial to implement correctly and require quite a bit of thought. Remember that macro arguments are not evaluated, they're just expanded in place, which can lead to problems if you pass expressions with side effects. The classic example is:
#define SQUARE(x) ((x)*(x))
...
y = SQUARE(z++);
SQUARE(z++) expands to ((z++)*(z++)), which leads to undefined behavior.
Separate compilation is a Good Thing, and you should not try to avoid it. Doing everything in one source file is not scalable, and leads to maintenance headaches.

My program do not compiled because I have multiple reference to test() function
That is because the .h file with the function is included and compiled in multiple C source files. As a result, the linker encounters the function with global scope multiple times.
You could have defined the function as static, which means it will have scope only for the curent compilation unit, so:
static void test()
{
printf("hello\n");
}

Hiding non-API symbols in library

Suppose I have a library foo which consists of the modules foo and util and has the following source tree:
foo/
foo.c
foo.h
util.c
util.h
The public API of the library is defined in foo.h and all global identifiers are properly prefixed with foo_ or util_. The module util is only used by foo. To prevent name clashes with other modules named util I want to create a (static) library in which only identifiers from module foo are visible. How can I do this?
Edit: I have searched the internet quite extensively but surprisingly this seems to be one of those unsolved problems in computer science.

There are probably other possible approaches, but here's one:
You might consider including the file util.c within foo.c and making all the util functions / globals static. i.e.:
#include "util.c"
// ...
This works the same as *.h files, it simply ports the whole source into foo.c, nesting util.c and making all the static data available.
When I do this, I rename the file to .inc (i.e. util.c => util.inc)...
#include "util.inc"
// ...
...it's an older convention I picked up somewhere, though it might conflict with assembler files, so you'll have to use your own discretion.
EDIT
Another approach might require linker specific directives. For example, this SO answer details GNU's ld to achieve this goal. There are other approaches as well, listed in that same thread.

The following is GCC-specific.
You can mark each utility function with
__attribute__((visibility ("hidden")))
which will prevent it from being linked to from another shared object.
You can apply this to a series of declarations by surrounding them with
#pragma GCC visibility push(hidden)
/* ... */
#pragma GCC visibility pop
or use -fvisibility=hidden when compiling the object, which applies to declarations without an explicit visibility (e.g. neither __attribute__((visibility)) nor #pragma GCC visibility).

Before each variable and function declaration in util.h, define a macro constant which renames the declared identifier by adding the library prefix foo_, for instance
#define util_x foo_util_x
extern int util_x;
#define util_f foo_util_f
void util_f(void);
...
With these definitions in place, no other parts of the code need to be changed, and all global symbols in the object file util.o will be prefixed with foo_. This means that name collisions are less likely to occur.

Removing functions included from a header from scope of the next files

In my project we are heavily using a C header which provides an API to comunicate to an external software. Long story short, in our project's bugs show up more often on the calling of the functions defined in those headers (it is an old and ugly legacy code).
I would like to implement an indirection on the calling of those functions, so I could include some profiling before calling the actual implementation.
Because I'm not the only person working on this project, I would like to make those wrappers in a such way that if someone uses the original implementations directly it should cause a compile error.
If those headers were C++ sources, I would be able to simply make a namespace, wrap the included files in it, and implement my functions using it (the other developers would be able to use the original implementation using the :: operator, but just not being able to call it directly is enough encapsulation to me). However the headers are C sources (which I have to include with extern "C" directive to include), so namespaces won't help me AFAIK.
I tried to play around with defines, but with no luck, like this:
#define my_func api_func
#define api_func NULL
What I wanted with the above code is to make my_func to be translated to api_func during the preprocessing, while making a direct call to api_func give a compile error, but that won't work because it will actually make my_func to be translated to NULL too.
So, basically, I would like to make a wrapper, and make sure the only way to access the API is through this wrapper (unless the other developers make some workaround, but this is inevitable).
Please note that I need to wrap hundreds of functions, which show up spread in the whole code several times.
My wrapper necessarily will have to include those C headers, but I would like to make them leave scope outside the file of my wrapper, and make them to be unavailable to every other file who includes my wrapper, but I guess this is not possible in C/C++.

You have several options, none of them wonderful.
if you have the sources of the legacy software, so that you can recompile it, you can just change the names of the API functions to make room for the wrapper functions. If you additionally make the original functions static and put the wrappers in the same source files, then you can ensure that the originals are called only via the wrappers. Example:
static int api_func_real(int arg);
int api_func(int arg) {
// ... instrumentation ...
int result = api_func_real(arg);
// ... instrumentation ...
return result;
}
static int api_func_real(int arg) {
// ...
}
The preprocessor can help you with that, but I hesitate to recommend specifics without any details to work with.
if you do not have sources for the legacy software, or if otherwise you are unwilling to modify it, then you need to make all the callers call your wrappers instead of the original functions. In this case you can modify the headers or include an additional header before that uses #define to change each of the original function names. That header must not be included in the source files containing the API function implementations, nor in those providing the wrapper function implementations. Each define would be of the form:
#define api_func api_func_wrapper
You would then implement the various api_func_wrapper() functions.
Among the ways those cases differ is that if you change the legacy function names, then internal calls among those functions will go through the wrappers bearing the original names (unless you change the calls, too), but if you implement wrappers with new names then they will be used only when called explicitly, which will not happen for internal calls within the legacy code (unless, again, you modify those calls).

You can do something like
[your wrapper's include file]
int origFunc1 (int x);
int origFunc2 (int x, int y);
#ifndef WRAPPER_IMPL
#define origFunc1 wrappedFunc1
#define origFunc2 wrappedFunc2
#else
int wrappedFunc1(int x);
int wrappedFunc2(int x, int y);
#endif
[your wrapper implementation]
#define WRAPPER_IMPL
#include "wrapper.h"
int wrapperFunc1 (...) {
printf("Wrapper1 called\n");
origFunc1(...);
}
Your wrapper's C file obviously needs to #define WRAPPER_IMPL before including the header.
That is neither nice nor clean (and if someone wants to cheat, he could simply define WRAPPER_IMPL), but at least some way to go.

There are two ways to wrap or override C functions in Linux:
Using LD_PRELOAD:
There is a shell environment variable in Linux called LD_PRELOAD,
which can be set to a path of a shared library,
and that library will be loaded before any other library (including glibc).
Using ‘ld --wrap=symbol‘:
This can be used to use a wrapper function for symbol.
Any further reference to symbol will be resolved to the wrapper function.
a complete writeup can be found at:
http://samanbarghi.com/blog/2014/09/05/how-to-wrap-a-system-call-libc-function-in-linux/

Hide dependent DLL functions

I am struggling to understand if I can prevent exposing certain function calls from within a DLL that I am building. The function calls I want to hide are calls that are exposed by sqlite3.dll which I am building into another DLL of my own making. sqlite3.dll exposes 5 functions, one of which looks like this in the header:
SQLITE_API int SQLITE_STDCALL sqlite3_close(sqlite3*);
The macros at play here are defined earlier as:
/*
** Provide the ability to override linkage features of the interface.
*/
#ifndef SQLITE_EXTERN
# define SQLITE_EXTERN extern
#endif
#ifndef SQLITE_API
# define SQLITE_API
#endif
#ifndef SQLITE_CDECL
# define SQLITE_CDECL
#endif
#ifndef SQLITE_STDCALL
# define SQLITE_STDCALL
#endif
Now, I am building sqlite3.dll into my application by linking against sqlite3.lib and including sqlite3.h (the source of the prior code snippets).
I realize I may be able to play with those macros to achieve what I want.
I expose the functions in my own dll with:
/* module entry point */
int __declspec(dllexport) __stdcall load_properties(CAObjHandle context);
When I look at the functions available in the output of my build, I get my functions+5 functions from the sqlite library. All of the functions in sqlite that are exposed have the declaration structure similar to what I showed for close() above.
Is there a way I can hide the sqlite functions? Is it the .lib file that is causing the issue? That file was auto-generated so I am not sure what is in there.

I discovered the answer. The sqlite3.dll was incorrectly specified as a source of exports to the compiler. Removing the directive to export functions from sqlite3.dll corrected the issue.

How can I share an array between the C files of a library, without the array being visible to the outside? [duplicate]

I am writing a C (shared) library. It started out as a single translation unit, in which I could define a couple of static global variables, to be hidden from external modules.
Now that the library has grown, I want to break the module into a couple of smaller source files. The problem is that now I have two options for the mentioned globals:
Have private copies at each source file and somehow sync their values via function calls - this will get very ugly very fast.
Remove the static definition, so the variables are shared across all translation units using extern - but now application code that is linked against the library can access these globals, if the required declaration is made there.
So, is there a neat way for making private global variable shared across multiple, specific translation units?

You want the visibility attribute extension of GCC.
Practically, something like:
#define MODULE_VISIBILITY __attribute__ ((visibility ("hidden")))
#define PUBLIC_VISIBILITY __attribute__ ((visibility ("default")))
(You probably want to #ifdef the above macros, using some configuration tricks à la autoconfand other autotools; on other systems you would just have empty definitions like #define PUBLIC_VISIBILITY /*empty*/ etc...)
Then, declare a variable:
int module_var MODULE_VISIBILITY;
or a function
void module_function (int) MODULE_VISIBILITY;
Then you can use module_var or call module_function inside your shared library, but not outside.
See also the -fvisibility code generation option of GCC.
BTW, you could also compile your whole library with -Dsomeglobal=alongname3419a6 and use someglobal as usual; to really find it your user would need to pass the same preprocessor definition to the compiler, and you can make the name alongname3419a6 random and improbable enough to make the collision improbable.
PS. This visibility is specific to GCC (and probably to ELF shared libraries such as those on Linux). It won't work without GCC or outside of shared libraries.... so is quite Linux specific (even if some few other systems, perhaps Solaris with GCC, have it). Probably some other compilers (clang from LLVM) might support also that on Linux for shared libraries (not static ones). Actually, the real hiding (to the several compilation units of a single shared library) is done mostly by the linker (because the ELF shared libraries permit that).

The easiest ("old-school") solution is to simply not declare the variable in the intended public header.
Split your libraries header into "header.h" and "header-internal.h", and declare internal stuff in the latter one.
Of course, you should also take care to protect your library-global variable's name so that it doesn't collide with user code; presumably you already have a prefix that you use for the functions for this purpose.
You can also wrap the variable(s) in a struct, to make it cleaner since then only one actual symbol is globally visible.

You can obfuscate things with disguised structs, if you really want to hide the information as best as possible. e.g. in a header file,
struct data_s {
void *v;
};
And somewhere in your source:
struct data_s data;
struct gbs {
// declare all your globals here
} gbss;
and then:
data.v = &gbss;
You can then access all the globals via: ((struct gbs *)data.v)->

I know that this will not be what you literally intended, but you can leave the global variables static and divide them into multiple source files.
Copy the functions that write to the corresponding static variable in the same source file also declared static.
Declare functions that read the static variable so that external source files of the same module can read it's value.
In a way making it less global. If possible, best logic for breaking big files into smaller ones, is to make that decision based on the data.
If it is not possible to do it this way than you can bump all the global variables into one source file as static and access them from the other source files of the module by functions, making it official so if someone is manipulating your global variables at least you know how.
But then it probably is better to use #unwind's method.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight