Thread-safe init of read-only global data - c

Let's imagine that I'm writing a library that has a reasonably large amount of read-only global data that needs to be initialized before the library can be used. For example, perhaps the global data be lookup tables for various parts of the application logic that won't change during the lifetime of the program.
Now I have a few ways to initialize this data:
I may require that the user call some kind of init() function before the library is used.
I may lazily construct the data the first time a function is called on my library.
I may include the data in a initializer statement in the source, such that variables are statically initialized to their final value.
Now if my data is read-only and should be the same for every environment the library runs in, then (3) is fairly appealing. Even in that case it has some downsides: if the data is very large (but easy to generate procedurally) the size of bloat up a lot (e.g., a library with 50K of code but 8MB of lookup tables would end up around 8050K). Similarly, the source itself may be very large, or the build system needs to handle the generation of the source at compile time.
The main reason you might not able to use (3) is that the tables might be fixed (read-only), but require generation at runtime because they embed some information about the environment (e.g., the value of an environment variable, I configuration setting read from a file, information about the machine architecture, whatever). This data can't be embedded in the source since depends on the runtime environment.
So we have methods (1) and (2) at least - but I can't see how to make these thread-safe in a simple way. The rest of the library can be thread-safe simply by not mutating any global state - just like the vast majority of C functions can be written in a thread-safe way w/o any explicit use of threading primitives.
I can't figure out a similar alternative for this global init, however:
(1) Is undesirable because we prefer not to require the user to call this method, and in any case it simply moves the problem up to the calling code: the calling code then needs to organize to call this init() method exactly once across all threads using the library, and before any thread uses the library.
(2) Fails since concurrent calls to the library might do a double init.
In C++ you can just initialize globals with a method call, like int data[] = loadData(). Is there any equivalent in C? Or am I stuck using threading primitives (which vary by platform, e.g., pthread_once, call_once and whatever Windows has) just to get my thread-safe init?

I don't know of any platform-independent way of initializing a library in a thread-safe manner. That's not surprising since there's no platform-independent threading model in C.
So your solution is going to be platform-specific.
#ThingyWotsit mentions in the comments using C++ to initialize your library, and that will be thread-safe. But it may very well lock you into a specific C++ run-time, so it may not be a useful solution for your C shared object/library. You may not be willing or able to add a dependency on C++ and you may especially not be willing or able to be locked into a specific C++ run-time.
For GCC, you can use the __attribute((constructor)) to have your iniitaliziation function called when the shared object is loaded:
constructor
destructor
constructor (priority)
destructor (priority)
The constructor attribute causes the function to be called automatically before execution enters main ().
Similarly, the destructor attribute causes the function to be called
automatically after main () has completed or exit () has been called.
Functions with these attributes are useful for initializing data that
will be used implicitly during the execution of the program.
You may provide an optional integer priority to control the order in
which constructor and destructor functions are run. A constructor with
a smaller priority number runs before a constructor with a larger
priority number; the opposite relationship holds for destructors. So,
if you have a constructor that allocates a resource and a destructor
that deallocates the same resource, both functions typically have the
same priority. The priorities for constructor and destructor functions
are the same as those specified for namespace-scope C++ objects (see
C++ Attributes).
For example:
static __attribute__((constructor)) void my_lib_init_func( void )
{
...
}
Your code will run before main() is called.
If your library is dynamically loaded (explicit call to dlopen(), for exmaple), your init function will be called when your library is loaded, and your library won't be considered loaded until it returns.
Other compilers provide the functionally-identical #pragma init():
#pragma init(my_lib_init_func)
static void my_lib_init_func( void )
{
...
}
See #pragma init and #pragma fini using gcc compiler on linux
For Windows? The Windows C++ run-time is pretty stable and ubiquitous. I'd just use a C++ solution on Windows, especially if you're compiling with MSVC. (But see the comments...)

Option 3 is always preferable when possible. Your reasoning about the cons is wrong. If you have an 8MB constant table in the executable file, it's directly mapped and shared by all instances of the program or users of the shared library on any remotely modern operating system. If you generate it at runtime, each process will have its own copy of the table.
When option 3 is not available you must use pthread_once or equivalent or implement your own version of the same (much less efficiently) using a lock. There is little reason to use weird OS-specific replacements for it; all major platforms either support POSIX threads API natively or have existing libraries which provide it on top of the platform's low-level primitives.

Related

Providing external routines from a C library in a threadsafe manner

I have a c-library wrapped around a fortran library that I want to use in OCaml. The obvious solution is to map the c-interface into ocaml routines using some handwritten code to deal with GC.
However, it turns out that the algorithm implemented by the fortran library gets its inputs as EXTERNAL routines, i.e.:
EXTERNAL RHS
This means that the input is essentially passed by the linker. The C-wrapper has a nice interface collecting all required input in a struct, but essentially provides one global instance of that struct and then defines all the missing external routines in terms of that global instance.
As a functional programmer, this smells like an antipattern to me. Since I do not want to rewrite the fortran code, my question is:
Is there a safe, idiomatic way to link the fortran library and avoiding global state clashups? Can the C-library provide the global state of the fortran library, without rewriting the fortran code?
If no such way exists, what is a good C11 (i.e. OS independent) idiom to protect the global state? I'd need a kind of global lock that only allows access through a key that is issued exactly once.
I just read about thread local declarations in C11, would that be an option?

When do I need a function to run before or after main()?

GCC supports construtors/destructor functions which support running function before or after main():
The constructor attribute causes the function to be called automatically before execution enters main(). Similarly, the destructor attribute causes the function to be called automatically after main() completes or exit() is called. Functions with these attributes are useful for initializing data that is used implicitly during the execution of the program.
Here is an example from GeeksforGeeks.
When is the proper scenario of using this feature? Especially a function to be called before main(), what is the difference if we just place it in start of main()?
Such constructor and destructor functions are mainly useful when writing libraries.
If you are writing a library which needs to be initialised, then you would have to provide an initialisation function. But how would you ensure that it is run before any other of your library's functions? The use of the library would have to remember to call it, which they could easily forget.
One way to get the initialisation done automatically is to mark the function as a constructor.
See also: How to initialize a shared library on Linux
For the majority of scenarios there will be no difference. Everything that you want to do with global variables, singletons, memory, etc, you could theoretically do in main() and with plain static initializers.
The main scenario where this is marginally applicable is cross platform projects, where you would like to keep most of your common code in main, however on some platforms, mainly embedded ones, you would like to duplicate what the other OSes are doing before main - setting up environment variables, wiring standard file descriptors (stdin/stdout/stderr) to custom descriptors on your system, allocate your own custom memory manager - e.g., allocate your own stack for running main(), and so on.
From mine point of view, module constructor have their meaning when making shared modules.
Shared modules don't have an specific initialization routine (there is DllMain on Windows, but i has it´s limitations).
For example, Asterisk PBX abuses of constructors because is strongly based on modules, it injects a constructor on each module at compilation time.
This constructor gets called on dlload() and tells asterisk core whether the module has been loaded properly or not, allowing it to call specific functions on the module.
Suppose you have a global structure and you want to initialize memory to the structure before starting your program, you can put it inside the constructor, since it calls before main().
Similarly, if you want to free any existing memory before the end of the program you can do so in the destructor.

__attribute__((constructor)) && __attribute__((destructor)) in multithreaded app

I have an app that I am currently writing in C , where I have several TLS static global variables declared inside a library which is part of the project.
The TLS variables are declared using gcc's __thread directive.
I would like to know if I can use .ctor && .dtor sections to initialise TLS data on a per thread basis inside a shared or static C library, and how thread safe using this method is.
Will the .ctor && .dtor sections be executed per thread or they exist only in the parent process?
On a final note , the library compiles either statically or dynamically to application code , does this mean the .ctor && .dotr sections decalred in the shared/static library will be part of the final executable?
I am really confused about the threading part mostly ... anyone who has an idea?
Try it and see what happens, but it's best not to rely on behaviour you can't find defined in the manual - it's liable to change without notice.
As far as I know, __attribute__((constructor)) applies only to global data running at load time. Trying to mix that with TLS might be undefined, or might only initialise the data for the master thread.
pthread_key_create creates tls entry with destructor callback;
example usage is here: http://linux.die.net/man/3/pthread_key_create
However you have to set the tls variable in a thread, otherwise the destructor callback is not called.
No, there is no call back that you could activate on the launching of threads. This would be a performance killer, I think: any naive programmer could add such a callback by accident and all of the sudden every thread of the program, even those that don't access that TLS would slow down.
For gcc's __thread as well as for the corresponding C11 feature _Thread_local only a static initialization is foreseen. That is the same value for every copy which must be determined at compile time is used.

C Shared library: static variable initialization + global variable visibility among processes

I want to modifiy an existing shared library so that it uses different memory management routines depending on the application using the shared library.
(For now) there will be two families of memory management routines:
The standard malloc, calloc etc functions
specialized versions of malloc, calloc etc
I have come up with a potential way of solving this problem (with the help of some people here on SO). There are still a few grey areas and I would like some feedback on my proposal so far.
This is how I intend to implement the modification:
Replace existing calls to malloc/calloc etc with my_malloc/my_calloc etc. These new functions will invoke correctly assigned function pointers instead of calling hard coded function names.
Provide a mechanism for the shared library to initialize the function pointers used by my_malloc etc to point to the standard C memory mgmt routines - this allows me to provide backward compatability to applications which depend on this shared library - so they don't have to be modified as well. In C++, I could have done this by using static variable initialization (for example) - I'm not sure if the same 'pattern' can be used in C.
Introduce a new idempotent function initAPI(type) function which is called (at startup) by the application that need to use different mem mgmt routines in the shared libray. The initAPI() function assigns the memory mgmt func ptrs to the appropriate functions.
Clearly, it would be preferable if I could restrict who could call initAPI() or when it was called - for example, the function should NOT be called after API calls have been made to the library - as this will change the memory mgmt routines. So I would like to restrict where it is called and by whom. This is an access problem which can be solved by making the method private in C++, I am not sure how to do this in C.
The problems in 2 and 3 above can be trivially resolved in C++, however I am constrained to using C, so I would like to solve these issues in C.
Finally, assuming that the function pointers can be correctly set during initialisation as described above - I have a second question, regarding the visibility of global variables in a shared library, accross different processes using the shared library. The function pointers will be implemented as global variables (I'm not too concerned about thread safety FOR NOW - although I envisage wrapping access with mutex locking at some point)* and each application using the shared library should not interfere with the memory management routines used for another application using the shared library.
I suspect that it is code (not data) that is shared between processes using a shlib - however, I would like that confirmed - preferably, with a link that backs up that assertion.
*Note: if I am naively downplaying threading issues that may occur in the future as a result of the 'architecture' I described above, someone please alert me!..
BTW, I am building the library on Linux (Ubuntu)
Since I'm not entirely sure what the question being asked is, I will try to provide information that may be of use.
You've indicated c and linux, it is probably safe to assume you are also using the GNU toolchain.
GCC provides a constructor function attribute that causes a function to be called automatically before execution enters main(). You could use this to better control when your library initialization routine, initAPI() is called.
void __attribute__ ((constructor)) initAPI(void);
In the case of library initialization, constructor routines are executed before dlopen() returns if the library is loaded at runtime or before main() is started if the library is loaded at load time.
The GNU linker has a --wrap <symbol> option which allows you to provide wrappers for system functions.
If you link with --wrap malloc, references to malloc() will redirect to __wrap_malloc() (which you implement), and references to __real_malloc() will redirect to the original malloc() (so you can call it from within your wrapper implementation).
Instead of using the --wrap malloc option to provide a reference to the original malloc() you could also dynamically load a pointer to the original malloc() using dlsym(). You cannot directly call the original malloc() from the wrapper because it will be interpreted as a recursive call to the wrapper itself.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdint.h>
#include <dlfcn.h>
void * malloc(size_t size) {
static void * (*func)(size_t) = NULL;
void * ret;
if (!func) {
/* get reference to original (libc provided) malloc */
func = (void *(*)(size_t)) dlsym(RTLD_NEXT, "malloc");
}
/* code to execute before calling malloc */
...
/* call original malloc */
ret = func(size);
/* code to execute after calling malloc */
...
return ret;
}
I suggest reading Jay Conrod's blog post entitled Tutorial: Function Interposition in Linux for additional information on replacing calls to functions in dynamic libraries with calls to your own wrapper functions.
-1 for the lack of concrete questions. The text is long, could have been written more succintly, and it does not contain a single question-mark.
Now to address your problems:
Static data (what you call "global variables") of a shared library is per-process. Your global variables in one process will not interfere with global variables in another process. No need for mutexes.
In C, you cannot restrict[1] who can call a function. It can be called by anybody who knows its name or has a pointer to it. You can code initAPI() such that it visibly aborts the program (crashes it) if it is not the first library function called. You are library writer, you set the rules of the game, and you have NO obligation towards coders who do not respect the rules.
[1] You can declare the function with static, meaning it can be called by name only by the code within the same translation unit; it can still be called through a pointer by anybody who manages to obtain a pointer to it. Such functions are not "exported" from libraries, so this is not applicable to your scenario.
Achieving this:
(For now) there will be two families of memory management routines:
The standard malloc, calloc etc functions
specialized versions of malloc, calloc etc
with dynamic libraries on Linux is trivial, and does not require the complicated scheme you have concocted (nor the LD_PRELOAD or dlopen suggested by #ugoren).
When you want to provide specialized versions of malloc and friends, simply link these routines into your main executable. Voila: your existing shared library will pick them up from there, no modifications required.
You could also build specialized malloc into e.g. libmymalloc.so, and put that library on the link line before libc, to achieve the same result.
The dynamic loader will use the first malloc it can see, and searches the list starting from the a.out, and proceeding to search other libraries in the same order they were listed on link command line.
UPDATE:
On further reflection, I don't think what you propose will work.
Yes, it will work (I use that functionality every day, by linking tcmalloc into my main executable).
When your shared library (the one providing an API) calls malloc "behind the scenes", which (of possibly several) malloc implementations does it get? The first one that is visible to the dynamic linker. If you link a malloc implementation into a.out, that will be the one.
It's easy enough for you to require that your initialization function is:
called from the main thread
that the client may call it exactly once
and that the client may provide the optional function pointers by parameter
If different applications run in separate processes, it's quite simple to do using dynamic libraries.
The library can simply call malloc() and free(), and applications that want to override it could load another library, with alternative implementations for these libraries.
This can be done with the LD_PRELOAD environment variable.
Or, if your library is loaded with dlopen(), just load the malloc library first.
This is basically what tools such as valgrind, which replace malloc, do.

Using __thread in c99

I would like to define a few variables as thread-specific using the __thread storage class. But three questions make me hesitate:
Is it really standard in c99? Or more to the point, how good is the compiler support?
Will the variables be initialised in every thread?
Do non-multi threaded programs treat them as plain-old-globals?
To answer your specific questions:
No, it is not part of C99. You will not find it mentioned anywhere in the n1256.pdf (C99+TC1/2/3) or the original C99 standard.
Yes, __thread variables start out with their initialized value in every new thread.
From a standpoint of program behavior, thread-local storage class variables behave pretty much the same as plain globals in non-multi-threaded programs. However, they do incur a bit more runtime cost (memory and startup time), and there can be issues with limits on the size and number of thread-local variables. All this is rather complicated and varies depending on whether your program is static- or dynamic-linked and whether the variables reside in the main program or a shared library...
Outside of implementing C/POSIX (e.g. errno, etc.), thread-local storage class is actually not very useful, in my opinion. It's pretty much a crutch for avoiding cleanly passing around the necessary state in the form of a context pointer or similar. You might think it could be useful for getting around broken interfaces like qsort that don't take a context pointer, but unfortunately there is no guarantee that qsort will call the comparison function in the same thread that called qsort. It might break the job down and run it in multiple threads. Same goes for most other interfaces where this sort of workaround would be possible.
You probably want to read this:
http://www.akkadia.org/drepper/tls.pdf
1) MSVC doesn't support C99. GCC does and other compilers attempt GCC compatibility.
edit A breakdown of compiler support for __thread is available here:
http://chtekk.longitekk.com/index.php?/archives/2011/02/C8.html
2) Only C++ supports an initializer and it must be constant.
3) Non-multi-threaded applications are single-threaded applications.

Resources