how to reset handlers registered by pthread_atfork - c

Some libraries might register some handlers with pthread_atfork(). I don't need them as I only use fork() together with exec(). Also, they can cause trouble in some cases. So, is there a way to reset the registered handler list?
Related: calling fork() without the atfork handlers, fork() async signal safety.

POSIX does not document any mechanism for fork handlers installed by pthread_atfork() to be removed, short of termination of the process or replacing the process image. If you don't want them, then don't install them. If they are installed by a third-party library, as you describe, then your options are to find a way to avoid that behavior of the library (possibly by avoiding the library altogether) or to live with it.

Related

Signals and libraries

Are there any conventions/design pattern for using signals and signal handlers in a library code? Because signals are directed to the whole process and not to specific thread or library, i feel there may be some issues.
Let's say i m writing a shared library which will be used by other applications and i want to use alarm, setitimer functions and trap SIGALRM signal to do some processing at specific time.
I see some problems with it:
1) If application code (which i have no control of) also uses SIGALRM and i install my own signal handler for it, this may overwrite the application's signal handler and thus disable it's functionality. Of course i can make sure to call previous signal handler (retrieved by signal() function) from my own signal handler, but then there is still a reverse problem when application code can overwrite my signal handler and thus disable the functionality in my library.
2) Even worse than that, application developer may link with another shared library from another vendor which also uses SIGALRM, and thus nor me nor application developer would have any control over it.
3) Calling alarm() or setitimer() will overwrite the previous timer used by the process, so application could overwrite the timer i set in the library or vice versa.
I m kinda a novice at this, so i m wondering if there is already some convention for handling this? (For example, if every code is super nice, it would call previous signal handler from their own signal handler and also structure the alarm code to honor previous timers before overwriting them with their own timer)
Or should i avoid using signal handlers and alarm()s in a library alltogether?
Or should i avoid using signal handlers and alarm()s in a library alltogether?
Yes. For the reasons you've identified, you can't depend on signal disposition for anything, unless you control all code in the application.
You could document that your library requires that the application not use SIGALRM, nor call alarm, but the application developer may not have any control over that anyway, and so it's in your best interest to avoid placing such restrictions in the first place.
If your library can work without SIGALRM (perhaps with reduced functionality), you can also make this feature optional, perhaps controlled by some environment variable. Then, if it is discovered that there is some code that interferes with your signal handling, you can tell the end-user to set your environment variable to disable that part of your library (which beats having to rebuild and supply a new version of it).
P.S. Your question and this answer applies to any library, whether shared or archive.

Getting popen and SIGCHLD handler to work in parallel

In our code base we have a part of software that allows to run an arbitrary amount of external programs and monitor their exit codes via the use of fork() and the installation of a SIGCHLD handler. In the unit test cases this piece of software works fine.
However the process that is running this fork "server" is also running a bunch of software modules in several threads. Unfortunately one some parts of this (older) software are using popen() which seems to need to use its own SIGCHLD handler. The result we see is that the program will fail on the call to pclose() with the errno ECHILD.
Is there any way to use a SIGCHLD handler and a call to popen/pclose in parallel?
After fork(), signal handlers are inherited. So perhaps you should reset them to their defaults between fork() and exec(), by using signal() with SIG_DFL in the first child process, before you invoke the legacy software.

atexit considered harmful?

Are there inherent dangers in using atexit in large projects such as libraries?
If so, what is it about the technical nature behind atexit that may lead to problems in larger projects?
The main reason I would avoid using atexit in libraries is that any use of it involves global state. A good library should avoid having global state.
However, there are also other technical reasons:
Implementations are only required to support a small number (32, I think) of atexit handlers. After that, it's possible that all calls to atexit fail, or that they succeed or fail depending on resource availability. Thus, you have to deal with what to do if you can't register your atexit handler, and there might not be any good way to proceed.
Interaction of atexit with dlopen or other methods of loading libraries dynamically is not defined. A library which has registered atexit handlers cannot safely be unloaded, and the ways different implementations deal with this situation can vary.
Poorly written atexit handlers could have interactions with one another, or just bad behaviors, that prevent the program from properly exiting. For instance, if an atexit handler attempts to obtain a lock that's held in another thread and that can't be released due to the state at the time exit was called.
Secure CERT has an entry about atexit when not used correctly:
ENV32-C. All atexit handlers must return normally
https://www.securecoding.cert.org/confluence/display/seccode/ENV32-C.+All+atexit+handlers+must+return+normally

How to avoid "exit" in user defined library programs for my container server in C?

I'm writing a container server for library in C.
The system library dl, as dynamic linking loader, is used to implement the programming interface. That is dlopen/dlsym function.
To return control to the container server, both return and exit could be used. The return is ok.
But the exit() in users' program will lead the container server to exit too.
How can I support exit in users' programs?
I'm thinking to override the exit function while invoking dynamic linking loader.
Since you're writing just a library, it cannot run on its own without a process invoking it.
As soon as the application exits, the state of your library would also be lost unfortunately.
In other words, if you want to maintain the state even after the application exits, you would probably need to write an Initialization Daemon which is always the first process to initialize this library and keeps running in the background as a means to maintain the state of your container.
You would also need to use semaphores or some form of IPC to ensure the state is propagated between the daemon and other client processes using this library.
Maybe you should create a new child process for running the library function which may call exit().
When the library function calls exit() or returns, then the child process will exit, and the parent process (the container server) will get information about the termination of child.
In this case easiest way to start new a child process, is to call fork(). Using of system() call is not needed in this case.

Shared POSIX objects cleanup on process end / death

Is there any way to perform POSIX shared synchronization objects cleanup especially on process crash? Locked POSIX semaphores unblock is most desired thing but automatically 'collected' queues / shared memory region would be nice too. Another thing to keep eye on is we can't in general use signal handlers because of SIGKILL which cannot be caught.
I see only one alternative: some external daemon which accepts subscriptions and 'keep-alive' requests working as watchdog so not having notifications about some object it could close / unlock object in accordance to registered policy.
Has anyone better alternative / proposition? I never worked seriously with POSIX shared objects before (sockets were enough for all my needs and are much more useful by my opinion) and I did not found any applicable article. I'd gladly use sockets here but can't because of historical reasons.
Rather than using semaphores you could use file locking to co-oridinate your processes. The big advanatge of file locks being that they are released if the process terminates. You can map each semaphore onto a lock for a byte in a shared file and know that locks will get released on exit; in mosts version of unix the bytes you lock don't even have to exist. There is code for this in Marc Rochkind's book Advanced Unix Programming 1st edition, don't know if it's in the latest 2nd edition though.
I know this question is old, but another great solution is POSIX robust mutexes. They automatically unlock and enter an "inconsistent flag" state when the owner dies, and the next thread to attempt locking the mutex gets an EOWNERDEAD error but succeeds in becoming the new owner of the mutex. It's then able to clean up whatever state the mutex was protecting (which could be in a very bad inconsistent state due to asynchronous termination of the previous owner!) and mark the mutex as consistent again before unlocking it.
See the documentation on robust mutexes here:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html
The usual way is to work with signal handlers. Just catch the signals and call the cleanup functions.
But your watchdog daemon has some merits, too. It would surely make the system more simple to understand and manage. To make it more simple to administrate, your application should start the daemon when it's not running and the daemon should be able to clean up any residue from the last crash.

Resources