how to block a signal before shared library creates a thread? - c

I am dealing with application that requires certain signal to be blocked in every thread. Said app also dynamically links a library (libcpprest.so) which creates a thread pool during initialization. Naturally, because main executable had no chance to execute any code these threads have that signal unblocked -- which leads to mysterious crashes.
Is it possible to block a signal before dynamically linked library has a chance to create a thread?
Unacceptable solutions (that I am aware of):
link library statically and use init_priority to ensure signal is blocked asap
use "starter" utility that blocks signal and starts executable (which will inherit signals mask)

Doesn't seem to be possible -- shared library should block all signals in created threads if it really needs to create a thread so early.

Related

Use of pthread_join()

I am wondering, what can happen if we do a pthread_create without a pthread_join?
Who will "clean" all the memory of the "non-joined" thread.
When the process terminates, all resources associated with the process cease to exist. (This of course does not include shared resources the process created, like files in the filesystem, shared memory segments, etc.) Until then, unjoined threads will continue to consume resources, potentially calling future calls to pthread_create or even malloc to fail.
Well, assuming that it's an app-lifetime thread that does not need or try to explicitly terminate, the OS will do it when its process is terminated, (on all non-trivial OS).
If the thread is created without using pthread_join then when the main thread completes execution all other threads created in main function will be stopped and hence will not complete executing the whole statements in it.
Look at the documentation of Pthread_join.
It will make the main thread to suspend until the spawned thread completes execution.

NetServerEnum block when thread is terminated externaly

(Working in Win32 api , in C environment with VS2010)
I have a two thread app. The first thread forks the second and waits for a given interval - 'TIMEOUT', and then calls TerminateThread() on it.
Meanwhile, second thread calls NetServerEnum().
It appears that when timeout is reached , whether NetServerEnum returned successfully or not, the first thread get deadlocked.
I've already noticed that NetServerEnum creates worker threads of it's own.
I ultimately end up with one of those threads in deadlock, typically on ntdll.dll!RtlInitializeExceptionChain, unable to exit my process gracefully.
As this to too long for a comment:
Verbatim from MSDN, allow me to use te answer form (emphasis by me):
TerminateThread is a dangerous function that should only be used in the most extreme cases. You should call TerminateThread only if you know exactly what the target thread is doing, and you control all of the code that the target thread could possibly be running at the time of the termination. For example, TerminateThread can result in the following problems:
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be released.
*If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL.
From reading this it is easy to understanf why it is a bad idea to cancel (terminate) a thread stucking in a system call.
A possible alternative approach to the OP's design might be to spawn off a thread calling NetServerEnum() and simply let it run until the system call returned.
In the mean while the main thread could do other things like for example informing the user that scanning the net takes longer as expected.

freeing memory inside a signal handler

I am writing an API that uses sockets. In the API, I allocate memory for various items. I want to make sure I close the sockets and free the memory in case there is a signal such as Ctrl-C. In researching this, it appears free() is not on the safe function list (man 7 signal) thus, I can't free the memory inside a signal handler. I can close the socket just fine though. Does any have any thoughts on how I can free the memory? Thank you in advance for your time.
Alternatively, don't catch the signal and just let the OS handle the cleanup as it's going to do during process cleanup anyway. You're not releasing any resources that aren't tied directly to the process, so there's no particular need to manually release them.
One technique (others exist too):
Have your program run a main processing loop.
Have your main processing loop check a flag to see if it should "keep running".
Have your signal handler simply set the "keep running" flag to false, but not otherwise terminate the program.
Have your main processing loop do the memory cleanup prior to exiting.
This has the benefit of placing both the allocation and de-allocation in blocks of code which are called with a known sequence. Doing so can be a godsend when dealing with webs of interrelated objects, and there is not going to be race condition between two processing flows trying to mess with the same object.
Don't free in the handler. Instead, indicate to your program that something needs to be freed. Then, detect that in you program, so you can free from the main context, instead of the signal context.
Are you writing a library or an application? If you're writing a library, you have no business installing signal handlers, which would conflict with the calling application. It's the application's business to handle such signals, if it wants to, and then make the appropriate cleanup calls to your library (from outside a signal-handler context).
Of course even if you're writing an application, there's no reason to handle SIGINT to close sockets and free memory. The only reasons to handle the signal are if you don't want to terminate, or if you have unsaved data or shared state (like stuff in shared memory or the filesystem) that needs to be cleaned up before terminating. Freeing memory or closing file descriptors that are used purely by your own process are not tasks you need to perform when exiting.

Restarting threads from saved contexts

I am trying to implement a checkpointing scheme for multithreaded applications by using fork. I will take the checkpoint at a safe location such as a barrier. One thread will call fork to replicate the address space and signals will be sent to all other threads so that they can save their contexts and write it to a file.
The forked process will not run initially. Only when restart from checkpoint is required, a signal would be sent to it so it can start running. At that point, the threads who were not forked but whose contexts were saved, will be recreated from the saved contexts.
My first question is if it is enough to recreate threads from saved contexts and run them from there, if i assume there was no lock held, no signal pending during checkpoint etc... . Lastly, how a thread can be created to run from a known context.
What you want is not possible without major integration with the pthreads implementation. Internal thread structures will likely contain their own kernel-space thread ids, which will be different in the restored contexts.
It sounds to me like what you really want is forkall, which is non-trivial to implement. I don't think barriers are useful at all for what you're trying to accomplish. Asynchronous interruption and checkpointing is just as good as synchronized.
If you want to try hacking forkall into glibc, you should start out by looking at the setxid code NPTL uses for synchronizing setuid() calls between threads using signals. The same principle is what's needed to implement forkall, but you'd basically call setjmp instead of setuid in the signal handlers, and then longjmp back into them after making new threads in the child. After that you'd have to patch up the thread structures to have the right pid/tid values, free the excess new stacks that were created, etc.
Edit: Since the setxid code in glibc/NPTL is rather dense reading for someone not familiar with the codebase, you might instead look at the corresponding code I have in musl, called __synccall:
http://git.etalabs.net/cgi-bin/gitweb.cgi?p=musl;a=blob;f=src/thread/synccall.c;h=91ac5eb77322da7393f778da29d35fb3c2def15d;hb=HEAD
It uses a signal to synchronize all threads, then runs a callback sequentially in each thread one-by-one. To implement forkall, you'd want to do something like this prior to the fork, but instead of a callback, simply save jump buffers for each thread except the calling thread (you can't use a callback for this because the return would invalidate the jump buffer you just saved), then perform the fork from the calling thread. After that, you would make N new threads, and have them jump back to the old threads' saved jump buffers, and destroy their new (unneeded) stacks. You'd also need to make the right syscall to update their thread register (e.g. %gs on x86) and tid address.
Then you need to take these ideas and integrate them with glibc's thread allocation and thread stack cache framework. :-)

Does setitimer's ITIMER_PROF(SIGPROF)s send to every thread in Multithread and NPTL and Linux(2.6.21.7)?

Manual has said that setitimer is shared in the whole PROCESS and the SIGPROF is send to the PROCESS not to the thread.
But when I create the timer in my multithread PROCESS, unless I create independent stacks for every thread in the PROCESS to handler the signo, I will got some very serious errors in the sig handler. Through some debugging, I confirm that the stack(sole stack case) must have been reenterd.
So now I suspect that SIGPROFs may be send to multithread at the same time? Thanks!
I don't follow the details of your question but the general case is:
A signal may be generated (and thus pending) for a process as a whole (e.g., when sent using kill(2)) or for a specific thread (e.g., certain signals, such as SIGSEGV and SIGFPE, generated as a consequence of executing a specific machine-language instruction are thread directed, as are signals targeted at a specific thread using pthread_kill(3)). A process-directed signal may be delivered to any one of the threads that does not currently have the signal blocked. If more than one of the threads has the signal unblocked, then the kernel chooses an arbitrary thread to which to deliver the signal.
man (7) signal
You can block the signal for specific threads with pthread_sigmask and by elimination direct it to the thread you want to handle it.
According to POSIX, the alternate signal stack established with sigaltstack is per-thread, and is not inherited by new threads. However, I believe some versions of Linux and/or userspace pthread library code (at least old kernels with LinuxThreads and maybe some versions with NPTL too?) have a bug where the alternate stack is inherited, and of course that will lead to crashing whenever you use the alternate stack. Is there a reason you need alternate stacks? Normally the only purpose is to handle stack overflows semi-gracefully (allowing yourself some stack place to catch SIGSEGV and save any unsaved data before exiting). I would just disable it.
Alternatively, use pthread_sigmask to block SIGPROF in all threads but the main one. Note that, to avoid a nasty race condition here, you need to block it in the main thread before calling pthread_create so that the new thread starts with it blocked, and unblock it after pthread_create returns.

Resources