Related
Background, from POSIX:
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
The difficulty is that we generally don't know if we're a multi-threaded process, since threads may have been created by library code. And "async-signal-safe" is a quite-severe restriction.
It is nonsensical to ask "how many threads are there", since if other threads are still running, they may be exiting or creating new threads while asking. We can, however, get answers (or partial answers) to simpler questions:
Is it even possible for other threads to exist?
Am I the only thread that ever existed?
Am I the only thread that exists right now?
...
For simplicity's sake let's assume:
we're not in a signal handler
nobody is mad enough to invoke UB by calling pthread_create or C11's thrd_create from a signal handler
nobody is doing threads outside of pthreads, C11, and C++11
C++11 threads appear to always be implemented in terms of pthreads (on platforms that support fork, at least)
C11 threads are very similar to pthreads, although we sometimes have to handle the functions separately.
Answers that involve arcane implementation details are encouraged, as long as they are (fairly) stable.
Some partial answers (more still needed):
Question 1 is addressed by libstdc++'s __gthread_active_p() for several libc implementations. The header is compatible with C, but it a static function in a C++-only part of the include path, and also relies on the existence of macro __GXX_WEAK__ which is only predefined for C++. (libc++ unconditionally pulls in pthreads)
Unfortunately, this is dangerously unreliable for the dlopen case (race conditions in correct user code), see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78017
Question 2 can be addressed by installing interceptors for pthread_create and thrd_creat. But this can potentially be finicky (see comments in gthr.h about interceptors).
If calling clock_gettime with CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID differs, this may be proof that another thread has existed, but beware of races, resolution, and clock settability (setting these clocks is not possible on Linux, but POSIX potentially allows it)
Question 3 is the interesting one, anyway:
GDB is likely to know the answer, but spawning a whole other process seems unnecessary (e.g. answers involving ps should be rewritten to use /proc/ directly)), and it may not have permission anyway
libthread_db.so exists but appears undocumented except in the original Solaris version. It looks like it might be possible to implement the proc_service.h callbacks for the current process however, if we ignore the "stop" part ...
On Linux, if gettid() != getpid(), you're not the main thread, thus there probably are at least two threads. (it's possible for the main thread to call pthread_exit, but this is weird)
A (somewhat) more portable version of the preceding: use __attribute__((constructor)) (or politely ask your caller) to stash the value of pthread_self() for the main thread. Unfortunately, there is a disturbing comment in libstdc++'s <thread> header (grep for __GLIBC__) about returning 0 (but I cannot reproduce this).
On Linux if /proc is mounted and accessible, you can enumerate /proc/self/task/. The code to do this is portable, it will just fail on OSes that don't provide this. (are there others that do provide this much?). Is /proc/self/status or /proc/self/stat any more portable? They have less information (and stat is hard to parse securely), but we probably don't need any more. (need to test these for the "main thread exited" case)
On GLIBC, we could possibly read the debug symbols to find the multiple_threads flag (sometimes global, sometimes part of struct pthread - ugh). But this is probably similar to libthread_db.so
Similarly MUSL has a count (minus one) and a linked list ... though it prefers to take an internal lock first. If we're only reading, is it safe to skip that?
If we block a signal and then kill the current process (not thread) with it, and our thread isn't the one that receives it, we know that other threads must exist to handle it. But there's no way to know how long to wait, and signals are dangerous global state anyway.
On Linux, unshare(2) ignores CLONE_THREAD for single-threaded processes and errors for multithreaded processes! (There's also some harder cases with user namespaces but I don't think they're needed)
On Linux, SELinux's setcon(3) is guaranteed to fail for multithreaded processes under certain conditions. This requires investigation; it takes some steps to correlate the kernel implementation to userland headers (there is a userland library involved).
From grepping kernel sources those are the only 2 that use specific functions, but there's nothing stopping other functions from being implemented on the same data structures.
I have read few books on parallel programming over the past few months and I decided to close it off with learning about the posix thread.
I am reading "PThreads programming - A Posix standard for better multiprocessing nutshell-handbook". In chapter 5 ( Pthreads and Unix ) the author talks about handling signals in multi-threaded programs. In the "Threadsafe Library Functions and System Calls" section, the author made a statement that I have not seen in most books that I have read on parallel programming. The statement was:
Race conditions can also occur in traditional, single-threaded programs that use signal handlers or that call routines recursively. A single-threaded program of this kind may have the same routine in progress in various call frames on its process stack.
I find it a little bit tedious to decipher this statement. Does the race condition in the recursive function occur when the recursive function keeps an internal structure by using the static storage type?
I would also love to know how signal handlers can cause RACE CONDITION IN SINGLE THREADED PROGRAMS
Note: Am not a computer science student , i would really appreciate simplified terms
I don't think one can call it a race condition in the classical meaning. Race conditions have a somewhat stochastic behavior, depending on the scheduler policy and timings.
The author is probably talking about bugs that can arise when the same object/resource is accessed from multiple recursive calls. But this behavior is completely deterministic and manageable.
Signals on the other hand is a different story as they occur asynchronously and can apparently interrupt some data processing in the middle and trigger some other processing on that data, corrupting it when returned to the interrupted task.
A signal handler can be called at any time without warning, and it potentially can access any global state in the program.
So, suppose your program has some global flag, that the signal handler sets in response to,... I don't know,... SIGINT. And your program checks the flag before each call to f(x).
if (! flag) {
f(x);
}
That's a data race. There is no guarantee that f(x) will not be called after the signal happens because the signal could sneak in at any time, including right after the "main" program tests the flag.
First it is important to understand what a race condition is. The definition given by Wikipedia is:
Race conditions arise in software when an application depends on the sequence or timing of processes or threads for it to operate properly.
The important thing to note is that a program can behave both properly and improperly based on timing or ordering of execution.
We can fairly easily create "dummy" race conditions in single threaded programs under this definition.
bool isnow(time_t then) {
time_t now = time(0);
return now == then;
}
The above function is a very dumb example and while mostly it will not work, sometimes it will give the correct answer. The correct vs. incorrect behavior depends entirely on timing and so represents a race condition on a single thread.
Taking it a step further we can write another dummy program.
bool printHello() {
sleep(10);
printf("Hello\n");
}
The expected behavior of the above program is to print "Hello" after waiting 10 seconds.
If we send a SIGINT signal 11 seconds after calling our function, everything behaves as expected. If we send a SIGINT signal 3 seconds after calling our function, the program behaves improperly and does not print "Hello".
The only difference between the correct and incorrect behavior was the timing of the SIGINT signal. Thus, a race condition was introduced by signal handling.
I'm going to give a more general answer than you asked for. And this is my own, personal, pragmatic answer, not necessarily one that hews to any official, formal definition of the term "race condition".
Me, I hate race conditions. They lead to huge classes of nasty bugs that are hard to think about, hard to find, and sometimes hard to fix. So I don't like doing programming that's susceptible to race conditions. So I don't do much classically multithreaded programming.
But even though I don't do much multithreaded programming, I'm still confronted by certain classes of what feel to me like race conditions from time to time. Here are the three I try to keep in mind:
The one you mentioned: signal handlers. Receipt of a signal, and calling of a signal handler, is a truly asynchronous event. If you have a data structure of some kind, and you're in the middle of modifying it when a signal occurs, and if your signal handler also tries to modify that same data structure, you've got a race condition. If the code that was interrupted was in the middle of doing something that left the data structure in an inconsistent state, the code in the signal handler might be confused. Note, too, that it's not necessarily code right in the signal handler, but any function called by the signal handler, or called by a function that's called by the signal handler, etc.
Shared OS resources, typically in the filesystem: If your program accesses (or modifies) a file or directory in the filesystem that's also being accessed or modified by another process, you've got a big potential for race conditions. (This is not surprising, because in a computer science sense, multiple processes are multiple threads. They may have separate address spaces meaning they can't interfere with each other that way, but obviously the filesystem is a shared resource where they still can interfere with each other.)
Non-reentrant functions like strtok. If a function maintains internal, static state, you can't have a second call to that function if another instance is active. This is not a "race condition" in the formal sense at all, but it has many of the same symptoms, and also some of the same fixes: don't use static data; do try to write your functions so that they're reentrant.
The author of the book in which you found seems to be defining the term "race condition" in an unusual manner, or maybe he's just used the wrong term.
By the usual definition, no, recursion does not create race conditions in single-threaded programs because the term is defined with respect to the respective actions of multiple threads of execution. It is possible, however, for a recursion to produce exposure to non-reentrancy of some of the functions involved. It's also possible for a single thread to deadlock against itself. These do not reflect race conditions, but perhaps one or both of them is what the author meant.
Alternatively, maybe what you read is the result of a bad editing job. The text you quoted groups functions that employ signal handling together with recursive functions, and signal handlers indeed can produce data races, just as a multiple threads can do, because execution of a signal handler has the relevant characteristics of execution of a separate thread.
Race conditions absolutely happen in single-threaded programs once you have signal handlers. Look at the Unix manual page for pselect().
One way it happens is like this: You have a signal handler that sets some global flag. You check your global flag and because it is clear you make a system call that suspends, confident that when the signal arrives the system call will exit early. But the signal arrives just after you check the global flag and just before the system call takes place. So now you're hung in a system call waiting for a signal that has already arrived. In this case, the race is between your single-threaded code and an external signal.
Well, consider the following code:
#include <pthread.h>
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
int num = 2;
void lock_and_call_again() {
pthread_mutex_lock(&mutex);
if(num > 0) {
--num;
lock_and_call_again();
}
}
int main(int argc, char** argv) {
lock_and_call_again();
}
(Compile with gcc -pthread thread-test.c if you safe the code as thread-test.c)
This is clearly single-threaded, isn't it?
Never the less, it will enter a dead-lock, because you try to lock an already locked mutex.
That's basically what is meant within the paragraph you cited, IMHO:
It does not matter whether it is done in several threads or one single thread, if you try to lock an already locked mutex, your program will end in an dead-lock.
If a function calls itself, like lock_and_call above, it what is called a recursive call .
Just as james large explains, a signal can occur any time, and if a signal handler is registered with this signal, it will called at unpredictable times, if no measures are taken, even while the same handler is already being executed - yielding some kind of implicit recursive execution of the signal handler.
If this handler aquires some kind of a lock, you end up in a dead-lock, even without a function calling itself explicitly.
Consider the following function:
pthread_mutex_t mutex;
void my_handler(int s) {
pthread_mutex_lock(&mutex);
sleep(10);
pthread_mutex_unnlock(&mutex);
}
Now if you register this function for a particular signal, it will be called whenever the signal is caught by your program. If the handler has been called and sleeps, it might get interrupted, the handler called again, and the handler try to lock the mutex that is already locked.
Regarding the wording of the citation:
"A single-threaded program of this kind may have the same routine in progress in various call frames on its process stack."
When a function gets called, some information is stored on the process's stack - e.g. the return address. This information is called a call frame. If you call a function recursively, like in the example above, this information gets stored on the stack several times - several call frames are stored.
It's stated a littlebit clumsy, I admit...
I know similar questions have been asked, but I think my situation is little bit different. I need to check if child thread is alive, and if it's not print error message. Child thread is supposed to run all the time. So basically I just need non-block pthread_join and in my case there are no race conditions. Child thread can be killed so I can't set some kind of shared variable from child thread when it completes because it will not be set in this case.
Killing of child thread can be done like this:
kill -9 child_pid
EDIT: alright, this example is wrong but still I'm sure there exists way to kill a specific thread in some way.
EDIT: my motivation for this is to implement another layer of security in my application which requires this check. Even though this check can be bypassed but that is another story.
EDIT: lets say my application is intended as a demo for reverse engineering students. And their task is to hack my application. But I placed some anti-hacking/anti-debugging obstacles in child thread. And I wanted to be sure that this child thread is kept alive. As mentioned in some comments - it's probably not that easy to kill child without messing parent so maybe this check is not necessary. Security checks are present in main thread also but this time I needed to add them in another thread to make main thread responsive.
killed by what and why that thing can't indicate the thread is dead? but even then this sounds fishy
it's almost universally a design error if you need to check if a thread/process is alive - the logic in the code should implicitly handle this.
In your edit it seems you want to do something about a possibility of a thread getting killed by something completely external.
Well, good news. There is no way to do that without bringing the whole process down. All ways of non-voluntary death of a thread kill all threads in the process, apart from cancellation but that can only be triggered by something else in the same process.
The kill(1) command does not send signals to some thread, but to a entire process. Read carefully signal(7) and pthreads(7).
Signals and threads don't mix well together. As a rule of thumb, you don't want to use both.
BTW, using kill -KILL or kill -9 is a mistake. The receiving process don't have the opportunity to handle the SIGKILL signal. You should use SIGTERM ...
If you want to handle SIGTERM in a multi-threaded application, read signal-safety(7) and consider setting some pipe(7) to self (and use poll(2) in some event loop) which the signal handler would write(2). That well-known trick is well explained in Qt documentation. You could also consider the signalfd(2) Linux specific syscall.
If you think of using pthread_kill(3), you probably should not in your case (however, using it with a 0 signal is a valid but crude way to check that the thread exists). Read some Pthread tutorial. Don't forget to pthread_join(3) or pthread_detach(3).
Child thread is supposed to run all the time.
This is the wrong approach. You should know when and how a child thread terminates because you are coding the function passed to pthread_create(3) and you should handle all error cases there and add relevant cleanup code (and perhaps synchronization). So the child thread should run as long as you want it to run and should do appropriate cleanup actions when ending.
Consider also some other inter-process communication mechanism (like socket(7), fifo(7) ...); they are generally more suitable than signals, notably for multi-threaded applications. For example you might design your application as some specialized web or HTTP server (using libonion or some other HTTP server library). You'll then use your web browser, or some HTTP client command (like curl) or HTTP client library like libcurl to drive your multi-threaded application. Or add some RPC ability into your application, perhaps using JSONRPC.
(your putative usage of signals smells very bad and is likely to be some XY problem; consider strongly using something better)
my motivation for this is to implement another layer of security in my application
I don't understand that at all. How can signal and threads add security? I'm guessing you are decreasing the security of your software.
I wanted to be sure that this child thread is kept alive.
You can't be sure, other than by coding well and avoiding bugs (but be aware of Rice's theorem and the Halting Problem: there cannot be any reliable and sound static source code program analysis to check that). If something else (e.g. some other thread, or even bad code in your own one) is e.g. arbitrarily modifying the call stack of your thread, you've got undefined behavior and you can just be very scared.
In practice tools like the gdb debugger, address and thread sanitizers, other compiler instrumentation options, valgrind, can help to find most such bugs, but there is No Silver Bullet.
Maybe you want to take advantage of process isolation, but then you should give up your multi-threading approach, and consider some multi-processing approach. By definition, threads share a lot of resources (notably their virtual address space) with other threads of the same process. So the security checks mentioned in your question don't make much sense. I guess that they are adding more code, but just decrease security (since you'll have more bugs).
Reading a textbook like Operating Systems: Three Easy Pieces should be worthwhile.
You can use pthread_kill() to check if a thread exists.
SYNOPSIS
#include <signal.h>
int pthread_kill(pthread_t thread, int sig);
DESCRIPTION
The pthread_kill() function shall request that a signal be delivered
to the specified thread.
As in kill(), if sig is zero, error checking shall be performed
but no signal shall actually be sent.
Something like
int rc = pthread_kill( thread_id, 0 );
if ( rc != 0 )
{
// thread no longer exists...
}
It's not very useful, though, as stated by others elsewhere, and it's really weak as any type of security measure. Anything with permissions to kill a thread will be able to stop it from running without killing it, or make it run arbitrary code so that it doesn't do what you want.
This really is two questions, but I suppose it's better they be combined.
We're working on a client that uses asynchronous TCP connection. The idea is that the program will block until certain message is received from the server, which will invoke a SIGPOLL handler. We are using a busy waiting loop, basically:
var = 1
while (var) usleep(100);
//...and somewhere else
void sigpoll_handler(int signum){
......
var = 0;
......
}
We would like to use something more reliable instead, like a semaphore. The thing is, when a thread is blocked on a semaphore, will the signal get through still? Especially considering that signals get delivered when it switches back to user level; if the process is off the runqueue, how will it happen?
Side question (just out of curiosity):
Without the "usleep(100)" the program never progresses past the while loop, although I can verify the variable was set in the handler. Why is that? Printing changes its behaviour too.
Cheers!
[too long for a comment]
Accessing var from inside the signal handler invokes undefined behaviour (at least for a POSIX conforming system).
From the related POSIX specification:
[...] if the process is single-threaded and a signal handler is executed [...] the behavior is undefined if the signal handler refers to any object [...] with static storage duration other than by assigning a value to an object declared as volatile sig_atomic_t [...]
So var shall be defined:
volatile sig_atomic_t var;
The busy waiting while-loop, can be replaced by a single call to a blocking pause(), as it will return on reception of the signal.
From the related POSIX specification:
The pause() function shall suspend the calling thread until delivery of a signal whose action is either to execute a signal-catching function or to terminate the process.
Using pause(), btw, will make the use of any global flag like var redundant, to not say needless.
Short answer: yes, the signal will get through fine with a good implementation.
If you're going to be using a semaphore to control the flow of the program, you'll want to have the listening be on one child with the actual data processing be on another. This will then put the concurrency fairness in the hands of the OS which will make sure your signal listening thread gets a chance to check for a signal with some regularity. It shouldn't ever be really "off the runqueue," but cycling through positions on the runqueue instead.
If it helps you to think about it, what you have right now seems to basically be a a very rough implementation of a semaphore on its own -- a shared variable whose value will stop one block of code from executing until another code block clears it. There isn't anything inherently paralyzing about a semaphore on a system level.
I kind of wonder why whatever function you're using to listen for the SIGPOLL isn't doing its own blocking, though. Most of those utilities that I've seen will stop their calling thread until they return a value. Basically they handle the concurrency for you and you can code as if you were dealing with a normal synchronous program.
With regards to the usleep loop: I'd have to look at what the optimizer's doing, but I think there are basically two possibilities. I think it's unlikely, but it could be that the no-body loop is compiling into something that isn't actually checking for a value change and is instead just looping. More likely to me would be that the lack of any body steps is messing up the underlying concurrency handling, and the loop is executing so quickly that nothing else is getting a chance to run -- the queue is being flooded by loop iterations and your signal processsing can't get a word in edgewise. You could try just watching it for a few hours to see if anything changes; theoretically if it's just a concurrency problem then the random factor involved could clear the block on its own with a few billion chances.
I've got some system level code that fires timers every once in a while, and has a signal handler that manages these signals when they arrive. This works fine and seems completely reasonable. There are also two separate threads running alongside the main program, but they do not share any variables, but use glib's async queues to pass messages in one direction only.
The same code uses glib's GHashTable to store, well, key/value pairs. When the signal code is commented out of the system, the hash table appears to operate fine. When it is enabled, however, there is a strange race condition where the call to g_hash_table_lookup actually returns NULL (meaning that there is no entry with the key used to look it up), when indeed the entry is actually there (yes I made sure by printing the whole list of key/value pairs with g_hash_table_foreach). Why would this occur most of the time? Is GLib's hash table implementation buggy? Sometimes the lookup call is successful.
It's a very particular situation, and I can clarify further if it didn't make sense, but I'm hoping I am doing something wrong so that this can actually be fixed.
More info: The code segments that are not within the signal handler scope but access the g_hash_table variable are surrounded by signal blocking calls so that the signal handler does not access these variables when the process was originally accessing them too.
Generally, signal handlers can only set flags and make system calls
As it happens, there are severe restrictions in ISO C regarding what signal handlers can do, and most library entry points and most API's are not even remotely 100% multi-thread-safe and approximately 0.0% of them are signal-handler-safe. That is, there is an absolute prohibition against calling almost anything from a signal handler.
In particular, for GHashTable, g_hash_table_ref() and g_hash_table_unref() are the only API elements that are even thread-safe, and none of them are signal-handler safe. Actually, ISO-C only allows signal handlers to modify objects declared with volatile sig_atomic_t and only a couple of library routines may be called.
Some of us consider threaded systems to be intrinsically dangerous, practically radioactive sources of subtle bugs. A good place to start worrying is The Problem with Threads. (And note that signal handlers themselves are much worse. No one thinks an API is safe there...)