C difference between main thread and other threads - c

Is there a difference between the first thread and other threads created during runtime. Because I have a program where to abort longjmp is used and a thread should be able to terminate the program (exit or abort don't work in my case). Could I safely use pthread_kill_other_threads_np and then longjmp?

I'm not sure what platform you're talking about, but pthread_kill_other_threads_np is not a standard function and not a remotely reasonable operation anymore than free_all_malloced_memory would be. Process termination inherently involves the termination of all threads atomically with respect to each other (they don't see each other terminate).
As for longjmp, while there is nothing wrong with longjmp, you cannot use it to jump to a context in a different thread.
It sounds like you have an XY problem here; you've asked about whether you can use (or how to use) particular tools that are not the right tool for whatever it is you want, without actually explaining what your constraints are.

Related

Ways to detect if any other threads exist (e.g. prior to fork)

Background, from POSIX:
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
The difficulty is that we generally don't know if we're a multi-threaded process, since threads may have been created by library code. And "async-signal-safe" is a quite-severe restriction.
It is nonsensical to ask "how many threads are there", since if other threads are still running, they may be exiting or creating new threads while asking. We can, however, get answers (or partial answers) to simpler questions:
Is it even possible for other threads to exist?
Am I the only thread that ever existed?
Am I the only thread that exists right now?
...
For simplicity's sake let's assume:
we're not in a signal handler
nobody is mad enough to invoke UB by calling pthread_create or C11's thrd_create from a signal handler
nobody is doing threads outside of pthreads, C11, and C++11
C++11 threads appear to always be implemented in terms of pthreads (on platforms that support fork, at least)
C11 threads are very similar to pthreads, although we sometimes have to handle the functions separately.
Answers that involve arcane implementation details are encouraged, as long as they are (fairly) stable.
Some partial answers (more still needed):
Question 1 is addressed by libstdc++'s __gthread_active_p() for several libc implementations. The header is compatible with C, but it a static function in a C++-only part of the include path, and also relies on the existence of macro __GXX_WEAK__ which is only predefined for C++. (libc++ unconditionally pulls in pthreads)
Unfortunately, this is dangerously unreliable for the dlopen case (race conditions in correct user code), see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78017
Question 2 can be addressed by installing interceptors for pthread_create and thrd_creat. But this can potentially be finicky (see comments in gthr.h about interceptors).
If calling clock_gettime with CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID differs, this may be proof that another thread has existed, but beware of races, resolution, and clock settability (setting these clocks is not possible on Linux, but POSIX potentially allows it)
Question 3 is the interesting one, anyway:
GDB is likely to know the answer, but spawning a whole other process seems unnecessary (e.g. answers involving ps should be rewritten to use /proc/ directly)), and it may not have permission anyway
libthread_db.so exists but appears undocumented except in the original Solaris version. It looks like it might be possible to implement the proc_service.h callbacks for the current process however, if we ignore the "stop" part ...
On Linux, if gettid() != getpid(), you're not the main thread, thus there probably are at least two threads. (it's possible for the main thread to call pthread_exit, but this is weird)
A (somewhat) more portable version of the preceding: use __attribute__((constructor)) (or politely ask your caller) to stash the value of pthread_self() for the main thread. Unfortunately, there is a disturbing comment in libstdc++'s <thread> header (grep for __GLIBC__) about returning 0 (but I cannot reproduce this).
On Linux if /proc is mounted and accessible, you can enumerate /proc/self/task/. The code to do this is portable, it will just fail on OSes that don't provide this. (are there others that do provide this much?). Is /proc/self/status or /proc/self/stat any more portable? They have less information (and stat is hard to parse securely), but we probably don't need any more. (need to test these for the "main thread exited" case)
On GLIBC, we could possibly read the debug symbols to find the multiple_threads flag (sometimes global, sometimes part of struct pthread - ugh). But this is probably similar to libthread_db.so
Similarly MUSL has a count (minus one) and a linked list ... though it prefers to take an internal lock first. If we're only reading, is it safe to skip that?
If we block a signal and then kill the current process (not thread) with it, and our thread isn't the one that receives it, we know that other threads must exist to handle it. But there's no way to know how long to wait, and signals are dangerous global state anyway.
On Linux, unshare(2) ignores CLONE_THREAD for single-threaded processes and errors for multithreaded processes! (There's also some harder cases with user namespaces but I don't think they're needed)
On Linux, SELinux's setcon(3) is guaranteed to fail for multithreaded processes under certain conditions. This requires investigation; it takes some steps to correlate the kernel implementation to userland headers (there is a userland library involved).
From grepping kernel sources those are the only 2 that use specific functions, but there's nothing stopping other functions from being implemented on the same data structures.

Better replacement for exit(), atexit() in C

I am new to C programming. I used to think using exit() was the cleanest way of process termination (as it is capable of removing temporary files, closing open files, normal process termination...), but when I tried man exit command on the terminal (Ubuntu 16.04.5, gcc 5.4.0) I saw the following line:
The exit() function uses a global variable that is not protected, so
it is not thread-safe.
After that I tried to make some research about better replacement for exit() (to change my programming behavior from the beginning). While doing that I faced with this question in which side effects of exit() is mentioned and it is suggested to use atexit() properly to solve the problem (at least partially).
There were some cases in which using abort() was preferred over exit(). On top of that, this question suggests that atexit() might also be harmful.
So here are my questions:
Is there any general and better way of process terminating (which is guaranteed to clean like exit() and is not harmful for the system at any case)?
If the answer to the first question is NO!, what is the best possible way of process terminating (including the cases in which they are most useful)?
what is the best possible way of process terminating
If going single threaded just use exit(), as your code is not going multi-threaded.
Else make sure all but one thread have ended before the last thread and then safely call exit() because of 1. above.
Given that power/hardware fails can happen at any time, the imposs.. extreme difficulty of reliably terminating threads with user code and the chaotic nature of the use of memory pools etc. in many non-trivial multithreaded apps, it is better to design apps and systems that can clean temp files etc. on start-up, rather than trying to micro-manage shutdown.
'Clean up all the resources you allocate before you exit' sounds like good advice in a classroom or lecture, but quickly becomes a whole chain of albatross round your neck when faced with a dozen threads, queues and pools in a continually changing dynamic system.
If you can, if you are running under a non trivial OS, let it do its job and clean up for you. It's much better at it than your user code will ever be.

Race conditions can also occur in traditional, single-threaded programs - Clarity

I have read few books on parallel programming over the past few months and I decided to close it off with learning about the posix thread.
I am reading "PThreads programming - A Posix standard for better multiprocessing nutshell-handbook". In chapter 5 ( Pthreads and Unix ) the author talks about handling signals in multi-threaded programs. In the "Threadsafe Library Functions and System Calls" section, the author made a statement that I have not seen in most books that I have read on parallel programming. The statement was:
Race conditions can also occur in traditional, single-threaded programs that use signal handlers or that call routines recursively. A single-threaded program of this kind may have the same routine in progress in various call frames on its process stack.
I find it a little bit tedious to decipher this statement. Does the race condition in the recursive function occur when the recursive function keeps an internal structure by using the static storage type?
I would also love to know how signal handlers can cause RACE CONDITION IN SINGLE THREADED PROGRAMS
Note: Am not a computer science student , i would really appreciate simplified terms
I don't think one can call it a race condition in the classical meaning. Race conditions have a somewhat stochastic behavior, depending on the scheduler policy and timings.
The author is probably talking about bugs that can arise when the same object/resource is accessed from multiple recursive calls. But this behavior is completely deterministic and manageable.
Signals on the other hand is a different story as they occur asynchronously and can apparently interrupt some data processing in the middle and trigger some other processing on that data, corrupting it when returned to the interrupted task.
A signal handler can be called at any time without warning, and it potentially can access any global state in the program.
So, suppose your program has some global flag, that the signal handler sets in response to,... I don't know,... SIGINT. And your program checks the flag before each call to f(x).
if (! flag) {
f(x);
}
That's a data race. There is no guarantee that f(x) will not be called after the signal happens because the signal could sneak in at any time, including right after the "main" program tests the flag.
First it is important to understand what a race condition is. The definition given by Wikipedia is:
Race conditions arise in software when an application depends on the sequence or timing of processes or threads for it to operate properly.
The important thing to note is that a program can behave both properly and improperly based on timing or ordering of execution.
We can fairly easily create "dummy" race conditions in single threaded programs under this definition.
bool isnow(time_t then) {
time_t now = time(0);
return now == then;
}
The above function is a very dumb example and while mostly it will not work, sometimes it will give the correct answer. The correct vs. incorrect behavior depends entirely on timing and so represents a race condition on a single thread.
Taking it a step further we can write another dummy program.
bool printHello() {
sleep(10);
printf("Hello\n");
}
The expected behavior of the above program is to print "Hello" after waiting 10 seconds.
If we send a SIGINT signal 11 seconds after calling our function, everything behaves as expected. If we send a SIGINT signal 3 seconds after calling our function, the program behaves improperly and does not print "Hello".
The only difference between the correct and incorrect behavior was the timing of the SIGINT signal. Thus, a race condition was introduced by signal handling.
I'm going to give a more general answer than you asked for. And this is my own, personal, pragmatic answer, not necessarily one that hews to any official, formal definition of the term "race condition".
Me, I hate race conditions. They lead to huge classes of nasty bugs that are hard to think about, hard to find, and sometimes hard to fix. So I don't like doing programming that's susceptible to race conditions. So I don't do much classically multithreaded programming.
But even though I don't do much multithreaded programming, I'm still confronted by certain classes of what feel to me like race conditions from time to time. Here are the three I try to keep in mind:
The one you mentioned: signal handlers. Receipt of a signal, and calling of a signal handler, is a truly asynchronous event. If you have a data structure of some kind, and you're in the middle of modifying it when a signal occurs, and if your signal handler also tries to modify that same data structure, you've got a race condition. If the code that was interrupted was in the middle of doing something that left the data structure in an inconsistent state, the code in the signal handler might be confused. Note, too, that it's not necessarily code right in the signal handler, but any function called by the signal handler, or called by a function that's called by the signal handler, etc.
Shared OS resources, typically in the filesystem: If your program accesses (or modifies) a file or directory in the filesystem that's also being accessed or modified by another process, you've got a big potential for race conditions. (This is not surprising, because in a computer science sense, multiple processes are multiple threads. They may have separate address spaces meaning they can't interfere with each other that way, but obviously the filesystem is a shared resource where they still can interfere with each other.)
Non-reentrant functions like strtok. If a function maintains internal, static state, you can't have a second call to that function if another instance is active. This is not a "race condition" in the formal sense at all, but it has many of the same symptoms, and also some of the same fixes: don't use static data; do try to write your functions so that they're reentrant.
The author of the book in which you found seems to be defining the term "race condition" in an unusual manner, or maybe he's just used the wrong term.
By the usual definition, no, recursion does not create race conditions in single-threaded programs because the term is defined with respect to the respective actions of multiple threads of execution. It is possible, however, for a recursion to produce exposure to non-reentrancy of some of the functions involved. It's also possible for a single thread to deadlock against itself. These do not reflect race conditions, but perhaps one or both of them is what the author meant.
Alternatively, maybe what you read is the result of a bad editing job. The text you quoted groups functions that employ signal handling together with recursive functions, and signal handlers indeed can produce data races, just as a multiple threads can do, because execution of a signal handler has the relevant characteristics of execution of a separate thread.
Race conditions absolutely happen in single-threaded programs once you have signal handlers. Look at the Unix manual page for pselect().
One way it happens is like this: You have a signal handler that sets some global flag. You check your global flag and because it is clear you make a system call that suspends, confident that when the signal arrives the system call will exit early. But the signal arrives just after you check the global flag and just before the system call takes place. So now you're hung in a system call waiting for a signal that has already arrived. In this case, the race is between your single-threaded code and an external signal.
Well, consider the following code:
#include <pthread.h>
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
int num = 2;
void lock_and_call_again() {
pthread_mutex_lock(&mutex);
if(num > 0) {
--num;
lock_and_call_again();
}
}
int main(int argc, char** argv) {
lock_and_call_again();
}
(Compile with gcc -pthread thread-test.c if you safe the code as thread-test.c)
This is clearly single-threaded, isn't it?
Never the less, it will enter a dead-lock, because you try to lock an already locked mutex.
That's basically what is meant within the paragraph you cited, IMHO:
It does not matter whether it is done in several threads or one single thread, if you try to lock an already locked mutex, your program will end in an dead-lock.
If a function calls itself, like lock_and_call above, it what is called a recursive call .
Just as james large explains, a signal can occur any time, and if a signal handler is registered with this signal, it will called at unpredictable times, if no measures are taken, even while the same handler is already being executed - yielding some kind of implicit recursive execution of the signal handler.
If this handler aquires some kind of a lock, you end up in a dead-lock, even without a function calling itself explicitly.
Consider the following function:
pthread_mutex_t mutex;
void my_handler(int s) {
pthread_mutex_lock(&mutex);
sleep(10);
pthread_mutex_unnlock(&mutex);
}
Now if you register this function for a particular signal, it will be called whenever the signal is caught by your program. If the handler has been called and sleeps, it might get interrupted, the handler called again, and the handler try to lock the mutex that is already locked.
Regarding the wording of the citation:
"A single-threaded program of this kind may have the same routine in progress in various call frames on its process stack."
When a function gets called, some information is stored on the process's stack - e.g. the return address. This information is called a call frame. If you call a function recursively, like in the example above, this information gets stored on the stack several times - several call frames are stored.
It's stated a littlebit clumsy, I admit...

How to properly terminate a thread in a signal handler?

I want to set up a signal handler for SIGSEGV, SIGILL and possibly a few other signals that, rather than terminating the whole process, just terminates the offending thread and perhaps sets a flag somewhere so that a monitoring thread can complain and start another thread. I'm not sure there is a safe way to do this. Pthreads seems to provide functions for exiting the current thread, as well as canceling another thread, but these potentially call a bunch of at-exit handlers. Even if they don't, it seems as though there are many situations in which they are not async-signal-safe, although it is possible that those situations are avoidable. Is there a lower-level function I can call that just destroys the thread? Assuming I modify my own data structures in an async-signal-safe way, and acquire no mutexes, are there pthread/other global data structures that could be left in an inconsistent state simply by a thread terminating at a SIGSEGV? malloc comes to mind, but malloc itself shouldn't SIGSEGV/SIGILL unless the libc is buggy. I realize that POSIX is very conservative here, and makes no guarantees. As long as there's a way to do this in practice I'm happy. Forking is not an option, btw.
If the SIGSEGV/SIGILL/etc. happens in your own code, the signal handler will not run in an async-signal context (it's fundamentally a synchronous signal, but would still be an AS context if it happened inside a standard library function), so you can legally call pthread_exit from the signal handler. However, there are still issues that make this practice dubious:
SIGSEGV/SIGILL/etc. never occur in a program whose behavior is defined unless you generate them via raise, kill, pthread_kill, sigqueue, etc. (and in some of these special cases, they would be asynchronous signals). Otherwise, they're indicative of a program having undefined behavior. If the program has invoked undefined behavior, all bets are off. UB is not isolated to a particular thread or a particular sequence in time. If the program has UB, its entire output/behavior is meaningless.
If the program's state is corrupted (e.g. due to access-after-free, use of invalid pointers, buffer overflows, ...) it's very possible that the first faulting access will happen inside part of the standard library (e.g. inside malloc) rather than in your code. In this case, the signal handler runs in an AS-safe context and cannot call pthread_exit. Of course the program already has UB anyway (see the above point), but even if you wanted to pretend that's not an issue, you'd still be in trouble.
If your program is experiencing these kinds of crashes, you need to find the cause and fix it, not try to patch around it with signal handlers. Valgrind is your friend. If that's not possible, your best bet is to isolate the crashing code into separate processes where you can reason about what happens if they crash asynchronously, rather than having the crashing code in the same process (where any further reasoning about the code's behavior is invalid once you know it crashes).

pausing main thread execution other than sleep() in C

I need to pause the execution of the main thread with out using sleep statement.
is there any function or status values that shows the alive status of other threads like isalive() in java?
pause() often works well; it suspends execution until a signal is received.
Standard C provides no way to pause the main thread, because standard C has no concept of threads. (That's changing in C201X, but that new version of the standard isn't quite finished, and there are no implementations of it.)
Even sleep() (which is a function, not a language-defined statement) is implementation-specific.
So it's not really possible to answer your question without knowing what environment you're using. Do you have multiple threads? If so, what threading library are you using? Pthreads? Win32 threads?
Why does sleep() not satisfy your requirements? (Probably because it pauses all threads, not just the current one.)
(Hint: Whenever you ask "How do I do X without using Y?", tell us why you can't use Y.)
Consult the documentation for whatever thread library you're using. It should provide a function that does what you need.
A extremely simple approach would be using something as simple as getchar().
Other approach could be waiting for a signal with pthread_cond_wait (or any other similar function in a different threading API).
Other approach could be sitting on a tight loop and using a semaphore (or something simpler like a global variable value) to wait for the other threads to finish.
Anyway, there are several options. You don't say enough about your problem to tell what's your best choice here.
select() is often a good choice.
On Linux, epoll() is often a good alternative to select().
And every program, "threaded" or not, always has "main thread". If you're actually using threads, however, look at pthread_cond_wait().

Resources