The function putenv is not a thread safe function, so I guess if I call pthread_mutex_lock before calling putenv, can I make putenv "thread safe" in this way?
I tried it but when I run it, segmentation fault came out.
Here is the code:
#include "apue.h"
#include <pthread.h>
pthread_mutex_t envlock = PTHREAD_MUTEX_INITIALIZER;
void thread_func(void*arg){
pthread_mutex_lock(&envlock);
char env[100];
sprintf(env,"hhh=%s",(char*)arg);
putenv(env);
pthread_mutex_unlock(&envlock);
return;
}
int main(){
pthread_t thread0, thread1, thread2;
void *shit;
int err;
char name0[]="thread0";
err=pthread_create(&thread0,NULL,thread_func,(void*)name0);
if(err!=0)
exit(-1);
char name1[]="thread1";
err=pthread_create(&thread1,NULL,thread_func,(void*)name1);
if(err!=0)
exit(-1);
char name2[]="thread2";
err=pthread_create(&thread2,NULL,thread_func,(void*)name2);
if(err!=0)
exit(-1);
pthread_join(thread0,&shit);
pthread_join(thread1,&shit);
pthread_join(thread2,&shit);
char *hhh=getenv("hhh");
printf("hhh is =%s",hhh);
return 0;
}
putenv is reentrant in newer versions of glibc. The problem is that putenv does not copy the string that is given to it, and therefore you cannot base it on your stack. Try keeping your char env[100] in a place where it will not be destroyed at the function's end.
The putenv() function is not required to be reentrant, and the one in
glibc 2.0 is not, but the glibc 2.1 version is.
...
Since version 2.1.2, the glibc implementation conforms to SUSv2: the
pointer string given to putenv() is used. In particular, this string
becomes part of the environment; changing it later will change the
environment. (Thus, it is an error to call putenv() with an
automatic variable as the argument, then return from the calling
function while string is still part of the environment.)
In general, protecting a function with a locking mechanism does not make it automatically reentrant. Reentrancy means precisely that, that a function can be reentered again, in the middle of a call to itself, without any risk to the internal data it manages int the active call. For that to occurr, the function must operate only on its stack frame, or be given pointers as parameters (in the stack, or in storable registers) to any externa data objects it must act upon. This makes reentrancy possible.
Now, I'll explain one scenario where using a non-reentrant function (with the locking mechanism you propose) is not applicable:
Assume the case you have a function f() that is being executed, then a signal is received (or an interrupt) and the signal handler just has to make a call to f(). As far as you have the function entry locked, the signal handler will be locked on entry to the function f(), making the handler never return, so the main program cannot continue its execution of f() to open the lock. In cases (the majority) you use the same stack to handle interrupts than the one of the interrupted function (well, I know about FreeBSD using a different context for interrupt handlers, but don't know if this applies to user mode processes), no chance to unlock the lock until the handler has return, but it cannot return as far as the handler is waiting for the lock to be unlocked. This case of reentrancy is not handled by your routine.
How can this problem be avoided. Just avoid interrupts that call this handler when you are in the middle of the shared region (then, why to lock it?) so interrupts are handled after the function call.
Of course, if you need to call it from several threads (each with it's own stack) then you need the lock after all.
Conclusion:
The lock just avoids the reentering of f(), but doesn't make it reentrant.
Related
If I look at the implementation of pthread_equal it looks as follows:
int
__pthread_equal (pthread_t thread1, pthread_t thread2)
{
return thread1 == thread2;
}
weak_alias (__pthread_equal, pthread_equal)
with
typedef unsigned long int pthread_t;
Posix documentation says pthread_equal is threadsafe. But the call to pthread_equals copies and therefore accesses the thread1 and thread2 variables. If those variables are global and changed in this moment by another thread this would lead to undefined behavior.
So should pthread_t not be an atomic? Or does it behave atomically which is ensured in some other way?
This implementation of pthread_equal (which is going to be specific to the POSIX implementation it goes with) does not access any variables except local ones. In C, arguments are always passed by value. thread1 and thread2 are local variables in the function who take their values from the expressions the caller uses when making the call.
With that said, even if they weren't, there would be no problem of thread-unsafety here. Rather, if the caller accessed pthread_t objects that could be changed concurrently by other threads without using appropriate synchronization mechanisms to preclude that, it's a data race bug in the caller, not a thread-safety matter in the callee.
When we say an operation is thread safe, we make certain assumptions on the code that invokes that operation. Those assumptions include making sure the inputs to the operation are stable and that their values remain valid and stable until the operation completes. Otherwise, nothing would be thread safe.
I have C code like this
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
void handler_function(int);
int i=0;
int j=0;
int main() {
signal(SIGINT,f);
while(1) {
/* do something in variable `i` */
}
}
void f(int signum) {
/* do something else on variable `i` */
}
Can it produce a data race? i.e. is f executed in parallel (even in a multithread machine) to the main. Or maybe is the main stopped until f finish its execution?
First of all according to the man page of signal() you should not use signal() but sigaction()
The behavior of signal() varies across UNIX versions, and has also varied historically across different versions of Linux. Avoid its use: use sigaction(2) instead. See Portability below.
But one might hope that signal() behaves sanely. However, there might be a data race because main might be interrupted before a store e.g. in a situation like this
if ( i > 10 ) {
i += j;
}
void f(int signum) {
i = 0;
}
If main is past the compare (or if the according registers do not get update if main was interrupted while compare), main would still to i += j which is a data race.
So where does this leave us? - Don't ever modify globals that get modified elsewhere in signal handlers if you cannot guarantee that the signal handler cannot interrupt this operation (e.g. disable signal handler for certain operations).
Unless you use the raise() from Standard C or kill() with the value from getpid() as the PID argument, signal events are asynchronous.
In single-threaded code on a multi-core machine, it means that you cannot tell what is happening in the 'do something to variable i' code. For example, that code might have just fetched the value from i and have incremented it, but not yet saved the incremented value. If the signal handler function f() reads i, modifies it in a different way, saves the result and returns, the original code may now write the incremented value of i instead of using the value modified by f().
This is what leads to the many constraints on what you can do in a signal handler. For example, it is not safe to call printf() in a signal handler because it might need to do memory allocation (malloc()) and yet the signal might have arrived while malloc() was modifying its linked lists of available memory. The second call to malloc() might get thoroughly confused.
So, even in a single-threaded program, you have to be aware and very careful about how you modify global variables.
However, in a single-threaded program, there will be no activity from the main loop while the signal is being handled. Indeed, even in a multi-threaded program, the thread that receives (handles) the signal is suspended while the signal handler is running, but other threads are not suspeded so there could be concurrent activity from other threads. If it matters, make sure the access to the variables is properly serialized.
See also:
What is the difference between sigaction() and signal()?
Signal concepts.
Is it possible to read the registers or thread local variables of another thread directly, that is, without requiring to go to the kernel? What is the best way to do so?
You can't read the registers, which wouldn't be useful anyway. But reading thread local variables from another thread is easily possible.
Depending on the architecture (e. g. strong memory ordering like on x86_64) you can safely do it even without synchronization, provided that the read value doesn't affect in any way the thread is belongs to. A scenario would be displaying a thread local counter or similar.
Specifically in linux on x86_64 as you tagged, you could to it like that:
// A thread local variable. GCC extension, but since C++11 actually part of C++
__thread int some_tl_var;
// The pointer to thread local. In itself NOT thread local, as it will be
// read from the outside world.
struct thread_data {
int *psome_tl_var;
...
};
// the function started by pthread_create. THe pointer needs to be initialized
// here, and NOT when the storage for the objects used by the thread is allocated
// (otherwise it would point to the thread local of the controlling thread)
void thread_run(void* pdata) {
pdata->psome_tl_var = &some_tl_var;
// Now do some work...
// ...
}
void start_threads() {
...
thread_data other_thread_data[NTHREADS];
for (int i=0; i<NTHREADS; ++i) {
pthread_create(pthreadid, NULL, thread_run, &other_thread_data[i]);
}
// Now you can access each some_tl_var as
int value = *(other_thread_data[i].psome_tl_var);
...
}
I used similar for displaying some statistics about worker threads. It is even easier in C++, if you create objects around your threads, just make the pointer to the thread local a field in your thread class and access is with a member function.
Disclaimer: This is non portable, but it works on x86_64, linux, gcc and may work on other platforms too.
There's no way to do it without involving the kernel, and in fact I don't think it could be meaningful to read them anyway without some sort of synchronization. If you don't want to use ptrace (which is ugly and non-portable) you could instead choose one of the realtime signals to use for a "send me your registers/TLS" message. The rough idea is:
Lock a global mutex for the request.
Store the information on what data you want (e.g. a pthread_key_t or a special value meaning registers) from the thread in global variables.
Signal the target thread with pthread_kill.
In the signal handler (which should have been installed with sigaction and SA_SIGINFO) use the third void * argument to the signal handler (which really points to a ucontext_t) to copy that ucontext_t to the global variable used to communicate back to the requesting thread. This will give it all the register values, and a lot more. Note that TLS is a bit more tricky since pthread_getspecific is not async-signal-safe and technically not legal to run in this context...but it probably works in practice.
The signal handler posts a semaphore (this is the ONLY async-signal-safe synchronization function offered by POSIX) indicating to the requesting thread that it's done, and returns.
The requesting thread finishes by waiting on the semaphore, then reads the data and unlocks the request mutex.
Note that this will involve at least 1 transition to kernelspace (pthread_kill) in the requesting thread (and maybe another in sem_wait), and 1-3 in the target thread (1 for returning from the signal handler, one for entering the signal handler if it was not already sleeping in kernelspace, and possibly one for sem_post). Still it's probably faster than mucking around with ptrace which is not designed for high-performance usage...
From the question:
Is it good programming practice to use setjmp and longjmp in C?
Two of the comments left said:
"You can't throw an exception in a signal handler, but you can do a
longjmp safely -- as long as you know what you are doing. – Dietrich
Epp Aug 31 at 19:57
#Dietrich: +1 to your comment. This is a little-known and
completely-under-appreciated fact. There are a number of problems that
cannot be solved (nasty race conditions) without using longjmp out of
signal handlers. Asynchronous interruption of blocking syscalls is the
classic example."
I was under the impression that signal handlers were called by the kernel when it encountered an exceptional condition (e.g. divide by 0). Also, that they're only called if you specifically register them.
This would seem to imply (to me) that they aren't called through your normal code.
Moving on with that thought... setjmp and longjmp as I understand them are for collapsing up the stack to a previous point and state. I don't understand how you can collapse up a stack when a signal handler is called since its called from the Kernel as a one-off circumstance rather than from your own code. What's the next thing up the stack from a signal handler!?
The way the kernel "calls" a signal handler is by interrupting the thread, saving the signal mask and processor state in a ucontext_t structure on the stack just beyond (below, on grows-down implementations) the interrupted code's stack pointer, and restarting execution at the address of the signal handler. The kernel does not need to keep track of any "this process is in a signal handler" state; that's entirely a consequence of the new call frame that was created.
If the interrupted thread was in the middle of a system call, the kernel will back out of the kernelspace code and adjust the return address to repeat the system call (if SA_RESTART is set for the signal and the system call is a restartable one) or put EINTR in the return code (if not restartable).
It should be noted that longjmp is async-signal-unsafe. This means it invokes undefined behavior if you call it from a signal handler if the signal interrupted another async-signal-unsafe function. But as long as the interrupted code is not using library functions, or only using library functions that are marked async-signal-safe, it's legal to call longjmp from a signal handler.
Finally, my answer is based on POSIX since the question is tagged unix. If the question were just about pure C, I suspect the answer is somewhat different, but signals are rather useless without POSIX anyway...
longjmp does not perform normal stack unwinding. Instead, the stack pointer is simply restored from the context saved by setjmp.
Here is an illustration on how this can bite you with non-async-safe critical parts in your code. It is advisable to e.g. mask the offending signal during critical code.
worth reading this: http://man7.org/linux/man-pages/man2/sigreturn.2.html in regard to how Linux handles signal handler invocation, and in this case how it manages signal handler exit, my reading of this suggests that executing a longjmp() from a signal handler (resulting in no call of sigreturn()) might be at best "undefined"... also have to take into account on which thread (and thus user stack) the setjmp() was called, and on which thread (and thus user stack) longjmp() in subsequently called also!
This doesn't answer the question of whether or not it is "good" to do this, but
this is how to do it. In my application, I have a complicated interaction between custom hardware, huge page, shared memory, NUMA lock memory, etc, and it is possible to have memory that seems to be decently allocated but when you touch it (write in this case), it throws a BUS error or SEGV fault in the middle of the application. I wanted to come up with a way of testing memory addresses to make sure that the shared memory wasn't node locked to a node that didn't have enough memory, so that the program would fail early with graceful error messages. So these signal handlers are ONLY used for this one piece of code (a small memcpy of 5 bytes) and not used to rescue the app while it is in use. I think it is safe here.
Apologies if this is not "correct". Please comment and I'll fix it up. I cobbled it together based on hints and some sample code that didn't work.
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
sigjmp_buf JumpBuffer;
void handler(int);
int count = 0;
int main(void)
{
struct sigaction sa;
sa.sa_handler = handler;
sigemptyset(&(sa.sa_mask));
sigaddset(&(sa.sa_mask), SIGSEGV);
sigaction(SIGSEGV, &sa, NULL);
while (1) {
int r = sigsetjmp(JumpBuffer,1);
if (r == 0) {
printf("Ready for memcpy, count=%d\n",count);
usleep(1000000);
char buffer[10];
#if 1
char* dst = buffer; // this won't do bad
#else
char* dst = nullptr; // this will cause a segfault
#endif
memcpy(dst,"12345",5); // trigger seg fault here
longjmp(JumpBuffer,2);
}
else if (r == 1)
{
printf("SEGV. count %d\n",count);
}
else if (r == 2)
{
printf("No segv. count %d\n",count);
}
}
return 0;
}
void handler(int sig)
{
count++;
siglongjmp(JumpBuffer, 1);
}
References
https://linux.die.net/man/3/sigsetjmp
https://pubs.opengroup.org/onlinepubs/9699919799/functions/longjmp.html
http://www.csl.mtu.edu/cs4411.ck/www/NOTES/non-local-goto/sig-1.html
https://www.gnu.org/software/libc/manual/html_node/Longjmp-in-Handler.html
In most systems a signal handler has it's own stack, separate from the main stack. That's why you could longjmp out of a handler. I think it's not a wise thing to do though.
You can't use longjmp to get out of a signal handler.
The reason for this is that setjmp only saves the resources (process registers) etc. that the calling-convention specifies that should be saved over a plain function call.
When an interrupt occurs, the function being interrupted may have a much larger state, and it will not be restored correctly by longjmp.
We have a C++ shared library that uses ZeroC's Ice library for RPC and unless we shut down Ice's runtime, we've observed child processes hanging on random mutexes. The Ice runtime starts threads, has many internal mutexes and keeps open file descriptors to servers.
Additionally, we have a few of mutexes of our own to protect our internal state.
Our shared library is used by hundreds of internal applications so we don't have control over when the process calls fork(), so we need a way to safely shutdown Ice and lock our mutexes while the process forks.
Reading the POSIX standard on pthread_atfork() on handling mutexes and internal state:
Alternatively, some libraries might have been able to supply just a child routine that reinitializes the mutexes in the library and all associated states to some known value (for example, what it was when the image was originally executed). This approach is not possible, though, because implementations are allowed to fail *_init() and *_destroy() calls for mutexes and locks if the mutex or lock is still locked. In this case, the child routine is not able to reinitialize the mutexes and locks.
On Linux, the this test C program returns EPERM from pthread_mutex_unlock() in the child pthread_atfork() handler. Linux requires adding _NP to the PTHREAD_MUTEX_ERRORCHECK macro for it to compile.
This program is linked from this good thread.
Given that it's technically not safe or legal to unlock or destroy a mutex in the child, I'm thinking it's better to have pointers to mutexes and then have the child make new pthread_mutex_t on the heap and leave the parent's mutexes alone, thereby having a small memory leak.
The only issue is how to reinitialize the state of the library and I'm thinking of reseting a pthread_once_t. Maybe because POSIX has an initializer for pthread_once_t that it can be reset to its initial state.
#include <pthread.h>
#include <stdlib.h>
#include <string.h>
static pthread_once_t once_control = PTHREAD_ONCE_INIT;
static pthread_mutex_t *mutex_ptr = 0;
static void
setup_new_mutex()
{
mutex_ptr = malloc(sizeof(*mutex_ptr));
pthread_mutex_init(mutex_ptr, 0);
}
static void
prepare()
{
pthread_mutex_lock(mutex_ptr);
}
static void
parent()
{
pthread_mutex_unlock(mutex_ptr);
}
static void
child()
{
// Reset the once control.
pthread_once_t once = PTHREAD_ONCE_INIT;
memcpy(&once_control, &once, sizeof(once_control));
}
static void
init()
{
setup_new_mutex();
pthread_atfork(&prepare, &parent, &child);
}
int
my_library_call(int arg)
{
pthread_once(&once_control, &init);
pthread_mutex_lock(mutex_ptr);
// Do something here that requires the lock.
int result = 2*arg;
pthread_mutex_unlock(mutex_ptr);
return result;
}
In the above sample in the child() I only reset the pthread_once_t by making a copy of a fresh pthread_once_t initialized with PTHREAD_ONCE_INIT. A new pthread_mutex_t is only created when the library function is invoked in the child process.
This is hacky but maybe the best way of dealing with this skirting the standards. If the pthread_once_t contains a mutex then the system must have a way of initializing it from its PTHREAD_ONCE_INIT state. If it contains a pointer to a mutex allocated on the heap than it'll be forced to allocate a new one and set the address in the pthread_once_t. I'm hoping it doesn't use the address of the pthread_once_t for anything special which would defeat this.
Searching
comp.programming.threads group for pthread_atfork() shows a lot of good discussion and how little the POSIX standards really provides to solve this problem.
There's also the issue that one should only call async-signal-safe functions from pthread_atfork() handlers, and it appears the most important one is the child handler, where only a memcpy() is done.
Does this work? Is there a better way of dealing with the requirements of our shared library?
Congratulations, you found a defect in the standard. pthread_atfork is fundamentally unable to solve the problem it was created to solve with mutexes, because the handler in the child is not permitted to perform any operations on them:
It cannot unlock them, because the caller would be the new main thread in the newly created child process, and that's not the same thread as the thread (in the parent) that obtained the lock.
It cannot destroy them, because they are locked.
It cannot re-initialize them, because they have not been destroyed.
One potential workaround is to use POSIX semaphores in place of mutexes here. A semaphore does not have an owner, so if the parent process locks it (sem_wait), both the parent and child processes can unlock (sem_post) their respective copies without invoking any undefined behavior.
As a nice aside, sem_post is async-signal-safe and thus definitely legal for the child to use.
I consider this a bug in the programs calling fork(). In a multi-threaded process, the child process should call only async-signal-safe functions. If a program wants to fork without exec, it should do so before creating threads.
There isn't really a good solution for threaded fork()/pthread_atfork(). Some chunks of it appear to work, but this is not portable and liable to break across OS versions.