I have a program that has a static pthread_key_t key variable and a function that calls pthread_key_create(&key, &cleanup_function) when the first thread is started. I don't want to call pthread_key_delete in the cleanup routine because it will run whenever a thread exits instead of when all threads have exited. This will cause problems if another thread calls get_specific or set_specific later on.
My question is: Can I just completely leave out pthread_key_delete? Will this (having called pthread_key_create without calling pthread_key_delete afterward) cause any memory leak when the program eventually comes to a halt? Is it mandatory to call pthread_key_delete after creating a pthread_key_t? Or does the key just go into the garbage collector or get destructed once the entire problem ends?
static pthread_key_t key = NULL;
...
static void cleanup(void *value) {
...
if (thread_exit_callback) {
thread_exit_callback(value);
}
free(value);
}
static void *start(void *value) {
...
if (key == NULL){
pthread_key_create(&key, &cleanup);
}
pthread_setspecific(key, value);
...
}
The program looks something like this.
I have a program that has a static pthread_key_t key variable and a function that calls pthread_key_create(&key, &cleanup_function) when the first thread is started.
I presume you mean the second thread, or the first started via pthread_create() Every process has one thread at startup. All code that is executed is executed in a thread.
Also, "when" is a bit fiddly. It is cleanest for the initial thread to create the key, and that before launching any other threads. Otherwise, you need to engage additional synchronization to avoid a data race.
I don't want to call pthread_key_delete in the cleanup routine because it will run whenever a thread exits instead of when all threads have exited.
The documentation refers to your "cleanup function" as a destructor, which term I find to be more indicative of the intended purpose: not generic cleanup, but appropriate teardown for thread-specific values associated with the key. You have nevertheless come to the right conclusion, however. The destructor function, if given, should tear down only a value, not the key.
Can I just completely leave out pthread_key_delete? Will this (having called pthread_key_create without calling pthread_key_delete afterward) cause any memory leak when the program eventually comes to a halt?
That is unspecified, so it is safest to ensure that pthread_key_delete() is called. If all threads using the key are joinable then you can do that after joining them. You could also consider registering an exit handler to do that, but explicitly destroying the key is better style where that can reasonably be performed.
With that said, you will not leak any ordinary memory if you fail to destroy the key, as the system will reclaim all resources belonging to a process when that process exits. The main risk is that some of the resources associated with the key will be more broadly scoped, and that they would leak. Named semaphores and memory-mapped files are examples of such resources, though I have no knowledge of those specific types of resources being associated with TLD keys.
does the key just go into the garbage collector or get destructed once the entire problem ends?
C implementations do not typically implement garbage collection, though there are add-in garbage collector frameworks for C. But as described above, the resources belonging to a process do get released by the system when the process terminates.
Related
The first argument of pthread_create() is a thread object which is used to identify the newly-created thread. However, I'm not sure I completely understand the implacations of this.
For instance, I am writing a simple chat server and I plan on using threads. Threads will be coming and going at all times, so keeping track of thread objects could be complicated. However, I don't think I should need to identify individual threads. Could I simply use the same thread object for the first argument of pthread_create() over and over again, or are there other ramifications for this?
If you throw away the thread identifiers by overwriting the same variable with the ID of each thread you create, you'll not be able to use pthread_join() to collect the exit status of threads. So, you may as well make the threads detached (non-joinable) when you call pthread_create().
If you don't make the threads detached, exiting threads will continue to use some resource, so continually creating attached (non-detached) threads that exit will use up system resources — a memory leak.
Read the manual at http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_create.html
According to it:
"Upon successful completion, pthread_create() shall store the ID of the created thread in the location referenced by thread."
I think pthread_create just overwrites the value in the first argument. It does not read it, doesn't care what is inside it. So you can get a new thread from pthread_create, but you can't make it reuse an existing thread. If you would like to reuse your threads, that is more complicated.
i am using posix threads my question is as to whether or not a thread can cancel itself by passing its own thread id in pthread_cancel function?
if yes then what are its implications
also if a main program creates two threads and one of the thread cancels the other thread then what happens to the return value and the resources of the cancelled thread
and how to know from main program as to which thread was cancelled ..since main program is not cancelling any of the threads
i am using asynchronous cancellation
kindly help
Q1: Yes, a thread can cancel itself. However, doing so has all of the negative consequences of cancellation in general; you probably want to use pthread_exit instead, which is somewhat more predictable.
Q2: When a thread has been cancelled, it doesn't get to generate a return value; instead, pthread_join will put the special value PTHREAD_CANCELED in the location pointed to by its retval argument. Unfortunately, you have to know by some other means that a specific thread has definitely terminated (in some fashion) before you call pthread_join, or the calling thread will block forever. There is no portable equivalent of waitpid(..., WNOHANG) nor of waitpid(-1, ...). (The manpage says "If you believe you need this functionality, you probably need to rethink your application design" which makes me want to punch someone in the face.)
Q2a: It depends what you mean by "resources of the thread". The thread control block and stack will be deallocated. All destructors registered with pthread_cleanup_push or pthread_key_create will be executed (on the thread, before it terminates); some runtimes also execute C++ class destructors for objects on the stack. It is the application programmer's responsibility to make sure that all resources owned by the thread are covered by one of these mechanisms. Note that some of these mechanisms have inherent race conditions; for instance, it is impossible to open a file and push a cleanup that closes it as an atomic action, so there is a window where cancellation can leak the open file. (Do not think this can be worked around by pushing the cleanup before opening the file, because a common implementation of deferred cancels is to check for them whenever a system call returns, i.e. exactly timed to hit the tiny gap between the OS writing the file descriptor number to the return-value register, and the calling function copying that register to the memory location where the cleanup expects it to be.)
Qi: you didn't ask this, but you should be aware that a thread with asynchronous cancellation enabled is officially not allowed to do anything other than pure computation. The behavior is undefined if it calls any library function other than pthread_cancel, pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED), or pthread_setcancelstate(PTHREAD_CANCEL_DISABLE).
Q1. Yes,thread can cancel itself.
Q2. If one thread cancel another thread , its resources are hang around until main thread
join that thread with pthread_join() function(if the thread is joinable). And if the canceled
thread is not join in main thread resources are free with program ends/terminate.
Q3. I am not sure, but main program don't know which thread was canceled.
thread can cancel any other thread (within the same process) including itself
threads do not have return values (in general way, they can have return status only), resources of the thread will be freed upon cancellation
main program can store thread's handler and test whether it valid or not
Is it possible to read the registers or thread local variables of another thread directly, that is, without requiring to go to the kernel? What is the best way to do so?
You can't read the registers, which wouldn't be useful anyway. But reading thread local variables from another thread is easily possible.
Depending on the architecture (e. g. strong memory ordering like on x86_64) you can safely do it even without synchronization, provided that the read value doesn't affect in any way the thread is belongs to. A scenario would be displaying a thread local counter or similar.
Specifically in linux on x86_64 as you tagged, you could to it like that:
// A thread local variable. GCC extension, but since C++11 actually part of C++
__thread int some_tl_var;
// The pointer to thread local. In itself NOT thread local, as it will be
// read from the outside world.
struct thread_data {
int *psome_tl_var;
...
};
// the function started by pthread_create. THe pointer needs to be initialized
// here, and NOT when the storage for the objects used by the thread is allocated
// (otherwise it would point to the thread local of the controlling thread)
void thread_run(void* pdata) {
pdata->psome_tl_var = &some_tl_var;
// Now do some work...
// ...
}
void start_threads() {
...
thread_data other_thread_data[NTHREADS];
for (int i=0; i<NTHREADS; ++i) {
pthread_create(pthreadid, NULL, thread_run, &other_thread_data[i]);
}
// Now you can access each some_tl_var as
int value = *(other_thread_data[i].psome_tl_var);
...
}
I used similar for displaying some statistics about worker threads. It is even easier in C++, if you create objects around your threads, just make the pointer to the thread local a field in your thread class and access is with a member function.
Disclaimer: This is non portable, but it works on x86_64, linux, gcc and may work on other platforms too.
There's no way to do it without involving the kernel, and in fact I don't think it could be meaningful to read them anyway without some sort of synchronization. If you don't want to use ptrace (which is ugly and non-portable) you could instead choose one of the realtime signals to use for a "send me your registers/TLS" message. The rough idea is:
Lock a global mutex for the request.
Store the information on what data you want (e.g. a pthread_key_t or a special value meaning registers) from the thread in global variables.
Signal the target thread with pthread_kill.
In the signal handler (which should have been installed with sigaction and SA_SIGINFO) use the third void * argument to the signal handler (which really points to a ucontext_t) to copy that ucontext_t to the global variable used to communicate back to the requesting thread. This will give it all the register values, and a lot more. Note that TLS is a bit more tricky since pthread_getspecific is not async-signal-safe and technically not legal to run in this context...but it probably works in practice.
The signal handler posts a semaphore (this is the ONLY async-signal-safe synchronization function offered by POSIX) indicating to the requesting thread that it's done, and returns.
The requesting thread finishes by waiting on the semaphore, then reads the data and unlocks the request mutex.
Note that this will involve at least 1 transition to kernelspace (pthread_kill) in the requesting thread (and maybe another in sem_wait), and 1-3 in the target thread (1 for returning from the signal handler, one for entering the signal handler if it was not already sleeping in kernelspace, and possibly one for sem_post). Still it's probably faster than mucking around with ptrace which is not designed for high-performance usage...
I have a method which is is meant to write information into a struct. I want to make it run as a thread.
If I call it by itself, as childWriter((void*) &sa) it works.
If I call pthread_create(&writerChild, NULL, childWriter, (void*) &sa), it no longer works. It doesn't write to the shared object.
This is driving me mad. Why isn't it working, and how do I make it work?
What makes you so sure that the code doesn't execute? Note that if you do something like:
int main(int argc, char* argv[])
{
pthread_create(....);
return 0;
}
In the above, the program will quit right away, because the program exits as soon as the main thread has terminated. You need to "join" the thread (using pthread_join), in order to wait for the thread to have terminated. Note that spawning a thread and then joining it is actually worse than simply running the content that the thread would run (since spawning and then joining a thread is equivalent to running the content serially, plus it adds the overhead of the spawn/join). If you intend to multithread things, typically one spawns multiple threads and then either detaches them or joins them later.
Also, you should be cautious about sharing data between threads that can be modified; any object that is read from multiple threads and is modified in even one thread requires explicit locking around access to that object. Otherwise, you can end up with all sorts of garbage and corruption.
Short version:
Use pthread_join() before you read the data from the struct:
pthread_create(&writerChild, NULL, childWriter, (void*) &sa);
//if necessary, do some other tasks, not related to the struct, here
pthread_join(writerChild,NULL);
//now, you may read from the struct
If you are creating the thread in one function, & reading the struct in another function, simply shift the pthread_join statement to the latter, just before you read from the struct. (Also make sure that the pthread_t variable writerChild is visible in the reading function's scope)
Long version:
Threads are generally used in programs where tasks can be parallelized. I suppose your intention here is to read the data written to the struct after the childWriter function writes to it.
When you call your function in a single-threaded process, via:
childWriter((void*) &sa);
the thread / process shifts to execute the instructions that are part of your function. Only when the function returns, does the execution control return to the point from which you called the function. Hence, you can be sure that childWriter has completed its execution, before you begin to read from your struct in the calling function.
When you create a new thread, via:
pthread_create(&writerChild, NULL, childWriter, (void*) &sa);
the thread runs in parallel with your "main" thread. There is no guarantee that the newly created thread will get a chance to execute before the next instructions in the main thread get exectued, leave alone the possibilty that the childWriter thread function completes its execution prior to reading the struct.
Hence, you need to wait for the thread to complete its execution. This can be accomplished using pthread_join().
We have a C++ shared library that uses ZeroC's Ice library for RPC and unless we shut down Ice's runtime, we've observed child processes hanging on random mutexes. The Ice runtime starts threads, has many internal mutexes and keeps open file descriptors to servers.
Additionally, we have a few of mutexes of our own to protect our internal state.
Our shared library is used by hundreds of internal applications so we don't have control over when the process calls fork(), so we need a way to safely shutdown Ice and lock our mutexes while the process forks.
Reading the POSIX standard on pthread_atfork() on handling mutexes and internal state:
Alternatively, some libraries might have been able to supply just a child routine that reinitializes the mutexes in the library and all associated states to some known value (for example, what it was when the image was originally executed). This approach is not possible, though, because implementations are allowed to fail *_init() and *_destroy() calls for mutexes and locks if the mutex or lock is still locked. In this case, the child routine is not able to reinitialize the mutexes and locks.
On Linux, the this test C program returns EPERM from pthread_mutex_unlock() in the child pthread_atfork() handler. Linux requires adding _NP to the PTHREAD_MUTEX_ERRORCHECK macro for it to compile.
This program is linked from this good thread.
Given that it's technically not safe or legal to unlock or destroy a mutex in the child, I'm thinking it's better to have pointers to mutexes and then have the child make new pthread_mutex_t on the heap and leave the parent's mutexes alone, thereby having a small memory leak.
The only issue is how to reinitialize the state of the library and I'm thinking of reseting a pthread_once_t. Maybe because POSIX has an initializer for pthread_once_t that it can be reset to its initial state.
#include <pthread.h>
#include <stdlib.h>
#include <string.h>
static pthread_once_t once_control = PTHREAD_ONCE_INIT;
static pthread_mutex_t *mutex_ptr = 0;
static void
setup_new_mutex()
{
mutex_ptr = malloc(sizeof(*mutex_ptr));
pthread_mutex_init(mutex_ptr, 0);
}
static void
prepare()
{
pthread_mutex_lock(mutex_ptr);
}
static void
parent()
{
pthread_mutex_unlock(mutex_ptr);
}
static void
child()
{
// Reset the once control.
pthread_once_t once = PTHREAD_ONCE_INIT;
memcpy(&once_control, &once, sizeof(once_control));
}
static void
init()
{
setup_new_mutex();
pthread_atfork(&prepare, &parent, &child);
}
int
my_library_call(int arg)
{
pthread_once(&once_control, &init);
pthread_mutex_lock(mutex_ptr);
// Do something here that requires the lock.
int result = 2*arg;
pthread_mutex_unlock(mutex_ptr);
return result;
}
In the above sample in the child() I only reset the pthread_once_t by making a copy of a fresh pthread_once_t initialized with PTHREAD_ONCE_INIT. A new pthread_mutex_t is only created when the library function is invoked in the child process.
This is hacky but maybe the best way of dealing with this skirting the standards. If the pthread_once_t contains a mutex then the system must have a way of initializing it from its PTHREAD_ONCE_INIT state. If it contains a pointer to a mutex allocated on the heap than it'll be forced to allocate a new one and set the address in the pthread_once_t. I'm hoping it doesn't use the address of the pthread_once_t for anything special which would defeat this.
Searching
comp.programming.threads group for pthread_atfork() shows a lot of good discussion and how little the POSIX standards really provides to solve this problem.
There's also the issue that one should only call async-signal-safe functions from pthread_atfork() handlers, and it appears the most important one is the child handler, where only a memcpy() is done.
Does this work? Is there a better way of dealing with the requirements of our shared library?
Congratulations, you found a defect in the standard. pthread_atfork is fundamentally unable to solve the problem it was created to solve with mutexes, because the handler in the child is not permitted to perform any operations on them:
It cannot unlock them, because the caller would be the new main thread in the newly created child process, and that's not the same thread as the thread (in the parent) that obtained the lock.
It cannot destroy them, because they are locked.
It cannot re-initialize them, because they have not been destroyed.
One potential workaround is to use POSIX semaphores in place of mutexes here. A semaphore does not have an owner, so if the parent process locks it (sem_wait), both the parent and child processes can unlock (sem_post) their respective copies without invoking any undefined behavior.
As a nice aside, sem_post is async-signal-safe and thus definitely legal for the child to use.
I consider this a bug in the programs calling fork(). In a multi-threaded process, the child process should call only async-signal-safe functions. If a program wants to fork without exec, it should do so before creating threads.
There isn't really a good solution for threaded fork()/pthread_atfork(). Some chunks of it appear to work, but this is not portable and liable to break across OS versions.