The first argument of pthread_create() is a thread object which is used to identify the newly-created thread. However, I'm not sure I completely understand the implacations of this.
For instance, I am writing a simple chat server and I plan on using threads. Threads will be coming and going at all times, so keeping track of thread objects could be complicated. However, I don't think I should need to identify individual threads. Could I simply use the same thread object for the first argument of pthread_create() over and over again, or are there other ramifications for this?
If you throw away the thread identifiers by overwriting the same variable with the ID of each thread you create, you'll not be able to use pthread_join() to collect the exit status of threads. So, you may as well make the threads detached (non-joinable) when you call pthread_create().
If you don't make the threads detached, exiting threads will continue to use some resource, so continually creating attached (non-detached) threads that exit will use up system resources — a memory leak.
Read the manual at http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_create.html
According to it:
"Upon successful completion, pthread_create() shall store the ID of the created thread in the location referenced by thread."
I think pthread_create just overwrites the value in the first argument. It does not read it, doesn't care what is inside it. So you can get a new thread from pthread_create, but you can't make it reuse an existing thread. If you would like to reuse your threads, that is more complicated.
Related
My question is similar to this one. But after viewing all its answers, I still don't know what kind of safety guarantee one can get with pthread_cancel(). So I would like to ask a more specific question:
Say that pthread_cancel() is called on a pthread_t variable, named my_thread, is it possible that by the time pthread_cancel(my_thread) is executed, the actual thread corresponding to my_thread has already been terminated somehow, and the kernel recycled the value of my_thread for another newly created thread, such that by executing pthread_cancel(my_thread), another unintended thread gets killed?
The value can't be "recycled" until the thread is detached or joined. As long as you didn't do either of those things, it's safe to call pthread_cancel, even if the thread already terminated.
The question is about race conditions involving pthread_cancel(). POSIX requires that function to be thread safe in the specific, limited sense in which it uses that term, but that doesn't really speak to the question at hand. The key details are specified in XSH 2.9.2, as #R.. observed earlier in a comment. In particular:
The lifetime of a thread ID ends after the thread terminates if it was
created with the detachstate attribute set to PTHREAD_CREATE_DETACHED
or if pthread_detach() or pthread_join() has been called for that
thread. A conforming implementation is free to reuse a thread ID after
its lifetime has ended. If an application attempts to use a thread ID
whose lifetime has ended, the behavior is undefined.
So an application is permitted to re-use thread IDs whose lifetime has ended, but that's really a side issue because if you attempt to use a stale one then the behavior is undefined, whether the ID has been reused or not. And of course, one of the innumerable possible manifestations of UB that could ensue in the case described is indeed that a different thread is cancelled than the one you meant to cancel, regardless of whether the thread ID has been reused.
The lifetime of a thread ID ends when the thread it identifies terminates if that thread was created detached, or when it is passed to either pthread_detach or pthread_join if the thread was created joinable. It is entirely possible to have a race between that and the execution of pthread_cancel. If the thread was created joinable then you need at least three threads total for that, but if it was created detached then you don't need any other than the one calling pthread_cancel and a separate one being cancelled. Either way pthread_cancel is risky.
The accepted answer to the question you linked is misleading, at best, but #DavidSchwartz's comment on it is much more useful, even if I don't think it accurately reflects the specification in every detail. Here is how I would put it:
It is safe to cancel a thread with pthread_cancel if one of these cases holds:
the thread was created joinable, and it is certain that it cannot have been detached or joined before the pthread_cancel call completes, or
the thread was created detached, and it is certain that it cannot have terminated, nor have been passed to pthread_join or pthread_attach (regardless of the success of these calls) before the pthread_cancel call completes.
It is not safe (i.e. it risks UB) to attempt to cancel
a thread that was created joinable, via the thread ID provided by pthread_create, if it is possible for that thread to be detached or joined before the pthread_cancel call completes, or
a thread that was created detached, if it is possible for that thread to terminate or have pthread_join or pthread_detach called on it before the pthread_cancel call completes.
It is unclear whether it is safe to cancel a thread that was created joinable and later detatched, via a thread ID obtained from pthread_self() after the detachment, if it is certain that neither pthread_join nor pthread_detach can have been called on that thread ID before the `pthread_cancel completes.*
*One could interpret the specifications to imply that under those circumstances, pthread_self returns a thread ID whose lifetime has already ended, so that the cancellation would definitely produce UB. But there are at least a couple different contrary interpretations, and under any of those, there is no defined condition under which the lifetime of the thread ID from pthread_self ends before the end of the program, leaving it safe to cancel the thread via that ID at any time.
I'm writing code to save text to a binary file, which includes a function to auto-save text to the binary file, as well as a function to print from the binary file, and I need to incorporate pthread locks and join. We were given
pthread_mutext_t mutex;
pthread_t autosavethread;
as global variables, although the instructor didn't talk about what pthread or mutex actually do, so I'm confused about that.
Also, I understand that I need to use locks whenever shared variables are changed or read (in my case it would be the binary file). But at the end of the file I am supposed to use pthread_join, and I don't know what it does or what arguments are supposed to be used in it. I'm guessing mutex and autosavethread are supposed to be closed, or something along the lines of that, but I don't know how to write it. Can anyone help better my understanding?
There are two types of pthread - joinable thread & detached thread.
If you want to let a thread just take a task and go away once the task is done, you need the detached thread;
If you want to have the communication with the created thread when that thread is done with the assigned job, you have to use joinable thread. Basically it's needed when the parent & its created thread need to communicate after the thread is done.
It's very to google what exactly you need to call the pthread APIs and what can be communicated.
But one thing i want to mention here is, for the joinable thread, you have to explicitly call the pthread_join against the created thread. Otherwise, there will be serious memory leaks. When the joinable thread completes its task, the thread seems to exit (On linux, you can check the /proc/PID/task/ folder and once the thread completes, the entry under it will go away), but the resource allocated for this joinable thread, i.e. stack, is still there in the process memory space. As more and more joinable threads created and completing their tasks, the stacks for each thread are just left in process space, unless you explicitly call the pthread_join. Hope that helps, even a bit
Does the function "pthread_create" start the thread (starts executing its function), or does it just creates it and make it wait for the right moment to start?
pthread_create creates the thread (by using clone syscall internally), and return the tid (thread id, like pid). So, at the time when pthread_create returns, the new thread is at least created. But there are no guaranties when it will be started.
From the Man:
http://man7.org/linux/man-pages/man3/pthread_create.3.html
Unless real-time scheduling policies
are being employed, after a call to pthread_create(), it is
indeterminate which thread—the caller or the new thread—will next
execute.
POSIX has the similar comment in the informative description of pthread_create http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_create.html
There is no requirement on the implementation that the ID of the created thread be available before the newly created thread starts executing.
There is also long "Rationale" why pthread_create is single step process without separate thread creation and start_execution (as it was in good old Java epoch):
A suggested alternative to pthread_create() would be to define two separate operations: create and start. Some applications would find such behavior more natural. Ada, in particular, separates the "creation" of a task from its "activation".
Splitting the operation was rejected by the standard developers for many reasons:
The number of calls required to start a thread would increase from one to two and thus place an additional burden on applications that do not require the additional synchronization. The second call, however, could be avoided by the additional complication of a start-up state attribute.
An extra state would be introduced: "created but not started". This would require the standard to specify the behavior of the thread operations when the target has not yet started executing.
For those applications that require such behavior, it is possible to simulate the two separate steps with the facilities that are currently provided. The start_routine() can synchronize by waiting on a condition variable that is signaled by the start operation.
You may use RT scheduling; or just add some synchronization in the created thread to get exact information about it's execution. It can be also useful in some cases to manually bind the thread to specific CPU core using pthread_setaffinity_np
It creates the thread and enters the ready queue. When it gets its slice from the scheduler, it starts to run.
How early it gets to run will depend upon thread's priority, no of threads it is competing against among other factors.
(Working in Win32 api , in C environment with VS2010)
I have a two thread app. The first thread forks the second and waits for a given interval - 'TIMEOUT', and then calls TerminateThread() on it.
Meanwhile, second thread calls NetServerEnum().
It appears that when timeout is reached , whether NetServerEnum returned successfully or not, the first thread get deadlocked.
I've already noticed that NetServerEnum creates worker threads of it's own.
I ultimately end up with one of those threads in deadlock, typically on ntdll.dll!RtlInitializeExceptionChain, unable to exit my process gracefully.
As this to too long for a comment:
Verbatim from MSDN, allow me to use te answer form (emphasis by me):
TerminateThread is a dangerous function that should only be used in the most extreme cases. You should call TerminateThread only if you know exactly what the target thread is doing, and you control all of the code that the target thread could possibly be running at the time of the termination. For example, TerminateThread can result in the following problems:
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be released.
*If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL.
From reading this it is easy to understanf why it is a bad idea to cancel (terminate) a thread stucking in a system call.
A possible alternative approach to the OP's design might be to spawn off a thread calling NetServerEnum() and simply let it run until the system call returned.
In the mean while the main thread could do other things like for example informing the user that scanning the net takes longer as expected.
Is it possible for two threads to use a single function "ThreadProc" as its thread procedure when CreateThread() is used?
HANDLE thread1= CreateThread( NULL, //Choose default security
0, //Default stack size
(LPTHREAD_START_ROUTINE)&ThreadProc,
//Routine to execute. I want this routine to be different each time as I want each thread to perform a different functionality.
(LPVOID) &i, //Thread parameter
0, //Immediately run the thread
&dwThreadId //Thread Id
)
HANDLE thread2= CreateThread( NULL, //Choose default security
0, //Default stack size
(LPTHREAD_START_ROUTINE)&ThreadProc,
//Routine to execute. I want this routine to be different each time as I want each thread to perform a different functionality.
(LPVOID) &i, //Thread parameter
0, //Immediately run the thread
&dwThreadId //Thread Id
)
Would the above code create two threads each with same functionality(since thread procedure for both of the threads is same.) Am I doing it correctly?
If it is possible then would there be any synchronization issues since both threads are using same Thread Procedure.
Please help me with this. I am really confused and could not find anything over the internet.
It is fine to use the same function as a thread entry point for multiple threads.
However, from the posted code the address of i is being passed to both threads. If either thread modifies this memory and the other reads then there is a race condition on i. Without seeing the declaration of i it is probably a local variable. This is dangerous as the threads require that i exist for their lifetime. If i does not the threads will have a dangling pointer. It is common practice to dynamically allocate thread arguments and have each thread free its arguments.
Yes, it is very well possible to have multiple (concurrent) threads that start with the same entry point.
Apart from the fact that the OS/threading library specifies the signature and calls it, there is nothing special about a thread entry point function. It can be used to start off multiple threads with the same caveats as for calling any other function from multiple threads: you need synchronization to access non-atomic shared variables.
Each thread uses its own stack area, but that gets allocated by the OS before the Thread Procedure get invoked, so by the time the Thread Procedure gets called all the special actions that are needed to create and start a new thread have already taken place.
Whether the threads are using the same code or not is irrelevant. It has no effect whatsoever on synchronization. It behaves precisely the same as if they were different functions. The issues with potential races is the same.
You probably don't want to pass both threads the same pointers. That will likely lead to data races. (Though we'd have to see the code to know for sure.)
Your code is right. There is NOT any synchronization issues between both threads. If they need synchronization, it maybe because they are change the same global variable, not because they use the same thread Procedure.