Static vs Dynamic pthread creation

Static vs Dynamic pthread creation - c

I have a question in regards to creating threads.
Specifically I want to know the difference between looping through thread[i]
and not looping but recalling pthread_create
For Example
A. Initializes 5 threads
for(i=0,i<5;i++){
pthread_create(&t[i],NULL,&routine,NULL);
}
B. Incoming clients connecting to a server
while(true){
client_connects_to_server = accept(sock, (struct sockaddr *)&server,
(socklen_t*)&server_len)
pthread_create(&t,NULL,&routine,NULL); //no iteration
}
Is the proper method of creating threads for incoming clients, to keep track of the connections already made, maybe something like this ?
pthread_create(&t[connections_made+1],&routine,NULL)
My concern is not being able to handle concurrent pthreads if option B is terminating threads or "re-writing" client connections.
Here is an example where no iteration is done
https://gist.github.com/oleksiiBobko/43d33b3c25c03bcc9b2b
Why is this correct ?

Contrary to your apparent assertion, both your examples call pthread_create() inside loops. In example A, it is a for loop that will iterate a known number of times, whereas in example B, it is a while loop that will iterate an unbounded number of times. I guess the known number vs unbounded number is what you mean by "static" and "dynamic", but that is not a conventional usage of those terms.
In any event, pthread_create() does what it is documented to do. Like any other function, it does not know anything about the context from which it is called other than the arguments passed to it. It can fail, but that's not influenced by looping by the caller, at least not directly. When pthread_create() succeeds, it creates and starts a new thread, which runs until the top-level call to its thread function returns, pthread_exit() is called by thread, the thread is canceled, or the process terminates.
The main significant difference between your two examples is that A keeps all the thread IDs by recording them in different elements of an array, whereas B overwrites the previous thread ID each time it creates a new thread. But the thread IDs are not the threads themselves. If you lose the ID of a thread then you can no longer join it, among other things, but that doesn't affect the thread's operation, including its interactions with memory, with files, or with synchronization objects such as semaphores. In this regard, example B is more suited for a thread function that will detach the thread in which it is called, so that the joining issue is moot. Example A's careful preservation of all the thread IDs would pointless for threads that detach themselves, but necessary if the threads need to be joined later.

Related

c - pthread_create identifier

The first argument of pthread_create() is a thread object which is used to identify the newly-created thread. However, I'm not sure I completely understand the implacations of this.
For instance, I am writing a simple chat server and I plan on using threads. Threads will be coming and going at all times, so keeping track of thread objects could be complicated. However, I don't think I should need to identify individual threads. Could I simply use the same thread object for the first argument of pthread_create() over and over again, or are there other ramifications for this?

If you throw away the thread identifiers by overwriting the same variable with the ID of each thread you create, you'll not be able to use pthread_join() to collect the exit status of threads. So, you may as well make the threads detached (non-joinable) when you call pthread_create().
If you don't make the threads detached, exiting threads will continue to use some resource, so continually creating attached (non-detached) threads that exit will use up system resources — a memory leak.

Read the manual at http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_create.html
According to it:
"Upon successful completion, pthread_create() shall store the ID of the created thread in the location referenced by thread."
I think pthread_create just overwrites the value in the first argument. It does not read it, doesn't care what is inside it. So you can get a new thread from pthread_create, but you can't make it reuse an existing thread. If you would like to reuse your threads, that is more complicated.

How implement a barrier using semaphores

I have the following problem to solve:
Consider an application where there are three types of threads: Calculus-A,Calculus-B and Finalization. Whenever a thread type Calculus-A ends, it calls the routine endA(), which returns immediately. Whenever a thread type Calculus-B ends, it calls the routine endB(), which returns immediately. Threads like Finalization routine call wait(),
which returns only if they have already completed two Calculation-A threads and 2 Calculation-B threads. In other words, for exactly 2 conclusions of Calculus-A and 2 conclusions of Calculus-B one thread Finalization is allowed to continue.
There is an undetermined number of threads of the 3 types. It is not known the order of the routines called by threads. Threads Completion are answered in the order of arrival.
Implement routines endA(), endB() and wait() using semaphores. Besides the variables initialization, the only possible operations are P and V. Solutions with busy-waiting are not acceptable.
Here's is my solution:
semaphore calcA = 2;
semaphore calcB = 2;
semaphore wait = -3;
void endA()
{
P(calcA);
V(wait);
}
void endB()
{
P(calcB);
V(wait);
}
void wait()
{
P(wait);
P(wait);
P(wait);
P(wait);
V(calcA);
V(calcA);
V(calcB);
V(calcB);
}
I believe that there will be a deadlock due to the wait's initialization and if and wait() executes before endA() and endB(). Is there any other solution for this?

I tend to view semaphore problems as problems where one must identify "sources of waiting" and define for each a semaphore and a protocol for their access.
With that in mind, the "sources of waiting" are
Completions of CalcA
Completions of CalcB
Maybe, if I understood this right, a wait on whole completion groups, consisting of two CalcAs and two CalcBs. I say maybe because I'm not sure what "Threads Completion are answered in the order of arrival." means.
Completions of CalcA and CalcB should therefore increment their respective counters. At the other end, one Finalization thread gains exclusive access to the counters and waits in any order for the needed number of completions to constitute a completion group. It then unlocks access to the next group.
My code is below, although since I'm unfamiliar with the Dutch V and P I will use take()/give().
semaphore calcA = 0;
semaphore calcB = 0;
semaphore groupSem = 1;
void endA(){
give(calcA);
}
void endB(){
give(calcB);
}
void wait(){
take(groupSem);
take(calcA);
take(calcA);
take(calcB);
take(calcB);
give(groupSem);
}
The groupSem semaphore ensures all-or-nothing: the thread that enters the critical section will get the next two completions of each of CalcA and CalcB. If groupSem wasn't there, the first thread to enter wait could take two As and block, then be taken over by another thread that grabs two As and two B and then run away.
A worse problem that exists if the groupSem isn't there is if this second thread takes two As, one B and then blocks, and then the first thread grabs the second B. If somehow the result of the finalization allows more runs of CalculationA and CalculationB, then you may have a deadlock, because there may be no more opportunity for instances of calculation A and B to complete, therefore leaving the finalization threads hanging, unable to produce more calculation instances.

Can two Threads use same Thread Procedure?

Is it possible for two threads to use a single function "ThreadProc" as its thread procedure when CreateThread() is used?
HANDLE thread1= CreateThread( NULL, //Choose default security
0, //Default stack size
(LPTHREAD_START_ROUTINE)&ThreadProc,
//Routine to execute. I want this routine to be different each time as I want each thread to perform a different functionality.
(LPVOID) &i, //Thread parameter
0, //Immediately run the thread
&dwThreadId //Thread Id
)
HANDLE thread2= CreateThread( NULL, //Choose default security
0, //Default stack size
(LPTHREAD_START_ROUTINE)&ThreadProc,
//Routine to execute. I want this routine to be different each time as I want each thread to perform a different functionality.
(LPVOID) &i, //Thread parameter
0, //Immediately run the thread
&dwThreadId //Thread Id
)
Would the above code create two threads each with same functionality(since thread procedure for both of the threads is same.) Am I doing it correctly?
If it is possible then would there be any synchronization issues since both threads are using same Thread Procedure.
Please help me with this. I am really confused and could not find anything over the internet.

It is fine to use the same function as a thread entry point for multiple threads.
However, from the posted code the address of i is being passed to both threads. If either thread modifies this memory and the other reads then there is a race condition on i. Without seeing the declaration of i it is probably a local variable. This is dangerous as the threads require that i exist for their lifetime. If i does not the threads will have a dangling pointer. It is common practice to dynamically allocate thread arguments and have each thread free its arguments.

Yes, it is very well possible to have multiple (concurrent) threads that start with the same entry point.
Apart from the fact that the OS/threading library specifies the signature and calls it, there is nothing special about a thread entry point function. It can be used to start off multiple threads with the same caveats as for calling any other function from multiple threads: you need synchronization to access non-atomic shared variables.
Each thread uses its own stack area, but that gets allocated by the OS before the Thread Procedure get invoked, so by the time the Thread Procedure gets called all the special actions that are needed to create and start a new thread have already taken place.

Whether the threads are using the same code or not is irrelevant. It has no effect whatsoever on synchronization. It behaves precisely the same as if they were different functions. The issues with potential races is the same.
You probably don't want to pass both threads the same pointers. That will likely lead to data races. (Though we'd have to see the code to know for sure.)

Your code is right. There is NOT any synchronization issues between both threads. If they need synchronization, it maybe because they are change the same global variable, not because they use the same thread Procedure.

c threads and resource locking

I have a 2 dimensional array and 8 concurrent threads writing to the array. If each thread reads/writes to a different array, will it result in a seg fault?
For example:
char **buffer;
//each thread has its own thread ID
void set(short ID, short elem, char var)
{
buffer[ID][elem] = var;
}
Would this be ok? I know this is pseudocode-ish, but you get the idea.

If each thread writes to a different sub-array, this aspect of your code will be fine and you will not need locking.

Multiple threads reading or writing to memory does not, by itself, lead to seg faults. What it can do is result in a race condition, where the results depend indeterminately on the ordering of operations of the multiple threads. The consequences depend on what you do with the memory you read ... if you read a value and then use it as an index or dereference a pointer, that might result in an out of bounds access even though the logic of the code if run by just one thread could not.
In your specific case, if each thread writes to non-overlapping memory because it uses a different ID, there's no possibility of a race condition when accessing the array. However, there could be a race condition when assigning the ID, resulting in two threads receiving the same ID ... so you need to use a lock or other way of guaranteeing that doesn't happen.

The main thing you will need to be careful of is how or when the 2D array is allocated. If all allocation occurs before the worker threads begin to access the array(s), and each worker thread reads and writes to only one of the "rows" of the master array for the lifetime of the thread, and it is the only thread to access that row, then you should not have any threading issues accessing or updating entries in the arrays.
If only one thread is writing to a row, but multiple threads could be reading from that same row, then you may need to work out some synchronization plan or else your readers may occasionally see inconsistent / incoherent data due to partial writes by a concurrent writer.
If each worker thread is hard-bound to a single "row" in the master array, it's also possible to allocate and reallocate the memory needed for each row by the worker thread itself, including updating the slot in the main array to point to the row data (re)allocated by the thread. There should be no contention for the pointer slot in the main array because only this worker thread is interested in that slot. Make sure the master array is allocated before any worker threads get started. For this scenario, also make sure that your C RTL malloc implementation is thread safe. (you may have to select a thread-safe RTL in your build options)

How to resuse threads - pthreads c

I am programming using pthreads in C.
I have a parent thread which needs to create 4 child threads with id 0, 1, 2, 3.
When the parent thread gets data, it will set split the data and assign it to 4 seperate context variables - one for each sub-thread.
The sub-threads have to process this data and in the mean time the parent thread should wait on these threads.
Once these sub-threads have done executing, they will set the output in their corresponding context variables and wait(for reuse).
Once the parent thread knows that all these sub-threads have completed this round, it computes the global output and prints it out.
Now it waits for new data(the sub-threads are not killed yet, they are just waiting).
If the parent thread gets more data the above process is repeated - albeit with the already created 4 threads.
If the parent thread receives a kill command (assume a specific kind of data), it indicates to all the sub-threads and they terminate themselves. Now the parent thread can terminate.
I am a Masters research student and I am encountering the need for the above scenario. I know that this can be done using pthread_cond_wait, pthread_Cond_signal. I have written the code but it is just running indefinitely and I cannot figure out why.
My guess is that, the way I have coded it, I have over-complicated the scenario. It will be very helpful to know how this can be implemented. If there is a need, I can post a simplified version of my code to show what I am trying to do(even though I think that my approach is flawed!)...
Can you please give me any insights into how this scenario can be implemented using pthreads?

As far what can be seen from your description, there seems to be nothing wrong with the principle.
What you are trying to implement is a worker pool, I guess, there should be a lot of implementations out there. If the work that your threads are doing is a substantial computation (say at least a CPU second or so) such a scheme is a complete overkill. Mondern implementations of POSIX threads are efficient enough that they support the creation of a lot of threads, really a lot, and the overhead is not prohibitive.
The only thing that would be important if you have your workers communicate through shared variables, mutexes etc (and not via the return value of the thread) is that you start your threads detached, by using the attribute parameter to pthread_create.
Once you have such an implementation for your task, measure. Only then, if your profiler tells you that you spend a substantial amount of time in the pthread routines, start thinking of implementing (or using) a worker pool to recycle your threads.

One producer-consumer thread with 4 threads hanging off it. The thread that wants to queue the four tasks assembles the four context structs containing, as well as all the other data stuff, a function pointer to an 'OnComplete' func. Then it submits all four contexts to the queue, atomically incrementing a a taskCount up to 4 as it does so, and waits on an event/condvar/semaphore.
The four threads get a context from the P-C queue and work away.
When done, the threads call the 'OnComplete' function pointer.
In OnComplete, the threads atomically count down taskCount. If a thread decrements it to zero, is signals the the event/condvar/semaphore and the originating thread runs on, knowing that all the tasks are done.
It's not that difficult to arrange it so that the assembly of the contexts and the synchro waiting is done in a task as well, so allowing the pool to process multiple 'ForkAndWait' operations at once for multiple requesting threads.
I have to add that operations like this are a huge pile easier in an OO language. The latest Java, for example, has a 'ForkAndWait' threadpool class that should do exactly this kind of stuff, but C++, (or even C#, if you're into serfdom), is better than plain C.