c threads and resource locking - c

I have a 2 dimensional array and 8 concurrent threads writing to the array. If each thread reads/writes to a different array, will it result in a seg fault?
For example:
char **buffer;
//each thread has its own thread ID
void set(short ID, short elem, char var)
{
buffer[ID][elem] = var;
}
Would this be ok? I know this is pseudocode-ish, but you get the idea.

If each thread writes to a different sub-array, this aspect of your code will be fine and you will not need locking.

Multiple threads reading or writing to memory does not, by itself, lead to seg faults. What it can do is result in a race condition, where the results depend indeterminately on the ordering of operations of the multiple threads. The consequences depend on what you do with the memory you read ... if you read a value and then use it as an index or dereference a pointer, that might result in an out of bounds access even though the logic of the code if run by just one thread could not.
In your specific case, if each thread writes to non-overlapping memory because it uses a different ID, there's no possibility of a race condition when accessing the array. However, there could be a race condition when assigning the ID, resulting in two threads receiving the same ID ... so you need to use a lock or other way of guaranteeing that doesn't happen.

The main thing you will need to be careful of is how or when the 2D array is allocated. If all allocation occurs before the worker threads begin to access the array(s), and each worker thread reads and writes to only one of the "rows" of the master array for the lifetime of the thread, and it is the only thread to access that row, then you should not have any threading issues accessing or updating entries in the arrays.
If only one thread is writing to a row, but multiple threads could be reading from that same row, then you may need to work out some synchronization plan or else your readers may occasionally see inconsistent / incoherent data due to partial writes by a concurrent writer.
If each worker thread is hard-bound to a single "row" in the master array, it's also possible to allocate and reallocate the memory needed for each row by the worker thread itself, including updating the slot in the main array to point to the row data (re)allocated by the thread. There should be no contention for the pointer slot in the main array because only this worker thread is interested in that slot. Make sure the master array is allocated before any worker threads get started. For this scenario, also make sure that your C RTL malloc implementation is thread safe. (you may have to select a thread-safe RTL in your build options)

Related

Static vs Dynamic pthread creation

I have a question in regards to creating threads.
Specifically I want to know the difference between looping through thread[i]
and not looping but recalling pthread_create
For Example
A. Initializes 5 threads
for(i=0,i<5;i++){
pthread_create(&t[i],NULL,&routine,NULL);
}
B. Incoming clients connecting to a server
while(true){
client_connects_to_server = accept(sock, (struct sockaddr *)&server,
(socklen_t*)&server_len)
pthread_create(&t,NULL,&routine,NULL); //no iteration
}
Is the proper method of creating threads for incoming clients, to keep track of the connections already made, maybe something like this ?
pthread_create(&t[connections_made+1],&routine,NULL)
My concern is not being able to handle concurrent pthreads if option B is terminating threads or "re-writing" client connections.
Here is an example where no iteration is done
https://gist.github.com/oleksiiBobko/43d33b3c25c03bcc9b2b
Why is this correct ?
Contrary to your apparent assertion, both your examples call pthread_create() inside loops. In example A, it is a for loop that will iterate a known number of times, whereas in example B, it is a while loop that will iterate an unbounded number of times. I guess the known number vs unbounded number is what you mean by "static" and "dynamic", but that is not a conventional usage of those terms.
In any event, pthread_create() does what it is documented to do. Like any other function, it does not know anything about the context from which it is called other than the arguments passed to it. It can fail, but that's not influenced by looping by the caller, at least not directly. When pthread_create() succeeds, it creates and starts a new thread, which runs until the top-level call to its thread function returns, pthread_exit() is called by thread, the thread is canceled, or the process terminates.
The main significant difference between your two examples is that A keeps all the thread IDs by recording them in different elements of an array, whereas B overwrites the previous thread ID each time it creates a new thread. But the thread IDs are not the threads themselves. If you lose the ID of a thread then you can no longer join it, among other things, but that doesn't affect the thread's operation, including its interactions with memory, with files, or with synchronization objects such as semaphores. In this regard, example B is more suited for a thread function that will detach the thread in which it is called, so that the joining issue is moot. Example A's careful preservation of all the thread IDs would pointless for threads that detach themselves, but necessary if the threads need to be joined later.

Do I really need mutex lock in this case?

Consider we have three thread, bool status_flag[500] array, and working situations as follow :
Two threads only writing in status_flag array at different index. while third thread is only reading at any index.
All three thread writing at different index. While all three threads reading at any index.
In writing operation we are just setting the flag never reset it again.
status_flag [i] = true;
In reading operation we are doing something like that :
for(;;){ //spinning to get flag true
if(status_flag [i] == true){
//do_something ;
break;
}
}
What happen if compiler optimize (branch prediction) code?
I have read lot about lock but still having confusion to conclude result. Please help me to conclude.
POSIX is quite clear on this:
Applications shall ensure that access to any memory location by more than one thread of control (threads or processes) is restricted such that no thread of control can read or modify a memory location while another thread of control may be modifying it.
So without locking you're not allowed to read memory that some other thread may be writing. Furthermore, that part of POSIX describes which function will synchronize the memory between the threads. Before both threads have called any of the functions listed in there, you have no guarantee that the changes made by one thread will be visible to the other thread.
If all the threads are operating on different index value, then you do not need a lock. Basically it is equivalent to using different variables.
In your code , the value of the variable i is not set or modified. So it is reading only a particular index of flag. And for writing you are using different index, in this case no need to use lock.

Does a thread's cache get flushed to main memory when it exits?

The title is the question: when a thread exits, does its cached memory get flushed to the main memory?
I am wondering because cases are common where the main thread creates some threads, they do some work on independent parts of the array (no data dependencies between each other), the main thread joins all the worker threads, then does more calculations with the array values that result from the worker threads computations. Do the arrays need to be declared volatile for the main thread to see the side-effects on it?
The pthreads specification requires that pthread_join() is one of the functions that "synchronizes memory with respect to other threads", so in the case of pthreads you are OK - after pthread_join() has returned, the main thread will see all updates to shared memory made by the joined thread.
Assuming you are doing this in C, and if the array is global or you have passed a structure to the threads which contains the indices on which the threads need to do the computation on and a pointer to the array, then the array need not be volatile for the main thread to see the changes since the array memory is shared between the worker threads and the main thread.

Fastest way to share data with different threads?

Consider the following scenario with Open MP:
We have a pointer A pointed to a very large buffer in memory, and we have several threads, one thread (lets call it thread #1) keep updating the contents in A whilst other thread, based on signals controlled by #1, processing data stored in A.
Which is the fastest way to ensure (assuming the signal that telling other threads the right time to copy A is atomic, so no race conditions there):
Threads that processing data will always use the udpated data (e.g. no risk of some data were cached in registers etc).
Good performance.
You may try read-write lock. Thread #1 holds write lock, and other threads hold read lock. This way all other threads can read in parallel, only read and write are mutually exclusive.

Thread-safe ring buffer for producers-consumer in C

In C, I have several threads producing long values, and one thread consuming them. Therefore I need a buffer of a fixed size implemented in a similar fashion to i.e. the Wikipedia implementation, and methods that access it in a thread-safe manner.
On a general level, the following should hold:
When adding to a full buffer, the thread should be blocked (no overwriting old values).
The consumer thread should be blocked until the buffer is full - it's job has a high constant cost, and should do as much work as possible. (Does this call for a double-buffered solution?)
I would like to use a tried implementation, preferably from a library. Any ideas?
Motivation & explanation:
I am writing JNI code dealing with deleting global references kept as tags in heap objects.
When a ObjectFree JVMTI event occurs, I get a long tag representing a global reference I need to free using DeleteGlobalRef. For this, I need a JNIEnv reference - and getting it is really costly, so I want to buffer the requests and remove as many as possible at once.
There might be many threads receiving the ObjectFree event, and there will be one thread (mine) doing the reference deletion.
You can use a single buffer, with a mutex when accessed. You'll need to keep track of how many elements are used. For "signaling", you can use condition variables. One that is triggered by the producer threads whenever they place data in the queue; this releases the consumer thread to process the queue until empty. Another that is triggered by the consumer thread when it has emptied the queue; this signals any blocked producer threads to fill the queue. For the consumer, I recommend locking the queue and taking out as much as possible before releasing the lock (to avoid too many locks), especially since the dequeue operation is simple and fast.
Update
A few useful links:
* Wikipedia explanation
* POSIX Threads
* MSDN
Two possibilities:
a) malloc() a *Buffer struct with an array to hold some longs and an index - no locking required. Have each producer thread malloc its own *Buffer and start loading it up. When a producer thread fills the last array position, queue the *Buffer to the consumer thread on a producer-consumer queue and immediately malloc() a new *Buffer. The consumer gets the *Buffers and processes them and then free()s them, (or queues them off somewhere else, or pushes them back onto a pool for re-use by producers). This avoids any locks on the buffers themselves, leaving only the lock on the P-C queue. The snag is that producers that only occasionally generate their longs will not get their data processed until their *Buffer gets filled up, which may take some time, (you could push off the Buffer before the array gets full, in such a thread.
b) Declare a Buffer struct with an array to hold some longs and an index. Protect with a mutex/futex/CS lock. malloc() just one shared *Buffer and have all the threads get the lock, push on their long and release the lock. If a thread pushes in the last array position, queue the *Buffer to the consumer thread on a producer-consumer queue, immediately malloc a new *Buffer and then release the lock. The consumer gets the *Buffers and processes them and then free()s them, (or queues them off somewhere else, or pushes them back onto a pool for re-use by producers).
You may want to take condition in consideration. Take a look at this piece of code for consumer :
while( load == 0 )
pthread_cond_wait( &notEmpty, &mutex );
What it does is to check to see whether load ( where you store the number of elements in your list ) is zero or not and if it is zero, it'll wait until producer produces new item and put it in the list.
You should implement the same condition for producer ( when it wants to put item in a full list )

Resources