I wrote a simple program that implements master/worker scheme where the master is the main thread, and workers are created by it.
The main thread writes something to a shared buffer, and the worker threads read this shared buffer, writing and reading to shared buffer are organized by read/write lock.
Unfortunately, this scheme definitely leads to starvation of main thread, since a single write has to wait on several reads to complete. One possible solution is increasing the priority of the master thread, so if it wants to write something, it will get immediate access to the shared buffer.
According to a great post to a similar issue, I discovered that probably manipulating the priority of a thread under SCHED_OTHER policy is not allowed, what can be changed is the nice value only.
I wrote a procedure to give worker threads lower priority than master thread, but it seems not to work correctly.
void assignWorkerThreadPriority(pthread_t* worker)
{
struct sched_param* worker_sched_param = (struct sched_param*)malloc(sizeof(struct sched_param));
worker_sched_param->sched_priority =0; //any value other than 0 gives error?
int policy = SCHED_OTHER;
pthread_setschedparam(*worker, policy, worker_sched_param);
printf("Result of changing priority is: %d - %s\n", errno, strerror(errno));
}
I have a two-fold question:
How can I set the nice value of a worker threads to avoid main thread starvation.
If not possible, then how can I change the scheduling policy to a one that allows changing the priority.
Edit: I managed to run the program using other policies, such as SCHED_FIFO, all I had to do was running the program as a super user
You cannot avoid problems using a read/write lock when the read and write usage is so even. You need a different method. You need a lock-free message queue or independent work queues or one of many other techniques.
Here is another way to do the job, the way I would do it. The worker can take the buffer away and work on it rather than keeping it shared:
Write thread:
Create work item.
Lock the mutex or CriticalSection protecting the current queue and pointer to queue.
Add work item to queue.
Release the lock.
Optionally signal a condition variable or Event. Another option is for worker threads to check for work on a timer.
Worker thread:
Create a new queue.
Wait for a condition variable or event or other signal, or wait on a timer.
Lock the mutex or CriticalSection protecting the current queue and pointer to queue.
Set the current queue pointer to the new queue.
Release the lock.
Proceed to work on the now private queue.
Delete the queue when all work items complete.
Now write thread creates more work items. When all the worker threads have their own copies of a queue to work on it will be able to write many items in peace.
You can modify this. For example, a worker thread may lock the queue and move a limited number of work items off into its own internal queue instead of taking the whole thing.
Related
Given a work stealing thread pool system where each work item can generate new tasks in a threads local work queue - that can spill out to a global queue if full.
How would you safely and efficiently coordinate the shutdown of such a system? Assuming you have basic atomic operations and critical section locks available only.
To clarify some more and simplify. Say each thread grabs tasks from it's local work queue only (no stealing between other threads queues to simplify). If it's local work queue is exhausted it will take a lock on the global work queue and steal work to add to its local work queue. The local work queues require no locks as they are specific to each worker thread.
Using a simple flag or atomic count of 'active' worker threads won't work due to cases where other workers may spill new work onto the global queue where from another worker threads view it may have thought it was the only worker left with work.
All workers should exit only when there is no work left.
The biggest requirement would be to have some way of saving the definition of each task so the state of pending tasks can be saved to persistent storage. Then implement a "stop" flag (with a mutex on it). The method to get a task from the pool for execution checks that flag and, if it's set, returns a "terminate work thread" indication (distinct from a "no tasks available" result that makes the thread wait and try again). Threads terminate when they get that indication, and the overall pool management thread waits until all work threads have terminated and then terminates the pool. The main program has to wait until the pool is terminated and the pool management thread exits, once that happens it's safe to terminate the program. If the program needs to continue to run and restart the pool later, that's also the condition that must be met before it can do anything that would affect the pool configuration or restart the pool.
I have a multi-threaded application which has a producer-consumer model.
Basically I have 2 structs.
the first one is a struct which contains all the necessary information for the work to be done.
the second one is a struct which is tied to a worker thread, it contains a pointer to the first struct.
like this:
typedef struct worker_struct {
/* information for the work to be done */
} worker_struct;
typedef struct thread_specs {
worker_struct *work;
unsigned short thread_id;
unsigned short pred_cond;
pthread_mutex_t busy_mutex;
pthread_cond_t work_signal;
} thread_specs;
now this is all fine and dandy however now from my producer I know which works needs to be done and I want to link a worker thread to work to be done. My problem is that I have no idea how can I figure out if my worker thread is currently busy or not.
I have a predicate condition with a conditional wait like so:
while ( thread_stuff->pred_cond == 0 ) {
retval = pthread_cond_wait( &(thread_stuff->work_signal), &(thread_stuff->busy_mutex);
if (retval !=0 ) {
strerror_r(retval, strerror_buf, ERRNO_BUFSIZE);
printf("cond wait error! thread: %u, error: %s\n", thread_stuff->thread_id, strerror_buf);
}
}
Now How can I make sure a thread is not busy. If I set a variable protected by a mutex AFTER it has woken up from the signal I get a race condition as I have no guarantee that the variable gets set before my consumer checks again for waiting threads.
the only way I see what I could do is do a pthread_mutex_trylock() on it with the same mutex that is coupled with the conditional wait however this seems kind of expensive and not elegant.
Is there some other way, better way to do something like this, that is figure whether a thread is currently waiting at the predicate condition?
regards
In a typical producer-consumer relationship, producers and consumers are disconnected from each other and communicate through some shared data structure such as a FIFO queue. Producers create jobs and place them on a queue. Consumers remove items from the queue and process them. So there is no need for producers to know whether there is an available consumer. They just queue a job, and the next available consumer will pick it up.
Such a design makes it easy to add or remove producers or consumers because they can be independent of each other.
If you need some kind of signal that a job is currently processing or is complete, you would typically use some kind of signaling mechanism such as an event.
If you want to limit the number of scheduled but not yet processed work items, you would limit the size of the FIFO queue.
A similar question of mine was answered quite well some time back.
C - Guarantee condvars are ready for signalling
It guarantees that your condvar is indeed ready to signal via a protected status boolean, which means that it is asleep. If you're just using a simple setup (ie: one condvar per thread), then you can use this to detect if a thread is asleep or not.
I have one sender thread and 40 worker threads. There is a single queue. All of the 40 threads write to the queue and the sender thread periodically reads from the shared queue and sends the data read over a tcp socket (say after every 1 sec, the sender thread must read data from the queue and send it over the socket). I have a question here:
If any of the 40 threads is in the critical section and all other threads are also waiting to enter the critical section and at the same time 1 sec timer is up and I want to ignore the requests of all other threads to enter the critical section, and the Sender thread must be given priority now and must be given the critical section.
In other words I want to set the priority of sender thread to 1 i.e. when sender thread calls EnterCriticalSection() then, all other threads that are waiting to enter critical section must be ignored and as soon as the critical section gets free, it must be given to the sender thread.
Is there any way to achieve this functionality?
You can not achieve it by using just priority, because if any of worker thread is holding a lock then priority can not force them to free it. Here is one implementation I can think off..
As soon as the sender thread will wake up after 1 sec duration it will send a signal to the worker process. And in the signal handler free the lock that is held by the workers(I guess a binary semaphore would be good here, so put it's value to 0 in the signal handler), so whatever worker thread will try to access it will get blocked. At sender side send all the packets and at the end again set semaphore back to 1.
This is one implementation, you can think think your own like that but eventually it should work .:)
You likely just want some variant of a reader-writer lock. And probably just a plain Win32 critical section lock is all that is needed.
Here's why. The operations in the critical section, append data to a queue (or reading from the queue), is a non-blocking operation. In other words, no operation on the queue will take longer than a fraction of a millisecond. If you use the Windows critical section lock (EnterCriticalSection, LeaveCriticalSection), fairness is guaranteed to threads waiting to enter the CS (I'm fairly certain of this).
So if all 40 writer threads need to enter the CS to append to the queue, that shouldn't take more than a millisecond or two for the reader thread to wait it's turn to acquire the lock. This is of course assuming that writer threads are only copying memory a queue and are not doing any long blocking I/O operations while having acquired the lock.
Hope this helps in solving your problem http://man7.org/linux/man-pages/man3/pthread_getschedparam.3.html
One of the possible solutions to Your issue lies in The way Threads are implemented in Linux. Try and have a Mutex. Let your Sender thread create a named FIFO (using mkfifo() call), And, when you create say 40 worker threads, in their respective functions, make them create a named fifo for receiving. Whenever your Sender thread wants to communicate with one of your Worker thread, use open() call to open the worker_fifo and write onto it, close it. But When you have things like a User-Client Application thing, Whenever you open a fifo, put a Mutex Lock, do whatever you want (read/write) and Unlock the Mutex when you are done with it.
I am implementing a condition variable's wait operation. I have a struct for my condition variable. So far, my struct has a monitor, a queue, and a spinlock. But I am not sure if a condition variable should have a queue by itself. My notify looks like this:
void uthread_cv_notify (uthread_cv_t* cv) {
uthread_t* waiter_thread;
spinlock_lock(&cv->spinlock);
waiter_thread = dequeue (&cv->waiter_queue);
if(waiter_thread)
{
uthread_monitor_exit(cv->mon);
uthread_stop(TS_BLOCKED);
uthread_monitor_enter(cv->mon);
spinlock_unlock(&cv->spinlock);
}
}
But I wonder if in a notify function or a wait function I should just enqueue and dequeue in the monitor's waiting queue?
Thanks
The signal operation (that you're calling notify) should not require that the monitor be entered. This is inefficient.
It seems like you're trying to implement some clumsy old fashioned condition/monitor system in which the caller of "notify" must be inside the monitor, and it is guaranteed that if a thread is waiting, that thread gets the monitor before the "notify" caller returns to the monitor. (And that waiting thread does not have to have a loop re-testing the condition, either.)
That may be how C. A. R. Hoare initially described monitors and conditions, but the formalism is impractical/inefficient on modern multiprocessor systems, and also on threading implementations which do not have the luxury of being extremely tightly integrated with the low level scheduler (to be able to precisely control which thread gets to run when, so there are no races about who acquires a mutex first: for instance, to be able to transfer a thread from one wait queue to another, etc.)
Note how you're extending the critical section of the monitor over the spinlock_lock operation and over the dequeue operation. Neither of these belong under the monitor. The spinlock is independent, and the queue is guarded by the spinlock, not by the monitor. The monitor should protect the shared variables of the user code only (the special atomic property of of the wait operation).
So why do you need an extra queue? You are already storing all the threads that need to be notified.
Also, you probably want to do something like this:
void uthread_cv_notify (uthread_cv_t* cv) {
uthread_t* waiter_thread;
spinlock_lock(&cv->spinlock);
waiter_thread = dequeue (&cv->waiter_queue);
if(waiter_thread)
{
uthread_monitor_exit(cv->mon);
uthread_stop(TS_BLOCKED);
uthread_monitor_enter(cv->mon);
}
spinlock_unlock(&cv->spinlock);
}
This will ensure that the spin lock is always released.
I do understand what an APC is, how it works, and how Windows uses it, but I don't understand when I (as a programmer) should use QueueUserAPC instead of, say, a fiber, or thread pool thread.
When should I choose to use QueueUserAPC, and why?
QueueUserAPC is a neat tool that can often be a shortcut for some tasks that are otherwise handled with synchronization objects. It allows you to tell a particular thread to do something whenever it is convenient for that thread (i.e. when it finishes its current work and starts waiting on something).
Let's say you have a main thread and a worker thread. The worker thread opens a socket to a file server and starts downloading a 10GB file by calling recv() in a loop. The main thread wants to have the worker thread do something else in its downtime while it is waiting for net packets; it can queue a function to be run on the worker while it would otherwise be waiting and doing nothing.
You have to be careful with APCs, because as in the scenario I mentioned you would not want to make another blocking WinSock call (which would result in undefined behavior). You really have to be watching in order to find any good uses of this functionality because you can do the same thing in other ways. For example, by having the other thread check an event every time it is about to go to sleep, rather than giving it a function to run while it is waiting. Obviously the APC would be simpler in this scenario.
It is like when you have a call desk employee sitting and waiting for phone calls, and you give that person little tasks to do during their downtime. "Here, solve this Rubik's cube while you're waiting." Although, when a phone call comes in, the person would not put down the Rubik's cube to answer the phone (the APC has to return before the thread can go back to waiting).
QueueUserAPC is also useful if there is a single thread (Thread A) that is in charge of some data structure, and you want to perform some operation on the data structure from another thread (Thread B), but you don't want to have the synchronization overhead / complexity of trying to share that data between two threads. By having Thread B queue the operation to run on Thread A, which solely maintains that structure, you are executing any arbitrary function you want on that data without having to worry about synchronization.
It is just another tool like a thread pool. However with a thread pool you cannot send a task to a particular thread. You have no control over where the work is done. When you queue up a task that may end up creating a whole new thread. You may queue two tasks and they get done simultaneously on two different threads. With QueueUserAPC, you can be guaranteed that the tasks would get done in order and on the thread you designate.