Measuring efficiency of Mutex and Busy waiting

Measuring efficiency of Mutex and Busy waiting - c

The program is to create several threads where each thread increments a shared variable by 10000 using a for loop that increments it by 1 in every iteration. Both mutex lock and spin lock (busy waiting) versions are required. According to what I've learned, mutex version should work faster than spin lock. But what I implemented gave me an opposite answer...
This is the inplementation of each thread in the mutex version:
void *incr(void *tid)
{
int i;
for(i = 0; i < 10000; i++)
{
pthread_mutex_lock(&the_mutex); //Grab the lock
sharedVar++; //Increment the shared variable
pthread_mutex_unlock(&the_mutex); //Release the lock
}
pthread_exit(0);
}
And this is the implementation in the spin lock version:
void *incr(void *tid)
{
int i;
for(i = 0; i < 10000; i++)
{
enter_region((int)tid); //Grab the lock
sharedVar++; //Increment the shared variable
leave_region((int)tid); //Release the lock
}
pthread_exit(0);
}
void enter_region(int tid)
{
interested[tid] = true; //Show this thread is interested
turn = tid; //Set flag
while(turn == tid && other_interested(tid)); //Busy waiting
}
bool other_interested(int tid) //interested[] is initialized to all false
{
int i;
for(i = 0; i < tNumber; i++)
if(i != tid)
if(interested[i] == true) //There are other threads that are interested
return true;
return false;
}
void leave_region(int tid)
{
interested[tid] = false; //Depart from critical region
}
I also iterated the process of threads creating and running for hundreds of times to make sure the execution time can be distinguished.
For example, if tNumber is 4, and I iterated the program for 1000 times, mutex will take me 2.22 sec, and spin lock will take me 1.35 sec. The difference grows as tNumber increases. Why is this happening? Is my code wrong?

The code between enter_region and leave_region is not protected.
You can prove this by making it more complicated to ensure that it will trip itself up.
Create an array of bools (check) of length 10000 set up false. Make the code between enter and leave:
if (check[sharedVar]) cout << "ERROR" << endl;
else check[sharedVar++] = true;

The "difference" in speed is that you are implementing synchronization using
interested[tid] = true; //Show this thread is interested
turn = tid; //Set flag
while(turn == tid && other_interested(tid));
which are sequential operations. Any thread can be preempted while he is doing that, and the next thread read an erroneous state.
It need to be done atomically, by implementing either a compare-and-swap or a test-and-set. These instructions are usually provided by the hardware.
For eg on x86 you have xchg, cmpxchg/cmpxchg8b, xadd
Your test could be rewritten as
while( compare_and_swap_atomic(myid,id_meaning_it_is_free) == false);
The issue is that atomicity is expensive .

Related

is there a way to create a thread that will check other thread in C, dinning philosophers implementation

I have a question regarding threads in C, I know that to create a thread the function pthread_create is needed and I'm currently working on the dining philosopher problem and in this implementation of that problem I have to look if a philosopher has died of starvation.
I tested my programs and it works well, but to look if a philosopher has died I create another thread that'll always run and check in it if a philosoper has died.
a philosopher will die of starvation if he has not eat during a certain amount of time since his last meal.
Defining the general structure of the program and headers.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <string.h>
#include <sys/time.h>
struct t_phil;
typedef struct t_info t_info;
typedef struct t_phil t_phil;
typedef struct t_info
{
int num_of_phil;
t_phil *philo;
int min_dinner;
pthread_mutex_t *mutex;
int plate_eaten;
int num_of_dead_phil;
int time_to_die;
int time_to_sleep;
int time_to_eat;
pthread_mutex_t t_mut;
} t_info;
typedef struct t_phil
{
int number;
int has_eaten_all;
int finished_meal;
int is_dead;
t_info *data;
pthread_mutex_t *right;
pthread_mutex_t *left;
struct timeval last_dinner;
pthread_t thread;
} t_phil;
int count = 0;
Here is the function, that'll simulate the dinner, this one works as intended but I am open to possible error or improvement.
void *routine(void *args)
{
t_phil *philo = (t_phil *)(args);
int i = philo->number;
if ((philo->number % 2))
sleep(1);
gettimeofday(&philo->last_dinner, 0);
while (!philo->is_dead)
{
pthread_mutex_lock(philo->left);
printf("Philosopher : %i has take left fork\n", philo->number + 1);
pthread_mutex_lock(philo->right);
printf("Philosopher : %i has take right fork\n", philo->number + 1);
gettimeofday(&philo->last_dinner, 0);
printf("Philosopher :%i is eating in at %li\n", philo->number + 1, philo->last_dinner.tv_sec * 1000);
pthread_mutex_lock(&philo->data->t_mut);
// if (philo->data->num_of_dead_phil && !philo->data->min_dinner)
// break;
if (philo->is_dead)
break;
gettimeofday(&philo->last_dinner, NULL);
philo->finished_meal++;
if (philo->finished_meal == philo->data->min_dinner)
{
philo->data->plate_eaten++;
philo->has_eaten_all = 1;
}
sleep(philo->data->time_to_eat);
pthread_mutex_unlock(&philo->data->t_mut);
pthread_mutex_unlock(philo->left);
pthread_mutex_unlock(philo->right);
if (philo->has_eaten_all)
break;
printf("Philosopher : %i is now sleeping at %li\n", philo->number + 1, philo->last_dinner.tv_sec * 1000);
sleep(philo->data->time_to_sleep);
printf("Philosopher : %i is now thinking at %li\n", philo->number + 1, philo->last_dinner.tv_sec * 1000);
}
return (NULL);
}
This one function is the one not working as intended, and I don't know why is this happening right now, as my if statement seems to have the right condition but I am never entering in the if statement meaning that condition is never met while it should be.
I tested a lot of value and the same result happen each time
void *watchers_phil(void *args)
{
t_info *data = (t_info *)args;
t_phil *phil = data->philo;
int i = 0;
struct timeval now;
while (1)
{
if (data->plate_eaten == data->num_of_phil)
break;
while (i < data->num_of_phil)
{
if ((phil[i].last_dinner.tv_sec) >= ((phil[i].last_dinner.tv_sec) + (long int)data->time_to_die))
{
gettimeofday(&now, NULL);
printf("Unfortunately Philosopher : %i, is dead because of starvation at %li....", phil[i].number, (now.tv_sec * 1000));
phil[i].is_dead = 1;
}
i++;
}
i = 0;
}
return (NULL);
}
int main(int argc, char *argv[])
{
t_info data;
pthread_t watchers;
memset(&data, 0, sizeof(t_info));
data.num_of_phil = atoi(argv[1]);
data.min_dinner = atoi(argv[2]);
data.time_to_eat = atoi(argv[3]);
data.time_to_die = atoi(argv[4]);
data.time_to_sleep = atoi(argv[5]);
t_phil *philo = malloc(sizeof(t_phil) * data.num_of_phil);
if (!philo)
return (1);
pthread_mutex_t *mutex = malloc(sizeof(pthread_mutex_t) * data.num_of_phil);
data.mutex = mutex;
if (!mutex)
{
free(philo);
return (1);
}
int i = 0;
while (i < data.num_of_phil)
{
pthread_mutex_init(&data.mutex[i], NULL);
i++;
}
printf("Number : %i\n", data.num_of_phil);
pthread_mutex_init(&data.t_mut, NULL);
i = 0;
while (i < data.num_of_phil)
{
philo[i].number = i;
philo[i].has_eaten_all = 0;
philo[i].data = &data;
philo[i].is_dead = 0;
philo[i].right = &data.mutex[i];
if (i == (data.num_of_phil - 1))
philo[i].left = &data.mutex[0];
else
philo[i].left = &data.mutex[i + 1];
i++;
}
data.philo = philo;
i = 0;
while (i < data.num_of_phil)
{
pthread_create(&data.philo[i].thread, NULL, routine, &data.philo[i]);
i++;
}
pthread_create(&watchers, NULL, watchers_phil, &data);
i = 0;
while (i < data.num_of_phil)
{
pthread_join(data.philo[i].thread, NULL);
i++;
}
pthread_join(watchers, NULL);
printf("Dinner eaten : %i\n", data.plate_eaten);
i = 0;
while (i < data.num_of_phil)
{
pthread_mutex_destroy(&data.mutex[i]);
i++;
}
pthread_mutex_destroy(&data.t_mut);
}

I recommend the following approach:
Have an array containing a timestamp for each philosopher thread.
Each philosopher thread updates its respective timestamp either on beginning or ending eating. To determine its own death it can use this timestamp as well. You could store the eating time or the time when starvation occurs, I personally would rather do the latter as it requires calculating this starvation time just once (and you don't need the eating time anywhere else) while the former would need to do it on every test for starvation.
The monitoring thread iterates over all these timestamps doing the following:
Determine if a thread has starved (current time – just get it once in front of the loop! – being after timestamp stored, if you follow my recommendation above). Flag the thread as starved by some suitable way so that you can ignore it next time.
From non-starved threads remember the minimum value.
After iteration sleep as long until this minimum time will be reached (maybe waking up minimally earlier for better precision). If a philosopher updates the timestamp corresponding to this minimum – never mind, the monitor will wake up in vain, just doing nothing, but that won't hurt apart from consuming a bit of CPU time.
Updating the timestamps: You need to be aware that – unless you can guarantee atomic timestamp update – you need to protect them against race conditions between philosopher threads and monitoring thread.
I see two options for:
One single mutex for the entire array. Easy to implement, but either of the involved threads might block another one (two philosophers, if both trying to update at the same time, a philosopher the monitor or the monitor one or more philosophers). If one thread is preempted right while holding this mutex the blocked one(s) need(s) to wait until this thread is scheduled again.
One mutex for each timestamp. The advantage is that blocking only can occur between the monitor and one single philosopher while all other philosophers can go on as usual. The implementation is more complex, though, and the monitor thread will run its tests a bit slower (you won't notice...) due to having to lock and unlock mutexes again and again (system calls involved, these are rather expensive).
Philosophers should hold the mutex over the entire time between reading the timestamp for checking own starvation until having written the updated starvation time. This assures the philosopher cannot starve right while eating, though in between you should do as little work as possible, especially if you opt for one single mutex (if you cannot avoid, then rather opt for multiple ones).

Noticing if all threads are at pthread_cond_wait

I'm currently playing around with the POSIX library and trying out conditional variables.
At the moment I'm using a queue to scheduele tasks, if there is a task one thread uses pthread_cond_signal to wake up a thread that is waiting at pthread_cond_wait.
However there might be a point where no new tasks are created, so every thread is waiting at pthread_cond_wait and the programm is stuck there.
Is there anyway for me noticing if all my threads are waiting at pthread_cond_wait? I've tried using a counter and a leave variable but I did not get it working.
My code looks simliar to this code here: https://code-vault.net/lesson/j62v2novkv:1609958966824
,except each thread is adding new tasks (in executeTask) instead of only the main method adding them.
EDIT:
Here is my version of the code from the website above (which is loading for me:/)
#include <stdio.h>
#include <string.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#define THREAD_NUM 4
typedef struct Task
{
int a, b;
} Task;
Task taskQueue[256];
int taskCount = 0;
// THIS VARIABLE IS ADDED
int moreTasks = 10;
pthread_mutex_t mutexQueue;
pthread_cond_t condQueue;
void executeTask(Task *task)
{
usleep(50000);
int result = task->a + task->b;
printf("The sum of %d and %d is %d\n", task->a, task->b, result);
// THIS PART IS ADDED
if (moreTasks > 0)
{
pthread_mutex_lock(&mutexQueue);
moreTasks--;
pthread_mutex_unlock(&mutexQueue);
Task t = {
.a = rand() % 100,
.b = rand() % 100};
submitTask(t);
}
}
void submitTask(Task task)
{
pthread_mutex_lock(&mutexQueue);
taskQueue[taskCount] = task;
taskCount++;
pthread_mutex_unlock(&mutexQueue);
pthread_cond_signal(&condQueue);
}
void *startThread(void *args)
{
while (1)
{
Task task;
pthread_mutex_lock(&mutexQueue);
while (taskCount == 0)
{
pthread_cond_wait(&condQueue, &mutexQueue);
}
task = taskQueue[0];
int i;
for (i = 0; i < taskCount - 1; i++)
{
taskQueue[i] = taskQueue[i + 1];
}
taskCount--;
pthread_mutex_unlock(&mutexQueue);
executeTask(&task);
}
}
int main(int argc, char *argv[])
{
pthread_t th[THREAD_NUM];
pthread_mutex_init(&mutexQueue, NULL);
pthread_cond_init(&condQueue, NULL);
int i;
for (i = 0; i < THREAD_NUM; i++)
{
if (pthread_create(&th[i], NULL, &startThread, NULL) != 0)
{
perror("Failed to create the thread");
}
}
srand(time(NULL));
for (i = 0; i < 100; i++)
{
Task t = {
.a = rand() % 100,
.b = rand() % 100};
submitTask(t);
}
for (i = 0; i < THREAD_NUM; i++)
{
if (pthread_join(th[i], NULL) != 0)
{
perror("Failed to join the thread");
}
}
pthread_mutex_destroy(&mutexQueue);
pthread_cond_destroy(&condQueue);
return 0;
}
My Probleme now is that after a while the taskCount is always zero and tehrefore all my threads are at pthread_cond_wait(&condQueue, &mutexQueue); in the startThread methods.
Is ther any way of me noticing if all threads are at this place? As statet above if already tried using a counter for the threads that are currently waiting (and also a version where I counted the threads that are working) but I could not figure out how to stop all the threads once every task is done.

If you want to know how many threads are waiting on a condition var, just increase a global counter before you go into wait mode and decrease it on wake up. With that you'll know how many threads are currently waiting.
e.g.
//global scope
int num_waiting_threads = 0;
//in function startThread
while (taskCount == 0)
{
//if num_waiting_threads == THREAD_NUM, all threads are waiting
++num_waiting_threads;
pthread_cond_wait(&condQueue, &mutexQueue);
--num_waiting_threads;
}
Since you have a fixed number of tasks, after processing every task, every thread will be sooner or later just waiting around for newly submitted tasks. Your execute function will only add in total 10 new tasks (moreTasks is initialized with 10 and decreased continuously and if that counter hits zero, no more tasks will be submitted).
Therefore, you could/should use pthread_cond_timedwait and wake up by yourself to check the num_waiting_threads state. If all threads are waiting, break out of the loop and exit the thread.

What is the behavioral difference between vTaskDelay and _delay_ms?

1. Introduction
I cannot seem to find information or a detailed explanation about the behavioral differences between the following functions in a FreeRTOS task:
vTaskDelay
_delay_ms
2. Code
Suppose you have the following codes:
IdleHook + task creation
Long value = 0;
void vApplicationIdleHook( void ) {
while(1)
{
// empty
}
}
int main(void)
{
xTaskCreate(TaskIncrement, (const portCHAR *)"up" , 256, NULL, 2, NULL );
xTaskCreate(TaskDecrement, (const portCHAR *)"down" , 256, NULL, 1, NULL );
vTaskStartScheduler();
}
Tasks with vTaskDelay
static void TaskDecrement(void *param)
{
while(1)
{
for(unsigned long i=0; i < 123; i++) {
//semaphore take
value--;
//semaphore give
}
vTaskDelay(100);
}
}
static void TaskIncrement(void *param)
{
while(1)
{
for(unsigned long i=0; i < 123; i++) {
//semaphore take
value++;
//semaphore give
}
vTaskDelay(100);
}
}
Tasks with _delay_ms
static void TaskDecrement(void *param)
{
while(1)
{
for(unsigned long i=0; i < 123; i++) {
//semaphore take
value--;
//semaphore give
}
_delay_ms(100);
}
}
static void TaskIncrement(void *param)
{
while(1)
{
for(unsigned long i=0; i < 123; i++) {
//semaphore take
value++;
//semaphore give
}
_delay_ms(100);
}
}
3. Question
What happens with the flow of the program when the tasks are provided with a vTaskDelay versus _delay_ms?
Note: the two given example tasks have different priorities.

From https://www.freertos.org/Documentation/FreeRTOS_Reference_Manual_V10.0.0.pdf:
Places the task that calls vTaskDelay() into the Blocked state for a fixed number of tick interrupts. Specifying a delay period of zero ticks will not result in the calling task being placed into the Blocked state, but will result in the calling task yielding to any Ready state tasks that share its priority. Calling vTaskDelay(0) is equivalent to calling taskYIELD().
I think you get the idea already, but if you have multiple tasks created, then vTaskDelay() will put the running task into the "Blocked" state for the specified number of tick interrupts (not milliseconds!) and allow the task with the next highest priority to run until it yields control (or gets preempted, depending on your FreeRTOS configuration).
I don't think _delay_ms() is part of the FreeRTOS library. Are you sure it's not a platform specific function? My guess is, if the highest priority task calls _delay_ms(), then it will result in a busy wait. Otherwise, a task with a higher priority may preempt the task that calls _delay_ms() as it is delaying (that is, _delay_ms() will not yield control immediately).
Perhaps a better summary of the above: in a multitasking application, _delay_ms() is not deterministic.

Simple pipelining using pthreads

Very recently I have started working on pthreads and trying to implement software pipelining with pthreads. To do that I have a written a toy program myself, a similar of which would be a part of my main project.
So in this program the main thread creates and input and output buffer of integer type and then creates a single master thread and passes those buffers to the master thread. The master thread in turn creates two worker threads.
The input and the output buffer that is passed from the main to the master thread is of size nxk (e.g. 5x10 of size int). The master thread iterates over a chunk of size k (i.e. 10) for n (i.e. 5) number of times.
There is a loop running in the master thread for k (5 in here) number of times. In each iteration of k the master thread does some operation on a portion of input data of size n and place it in the common buffer shared between the master and the worker threads. The master thread then signals the worker threads that the data has been placed in the common buffer.
The two worker threads waits for the signal from the master thread if the common buffer is ready. The operation on the common buffer is divided into half among the worker threads. Which means one worker thread would work on the first half and the other worker thread would work on the next half of the common buffer.
Once the worker threads gets the signal from the master thread, each of the worker thread does some operation on their half of the data and copy it to the output buffer. Then the worker threads informs the master thread that their operation is complete on the common buffer by setting flag values. An array of flags are created for worker threads. The master thread keeps on checking if all the flags are set which basically means all the worker threads finished their operation on the common buffer and so master thread can place the next data chunk into the common buffer safely for worker thread's consumption.
So essentially there is communication between the master and the worker threads in a pipelined fashion. In the very end I am printing the output buffer in the main thread. But I am getting no output at all. I have copy pasted my code with full comments on almost all steps.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <sys/types.h>
#include <sys/time.h>
#include <semaphore.h>
#include <unistd.h>
#include <stdbool.h>
#include <string.h>
#define MthNum 1 //Number of Master threads
#define WthNum 2 //Number of Worker threads
#define times 5 // Number of times the iteration (n in the explanation)
#define elNum 10 //Chunk size during each iteration (k in the explanation)
pthread_mutex_t mutex; // mutex variable declaration
pthread_cond_t cond_var; //conditional variarble declaration
bool completion_flag = true; //This global flag indicates the completion of the worker thread. Turned false once all operation ends
//marking the completion
int *commonBuff; //common buffer between master and worker threads
int *commFlags; //array of flags that are turned to 1 by each worker threads. So worker thread i turns commFlags[i] to 1
// the master thread turns commFlags[i] = 0 for i =0 to (WthNum - 1)
int *commFlags_s;
int counter; // This counter used my master thread to count if all the commFlags[i] that shows
//all the threads finished their work on the common buffer
// static pthread_barrier_t barrier;
// Arguments structure passed to master thread
typedef struct{
int *input; // input buffer
int *output;// output buffer
}master_args;
// Arguments structure passed to worker thread
typedef struct{
int threadId;
int *outBuff;
}worker_args;
void* worker_func(void *arguments);
void *master_func(void *);
int main(int argc,char*argv[]){
int *ipData,*opData;
int i,j;
// allocation of input buffer and initializing to 0
ipData = (int *)malloc(times*elNum*sizeof(int));
memset(ipData,0,times*elNum*sizeof(int));
// allocation of output buffer and initializing to 0
opData = (int *)malloc(times*elNum*sizeof(int));
memset(opData,0,times*elNum*sizeof(int));
pthread_t thread[MthNum];
master_args* args[MthNum];
//creating the single master thread and passing the arguments
for( i=0;i<MthNum;i++){
args[i] = (master_args *)malloc(sizeof(master_args));
args[i]->input= ipData;
args[i]->output= opData;
pthread_create(&thread[i],NULL,master_func,(void *)args[i]);
}
//joining the master thred
for(i=0;i<MthNum;i++){
pthread_join(thread[i],NULL);
}
//printing the output buffer values
for(j =0;j<times;j++ ){
for(i =0;i<elNum;i++){
printf("%d\t",opData[i+j*times]);
}
printf("\n");
}
return 0;
}
//This is the master thread function
void *master_func(void *arguments){
//copying the arguments pointer to local variables
master_args* localMasterArgs = (master_args *)arguments;
int *indataArgs = localMasterArgs->input; //input buffer
int *outdataArgs = localMasterArgs->output; //output buffer
//worker thread declaration
pthread_t Workers[WthNum];
//worker thread arguments declaration
worker_args* wArguments[WthNum];
int i,j;
pthread_mutex_init(&mutex, NULL);
pthread_cond_init (&cond_var, NULL);
counter =0;
commonBuff = (int *)malloc(elNum*sizeof(int));
commFlags = (int *)malloc(WthNum*sizeof(int));
memset(commFlags,0,WthNum*sizeof(int) );
commFlags_s= (int *)malloc(WthNum*sizeof(int));
memset(commFlags_s,0,WthNum*sizeof(int) );
for(i =0;i<WthNum;i++){
wArguments[i] = (worker_args* )malloc(sizeof(worker_args));
wArguments[i]->threadId = i;
wArguments[i]->outBuff = outdataArgs;
pthread_create(&Workers[i],NULL,worker_func,(void *)wArguments[i]);
}
for (i = 0; i < times; i++) {
for (j = 0; j < elNum; j++)
indataArgs[i + j * elNum] = i + j;
while (counter != 0) {
counter = 0;
pthread_mutex_lock(&mutex);
for (j = 0; j < WthNum; j++) {
counter += commFlags_s[j];
}
pthread_mutex_unlock(&mutex);
}
pthread_mutex_lock(&mutex);
memcpy(commonBuff, &indataArgs[i * elNum], sizeof(int));
pthread_mutex_unlock(&mutex);
counter = 1;
while (counter != 0) {
counter = 0;
pthread_mutex_lock(&mutex);
for (j = 0; j < WthNum; j++) {
counter += commFlags[j];
}
pthread_mutex_unlock(&mutex);
}
// printf("master broad cast\n");
pthread_mutex_lock(&mutex);
pthread_cond_broadcast(&cond_var);
//releasing the lock
pthread_mutex_unlock(&mutex);
}
pthread_mutex_lock(&mutex);
completion_flag = false;
pthread_mutex_unlock(&mutex);
for (i = 0; i < WthNum; i++) {
pthread_join(Workers[i], NULL);
}
pthread_mutex_destroy(&mutex);
pthread_cond_destroy(&cond_var);
return NULL;
}
void* worker_func(void *arguments){
worker_args* localArgs = (worker_args*)arguments;
//copying the thread ID and the output buffer
int tid = localArgs->threadId;
int *localopBuffer = localArgs->outBuff;
int i,j;
bool local_completion_flag=false;
while(local_completion_flag){
pthread_mutex_lock(&mutex);
commFlags[tid] =0;
commFlags_s[tid] =1;
pthread_cond_wait(&cond_var,&mutex);
commFlags_s[tid] =0;
commFlags[tid] =1;
if (tid == 0) {
for (i = 0; i < (elNum / 2); i++) {
localopBuffer[i] = commonBuff[i] * 5;
}
} else { // Thread ID 1 operating on the other half of the common buffer data and placing on the
// output buffer
for (i = 0; i < (elNum / 2); i++) {
localopBuffer[elNum / 2 + i] = commonBuff[elNum / 2 + i] * 10;
}
}
local_completion_flag=completion_flag;
pthread_mutex_unlock(&mutex);//releasing the lock
}
return NULL;
}
But I have no idea where I have done wrong in my implementation since logically it seems to be correct. But definitely there is something wrong in my implementation. I have spent a long time trying different things to fix it but nothing worked. Sorry for this long post but I am unable to determine a section where I might have done wrong and so I couldn't concise the post. So if anybody could take a look into the problem and implementation and can suggest what changes needed to be done to run it as intended then that it would be really helpful. Thank you for your help and assistance.

There are several errors in this code.
You may start from fixing creation of worker threads:
wArguments[i] = (worker_args* )malloc(sizeof(worker_args));
wArguments[i]->threadId = i;
wArguments[i]->outBuff = outdataArgs;
pthread_create(&Workers[i],NULL,worker_func, (void *)wArguments);
You are initializing worker_args structs but incorrectly - passing pointer to array (void *)wArguments instead of pointers to array elements you just initialized.
pthread_create(&Workers[i],NULL,worker_func, (void *)wArguments[i]);
// ^^^
Initialize counter before starting threads that use it's value:
void *master_func(void *arguments)
{
/* (...) */
pthread_mutex_init(&mutex, NULL);
pthread_cond_init (&cond_var, NULL);
counter = WthNum;
When starting master thread, you incorrectly pass pointer to pointer:
pthread_create(&thread[i],NULL,master_func,(void *)&args[i]);
Please change this to:
pthread_create(&thread[i],NULL,master_func,(void *) args[i]);
All accesses to counter variable (as any other shared memory) must be synchronized between threads.

I think you should use semaphore based producer- consumer model like this
https://jlmedina123.wordpress.com/2014/04/08/255/

pthread_cond_wait deadlock in fifo circular queue

my code is only using in one producer-one consumer situation.
here is my test code:
static void *afunc(void * arg) {
Queue* q = arg;
for(int i= 0; i< 100000; i++) {
*queue_pull(q) = i; //get one element space
queue_push(q); //increase the write pointer
}
return NULL;
}
static void *bfunc(void * arg) {
Queue* q = arg;
for(;;) {
int *i = queue_fetch(q); //get the first element in queue
printf("%d\n", *i);
queue_pop(q); //increase the read pointer
}
}
int main() {
Queue queue;
pthread_t a, b;
queue_init(&queue);
pthread_create(&a, NULL, afunc, &queue);
pthread_create(&b, NULL, bfunc, &queue);
sleep(100000);
return 0;
}
and here is the implementation of the circular queue
#define MAX_QUEUE_SIZE 3
typedef struct Queue{
int data[MAX_QUEUE_SIZE] ;
int read,write;
pthread_mutex_t mutex, mutex2;
pthread_cond_t not_empty, not_full;
}Queue;
int queue_init(Queue *queue) {
memset(queue, 0, sizeof(Queue));
pthread_mutex_init(&queue->mutex, NULL);
pthread_cond_init(&queue->not_empty, NULL);
pthread_mutex_init(&queue->mutex2, NULL);
pthread_cond_init(&queue->not_full, NULL);
return 0;
}
int* queue_fetch(Queue *queue) {
int* ret;
if (queue->read == queue->write) {
pthread_mutex_lock(&queue->mutex);
pthread_cond_wait(&queue->not_empty, &queue->mutex);
pthread_mutex_unlock(&queue->mutex);
}
ret = &(queue->data[queue->read]);
return ret;
}
void queue_pop(Queue *queue) {
nx_atomic_set(queue->read, (queue->read+1)%MAX_QUEUE_SIZE);
pthread_cond_signal(&queue->not_full);
}
int* queue_pull(Queue *queue) {
int* ret;
if ((queue->write+1)%MAX_QUEUE_SIZE == queue->read) {
pthread_mutex_lock(&queue->mutex2);
pthread_cond_wait(&queue->not_full, &queue->mutex2);
pthread_mutex_unlock(&queue->mutex2);
}
ret = &(queue->data[queue->write]);
return ret;
}
void queue_push(Queue *queue) {
nx_atomic_set(queue->write, (queue->write+1)%MAX_QUEUE_SIZE);
pthread_cond_signal(&queue->not_empty);
}
after a few moments, it seems the two child threads will turn into deadlock..
EDIT: i use two semaphore， but it also has some problem.. it's pretty
weird, if if just execute ./main, it seems fine, but if i redirect into a file, like ./main > a.txt, then wc -l a.txt, the result is not equal the enqueue number..
int queue_init(Queue *queue) {
memset(queue, 0, sizeof(Queue));
pthread_mutex_init(&queue->mutex, NULL);
sem_unlink("/not_empty");
queue->not_empty = sem_open("/not_empty", O_CREAT, 644, 0);
sem_unlink("/not_full");
queue->not_full = sem_open("/not_full", O_CREAT, 644, MAX_QUEUE_SIZE);
return 0;
}
int* queue_fetch(Queue *queue) {
sem_wait(queue->not_empty);
return &(queue->data[queue->read]);
}
void queue_pop(Queue *queue) {
nx_atomic_set(queue->read, (queue->read+1)%MAX_QUEUE_SIZE);
sem_post(queue->not_full);
}
int* queue_pull(Queue *queue) {
sem_wait(queue->not_full);
return &(queue->data[queue->write]);
}
void queue_push(Queue *queue) {
nx_atomic_set(queue->write, (queue->write+1)%MAX_QUEUE_SIZE);
sem_post(queue->not_empty);
}

You are manipulating the state of the queue outside the mutex, this is inherently racey.
I would suggest using a single mutex, but take it whenever you change or test the read & write indicies. This also means that you don't need the atomic sets.

Quite possibly one of your threads is waiting for a condition to be signalled after the signalling has occurred, causing both threads to wait for each other indefinitely.
Pthreads condition variables don't remain signalled -- the signalling is a momentary action. The condition variable isn't used determine whether to wait -- it's just used to wake up a thread that's already waiting; you need a different means for determining whether or not to wait, such as checking a flag or some sort of test condition.
Normally, you signal as follows:
Lock the mutex
Do your updates, generally leaving your test condition 'true' (eg. setting your flag)
Call pthread_cond_signal() or pthread_cond_broadcast()
Unlock the mutex
...and wait as follows:
Lock the mutex
Loop until your test expression is 'true' (eg. until your flag is set), calling pthread_cond_wait() only if the test is false (inside the loop).
After the loop, when your test has succeeded, do your work.
Unlock the mutex
For example, signalling might go something like this:
pthread_mutex_lock(&mtx); /* 1: lock mutex */
do_something_important(); /* 2: do your work... */
ready_flag = 1; /* ...and set the flag */
pthread_cond_signal(&cond); /* 3: signal the condition (before unlocking) */
pthread_mutex_unlock(&mtx); /* 4: unlock mutex */
and waiting might be something like this:
pthread_mutex_lock(&mtx); /* 1: lock mutex */
while (ready_flag == 0) /* 2: Loop until flag is set... */
pthread_cond_wait(&cond, &mtx); /* ...waiting when it isn't */
do_something_else(); /* 3: Do your work... */
ready_flag = 0; /* ...and clear the flag if it's all done */
pthread_mutex_unlock(&mtx); /* 4: unlock mutex */
The waiter won't miss the condition this way, because the mutex ensures that the waiter's test-and-wait and the signaller's set-and-signal cannot occur simultaneously.
This section of your queue_fetch() function:
if (queue->read == queue->write) {
pthread_mutex_lock(&queue->mutex);
pthread_cond_wait(&queue->not_empty, &queue->mutex);
pthread_mutex_unlock(&queue->mutex);
}
ret = &(queue->data[queue->read]);
..might be rewritten as follows:
pthread_mutex_lock(&queue->mutex);
while (queue->read == queue->write)
pthread_cond_wait(&queue->not_empty, &queue->mutex);
ret = &(queue->data[queue->read]);
pthread_mutex_unlock(&queue->mutex);
...where:
The lock/unlock of the mutex are moved around the if, so the mutex is held while the test expression is evaluated, and still held until the condition wait starts
The if is changed to a while in case the condition wait is prematurely interrupted
Access to queue->read and queue->write is done with the mutex held
Similar changes would be made to queue_pull().
As for the signalling code, the following section of queue_pop():
nx_atomic_set(queue->read, (queue->read+1)%MAX_QUEUE_SIZE);
pthread_cond_signal(&queue->not_full);
..might be changed to:
pthread_mutex_lock(&queue->mutex);
queue->read = (queue->read + 1) % MAX_QUEUE_SIZE;
pthread_cond_signal(&queue->not_full);
pthread_mutex_unlock(&queue->mutex);
..where:
The mutex is held while signalling the condition (this ensures the condition can't be signalled between the waiter deciding whether to wait and actually starting to wait, since the waiter would hold the mutex during that interval)
The mutex is held while changing queue->read as well rather than using nx_atomic_set() since the mutex is needed when signalling the condition anyway
Similar changes would be made to queue_push().
Additionally, you should just use a single mutex (so that the same mutex is always held when accessing read and write), and once the while loops are added to the condition waits there's little compelling reason to use more than one condition variable. If switching to a single condition variable, just signal the condition again after completing a wait:
pthread_mutex_lock(&queue->mutex);
while (queue->read == queue->write) {
pthread_cond_wait(&queue->cond, &queue->mutex);
pthread_cond_signal(&queue->cond); /* <-- signal next waiter, if any */
}
ret = &(queue->data[queue->read]);
pthread_mutex_unlock(&queue->mutex);

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight