Thread Monitoring in c using heartbeat signals

Thread Monitoring in c using heartbeat signals - c

I want to monitor threads. I used condition variables for send & receive HeartBeat & Acknowlagement signals for that.
scnMonitor_t is a monitor structure. As new threads are added it register with monitor & added to scnThreadlist_t.
monitorHeartbeatCheck is the thread that starts with program,
monitorHeartbeatProcess is API which are added to all thread functions.
Actually my problem is that the index of process is not properly followed
It ends with a wait HB condition for 3rd Thread & dead-lock is created.
what should be the problem?
thanks in advance.
typedef struct scnThreadList_{
osiThread_t thread;
struct scnThreadList_ *next;
} scnThreadList_t;
typedef struct scnMonitor_{
bool started;
osiThread_t heartbeatThread;
osiMutex_t heartbeatMutex;
osiMutex_t ackMutex;
osiCond_t heartbeatCond;
scnThreadList_t *threads;
} scnMonitor_t;
static scnMonitor_t *s_monitor = NULL;
// Main heartbeat check thread
void* monitorHeartbeatCheck( void *handle )
{
scnThreadList_t *pObj = NULL;
static int idx = 0;
static bool waitAck = false;
while ( 1 ) {
pObj = s_monitor->threads;
while ( pObj && ( pObj != s_monitor->heartbeatThread ) ) { //skip it-self from monitoring.
++idx;
printf("\"HB Check No.%d\"\n",idx);
// send heartbeat
usleep( 250 * 1000 );
pthread_mutex_lock( s_monitor->heartbeatMutex, 1 );
pthread_cond_signal( s_monitor->heartbeatCond );
printf("-->C %d HB sent\n",idx);
pthread_mutex_unlock( s_monitor->heartbeatMutex );
// wait for ACK
while( !waitAck ){
pthread_mutex_lock( s_monitor->ackMutex, 1 );
printf("|| C %d wait Ack\n",idx);
waitAck = true;
pthread_cond_wait( s_monitor->heartbeatCond, s_monitor->ackMutex );
waitAck = false;
printf("<--C %d received Ack\n",idx);
pthread_mutex_unlock( s_monitor->ackMutex );
LOG_INFO( SCN_MONITOR, "ACK from thread %p \n", pObj->thread );
}
pObj = pObj->next;
}
} // while, infinite
return NULL;
}
// Waits for hearbeat and acknowledges
// Call this API from every thread function that are registered
int monitorHeartbeatProcess( void )
{
static int id = 0;
static bool waitHb = false;
++ id;
printf("\"HB Process No.%d\"\n",id);
// wait for HB
while(!waitHb){
pthread_mutex_lock( s_monitor->heartbeatMutex, 1 );
printf("|| P %d wait for HB\n",id);
waitHb = true;
pthread_cond_wait( s_monitor->heartbeatCond, s_monitor->heartbeatMutex );
waitHb = false;
printf("<--P %d HB received \n",id);
pthread_mutex_unlock( s_monitor->heartbeatMutex );
}
// send ACK
uleep( 250 * 1000 );
pthread_mutex_lock( s_monitor->ackMutex, 1 );
pthread_cond_signal( s_monitor->heartbeatCond );
printf("-->P %d ACK sent\n",id);
pthread_mutex_unlock( s_monitor->ackMutex );
return 1;
}

You should always associate only one mutex with a condition at a time. Using two different mutexes with the same condition at the same time could lead to unpredictable serialization issues in your application.
http://publib.boulder.ibm.com/infocenter/iseries/v5r4/index.jsp?topic=%2Fapis%2Fusers_78.htm
You have 2 different mutexes with your condition heartbeatCond.

I think you are experiencing a deadlock here. The thread calling monitorHeartbeatProcess() takes mutex on heartbeatMutex and waits for signal on the condition variable, heartbeatCond. While thread calling monitorHeartbeatCheck() takes mutex on ackMutex and waits for sognal on condition variable, heartbeatCond. Thus both threads waits on the condition variable heartbeatCond causing deadlock. If you are so particular in using two mutexes, why not two condition variables?

Related

Using two semaphores in one method

I'm stuck on this assignment and I just can't get my head around it.
Let's say there's a queue of visitors waiting to get into one of the cars which are also queued. Only one car at a time is supposed to drive up to the platform to pick up 2 waiting people. As soon as 2 visitors enter the car it has to leave the platform.
I need to change the following methods "carArrives()" and "visitorArrives()" from Busy - Waiting to using either only Mutex or Semaphores.
Sorry for any mistakes.
int availableCars = 0;
int availableSeats = 0;
void carArrives(){
while(availableCars > 0){noop;} //exchange this with Mutex/Semaphore
availableCars = 1;
driveToPlatform();
openDoors();
availableSeats = 2;
while(availableSeats > 0){noop;} //exchange this with Mutex/Semaphore
closeDoors();
leavePlatform();
availableCars = 0;
}
void visitorArrives(){
while(availableSeats < 1){noop;} //exchange this with Mutex/Semaphore
enterCar();
availableSeats = availableSeats - 1;
}

if you generate a mutex:
pthread_mutex MyMutex = PTHREAD_MUTEX_INIT;
Then to stop any other part of the program from modifying some resource while the current thread is modifying/accessing the resource:
pthread_mutex_lock( &MyMutex );
// modify or access resource here
pthread_mutex_unlock( &MyMutex );
Note: to be effective, all places in the code that access that resource must use the same 'MyMutex'
regarding:
availableSeats = 2;
while(availableSeats > 0){noop;}
Suggest between locking a mutex for that same resource, to pause a bit, while not locked, to allow some other thread some time to modify the 'availableSeats' value. For instance:
pthread_mutex SeatsFilled = PTHREAD_MUTEX_INIT;
...
pthread_mutex_lock( &SeatsFilled );
availableSeats = 2;
pthread_mutex_unlock( &SeatsFilled );
...
do
{
int numSeats;
pthread_mutex_lock( &SeatsFilled );
numSeats = availableSeats;
pthread_mutex_unlock( &SeatsFilled );
if( !numSeats )
{
nanosleep( 1000 );
}
} while( numSeats );
...

pthread_cond_wait is too slow, is that a better way?

I used a thread pool, the main thread continuous add tasks into a queue, and throw a pthread_cond_signal that indicate the queue is not empty to the work threads. Then the work threads read the queue and throw a pthread_cond_signal indicate the queue is not full to the main thread. And when the program ran, I found the pthread_cond_wait is very slow, if I delete the pthread_cond_wait in function threadpool_add, it's much quicker, but I doubt it may has something wrong, How can I let the threadpool_add function faster? it's the performance bottleneck of my project, Here is the code:
/*
* #function int threadpool_add(threadpool_t *pool, void*(*function)(void *arg), void *arg, char* buf)
* #desc add tasks to the queue
* #param [thread_num] number of thread in pool,
* [queue_max_size] size of the queue for task
*/
int threadpool_add(threadpool_t *pool, void*(*function)(void *arg), void *arg)
{
ulogd_log(ULOGD_NOTICE, "in threadpool_add\n");
if(pool == NULL || function == NULL || arg == NULL)
{
return -1;
}
pthread_mutex_lock(&(pool->lock));
/*
* wait untill queue is not full
*/
while ((pool->queue_size == pool->queue_max_size) && (!pool->shutdown))
{
pthread_cond_wait(&(pool->queue_not_full), &(pool->lock)); // if delete this, it's quicker
}
if (pool->shutdown)
{
pthread_mutex_unlock(&(pool->lock));
return 0;
}
pool->task_queue[pool->queue_rear].function = function;
pool->task_queue[pool->queue_rear].arg = arg;
pool->queue_rear = (pool->queue_rear + 1)%pool->queue_max_size;
pool->queue_size++;
/*
* notify to the threads, there has a task
*/
pthread_cond_signal(&(pool->queue_not_empty));
pthread_mutex_unlock(&(pool->lock));
ulogd_log(ULOGD_NOTICE, "threadpool_add, tpool:0x%x, shutdown: %d, queue_size:%d, queue_not_empty:0x%x\n", pool, pool->shutdown, pool->queue_size, &(pool->queue_not_empty));
return 0;
}
void *threadpool_thread(void *threadpool)
{
threadpool_t *pool = (threadpool_t *)threadpool;
threadpool_task_t task;
while(1)
{
ulogd_log(ULOGD_NOTICE, "in thread 0x%x, tpool: 0x%x\n", pthread_self(), pool);
/* Lock must be taken to wait on conditional variable */
pthread_mutex_lock(&(pool->lock));
while ((pool->queue_size == 0) && (!pool->shutdown))
{
ulogd_log(ULOGD_NOTICE, "thread 0x%x is waiting, queue_size:%d, queue_not_empty:0x%x\n", pthread_self(), pool->queue_size, &(pool->queue_not_empty));
pthread_cond_wait(&(pool->queue_not_empty), &(pool->lock));
}
if (pool->shutdown)
{
pthread_mutex_unlock(&(pool->lock));
ulogd_log(ULOGD_NOTICE,"thread 0x%x is exiting\n", pthread_self());
pthread_exit(NULL);
}
//get a task from queue
task.function = pool->task_queue[pool->queue_front].function;
task.arg = pool->task_queue[pool->queue_front].arg;
pool->queue_front = (pool->queue_front + 1)%pool->queue_max_size;
pool->queue_size--;
//now queue must be not full
pthread_mutex_unlock(&(pool->lock));
pthread_cond_signal(&(pool->queue_not_full));
// Get to work
ulogd_log(ULOGD_NOTICE, "thread 0x%x start working\n", pthread_self());
(*(task.function))(task.arg);
// task run over
ulogd_log(ULOGD_NOTICE, "thread 0x%x end working\n", pthread_self());
}
pthread_exit(NULL);
return (NULL);
}

Why does my multi-threaded program blocks sometimes?

We have to write a program, which has 2 threads. One of them reads the content token by token and stores them into a array. The other reads the tokens from the array and writes it into a file.
Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#define MAX 10
int buffer[MAX];
int buf_pos; // the actual position of the buffer
void copy();
int flag; // if flag is 0, the whole content of the file was read
FILE *source;
FILE *dest;
// read the content of a file token by token and stores it into a buffer
void *read_from_file();
// write the content of a buffer token by token into a file
void *write_to_file();
pthread_cond_t condition = PTHREAD_COND_INITIALIZER;
pthread_mutex_t mutex;
int main( int argc, char *argv[] )
{
flag = 1;
pthread_t writer;
pthread_t reader;
pthread_mutex_init( &mutex, NULL );
if( argc != 3 )
{
printf( "Error\n" );
exit( EXIT_FAILURE );
}
source = fopen( argv[1], "r" );
dest = fopen( argv[2], "w" );
if( source == NULL || dest == NULL )
{
printf( "Error\n" );
exit( EXIT_FAILURE );
}
pthread_create( &reader, NULL, read_from_file, NULL );
pthread_create( &writer, NULL, write_to_file, NULL );
pthread_join( reader, NULL );
pthread_join( writer, NULL );
}
void *read_from_file()
{
int c;
while( ( c = getc( source ) ) != EOF )
{
if( buf_pos < MAX - 1 ) // the buffer is not full
{
pthread_mutex_lock( &mutex );
buffer[buf_pos++] = c;
pthread_mutex_unlock( &mutex );
pthread_cond_signal( &condition );
}
else
{
buffer[buf_pos++] = c; // store the last token
pthread_mutex_lock( &mutex );
pthread_cond_wait( &condition, &mutex ); // wait until the other thread sends a signal
pthread_mutex_unlock( &mutex );
}
}
flag = 0; // EOF
return NULL;
}
void *write_to_file()
{
int c;
while( flag || buf_pos > 0 )
{
if( buf_pos > 0 ) // The buffer is not empty
{
fputc( buffer[0], dest ); // write the first token into file
pthread_mutex_lock( &mutex );
copy();
--buf_pos;
pthread_mutex_unlock( &mutex );
pthread_cond_signal( &condition );
}
else
{
pthread_mutex_lock( &mutex );
pthread_cond_wait( &condition, &mutex );
pthread_mutex_unlock( &mutex );
}
}
return NULL;
}
void copy()
{
int i = 0;
for( ; i < buf_pos - 1; ++i )
buffer[i] = buffer[i + 1];
}
If I want to run the program, it sometimes blocks, but i do not know why. But if the program terminates, the outputfile is the same as the inputfile.
Can somebody explain me, why this could happen?

There are a number of problems with the code, but the reason for your lockups is most probably that you send out signals without the mutex locked, and without checking your condition with the mutex locked. Both are necessary to make sure that you don't end up with lost signals.
As noted by Useless, make sure that you know what your shared variables are, and that they are mutex-protected where necessary. For instance, your buf_pos is modified without protection in your reader thread, and used as a condition for waiting without mutex-protection in both threads.
Also, when doing pthread_cond_wait, you typically want a guard-expression to make sure you don't react to socalled spurios wakeups (have a look at the "Condition Wait Semantics" section at http://linux.die.net/man/3/pthread_cond_wait), and to make sure that the condition you're waiting for hasn't actually happened between testing for it, and starting the wait.
For instance, in your writer thread you could do:
pthread_mutex_lock(&mutex);
while(buf_pos == 0) pthread_cond_wait(&condition, &mutex);
pthread_mutex_unlock(&mutex);
But still, have a thorough look at your program, identify your shared variables, and make sure that they are mutex-protected where necessary. Also, it's far from correct that you only need to mutex-protect shared data for write access. In some cases you can get away with removing the mutex-protection when reading shared variables, but you'll have to analyze your code to make sure that that's actually the case.

Your problem with the deadlock is that you have both threads do this:
pthread_mutex_lock( &mutex );
pthread_cond_wait( &condition, &mutex );
pthread_mutex_unlock( &mutex );
on the same mutex, and the same condition variable. Suppose your reading thread executes those three lines immediately before your writing thread does. They're both going to be waiting on the same condition variable forever. Remember, signaling a condition variable does nothing if there are no threads actually waiting on it at the time. Right now you only avoid deadlock when one of your threads waits on that variable if, by chance, it happens to signal it before waiting for it itself.
Sonicwave's answer does a good job of identifying the other problems in your code, but in general you're not being careful enough about protecting your shared data, and you're not using condition variables properly so your threads are not properly synchronizing with each other. Among other things, if one of your threads is waiting on a condition variable, you need to make sure that there are no circumstances in which the other thread can wait on the same or different one before the other one receives a signal and wakes up.

C pthread: How to wake it up after some time?

I would like to wake up a pthread from another pthread - but after some time. I know signal or pthread_signal with pthread_cond_wait can be used to wake another thread, but I can't see a way to schedule this. The situation would be something like:
THREAD 1:
========
while(1)
recv(low priority msg);
dump msg to buffer
THREAD 2:
========
while(1)
recv(high priority msg);
..do a little bit of processing with msg ..
dump msg to buffer
wake(THREAD3, 5-seconds-later); <-- **HOW TO DO THIS? **
//let some msgs collect for at least a 5 sec window.
//i.e.,Don't wake thread3 immediately for every msg rcvd.
THREAD 3:
=========
while(1)
do some stuff ..
Process all msgs in buffer
sleep(60 seconds).
Any simple way to schedule a wakeup (short of creating a 4th thread that wakes up every second and decides if there is a scheduled entry for thread-3 to wakeup). I really don't want to wakeup thread-3 frequently if there are only low priority msgs in queue. Also, since the messages come in bursts (say 1000 high priority messages in a single burst), I don't want to wake up thread-3 for every single message. It really slows things down (as there is a bunch of other processing stuff it does every time it wakes up).
I am using an ubuntu pc.

How about the use of the pthread_cond_t object available through the pthread API ?
You could share such an object within your threads and let them act on it appropriately.
The resulting code should look like this :
/*
* I lazily chose to make it global.
* You could dynamically allocate the memory for it
* And share the pointer between your threads in
* A data structure through the argument pointer
*/
pthread_cond_t cond_var;
pthread_mutex_t cond_mutex;
int wake_up = 0;
/* To call before creating your threads: */
int err;
if (0 != (err = pthread_cond_init(&cond_var, NULL))) {
/* An error occurred, handle it nicely */
}
if (0 != (err = pthread_mutex_init(&cond_mutex, NULL))) {
/* Error ! */
}
/*****************************************/
/* Within your threads */
void *thread_one(void *arg)
{
int err = 0;
/* Remember you can embed the cond_var
* and the cond_mutex in
* Whatever you get from arg pointer */
/* Some work */
/* Argh ! I want to wake up thread 3 */
pthread_mutex_lock(&cond_mutex);
wake_up = 1; // Tell thread 3 a wake_up rq has been done
pthread_mutex_unlock(&cond_mutex);
if (0 != (err = pthread_cond_broadcast(&cond_var))) {
/* Oops ... Error :S */
} else {
/* Thread 3 should be alright now ! */
}
/* Some work */
pthread_exit(NULL);
return NULL;
}
void *thread_three(void *arg)
{
int err;
/* Some work */
/* Oh, I need to sleep for a while ...
* I'll wait for thread_one to wake me up. */
pthread_mutex_lock(&cond_mutex);
while (!wake_up) {
err = pthread_cond_wait(&cond_var, &cond_mutex);
pthread_mutex_unlock(&cond_mutex);
if (!err || ETIMEDOUT == err) {
/* Woken up or time out */
} else {
/* Oops : error */
/* We might have to break the loop */
}
/* We lock the mutex again before the test */
pthread_mutex_lock(&cond_mutex);
}
/* Since we have acknowledged the wake_up rq
* We set "wake_up" to 0. */
wake_up = 0;
pthread_mutex_unlock(&cond_mutex);
/* Some work */
pthread_exit(NULL);
return NULL;
}
If you want your thread 3 to exit the blocking call to pthread_cond_wait() after a timeout, consider using pthread_cond_timedwait() instead (read the man carefully, the timeout value you supply is the ABSOLUTE time, not the amount of time you don't want to exceed).
If the timeout expires, pthread_cond_timedwait() will return an ETIMEDOUT error.
EDIT : I skipped error checking in the lock / unlock calls, don't forget to handle this potential issue !
EDIT² : I reviewed the code a little bit

You can have the woken thread do the wait itself. In the waking thread:
pthread_mutex_lock(&lock);
if (!wakeup_scheduled) {
wakeup_scheduled = 1;
wakeup_time = time() + 5;
pthread_cond_signal(&cond);
}
pthread_mutex_unlock(&lock);
In the waiting thread:
pthread_mutex_lock(&lock);
while (!wakeup_scheduled)
pthread_cond_wait(&cond, &lock);
pthread_mutex_unlock(&lock);
sleep_until(wakeup_time);
pthread_mutex_lock(&lock);
wakeup_scheduled = 0;
pthread_mutex_unlock(&lock);

Why not just compare the current time to one save earlier?
time_t last_uncond_wakeup = time(NULL);
time_t last_recv = 0;
while (1)
{
if (recv())
{
// Do things
last_recv = time(NULL);
}
// Possible other things
time_t now = time(NULL);
if ((last_recv != 0 && now - last_recv > 5) ||
(now - last_uncond_wakeup > 60))
{
wake(thread3);
last_uncond_wakeup = now;
last_recv = 0;
}
}

pthread_cancel always crashes

I have a program that is trying to use create and cancel through an implemented pool.
The creation is as follows:
while (created<threadsNum){
pthread_t newThread;
pthread_struct *st; //Open the thread that handle the deleting of the sessions timeout.
st = (pthread_struct*)malloc(sizeof(pthread_struct));
st->id = created;
st->t = &newThread;
pthread_mutex_lock( &mutex_threadsPool );
readingThreadsPool[created] = st;
pthread_mutex_unlock( &mutex_threadsPool );
if((threadRes1 = pthread_create( &newThread, NULL, pcapReadingThread, (void*)created)))
{
syslog(LOG_CRIT, "Creating Pcap-Reading Thread %d failed.",created);
printf( "Creating Pcap-Reading Thread %d failed.\n",created);
exit(1);
}
syslog(LOG_INFO, "Created Pcap-Reading Thread %d Successfully.",created);
created++;
}
Later I try to cancel them and restart them :
pthread_t* t;
pthread_struct* tstr;
int i;
pthread_mutex_unlock( &mutex_threadsPool );
//first go on array and kill all threads
for(i = 0; i<threadsNum ; i++ ){
tstr = readingThreadsPool[i];
if (tstr!=NULL){
t = tstr->t;
//Reaches here :-)
if (pthread_cancel(*t)!=0){
perror("ERROR : Could not kill thread");
}
else{
printf("Killed Thread %d \n",i);
}
//doesnt reach here
}
}
I checked the addresses in the memory of the created thread in part one and the address of the about to be cancelled thread in the second part..they match..
I read about the thread manager that can't work if one calls killall().
But I don't..
Anyone have any idea?
Thanks

while (created<threadsNum){
pthread_t newThread;
pthread_struct *st;
/* ... */
st->t = &newThread;
/* ... */
}
You've got st->t pointing to a local variable newThread. newThread is only in scope during the current loop iteration. After this iteration st->t will contain an invalid address.
newThread is on the stack, so after it goes out of scope that stack space will be used for other variables. That could be different pthread_ts on successive iterations, or once the loop is over then that stack space will be used for completely different types of values.
To fix this I'd probably change pthread_struct.t to be a pthread_t instead of a pthread_t *, and then change the pthread_create call to:
pthread_create(&st->t, /*...*/)
Also, you should be careful about adding st to the thread pool before you've called pthread_create. It should probably be added after. As it stands, there's a small window where st->t is on the thread pool but has not been initialized.