why does msgrcv() sets msqid to 0? - c

I have a program in c which is supposed to send and receive ipc messages through msgq.
The problem I have is that when I run msgrcv() it sets my global int msqid to 0. And of course I need it at other methods, like in a signal handler.
here is some code:
/* all the includes and some variables*/
#include "msg.h" // include the one I made
int msgQ; // global int
int main(int argc, char *argv[])
{
key = ftok("progfile", 65);
msgQ = msgget(key, 0666 | IPC_CREAT);
printf("msg queue id: %d \n", msgQ);
start_tik_tok(); // setting up the timer and the signal handler
/* irrelevant code */
void read_msgs(msgQ);
}
void read_msgs(int msgQid)
{
while (1)
{
printf("before the read local:%d goval:%d\n", msgQid, msgQ);
int ret = msgrcv(msgQid, &message, sizeof(message), 1, 0);
printf("after the read local:%d global :%d\n", msgQid, msgQ);
if (ret == -1)
/* error handling */
switch (message.action_type)
{
/* mesage handling */
}
}
void signal_handler(int signo)
{
/*I need the global int here to send some messages */
}
void start_tik_tok()
{
//timer interval for setitimer function
struct itimerval timer;
timer.it_interval.tv_sec = 1; //every 1 seconds
timer.it_interval.tv_usec = 0;
timer.it_value.tv_sec = 1; //start in 1 seconds
timer.it_value.tv_usec = 0;
//action for the signal
struct sigaction new_sa;
memset(&new_sa, 0, sizeof(new_sa));
new_sa.sa_handler = &signal_handler;
sigaction(SIGALRM, &new_sa, NULL);
setitimer(ITIMER_REAL, &timer, NULL);
}
the msg.h file:
#include <sys/msg.h>
struct msg_buff{
long mesg_type; //reciver
int sender; //sender
char action_type;
char time_tiks; //time in tiks
} message;
output:
msg queue id: 45416448
before the read local:45416448 global:45416448
after the read local:45416448 global:0
...
you can see that after I run msgrcv(), the value of msgQ turns to 0, even though I'm using a variable to pass the value to the method read_msgs().

The msgrcv function takes a pointer to a structure that starts with a "header" of type long, followed by the message data. The third argument to msgrcv, msgsz, is the size of the message data body, not including the long that's the header. So you should pass something like sizeof message - sizeof(long). By passing sizeof message, you're asking it to overflow the buffer sizeof(long) bytes, and this is clobbering some other global variable.

I found the solution, I'm not sure why is that but it solved it.
I just initialized the int from the beginning.
changed:
int msgQ; // global int
for:
int msgQ = 0; // global int

Related

Prevent msgrcv from waiting indefinitely

I have a multihreaded program whose 2 threads communicate with each other via a message queue. The first thread (sender) periodically sends a message, while the second thread (receiver) processes the information.
The sender has code similar to this:
// Create queue
key_t key = ftok("/tmp", 'B');
int msqid = msgget(key, 0664 | IPC_CREAT);
// Create message and send
struct request_msg req_msg;
req_msg.mtype = 1;
snprintf(req_msg.mtext, MSG_LENGTH, "Send this information");
msgsnd(msqid, &req_msg, strlen(req_msg.mtext) + 1, 0);
On the receiving thread, I do this:
// Subscribe to queue
key_t key = ftok("/tmp", 'B');
int msqid = msgget(key, 0664);
struct request_msg req_msg;
while(running)
{
msgrcv(msqid, &req_msg, sizeof(req_msg.mtext), 0, 0);
// Do sth with the message
}
As you can see, the receiver sits within a while loop that is controlled by a global variable named "running". Error handlers do set the boolean to false, if an error is encountered within the process. This works in most cases, but if an error occurs before being able to send a message to the queue, the receiver will not exit the while loop because it waits for a message before continuing and thus, checking the running variable. That means it will hang there forever, as the sender will not send anything for the rest of the runtime.
I would like to avoid this, but I do not know how to let msgrcv know that it cannot expect any more messages. I was unable to learn how msgrcv behaves if I kill the queue, assuming this is the easiest version. Maybe timeouts or sending some kind of termination message (possibly using the mtype member of the message struct) are also possible.
Please, let me know what the most robust solution to this problem is. Thanks!
EDIT: based on suggestions I have reworked the code to make the signal handlers action atomic.
#include <stdbool.h> // bool data type
#include <stdio.h>
#include <signal.h>
#include <stdint.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#define ALARM_INTERVAL_SEC 1
#define ALARM_INTERVAL_USEC 0
struct message
{
uint64_t iteration;
char req_time[28];
};
static volatile bool running = true;
static volatile bool work = false;
static struct itimerval alarm_interval;
static struct timeval previous_time;
static uint64_t loop_count = 0;
static struct message msg;
pthread_mutex_t mutexmsg;
pthread_cond_t data_updated_cv;
static void
termination_handler(int signum)
{
running = false;
}
static void
alarm_handler(int signum)
{
work = true;
}
static void
write_msg(void)
{
// Reset the alarm interval
if(setitimer(ITIMER_REAL, &alarm_interval, NULL) < 0)
{
perror("setitimer");
raise(SIGTERM);
return;
}
struct timeval current_time;
gettimeofday(&current_time, NULL);
printf("\nLoop count: %lu\n", loop_count);
printf("Loop time: %f us\n", (current_time.tv_sec - previous_time.tv_sec) * 1e6 +
(current_time.tv_usec - previous_time.tv_usec));
previous_time = current_time;
// format timeval struct
char tmbuf[64];
time_t nowtime = current_time.tv_sec;
struct tm *nowtm = localtime(&nowtime);
strftime(tmbuf, sizeof(tmbuf), "%Y-%m-%d %H:%M:%S", nowtm);
// write values
pthread_mutex_lock(&mutexmsg);
msg.iteration = loop_count;
snprintf(msg.req_time, sizeof(msg.req_time), "%s.%06ld", tmbuf, current_time.tv_usec);
pthread_cond_signal(&data_updated_cv);
pthread_mutex_unlock(&mutexmsg);
loop_count++;
}
static void*
process_msg(void *args)
{
while(1)
{
pthread_mutex_lock(&mutexmsg);
printf("Waiting for condition\n");
pthread_cond_wait(&data_updated_cv, &mutexmsg);
printf("Condition fulfilled\n");
if(!running)
{
break;
}
struct timeval process_time;
gettimeofday(&process_time, NULL);
char tmbuf[64];
char buf[64];
time_t nowtime = process_time.tv_sec;
struct tm *nowtm = localtime(&nowtime);
strftime(tmbuf, sizeof(tmbuf), "%Y-%m-%d %H:%M:%S", nowtm);
snprintf(buf, sizeof(buf), "%s.%06ld", tmbuf, process_time.tv_usec);
// something that takes longer than the interval time
// sleep(1);
printf("[%s] Req time: %s loop cnt: %lu\n", buf, msg.req_time, msg.iteration);
pthread_mutex_unlock(&mutexmsg);
}
pthread_exit(NULL);
}
int
main(int argc, char* argv[])
{
pthread_t thread_id;
pthread_attr_t attr;
// for portability, set thread explicitly as joinable
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
if(pthread_create(&thread_id, NULL, process_msg, NULL) != 0)
{
perror("pthread_create");
exit(1);
}
pthread_attr_destroy(&attr);
// signal handling setup
struct sigaction t;
t.sa_handler = termination_handler;
sigemptyset(&t.sa_mask);
t.sa_flags = 0;
sigaction(SIGINT, &t, NULL);
sigaction(SIGTERM, &t, NULL);
struct sigaction a;
a.sa_handler = alarm_handler;
sigemptyset(&a.sa_mask);
a.sa_flags = 0;
sigaction(SIGALRM, &a, NULL);
// Set the alarm interval
alarm_interval.it_interval.tv_sec = 0;
alarm_interval.it_interval.tv_usec = 0;
alarm_interval.it_value.tv_sec = ALARM_INTERVAL_SEC;
alarm_interval.it_value.tv_usec = ALARM_INTERVAL_USEC;
if(setitimer(ITIMER_REAL, &alarm_interval, NULL) < 0)
{
perror("setitimer");
exit(1);
}
gettimeofday(&previous_time, NULL);
while(1)
{
// suspending main thread until a signal is caught
pause();
if(!running)
{
// signal the worker thread to stop execution
pthread_mutex_lock(&mutexmsg);
pthread_cond_signal(&data_updated_cv);
pthread_mutex_unlock(&mutexmsg);
break;
}
if(work)
{
write_msg();
work = false;
}
}
// suspend thread until the worker thread joins back in
pthread_join(thread_id, NULL);
// reset the timer
alarm_interval.it_value.tv_sec = 0;
alarm_interval.it_value.tv_usec = 0;
if(setitimer(ITIMER_REAL, &alarm_interval, NULL) < 0)
{
perror("setitimer");
exit(1);
}
printf("EXIT\n");
pthread_exit(NULL);
}
You have not justified the use of a message queue other than as a synchronization primitive. You could be passing the message via a variable and an atomic flag to indicate message readiness. This answer then describes how to implement thread suspension and resuming using a condition variable. That’s how it’d be typically done between threads, although of course is not the only way.
I do not know how to let msgrcv know that it cannot expect any more messages
No need for that. Just send a message that tells the thread to finish! The running variable doesn’t belong: you are trying to communicate with the other thread, so do it the way you chose to: message it!
I have spent the last day to read a lot about threading and mutexes and tried to get my example program to work. It does, but unfortunately, it gets stuck when I try to shut it down via Ctrl+C. Reason being (again) that this time, that the worker thread waits for a signal from the main thread that won't send a signal anymore.
#Rachid K. and #Unslander Monica: if you want to take a look again, is this more state of the art code for doing this? Also, I think I have to use pthread_cond_timedwait instead of pthread_cond_wait to avoid the termination deadlock. Could you tell me how to handle that exactly?
Note that the program does simply periodically (interval 1 s) hand a timestamp and a loop counter to the processing thread, that prints out the data. The output also shows when the print got called.
Thanks again!
#include <stdbool.h> // bool data type
#include <stdio.h>
#include <signal.h>
#include <stdint.h>
#include <pthread.h>
#include <stdlib.h>
#define ALARM_INTERVAL_SEC 1
#define ALARM_INTERVAL_USEC 0
static bool running = true;
static struct itimerval alarm_interval;
static struct timeval previous_time;
static uint64_t loop_count = 0;
struct message
{
uint64_t iteration;
char req_time[28];
} msg;
pthread_mutex_t mutexmsg;
pthread_cond_t data_updated_cv;
static void
signal_handler(int signum)
{
if (signum == SIGINT || signum == SIGTERM)
{
running = false;
}
}
static void
write_msg(int signum)
{
if(!running)
{
return;
}
// Reset the alarm interval
if(setitimer(ITIMER_REAL, &alarm_interval, NULL) < 0)
{
perror("setitimer");
raise(SIGTERM);
return;
}
struct timeval current_time;
gettimeofday(&current_time, NULL);
printf("\nLoop count: %lu\n", loop_count);
printf("Loop time: %f us\n", (current_time.tv_sec - previous_time.tv_sec) * 1e6 +
(current_time.tv_usec - previous_time.tv_usec));
previous_time = current_time;
// format timeval struct
char tmbuf[64];
time_t nowtime = current_time.tv_sec;
struct tm *nowtm = localtime(&nowtime);
strftime(tmbuf, sizeof(tmbuf), "%Y-%m-%d %H:%M:%S", nowtm);
// write values
pthread_mutex_lock(&mutexmsg);
msg.iteration = loop_count;
snprintf(msg.req_time, sizeof(msg.req_time), "%s.%06ld", tmbuf, current_time.tv_usec);
pthread_cond_signal(&data_updated_cv);
pthread_mutex_unlock(&mutexmsg);
loop_count++;
}
static void*
process_msg(void *args)
{
while(running)
{
pthread_mutex_lock(&mutexmsg);
printf("Waiting for condition\n");
pthread_cond_wait(&data_updated_cv, &mutexmsg);
printf("Condition fulfilled\n");
struct timeval process_time;
gettimeofday(&process_time, NULL);
char tmbuf[64];
char buf[64];
time_t nowtime = process_time.tv_sec;
struct tm *nowtm = localtime(&nowtime);
strftime(tmbuf, sizeof(tmbuf), "%Y-%m-%d %H:%M:%S", nowtm);
snprintf(buf, sizeof(buf), "%s.%06ld", tmbuf, process_time.tv_usec);
printf("[%s] Message req time: %s loop cnt: %lu\n", buf, msg.req_time, msg.iteration);
pthread_mutex_unlock(&mutexmsg);
}
pthread_exit(NULL);
}
int
main(int argc, char* argv[])
{
pthread_t thread_id;
pthread_attr_t attr;
// for portability, set thread explicitly as joinable
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
if(pthread_create(&thread_id, NULL, process_msg, NULL) != 0)
{
perror("pthread_create");
exit(1);
}
pthread_attr_destroy(&attr);
// signal handling setup
struct sigaction s;
s.sa_handler = signal_handler;
sigemptyset(&s.sa_mask);
s.sa_flags = 0;
sigaction(SIGINT, &s, NULL);
sigaction(SIGTERM, &s, NULL);
struct sigaction a;
a.sa_handler = write_msg;
sigemptyset(&a.sa_mask);
a.sa_flags = 0;
sigaction(SIGALRM, &a, NULL);
// Set the alarm interval
alarm_interval.it_interval.tv_sec = 0;
alarm_interval.it_interval.tv_usec = 0;
alarm_interval.it_value.tv_sec = ALARM_INTERVAL_SEC;
alarm_interval.it_value.tv_usec = ALARM_INTERVAL_USEC;
if(setitimer(ITIMER_REAL, &alarm_interval, NULL) < 0)
{
perror("setitimer");
exit(1);
}
gettimeofday(&previous_time, NULL);
// suspend thread until the worker thread joins back in
pthread_join(thread_id, NULL);
// reset the timer
alarm_interval.it_value.tv_sec = 0;
alarm_interval.it_value.tv_usec = 0;
if(setitimer(ITIMER_REAL, &alarm_interval, NULL) < 0)
{
perror("setitimer");
exit(1);
}
pthread_exit(NULL);
return 0;
}
On the receiving thread, I do this:
...
while(running)
{
msgrcv(msqid, &req_msg, sizeof(req_msg.mtext), 0, 0);
Hopefully, in reality you do more than that.
Because you're not checking any error status in the code you've posted. And that's flat-out wrong for a blocking function call that is likely specified to never be restarted on receipt of a signal (as is true on Linux and Solaris). Per Linux `signal(2):
The following interfaces are never restarted after being interrupted
by a signal handler, regardless of the use of SA_RESTART; they
always fail with the error EINTR when interrupted by a signal
handler:
...
System V IPC interfaces: msgrcv(2), msgsnd(2), semop(2), and
semtimedop(2).
and Solaris sigaction():
SA_RESTART
If set and the signal is caught, functions that are interrupted by the execution of this signal's handler are transparently restarted by the system, namely fcntl(2), ioctl(2), wait(3C), waitid(2), and the following functions on slow devices like terminals: getmsg() and getpmsg() (see getmsg(2)); putmsg() and putpmsg() (see putmsg(2)); pread(), read(), and readv() (see read(2)); pwrite(), write(), and writev() (see write(2)); recv(), recvfrom(), and recvmsg() (see recv(3SOCKET)); and send(), sendto(), and sendmsg() (see send(3SOCKET)). Otherwise, the function returns an EINTR error.
So your code need to look more like this in order to handle both errors and signal interrupts:
volatile sig_atomic_t running;
...
while(running)
{
errno = 0;
ssize_t result = msgrcv(msqid, &req_msg, sizeof(req_msg.mtext), 0, 0);
if ( result == ( ssize_t ) -1 )
{
// if the call failed or no longer running
// break the loop
if ( ( errno != EINTR ) || !running )
{
break;
}
// the call was interrupted by a signal
continue
}
...
}
And that opens up the opportunity to use a alarm() and a SIGALRM signal handler to set running to 0 for use as a timeout:
volatile sig_atomic_t running;
void handler( int sig );
{
running = 0;
}
...
struct sigaction sa;
memset( &sa, 0, sizeof( sa ) );
sa.sa_handler = handler;
sigaction( SIGALRM, &sa, NULL );
while(running)
{
// 10-sec timeout
alarm( 10 );
errno = 0;
ssize_t result = msgrcv( msqid, &req_msg, sizeof(req_msg.mtext), 0, 0 );
// save errno as alarm() can munge it
int saved_errno = errno;
// clear alarm if it hasn't fired yet
alarm( 0 );
if ( result == ( ssize_t ) -1 )
{
// if the call failed or no longer running
// break the loop
if ( ( saved_errno != EINTR ) || !running )
{
break;
}
// the call was interrupted by a signal
continue
}
...
}
That can almost certainly be improved upon - the logic is rather complex to catch all the corner cases and there's likely a simpler way to do it.
Answer to the new proposal in the question
The cyclic timer should be rearmed in the event loop of the main thread for better visibility (subjective proposal);
When the secondary thread goes out of its loop, it must release the
mutex otherwise the main thread will enter into a deadlock (waiting
for the mutex which was locked by the terminated secondary thread).
So, here is the last proposal with the above fixes/enhancements:
#include <stdbool.h> // bool data type
#include <stdio.h>
#include <signal.h>
#include <stdint.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>
#define ALARM_INTERVAL_SEC 1
#define ALARM_INTERVAL_USEC 0
struct message
{
uint64_t iteration;
char req_time[28];
};
static volatile bool running = true;
static volatile bool work = false;
static struct itimerval alarm_interval;
static struct timeval previous_time;
static uint64_t loop_count = 0;
static struct message msg;
pthread_mutex_t mutexmsg;
pthread_cond_t data_updated_cv;
static void
termination_handler(int signum)
{
running = false;
}
static void
alarm_handler(int signum)
{
work = true;
}
static void
write_msg(void)
{
struct timeval current_time;
gettimeofday(&current_time, NULL);
printf("\nLoop count: %lu\n", loop_count);
printf("Loop time: %f us\n", (current_time.tv_sec - previous_time.tv_sec) * 1e6 +
(current_time.tv_usec - previous_time.tv_usec));
previous_time = current_time;
// format timeval struct
char tmbuf[64];
time_t nowtime = current_time.tv_sec;
struct tm *nowtm = localtime(&nowtime);
strftime(tmbuf, sizeof(tmbuf), "%Y-%m-%d %H:%M:%S", nowtm);
// write values
pthread_mutex_lock(&mutexmsg);
msg.iteration = loop_count;
snprintf(msg.req_time, sizeof(msg.req_time), "%s.%06ld", tmbuf, current_time.tv_usec);
pthread_cond_signal(&data_updated_cv);
pthread_mutex_unlock(&mutexmsg);
loop_count++;
}
static void*
process_msg(void *args)
{
while(1)
{
pthread_mutex_lock(&mutexmsg);
printf("Waiting for condition\n");
pthread_cond_wait(&data_updated_cv, &mutexmsg);
printf("Condition fulfilled\n");
if(!running)
{
pthread_mutex_unlock(&mutexmsg); // <----- To avoid deadlock
break;
}
struct timeval process_time;
gettimeofday(&process_time, NULL);
char tmbuf[64];
char buf[64];
time_t nowtime = process_time.tv_sec;
struct tm *nowtm = localtime(&nowtime);
strftime(tmbuf, sizeof(tmbuf), "%Y-%m-%d %H:%M:%S", nowtm);
snprintf(buf, sizeof(buf), "%s.%06ld", tmbuf, process_time.tv_usec);
// something that takes longer than the interval time
//sleep(2);
printf("[%s] Req time: %s loop cnt: %lu\n", buf, msg.req_time, msg.iteration);
pthread_mutex_unlock(&mutexmsg);
}
printf("Thread exiting...\n");
pthread_exit(NULL);
}
int
main(int argc, char* argv[])
{
pthread_t thread_id;
pthread_attr_t attr;
// for portability, set thread explicitly as joinable
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
if(pthread_create(&thread_id, NULL, process_msg, NULL) != 0)
{
perror("pthread_create");
exit(1);
}
pthread_attr_destroy(&attr);
// signal handling setup
struct sigaction t;
t.sa_handler = termination_handler;
sigemptyset(&t.sa_mask);
t.sa_flags = 0;
sigaction(SIGINT, &t, NULL);
sigaction(SIGTERM, &t, NULL);
struct sigaction a;
a.sa_handler = alarm_handler;
sigemptyset(&a.sa_mask);
a.sa_flags = 0;
sigaction(SIGALRM, &a, NULL);
// Set the alarm interval
alarm_interval.it_interval.tv_sec = 0;
alarm_interval.it_interval.tv_usec = 0;
alarm_interval.it_value.tv_sec = ALARM_INTERVAL_SEC;
alarm_interval.it_value.tv_usec = ALARM_INTERVAL_USEC;
if(setitimer(ITIMER_REAL, &alarm_interval, NULL) < 0)
{
perror("setitimer");
exit(1);
}
gettimeofday(&previous_time, NULL);
while(1)
{
// Reset the alarm interval <-------- Rearm the timer in the main loop
if(setitimer(ITIMER_REAL, &alarm_interval, NULL) < 0)
{
perror("setitimer");
raise(SIGTERM);
break;
}
// suspending main thread until a signal is caught
pause();
if(!running)
{
// signal the worker thread to stop execution
pthread_mutex_lock(&mutexmsg);
pthread_cond_signal(&data_updated_cv);
pthread_mutex_unlock(&mutexmsg);
break;
}
if(work)
{
write_msg();
work = false;
}
}
// suspend thread until the worker thread joins back in
pthread_join(thread_id, NULL);
// reset the timer
alarm_interval.it_value.tv_sec = 0;
alarm_interval.it_value.tv_usec = 0;
if(setitimer(ITIMER_REAL, &alarm_interval, NULL) < 0)
{
perror("setitimer");
exit(1);
}
printf("EXIT\n");
pthread_exit(NULL);
}
==================================================================
Answer to the original question
It is possible to use a conditional variable to wait for a signal from the senders. This makes the receiver wake up and check for messages in the message queue by passing IPC_NOWAIT in the flags parameter of msgrcv(). To end the communication, an "End of communication" message can be posted. It is also possible to use pthread_con_timedwait() to wake up periodically and check a "end of communication" or "end of receiver condition" (e.g. by checking your global "running" variable).
Receiver side:
// Mutex initialization
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
// Condition variable initialization
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
[...]
while (1) {
// Lock the mutex
pthread_mutex_lock(&mutex);
// Check for messages (non blocking thanks to IPC_NOWAIT)
rc = msgrcv(msqid, &req_msg, sizeof(req_msg.mtext), 0, IPC_NOWAIT);
if (rc == -1) {
if (errno == ENOMSG) {
// message queue empty
// Wait for message notification
pthread_cond_wait(&cond, &mutex); // <--- pthread_cond_timedwait() can be used to wake up and check for the end of communication or senders...
} else {
// Error
}
}
// Handle the message, end of communication (e.g. "running" variable)...
// Release the lock (so that the sender can post something in the queue)
pthread_mutex_unlock(&mutex);
}
Sender side:
// Prepare the message
[...]
// Take the lock
pthread_mutex_lock(&mutex);
// Send the message
msgsnd(msqid, &req_msg, strlen(req_msg.mtext) + 1, 0);
// Wake up the receiver
pthread_cond_signal(&cond);
// Release the lock
pthread_mutex_unlock(&mutex);
N.B.: SYSV message queues are obsolete. It is preferable to use the brand new Posix services.

Which synchronization primitive should I employ here?

while(1) {
char message_buffer[SIZE];
ssize_t message_length = mq_receive(mq_identifier, message_buffer, _mqueue_max_msg_size NULL);
if(message_len == -1) { /* error handling... */}
pthread_t pt1;
int ret = pthread_create(&pt1, NULL, handle_message, message_buffer);
if(ret) { /* error handling ... */}
}
void * handle_message (void * message) {
puts((char *) message);
return NULL;
}
The above example is not an MRE but it is extremely simple:
I've got a main thread with a loop that constantly consumes messages from a message queue. Once a new message is received, it is stored in the local message_buffer buffer. Then, a new thread is spawned to "take care" of said new message, and thus the message buffer's address is passed into handle_message, which the new thread subsequently executes.
The problem
Often, 2 threads will print the same message, even though I can verify with a 100% certainty that the messages in the queue were not the same.
I am not completely certain, but I think I understand why this is happening:
say that I push 2 different messages to the mqueue and only then I begin consuming them.
In the first iteration of the while loop, the message will get consumed from the queue and saved to message_buffer. A new thread will get spawned and the address of message_length passed to it. But that thread may not be fast enough to print the buffer's contents to the stream before the next message gets consumed (on the next iteration of the loop), and the contents of message_buffer subsequently overridden. Thus the first and second thread now print the same value.
My question is: what is the most efficient way to solve this? I'm pretty new to parallel programming and threading/pthreads and I'm pretty overwhelmed by the different synchronization primitives.
Mutex trouble
static pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
while(1) {
char message_buffer[SIZE];
pthread_mutex_lock(&m);
ssize_t message_length = mq_receive(mq_identifier, message_buffer, _mqueue_max_msg_size NULL);
pthred_mutex_unlock(&m);
if(message_len == -1) { /* error handling... */}
pthread_t pt1;
int ret = pthread_create(&pt1, NULL, handle_message, message_buffer);
if(ret) { /* error handling ... */}
}
void * handle_message (void * message) {
char own_buffer[SIZE];
pthread_mutex_lock(&m);
strncpy(own_buffer, (char *) message, SIZE);
pthread_mutex_unlock(&m);
puts(own_buffer);
return NULL;
}
I don't think my current mutex implementation is right as the threads are still receiving duplicate messages. The main thread can lock the mutex, consume a message into the buffer, unlock the mutex, spawn a thread, but that thread still may hang and the main one could just rewrite the buffer again (as the buffer mutex was never locked by the new thread), effectively making my current mutex implementation useless? How do I overcome this?
The problem is that you end the loop that contains message_buffer before guaranteeing that the thread has finished with that memory.
while (1) {
char message_buffer[SIZE];
ssize_t message_length = mq_receive(...);
if (message_len == -1) { /* error handling */ }
pthread_t pt1;
int ret = pthread_create(&pt1, NULL, handle_message, message_buffer);
if (ret) { /* error handling */ }
/****** Can't go beyond here until thread is done with message_buffer. ******/
}
void * handle_message (void * message) {
char own_buffer[SIZE];
strncpy(own_buffer, (char *) message, SIZE);
/******* Only now can the caller loop back. ******/
puts(own_buffer);
return NULL;
}
You could use a semaphore or similar.
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
static int copied = 0;
while (1) {
char message_buffer[SIZE];
ssize_t message_length = mq_receive(...);
if (message_len == -1) { /* error handling */ }
pthread_t pt1;
int ret = pthread_create(&pt1, NULL, handle_message, message_buffer);
if (ret) { /* error handling */ }
// Wait until threads is done with message_buffer.
pthread_mutex_lock(&mutex);
while (!copied) pthread_cond_wait(&cond, &mutex);
copied = 0;
pthread_mutex_unlock(&mutex);
}
void * handle_message (void * message) {
char own_buffer[SIZE];
strncpy(own_buffer, (char *) message, SIZE);
// Done with caller's buffer.
// Signal caller to continue.
pthread_mutex_lock(&mutex);
copied = 1;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mutex);
puts(own_buffer);
return NULL;
}
(The added chunks effectively perform semaphore operations. See the last snippet of this answer for a more generic implementation.)
But there's a simpler solution: Make the copy before creating the thread.
while (1) {
char message_buffer[SIZE];
ssize_t message_length = mq_receive(...);
if (message_len == -1) { /* error handling */ }
pthread_t pt1;
int ret = pthread_create(&pt1, NULL, handle_message, strdup(message_buffer));
if (ret) { /* error handling */ }
}
void * handle_message (void * message) {
char * own_buffer = message;
puts(own_buffer);
free(own_buffer);
return NULL;
}

Segfault on Server after Multithreading in C

So I'm trying to code a multi-threading server. I've spent an enormous time on the internet figuring out the correct way to do this and the answer as always seems to be it depends. Whenever I execute my code, the client successfully connects, and executes but when the thread terminates and returns to the while loop the whole program segfaults.
I probably could use a good spanking on a few other things as well such as my usage of global variables. The entirety of code is below, sorry for the inconsistent space/tabbing.
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdbool.h>
#include <signal.h>
#include <math.h>
#include <pthread.h>
#include <sys/stat.h>
#include <fcntl.h>
/* ---------------------------------------------------------------------
This is a basic whiteboard server. You can query it, append to it and
clear in it. It understands both encrypted and unencrypted data.
--------------------------------------------------------------------- */
struct whiteboard {
int line;
char type;
int bytes;
char string[1024];
} *Server;
int serverSize, threadcount, id[5];
bool debug = true;
struct whiteboard *Server;
pthread_mutex_t mutex;
pthread_t thread[5];
/* -------------------------------------------
function: sigint_handler
Opens a file "whiteboard.all" in writemode
and writes all white board information in
command mode.
------------------------------------------- */
void sigint_handler(int sig)
{
if (debug) printf("\nInduced SIGINT.\n");
FILE *fp;
fp=fopen("whiteboard.all","w");
int x=0;
for (x;x<serverSize;x++) // Loop Responsible for iterating all the whiteboard entries.
{
if (debug) printf("#%d%c%d\n%s\n",Server[x].line,Server[x].type,Server[x].bytes,Server[x].string);
fprintf(fp,"#%d%c%d\n%s\n",Server[x].line,Server[x].type,Server[x].bytes,Server[x].string);
}
if (debug) printf("All values stored.\n");
free(Server); // Free dynamically allocated memory
exit(1);
}
/* -------------------------------------------
function: processMessage
Parses '!' messages into their parts -
returns struct in response.
------------------------------------------- */
struct whiteboard processMessage(char * message)
{
int lineNumber, numBytes;
char stringType, entry[1028];
if (debug) printf("Update Statement!\n");
// Read line sent by Socket
sscanf(message,"%*c%d%c%d\n%[^\n]s",&lineNumber,&stringType,&numBytes,entry);
if (debug) printf("Processed: Line: %d, Text: %s\n",lineNumber,entry);
// Parse information into local Struct
struct whiteboard Server;
Server.line = lineNumber;
Server.type = stringType;
Server.bytes = numBytes;
strcpy(Server.string,entry);
// If there is no bytes, give nothing
if (numBytes == 0)
{
strcpy(Server.string,"");
}
return Server;
}
/* -------------------------------------------
function: handleEverything
Determines type of message recieved and
process and parses accordingly.
------------------------------------------- */
char * handleEverything(char* message, struct whiteboard *Server, char* newMessage)
{
bool updateFlag = false, queryFlag = false;
// If message is an Entry
if (message[0] == '#')
{
if (debug) printf("Triggered Entry!\n");
// Create Temporary Struct
struct whiteboard messageReturn;
messageReturn = processMessage(message);
// Store Temporary Struct in Correct Heap Struct
Server[messageReturn.line] = messageReturn;
sprintf(newMessage,"!%d%c%d\n%s\n",messageReturn.line, messageReturn.type, messageReturn.bytes, messageReturn.string);
return newMessage;
}
// If message is a query
if (message[0] == '?')
{
if (debug) printf("Triggered Query!\n");
int x;
queryFlag = true;
sscanf(message,"%*c%d",&x); // Parse Query
if (x > serverSize) // Check if Query out of Range
{
strcpy(newMessage,"ERROR: Query out of Range.\n");
return newMessage;
}
sprintf(newMessage,"!%d%c%d\n%s\n",Server[x].line,Server[x].type,Server[x].bytes,Server[x].string);
if (debug) printf("newMessage as of handleEverything:%s\n",newMessage);
return newMessage;
}
}
/* -------------------------------------------
function: readFile
If argument -f given, read file
process and parse into heap memory.
------------------------------------------- */
void readFile(char * filename)
{
FILE *fp;
fp=fopen(filename,"r");
int line, bytes, count = 0, totalSize = 0;
char type, check, string[1028], individualLine[1028];
// Loop to determine size of file. **I know this is sloppy.
while (fgets(individualLine, sizeof(individualLine), fp))
{
totalSize++;
}
// Each line shoud have totalSize - 2 (to account for 0)
// (answer) / 2 to account for string line and instruction.
totalSize = (totalSize - 2) / 2;
serverSize = totalSize+1;
if (debug) printf("Total Size is: %d\n",serverSize);
// Open and Allocate Memory
fp=fopen(filename,"r");
if (debug) printf("File Mode Calloc Initialize\n");
Server = calloc(serverSize+2, sizeof(*Server));
// Write to Heap Loop
while (fgets(individualLine, sizeof(individualLine), fp)) {
if (individualLine[0] == '#') // Case of Header Line
{
sscanf(individualLine,"%c%d%c%d",&check,&line,&type,&bytes);
if (debug) printf("Count: %d, Check:%c, Line:%d, Type: %c, Bytes:%d \n",count,check,line,type,bytes);
Server[count].line = line;
Server[count].type = type;
Server[count].bytes = bytes;
count++;
}
else
{
// For case of no data
if (individualLine[0] == '\n')
{
strcpy(string,"");
}
// Then scan data line
sscanf(individualLine,"%[^\n]s",string);
if (debug) printf("String: %s\n",string);
strcpy(Server[count-1].string,string);
}
}
return;
}
void *threadFunction(int snew)
{
char tempmessage[1024], message[2048];
// Compile and Send Server Message
strcpy(tempmessage, "CMPUT379 Whiteboard Server v0\n");
send(snew, tempmessage, sizeof(tempmessage), 0);
// Recieve Message
char n = recv(snew, message, sizeof(message), 0);
pthread_mutex_lock(&mutex);
if (debug) printf("Attempt to Malloc for newMessage\n");
char * newMessage = malloc(1024 * sizeof(char));
if (debug) printf("goto: handleEverything\n");
newMessage = handleEverything(message, Server, newMessage);
if (debug) printf("returnMessage:%s\n",newMessage);
strcpy(message,newMessage);
free(newMessage);
pthread_mutex_unlock(&mutex);
if (debug) printf("message = %s\n", message);
send(snew, message, sizeof(message), 0);
printf("End of threadFunction\n");
return;
}
/* -------------------------------------------
function: main
Function Body of Server
------------------------------------------- */
int main(int argc, char * argv[])
{
int sock, fromlength, outnum, i, socketNumber, snew;
bool cleanMode;
// Initialize Signal Handling
struct sigaction act;
act.sa_handler = sigint_handler;
sigemptyset(&act.sa_mask);
act.sa_flags = 0;
sigaction(SIGINT, &act, 0);
// For correct number of arguments.
if (argc == 4)
{
// If "-n" parameter (cleanMode)
if (strcmp(argv[2], "-n") == 0)
{
// Get size + 1
cleanMode = true;
sscanf(argv[3],"%d",&serverSize);
serverSize += 1;
if (debug) printf("== Clean Mode Properly Initiated == \n");
if (debug) printf("serverSize: %d\n",serverSize);
if (debug) printf("Clean Mode Calloc\n");
Server = calloc(serverSize, sizeof(*Server));
int i = 0;
for (i; i < serverSize; i++) // Initialize allocated Memory
{
Server[i].line = i;
Server[i].type = 'p';
Server[i].bytes = 0;
strcpy(Server[i].string,"");
}
}
// If "-f" parameter (filemode)
else if (strcmp(argv[2], "-f") == 0)
{
// Read File
cleanMode = false;
readFile(argv[3]);
if (debug) printf("== Statefile Mode Properly Initiated == \n");
if (debug) printf("serverSize: %d\n",serverSize);
}
// Otherwise incorrect parameter.
else
{
printf("Incorrect Argument. \n");
printf("Usage: wbs279 pornumber {-n number | -f statefile}\n");
exit(1);
}
sscanf(argv[1],"%d",&socketNumber);
}
// Send Error for Incorrect Number of Arguments
if (argc != 4)
{
printf("Error: Incorrect Number of Input Arguments.\n");
printf("Usage: wbs279 portnumber {-n number | -f statefile}\n");
exit(1);
}
// == Do socket stuff ==
char tempmessage[1024], message[2048];
struct sockaddr_in master, from;
if (debug) printf("Assrt Socket\n");
sock = socket (AF_INET, SOCK_STREAM, 0);
if (sock < 0)
{
perror ("Server: cannot open master socket");
exit (1);
}
master.sin_family = AF_INET;
master.sin_addr.s_addr = INADDR_ANY;
master.sin_port = htons (socketNumber);
if (bind (sock, (struct sockaddr*) &master, sizeof (master)))
{
perror ("Server: cannot bind master socket");
exit (1);
}
// == Done socket stuff ==
listen (sock, 5);
int threadNumber = 0;
while(1)
{
printf("But what about now.\n");
if (debug) printf("-- Wait for Input --\n");
printf("Enie, ");
fromlength = sizeof (from);
printf("Meanie, ");
snew = accept (sock, (struct sockaddr*) & from, & fromlength);
printf("Miney, ");
if (snew < 0)
{
perror ("Server: accept failed");
exit (1);
}
printf("Moe\n");
pthread_create(&thread[threadNumber],NULL,threadFunction(snew), &id[threadNumber]);
//printf("Can I join?!\n");
//pthread_join(thread[0],NULL);
//printf("Joined?!\n");
threadNumber++;
close (snew);
}
}
I'm also curious as to how exactly to let multiple clients use the server at once. Is how I've allocated the whiteboard structure data appropriate for this process?
I'm very sorry if these don't make any sense.
You seem to somehow expect this:
pthread_create(&thread[threadNumber],NULL,threadFunction(snew), &id[threadNumber]);
/* ... */
close (snew);
To make sense, while it clearly doesn't.
Instead of starting a thread that runs threadFunction, passing it snew, you call the thread function and pass the return value to pthread_create(), which will interpret it as a function pointer. This will break, especially considering that the thread function incorrectly ends with:
return;
This shouldn't compile, since it's declared to return void *.
Also assuming you managed to start the thread, passing it snew to use as its socket: then you immediately close that socket, causing any reference to it from the thread to be invalid!
Please note that pthread_create() does not block and wait for the thread to exit, that would be kind of ... pointless. It starts off the new thread to run in parallel with the main thread, so of course you can't yank the carpet away from under it.
This signal handler is completely unsafe:
void sigint_handler(int sig)
{
if (debug) printf("\nInduced SIGINT.\n");
FILE *fp;
fp=fopen("whiteboard.all","w");
int x=0;
for (x;x<serverSize;x++) // Loop Responsible for iterating all the whiteboard entries.
{
if (debug) printf("#%d%c%d\n%s\n",Server[x].line,Server[x].type,Server[x].bytes,Server[x].string);
fprintf(fp,"#%d%c%d\n%s\n",Server[x].line,Server[x].type,Server[x].bytes,Server[x].string);
}
if (debug) printf("All values stored.\n");
free(Server); // Free dynamically allocated memory
exit(1);
}
Per 2.4.3 Signal Actions of the POSIX standard (emphasis added):
The following table defines a set of functions that shall be
async-signal-safe. Therefore, applications can call them, without
restriction, from signal-catching functions. ...
[list of async-signal-safe functions]
Any function not in the above table may be unsafe with respect to signals. Implementations may make other interfaces
async-signal-safe. In the presence of signals, all functions defined
by this volume of POSIX.1-2008 shall behave as defined when called
from or interrupted by a signal-catching function, with the exception
that when a signal interrupts an unsafe function or equivalent
(such as the processing equivalent to exit() performed after a return
from the initial call to main()) and the signal-catching function
calls an unsafe function, the behavior is undefined. Additional
exceptions are specified in the descriptions of individual functions
such as longjmp().
Your signal handler invokes undefined behavior.

C uninitialized mutex works and initialized mutex fails?

My C program creates a producer thread, saving data as fast as possible. The main thread consumes and prints these.
After days of bug finding, I noticed that if the mutex was initialized, then the program stops within 30 seconds (deadlock?).
However if the mutex is left uninitialized it works perfectly.
Can anyone explain this?? To avoid undefined behavior, I would prefer to initialize them if possible.
Further Info: Specifically it's locking up if "pthread_mutex_t signalM" (the signaling mutex) is initialized
Initialized
#include <stdlib.h> // exit_failure, exit_success
#include <stdio.h> // stdin, stdout, printf
#include <pthread.h> // threads
#include <string.h> // string
#include <unistd.h> // sleep
#include <stdbool.h> // bool
#include <fcntl.h> // open
struct event {
pthread_mutex_t critical;
pthread_mutex_t signalM;
pthread_cond_t signalC;
int eventCount;
};
struct allVars {
struct event inEvents;
struct event outEvents;
int bufferSize;
char buffer[10][128];
};
/**
* Advance the EventCount
*/
void advance(struct event *event) {
// increment the event counter
pthread_mutex_lock(&event->critical);
event->eventCount++;
pthread_mutex_unlock(&event->critical);
// signal await to continue
pthread_mutex_lock(&event->signalM);
pthread_cond_signal(&event->signalC);
pthread_mutex_unlock(&event->signalM);
}
/**
* Wait for ticket and buffer availability
*/
void await(struct event *event, int ticket) {
int eventCount;
// get the counter
pthread_mutex_lock(&event->critical);
eventCount = event->eventCount;
pthread_mutex_unlock(&event->critical);
// lock signaling mutex
pthread_mutex_lock(&event->signalM);
// loop until the ticket machine shows your number
while (ticket > eventCount) {
// wait until a ticket is called
pthread_cond_wait(&event->signalC, &event->signalM);
// get the counter
pthread_mutex_lock(&event->critical);
eventCount = event->eventCount;
pthread_mutex_unlock(&event->critical);
}
// unlock signaling mutex
pthread_mutex_unlock(&event->signalM);
}
/**
* Add to buffer
*/
void putBuffer(struct allVars *allVars, char data[]) {
// get the current write position
pthread_mutex_lock(&allVars->inEvents.critical);
int in = allVars->inEvents.eventCount;
pthread_mutex_unlock(&allVars->inEvents.critical);
// wait until theres a space free in the buffer
await(&allVars->outEvents, in - allVars->bufferSize + 1); // set to 2 to keep 1 index distance
// add data to buffer
strcpy(allVars->buffer[in % allVars->bufferSize], data);
// increment the eventCounter and signal
advance(&allVars->inEvents);
}
/**
* Get from buffer
*/
char *getBuffer(struct allVars *allVars) {
// get the current read position
pthread_mutex_lock(&allVars->outEvents.critical);
int out = allVars->outEvents.eventCount;
pthread_mutex_unlock(&allVars->outEvents.critical);
// wait until theres something in the buffer
await(&allVars->inEvents, out + 1);
// allocate memory for returned string
char *str = malloc(128);
// get the buffer data
strcpy(str, allVars->buffer[out % allVars->bufferSize]);
// increment the eventCounter and signal
advance(&allVars->outEvents);
return str;
}
/** child thread (producer) */
void *childThread(void *allVars) {
char str[10];
int count = 0;
while (true) {
sprintf(str, "%d", count++);
putBuffer(allVars, str);
}
pthread_exit(EXIT_SUCCESS);
}
int main(void) {
// init structs
struct event inEvents = {
PTHREAD_MUTEX_INITIALIZER,
PTHREAD_MUTEX_INITIALIZER,
PTHREAD_COND_INITIALIZER,
0
};
struct event outEvents = {
PTHREAD_MUTEX_INITIALIZER,
PTHREAD_MUTEX_INITIALIZER,
PTHREAD_COND_INITIALIZER,
0
};
struct allVars allVars = {
inEvents, // events
outEvents,
10, // buffersize
{"", {""}} // buffer[][]
};
// create child thread (producer)
pthread_t thread;
if (pthread_create(&thread, NULL, childThread, &allVars)) {
fprintf(stderr, "failed to create child thread");
exit(EXIT_FAILURE);
}
// (consumer)
while (true) {
char *out = getBuffer(&allVars);
printf("buf: %s\n", out);
free(out);
}
return (EXIT_SUCCESS);
}
Uninitialized
#include <stdlib.h> // exit_failure, exit_success
#include <stdio.h> // stdin, stdout, printf
#include <pthread.h> // threads
#include <string.h> // string
#include <unistd.h> // sleep
#include <stdbool.h> // bool
#include <fcntl.h> // open
struct event {
pthread_mutex_t critical;
pthread_mutex_t signalM;
pthread_cond_t signalC;
int eventCount;
};
struct allVars {
struct event inEvents;
struct event outEvents;
int bufferSize;
char buffer[10][128];
};
/**
* Advance the EventCount
*/
void advance(struct event *event) {
// increment the event counter
pthread_mutex_lock(&event->critical);
event->eventCount++;
pthread_mutex_unlock(&event->critical);
// signal await to continue
pthread_mutex_lock(&event->signalM);
pthread_cond_signal(&event->signalC);
pthread_mutex_unlock(&event->signalM);
}
/**
* Wait for ticket and buffer availability
*/
void await(struct event *event, int ticket) {
int eventCount;
// get the counter
pthread_mutex_lock(&event->critical);
eventCount = event->eventCount;
pthread_mutex_unlock(&event->critical);
// lock signaling mutex
pthread_mutex_lock(&event->signalM);
// loop until the ticket machine shows your number
while (ticket > eventCount) {
// wait until a ticket is called
pthread_cond_wait(&event->signalC, &event->signalM);
// get the counter
pthread_mutex_lock(&event->critical);
eventCount = event->eventCount;
pthread_mutex_unlock(&event->critical);
}
// unlock signaling mutex
pthread_mutex_unlock(&event->signalM);
}
/**
* Add to buffer
*/
void putBuffer(struct allVars *allVars, char data[]) {
// get the current write position
pthread_mutex_lock(&allVars->inEvents.critical);
int in = allVars->inEvents.eventCount;
pthread_mutex_unlock(&allVars->inEvents.critical);
// wait until theres a space free in the buffer
await(&allVars->outEvents, in - allVars->bufferSize + 1); // set to 2 to keep 1 index distance
// add data to buffer
strcpy(allVars->buffer[in % allVars->bufferSize], data);
// increment the eventCounter and signal
advance(&allVars->inEvents);
}
/**
* Get from buffer
*/
char *getBuffer(struct allVars *allVars) {
// get the current read position
pthread_mutex_lock(&allVars->outEvents.critical);
int out = allVars->outEvents.eventCount;
pthread_mutex_unlock(&allVars->outEvents.critical);
// wait until theres something in the buffer
await(&allVars->inEvents, out + 1);
// allocate memory for returned string
char *str = malloc(128);
// get the buffer data
strcpy(str, allVars->buffer[out % allVars->bufferSize]);
// increment the eventCounter and signal
advance(&allVars->outEvents);
return str;
}
/** child thread (producer) */
void *childThread(void *allVars) {
char str[10];
int count = 0;
while (true) {
sprintf(str, "%d", count++);
putBuffer(allVars, str);
}
pthread_exit(EXIT_SUCCESS);
}
int main(void) {
// init structs
struct event inEvents; /* = {
PTHREAD_MUTEX_INITIALIZER,
PTHREAD_MUTEX_INITIALIZER,
PTHREAD_COND_INITIALIZER,
0
}; */
struct event outEvents; /* = {
PTHREAD_MUTEX_INITIALIZER,
PTHREAD_MUTEX_INITIALIZER,
PTHREAD_COND_INITIALIZER,
0
}; */
struct allVars allVars = {
inEvents, // events
outEvents,
10, // buffersize
{"", {""}} // buffer[][]
};
// create child thread (producer)
pthread_t thread;
if (pthread_create(&thread, NULL, childThread, &allVars)) {
fprintf(stderr, "failed to create child thread");
exit(EXIT_FAILURE);
}
// (consumer)
while (true) {
char *out = getBuffer(&allVars);
printf("buf: %s\n", out);
free(out);
}
return (EXIT_SUCCESS);
}
Jonathan explained why the code that didn't initialize mutexes didn't deadlock (essentially because trying to use an uninitialized mutex would never block, it would just immediately return an error).
The problem causing the infinite wait in the version of the program that does properly initialize mutexes is that you aren't using your condition variables properly. The check of the predicate expression and the wait on the condition variable must be done atomically with respect to whatever other thread might be modifying the predicate. You code is checking a predicate that is a local variable that the other thread doesn't even have access to. Your code reads the actual predicate into a local variable within a critical section, but then the mutex for reading the predicate is released and a different mutex is acquired to read the 'false' predicate (which cannot be modified by the other thread anyway) atomically with the condition variable wait.
So you have a situation where the actual predicate, event->eventCount, can be modified and the signal for that modification be issued in between when the waiting thread reads the predicate and blocks on the condition variable.
I think the following will fix your deadlock, but I haven't had a chance to perform much testing. The change is essentially to remove the signalM mutex from struct event and replace any use of it with the critical mutex:
#include <stdlib.h> // exit_failure, exit_success
#include <stdio.h> // stdin, stdout, printf
#include <pthread.h> // threads
#include <string.h> // string
#include <unistd.h> // sleep
#include <stdbool.h> // bool
#include <fcntl.h> // open
struct event {
pthread_mutex_t critical;
pthread_cond_t signalC;
int eventCount;
};
struct allVars {
struct event inEvents;
struct event outEvents;
int bufferSize;
char buffer[10][128];
};
/**
* Advance the EventCount
*/
void advance(struct event *event) {
// increment the event counter
pthread_mutex_lock(&event->critical);
event->eventCount++;
pthread_mutex_unlock(&event->critical);
// signal await to continue
pthread_cond_signal(&event->signalC);
}
/**
* Wait for ticket and buffer availability
*/
void await(struct event *event, int ticket) {
// get the counter
pthread_mutex_lock(&event->critical);
// loop until the ticket machine shows your number
while (ticket > event->eventCount) {
// wait until a ticket is called
pthread_cond_wait(&event->signalC, &event->critical);
}
// unlock signaling mutex
pthread_mutex_unlock(&event->critical);
}
/**
* Add to buffer
*/
void putBuffer(struct allVars *allVars, char data[]) {
// get the current write position
pthread_mutex_lock(&allVars->inEvents.critical);
int in = allVars->inEvents.eventCount;
pthread_mutex_unlock(&allVars->inEvents.critical);
// wait until theres a space free in the buffer
await(&allVars->outEvents, in - allVars->bufferSize + 1); // set to 2 to keep 1 index distance
// add data to buffer
strcpy(allVars->buffer[in % allVars->bufferSize], data);
// increment the eventCounter and signal
advance(&allVars->inEvents);
}
/**
* Get from buffer
*/
char *getBuffer(struct allVars *allVars) {
// get the current read position
pthread_mutex_lock(&allVars->outEvents.critical);
int out = allVars->outEvents.eventCount;
pthread_mutex_unlock(&allVars->outEvents.critical);
// wait until theres something in the buffer
await(&allVars->inEvents, out + 1);
// allocate memory for returned string
char *str = malloc(128);
// get the buffer data
strcpy(str, allVars->buffer[out % allVars->bufferSize]);
// increment the eventCounter and signal
advance(&allVars->outEvents);
return str;
}
/** child thread (producer) */
void *childThread(void *allVars) {
char str[10];
int count = 0;
while (true) {
sprintf(str, "%d", count++);
putBuffer(allVars, str);
}
pthread_exit(EXIT_SUCCESS);
}
int main(void) {
// init structs
struct event inEvents = {
PTHREAD_MUTEX_INITIALIZER,
PTHREAD_COND_INITIALIZER,
0
};
struct event outEvents = {
PTHREAD_MUTEX_INITIALIZER,
PTHREAD_COND_INITIALIZER,
0
};
struct allVars allVars = {
inEvents, // events
outEvents,
10, // buffersize
{"", {""}} // buffer[][]
};
// create child thread (producer)
pthread_t thread;
if (pthread_create(&thread, NULL, childThread, &allVars)) {
fprintf(stderr, "failed to create child thread");
exit(EXIT_FAILURE);
}
// (consumer)
while (true) {
char *out = getBuffer(&allVars);
printf("buf: %s\n", out);
free(out);
}
return (EXIT_SUCCESS);
}
I modified the getBuffer() and putBuffer() routines as shown (in both the initialized and uninitialized versions of the code):
static
void putBuffer(struct allVars *allVars, char data[])
{
int lock_ok = pthread_mutex_lock(&allVars->inEvents.critical);
if (lock_ok != 0)
printf("%s(): lock error %d (%s)\n", __func__, lock_ok, strerror(lock_ok));
int in = allVars->inEvents.eventCount;
int unlock_ok = pthread_mutex_unlock(&allVars->inEvents.critical);
if (unlock_ok != 0)
printf("%s(): unlock error %d (%s)\n", __func__, unlock_ok, strerror(unlock_ok));
await(&allVars->outEvents, in - allVars->bufferSize + 1);
strcpy(allVars->buffer[in % allVars->bufferSize], data);
advance(&allVars->inEvents);
}
static
char *getBuffer(struct allVars *allVars)
{
int lock_ok = pthread_mutex_lock(&allVars->outEvents.critical);
if (lock_ok != 0)
printf("%s(): lock error %d (%s)\n", __func__, lock_ok, strerror(lock_ok));
int out = allVars->outEvents.eventCount;
int unlock_ok = pthread_mutex_unlock(&allVars->outEvents.critical);
if (unlock_ok != 0)
printf("%s(): unlock error %d (%s)\n", __func__, unlock_ok, strerror(unlock_ok));
await(&allVars->inEvents, out + 1);
char *str = malloc(128);
strcpy(str, allVars->buffer[out % allVars->bufferSize]);
advance(&allVars->outEvents);
return str;
}
Then running the uninitialized code yields a lot of messages like:
buf: 46566
putBuffer(): lock error 22 (Invalid argument)
getBuffer(): lock error 22 (Invalid argument)
putBuffer(): unlock error 22 (Invalid argument)
getBuffer(): unlock error 22 (Invalid argument)
Basically, it appears to me that your locking and unlocking is being ignored. There are other places in your code that you should check too.
Fundamentally, if you ignore the errors reported, you don't notice that the locking and unlocking is not working at all, and there's no reason for the code to stop running.
Always check the return values from system calls that can fail.
I don't have an immediate explanation for why the initialized code locks up. It does for me, running on Mac OS X 10.10.3 with GCC 5.1.0, after anywhere from about 100,000 to 800,000 iterations.

Pthread runtime errors

I'm having trouble debugging the following program I wrote. The idea is to have two seperate threads; one thread executes a 5 second countdown while the other waits for key input from the user. Whichever thread completes first should cancel the sibling thread and exit the program. However, the following code just hangs.
Any help would be appreciated, but I would be most grateful for an explanation as to the problem.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h> // For sleep()
#define NUM_THREADS 2
// The stuct to be passed as an argument to the countdown routine
typedef struct countdown_struct {
pthread_t *thread;
signed int num_secs;
} CountdownStruct;
// Struct for passing to the input wait routine
typedef struct wait_struct {
pthread_t *thread;
int *key;
} WaitStruct;
// Countdown routine; simply acts as a timer counting down
void * countdown(void *args)
{
CountdownStruct *cd_str = (CountdownStruct *)args;
signed int secs = cd_str->num_secs;
printf("Will use default setting in %d seconds...", secs);
while (secs >= 0)
{
sleep(1);
secs -= 1;
printf("Will use default setting in %d seconds...", secs);
}
// Cancel the other struct
pthread_cancel(*(cd_str->thread));
return NULL;
}
// Waits for the user to pass input through the tty
void * wait_for_input(void *args)
{
WaitStruct *wait_str = (WaitStruct *) args;
int c = 0;
do {
c = getchar();
} while (!(c == '1' || c == '2'));
*(wait_str->key) = c;
// Cancel the other thread
pthread_cancel(*(wait_str->thread));
return NULL;
}
int main(int argc, char **argv)
{
pthread_t wait_thread;
pthread_t countdown_thread;
pthread_attr_t attr;
int key=0;
long numMillis=5000;
int rc=0;
int status=0;
// Create the structs to be passe as paramaters to both routines
CountdownStruct *cd_str = (CountdownStruct *) malloc(sizeof(CountdownStruct));
if (cd_str == NULL)
{
printf("Couldn't create the countdown struct. Aborting...");
return -1;
}
cd_str->thread = &wait_thread;
cd_str->num_secs = 5;
WaitStruct *wait_str = (WaitStruct *) malloc(sizeof(WaitStruct));
if (wait_str == NULL)
{
printf("Couldn't create the iput wait struct. Aborting...");
return -1;
}
wait_str->thread = &countdown_thread;
wait_str->key = &key;
// Create the joinable attribute
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
// Create both threads
rc = pthread_create(&countdown_thread, &attr, countdown, (void *) cd_str);
if (rc) { printf("Error with the thread creation!"); exit(-1); }
rc = pthread_create(&wait_thread, &attr, wait_for_input, (void *) wait_str);
if (rc) { printf("Error with the thread creation!"); exit(-1); }
// Destroy the pthread_attribute
pthread_attr_destroy(&attr);
// now join on the threads and wait for main
pthread_join(wait_thread, NULL);
pthread_join(countdown_thread, NULL);
// Call pthread_exit
pthread_exit(NULL);
// Free the function structs
free(cd_str);
free(wait_str);
}
Getchar is not required to be a cancellation point. Select and pselect are. Even if you want to continue to use a countdown thread you could still provide a cancellation point in the opposing thread by use of select.
I had reasonable behavior with the following modified wait_for_input()
// Waits for the user to pass input through the tty
void * wait_for_input(void *args)
{
WaitStruct *wait_str = (WaitStruct *) args;
int c = 0;
fd_set readFds;
int numFds=0;
FD_ZERO(&readFds);
do {
struct timeval timeout={.tv_sec=8,.tv_usec=0};
/* select here is primarily to serve as a cancellation
* point. Though there is a possible race condition
* still between getchar() getting called right as the
* the timeout thread calls cancel.().
* Using the timeout option on select would at least
* cover that, but not done here while testing.
*******************************************************/
FD_ZERO(&readFds);
FD_SET(STDOUT_FILENO,&readFds);
numFds=select(STDOUT_FILENO+1,&readFds,NULL,NULL,&timeout);
if(numFds==0 )
{
/* must be timeout if no FD's selected */
break;
}
if(FD_ISSET(STDOUT_FILENO,&readFds))
{
printf("Only get here if key pressed\n");
c = getchar();
}
} while (!(c == '1' || c == '2'));
*(wait_str->key) = c;
// Cancel the other thread
pthread_cancel(*(wait_str->thread));
return NULL;
}

Resources