why sem_wait doesn't wait semaphore on mac OSX? - c

The following code shows a producer-consumer example:
Once a product is produced, the consumer will get this product.
But I'm surprised that the consumer will sill get a product when there is no product.
#include <stdlib.h>
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
#include <semaphore.h>
#define NUM 5
int queue[NUM];
int i;
sem_t *blank_number, *product_number;
void *producer(void *arg) {
int p = 0;
while (1) {
sem_wait(blank_number);
queue[p] = rand() % 1000 + 1;
printf("Produce queue[%d]:%d\n", p, queue[p]);
i = sem_post(product_number);
//printf("i_p=%d\n", i);
p = (p+1)%NUM;
sleep(rand()%5);
}
}
void *consumer(void *arg) {
int c = 0;
while (1) {
sem_wait(product_number);
printf("Consume queue[%d]:%d\n", c, queue[c]);
queue[c] = 0;
i = sem_post(blank_number);
//printf("i_c=%d\n", i);
c = (c+1)%NUM;
sleep(rand()%5);
}
}
int main(int argc, char *argv[]) {
pthread_t pid, cid;
//set blank_number to NUM
blank_number = sem_open("blank_number", O_CREAT, S_IRWXU, NUM);
if(blank_number == SEM_FAILED){
perror("open blank_number");
return 1;
}
//set product_number to 0
product_number = sem_open("product_number", O_CREAT, S_IRWXU, 0);
if(product_number == SEM_FAILED){
perror("open product_number");
return 1;
}
pthread_create(&pid, NULL, producer, NULL);
pthread_create(&cid, NULL, consumer, NULL);
pthread_join(pid, NULL);
pthread_join(cid, NULL);
sem_close(blank_number);
sem_close(product_number);
return 0;
}
In my test result, there is only one product: 808, but the consumer gets two products: 808 and 0;
$ sudo ./a.out
Produce queue[0]:808
Consume queue[0]:808
Consume queue[1]:0
Is there any wrong in my code?

Your problem is that you never deleted your semaphores. So when you open them you recover some old/bad state. Try to open with O_EXCL you will be able to observe the problem.
Write a simple command to delete them with sem_unlink() or initialize them before using them with semctl.
You also need to set the appropriate values in sem_open not 022...
Alos note that POSIX named semaphores should have a name starting with /.
Change the beginning of your main to :
sem_unlink("blank_number");
sem_unlink("product_number");
//set blank_number to 1
blank_number = sem_open("blank_number", O_CREAT|O_EXCL, S_IRWXU, 1);
if(blank_number == SEM_FAILED){
perror("open blank_number");
return 1;
}
//set product_number to 0
product_number = sem_open("product_number", O_CREAT|O_EXCL, S_IRWXU, 0);
if(product_number == SEM_FAILED){
perror("open product_number");
return 1;
}

Maybe try to use sem_init with an unnamed semaphore instead of sem_open:
sem_t semaphore;
int ret = sem_init(&semaphore, 0, 0);

Related

Shared memory corrupting data

I'm trying to write a program that uses counting semaphores, a mutex, and two threads. One thread is a producer that writes items to shared memory. Each item has a sequence number, timestamp, checksum, and some data. The consumer thread copies the original checksum from an item then calculates its own checksum from the item's data and compares the two to make sure the data wasn't corrupted.
My program runs, however, it reports incorrect checksums far more than correct checksums. I did some print statements to see what was going on, and it looks like the item's data is changing between writing to shared memory and reading from it. The item's stored checksum is also changing, and I have no idea what is causing this.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/shm.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <errno.h>
#include <stdint.h>
#include <semaphore.h>
#include <time.h>
#include <pthread.h>
typedef struct{
int seqNo;
uint16_t checksum;
uint32_t timeStamp;
uint8_t data[22];
}Item;
char* shm_name = "buffer";
int shm_fd;
uint8_t *shm_ptr;
pthread_t producers;
pthread_t consumers;
pthread_mutex_t mutex;
sem_t *empty, *full;
int shmSize;
int in = 0;
int out = 0;
//method for initializing shared memory
void CreateSharedMemory(){
shm_fd = shm_open(shm_name, O_CREAT | O_RDWR, 0644);
if (shm_fd == -1) {
fprintf(stderr, "Error unable to create shared memory, '%s, errno = %d (%s)\n", shm_name,
errno, strerror(errno));
return -1;
}
/* configure the size of the shared memory segment */
if (ftruncate(shm_fd, shmSize) == -1) {
fprintf(stderr, "Error configure create shared memory, '%s, errno = %d (%s)\n", shm_name,
errno, strerror(errno));
shm_unlink(shm_name);
return -1;
}
printf("shared memory create success, shm_fd = %d\n", shm_fd);
}
uint16_t checksum(char *addr, uint32_t count)
{
register uint32_t sum = 0;
uint16_t *buf = (uint16_t *) addr;
// Main summing loop
while(count > 1)
{
sum = sum + *(buf)++;
count = count - 2;
}
// Add left-over byte, if any
if (count > 0)
sum = sum + *addr;
// Fold 32-bit sum to 16 bits
while (sum>>16)
sum = (sum & 0xFFFF) + (sum >> 16);
return(~sum);
}
Item CreateItem(){
Item item;
uint16_t cksum;
int j = 0;
time_t seconds;
seconds = time(NULL);
item.seqNo = j;
item.timeStamp = seconds; //FIX this
for(int i = 0; i < 22; ++i){
item.data[i] = rand() % 256;
}
cksum = checksum(&item.data[0], shmSize-2);
item.checksum = cksum;
++j;
return item;
}
void* producer() {
shm_fd = shm_open(shm_name, O_RDWR, 0644);
shm_ptr = (uint8_t *)mmap(0, 32, PROT_READ | PROT_WRITE, MAP_SHARED, shm_fd, 0);
while(1) {
Item tempItem = CreateItem();
tempItem.seqNo = in;
sem_wait(empty);
pthread_mutex_lock(&mutex);
while (((in + 1)%shmSize) == out)
; // waiting
if(in < shmSize) {
//&shm_ptr[counter] = item;
\
memcpy(&shm_ptr[in], &tempItem, 32);
printf("%d\n", tempItem.seqNo);
in = (in + 1) % shmSize;
printf("Producer: %x\n", tempItem.checksum);
}
sleep(1);
pthread_mutex_unlock(&mutex);
sem_post(full);
}
}
void* consumer() {
uint16_t cksum1, cksum2;
shm_fd = shm_open(shm_name, O_RDWR, 0644);
shm_ptr = (uint8_t *)mmap(0, shmSize, PROT_READ | PROT_WRITE, MAP_SHARED, shm_fd, 0);
while(1) {
sem_wait(full);
pthread_mutex_lock(&mutex);
while (in == out)
; // waiting
if(out > 0) {
Item tempItem;
memcpy(&tempItem, &shm_ptr[out], 32);
cksum1 = tempItem.checksum;
cksum2 = checksum(&tempItem.data[0], shmSize-2);
if (cksum1 != cksum2) {
printf("Checksum mismatch: expected %02x, received %02x \n", cksum2, cksum1);
}
else{
printf("removed from shm\n");
}
//printf("Checksums match !!! \n");
out = (out + 1)%shmSize;
}
sleep(1);
pthread_mutex_unlock(&mutex);
sem_post(empty);
}
}
int main(int argc, char **argv){
sem_unlink(&empty);
sem_unlink(&full);
shm_unlink(shm_name);
shmSize = atoi(argv[1]);
out = shmSize;
if(shmSize < 0){
printf("Error: Size of buffer cannot be negative. ");
return -1;
}
pthread_mutex_init(&mutex, NULL);
empty = sem_open("/empty", O_CREAT, 0644, shmSize);
full = sem_open("/full", O_CREAT, 0644, 0);
CreateSharedMemory();
pthread_create(&producers, NULL, producer, NULL);
pthread_create(&consumers, NULL, consumer, NULL);
pthread_exit(NULL);

Process-shared condition variable : how to recover after one process dies?

I'm working on a simple FIFO queue to synchronize multiple instances of a server process.
This is very similar to
Linux synchronization with FIFO waiting queue, except dealing with multiple processes instead of threads. I adapted caf's ticket lock to use process-shared mutex and condition variable from a shared memory segment. It also handles timeouts in case one process dies while processing a request:
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <pthread.h>
#include <signal.h>
#include <errno.h>
static inline void fail(char *str)
{
perror(str);
exit(1);
}
/***************************************************************************************************/
/* Simple ticket lock queue with pthreads
* https://stackoverflow.com/questions/3050083/linux-synchronization-with-fifo-waiting-queue
*/
typedef struct ticket_lock {
pthread_mutex_t mutex;
pthread_cond_t cond;
int queue_head, queue_tail;
} ticket_lock_t;
static void
ticket_init(ticket_lock_t *t)
{
pthread_mutexattr_t mattr;
pthread_mutexattr_init(&mattr);
pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED);
pthread_mutexattr_setrobust(&mattr, PTHREAD_MUTEX_ROBUST);
pthread_mutex_init(&t->mutex, &mattr);
pthread_mutexattr_destroy(&mattr);
pthread_condattr_t cattr;
pthread_condattr_init(&cattr);
pthread_condattr_setpshared(&cattr, PTHREAD_PROCESS_SHARED);
pthread_cond_init(&t->cond, &cattr);
pthread_condattr_destroy(&cattr);
t->queue_head = t->queue_tail = 0;
}
static void
ticket_broadcast(ticket_lock_t *ticket)
{
pthread_cond_broadcast(&ticket->cond);
}
static int
ticket_lock(ticket_lock_t *ticket)
{
pthread_mutex_lock(&ticket->mutex);
int queue_me = ticket->queue_tail++;
while (queue_me > ticket->queue_head) {
time_t sec = time(NULL) + 5; /* 5s timeout */
struct timespec ts = { .tv_sec = sec, .tv_nsec = 0 };
fprintf(stderr, "%i: waiting, current: %i me: %i\n", getpid(), ticket->queue_head, queue_me);
if (pthread_cond_timedwait(&ticket->cond, &ticket->mutex, &ts) == 0)
continue;
if (errno != ETIMEDOUT) fail("pthread_cond_timedwait");
/* Timeout, kick current user... */
fprintf(stderr, "kicking stale ticket %i\n", ticket->queue_head);
ticket->queue_head++;
ticket_broadcast(ticket);
}
pthread_mutex_unlock(&ticket->mutex);
return queue_me;
}
static void
ticket_unlock(ticket_lock_t *ticket, int me)
{
pthread_mutex_lock(&ticket->mutex);
if (ticket->queue_head == me) { /* Normal case: we haven't timed out. */
ticket->queue_head++;
ticket_broadcast(ticket);
}
pthread_mutex_unlock(&ticket->mutex);
}
/***************************************************************************************************/
/* Shared memory */
#define SHM_NAME "fifo_sched"
#define SHM_MAGIC 0xdeadbeef
struct sched_shm {
int size;
int magic;
int ready;
/* sched stuff */
ticket_lock_t queue;
};
static unsigned int shm_size = 256;
static struct sched_shm *shm = 0;
/* Create new shared memory segment */
static void
create_shm()
{
int fd = shm_open(SHM_NAME, O_RDWR | O_CREAT | O_TRUNC, 0644);
assert(fd != -1);
int r = ftruncate(fd, shm_size); assert(r == 0);
void *pt = mmap(0, shm_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
assert(pt != MAP_FAILED);
fprintf(stderr, "Created shared memory.\n");
shm = pt;
memset(shm, 0, sizeof(*shm));
shm->size = shm_size;
shm->magic = SHM_MAGIC;
shm->ready = 0;
ticket_init(&shm->queue);
shm->ready = 1;
}
/* Attach existing shared memory segment */
static int
attach_shm()
{
int fd = shm_open(SHM_NAME, O_RDWR, 0);
if (fd == -1) return 0; /* Doesn't exist yet... */
shm = mmap(0, shm_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (shm == MAP_FAILED) fail("mmap");
fprintf(stderr, "Mapped shared memory.\n");
assert(shm->magic == SHM_MAGIC);
assert(shm->ready);
return 1;
}
static void
shm_init()
{
fprintf(stderr, "shm_init()\n");
assert(shm_size >= sizeof(struct sched_shm));
if (!attach_shm())
create_shm();
}
/***************************************************************************************************/
int main()
{
shm_init();
while (1) {
int ticket = ticket_lock(&shm->queue);
printf("%i: start %i\n", getpid(), ticket);
printf("%i: done %i\n", getpid(), ticket);
ticket_unlock(&shm->queue, ticket);
}
return 0;
}
This works well standalone and while adding extra processes:
$ gcc -g -Wall -std=gnu99 -o foo foo.c -lpthread -lrt
$ ./foo
$ ./foo # (in other term)
...
26370: waiting, current: 134803 me: 134804
26370: start 134804
26370: done 134804
26370: waiting, current: 134805 me: 134806
26370: start 134806
26370: done 134806
26370: waiting, current: 134807 me: 134808
However killing the 2nd instance breaks pthread_cond_timedwait() in the 1st:
pthread_cond_timedwait: No such file or directory
Which makes sense in a way, the condition variable was tracking this process and it's not there anymore.
Surely there must be a way to recover from this ?
[too long for a comment]
pthread_cond_timedwait: No such file or directory
Hu! :-)
The pthread_*() family of functions does not set errno to any error code but returns it.
So to get any usable results change this
if (pthread_cond_timedwait(&ticket->cond, &ticket->mutex, &ts) == 0)
continue;
if (errno != ETIMEDOUT) fail("pthread_cond_timedwait");
to be
if ((errno = pthread_cond_timedwait(&ticket->cond, &ticket->mutex, &ts)) == 0)
continue;
if (errno != ETIMEDOUT) fail("pthread_cond_timedwait");
Ok, quoting posix pthread_mutex_lock() reference:
If mutex is a robust mutex and the process containing the owning thread terminated while holding the mutex lock, a call to pthread_mutex_lock() shall return the error value [EOWNERDEAD]. [...] In these cases, the mutex is locked by the thread but the state it protects is marked as inconsistent. The application should ensure that the state is made consistent for reuse and when that is complete call pthread_mutex_consistent(). If the application is unable to recover the state, it should unlock the mutex without a prior call to pthread_mutex_consistent(), after which the mutex is marked permanently unusable.
So in addition to alk's comment to robustly handle processes dying with the mutex locked we need to watch for EOWNERDEAD when calling pthread_mutex_lock() and pthread_cond_timedwait(), and call pthread_mutex_consistent() on it.
Something like:
if ((errno = pthread_cond_timedwait(&ticket->cond, &ticket->mutex, &ts)) == 0)
continue;
if (errno == EOWNERDEAD) /* Recover mutex owned by dead process */
pthread_mutex_consistent(&ticket->mutex);
else if (errno != ETIMEDOUT)
fail("pthread_cond_timedwait");

How to create a file Lock with timeout under linux

I'm trying to lock some critical resources that are accessed by multiple applications under linux.
All the applications will call the acquireLock function on the same file when entering the critical section, and the releaseLock when leaving.
If the lock is not acquired for more than timeot the caller will go ahead doing something else.
The code below works whit slow processes, but under stress the lock is easily broken the lock is acquired by multiple processes, so I guess I'm stumbling in a race condition somewhere.
Can somebody point me out why it's not working and what would be the correct implementation?
Thanks a lot!
MV
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/file.h>
//************************************************************
#define CYCLETIME 1000
//************************************************************
//************************************************************
int acquireLock(char *lockFile, int msTimeout)
{
int lockFd;
int cntTimeout = 0;
if ((lockFd = open(lockFile, O_CREAT | O_RDWR, S_IRWXU | S_IRWXG | S_IRWXO)) < 0)
return -1;
while (flock(lockFd, LOCK_EX | LOCK_NB) < 0){
usleep(CYCLETIME);
cntTimeout++;
if(cntTimeout >= msTimeout){
return -1;
}
}
return lockFd;
}
//*************************************************************
void releaseLock (int lockFd)
{
flock(lockFd, LOCK_UN);
close(lockFd);
}
//************************************************************
It appears that the mistake was in another part of the code, the lock is working as expected.
I share the code I'm using in case it can be helpful to somebody else.
Those are the locking functions:
/* ----------------------------------------------------------------------- *
* Code derived by the flock.c in the "util-linux" ubuntu package
* by Peter Anvin
* ----------------------------------------------------------------------- */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/file.h>
#include <sys/time.h>
#include <signal.h>
//************************************************************
static sig_atomic_t timeout_expired = 0;
//************************************************************
static void timeout_handler(int sig)
{
(void)sig;
timeout_expired = 1;
}
//************************************************************
int acquireLock(char *lockFile, int msTimeout)
{
struct itimerval timeout, old_timer;
struct sigaction sa, old_sa;
int err;
int sTimeout = msTimeout/1000;
memset(&timeout, 0, sizeof timeout);
timeout.it_value.tv_sec = sTimeout;
timeout.it_value.tv_usec = ((msTimeout-(sTimeout*1000))*1000);
memset(&sa, 0, sizeof sa);
sa.sa_handler = timeout_handler;
sa.sa_flags = SA_RESETHAND;
sigaction(SIGALRM, &sa, &old_sa);
setitimer(ITIMER_REAL, &timeout, &old_timer);
int lockFd;
int cntTimeout = 0;
if ((lockFd = open(lockFile, O_CREAT | O_RDWR, S_IRWXU | S_IRWXG | S_IRWXO)) < 0)
return -1;
while (flock(lockFd, LOCK_EX))
{
switch( (err = errno) ) {
case EINTR: /* Signal received */
if ( timeout_expired )
setitimer(ITIMER_REAL, &old_timer, NULL); /* Cancel itimer */
sigaction(SIGALRM, &old_sa, NULL); /* Cancel signal handler */
return -1; /* -w option set and failed to lock */
continue; /* otherwise try again */
default: /* Other errors */
return -1;
}
}
setitimer(ITIMER_REAL, &old_timer, NULL); /* Cancel itimer */
sigaction(SIGALRM, &old_sa, NULL); /* Cancel signal handler */
return lockFd;
}
//***************************************************************
void releaseLock (int lockFd)
{
flock(lockFd, LOCK_UN);
close(lockFd);
}
//************************************************************
... and those can be tried by reading and writing a FIFO
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include "lock.h"
#define LOCKED 1
void main(int argc, char **argv)
{
const char *filename;
const char *fifo_name;
const char *message;
int lockfd, fifoHandle;
filename = argv[1];
fifo_name = argv[2];
message = argv[3];
char bufin[1024];
char bufout[1024];
struct stat st;
int bufsize = strlen(message)+1;
int sleeptime = 0;
int j = 0;
if (stat(fifo_name, &st) != 0)
mkfifo(fifo_name, 0666);
while (1){
if (LOCKED)
lockfd=acquireLock(filename, 15000);
if (lockfd==-1)
printf("timeout expired \n");
fifoHandle= open(fifo_name, O_RDWR);
strcpy(bufin, message);
bufin[bufsize-1] = 0x0;
write(fifoHandle, bufin, sizeof(char)*bufsize);
sleeptime = rand() % 100000;
usleep(sleeptime);
read(fifoHandle, &bufout, sizeof(char)*(bufsize+1));
printf("%s - %d \n", bufout, j);
j= j+1;
if (LOCKED)
releaseLock(lockfd);
sleeptime = rand() % 10000;
}
unlink(fifo_name);
return;
}
by sending in two terminals
./locktestFIFO ./lck ./fifo messageA
./locktestFIFO ./lck ./fifo messageB
if LOCKED is not set to 1 the messages will mix up, otherwise the two threads will take and release the resource correctly.

Producer, consumer POSIX

I am trying to write simple producer consumer app using C POSIX semaphores.
Consumer:
int memoryID;
struct wrapper *memory;
int main(int argc, char **argv) {
srand(time(NULL));
key_t sharedMemoryKey = ftok(".",MEMORY_KEY);
if(sharedMemoryKey==-1)
{
perror("ftok():");
exit(1);
}
memoryID=shmget(sharedMemoryKey,sizeof(struct wrapper),0);
if(memoryID==-1)
{
perror("shmget(): ");
exit(1);
}
memory = shmat(memoryID,NULL,0);
if(memory== (void*)-1)
{
perror("shmat():");
exit(1);
}
while(1)
{
int r = rand();
sem_wait(&memory->full);
sem_wait(&memory->mutex);
int n;
sem_getvalue(&memory->full,&n);
printf("Removed item: %d",(memory->array)[n]);
usleep(1000000);
sem_post(&memory->mutex);
sem_post(&memory->empty);
}
}
Producer:
int memoryID;
struct wrapper *memory;
int rc;
void atexit_function() {
rc = shmctl(memoryID, IPC_RMID, NULL);
rc = shmdt(memory);
}
int main(int argc, char **argv) {
atexit(atexit_function);
//creating key for shared memory
srand(time(NULL));
key_t sharedMemoryKey = ftok(".", MEMORY_KEY);
if (sharedMemoryKey == -1) {
perror("ftok():");
exit(1);
}
memoryID = shmget(sharedMemoryKey, sizeof(struct wrapper), IPC_CREAT | 0600);
if (memoryID == -1) {
perror("shmget():");
exit(1);
}
memory = shmat(memoryID, NULL, 0);
if (memory == (void *) -1) {
perror("shmat():");
exit(1);
}
//initialization
memset(&memory->array, 0, sizeof(memory->array));
sem_init(&memory->mutex, 1, 1);
sem_init(&memory->empty, 1, SIZE_OF_ARRAY);
sem_init(&memory->full, 1, 0);
if (memoryID == -1) {
perror("shmget(): ");
exit(1);
}
while(1)
{
int r = rand();
sem_wait(&memory->empty);
sem_wait(&memory->mutex);
int n;
sem_getvalue(&memory->full,&n);
printf("Adding task\t Value:%d\tNumber of tasks waiting:%d \n",r,n);
(memory->array)[n]=r;
usleep(1000000);
sem_post(&memory->mutex);
sem_post(&memory->full);
}
}
common.h:
#define MEMORY_KEY 5
#define SIZE_OF_ARRAY 10
struct wrapper
{
int array[SIZE_OF_ARRAY];
sem_t empty;
sem_t mutex;
sem_t full;
};
What is happening:
Producer is starting successfully
Producer is successfully adding elements to table and printing them out Quickly after starting producer,
I am starting consumer Consumer does not take element from
array even once
Producer fills up the array and is waiting
I do not really see where is the problem. I suspect that the problem is the implementation not the algorithm cause the algorithm is taken from wikipedia Link
Your consumer works fine. It's just not flushing to stdout. Do as nos suggested by putting a \n at the end of your consumer printf call. You can also see it working by just waiting longer. Your producer will start producing again after the consumer has executed a few iterations.

POSIX sem_wait() SIGABRT

I am working on a school project where we have to make a multithreaded web server. I am having a problem where when I call sem_wait on my semaphore (which should be initialized to 0 but already seems to be sem_post()ed to 1). I get a SIGABRT.
I am attaching my code below, and I put a comment on the line that is causing my problem. I've spent a few hours with the debugger with little luck.
#include <iostream>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <netinet/in.h>
#include <netdb.h>
#include <string>
#include <string.h>
#include <iostream>
#include <fcntl.h>
#include <errno.h>
#include <pthread.h>
#include <vector>
#include <semaphore.h>
#include <stdio.h>
#include <cstdlib>
#include <strings.h>
#define PORTNUM 7000
#define NUM_OF_THREADS 5
#define oops(msg) { perror(msg); exit(1);}
#define FCFS 0
#define SJF 1;
void bindAndListen();
void acceptConnection(int socket_file_descriptor);
void* dispatchJobs(void*);
void* replyToClient(void* pos);
//holds ids of worker threads
pthread_t threads[NUM_OF_THREADS];
//mutex variable for sleep_signal_cond
pthread_mutex_t sleep_signal_mutex[NUM_OF_THREADS];
//holds the condition variables to signal when the thread should be unblocked
pthread_cond_t sleep_signal_cond[NUM_OF_THREADS];
//mutex for accessing sleeping_thread_list
pthread_mutex_t sleeping_threads_mutex = PTHREAD_MUTEX_INITIALIZER;
//list of which threads are sleeping so they can be signaled and given a job
std::vector<bool> *sleeping_threads_list = new std::vector<bool>();
//number of threads ready for jobs
sem_t available_threads;
sem_t waiting_jobs;
//holds requests waiting to be given to one of the threads for execution
//request implemented as int[3] with int[0]== socket_descriptor int[1]== file_size int[2]== file_descriptor of requested file
//if file_size == 0 then HEAD request
std::vector<std::vector<int> >* jobs = new std::vector<std::vector<int> >();
pthread_mutex_t jobs_mutex = PTHREAD_MUTEX_INITIALIZER;
int main (int argc, char * const argv[]) {
//holds id for thread responsible for removing jobs from ready queue and assigning them to worker thread
pthread_t dispatcher_thread;
//initializes semaphores
if(sem_init(&available_threads, 0, NUM_OF_THREADS) != 0){
oops("Error Initializing Semaphore");
}
if(sem_init(&waiting_jobs, 0, 0) !=0){
oops("Error Initializing Semaphore");
}
//initializes condition variables and guarding mutexes
for(int i=0; i<NUM_OF_THREADS; i++){
pthread_cond_init(&sleep_signal_cond[i], NULL);
pthread_mutex_init(&sleep_signal_mutex[i], NULL);
}
if(pthread_create(&dispatcher_thread, NULL, dispatchJobs, (void*)NULL) !=0){
oops("Error Creating Distributer Thread");
}
for (int i=0; i<NUM_OF_THREADS; i++) {
pthread_mutex_lock(&sleeping_threads_mutex);
printf("before");
sleeping_threads_list->push_back(true);
printf("after");
pthread_mutex_unlock(&sleeping_threads_mutex);
}
printf("here");
for (int i=0; i<NUM_OF_THREADS; i++) {
//creates threads and stores ID in threads
if(pthread_create(&threads[i], NULL, replyToClient, (void*)i) !=0){
oops("Error Creating Thread");
}
}
/*
if(sem_init(&available_threads, 0, NUM_OF_THREADS) !=0){
oops("Error Initializing Semaphore");
}
if(sem_init(&waiting_jobs, 0, 0) !=0){ //this is the semaphore thats used in the sem_wait
oops("Error Initializing Semaphore");
}*/
bindAndListen();
}
//binds to socket and listens for connections
//being done by main thead
void bindAndListen(){
struct sockaddr_in saddr;
struct hostent *hp;
char hostname[256];
int sock_id, sock_fd;
gethostname(hostname, 256);
hp = gethostbyname(hostname);
bzero(&saddr, sizeof(saddr));
//errno = 0;
bcopy(hp->h_addr, &saddr.sin_addr, hp->h_length);
saddr.sin_family = AF_INET;
saddr.sin_port = htons(PORTNUM);
saddr.sin_addr.s_addr = INADDR_ANY;
sock_id = socket(AF_INET, SOCK_STREAM, 0);
if(sock_id == -1){
oops("socket");
printf("socket");
}
if(bind(sock_id, (const sockaddr*)&saddr, sizeof(saddr)) ==0){
if(listen(sock_id, 5) ==-1){
oops("listen");
}
//each time a new connection is accepted, get file info and push to ready queue
while(1){
int addrlen = sizeof(saddr);
sock_fd = accept(sock_id, (sockaddr*)&saddr, (socklen_t*)&addrlen);
if (sock_fd > 0) {
acceptConnection(sock_fd);
}else {
oops("Error Accepting Connection");
}
}
}else{
oops("there was an error binding to socket");
}
}// end of bindAndListen()
//accepts connection and gets file info of requested file
//being done by main thread
void acceptConnection(int sock_fd){
printf("**Server: A new client connected!");
//only using loop so on error we can break out on error
while(true){
//used to hold input from client
char* inputBuff = new char[BUFSIZ];
int slen = read(sock_fd, inputBuff, BUFSIZ);
//will sit on space between HEAD/GET and path
int pos1 = 0;
//will sit on space between path and HTTP version
int pos2 = 0;
//need duplicate ptr so we can manipulate one in the loop
char* buffPtr = inputBuff;
//parses client input breaks up query by spaces
for(int i=0; i<slen; i++){
if(*buffPtr == ' '){
if (pos1 == 0) {
pos1 = i;
}else {
pos2 = i;
break;
}
}
buffPtr++;
}
if((pos1 - pos2) >=0){
std::string str = "Invalid Query";
write(sock_fd, str.c_str(), strlen(str.c_str()));
break;
}
printf("slen length %d\n", slen);
std::string* method = new std::string(inputBuff, pos1);
printf("method length %lu\n",method->length());
//increment the ptr for buff to the starting pos of the path
inputBuff += ++pos1;
printf("pos2 - pos1 %d\n", (pos2 - pos1));
printf("pos1 = %d pos2 = %d\n", pos1, pos2);
std::string* path = new std::string(inputBuff, (pos2 - pos1));
printf("path length %lu\n", path->length());
printf("part1 %s\n", method->c_str());
printf("part2 %s\n", path->c_str());
//opens file requested by client
int fd = open(path->c_str(), O_RDONLY);
if(fd < 0){
std::string* error = new std::string("Error Opening File");
*error += *path + std::string(strerror(errno), strlen(strerror(errno)));
write(sock_fd, error->c_str(), strlen(error->c_str()));
break;
}
int file_size;
if(method->compare("GET") == 0){
//gets file info and puts the resulting struct in file_info
struct stat file_info;
if(fstat(fd, &file_info) !=0){
oops("Error getting file info");
}
file_size = file_info.st_size;
}else if(method->compare("HEAD")){
file_size = 0;
}else{
write(sock_fd, "Invalid Query", strlen("Invalid Query"));
break;
}
//job to be pushed to ready queue
std::vector<int> job;
job.push_back(sock_fd);
job.push_back(file_size);
job.push_back(fd);
//check mutex guarding the ready queue
pthread_mutex_lock(&jobs_mutex);
//push job to back of ready queue
jobs->push_back(job);
//unlock mutex guarding the ready queue
pthread_mutex_unlock(&jobs_mutex);
//increment number of jobs in ready queue
sem_post(&waiting_jobs);
} //end of while(true)
// we only end up here if there was an error
fflush(stdout);
close(sock_fd);
}// end of acceptConnection()
//routine run by dispather thread
void *dispatchJobs(void*){
while(true){
//wait for a thread to be available to execute a job
sem_wait(&available_threads);
//wait for a job to be waiting in the ready queue
sem_wait(&waiting_jobs); //this is the line thats crashing
//aquire lock to check which threads are waiting
pthread_mutex_lock(&sleeping_threads_mutex);
//go through list of threads to see which is waiting
for(int i=0; i<sleeping_threads_list->size(); i++){
if(sleeping_threads_list->at(i)){
//unlocks lock for access to list of waiting threads
pthread_mutex_unlock(&sleeping_threads_mutex);
//allows us access to the list of condition variables to signal the thread to resume execution
pthread_mutex_lock(&sleep_signal_mutex[i]);
pthread_cond_signal(&sleep_signal_cond[i]);
pthread_mutex_unlock(&sleep_signal_mutex[i]);
}
}
}//end of while(true)
}//end of dispatchJobs()
//sends file or metadata to client
//run by worker thread
//pos is position of condition variable that it waits to be signaled in the sleep_signal_cond[] array
void* replyToClient(void* pos){
int position = (long)pos;
while(true){
//waits for dispather thread to signal it
pthread_mutex_lock(&sleep_signal_mutex[position]);
pthread_cond_wait(&sleep_signal_cond[position], &sleep_signal_mutex[position]);
pthread_mutex_unlock(&sleep_signal_mutex[position]);
//lock mutex to get job to be executed
pthread_mutex_lock(&jobs_mutex);
std::vector<int> job = jobs->front();
//removes job from front of vector
jobs->erase(jobs->begin());
//releases mutex
pthread_mutex_unlock(&jobs_mutex);
//socket file descriptor, used for writing to socket
int sock_fd =job[0];
int file_size = job[1];
//file descriptor for requested job
int fd = job[2];
//holds output to be written to socket
char* outputBuffer = new char[BUFSIZ];
//GET request, send file
if(file_size !=0){
int readResult = 0;
while ((readResult = read(fd, outputBuffer, BUFSIZ)) > 0) {
if(write(sock_fd, outputBuffer, readResult) != readResult){
printf("We may have a write error");
}
}
if(readResult < 0){
oops("Error Reading File");
}
if(readResult == 0){
printf("finished sending file");
}
}else{ // HEAD request
}
//increment number of available threads
sem_post(&available_threads);
}
}// end of replyToClient()
Check again the whole logic of the code - it is possible to reach here:
pthread_mutex_lock(&jobs_mutex);
std::vector<int> job = jobs->front();
//removes job from front of vector
jobs->erase(jobs->begin());
//releases mutex
pthread_mutex_unlock(&jobs_mutex);
with jobs->size () == 0, in which case front() and erase() invoke undefined behavior, which may well result in the effects you observe.
Check whether your program still crashes after the following change:
//lock mutex to get job to be executed
pthread_mutex_lock(&jobs_mutex);
if (jobs->size () == 0)
{
pthread_mutex_unlock (&jobs_mutex);
continue;
}
std::vector<int> job = jobs->front();
//removes job from front of vector
jobs->erase(jobs->begin());
//releases mutex
pthread_mutex_unlock(&jobs_mutex);
I haven't used POSIX semaphores, but I believe this is what is happening. I'm only familiar with Linux kernel semaphores, and you don't mention your system. The init function's 3rd parameter probably sets the count variable. You set it to 0 (= busy but no other processes waiting). The wait function probably invokes down(), which begins by decreasing the count variable by 1: to -1, which means the semaphore you mean to use is locked now. There is nothing in your program to ever unlock it I believe (from browsing your code - it's pretty long), so you are in trouble. Try setting it to 1 in init. This might be all that is needed.

Resources