Trying to pass a struct between threads in plain C using reference counting. I have pthreads and gcc atomics available. I can get it to work, but I'm looking for bulletproof.
At first, I used a pthread mutex owned by the struct itself:
struct item {
int ref;
pthread_mutex_t mutex;
};
void ref(struct item *item) {
pthread_mutex_lock(&item->mutex);
item->ref++;
pthread_mutex_unlock(&item->mutex);
}
void unref(struct item *item) {
pthread_mutex_lock(&item->mutex);
item->ref--;
pthread_mutex_unlock(&item->mutex);
if (item->ref <= 0)
free(item);
}
struct item *alloc_item(void) {
struct item *item = calloc(1, sizeof(*item));
return item;
}
But, realized the mutex shouldn't be owned by the item:
static pthread_mutex_t mutex;
struct item {
int ref;
};
void ref(struct item *item) {
pthread_mutex_lock(&mutex);
item->ref++;
pthread_mutex_unlock(&mutex);
}
void unref(struct item *item) {
pthread_mutex_lock(&mutex);
item->ref--;
if (item->ref <= 0)
free(item);
pthread_mutex_unlock(&mutex);
}
struct item *alloc_item(void) {
struct item *item = calloc(1, sizeof(*item));
return item;
}
Then, further realized pointers are passed by value, so I now have:
static pthread_mutex_t mutex;
struct item {
int ref;
};
void ref(struct item **item) {
pthread_mutex_lock(&mutex);
if (item != NULL) {
if (*item != NULL) {
(*item)->ref++;
}
}
pthread_mutex_unlock(&mutex);
}
void unref(struct item **item) {
pthread_mutex_lock(&mutex);
if (item != NULL) {
if (*item != NULL) {
(*item)->ref--;
if ((*item)->ref == 0) {
free((*item));
*item = NULL;
}
}
}
pthread_mutex_unlock(&mutex);
}
struct item *alloc_item(void) {
struct item *item = calloc(1, sizeof(*item));
if (item != NULL)
item->ref = 1;
return item;
}
Are there any logical missteps here? Thanks!
I don't know of a general purpose solution.
It would be nice to be able to reduce this down to an atomic add/subtract of the reference count. Indeed, most of the time that is all that is required... so stepping through a mutex or whatever hurts.
But the real problem is managing the reference count and the pointer to the item, at the same time.
When a thread comes to ref() an item, how does it find it ? If it doesn't already exist, presumably it must create it. If it does already exist, it must avoid some other thread freeing it before the reference-count is incremented.
So... your void ref(struct item** item) works on the basis that the mutex protects the struct item** pointer... while you hold the mutex, no other thread can change the pointer -- so only one thread can create the item (and increment the count 0->1), and only one thread can destroy the item (after decrementing the count 1->0).
It is said that many problems in computer science can be solved by introducing a new level of indirection, and that is what is going on here. The problem is how do all the threads obtain the address of the item -- given that it may (softly and suddenly) vanish away ? Answer: invent a level of indirection.
BUT, now we are assuming that the pointer to the item cannot itself vanish. This can be trivially achieved if the pointer to the item can be held a process global (static storage duration). If the pointer to the item is (part of) an allocated storage duration object, then we must ensure that this higher level object is somehow locked -- so that the address of the pointer to the item is "stable" while it is in use. That is, the higher level object won't move around in memory and won't be destroyed while we are using it !
So, the checks if (item == NULL) after locking the mutex are suspect. If the mutex also protects the pointer to the item, then that mutex needs to have been locked before establishing the address of the pointer to the item -- and in this case checking after the lock is too late. Or the address of the pointer to the item is protected in some other way (perhaps by another mutex) -- and in this case the check can be done before the lock (and moving it there makes it clear what the mutex protects, and what it does not protect).
However, if the item is part of a larger data structure, and that structure is locked, you may (well) not need a lock to cover the pointer to the item at all. It depends... as I said, I'm not aware of a general solution.
I have some large, dynamic data structures (hash tables, queues, trees, etc.) which are shared by a number of threads. Mostly, threads look up and hold on to items for some time. When the system is busy, it is very busy, and the destruction of items can be deferred until things are quieter. So I use read/write locks on the large structures, atomic add/subtract for the reference counts, and a garbage collector to do the actual destruction of items. The point here is that the choice of mechanism for the (apparently simple and self contained) increment/decrement of the reference count, depends on how the creation and destruction of items is managed, and how threads come to be in possession of a pointer to an item (which is what the reference count counts, after all).
If you have 128-bit atomic operation to hand, you can put a 64-bit address and a 64-bit reference count together and do something along the lines of:
ref: bar = fetch_add(*foo, 1) ;
ptr = bar >> 64 ;
if (ptr == NULL)
{
if (bar & 0xF...F)
...create item etc.
else
...wait for item
} ;
unref: bar = fetch_sub(*foo, 1) ;
if ((bar & 0xF...F) == 0)
{
if (cmp_xchg(*foo, bar, (NULL << 64) | 0)))
...free(bar >> 64) ;
} ;
where foo is the 128-bit combined ptr/ref-count (whose existence is protected by some external means) -- assuming 64-bit ptr and 64-bit count -- and bar is a local variable of that form, and ptr is a void*.
If finding the pointer NULL triggers the item creation, then the first thread to move the count from 0->1 knows who they are, and any threads that arrive before the item is created, and the pointer set, also know who thet are and can wait. Setting the pointer requires a cmp_xchg(), and the creator then discovers how many threads are waiting for same.
This mechanism moves the reference count out of the item, and bundles it with the address of the item, which seems neat enough -- though you now need the address of the item when operating on that, and the address of the reference to the item when you are operating on its reference count.
This replaces the mutex in your ref and unref functions... but does NOT solve the problem of how the reference itself is protected.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I'm working on a C project for an embedded target running a Linux distribution (build with Yocto). I'm new to Linux embedded world and I have to build a data logging system.
I'm already learning how to use threads, and I'm thinking on the project organization.
Here is what I imagine :
multiple threads to collect data from different interfaces, CAN bus, I2C... (different sample rate)
one thread with a sample rate of 200ms to populate csv files
one thread with a sample rate of 3 seconds to send data with http request
Threads will stop on CAN info or external event
I don't know what is the best way to organize this project. I see two ways, in the first a startup program create each thread and wait in a while loop with event watching to stop them. The second way is a startup program execute others binaries as thread.
In the two ways I don't know how share data between threads.
Can you share me your experience ?
Thank you
EDIT :
First, thanks a lot to #Glärbo, for your explanations. It's really helpful to learn multi threading mechanic.
I've tested it with success.
For future readers I've drawn diagrams to illustrate #Glärbo answer.
main thread
productor-sensor thread
datalogger thread
I would do it simpler, using a simple multiple producers, single consumer approach.
Let's assume each data item can be described using a single numerical value:
struct value {
struct value *next; /* Forming a singly-linked list of data items */
struct sensor *from; /* Identifies which sensor value this is */
struct timespec when; /* Time of sensor reading in UTC */
double value; /* Numerical value */
};
I would use two lists of values: one for sensor readings received but not stored, and one for unused value buckets. This way you don't need to dynamically allocate or free value buckets, unless you want to (by manipulating the unused list).
Both lists are protected by a mutex. Since the unused list may be empty, we need a condition variable (that is signaled on whenever a new unused value is added to it) so that threads can wait for one to become available. The received list similarly needs a condition variable, so that if it happens to be empty when the consumer (data storer) wants them, it can wait for at least one to appear.
static pthread_mutex_t unused_lock = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t unused_wait = PTHREAD_COND_INITIALIZER;
static struct value *unused_list = NULL;
static pthread_mutex_t received_lock = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t received_wait = PTHREAD_COND_INITIALIZER;
static struct value *received_list = NULL;
For the unused list, we need three helpers: one to create new unused value items from scratch (which you call initially to create say two or three value items per sensor, plus a few), and later on, if you think you need them (say, if you add new sensors run time):
int unused_create(void)
{
struct value *v;
v = malloc(sizeof *v);
if (!v)
return ENOMEM;
v->from = NULL;
pthread_mutex_lock(&unused_lock);
v->next = unused_list;
unused_list = v;
pthread_cond_signal(&unused_wait);
pthread_mutex_unlock(&unused_lock);
return 0;
}
The other two are needed to get and put value items from/back to the list:
struct value *unused_get(void)
{
struct value *v;
pthread_mutex_lock(&unused_lock);
while (!unused_list)
pthread_cond_wait(&unused_wait, &unused_lock);
v = unused_list;
unused_list = unused_list->next;
pthread_mutex_unlock(&unused_lock);
v->from = NULL;
return v;
}
void unused_put(struct value *v)
{
v->from = NULL;
pthread_mutex_lock(&unused_lock);
v->next = unused_list;
unused_list = v;
pthread_cond_signal(&unused_wait);
pthread_mutex_unlock(&unused_lock);
}
The idea above is that when the from member is NULL, the item is unused (as it is not from any sensor). Technically, we don't need to clear it to NULL at every stage, but I like to be thorough: it's not like setting it is a costly operation.
Sensor-accessing producers take the sensor reading, get the current time using e.g. clock_gettime(CLOCK_REALTIME, ×pec), and then use unused_get() to grab a new unused item. (The order is important, because unused_get() may take some time, if there are no free items.) Then, they fill in the fields, and call the following received_put() to prepend the reading to the list:
void received_put(struct value *v)
{
pthread_mutex_lock(&received_lock);
v->next = received_list;
received_list = v;
pthread_mutex_signal(&received_wait);
pthread_mutex_unlock(&received_lock);
}
There is only one thread that periodically collects all received sensor readings, and stores them. It can keep a set of most recent readings, and send those periodically. Instead of calling some received_get() repeatedly until there are no more received values not handled yet, we should use a function that returns the whole list of them:
struct value *received_getall(void)
{
struct value *v;
pthread_mutex_lock(&received_lock);
while (!received_list)
pthread_cond_wait(&received_wait, &received_lock);
v = received_list;
received_list = NULL;
pthread_mutex_unlock(&received_lock);
return v;
}
The consumer thread, storing/sending the summaries and readings, should obtain the whole list, then handle them one by one. After each item has been processed, they should be added to the unused list. In other words, something like
struct value *all, v;
while (1) {
all = receive_getall();
while (all) {
v = all;
all = all->next;
v->next = NULL;
/* Store/summarize value item v */
unused_put(v);
}
}
As you can see, while the consumer thread is handling the sensor value items, the sensor threads can add new readings for the next round, as long as there are enough free value item buckets to use.
Of course, you can also allocate lots of values at one malloc() call, but then you must somehow remember which pool of values each value belongs to to free them. So:
struct owner {
size_t size; /* Number of value's */
size_t used; /* Number of value's not freed yet */
struct value value[];
};
struct value {
struct value *next; /* Forming a singly-linked list of data items */
struct owner *owner; /* Part of which value array, NULL if standalone */
struct sensor *from; /* Identifies which sensor value this is */
struct timespec when; /* Time of sensor reading in UTC */
double value; /* Numerical value */
};
int unused_add_array(const size_t size)
{
struct owner *o;
struct value *v;
size_t i;
o = malloc(sizeof (struct owner) + size * sizeof (struct value));
if (!o)
return ENOMEM;
o->size = size;
o->used = used;
i = size - 1;
pthread_mutex_lock(&unused_lock);
o->value[i].next = unused_list;
while (i-->0)
o->value[i].next = o->value + i + 1;
unused_list = o->value[0];
pthread_cond_broadcast(&unused_wait);
pthread_mutex_unlock(&unused_lock);
return 0;
}
/* Instead of unused_put(), call unused_free() to discard a value */
void unused_free(struct value *v)
{
pthread_mutex_lock(&unused_lock);
v->from = NULL;
if (v->owner) {
if (v->owner->used > 1) {
v->owner->used--;
return;
}
v->owner->size = 0;
v->owner->used = 0;
free(v->owner);
return;
}
free(v);
return;
}
The reason unused_free() uses unused_lock is that we must be sure that no other thread is accessing the bucket when we free it. Otherwise, we can have a race window, where the other thread may use the value after we free()d it.
Remember that the Linux C library, like most other C libraries, does not return dynamically allocated memory to the operating system at free(); memory is only returned if it is large enough to matter. (Currently on x86 and x86-64, the glibc limit is about 132,000 bytes or so; anything smaller is left in the process heap, and used to satisfy future malloc()/calloc()/realloc() calls.)
The contents of the struct sensor are up to you, but personally, I'd put at least
struct sensor {
pthread_t worker;
int connfd; /* Device or socket descriptor */
const char *name; /* Some kind of identifier, perhaps header in CSV */
const char *units; /* Optional, could be useful */
};
plus possibly sensor reading interval (in, say, milliseconds) in it.
In practice, because there is only one consumer thread, I'd use the main thread for it.
I am working on a multithreaded client using C and the pthreads library, using a boss/worker arch design and am having issues understanding/debugging a stack-use-after-scope error that is causing my client to fail. (I am kinda new to C)
I have tried multiple things, including defining the variable globally, passing a double pointer reference, etc.
Boss logic within main:
for (i = 0; i < nrequests; i++)
{
struct Request_work_item *request_ctx = malloc(sizeof(*request_ctx));
request_ctx->server = server;
request_ctx->port = port;
request_ctx->nrequests = nrequests;
req_path = get_path(); //Gets a file path to work on
request_ctx->path = req_path;
steque_item work_item = &request_ctx; // steque_item is a void* so passing it a pointer to the Request_work_item
pthread_mutex_lock(&mutex);
while (steque_isempty(&work_queue) == 0) //Wait for the queue to be empty to add more work
{
pthread_cond_wait(&c_boss, &mutex);
}
steque_enqueue(&work_queue, work_item); //Queue the workItem in a workQueue (type steque_t, can hold any number of steque_items)
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&c_worker);
}
Worker logic inside a defined function:
struct Request_work_item **wi;
while (1)
{
pthread_mutex_lock(&mutex);
while (steque_isempty(&work_queue) == 1) //Wait for work to be added to the queue
{
pthread_cond_wait(&c_worker, &mutex);
}
wi = steque_pop(&work_queue); //Pull the steque_item into a Request_work_item type
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&c_boss);
char *path_to_file = (*wi)->path; //When executing, I get this error in this line: SUMMARY: AddressSanitizer: stack-use-after-scope
...
...
...
continues with additional worker logic
I expect the worker to pull the work_item from the queue, dereference the values and then perform some work. However, I keep getting AddressSanitizer: stack-use-after-scope, and the information for this error online is not very abundant so any pointers would be greatly appreciated.
The red flag here is that &request_ctx is the address of a local variable. It's not the pointer to the storage allocated with malloc, but the address of the variable which holds that storage. That variable is gone once this scope terminates, even though the malloc-ed block endures.
Maybe the fix is simply to delete the address-of & operator in this line?
steque_item work_item = &request_ctx; // steque_item is a void* so passing
// it a pointer to the Request_work_item
If we do that, then the comment actually tells the truth. Because otherwise we're making work_item a pointer to a pointer to the Request_work_item.
Since work_item has type void*, it compiles either way, unfortunately.
If the consumer of the item on the other end of the queue is extracting it as a Request_work_item *, then you not only have an access to an object that has gone out of scope, but also a type mismatch even if that object happens to still be in the producer's scope when the consumer uses it. The consumer ends up using a piece of the producer's stack as if it were a Request_work_item structure. Edit: I see that you are using a pointer-to-pointer when dequeuing the item and accessing it as (*wi)->path. Think about changing the design to avoid doing that. Or else, that wi pointer has to be dynamically allocated also, and freed. The producer has to do something like:
struct Request_work_item **p_request_ctx = malloc(sizeof *p_request_ctx);
struct Request_work_item *request_ctx = malloc(sizeof *request_ctx);
if (p_request_ctx && request_ctx) {
*p_request_ctx = request_ctx;
request_ctx->field = init_value;
// ... etc
// then p_request_ctx is enqueued.
The consumer then has to free the structure, and also free the pointer. That extra pointer just seems like pure overhead here; it doesn't provide any essential or useful level of indirection.
I have a program that enables multiple threads to insert entries into a hashtable and retrieve them. The hashtable itself is a very simple implementation with a struct defining each bucket entry and a table (array) to hold each bucket. I'm very to new to concurrency and multithreading, but I think that in order to avoid data from being lost in the table during insert and read operations, some kind of synchronization (in the form of something like mutex locking) needs to be added to avoid preemption on one process's data operation by another's.
In practice though, I'm not really sure how to tell where a process could be preempted in either a data read or write operation on the hashtable and where exactly locks should be placed to avoid such problems as well as dead locks. As per this website, for the hashtable insert method, I added a mutex lock before each key gets inserted into the table and unlock it at the end of the function. I essentially do something similar in the function where I'm reading data from the hash table and when I run the code, it seems that the keys are successfully being inserted initially, but the program hangs when the keys are supposed to be retrieved. Here is how I implemented the locking for each function:
// Inserts a key-value pair into the table
void insert(int key, int val) {
pthread_mutex_lock(&lock);
int i = key % NUM_BUCKETS;
bucket_entry *e = (bucket_entry *) malloc(sizeof(bucket_entry));
if (!e) panic("No memory to allocate bucket!");
e->next = table[i];
e->key = key;
e->val = val;
table[i] = e;
pthread_mutex_unlock(&lock);
pthread_exit(NULL);
}
// Retrieves an entry from the hash table by key
// Returns NULL if the key isn't found in the table
bucket_entry * retrieve(int key) {
pthread_mutex_lock(&lock);
bucket_entry *b;
for (b = table[key % NUM_BUCKETS]; b != NULL; b = b->next) {
if (b->key == key) return b;
}
pthread_mutex_unlock(&lock);
pthread_exit(NULL);
return NULL;
}
So the main problems here are:
How to tell where data is being lost between each thread operation
What could cause the program to hang when the keys are being retrieved from the hashtable?
First, you should read more about pthreads. Read also pthreads(7). Notice in particular that every locking call like pthread_mutex_lock should always be later followed by a call to pthread_mutex_unlock on the same mutex (and conventionally you should adopt the discipline that each lock and unlock happens in the same block). Hence your return in the for loop of your retrieve is wrong, you should code:
bucket_entry *
retrieve(int key) {
bucket_entry *res = NULL;
pthread_mutex_lock(&lock);
for (bucket_entry *b = table[key % NUM_BUCKETS];
b != NULL; b = b->next) {
if (b->key == key)
{ res = b; break; };
}
pthread_mutex_unlock(&lock);
return res;
}
Then you could use valgrind and use a recent GCC compiler (e.g. 5.2 in November 2015). Compile with all warnings & debug info (gcc -Wall -Wextra -g -pthread). Read about the sanitizer debugging options, in particular consider using -fsanitize=thread
There are few reasons to call pthread_exit (likewise, you rarely call exit in a program). When you do, the entire current thread will be terminated.
I know there's a lot of answered questions about casting void* to struct but I can't manage to get it work the right way.
Well I want to create a thread which will play music in background. I have a structure which gather the loaded music file array and the start and end index :
typedef unsigned char SoundID;
typedef unsigned char SongID;
typedef struct {
Mix_Music *songs[9]; // array of songs
SongID startId; // index of first song to play
SongID endId; // index of last song to play
} SongThreadItem;
Then I want to play the songs by creating a thread and passing the function which actually plays the songs to the thread_create() function.
int play_songs(Mix_Music *songs[9], SongID startId, SongID endId, char loop){
thrd_t thrd;
SongThreadItem _item;
SongThreadItem *item = &_item;
memcpy(item->songs, songs, sizeof(item->songs));
item->startId = startId;
item->endId = endId;
printf("item->startId is %i\n", item->startId);
printf("item->endId is %i\n", item->endId);
thrd_create_EC(thrd_create(&thrd, audio_thread_run, item));
return 0;
}
int audio_thread_run(void *arg){
SongThreadItem *item = arg; // also tried with = (SongThreadItem *)arg
printf("item->startId is %i\n", item->startId);
printf("item->endId is %i\n", item->endId);
free(item);
return 0;
}
Then I get the following output:
item->startId is 0
item->endId is 8
item->startId is 6
item->endId is 163
The value retrieved inside audio_thread_run() aren't the one expected. I don't know if I put enough code to let someone find my error, I try to make it minimal because it's part of a bigger project.
Thanks in advance for your help.
SongThreadItem _item;
SongThreadItem *item = &_item; // bug
That's a problem there: you're giving the thread a pointer to a stack variable. The stack will get overwritten by pretty much anything going on in the main thread. You need to allocate dynamic memory here (with malloc), and take care of freeing it when no-longer needed (perhaps in the thread routine itself).
Other options would be a global structure that keeps track of all the active threads and their starting data, or something like that. But it will involve dynamic allocations unless the count of threads is fixed at compile time.
The thread runs asynchronously but you are passing it a pointer to SongThreadItem that is on the stack of the thread that calls play_songs().
If you have only a single thread calling play_songs() and this is not called again until you are done with the item, you can make the definition _item like this:
static SongThreadItem _item;
so that it is in the data segment and will not be overwritten.
If you don't know when and who will call play_songs() then just malloc the _item and free it in the thread when you are done:
...
SongThreadItem *item = (SongThreadItem *)malloc(sizeof(SongThreadItem));
...
The latter is usually the better idea. Think of it as passing the ownership of the data to the new thread. Of course production quality code should free the item if the thread creation fails.
I want to make every element in an array of structure thread safe by using mutex lock for accessing each element of array.
This is my structure:
typedef struct {
void *value;
void *key;
uint32_t value_length;
uint32_t key_length;
uint64_t access_count;
void *next;
pthread_mutex_t *mutex;
} lruc_item;
I have an array of this structure, and want to use mutex locks in order to make structure elements thread safe.
I tried using the lock on one of the array element in a function and then intensionally didn't unlock it, just to ensure that my locks are working fine, but the strange thing was that there was no deadlock and the 2nd function accessing the same array element was able to access it.
Can some one please guide me on how to use mutexes to lock every element in a structure array (so as to make each element of the struture thread safe).
sample code to explain my point:
/** FUNCTION THAT CREATES ELEMENTS OF THE STRUCTURE **/
lruc_item *create_item(lruc *cache) {
lruc_item *item = NULL;
item = (lruc_item *) calloc(sizeof(lruc_item), 1);
item->mutex = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t));
if(pthread_mutex_init(item->mutex, NULL)) {
perror("LRU Cache unable to initialise mutex for page");
return NULL;
}
}
return item;
}
set()
{
item = create_item(cache);
pthread_mutex_lock(item->mutex);
item->value = value;
item->key = key;
item->value_length = value_length;
item->key_length = key_length;
item->access_count = ++cache->access_count;
pthread_mutex_unlock(item->mutex); /** (LINE P) tried commenting out this to check proper working of mutex(deadlock expected if the same "item" is accessed in another function) **/
}
get(lruc_item *item)
{
pthread_mutex_lock(item->mutex); /** deadlock doesn't occur when "LINE P" is commented out**/
*value = item->value;
item->access_count = ++cache->access_count;
pthread_mutex_unlock(item->mutex);
}
It's important to note that a mutex only locks out code from other threads. If you tried to execute WaitForMultipleObjects with the same mutex in the same thread it wouldn't block. I'm assuming Windows, because you haven't detailed that.
But, if you provide more detail, maybe we can pin-point where the issue really is.
Now, assuming again Windows, if you want to make accesses to the individual elements "thread-safe", you might want to consider the InterlockedExchange-class of functions instead of a mutex. For example:
InterlockExchange(&s.value_length, newValue);
or
InterlockedExchange64(&s.access_count, new64Value);
or
InterlockedExchangePointer(&s.value, newPointer);
If what you want to do is make sure multiple element accesses to the structure, as a transaction, is thread-safe, then mutex can do that for you. Mutex is useful across process boundaries. If you are only dealing within a single process, a critical section might be a better idea.