I have a problem for accessing an array from several threads. I have written a struct which gathers all informations needed for the job I want to do.
The structure is defined like this:
struct thread_blk
{
size_t th_blk_count;
size_t th_blk_size;
size_t th_blk_current;
size_t* th_blk_i_start;
void* data;
pthread_t* th_id;
ThreadBlockAdaptive th_blk_adapt;
};
The idea is to fill an array from multiple threads, each one working on a delimited field of memory of an array.
The th_blk_count field represents the amount of block that has to
be treated,
The th_blk_size field represents the size of a block,
The th_blk_current field represents the processed block (they are
listed from 0 to n),
The th_blk_i_start is an array which contains indexes of the array
that has to be filled.
Just a single function applied to the thread_blk struct is not working properly:
int startAllThreadBlock(struct thread_blk* th_blk, worker_func f)
{
int res = 0;
for(size_t i = 0; i < th_blk->th_blk_count; ++i)
{
res |= pthread_create(th_blk->th_id + i, NULL, f, th_blk);
th_blk->th_blk_current++;
}
return res;
}
In fact, the th_blk_current field is not incremented properly. I used it to retrieve the th_blk_i_start indexes which serve as intervals. As a result, my worker (shown bellow) is processing the same indexes of the double array.
Here is the function I use in the startAllThreadBlock function:
void* parallel_for(void* th_blk_void)
{
ThreadBlock th_blk = (ThreadBlock)th_blk_void;
size_t i = getThreadBlockStartIndex(th_blk, getThreadBlockCurrentIndex(th_blk));
printf(
"Running thread %p\n"
" -Start index %zu\n\n",
pthread_self(),
i
);
if(getThreadBlockCurrentIndex(th_blk) == (getThreadBlockCount(th_blk) - 1))
{
for(; i < MAX; ++i)
{
result[i] = tan(atan((double)i));
}
}
else
{
size_t threshold = getThreadBlockStartIndex(th_blk, getThreadBlockCurrentIndex(th_blk) + 1);
for(; i < threshold; ++i)
{
result[i] = tan(atan((double)i));
}
}
return NULL;
}
ThreadBlock is just a typedef over a thread_blk*; result is the array of double.
I am pretty sure that the problem lies around the startAllThreadBlock (if I use a 1 second sleep everything run as expected). But I don't know how to fix it.
Does someone have an idea?
Thanks for your answers.
Update
Placing the incrementation in the worker solved the problem. But I think it is not safe though, for the reason Some programmer dude mentioned.
void* parallel_for(void* th_blk_void)
{
ThreadBlock th_blk = (ThreadBlock)th_blk_void;
size_t i = getThreadBlockStartIndex(th_blk, getThreadBlockCurrentIndex(th_blk));
size_t n;
if(getThreadBlockCurrentIndex(th_blk) == (getThreadBlockCount(th_blk) - 1))
{
n = MAX;
}
else
{
n = getThreadBlockStartIndex(th_blk, getThreadBlockCurrentIndex(th_blk) + 1);
}
incThreadBlockCurrent(th_blk);
printf(
"Running thread %p\n"
" -Start index %zu\n\n",
pthread_self(),
i
);
for(; i < n; ++i)
{
result[i] = tan(atan((double)i));
}
return NULL;
}
It would do it with a mutex on th_blk_current no?
I think the problem here is that you think the thread gets passed a copy of the structure. It doesn't, it gets a pointer. All the threads get a pointer to the same structure. So any changes to the structure will affect all threads.
You need to come up with a way to pass individual data to the individual threads. For example a thread-specific structure containing only the thread-specific data, and you dynamically allocate an instance of that structure to pass to the thread.
Related
I need to have buffers that I will use it in multiple different types of threads. So the array needs to be global.
Buffer size and number of buffers are given as input to the program.
As an alternative I could implement linked list maybe.
What is the best way to implement such buffers? Can you provide a sample?
Any help is appreciated!
I don't understand what do you mean by "without knowing length", if you pass size of each buffer and number of buffers as input parameters then you know every required length.
Maybe this is not the best, but that would be my way.
First declare global buffer and threads.
static void ** buffer;
pthread_t tid[2];
Here is described how the threads will work. First buffer will assign with data first two sub-buffers. Second will do the same with the other two.
void *assignBuffer(void *threadid) {
pthread_t id = pthread_self();
if (pthread_equal(id, tid[0])) {
strcpy(buffer[0], "foo");
strcpy(buffer[1], "bar");
} else {
strcpy(buffer[2], "oof");
strcpy(buffer[3], "rab");
}
return NULL;
}
Converting program args from string to integer.
Here we assign buffer with arrays of unknown type.
Here we assign each buffer with his size in bytes.
Finally we create working threads. The important thing is that they
will run simultaneously.
Waiting until all threads done their job.
Simple print buffer contents.
Ok, here is the code.
int main(int argc, char **argv) {
//1
int bufferSize = atoi(argv[1]);
int buffersAmount = atoi(argv[2]);
//2
buffer = malloc(sizeof(void *)*buffersAmount);
//3
int i;
for (i = 0; i < buffersAmount; ++i) {
buffer[i] = malloc(bufferSize);
}
//4
i = 0;
while (i < 2) {
pthread_create(&tid[i], NULL, &assignBuffer, NULL);
++i;
}
//5
for (i = 0; i < 2; i++)
pthread_join(tid[i], NULL);
//6
for (i = 0; i < 4; ++i) {
printf("%d %s\n", i, (char*)buffer[i]);
}
for (i = 0; i < buffersAmount; ++i) {
free(buffer[i]);
}
return 0;
}
Feel free to ask if you don't understand something, also sorry for my english it is not my native language.
I want to create an array of structs on the heap from another data structure. Say there are N total elements to traverse, and (N-x) pointers (computed_elements) will be added to the array.
My naive strategy for this is to create an array (temp_array) size N on the stack and traverse the data structure, keeping track of how many elements need to be added to the array, and adding them to temp_array when I encounter them. Once I've finished, I malloc(computed_elements) and populate this array with the temp_array.
This is suboptimal because the second loop is unnecessary. However, I am weighing this against the tradeoff of constantly reallocating memory every iteration. Some rough code to clarify:
void *temp_array[N];
int count = 0;
for (int i = 0; i < N; i++) {
if (check(arr[i])) {
temp_array[count] = arr[i];
count++;
}
}
void *results = malloc(count * sizeof(MyStruct));
for (int i = 0; i < count; i++) {
results[i] = temp_array[i];
}
return results;
Thoughts would be appreciated.
One common strategy is to try to estimate the number of elements you're going to need (not a close estimate, more of a "On the order of..." type estimate). Malloc that amount of memory, and when you get "close" to that limit ("close" also being up for interpretation), realloc some more. Personally, I typically double the array when I get close to filling it.
-EDIT-
Here is the "ten minute version". (I've ensured that it builds and doesn't segfault)
Obviously I've omitted things like checking for the success of malloc/realloc, zeroing memory, etc...
#include <stdlib.h>
#include <stdbool.h>
#include <string.h> /* for the "malloc only version" (see below) */
/* Assume 500 elements estimated*/
#define ESTIMATED_NUMBER_OF_RECORDS 500
/* "MAX" number of elements that the original question seems to be bound by */
#define N 10000
/* Included only to allow for test compilation */
typedef struct
{
int foo;
int bar;
} MyStruct;
/* Included only to allow for test compilation */
MyStruct arr[N] = { 0 };
/* Included only to allow for test compilation */
bool check(MyStruct valueToCheck)
{
bool baz = true;
/* ... */
return baz;
}
int main (int argc, char** argv)
{
int idx = 0;
int actualRecordCount = 0;
int allocatedSize = 0;
MyStruct *tempPointer = NULL;
MyStruct *results = malloc(ESTIMATED_NUMBER_OF_RECORDS * sizeof(MyStruct));
allocatedSize = ESTIMATED_NUMBER_OF_RECORDS;
for (idx = 0; idx < N; idx++)
{
/* Ensure that we're not about to walk off the current array */
if (actualRecordCount == (allocatedSize))
{
allocatedSize *= 2;
/* "malloc only version"
* If you want to avoid realloc and just malloc everything...
*/
/*
tempPointer = malloc(allocatedSize);
memcpy(tempPointer, results, allocatedSize);
free(results);
results = tempPointer;
*/
/* Using realloc... */
tempPointer = realloc(results, allocatedSize);
results = tempPointer;
}
/* Check validity or original array element */
if (check(arr[idx]))
{
results[actualRecordCount] = arr[idx];
actualRecordCount++;
}
}
if (results != NULL)
{
free(results);
}
return 0;
}
One possibility is malloc for the size N, then run your loop, then realloc for size N-x. Memory fragmentation may result for small x.
The best re-usable code very often is scalable. Unless obligated to use small sizes, assume code will grow in subsequent applications and need to be reasonable efficient for large N. You do want to re-use good code.
realloc() an array by a factor of 4 or so as the need arises. Arrays may also shrink - no need to have a bloated array laying around. I've used grow by factor of 4 at intervals 1,4,16,64... and shrink intervals at 2,8,32,128... By having grow/shrink intervals apart from each other, it avoid lots of actively should N waver around an interval.
Even in small ways, like using size_t vs. int. Sure, with sizes like 1000, it makes no difference, but with code re-use, an application may push the limit: size_t is better for array indexes.
void *temp_array[N];
for (int i = 0; i < N; i++) {
void *temp_array[N];
for (size_t i = 0; i < N; i++) {
I have a dynamic 2d array inside this struct:
struct mystruct{
int mySize;
int **networkRep;
};
In my code block I use it as follows:
struct myStruct astruct[100];
astruct[0].networkRep = declareMatrix(astruct[0].networkRep, 200, 200);
// do stuff...
int i;
for(i=0; i<100; i++)
freeMatrix(astruct[i].networkRep, 200);
This is how I declare the 2d array:
int** declareMatrix(int **mymatrix, int rows, int columns)
{
mymatrix = (int**) malloc(rows*sizeof(int*));
if (mymatrix==NULL)
printf("Error allocating memory!\n");
int i,j;
for (i = 0; i < rows; i++)
mymatrix[i] = (int*) malloc(columns*sizeof(int));
for(i=0; i<rows; i++){
for(j=0; j<columns; j++){
mymatrix[i][j] = 0;
}
}
return mymatrix;
}
And this is how I free the 2d array:
void freeMatrix(int **matrix, int rows)
{
int i;
for (i = 0; i < rows; i++){
free(matrix[i]);
}
free(matrix);
matrix = NULL;
}
The strange behvior that I'm seeing is that when I compile and run my program everything looks OK. But when I pipe the stdout to a txt file, I'm getting a seg fault. However, the seg fault doesn't occur if I comment out the loop containing the "freeMatrix" call. What am I doing wrong?
I don't see any problem in free code, except, freeMatrix get called for 100 times whereas your allocation is just 1.
So, either you allocate as below:
for(int i=0; i<100; i++) //Notice looping over 100 elements.
astruct[i].networkRep = declareMatrix(astruct[i].networkRep, 200, 200);
Or, free for only 0th element which you have allocated in your original code.
freeMatrix(astruct[0].networkRep, 200);
On sidenote: Initialize your astruct array.
mystruct astruct[100] = {};
struct myStruct astruct[100];
astruct[0].networkRep = declareMatrix(astruct[0].networkRep, 200, 200);
// do stuff...
int i;
for(i=0; i<100; i++)
freeMatrix(astruct[i].networkRep, 200);
You allocated one astruct but free 100 of them; that will crash if any of the 99 extra ones isn't NULL, which probably happens when you do your redirection. (Since astruct is on the stack, it will contain whatever was left there.)
Other issues:
You're using numeric literals rather than manifest constants ... define NUMROWS and NUMCOLS and use them consistently.
Get rid of the first parameter to declareMatrix ... you pass a value but never use it.
In freeMatrix,
matrix = NULL;
does nothing. With optimization turned on, the compiler won't even generate any code.
if (mymatrix==NULL)
printf("Error allocating memory!\n");
You should exit(1) upon error, otherwise your program will crash and you may not even see the error message because a) stdout is buffered and b) you're redirecting it to a file. Which is also a reason to write error messages to stderr, not stdout.
astruct[0].networkRep = declareMatrix(astruct[0].networkRep, 200, 200);
your not passing the address of the pointer. It just passes the value in the memory to the function which is unncessary.
And your only initializing first variable of struct but while you are trying to free the memory you are unallocating memory which is not yet allocated (astruct[1] and so on till 100 ).
When you use a malloc , it actually allocates a bit more memory than you you specified. extra memory is used to store information such as the size of block, and a link to the next free/used block and sometimes some guard data that helps the system to detect if you write past the end of your allocated block.
If you pass in a different address, it will access memory that contains garbage, and hence its behaviour is undefined (but most frequently will result in a crash)
To index and count an unsigned integer type is enough. size_tis the type of choice for this as it is guaranteed to be larger enough to address/index every byte of memory/array's element on the target machine.
struct mystruct
{
size_t mySize;
int ** networkRep;
};
Always properly initialise variables:
struct myStruct astruct[100] = {0};
Several issues with the allocator:
Give it a chance to returned specific error codes. This typically is done by setting using the function returned value to to so.
Use size_t for counters and indicies and sizes ("rows", "columns")(for why please see above).
Do proper error checking.
Clean up in case an error occurs during work.
do not cast the value returned by malloc(), as in C it's not necessary, not recommended
Use perror() to log error, as it gets the most from the OS about the as possibe.
A possible to do this:
int declareMatrix(int *** pmymatrix, size_t rows, size_t columns)
{
int result = 0; /* Be optimistc. */
assert(NULL != pmatrix);
*pmymatrix = malloc(rows * sizeof(**pmymatrix));
if (NULL == *pmymatrix)
{
perror("malloc() failed");
result = -1;
goto lblExit;
}
{
size_t i, j;
for (i = 0; i < rows; i++)
{
(*pmymatrix)[i] = malloc(columns * sizeof(***pmymatrix));
if (NULL == (*pmymatrix)[i])
{
perror("malloc() failed");
freeMatrix(pmymatrix); /* Clean up. */
result = -1;
goto lblExit;
}
for(i = 0; i < rows; ++i)
{
for(j = 0; j < columns; ++j)
{
(*pmymatrix)[i][j] = 0;
}
}
}
}
lblExit:
return 0;
}
Two issues for the de-allocator:
Mark it's work as done be properly de-initilaising the pointer.
Perform validation of input prior to acting on it.
A possible to do this:
void freeMatrix(int *** pmatrix, size_t rows)
{
if (NULL != pmatrix)
{
if (NULL != *pmatrix)
{
size_t i;
for (i = 0; i < rows; ++i)
{
free((*pmatrix)[i]);
}
}
free(*pmatrix);
*pmatrix = NULL;
}
}
Then use the stuff like this:
struct myStruct astruct[100] = {0};
...
int result = declareMatrix(&astruct[0].networkRep, 200, 200);
if (0 != result)
{
fprintf("declareMatrix() failed.\n");
}
else
{
// Note: Arriving here has only the 1st element of astruct initialised! */
// do stuff...
}
{
size_t i;
for(i = 0; i < 100; ++i)
{
freeMatrix(&astruct[i].networkRep, 200);
}
}
Need help in getting the following to work.
I have a multiple producer threads (each writing say 100 bytes of data) to ringbuffer.
And one single reader(consumer) thread ,reads 100 bytes at a time and writes to stdout.(Finally i want to write to files based on the data)
With this implementation ,I get the data read from ring buffer wrong sometimes. see below
Since the ringbuffer size is small it becomes full and some part of data is loss.This is not my current problem.
** Questions:
On printing the data thats read from ringbuffer ,some data gets
interchanged !!I'm unable to find the bug.
Is the logic/approach correct ? (or) Is there a
better way to do this
ringbuffer.h
#define RING_BUFFER_SIZE 500
struct ringbuffer
{
char *buffer;
int wr_pointer;
int rd_pointer;
int size;
int fill_count;
};
ringbuffer.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "ringbuffer.h"
int init_ringbuffer(char *rbuffer, struct ringbuffer *rb, size_t size)
{
rb->buffer = rbuffer;
rb->size = size;
rb->rd_pointer = 0;
rb->wr_pointer = 0;
rb->fill_count = 0;
return 0;
}
int rb_get_free_space (struct ringbuffer *rb)
{
return (rb->size - rb->fill_count);
}
int rb_write (struct ringbuffer *rb, unsigned char * buf, int len)
{
int availableSpace;
int i;
availableSpace = rb_get_free_space(rb);
printf("In Write AVAIL SPC=%d\n",availableSpace);
/* Check if Ring Buffer is FULL */
if(len > availableSpace)
{
printf("NO SPACE TO WRITE - RETURN\n");
return -1;
}
i = rb->wr_pointer;
if(i == rb->size) //At the end of Buffer
{
i = 0;
}
else if (i + len > rb->size)
{
memcpy(rb->buffer + i, buf, rb->size - i);
buf += rb->size - i;
len = len - (rb->size - i);
rb->fill_count += len;
i = 0;
}
memcpy(rb->buffer + i, buf, len);
rb->wr_pointer = i + len;
rb->fill_count += len;
printf("w...rb->write=%tx\n", rb->wr_pointer );
printf("w...rb->read=%tx\n", rb->rd_pointer );
printf("w...rb->fill_count=%d\n", rb->fill_count );
return 0;
}
int rb_read (struct ringbuffer *rb, unsigned char * buf, int max)
{
int i;
printf("In Read,Current DATA size in RB=%d\n",rb->fill_count);
/* Check if Ring Buffer is EMPTY */
if(max > rb->fill_count)
{
printf("In Read, RB EMPTY - RETURN\n");
return -1;
}
i = rb->rd_pointer;
if (i == rb->size)
{
i = 0;
}
else if(i + max > rb->size)
{
memcpy(buf, rb->buffer + i, rb->size - i);
buf += rb->size - i;
max = max - (rb->size - i);
rb->fill_count -= max;
i = 0;
}
memcpy(buf, rb->buffer + i, max);
rb->rd_pointer = i + max;
rb->fill_count -= max;
printf("r...rb->write=%tx\n", rb->wr_pointer );
printf("r...rb->read=%tx\n", rb->rd_pointer );
printf("DATA READ ---> %s\n",(char *)buf);
printf("r...rb->fill_count=%d\n", rb->fill_count );
return 0;
}
At the producer you also need to wait on conditional variable for the has empty space condition. The both conditional variables should be signaled unconditionally, i.e. when a consumer removes an element from the ring buffer it should signal the producers; when a producer put something in the buffer it should signal the consumers.
Also, I would move this waiting/signaling logic into rb_read and rb_write implementations, so your ring buffer is a 'complete to use solution' for the rest of your program.
As to your questions --
1. I can't find that bug either -- in fact, I've tried your code and don't see that behavior.
2. You ask if this is logic/approach correct -- well, as far as it goes, this does implement a kind of ring buffer. Your test case happens to have an integer multiple of the size, and the record size is constant, so that's not the best test.
In trying your code, I found that there is a lot of thread starvation -- the 1st producer thread to run (the last created) hits things really hard, trying and failing after the 1st 5 times to stuff things into the buffer, not giving the consumer thread a chance to run (or even start). Then, when the consumer thread starts, it stays cranking for quite some time before it releases the cpu, and the next producer thread finally starts. That's how it works on my machine -- it will be different on different machines, I'm sure.
It's too bad that your current code doesn't have a way to end -- creating files of 10's or 100's of MB ... hard to wade through.
(Probably a bit later for the author, but if anyone else searches for a "multiple producers single consumer")
I think the fundamental problem in that implementation is what rb_write modifies a global state (rb->fill_count and other rb->XX) w/o doing any synchronization between multiple writers.
For alternative ideas check the: http://www.linuxjournal.com/content/lock-free-multi-producer-multi-consumer-queue-ring-buffer.
This question is about the best practices to handle this pointer problem I've dug myself into.
I have an array of structures that is dynamically generated in a function that reads a csv.
int init_from_csv(instance **instances,char *path) {
... open file, get line count
*instances = (instance*) malloc( (size_t) sizeof(instance) * line_count );
... parse and set values of all instances
return count_of_valid_instances_read;
}
// in main()
instance *instances;
int ins_len = init_from_csv(&instances, "some/path/file.csv");
Now, I have to perform functions on this raw data, split it, and perform the same functions again on the splits. This data set can be fairly large so I do not want to duplicate the instances, I just want an array of pointers to structs that are in the split.
instance **split = (instance**) malloc (sizeof(instance*) * split_len_max);
int split_function(instance *instances, ins_len, instances **split){
int i, c;
c = 0;
for (i = 0; i < ins_len; i++) {
if (some_criteria_is_true) {
split[c++] = &instances[i];
}
return c;
}
Now my question what would be the best practice or most readable way to perform a function on both the array of structs and the array of pointers? For a simple example count_data().
int count_data (intances **ins, ins_len, float crit) {
int i,c;
c = 0;
for (i = 0; i < ins_len; i++) {
if ins[i]->data > crit) {
++c;
}
}
return c;
}
// code smell-o-vision going off by now
int c1 = count_data (split, ins_len, 0.05); // works
int c2 = count_data (&instances, ins_len, 0.05); // obviously seg faults
I could make my init_from_csv malloc an array of pointers to instances, and then malloc my array of instances. I want to learn how a seasoned c programmer would handle this sort of thing though before I start changing a bunch of code.
This might seem a bit grungey, but if you really want to pass that instances** pointer around and want it to work for both the main data set and the splits, you really need to make an array of pointers for the main data set too. Here's one way you could do it...
size_t i, mem_reqd;
instance **list_seg, *data_seg;
/* Allocate list and data segments in one large block */
mem_reqd = (sizeof(instance*) + sizeof(instance)) * line_count;
list_seg = (instance**) malloc( mem_reqd );
data_seg = (instance*) &list_seg[line_count];
/* Index into the data segment */
for( i = 0; i < line_count; i++ ) {
list_seg[i] = &data_seg[i];
}
*instances = list_seg;
Now you can always operate on an array of instance* pointers, whether it's your main list or a split. I know you didn't want to use extra memory, but if your instance struct is not trivially small, then allocating an extra pointer for each instance to prevent confusing code duplication is a good idea.
When you're done with your main instance list, you can do this:
void free_instances( instance** instances )
{
free( instances );
}
I would be tempted to implement this as a struct:
struct instance_list {
instance ** data;
size_t length;
int owner;
};
That way, you can return this from your functions in a nicer way:
instance_list* alloc_list( size_t length, int owner )
{
size_t i, mem_reqd;
instance_list *list;
instance *data_seg;
/* Allocate list and data segments in one large block */
mem_reqd = sizeof(instance_list) + sizeof(instance*) * length;
if( owner ) mem_reqd += sizeof(instance) * length;
list = (instance_list*) malloc( mem_reqd );
list->data = (instance**) &list[1];
list->length = length;
list->owner = owner;
/* Index the list */
if( owner ) {
data_seg = (instance*) &list->data[line_count];
for( i = 0; i < line_count; i++ ) {
list->data[i] = &data_seg[i];
}
}
return list;
}
void free_list( instance_list * list )
{
free(list);
}
void erase_list( instance_list * list )
{
if( list->owner ) return;
memset((void*)list->data, 0, sizeof(instance*) * list->length);
}
Now, your function that loads from CSV doesn't have to focus on the details of creating this monster, so it can simply do the task it's supposed to do. You can now return lists from other functions, whether they contain the data or simply point into other lists.
instance_list* load_from_csv( char *path )
{
/* get line count... */
instance_list *list = alloc_list( line_count, 1 );
/* parse csv ... */
return list;
}
etc... Well, you get the idea. No guarantees this code will compile or work, but it should be close. I think it's important, whenever you're doing something with arrays that's even slightly more complicated than just a simple array, it's useful to make that tiny extra effort to encapsulate it. This is the major data structure you'll be working with for your analysis or whatever, so it makes sense to give it a little bit of stature in that it has its own data type.
I dunno, was that overkill? =)