Reading and writing in-memory data - c

In the following example, I am taking a struct that occupies 32 bytes in memory and writing that to a file and reading it back in -- i.e., serializing the data to a binary format:
#include <stdio.h>
typedef struct _Person {
char name[20];
int age;
double weight;
} Person;
int main(void)
{
Person tom = (Person) {.name="Tom", .age=20, .weight=125.0};
// write the struct to a binary file
FILE *fout = fopen("person.b", "wb");
fwrite(&tom, sizeof tom, 1, fout);
fclose(fout);
// read the binary data and set the person to that
Person unknown;
FILE *fin = fopen("person.b", "rb");
fread(&unknown, sizeof unknown, 1, fin);
fclose(fin);
// confirm all looks ok
printf("{name=%s, age=%d, weight=%f}", unknown.name, unknown.age, unknown.weight);
}
Note however that these are all values on the stack, and no pointers/indirection is involved. How might data be serialized to a file when, for example, multiple pointers can be involved, multiple variables may point to the same memory location, etc. Is this effectively what protocol buffers do?

Ok so you want a binary file. I used to do it this way long ago. It's fine. It just breaks when you move to another platform or bitness. I'm teaching the old way because it's a good place to start. Newer ways are popular now because they keep working when changing platforms or bitnesses.
When writing records to the file I would use structs like this:
typedef struct _Person {
char name[20];
int age;
double weight;
} Person;
typedef struct _Thing {
char name[20];
};
typedef struct _Owner {
int personId;
int thingId;
} Owner;
See how the Owner structure has Id members. These are just indices into the arrays of the other structures.
These can be written out to a file one after another, usually prefixed by a single integer written directly that says how many records of each kind. The reader just allocates an array of structs with malloc big enough to hold them. As we add more items in memory we resize the arrays with realloc. We can (and should) also mark for deleting (say by setting the first character of name to 0) and reusing the record later.
The writer looks something like this:
void writeall(FILE *h, Person *allPeople, int nPeople, Thing *allThings, int nThings, Owner *allOwners, int nOwners)
{
// Error checking omitted for brevity
fwrite(&nPeople, sizeof(nPeople), 1, h);
fwrite(allPeople, sizeof(*allPeople), nPeople, h);
fwrite(&nThings, sizeof(nThings), 1, h);
fwrite(allThings, sizeof(*allThings), nThings, h);
fwrite(&nOwners, sizeof(nOwners), 1, h);
fwrite(allOwners, sizeof(*allOwners), nOwners, h);
}
The reader in turn looks like this:
int writeall(FILE *h, Person **allPeople, int *nPeople, int *aPeople, Thing **allThings, int *nThings, int *aThings, Owner **allOwners, int *nOwners, int *aOwners)
{
*aPeople = 0; // Don't crash on bad read
*aThigns = 0;
*aOwners = 0;
*allPeople = NULL;
*allThings = NULL;
*allOwners = NULL;
if (1 != fread(nPeople, sizeof(*nPeople), 1, h)) return 0;
*allPeople = malloc(sizeof(**allPeople) * *nPeople);
if (!allPeople) return 0; // OOM
*aPeople = *nPeople;
if (*nPeople != fread(*allPeople, sizeof(**allPeople), nPeople, h)) return 0;
if (1 != fread(nThings, sizeof(*nThings), 1, h)) return 0;
*allThings = malloc(sizeof(**allThings) * *nThings);
if (!allThings) return 0; // OOM
*aThings = *nThings;
if (*nThings != fread(*allThings, sizeof(**allThings), nThings, h)) return 0;
if (1 != fread(nOwners, sizeof(*nOwners), 1, h)) return 0;
*allOwners = malloc(sizeof(**allOwners) * *nOwners);
if (!allOwners) return 0; // OOM
*aOwners = *nOwners;
if (*nOwners != fread(*allOwners, sizeof(**allOwners), nOwners, h)) return 0;
return 1;
}
There was an old technique for writing a heap arena directly to disk and reading it back again. I recommend never using it, and never storing pointers on disk. That way lies security nightmares.
When memory was cheap I would have talked about how to use block allocation and linked blocks to dynamically update the on-disk records partially; but now for problems you're going to encounter at this level I say don't bother, and just read the whole thing into RAM and write it back out again. Eventually you will learn databases, which handles that stuff for you.

Related

How do I write a serializer in C using MessagePack (Mpack)

I would like to write a serializer application that will encode any type of C data structure into MessagePack format. All examples I have seen show the encoding of known structures such as this using MPack:
// encode to memory buffer
char* data;
size_t size;
mpack_writer_t writer;
mpack_writer_init_growable(&writer, &data, &size);
// write the example on the msgpack homepage
mpack_start_map(&writer, 2);
mpack_write_cstr(&writer, "compact");
mpack_write_bool(&writer, true);
mpack_write_cstr(&writer, "schema");
mpack_write_uint(&writer, 0);
mpack_finish_map(&writer);
// finish writing
if (mpack_writer_destroy(&writer) != mpack_ok) {
fprintf(stderr, "An error occurred encoding the data!\n");
return;
}
// use the data
do_something_with_data(data, size);
free(data);
But what I would like is to be able to encode any C data structure. For example if I had the following:
struct My_Struct{
int number1;
float number2;
char array[6];
};
struct My_Struct ms = {10, 4.44, "Hello"};
programatically, how would I know that the first 4 bytes represents an int so that I could call an mpack_write_int function to start packing the int into a messagepack format?

Two-dimensional char array too large exit code 139

Hey guys I'm attempting to read in workersinfo.txt and store it into a two-dimensional char array. The file is around 4,000,000 lines with around 100 characters per line. I want to store each file line on the array. Unfortunately, I get exit code 139(Not enough memory). I'm aware I have to use malloc() and free() but I've tried a couple of things and I haven't been able to make them work.Eventually I have to sort the array by ID number but I'm stuck on declaring the array.
The file looks something like this:
First Name, Last Name,Age, ID
Carlos,Lopez,,10568
Brad, Patterson,,20586
Zack, Morris,42,05689
This is my code so far:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
FILE *ptr_file;
char workers[4000000][1000];
ptr_file =fopen("workersinfo.txt","r");
if (!ptr_file)
perror("Error");
int i = 0;
while (fgets(workers[i],1000, ptr_file)!=NULL){
i++;
}
int n;
for(n = 0; n < 4000000; n++)
{
printf("%s", workers[n]);
}
fclose(ptr_file);
return 0;
}
The Stack memory is limited. As you pointed out in your question, you MUST use malloc to allocate such a big (need I say HUGE) chunk of memory, as the stack cannot contain it.
you can use ulimit to review the limits of your system (usually including the stack size limit).
On my Mac, the limit is 8Mb. After running ulimit -a I get:
...
stack size (kbytes, -s) 8192
...
Or, test the limit using:
struct rlimit slim;
getrlimit(RLIMIT_STACK, &rlim);
rlim.rlim_cur // the stack limit
I truly recommend you process each database entry separately.
As mentioned in the comments, assigning the memory as static memory would, in most implementations, circumvent the stack.
Still, IMHO, allocating 400MB of memory (or 4GB, depending which part of your question I look at), is bad form unless totally required - especially for a single function.
Follow-up Q1: How to deal with each DB entry separately
I hope I'm not doing your homework or anything... but I doubt your homework would include an assignment to load 400Mb of data to the computer's memory... so... to answer the question in your comment:
The following sketch of single entry processing isn't perfect - it's limited to 1Kb of data per entry (which I thought to be more then enough for such simple data).
Also, I didn't allow for UTF-8 encoding or anything like that (I followed the assumption that English would be used).
As you can see from the code, we read each line separately and perform error checks to check that the data is valid.
To sort the file by ID, you might consider either running two lines at a time (this would be a slow sort) and sorting them, or creating a sorted node tree with the ID data and the position of the line in the file (get the position before reading the line). Once you sorted the binary tree, you can sort the data...
... The binary tree might get a bit big. did you look up sorting algorithms?
#include <stdio.h>
// assuming this is the file structure:
//
// First Name, Last Name,Age, ID
// Carlos,Lopez,,10568
// Brad, Patterson,,20586
// Zack, Morris,42,05689
//
// Then this might be your data structure per line:
struct DBEntry {
char* last_name; // a pointer to the last name
char* age; // a pointer to the name - could probably be an int
char* id; // a pointer to the ID
char first_name[1024]; // the actual buffer...
// I unified the first name and the buffer since the first name is first.
};
// each time you read only a single line, perform an error check for overflow
// and return the parsed data.
//
// return 1 on sucesss or 0 on failure.
int read_db_line(FILE* fp, struct DBEntry* line) {
if (!fgets(line->first_name, 1024, fp))
return 0;
// parse data and review for possible overflow.
// first, zero out data
int pos = 0;
line->age = NULL;
line->id = NULL;
line->last_name = NULL;
// read each byte, looking for the EOL marker and the ',' seperators
while (pos < 1024) {
if (line->first_name[pos] == ',') {
// we encountered a devider. we should handle it.
// if the ID feild's location is already known, we have an excess comma.
if (line->id) {
fprintf(stderr, "Parsing error, invalid data - too many fields.\n");
return 0;
}
// replace the comma with 0 (seperate the strings)
line->first_name[pos] = 0;
if (line->age)
line->id = line->first_name + pos + 1;
else if (line->last_name)
line->age = line->first_name + pos + 1;
else
line->last_name = line->first_name + pos + 1;
} else if (line->first_name[pos] == '\n') {
// we encountered a terminator. we should handle it.
if (line->id) {
// if we have the id string's possition (the start marker), this is a
// valid entry and we should process the data.
line->first_name[pos] = 0;
return 1;
} else {
// we reached an EOL without enough ',' seperators, this is an invalid
// line.
fprintf(stderr, "Parsing error, invalid data - not enough fields.\n");
return 0;
}
}
pos++;
}
// we ran through all the data but there was no EOL marker...
fprintf(stderr,
"Parsing error, invalid data (data overflow or data too large).\n");
return 0;
}
// the main program
int main(int argc, char const* argv[]) {
// open file
FILE* ptr_file;
ptr_file = fopen("workersinfo.txt", "r");
if (!ptr_file)
perror("File Error");
struct DBEntry line;
while (read_db_line(ptr_file, &line)) {
// do what you want with the data... print it?
printf(
"First name:\t%s\n"
"Last name:\t%s\n"
"Age:\t\t%s\n"
"ID:\t\t%s\n"
"--------\n",
line.first_name, line.last_name, line.age, line.id);
}
// close file
fclose(ptr_file);
return 0;
}
Followup Q2: Sorting array for 400MB-4GB of data
IMHO, 400MB is already touching on the issues related to big data. For example, implementing a bubble sort on your database should be agonizing as far as performance goes (unless it's a single time task, where performance might not matter).
Creating an Array of DBEntry objects will eventually get you a larger memory foot-print then the actual data..
This will not be the optimal way to sort large data.
The correct approach will depend on your sorting algorithm. Wikipedia has a decent primer on sorting algorythms.
Since we are handling a large amount of data, there are a few things to consider:
It would make sense to partition the work, so different threads/processes sort a different section of the data.
We will need to minimize IO to the hard drive (as it will slow the sorting significantly and prevent parallel processing on the same machine/disk).
One possible approach is to create a heap for a heap sort, but only storing a priority value and storing the original position in the file.
Another option would probably be to employ a divide and conquer algorithm, such as quicksort, again, only sorting a computed sort value and the entry's position in the original file.
Either way, writing a decent sorting method will be a complicated task, probably involving threading, forking, tempfiles or other techniques.
Here's a simplified demo code... it is far from optimized, but it demonstrates the idea of the binary sort-tree that holds the sorting value and the position of the data in the file.
Be aware that using this code will be both relatively slow (although not that slow) and memory intensive...
On the other hand, it will require about 24 bytes per entry. For 4 million entries, it's 96MB, somewhat better then 400Mb and definitely better then the 4GB.
#include <stdlib.h>
#include <stdio.h>
// assuming this is the file structure:
//
// First Name, Last Name,Age, ID
// Carlos,Lopez,,10568
// Brad, Patterson,,20586
// Zack, Morris,42,05689
//
// Then this might be your data structure per line:
struct DBEntry {
char* last_name; // a pointer to the last name
char* age; // a pointer to the name - could probably be an int
char* id; // a pointer to the ID
char first_name[1024]; // the actual buffer...
// I unified the first name and the buffer since the first name is first.
};
// this might be a sorting node for a sorted bin-tree:
struct SortNode {
struct SortNode* next; // a pointer to the next node
fpos_t position; // the DB entry's position in the file
long value; // The computed sorting value
}* top_sorting_node = NULL;
// this function will free all the memory used by the global Sorting tree
void clear_sort_heap(void) {
struct SortNode* node;
// as long as there is a first node...
while ((node = top_sorting_node)) {
// step forward.
top_sorting_node = top_sorting_node->next;
// free the original first node's memory
free(node);
}
}
// each time you read only a single line, perform an error check for overflow
// and return the parsed data.
//
// return 0 on sucesss or 1 on failure.
int read_db_line(FILE* fp, struct DBEntry* line) {
if (!fgets(line->first_name, 1024, fp))
return -1;
// parse data and review for possible overflow.
// first, zero out data
int pos = 0;
line->age = NULL;
line->id = NULL;
line->last_name = NULL;
// read each byte, looking for the EOL marker and the ',' seperators
while (pos < 1024) {
if (line->first_name[pos] == ',') {
// we encountered a devider. we should handle it.
// if the ID feild's location is already known, we have an excess comma.
if (line->id) {
fprintf(stderr, "Parsing error, invalid data - too many fields.\n");
clear_sort_heap();
exit(2);
}
// replace the comma with 0 (seperate the strings)
line->first_name[pos] = 0;
if (line->age)
line->id = line->first_name + pos + 1;
else if (line->last_name)
line->age = line->first_name + pos + 1;
else
line->last_name = line->first_name + pos + 1;
} else if (line->first_name[pos] == '\n') {
// we encountered a terminator. we should handle it.
if (line->id) {
// if we have the id string's possition (the start marker), this is a
// valid entry and we should process the data.
line->first_name[pos] = 0;
return 0;
} else {
// we reached an EOL without enough ',' seperators, this is an invalid
// line.
fprintf(stderr, "Parsing error, invalid data - not enough fields.\n");
clear_sort_heap();
exit(1);
}
}
pos++;
}
// we ran through all the data but there was no EOL marker...
fprintf(stderr,
"Parsing error, invalid data (data overflow or data too large).\n");
return 0;
}
// read and sort a single line from the database.
// return 0 if there was no data to sort. return 1 if data was read and sorted.
int sort_line(FILE* fp) {
// allocate the memory for the node - use calloc for zero-out data
struct SortNode* node = calloc(sizeof(*node), 1);
// store the position on file
fgetpos(fp, &node->position);
// use a stack allocated DBEntry for processing
struct DBEntry line;
// check that the read succeeded (read_db_line will return -1 on error)
if (read_db_line(fp, &line)) {
// free the node's memory
free(node);
// return no data (0)
return 0;
}
// compute sorting value - I'll assume all IDs are numbers up to long size.
sscanf(line.id, "%ld", &node->value);
// heap sort?
// This is a questionable sort algorythm... or a questionable implementation.
// Also, I'll be using pointers to pointers, so it might be a headache to read
// (it's a headache to write, too...) ;-)
struct SortNode** tmp = &top_sorting_node;
// move up the list until we encounter something we're smaller then us,
// OR untill the list is finished.
while (*tmp && (*tmp)->value <= node->value)
tmp = &((*tmp)->next);
// update the node's `next` value.
node->next = *tmp;
// inject the new node into the tree at the position we found
*tmp = node;
// return 1 (data was read and sorted)
return 1;
}
// writes the next line in the sorting
int write_line(FILE* to, FILE* from) {
struct SortNode* node = top_sorting_node;
if (!node) // are we done? top_sorting_node == NULL ?
return 0; // return 0 - no data to write
// step top_sorting_node forward
top_sorting_node = top_sorting_node->next;
// read data from one file to the other
fsetpos(from, &node->position);
char* buffer = NULL;
ssize_t length;
size_t buff_size = 0;
length = getline(&buffer, &buff_size, from);
if (length <= 0) {
perror("Line Copy Error - Couldn't read data");
return 0;
}
fwrite(buffer, 1, length, to);
free(buffer); // getline allocates memory that we're incharge of freeing.
return 1;
}
// the main program
int main(int argc, char const* argv[]) {
// open file
FILE *fp_read, *fp_write;
fp_read = fopen("workersinfo.txt", "r");
fp_write = fopen("sorted_workersinfo.txt", "w+");
if (!fp_read) {
perror("File Error");
goto cleanup;
}
if (!fp_write) {
perror("File Error");
goto cleanup;
}
printf("\nSorting");
while (sort_line(fp_read))
printf(".");
// write all sorted data to a new file
printf("\n\nWriting sorted data");
while (write_line(fp_write, fp_read))
printf(".");
// clean up - close files and make sure the sorting tree is cleared
cleanup:
printf("\n");
fclose(fp_read);
fclose(fp_write);
clear_sort_heap();
return 0;
}

Allocating memory for struct within a struct in cycle

I'm working on INI-style configuration parser for some project, and I gets next trouble.
I have 3 structures:
typedef struct {
const char* name;
unsigned tract;
int channel;
const char* imitation_type;
} module_config;
typedef struct {
int channel_number;
int isWorking;
int frequency;
int moduleCount;
} channel_config;
typedef struct {
int mode;
module_config* module;
channel_config* channel;
} settings;
And I have function for handling data in my INI-file (I working under inih parser): [pasted to pastebin cause too long]. Finally, in main(), I did the next:
settings* main_settings;
main_settings = (settings*)malloc(sizeof(settings));
main_settings->module = (module_config*)malloc(sizeof(module_config));
main_settings->channel = (channel_config*)malloc(sizeof(channel_config));
if (ini_parse("test.ini", handler, &main_settings) < 0) {
printf("Can't load 'test.ini'\n");
return 1;
}
In result, binary crashes with memory fault. I think (no, I KNOW), what I'm incorrectly allocating the memory in handler(), but I does not understand, where I do it wrong. I spent all night long trying to understand memory allocating, and I'm very tired, but now me simply interestingly, what I'm doing wrong, and HOW to force this working fine.
P.S. Sorry for ugly english
The problem seems to be related to the reallocation of your structs:
pconfig = (settings *) realloc(pconfig, (module_count + channel_count) * sizeof(channel_config));
pconfig->module = (module_config *) realloc(pconfig->module, module_count * sizeof(module_config));
pconfig->channel = (channel_config *) realloc(pconfig->channel, channel_count * sizeof(channel_config));
First of all, you must not reallocate the main settings struct. Since your handler will always be called with the original pconfig value, the reallocation of the module and channel arrays has no effect, and you'll access freed memory.
Also when reallocating the module and channel arrays you should allocate count + 1 elements, since the next invocation of handler might assign to the [count] slot.
So try to replace the three lines above with:
pconfig->module = (module_config *) realloc(pconfig->module, (module_count + 1) * sizeof(module_config));
pconfig->channel = (channel_config *) realloc(pconfig->channel, (channel_count + 1) * sizeof(channel_config));

Timing behavior of null pointers in C

The below is mostly tested for Microsoft CL Version 17.00.50727.1 on Windows 7, but I see something similar with g++. I'm quite sure that the logical function is correct. The question is only about the timing.
Essentially I have a function that dynamically returns new "blocks" of data as needed. If it runs out of space in its "page", it makes a new page.
The purpose of the blocks is to match incoming data keys. If the key is found, that's great. If not, then a new data key is added to the block. If the block runs out of space, then a new block is created and linked to the old one.
The code works whether or not the block-making function explicitly sets the "next" pointer in the new block to NULL. The theory is that calloc() has set the content to 0 already.
The first odd thing is that the block-making function takes about 5 times(!) longer to run when that "next" pointer is explicitly set to NULL. However, then that is done, then the timing of the overall example behaves as expected: It takes linearly longer to match a new key, the more entries there are in the key list. The only difference occurs when a key is added which causes a new block to be fetched. The overhead of doing that is similar to the time taken to call the block-making function.
The only problem with this is that the block-making function is unacceptably slow then.
When the pointer is NOT explicitly set to NULL, then the block-making function becomes nice and fast -- maybe half to a quarter of the key-matching function, instead of as long or even longer.
But then the key-matching function starts to exhibit odd timing behavior. It does mostly increase linearly with the number of keys. It still has jumps at 16 and 32 keys (since the list length is 16). But it also has a large jump at key number 0, and it has large jumps at keys number 17, 33 etc.
These are the key numbers when the program first has to look at the "next" pointer. Apparently it takes a long time to figure out that the 0 value from calloc is really a NULL pointer? Once it knows this, the next times are faster.
The second weird thing is that the effect goes away if the data struct consists exclusively of the key. Now the jumps at 0, 17, 33 etc. go away whether or not the "next" pointer is explicitly set to NULL. But when "int unused[4]" is also in the struct, then the effect returns.
Maybe the compiler (with option /O2 or with -O3 for g++) optimizes away the struct when it consists of a single number? But I still don't see why that would affect the timing behavior in this way.
I've tried to simplify the example as much as I could from the real code, but I'm sorry that it's still quite long. It's not that complicated, though.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <windows.h>
void timer_start(int n);
void timer_end(int n);
void print_times();
// There are pages of blocks, and data entries per block.
// We don't know ahead of time how many there will be.
// GetNextBlock() returns new blocks, and if necessary
// makes new pages. MatchOrStore() goes through data in
// a block to try to match the key. It won't ever match
// in this example, so the function makes a new data entry.
struct dataType
{
// Surprise number 1: If the line with "unused" is
// commented out, things behave as expected, even if
// Surprise number 2 is in effect.
int unused[4];
int key;
};
#define DATA_PER_BLOCK 16
struct blockType
{
char nextEntryNo;
struct dataType list[DATA_PER_BLOCK];
struct blockType * next;
};
struct pageType
{
int nextBlockNo;
struct blockType * list;
struct pageType * next;
struct pageType * prev;
};
struct blockType * GetNextBlock();
void MatchOrStore(
struct dataType * dp,
struct blockType * bp);
struct pageType * pagep;
int main(int argc, char * argv[])
{
pagep = (struct pageType *) 0;
struct dataType data;
for (int j = 0; j < 50000; j++)
{
struct blockType * blockp = GetNextBlock();
// Make different keys each time.
for (data.key = 0; data.key < 40; data.key++)
{
// One timer per key number, useful for statistics.
timer_start(data.key);
MatchOrStore(&data, blockp);
timer_end(data.key);
}
}
print_times();
exit(0);
}
#define BLOCKS_PER_PAGE 5000
struct blockType * GetNextBlock()
{
if (pagep == NULL ||
pagep->nextBlockNo == BLOCKS_PER_PAGE)
{
// If this runs out of page space, it makes some more.
struct pageType * newpagep = (struct pageType *)
calloc(1, sizeof(struct pageType));
newpagep->list = (struct blockType *)
calloc(BLOCKS_PER_PAGE, sizeof(struct blockType));
// I never actually free this, but you get the idea.
newpagep->nextBlockNo = 0;
newpagep->next = NULL;
newpagep->prev = pagep;
if (pagep)
pagep->next = newpagep;
pagep = newpagep;
}
struct blockType * bp = &pagep->list[ pagep->nextBlockNo++ ];
// Surprise number 2: If this line is active, then the
// timing behaves as expected. If it is commented out,
// then presumably calloc still sets next to NULL.
// But the timing changes in an unexpected way.
// bp->next = (struct blockType *) 0;
return bp;
}
void MatchOrStore(
struct dataType * dp,
struct blockType * blockp)
{
struct blockType * bp = blockp;
while (1)
{
for (int i = 0; i < bp->nextEntryNo; i++)
{
// This will spend some time traversing the list of
// blocks, failing to find the key, because that's
// the way I've set up the data for this example.
if (bp->list[i].key != dp->key) continue;
// It will never match.
return;
}
if (! bp->next) break;
bp = bp->next;
}
if (bp->nextEntryNo == DATA_PER_BLOCK)
{
// Once in a while it will run out of space, so it
// will make a new block and add it to the list.
timer_start(99);
struct blockType * bptemp = GetNextBlock();
bp->next = bptemp;
bp = bptemp;
timer_end(99);
}
// Since it didn't find the key, it will store the key
// in the list here.
bp->list[ bp->nextEntryNo++ ].key = dp->key;
}
#define NUM_TIMERS 100
#ifdef _WIN32
#include <time.h>
LARGE_INTEGER
tu0[NUM_TIMERS],
tu1[NUM_TIMERS];
#else
#include <sys/time.h>
struct timeval
tu0[NUM_TIMERS],
tu1[NUM_TIMERS];
#endif
int ctu[NUM_TIMERS],
number[NUM_TIMERS];
void timer_start(int no)
{
number[no]++;
#ifdef _WIN32
QueryPerformanceCounter(&tu0[no]);
#else
gettimeofday(&tu0[no], NULL);
#endif
}
void timer_end(int no)
{
#ifdef _WIN32
QueryPerformanceCounter(&tu1[no]);
ctu[no] += (tu1[no].QuadPart - tu0[no].QuadPart);
#else
gettimeofday(&tu1[no], NULL);
ctu[no] += 1000000 * (tu1[no].tv_sec - tu0[no].tv_sec )
+ (tu1[no].tv_usec - tu0[no].tv_usec);
#endif
}
void print_times()
{
printf("%5s %10s %10s %8s\n",
"n", "Number", "User ticks", "Avg");
for (int n = 0; n < NUM_TIMERS; n++)
{
if (number[n] == 0)
continue;
printf("%5d %10d %10d %8.2f\n",
n,
number[n],
ctu[n],
ctu[n] / (double) number[n]);
}
}

External Functions and Parameter Size Limitation (C)

I am very much stuck in the following issue. Any help is very much appreciated!
Basically I have a program wich contains an array of structs and I am getting a segmentation error when I call an external function. The error only happens when I have more than 170 items on the array being passed.
Nothing on the function is processed. The program stops exactly when accessing the function.
Is there a limit for the size of the parameters that are passed to external functions?
Main.c
struct ratingObj {
int uid;
int mid;
double rating;
};
void *FunctionLib; /* Handle to shared lib file */
void (*Function)(); /* Pointer to loaded routine */
const char *dlError; /* Pointer to error string */
int main( int argc, char * argv[]){
// ... some code ...
asprintf(&query, "select mid, rating "
"from %s "
"where uid=%d "
"order by rand()", itable, uid);
if (mysql_query(conn2, query)) {
fprintf(stderr, "%s\n", mysql_error(conn2));
exit(1);
}
res2 = mysql_store_result(conn2);
int movieCount = mysql_num_rows(res2);
// withhold is a variable that defines a percentage of the entries
// to be used for calculations (generally 20%)
int listSize = round((movieCount * ((double)withhold/100)));
struct ratingObj moviesToRate[listSize];
int mvCount = 0;
int count =0;
while ((row2 = mysql_fetch_row(res2)) != NULL){
if(count<(movieCount-listSize)){
// adds to another table
}else{
moviesToRate[mvCount].uid = uid;
moviesToRate[mvCount].mid = atoi(row2[0]);
moviesToRate[mvCount].rating = 0.0;
mvCount++;
}
count++;
}
// ... more code ...
FunctionLib = dlopen("library.so", RTLD_LAZY);
dlError = dlerror();
if( dlError ) exit(1);
Function = dlsym( FunctionLib, "getResults");
dlError = dlerror();
(*Function)( moviesToRate, listSize );
// .. more code
}
library.c
struct ratingObj {
int uid;
int mid;
double rating;
};
typedef struct ratingObj ratingObj;
void getResults(struct ratingObj *moviesToRate, int listSize);
void getResults(struct ratingObj *moviesToRate, int listSize){
// ... more code
}
You are likely blowing up the stack. Move the array to outside of the function, i.e. from auto to static land.
Another option is that the // ... more code - array gets populated... part is corrupting the stack.
Edit 0:
After you posted more code - you are using C99 variable sized array on the stack - Bad IdeaTM. Think what happens when your data set grows to thousands, or millions, of records. Switch to dynamic memory allocation, see malloc(3).
You don't show us what listsize is, but I suppose it is a variable and not a constant.
What you are using are variable length arrays, VLA. These are a bit dangerous if they are too large since they usually allocated on the stack.
To work around that you can allocate such a beast dynamically
struct ratingObj (*movies)[listSize] = malloc(sizeof(*movies));
// ...
free(movies);
You'd then have in mind though that movies then is a pointer to array, so you have to reference with one * more than before.
Another, more classical C version would be
struct ratingObj * movies = malloc(sizeof(*movies)*listsize);
// ...
free(movies);

Resources