Issue implementing dynamic array of structures - c

I am having an issue creating a dynamic array of structures. I have seen and tried to implement a few examples on here and other sites, the examples as well as how they allocate memory tend to differ, and I can't seem to get any of them to work for me. Any help would be greatly appreciated.
typedef struct node {
int index;
int xmin, xmax, ymin, ymax;
} partition;
partition* part1 = (partition *)malloc(sizeof(partition) * 50);
I can't even get this right. It gives me the following error:
error: initializer element is not constant
If anyone could explain how something like this should be implemented I would greatly appreciate it.
Also, once I have that part down, how would I add values into the elements of the structure? Would something like the below work?
part1[i]->index = x;

The compiler is complaining because you're doing:
partition* part1 = (partition *)malloc(sizeof(partition) * 50);
Do this instead:
partition* part1;
int
main(void)
{
part1 = (partition *)malloc(sizeof(partition) * 50);
...
}
Your version used an initializer on a global, which in C must be a constant value. By moving the malloc into a function, you are "initializing the value" with your code, but you aren't using an initializer as defined in the language.
Likewise, you could have had a global that was initialized:
int twenty_two = 22;
Here 22 is a constant and thus allowable.
UPDATE: Here's a somewhat lengthy example that will show most of the possible ways:
#define PARTMAX 50
partition static_partlist[PARTMAX];
partition *dynamic_partlist;
int grown_partmax;
partition *grown_partlist;
void
iterate_byindex_static_length(partition *partlist)
{
int idx;
for (idx = 0; idx < PARTMAX; ++idx)
do_something(&partlist[idx]);
}
void
iterate_byptr_static_length(partition *partlist)
{
partition *cur;
partition *end;
// these are all equivalent:
// end = partlist + PARTMAX;
// end = &partlist[PARTMAX];
end = partlist + PARTMAX;
for (cur = partlist; cur < end; ++cur)
do_something(cur);
}
void
iterate_byindex_dynamic_length(partition *partlist,int partmax)
{
int idx;
for (idx = 0; idx < partmax; ++idx)
do_something(&partlist[idx]);
}
void
iterate_byptr_dynamic_length(partition *partlist,int partmax)
{
partition *cur;
partition *end;
// these are all equivalent:
// end = partlist + partmax;
// end = &partlist[partmax];
end = partlist + partmax;
for (cur = partlist; cur < end; ++cur)
do_something(cur);
}
int
main(void)
{
partition *part;
dynamic_partlist = malloc(sizeof(partition) * PARTMAX);
// these are all the same
iterate_byindex_static_length(dynamic_partlist);
iterate_byindex_static_length(dynamic_partlist + 0);
iterate_byindex_static_length(&dynamic_partlist[0]);
// as are these
iterate_byptr_static_length(static_partlist);
iterate_byptr_static_length(static_partlist + 0);
iterate_byptr_static_length(&static_partlist[0]);
// still the same ...
iterate_byindex_dynamic_length(dynamic_partlist,PARTMAX);
iterate_byindex_dynamic_length(dynamic_partlist + 0,PARTMAX);
iterate_byindex_dynamic_length(&dynamic_partlist[0],PARTMAX);
// yet again the same ...
iterate_byptr_dynamic_length(static_partlist,PARTMAX);
iterate_byptr_dynamic_length(static_partlist + 0,PARTMAX);
iterate_byptr_dynamic_length(&static_partlist[0],PARTMAX);
// let's grow an array dynamically and fill it ...
for (idx = 0; idx < 10; ++idx) {
// grow the list -- Note that realloc is smart enough to handle
// the fact that grown_partlist is NULL on the first time through
++grown_partmax;
grown_partlist = realloc(grown_partlist,
grown_partmax * sizeof(partition));
part = &grown_partlist[grown_partmax - 1];
// fill in part with whatever data ...
}
// once again, still the same
iterate_byindex_dynamic_length(grown_partlist,grown_partmax);
iterate_byindex_dynamic_length(grown_partlist + 0,grown_partmax);
iterate_byindex_dynamic_length(&grown_partlist[0],grown_partmax);
// sheesh, do things ever change? :-)
iterate_byptr_dynamic_length(grown_partlist,grown_partmax);
iterate_byptr_dynamic_length(grown_partlist + 0,grown_partmax);
iterate_byptr_dynamic_length(&grown_partlist[0],grown_partmax);
}
There are two basic ways to interate through an array: by index and by pointer. It does not matter how the array was defined (e.g. global/static --> int myary[37]; or via malloc/realloc --> int *myptr = malloc(sizeof(int) * 37);). The "by index" syntax and "by pointer" syntaxes are interchangeable. If you wanted the 12th element, the following are all equivalent:
myary[12]
*(myary + 12)
*(&myary[12])
myptr[12]
*(myptr + 12)
*(&myptr[12])
That's why all of the above will produce the same results.

Related

How can I correctly allocate memory for this MergeSort implementation in C (with the DS I am using)?

My goal here is to perform MergeSort on a dynamic array-like data structure I called a dictionary used to store strings and their relative weights. Sorry if the implementation is dumb, I'm a student and still learning.
Anyway, based on the segfaults I'm getting, I'm incorrectly allocating memory for my structs of type item to be copied over into the temporary lists I'm making. Not sure how to fix this. Code for mergesort and data structure setup is below, any help is appreciated.
/////// DICTIONARY METHODS ////////
typedef struct {
char *item;
int weight;
} item;
typedef struct {
item **wordlist;
//track size of dictionary
int size;
} dict;
//dict constructor
dict* Dict(int count){
//allocate space for dictionary
dict* D = malloc(sizeof(dict));
//allocate space for words
D->wordlist = malloc(sizeof(item*) * count);
//initial size
D->size = 0;
return D;
}
//word constructor
item* Item(char str[]){
//allocate memory for struct
item* W = malloc(sizeof(item));
//allocate memory for string
W->item = malloc(sizeof(char) * strlen(str));
W->weight = 0;
return W;
}
void merge(dict* D, int start, int middle, int stop){
//create ints to track lengths of left and right of array
int leftlen = middle - start + 1;
int rightlen = stop - middle;
//create new temporary dicts to store the two sides of the array
dict* L = Dict(leftlen);
dict* R = Dict(rightlen);
int i, j, k;
//copy elements start through middle into left dict- this gives a segfault
for (int i = 0; i < leftlen; i++){
L->wordlist[i] = malloc(sizeof(item*));
L->wordlist[i] = D->wordlist[start + i];
}
//copy elements middle through end into right dict- this gives a segfault
for (int j = 0; j < rightlen; j++){
R->wordlist[j] = malloc(sizeof(item*));
R->wordlist[j]= D->wordlist[middle + 1 + k];
}
i = 0;
j = 0;
k = leftlen;
while ((i < leftlen) && (j < rightlen)){
if (strcmp(L->wordlist[i]->item, R->wordlist[j]->item) <= 0) {
D->wordlist[k] = L->wordlist[i];
i++;
k++;
}
else{
D->wordlist[k] = R->wordlist[j];
j++;
k++;
}
}
while (i < leftlen){
D->wordlist[k] = L->wordlist[i];
i++;
k++;
}
while (j < rightlen){
D->wordlist[k] = L->wordlist[j];
j++;
k++;
}
}
void mergeSort(dict* D, int start, int stop){
if (start < stop) {
int middle = start + (stop - start) / 2;
mergeSort(D, start, middle);
mergeSort(D, middle + 1, stop);
merge(D, start, middle, stop);
}
I put print statements everywhere and narrowed it down to the mallocs in the section where I copy the dictionary to be sorted into 2 separate dictionaries. Also tried writing that malloc as malloc(sizeof(D->wordlist[start + i])). Is there something else I need to do to be able to copy the item struct into the wordlist of the new struct?
Again, I'm new to this, so cut me some slack :)
There are numerous errors in the code:
In merge() when copying elements to the R list, the wrong (and uninitialized) index variable k is being used instead of j. R->wordlist[j]= D->wordlist[middle + 1 + k]; should be R->wordlist[j]= D->wordlist[middle + 1 + j];.
In merge() before merging the L and R lists back to D, the index variable k for the D list is being initialized to the wrong value. k = leftLen; should be k = start;.
In merge() in the loop that should copy the remaining elements of the "right" list to D, the elements are being copied from the "left" list instead of the "right" list. D->wordlist[k] = L->wordlist[j]; should be D->wordlist[k] = R->wordlist[j];.
In Item(), the malloc() call is not reserving space for the null terminator at the end of the string. W->item = malloc(sizeof(char) * strlen(str)); should be W->item = malloc(sizeof(char) * (strlen(str) + 1)); (and since sizeof(char) is 1 by definition it can be simplified to W->item = malloc(strlen(str) + 1);).
Item() is not copying the string to the allocated memory. Add strcpy(W->item, str);.
There are memory leaks in merge():
L->wordlist[i] = malloc(sizeof(item*)); is not required and can be removed since L->wordlist[i] is changed on the very next line: L->wordlist[i] = D->wordlist[start + i];.
Similarly, R->wordlist[j] = malloc(sizeof(item*)); is not required and can be removed since R->wordlist[j] is changed on the very next line.
L and R memory is created but never destroyed. Add these lines to the end of merge() to free them:
free(L->wordlist);
free(L);
free(R->wordlist);
free(R);
None of the malloc() calls are checked for success.
Allocate it all at once, before the merge sort even starts.
#include <stdlib.h>
#include <string.h>
// Weighted Word --------------------------------------------------------------
//
typedef struct {
char *word;
int weight;
} weighted_word;
// Create a weighted word
//
weighted_word* CreateWeightedWord(const char *str, int weight){
weighted_word* W = malloc(sizeof(weighted_word));
if (W){
W->word = malloc(strlen(str) + 1); // string length + nul terminator
if (W->word)
strcpy( W->word, str);
W->weight = weight;
}
return W;
}
// Free a weighted word
//
weighted_word *FreeWeightedWord(weighted_word *W){
if (W){
if (W->word)
free(W->word);
free(W);
}
return NULL;
}
// Dictionary (of Weighted Words) ---------------------------------------------
//
typedef struct {
weighted_word **wordlist; // this is a pointer to an array of (weighted_word *)s
int size; // current number of elements in use
int capacity; // maximum number of elements available to use
} dict;
// Create a dictionary with a fixed capacity
//
dict* CreateDict(int capacity){
dict* D = malloc(sizeof(dict));
if (D){
D->wordlist = malloc(sizeof(weighted_word*) * capacity);
D->size = 0;
D->capacity = capacity;
}
return D;
}
// Free a dictionary (and all weighted words)
//
dict *FreeDict(dict *D){
if (D){
for (int n = 0; n < D->size; n++)
FreeWeightedWord(D->wordlist[n]);
free(D->wordlist);
free(D);
}
return NULL;
}
// Add a new weighted word to the end of our dictionary
//
void DictAddWord(dict *D, const char *str, int weight){
if (!D) return;
if (D->size == D->capacity) return;
D->wordlist[D->size] = CreateWeightedWord(str, weight);
if (D->wordlist[D->size])
D->size += 1;
}
// Merge Sort the Dictionary --------------------------------------------------
// Merge two partitions of sorted words
// words • the partitioned weighted word list
// start • beginning of left partition
// middle • end of left partition, beginning of right partition
// stop • end of right partition
// buffer • temporary work buffer, at least as big as (middle-start)
//
void MergeWeightedWords(weighted_word **words, int start, int middle, int stop, weighted_word **buffer){
int Lstart = start; int Rstart = middle; // Left partition
int Lstop = middle; int Rstop = stop; // Right partition
int Bindex = 0; // temporary work buffer output index
// while (left partition has elements) AND (right partition has elements)
while ((Lstart < Lstop) && (Rstart < Rstop)){
if (strcmp( words[Rstart]->word, words[Lstart]->word ) < 0)
buffer[Bindex++] = words[Rstart++];
else
buffer[Bindex++] = words[Lstart++];
}
// if (left partition has any remaining elements)
while (Lstart < Lstop)
buffer[Bindex++] = words[Lstart++];
// We don't actually need this. Think about it. Why not?
// // if (right partition has any remaining elements)
// while (Rstart < Rstop)
// buffer[Bindex++] = words[Rstart++];
// Copy merged data from temporary buffer back into source word list
for (int n = 0; n < Bindex; n++)
words[start++] = buffer[n];
}
// Merge Sort an array of weighted words
// words • the array of (weighted_word*)s to sort
// start • index of first element to sort
// stop • index ONE PAST the last element to sort
// buffer • the temporary merge buffer, at least as big as (stop-start+1)/2
//
void MergeSortWeightedWords(weighted_word **words, int start, int stop, weighted_word **buffer){
if (start < stop-1){ // -1 because a singleton array is by definition sorted
int middle = start + (stop - start) / 2;
MergeSortWeightedWords(words, start, middle, buffer);
MergeSortWeightedWords(words, middle, stop, buffer);
MergeWeightedWords(words, start, middle, stop, buffer);
}
}
// Merge Sort a Dictionary
//
void MergeSortDict(dict *D){
if (D){
// We only need to allocate a single temporary work buffer, just once, right here.
dict * Temp = CreateDict(D->size);
if (Temp){
MergeSortWeightedWords(D->wordlist, 0, D->size, Temp->wordlist);
}
FreeDict(Temp);
}
}
// Main program ---------------------------------------------------------------
#include <stdio.h>
int main(int argc, char **argv){
// Command-line arguments --> dictionary
dict *a_dict = CreateDict(argc-1);
for (int n = 1; n < argc; n++)
DictAddWord(a_dict, argv[n], 0);
// Sort the dictionary
MergeSortDict(a_dict);
// Print the weighted words
for (int n = 0; n < a_dict->size; n++)
printf( "%d %s\n", a_dict->wordlist[n]->weight, a_dict->wordlist[n]->word );
// Clean up
FreeDict(a_dict);
}
Notes for you:
Be consistent. You were inconsistent with capitalization and * placement and, oddly, vertical spacing. (You are waaay better than most beginners, though.) I personally hate the Egyptian brace style, but to each his own.
I personally think there are far too many levels of malloc()s in this code too, but I will leave it at this one comment. It works as is.
Strings must be nul-terminated — that is, each string takes strlen() characters plus one for a '\0' character. There is a convenient library function that can copy a string for you too, called strdup(), which AFAIK exists on every system.
Always check that malloc() and friends succeed.
Don’t forget to free everything you allocate. Functions help.
“Item” was a terribly non-descript name, and it overlapped with the meaning of two different things in your code. I renamed them to separate things.
Your dictionary object should be expected to keep track of how many elements it can support. The above code simply refuses to add words after the capacity is filled, but you could easily make it realloc() a larger capacity if the need arises. The point is to prevent invalid array accesses by adding too many elements to a fixed-size array.
Printing the array could probably go in a function.
Notice how I set start as inclusive and stop as exclusive. This is a very C (and C++) way of looking at things, and it is a good one. It will help you with all kinds of algorithms.
Notice also how I split the Merge Sort up into two functions: one that takes a dictionary as argument, and a lower-level one that takes an array of the weighted words as argument that does all the work.
The higher-level merge sort a dictionary allocates all the temporary buffer the merge algorithm needs, just once.
The lower-level merge sort an array of (weighted_word*)s expects that temporary buffer to exist and doesn’t care (or know anything) about the dictionary object.
The merge algorithm likewise doesn't know much. It is simply given all the information it needs.
Right now the merge condition simply compares the weighted-word’s string value. But it doesn’t have to be so simple. For example, you could sort equal elements by weight. Create a function:
int CompareWeightedWords(const weighted_word *a, const weighted_word *b){
int rel = strcmp( a->word, b->word );
if (rel < 0) return -1;
if (rel > 0) return 1;
return a->weight < b->weight ? -1 : a->weight > b->weight;
}
And put it to use in the merge function:
if (CompareWeightedWords( words[Rstart], words[Lstart] ) < 0)
buffer[Bindex++] = words[Rstart++];
else
buffer[Bindex++] = words[Lstart++];
I don’t think I forgot anything.

Recovering elements of large array with multiple index ranges

This is a tricky problem that I have been thinking about for a long time and have yet to see a satisfactory answer anywhere. Lets say I have a large int array of size 10000. I can simply declare it in the following manner:
int main()
{
int foo[10000];
int i;
int n;
n = sizeof(foo) / sizeof(int);
for (i = 0; i < n; i++)
{
printf("Index %d is %d\n",i,foo[i] );
}
return 0;
}
It is pretty clear that each index in the array will hold a random assortment of numbers before I formally initialize them:
Index 0 is 0
Index 1 is 0
Index 2 is 0
Index 3 is 0
.
.
.
Index 6087 is 0
Index 6088 is 1377050464
Index 6089 is 32767
Index 6090 is 1680893034
.
.
.
Index 9996 is 0
Index 9997 is 0
Index 9998 is 0
Index 9999 is 0
Then lets say that I initialize select index ranges of my array with values that hold a specific value for the program as a whole and must be preserved, with the goal of passing in those values for subsequent operation to some function:
//Call this block 1
foo[0] = 0;
foo[1] = 7;
foo[2] = 99;
foo[3] = 0;
//Call this block 2
foo[9996] = 0;
foo[9997] = 444;
foo[9998] = 2;
foo[9999] = 0;
for (i = 0; i < (What goes here?); i++)
{
//I must pass in only those values initialized to select indices of foo[] (Blocks 1 and 2 uncorrupted)
//How to recover those values to pass into foo_func()?
foo_func(foo[]);
}
Some of those values that I initialized foo[] with overlap with pre-existing values in the array before formally initializing the array myself. How can I pass in just the indices of the array elements that I initialized, given that there are multiple index ranges? I just can't figure this out. Thanks for any and all help!
EDIT:
I should also mention that the array itself will be read from a .txt file. I just showed the initialization in the code for illustrative purposes.
There's a number of ways you can quickly zero out the memory in the array, either while initializing or after.
For an array on the stack, initialize it with zeros. {0} is shorthand for that.
int foo[10000] = {0};
For an array on the heap, use calloc to allocate memory and initialize it with 0's.
int *foo = calloc(10000, sizeof(int));
If the array already exists, use memset to quickly overwrite all the array's memory with zeros.
memset(foo, 0, sizeof(int) * 10000);
Now all elements are zero. You can set individual elements to whatever you like one by one. For example...
int main() {
int foo[10] = {0};
foo[1] = 7;
foo[2] = 99;
foo[7] = 444;
foo[8] = 2;
for( int i = 0; i < 10; i++ ) {
printf("%d - %d\n", i, foo[i]);
}
}
That will print...
0 - 0
1 - 7
2 - 99
3 - 0
4 - 0
5 - 0
6 - 0
7 - 444
8 - 2
9 - 0
As a side note, using only a few elements of a large array is a waste of memory. Instead, use a hash table, or if you need ordering, some type of tree. These can be difficult to implement correctly, but a library such as GLib can provide you with good implementations.
Introduction
I'm making a strong assumption on your problem, and it is sparsness (a majority of the elements in your array will remain zero).
Under this assumption I would build the array as a list. I'm including a sample code, that it is not complete and it is not intended to
be---you should do your own homework :)
The core object is a struct with a pointer to a begin element and the size:
typedef struct vector {
size_t size;
vector_element_t * begin;
} vector_t;
each element of the vector has its own index and value and a pointer to the next element in a list:
typedef struct vector_element vector_element_t;
struct vector_element {
int value;
size_t index;
vector_element_t *next;
};
on this basis we can build a dynamical vector as a list, by dropping a constraint on the ordering (it is not needed, you can modify this code
to maintain the ordering), using some simple custom methods:
vector_t * vector_init(); // Initialize an empty array
void vector_destroy(vector_t* v); // Destroy the content and the array itself
int vector_get(vector_t *v, size_t index); // Get an element from the array, by searching the index
size_t vector_set(vector_t *v, size_t index, int value); // Set an element at the index
void vector_delete(vector_t *v, size_t index); // Delete an element from the vector
void vector_each(vector_t *v, int(*f)(size_t index, int value)); // Executes a callback for each element of the list
// This last function may be the response to your question
Test it online
The main example
This is a main that uses all this methods and prints in console:
int callback(size_t index, int value) {
printf("Vector[%lu] = %d\n", index, value);
return value;
}
int main() {
vector_t * vec = vector_init();
vector_set(vec, 10, 5);
vector_set(vec, 23, 9);
vector_set(vec, 1000, 3);
printf("vector_get(vec, %d) = %d\n", 1000, vector_get(vec, 1000)); // This should print 3
printf("vector_get(vec, %d) = %d\n", 1, vector_get(vec, 1)); // this should print 0
printf("size(vec) = %lu\n", vec->size); // this should print 3 (the size of initialized elements)
vector_each(vec, callback); // Calling the callback on each element of the
// array that is initialized, as you asked.
vector_delete(vec, 23);
printf("size(vec) = %lu\n", vec->size);
vector_each(vec, callback); // Calling the callback on each element of the array
vector_destroy(vec);
return 0;
}
And the output:
vector_get(vec, 1000) = 3
vector_get(vec, 1) = 0
size(vec) = 3
Vector[10] = 5
Vector[23] = 9
Vector[1000] = 3
size(vec) = 3
Vector[10] = 5
Vector[1000] = 3
The callback with the function vector_each is something you really should look at.
Implementations
I'm giving you some trivial implementations for the functions in the introdution. They are not complete,
and some checks on pointers should be introduced. I'm leaving that to you. As it is, this code is not for production and under some circumstances can also overflow.
The particular part is the search of a specific element in the vector. Every time you tranverse the list,
and this is convenient only and only if you have sparsity (the majority of your index will always return zero).
In this implementation, if you access an index that is not enlisted, you get as a result 0. If you don't want this
you should define an error callback.
Initialization and destruction
When we initialize, we allocate the memory for our vector, but with no elements inside, thus begin points to NULL. When we destroy the vector we have not only to free the vector, but also each element contained.
vector_t * vector_init() {
vector_t * v = (vector_t*)malloc(sizeof(vector_t));
if (v) {
v->begin = NULL;
v->size = 0;
return v;
}
return NULL;
}
void vector_destroy(vector_t *v) {
if (v) {
vector_element_t * curr = v->begin;
if (curr) {
vector_element_t * next = curr->next;
while (next) {
curr = curr->next;
next = next->next;
if (curr)
free(curr);
}
if (next)
free(next);
}
free(v);
}
}
The get and set methods
In get you can see how the list works (and the same concept
is used also in set and delete): we start from the begin, and
we cross the list until we reach an element with an index equal
to the one requested. If we cannot find it we simply return 0.
If we need to "raise some sort of signal" when the value is
not found, it is easy to implement an "error callback".
As long as sparsness holds, searching in the whole array for an index is a good compromise in terms of memory requirements, and efficiency may be not an issue.
int vector_get(vector_t *v, size_t index) {
vector_element_t * el = v->begin;
while (el != NULL) {
if (el->index == index)
return el->value;
el = el->next;
}
return 0;
}
// Gosh, this set function is really a mess... I hope you can understand it...
// -.-'
size_t vector_set(vector_t *v, size_t index, int value) {
vector_element_t * el = v->begin;
// Case 1: Initialize the first element of the array
if (el == NULL) {
el = (vector_element_t *)malloc(sizeof(vector_element_t));
if (el != NULL) {
v->begin = el;
v->size += 1;
el->index = index;
el->value = value;
el->next = NULL;
return v->size;
} else {
return 0;
}
}
// Case 2: Search for the element in the array
while (el != NULL) {
if (el->index == index) {
el->value = value;
return v->size;
}
// Case 3: if there is no element with that index creates a new element
if (el->next == NULL) {
el->next = (vector_element_t *)malloc(sizeof(vector_element_t));
if (el->next != NULL) {
v->size += 1;
el->next->index = index;
el->next->value = value;
el->next->next = NULL;
return v->size;
}
return 0;
}
el = el->next;
}
}
Deleting an element
With this approach it is possible to delete an element quite easily, connecting
curr->next to curr->next->next. We must though free the previous curr->next...
void vector_delete(vector_t * v, size_t index) {
vector_element_t *curr = v->begin;
vector_element_t *next = curr->next;
while (next != NULL) {
if (next->index == index) {
curr->next = next->next;
free(next);
return;
} else {
curr = next;
next = next->next;
}
}
}
An iteration function
I think this is the answer to the last part of your question,
instead passing a sequence of indexes, you pass a callback to the vector.
The callback gets and sets value in a specific index. If you want to
operate only on some specific indexes, you may include a check in the
callback itself. If you need to pass more data to the callback, check
the very last section.
void vector_each(vector_t * v, int (*f)(size_t index, int value)) {
vector_element_t *el = v->begin;
while (el) {
el->value = f(el->index, el->value);
el = el->next;
}
}
Error callback
You may want to raise some out of bounds error or something else. One solution is to enrich your list with function pointer that represent a callback that should be called when your user sk for an undefined element:
typedef struct vector {
size_t size;
vector_element_t *begin;
void (*error_undefined)(vector *v, size_t index);
} vector_t
and maybe at the end of your vector_get function you may want to do something like:
int vector_get(vector_t *v, size_t index) {
// [ . . .]
// you know at index the element is undefined:
if (v->error_undefined)
v->error_undefined(v, index);
else {
// Do something to clean up the user mess... or simply
return 0;
}
}
usually it is nice to add also an helper function to set the callback...
Passing user data to "each" callback
If you want to pass more data to the user callback, you may add a void* as last argument:
void vector_each(vector_t * v, void * user_data, int (*f)(size_t index, int value, void * user_data));
void vector_each(vector_t * v, void * user_data, int (*f)(size_t index, int value, void * user_data)) {
[...]
el->value = f(el->index, el->value, user_data);
[...]
}
if the user do not need it, he can pass a wonderful NULL.

Need help understanding logic of function

monthly->maxTemperature = yearData[i].high;
monthly->minTemperature = yearData[i].low;
I just can't seem to understand the logic of what the iterations will look like or how to access the proper elements in the array of data to get the proper data for each month.... without corrupting data. Thanks!
You're on the right track:
void stats(int mth, const struct Data yearData[], int size, struct Monthly* monthStats)
{
// These are used to calc averages
int highSum = 0;
int lowSum = 0;
int days = 0;
// Initialize data
monthly->maxTemperature = INT_MIN;
monthly->minTemperature = INT_MAX;
monthly->totalPrecip = 0;
for (int i = 0; i < size; ++i) {
// Only use data from given month
if (yearData[i].month == mth) {
days += 1;
if (yearData[i].high > monthly->maxTemperature) monthly->maxTemperature = yearData[i].high;
if (yearData[i].low < monthly->minTemperature) monthly->minTemperature = yearData[i].low;
highSum += yearData[i].high;
lowSum + yearData[i].low;
monthly->totalPrecip += yearData[i].precip;
}
}
if (0 != days) {
monthly->avgHigh = highSum / days;
monthly->avgLow = lowSum / days;
}
}
Before working on the assignment it's a good idea to examine the API that you need to implement for clues. First thing to notice is that the reason the struct Monthly is passed to your function by pointer is so that you could set the result into it. This is different from the reason for passing struct Data as a pointer*, which is to pass an array using the only mechanism for passing arrays available in C. const qualifier is a strong indication that you must not be trying to modify anything off of the yearData, only the monthStats.
This tells you what to do with the min, max, average, and total that you are going to find in your function: these need to be assigned to fields of monthStats, like this:
monthStats->maxTemperature = maxTemperature;
monthStats->minTemperature = minTemperature;
...
where maxTemperature, minTemperature, and so on are local variables that you declare before entering the for loop.
As far as the for loop goes, your problem is that you ignore the mth variable completely. You need to use its value to decide if an element of yearData should be considered for your computations or not. The simplest way is to add an if to your for loop:
int maxTemperature = INT_MIN; // you need to include <limits.h>
int minTemperature = INT_MAX; // to get definitions of INT_MIN and INT_MAX
for(int i = 0; i<size; ++i) {
if (yearData[i].month < mth) continue;
if (yearData[i].month > mth) break;
... // Do your computations here
}
* Even though it looks like an array, it is still passed as a pointer

Sorting an array of coordinates by their distance from origin

The code should take an array of coordinates from the user, then sort that array, putting the coordinates in order of their distance from the origin. I believe my problem lies in the sorting function (I have used a quicksort).
I am trying to write the function myself to get a better understanding of it, which is why I'm not using qsort().
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define MAX_SIZE 64
typedef struct
{
double x, y;
}POINT;
double distance(POINT p1, POINT p2);
void sortpoints(double distances[MAX_SIZE], int firstindex, int lastindex, POINT data[MAX_SIZE]);
void printpoints(POINT data[], int n_points);
int main()
{
int n_points, i;
POINT data[MAX_SIZE], origin = { 0, 0 };
double distances[MAX_SIZE];
printf("How many values would you like to enter?\n");
scanf("%d", &n_points);
printf("enter your coordinates\n");
for (i = 0; i < n_points; i++)
{
scanf("%lf %lf", &data[i].x, &data[i].y);
distances[i] = distance(data[i], origin); //data and distances is linked by their index number in both arrays
}
sortpoints(distances, 0, i, data);
return 0;
}
double distance(POINT p1, POINT p2)
{
return sqrt(pow((p1.x - p2.x), 2) + pow((p1.y - p2.y), 2));
}
void printpoints(POINT *data, int n_points)
{
int i;
printf("Sorted points (according to distance from the origin):\n");
for (i = 0; i < n_points; i++)
{
printf("%.2lf %.2lf\n", data[i].x, data[i].y);
}
}
//quicksort
void sortpoints(double distances[MAX_SIZE], int firstindex, int lastindex, POINT data[MAX_SIZE])
{
int indexleft = firstindex;
int indexright = lastindex;
int indexpivot = (int)((lastindex + 1) / 2);
int n_points = lastindex + 1;
double left = distances[indexleft];
double right = distances[indexright];
double pivot = distances[indexpivot];
POINT temp;
if (firstindex < lastindex) //this will halt the recursion of the sorting function once all the arrays are 1-size
{
while (indexleft < indexpivot || indexright > indexpivot) //this will stop the sorting once both selectors reach the pivot position
{
//reset the values of left and right for the iterations of this loop
left = distances[indexleft];
right = distances[indexright];
while (left < pivot)
{
indexleft++;
left = distances[indexleft];
}
while (right > pivot)
{
indexright--;
right = distances[indexright];
}
distances[indexright] = left;
distances[indexleft] = right;
temp = data[indexleft];
data[indexleft] = data[indexright];
data[indexright] = temp;
}
//recursive sorting to sort the sublists
sortpoints(distances, firstindex, indexpivot - 1, data);
sortpoints(distances, indexpivot + 1, lastindex, data);
}
printpoints(data, n_points);
}
Thanks for your help, I have been trying to debug this for hours, even using a debugger.
Ouch! You call sortpoints() with i as argument. That argument, according to your prototype and code, should be the last index, and i is not the last index, but the last index + 1.
int indexleft = firstindex;
int indexright = lastindex; // indexright is pointing to a non-existent element.
int indexpivot = (int)((lastindex + 1) / 2);
int n_points = lastindex + 1;
double left = distances[indexleft];
double right = distances[indexright]; // now right is an undefined value, or segfault.
To fix that, call your sortpoints() function as:
sortpoints (0, n_points-1, data);
The problem is in your sortpoints function. The first while loop is looping infinitely. To test that is it an infinite loop or not place a printf statement
printf("Testing first while loop\n");
in your first while loop. You have to fix that.
There are quite a number of problems, but one of them is:
int indexpivot = (int)((lastindex + 1) / 2);
The cast is unnecessary, but that's trivia. Much more fundamental is that if you are sorting a segment from, say, 48..63, you will be pivoting on element 32, which is not in the range you are supposed to be working on. You need to use:
int indexpivot = (lastindex + firstindex) / 2;
or perhaps:
int indexpivot = (lastindex + firstindex + 1) / 2;
For the example range, these will pivot on element 55 or 56, which is at least within the range.
I strongly recommend:
Creating a print function similar to printpoints() but with the following differences:
Takes a 'tag' string to identify what it is printing.
Takes and prints the distance array too.
Takes the arrays and a pair of offsets.
Use this function inside the sort function before recursing.
Use this function inside the sort function before returning.
Use this function in the main function after you've read the data.
Use this function in the main function after the data is sorted.
Print key values — the pivot distance, the pivot index, at appropriate points.
This allows you to check that your partitioning is working correctly (it isn't at the moment).
Then, when you've got the code working, you can remove or disable (comment out) the printing code in the sort function.

Optimization for large strings C/C++

I'm looking for a way to optimize my implementation. Basically this is a "reduce"-like (from Map Reduce framework) function. It takes a key and its values. The goal is to check all the values if they are distinct and output them in a form of an list: value1;value2;value3;...valuen; as a string. n can be very large (in 1000s)
void unique(char *key, int keybytes, char *multivalue, int nvalues,
int *valuebytes, KeyValue *kv, void *ptr) {
char * value = NULL;
char * elem[nvalues];
int i, j, cx;
char adj[3858905] = "";
Big problem is that I have to specify char adj[] length for every input and I don't know ahead how big a number of values is. (That takes huge amount of memory)
for (i = 0; i < nvalues; i++) {
if (i == 0) {
value = multivalue;
} else {
value = multivalue + valuebytes[i - 1];
multivalue = multivalue + valuebytes[i - 1];
}
elem[i] = value;
}
size_t elem_length = sizeof(elem)/sizeof(char *);
qsort(elem, elem_length, sizeof(char *), cstring_cmp);
cx = sprintf(adj, "%s;", elem[0]);
j = 0;
for (i = 1; i < nvalues; i++) {
bool matching = false;
if (!strcmp(elem[i], elem[j]))
matching = true;
j++;
if (!matching) //{;}
cx += snprintf(adj + cx, 3858905 - cx - 1, "%s;", elem[i]);
}
adj is an output string - list of values.
kv->add(key, keybytes, adj, strlen(adj) + 1); //this outputs key-value pairs.
}
I have to use C/C++ only though.
Try to use Huffman Codification. It's a complex thing and old, but I think that's efficient. I don't know if there's new or/and better algorithms to do that.
http://www.cprogramming.com/tutorial/computersciencetheory/huffman.html
http://en.wikipedia.org/wiki/Huffman_coding
struct node {
int value;
struct node *next;
};
i suggest to use linked list to store all the values and then convert it into string...
you can keep count of number of stored values in linked list and using that calculate the string length...and then allocate enough memory using malloc().....
and later on..while more values are added to list you can modify the memory allocated using calloc()....
i dont know if its what you exactly wanted....but it looks feasible to me

Resources