Related
I pasted code at the bottom that allocates lots of pointers but doesn't free any. I have a struct named Node that has fields of type struct Node**. In my main function I have the variable: Node** nodes = malloc(size * typeof(Node*));. I would like to know how to properly deallocate nodes.
typedef struct Node {
size_t id; // identifier of the node
int data; // actual data
size_t num_parents; // actual number of parent nodes
size_t size_parents; // current maximum capacity of array of parent nodes
struct Node** parents; // all nodes that connect from "upstream"
size_t num_children; // actual number of child nodes
size_t size_children; // current maximum capacity of array of children nodes
struct Node** children; // all nodes that connect "downstream"
} Node;
I've pasted the whole code down at the bottom because it is already almost minimal (only things we don't need here are the printing function and find_smallest_value function). VS2019 also gives me two warnings for two lines within the main loop in the main function where I'm allocating each node:
Node** nodes = malloc((num_nodes + 1) * sizeof(Node*));
for (size_t i = 1; i <= num_nodes; i++) {
nodes[i] = malloc(sizeof(Node)); // WARNING Buffer overrun while writing to 'nodes': the writable size is '((num_nodes+1))*sizeof(Node *)' bytes, but '16' bytes might be written.
nodes[i]->id = i; // WARNING Reading invalid data from 'nodes': the readable size is '((num_nodes+1))*sizeof(Node *)' bytes, but '16' bytes may be read.
I don't understand these warnings at all. Finally, you can obtain large input for this program from this website. Just save it to a text file and modify the hardcoded file name in the main function. The program runs fine if I comment out the last lines where I try to deallocate my nodes. My attempt at deallocating crashes the program. I'd greatly appreciate if anyone could explain the correct way to do it.
Explaining the purpose of the code:
The code at the bottom has the following goal. I'm trying to build a directed graph where every vertex has a label and a value. An example of such a graph. The graphs I'm interested in all represent hierarchies. I am to perform two operations on these graphs: I. given a vertex, find the one with smallest value that above it in the hierarchy and print its value; II. given a pair of vertices, swap their places. For example, given vertices 4 and 2 in that figure, the result of operation II would be the same graph but the vertices labelled 2 and 4 would have their labels and data swapped. Given vertex 6, the result of operation I would be "18". I implemented both operations successfully, I believe.
My main function reads from a txt file in order to build the data structure, which I chose to be a multiply linked list. Any input file should be of the following format (this file generates the graph shown in the figure and performs some operations on it):
7 8 9
21 33 33 18 42 22 26
1 2
1 3
2 5
3 5
3 6
4 6
4 7
6 7
P 7
T 4 2
P 7
P 5
T 1 4
P 7
T 4 7
P 2
P 6
First line has three numbers: number of vertices (nodes), number of edges (k, connections) and number of instructions (l, either operation I or II).
Second line is the data in each node. Labels correspond to the index of the node.
The next k lines consist of two node labels: left is a parent node, right is a child node.
The next l lines consist of instructions. P stands for operation I and it's followed by the label of the node. T stands for operation II and it's followed by the two labels of the nodes to be swapped.
The entire pattern can repeat.
The code:
#include<stdlib.h>
#include<stdio.h>
typedef unsigned int uint;
typedef struct Node {
size_t id; // identifier of the node
int data; // actual data
size_t num_parents; // actual number of parent nodes
size_t size_parents; // current maximum capacity of array of parent nodes
struct Node** parents; // all nodes that connect from "upstream"
size_t num_children; // actual number of child nodes
size_t size_children; // current maximum capacity of array of children nodes
struct Node** children; // all nodes that connect "downstream"
} Node;
Node** reallocate_node_array(Node** array, size_t* size) {
Node** new_array = realloc(array, sizeof(Node*) * (*size) * 2);
if (new_array == NULL) {
perror("realloc");
exit(1);
}
*size *= 2;
return new_array;
}
// The intention is to pass `num_children` or `num_parents` as `size` in order to decrease them
void remove_node(Node** array, size_t* size, size_t index) {
for (size_t i = index; i < *size - 1; i++) {
array[i] = array[i + 1];
}
(*size)--; // the decrement to either `num_children` or `num_parents`
}
void remove_parent(Node* node, size_t id) {
for (size_t i = 0; i < node->num_parents; i++) {
if (node->parents[i]->id == id) {
remove_node(node->parents, &node->num_parents, i);
}
}
}
void remove_child(Node* node, size_t id) {
for (size_t i = 0; i < node->num_children; i++) {
if (node->children[i]->id == id) {
remove_node(node->children, &node->num_children, i);
}
}
}
void add_parent(Node* node, Node* parent) {
if (node->num_parents >= node->size_parents) {
node->parents = reallocate_node_array(node->parents, &node->size_parents);
}
node->parents[node->num_parents++] = parent;
}
void add_child(Node* node, Node* child) {
if (node->num_children >= node->size_children) {
node->children = reallocate_node_array(node->children, &node->size_children);
}
node->children[node->num_children++] = child;
}
uint number_of_digits(int n) {
uint d = 0;
do { d++; n /= 10; } while (n != 0);
return d;
}
// return format: "{ parent1.id parent2.id ...} { id data } { child1.id child2.id ...}"
void print_node(Node node) {
printf("{ ");
for (size_t i = 0; i < node.num_parents; i++) {
printf("%zu ", node.parents[i]->id);
}
printf("} [ %zu %d ] { ", node.id, node.data);
for (size_t i = 0; i < node.num_children; i++) {
printf("%zu ", node.children[i]->id);
}
printf("}\n");
}
void switch_nodes(Node* n1, Node* n2, Node** array) {
uint temp_id = n1->id;
uint temp_data = n1->data;
n1->id = n2->id;
n1->data = n2->data;
n2->id = temp_id;
n2->data = temp_data;
Node* temp = array[n1->id];
array[n1->id] = array[n2->id];
array[n2->id] = temp;
}
int find_smallest_valued_parent(Node* node, uint depth) {
// has no parents
if (node->num_parents == 0 || node->parents == NULL) {
if (depth == 0) return -1; // there was no parent on first call (nothing to report)
else return node->data;
}
else {
depth++;
int minimum_value = node->parents[0]->data; // we're guaranteed 1 parent
for (size_t i = 0; i < node->num_parents; i++) {
int next_value = find_smallest_valued_parent(node->parents[i], depth);
if (node->parents[i]->data < next_value) next_value = node->parents[i]->data;
if (next_value < minimum_value) minimum_value = next_value;
}
return minimum_value;
}
}
void free_node_array(Node** array, size_t start, size_t end) {
for (size_t i = start; i < end; i++) {
free(array[i]);
}
free(array);
}
int main() {
char* file_name = "input_feodorv.txt";
FILE* data_file = fopen(file_name, "r");
if (data_file == NULL) {
printf("Error: invalid file %s", file_name);
return 1;
}
for (;;) {
size_t num_nodes, num_relationships, num_instructions;
if (fscanf(data_file, "%zu %zu %zu\n", &num_nodes, &num_relationships, &num_instructions) == EOF)
break;
Node** nodes = malloc((num_nodes + 1) * sizeof(Node*));
for (size_t i = 1; i <= num_nodes; i++) {
nodes[i] = malloc(sizeof(Node)); // WARNING Buffer overrun while writing to 'nodes': the writable size is '((num_nodes+1))*sizeof(Node *)' bytes, but '16' bytes might be written.
nodes[i]->id = i; // WARNING Reading invalid data from 'nodes': the readable size is '((num_nodes+1))*sizeof(Node *)' bytes, but '16' bytes may be read.
fscanf(data_file, "%u ", &nodes[i]->data);
nodes[i]->num_children = 0;
nodes[i]->size_children = 2;
nodes[i]->children = (Node**)malloc(2 * sizeof(Node*));
for (size_t j = 0; j < 2; j++) nodes[i]->children[j] = malloc(sizeof(Node));
nodes[i]->num_parents = 0;
nodes[i]->size_parents = 2;
nodes[i]->parents = (Node**)malloc(2 * sizeof(Node*));
for (size_t j = 0; j < 2; j++) nodes[i]->parents[j] = malloc(sizeof(Node));
}
for (size_t i = 0; i < num_relationships; i++) {
size_t parent_id, child_id;
fscanf(data_file, "%zu %zu\n", &parent_id, &child_id);
add_child(nodes[parent_id], nodes[child_id]);
add_parent(nodes[child_id], nodes[parent_id]);
}
for (size_t i = 0; i < num_instructions; i++) {
char instruction;
fscanf(data_file, "%c ", &instruction);
if (instruction == 'P') {
size_t id;
fscanf(data_file, "%zu\n", &id);
int minimum_value = find_smallest_valued_parent(nodes[id], 0);
if (minimum_value == -1) printf("*\n");
else printf("%u\n", minimum_value);
}
else {
size_t n1_id, n2_id;
fscanf(data_file, "%zu %zu\n", &n1_id, &n2_id);
switch_nodes(nodes[n1_id], nodes[n2_id], nodes);
}
}
/**/
for (size_t i = 1; i <= num_nodes; i++) {
free_node_array(nodes[i]->parents, 0, nodes[i]->size_parents);
free_node_array(nodes[i]->children, 0, nodes[i]->size_children);
}
free_node_array(nodes, 0, num_nodes);
/**/
}
}
There is a memory leak in your code. In the main() function, you are doing:
nodes[i]->children = (Node**)malloc(2 * sizeof(Node*));
for (size_t j = 0; j < 2; j++) nodes[i]->children[j] = malloc(sizeof(Node));
and
nodes[i]->parents = (Node**)malloc(2 * sizeof(Node*));
for (size_t j = 0; j < 2; j++) nodes[i]->parents[j] = malloc(sizeof(Node));
that mean, allocating memory to nodes[i]->children[j] and nodes[i]->parents[j] pointers.
In add_child() and add_parent() function, you are making them point to some other node resulting in loosing there allocated memory reference:
void add_parent(Node* node, Node* parent) {
.....
node->parents[node->num_parents++] = parent;
}
void add_child(Node* node, Node* child) {
.....
node->children[node->num_children++] = child;
}
You actually don't need to allocate memory to nodes[i]->children[j] and nodes[i]->parents[j] pointers in main() because these pointer are suppose to point to the existing nodes of the graph and you are already allocating memory to those nodes here in main():
nodes[i] = malloc(sizeof(Node));
nodes[i] is an element of array of all the nodes of the given graph and childrens and parents pointer should point to these nodes only.
Now coming to freeing these pointers:
The way you are freeing the nodes of graph is not correct. Look at free_node_array() function:
void free_node_array(Node** array, size_t start, size_t end) {
for (size_t i = start; i < end; i++) {
free(array[i]);
}
free(array);
}
and you are calling it in this way:
for (size_t i = 1; i <= num_nodes; i++) {
free_node_array(nodes[i]->parents, 0, nodes[i]->size_parents);
free_node_array(nodes[i]->children, 0, nodes[i]->size_children);
}
That mean, you are freeing the pointers pointed by array of pointers nodes[i]->parents and nodes[i]->children. The members of nodes[i]->parents and nodes[i]->children are pointers which are pointing to elements of nodes array. It is perfectly possible that a node can be a child 1 or more parents and a parent node can have more than 1 child. Now assume case where a child node is pointed by 2 parent nodes, say n1 and n2. When you call free_node_array() function and pass the first parent (n1), it will end you freeing that child node and when free_node_array() function is called to free the second parent (n2), it will try to free the node which is already freed while freeing n1.
So, this way of freeing the memory is not correct. The correct way to free the memory is, simply, free the elements of nodes array because it's the array which will contain all the nodes of given graph and parents and children pointers are supposed to point to these nodes only. No need to traverse the hierarchy of parent and child nodes. To free the graph appropriately, you should do:
Traverse through the nodes array and for each element of array:
Free the array of parents pointer (free (nodes[i]->parents).
Free the array of children pointer (free (nodes[i]->children).
Free that element of nodes array (free (nodes[i]).
Once, this is done then free the nodes array - free (nodes).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct student{
int grade;
int enrollCode;
}student;
typedef struct colVoidStar{
int capacity;
int num_itens_curr;
void **arr;
int current_pos;
}colVoidStar;
colVoidStar *colCreate(int capacity){
if(capacity > 0){
colVoidStar *c = malloc(sizeof(colVoidStar));
if(c != NULL){
c->arr = (void**)malloc(sizeof(void*)*capacity);
if( c->arr != NULL){
c->num_itens_curr = 0;
c->capacity = capacity;
return c;
}
free(c->arr);
}
free(c);
}
return NULL;
}
int colInsert(colVoidStar *c, void *item){
if(c != NULL){
if(c->num_itens_curr < c->capacity){
c->arr[c->num_itens_curr] = (student*)item;
c->num_itens_curr++;
return 1;
}
}
return 0;
}
void *colRemove(colVoidStar *c, void *key, int compar1(void* a, void* b)){
int(*ptrCompar)(void*, void*) = compar1;
student* eleRemoved;
if(c != NULL){
if(c->num_itens_curr > 0){
int i = 0;
for(i; i < c->num_itens_curr; i++){
if(ptrCompar((void*)key, (void*)c->arr[i]) == 0){
eleRemoved = (student*)c->arr[i];
for(int j = i; j < c->num_itens_curr; j++){
c->arr[i] = c->arr[i + 1];
c->arr[i + 1] = 0;
}
return (void*)eleRemoved;
}
return NULL;
}
}
}
return NULL;
}
int compar1(void *a, void*b){
int key;
student *item;
key = *(int*)a;
item = (student*)b;
return (int)(key - item->enrollCode);
}
int main(){
int finishProgram = 0, choose, capacity, returnInsert, removeEnroll;
colVoidStar *c;
student *a, *studentRemoved;
while(finishProgram != 9){
printf("-----------------panel-----------------------\n");
printf("Type: \n");
printf("[1] to create a collection;\n");
printf("[2] to insert a student;\n");
printf("[3] to remove some student of collection;\n");
printf("--------------------------------------------------------\n");
scanf("%d", &choose);
switch(choose){
case 1:
printf("Type the maximum of students the collection will have: \n");
scanf("%d", &capacity);
c = (colVoidStar*)colCreate(capacity);
if(c == NULL){
printf("Error in create collection!\n");
}
break;
case 2:
if(c->num_itens_curr < capacity){
a = (student*)malloc(sizeof(student));
printf("%d student:(type the Grade and the Enroll code, back-to-back)\n", c->num_itens_curr + 1);
scanf("%d %d", &a->grade, &a->enrollCode);
returnInsert = colInsert(c, (void*)a);
if(returnInsert == 1){
for(int i = 0; i < c->num_itens_curr; i++){
printf("The student added has grade = %d e enrollCode = %d \n", (((student*)c->arr[i])->grade), ((student*)c->arr[i])->enrollCode);
}
}else{
printf("the student wasn't added in the collection\n");
}
}else{
printf("it's not possible to add more students to the colletion, since the limit of elements was reached!");
}
break;
case 3:
printf("Type an enrollcode to remove the student attached to it:\n");
scanf("%d", &removeEnroll);
studentRemoved = (student*)colRemove(c, &removeEnroll, compar1(&removeEnroll, c->arr[0]));
if(studentRemoved != NULL)
printf("the student removed has grade = %d and enrollcode %d.", studentRemoved->grade, studentRemoved->enrollCode);
else
printf("the number typed wasn't found");
break;
}
}
return 0;
}
---> As you can realize, what I'm trying to do, at least at this point, is access and remove an item(student* that initially will assume a void* type) of a student's collection(void** arr) using a sort of enrollment code. However, I'm having problems with Segmentation Fault and can't understand why and how can solve them, hence my question up there. Debugging the code I found out the errors lies at: if(ptrCompar((void)key, (void**)*c->arr[i]) == 0) inside of Remove function and return (int)(key - item->matricula) inside of Compar1.
Besides, if you can point me out some articles/documentations/whatever that helps me to understand how to cope with problems like that, I'll appreciate it a lot.
Here are the problems I see in colRemove:
(Not really a problem, just a matter of style) Although the function parameter int compar1(void* a, void* b) is OK, it is more conventional to use the syntax int (*compar1)(void* a, void* b).
(Not really a problem) Having both compar1 and ptrCompar pointing to the same function is redundant. It is probably better to name the parameter ptrCompar to avoid reader's confusion with the compar1 function defined elsewhere in the code.
The function is supposed to be general-purpose and shouldn't be using student* for the eleRemoved variable. Perhaps that was just for debugging? It should be void*.
After the element to be removed has been found, the remaining code is all wrong:
c->num_itens_curr has not been decremented to reduce the number of items.
The code is accessing c->arr[i] and c->arr[i + 1] instead of c->arr[j] and c->arr[j + 1].
c->arr[j + 1] may be accessing beyond the last element because the loop termination condition is off by 1. This may be because c->num_itens_curr was not decremented.
The assignment c->arr[j + 1] = 0; is not really needed because all but the last element will be overwritten on the next iteration, and the value of the old last element does not matter because the number of items should be reduced by 1.
(Not really a problem) There is unnecessary use of type cast operations in the function (e.g. casting void * to void *).
Here is a corrected and maybe improved version of the function (using fewer variables):
void *colRemove(colVoidStar *c, void *key, int (*ptrCompar)(void* a, void* b)){
void* eleRemoved = NULL;
if(c != NULL){
int i;
/* Look for item to be removed. */
for(i = 0; i < c->num_itens_curr; i++){
if(ptrCompar(key, c->arr[i]) == 0){
/* Found it. */
eleRemoved = c->arr[i];
c->num_itens_curr--; /* There is now one less item. */
break;
}
}
/* Close the gap. */
for(; i < c->num_itens_curr; i++){
c->arr[i] = c->arr[i + 1];
}
}
return eleRemoved;
}
In addition, this call of colRemove from main is incorrect:
studentRemoved = (student*)colRemove(c, &removeEnroll, compar1(&removeEnroll, c->arr[0]));
The final argument should be a pointer to the compar1 function, but the code is actually passing the result of a call to the compar1 function which is of type int. It should be changed to this:
studentRemoved = (student*)colRemove(c, &removeEnroll, compar1);
or, removing the unnecessary type cast of the the void* to student*:
studentRemoved = colRemove(c, &removeEnroll, compar1);
The colInsert function is also supposed to be general-purpose so should not use this inappropriate type cast to student*:
c->arr[c->num_itens_curr] = (student*)item;
Perhaps that was also for debugging purposes, but it should just be using item as-is:
c->arr[c->num_itens_curr] = item;
As pointed out by #chux in the comments on the question, the expression key - item->enrollCode in the return statement of compar1 may overflow. I recommend changing it to something like this:
return key < item->enroleCode ? -1 : key > item->enrolCode ? 1 : 0;
or changing it to use this sneaky trick:
return (key > item->enroleCode) - (key < item->enroleCode);
Similarly to other users I am using this wikipedia algorithm. However I have tried to reimplement the algorithm using pointer arithmetic. However I'm having difficulty finding where I've gone wrong.
I think that this if statement is probably the cause but I'm not be sure.
...
if (left >= right) {
ret = (right - ptr);
return ret;
}
temp = *left;
*left = *right;
*right = temp;
/* sortstuff.h */
extern void quicksort(const size_t n, int * ptr);
/* sortstuff.c */
size_t quicksortpartition(const size_t n, int * ptr);
void quicksort(const size_t n, int * ptr) {
int* end = ptr + n - 1;
// for debug purposes
if (original_ptr == NULL) {
original_ptr = ptr;
original_count = n;
}
if (n > 1) {
size_t index = quicksortpartition(n, ptr);
quicksort(index, ptr);
quicksort(n - index - 1, ptr + index + 1);
}
return;
}
size_t quicksortpartition(const size_t n, int * ptr) {
int* right = ptr + n - 1;
int* pivot = ptr + (n - 1) / 2;
int* left = ptr;
int temp;
size_t ret = NULL;
while (1) {
while (*left <= *pivot && left < pivot) {
++left;
}
while (*right > *pivot) {
--right;
}
if (left >= right) {
ret = (right - ptr);
return ret;
}
temp = *left;
*left = *right;
*right = temp;
//print_arr();
}
}
int main(void) {
}
/* main.c */
int array0[] = {5, 22, 16, 3, 1, 14, 9, 5};
const size_t array0_count = sizeof(array0) / sizeof(array0[0]);
int main(void) {
quicksort(array0_count, array0);
printf("array out: ");
for (size_t i = 0; i != array0_count; ++i) {
printf("%d ", array0[i]);
}
puts("");
}
I don't think there are any off by one errors
The code you have presented does not accurately implement the algorithm you referenced. Consider in particular this loop:
while (*left <= *pivot && left < pivot) {
++left;
}
The corresponding loop in the algorithm description has no analog of the left < pivot loop-exit criterion, and its analog of *left <= *pivot uses strict less-than (<), not (<=).
It's easy to see that the former discrepancy must constitute an implementation error. The final sorted position of the pivot is where the left and right pointers meet, but the condition prevents the left pointer ever from advancing past the initial position of the pivot. Thus, if the correct position is rightward of the initial position then the partition function certainly cannot return the correct value. It takes a more thoughtful analysis to realize that in fact, the partition function is moreover prone to looping infinitely in that case, though I think that's somewhat data-dependent.
The latter discrepancy constitutes a provisional error. It risks overrunning the end of the array in the event that the selected pivot value happens to be the largest value in the array, but that's based in part on the fact that the left < pivot condition is erroneous and must be removed. You could replace the latter with left < right to resolve that issue, but although you could form a working sort that way, it probably would not be an improvement on the logic details presented in the algorithm description.
Note, however, that with the <= variation, either quicksortpartition() needs to do extra work (not presently provided for) to ensure that the pivot value ends up at the computed pivot position, or else the quicksort function needs to give up its assumption that that will happen. The former is more practical, supposing you want your sort to be robust.
Pivot needs to be an int, not a pointer. Also to more closely follow the Wiki algorithm, the parameters should be two pointers, not a count and a pointer. I moved the partition logic into the quick sort function.
void QuickSort(int *lo, int *hi)
{
int *i, *j;
int p, t;
if(lo >= hi)
return;
p = *(lo + (hi-lo)/2);
i = lo - 1;
j = hi + 1;
while (1){
while (*(++i) < p);
while (*(--j) > p);
if (i >= j)
break;
t = *i;
*i = *j;
*j = t;
}
QuickSort(lo, j);
QuickSort(j+1, hi);
}
The call would be:
QuickSort(array0, array0+array0_count-1);
This is a tricky problem that I have been thinking about for a long time and have yet to see a satisfactory answer anywhere. Lets say I have a large int array of size 10000. I can simply declare it in the following manner:
int main()
{
int foo[10000];
int i;
int n;
n = sizeof(foo) / sizeof(int);
for (i = 0; i < n; i++)
{
printf("Index %d is %d\n",i,foo[i] );
}
return 0;
}
It is pretty clear that each index in the array will hold a random assortment of numbers before I formally initialize them:
Index 0 is 0
Index 1 is 0
Index 2 is 0
Index 3 is 0
.
.
.
Index 6087 is 0
Index 6088 is 1377050464
Index 6089 is 32767
Index 6090 is 1680893034
.
.
.
Index 9996 is 0
Index 9997 is 0
Index 9998 is 0
Index 9999 is 0
Then lets say that I initialize select index ranges of my array with values that hold a specific value for the program as a whole and must be preserved, with the goal of passing in those values for subsequent operation to some function:
//Call this block 1
foo[0] = 0;
foo[1] = 7;
foo[2] = 99;
foo[3] = 0;
//Call this block 2
foo[9996] = 0;
foo[9997] = 444;
foo[9998] = 2;
foo[9999] = 0;
for (i = 0; i < (What goes here?); i++)
{
//I must pass in only those values initialized to select indices of foo[] (Blocks 1 and 2 uncorrupted)
//How to recover those values to pass into foo_func()?
foo_func(foo[]);
}
Some of those values that I initialized foo[] with overlap with pre-existing values in the array before formally initializing the array myself. How can I pass in just the indices of the array elements that I initialized, given that there are multiple index ranges? I just can't figure this out. Thanks for any and all help!
EDIT:
I should also mention that the array itself will be read from a .txt file. I just showed the initialization in the code for illustrative purposes.
There's a number of ways you can quickly zero out the memory in the array, either while initializing or after.
For an array on the stack, initialize it with zeros. {0} is shorthand for that.
int foo[10000] = {0};
For an array on the heap, use calloc to allocate memory and initialize it with 0's.
int *foo = calloc(10000, sizeof(int));
If the array already exists, use memset to quickly overwrite all the array's memory with zeros.
memset(foo, 0, sizeof(int) * 10000);
Now all elements are zero. You can set individual elements to whatever you like one by one. For example...
int main() {
int foo[10] = {0};
foo[1] = 7;
foo[2] = 99;
foo[7] = 444;
foo[8] = 2;
for( int i = 0; i < 10; i++ ) {
printf("%d - %d\n", i, foo[i]);
}
}
That will print...
0 - 0
1 - 7
2 - 99
3 - 0
4 - 0
5 - 0
6 - 0
7 - 444
8 - 2
9 - 0
As a side note, using only a few elements of a large array is a waste of memory. Instead, use a hash table, or if you need ordering, some type of tree. These can be difficult to implement correctly, but a library such as GLib can provide you with good implementations.
Introduction
I'm making a strong assumption on your problem, and it is sparsness (a majority of the elements in your array will remain zero).
Under this assumption I would build the array as a list. I'm including a sample code, that it is not complete and it is not intended to
be---you should do your own homework :)
The core object is a struct with a pointer to a begin element and the size:
typedef struct vector {
size_t size;
vector_element_t * begin;
} vector_t;
each element of the vector has its own index and value and a pointer to the next element in a list:
typedef struct vector_element vector_element_t;
struct vector_element {
int value;
size_t index;
vector_element_t *next;
};
on this basis we can build a dynamical vector as a list, by dropping a constraint on the ordering (it is not needed, you can modify this code
to maintain the ordering), using some simple custom methods:
vector_t * vector_init(); // Initialize an empty array
void vector_destroy(vector_t* v); // Destroy the content and the array itself
int vector_get(vector_t *v, size_t index); // Get an element from the array, by searching the index
size_t vector_set(vector_t *v, size_t index, int value); // Set an element at the index
void vector_delete(vector_t *v, size_t index); // Delete an element from the vector
void vector_each(vector_t *v, int(*f)(size_t index, int value)); // Executes a callback for each element of the list
// This last function may be the response to your question
Test it online
The main example
This is a main that uses all this methods and prints in console:
int callback(size_t index, int value) {
printf("Vector[%lu] = %d\n", index, value);
return value;
}
int main() {
vector_t * vec = vector_init();
vector_set(vec, 10, 5);
vector_set(vec, 23, 9);
vector_set(vec, 1000, 3);
printf("vector_get(vec, %d) = %d\n", 1000, vector_get(vec, 1000)); // This should print 3
printf("vector_get(vec, %d) = %d\n", 1, vector_get(vec, 1)); // this should print 0
printf("size(vec) = %lu\n", vec->size); // this should print 3 (the size of initialized elements)
vector_each(vec, callback); // Calling the callback on each element of the
// array that is initialized, as you asked.
vector_delete(vec, 23);
printf("size(vec) = %lu\n", vec->size);
vector_each(vec, callback); // Calling the callback on each element of the array
vector_destroy(vec);
return 0;
}
And the output:
vector_get(vec, 1000) = 3
vector_get(vec, 1) = 0
size(vec) = 3
Vector[10] = 5
Vector[23] = 9
Vector[1000] = 3
size(vec) = 3
Vector[10] = 5
Vector[1000] = 3
The callback with the function vector_each is something you really should look at.
Implementations
I'm giving you some trivial implementations for the functions in the introdution. They are not complete,
and some checks on pointers should be introduced. I'm leaving that to you. As it is, this code is not for production and under some circumstances can also overflow.
The particular part is the search of a specific element in the vector. Every time you tranverse the list,
and this is convenient only and only if you have sparsity (the majority of your index will always return zero).
In this implementation, if you access an index that is not enlisted, you get as a result 0. If you don't want this
you should define an error callback.
Initialization and destruction
When we initialize, we allocate the memory for our vector, but with no elements inside, thus begin points to NULL. When we destroy the vector we have not only to free the vector, but also each element contained.
vector_t * vector_init() {
vector_t * v = (vector_t*)malloc(sizeof(vector_t));
if (v) {
v->begin = NULL;
v->size = 0;
return v;
}
return NULL;
}
void vector_destroy(vector_t *v) {
if (v) {
vector_element_t * curr = v->begin;
if (curr) {
vector_element_t * next = curr->next;
while (next) {
curr = curr->next;
next = next->next;
if (curr)
free(curr);
}
if (next)
free(next);
}
free(v);
}
}
The get and set methods
In get you can see how the list works (and the same concept
is used also in set and delete): we start from the begin, and
we cross the list until we reach an element with an index equal
to the one requested. If we cannot find it we simply return 0.
If we need to "raise some sort of signal" when the value is
not found, it is easy to implement an "error callback".
As long as sparsness holds, searching in the whole array for an index is a good compromise in terms of memory requirements, and efficiency may be not an issue.
int vector_get(vector_t *v, size_t index) {
vector_element_t * el = v->begin;
while (el != NULL) {
if (el->index == index)
return el->value;
el = el->next;
}
return 0;
}
// Gosh, this set function is really a mess... I hope you can understand it...
// -.-'
size_t vector_set(vector_t *v, size_t index, int value) {
vector_element_t * el = v->begin;
// Case 1: Initialize the first element of the array
if (el == NULL) {
el = (vector_element_t *)malloc(sizeof(vector_element_t));
if (el != NULL) {
v->begin = el;
v->size += 1;
el->index = index;
el->value = value;
el->next = NULL;
return v->size;
} else {
return 0;
}
}
// Case 2: Search for the element in the array
while (el != NULL) {
if (el->index == index) {
el->value = value;
return v->size;
}
// Case 3: if there is no element with that index creates a new element
if (el->next == NULL) {
el->next = (vector_element_t *)malloc(sizeof(vector_element_t));
if (el->next != NULL) {
v->size += 1;
el->next->index = index;
el->next->value = value;
el->next->next = NULL;
return v->size;
}
return 0;
}
el = el->next;
}
}
Deleting an element
With this approach it is possible to delete an element quite easily, connecting
curr->next to curr->next->next. We must though free the previous curr->next...
void vector_delete(vector_t * v, size_t index) {
vector_element_t *curr = v->begin;
vector_element_t *next = curr->next;
while (next != NULL) {
if (next->index == index) {
curr->next = next->next;
free(next);
return;
} else {
curr = next;
next = next->next;
}
}
}
An iteration function
I think this is the answer to the last part of your question,
instead passing a sequence of indexes, you pass a callback to the vector.
The callback gets and sets value in a specific index. If you want to
operate only on some specific indexes, you may include a check in the
callback itself. If you need to pass more data to the callback, check
the very last section.
void vector_each(vector_t * v, int (*f)(size_t index, int value)) {
vector_element_t *el = v->begin;
while (el) {
el->value = f(el->index, el->value);
el = el->next;
}
}
Error callback
You may want to raise some out of bounds error or something else. One solution is to enrich your list with function pointer that represent a callback that should be called when your user sk for an undefined element:
typedef struct vector {
size_t size;
vector_element_t *begin;
void (*error_undefined)(vector *v, size_t index);
} vector_t
and maybe at the end of your vector_get function you may want to do something like:
int vector_get(vector_t *v, size_t index) {
// [ . . .]
// you know at index the element is undefined:
if (v->error_undefined)
v->error_undefined(v, index);
else {
// Do something to clean up the user mess... or simply
return 0;
}
}
usually it is nice to add also an helper function to set the callback...
Passing user data to "each" callback
If you want to pass more data to the user callback, you may add a void* as last argument:
void vector_each(vector_t * v, void * user_data, int (*f)(size_t index, int value, void * user_data));
void vector_each(vector_t * v, void * user_data, int (*f)(size_t index, int value, void * user_data)) {
[...]
el->value = f(el->index, el->value, user_data);
[...]
}
if the user do not need it, he can pass a wonderful NULL.
I am new to C so I am having troubles with making a hash table and malloc-ing spaces.
I am doing an anagram solver. Right now I am still at the step where I create the hash table for this program. I am trying to test my insert function to see if it is working properly by calling the function once with some random arguments.
However, I kept getting segmentation faults, and I used valgrind to track down where it crashes.
Can you point out what am I missing?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int hash(char *word)
{
int h = 0;
int i, j;
char *A;
char *a;
// an array of 26 slots for 26 uppercase letters in the alphabet
A = (char *)malloc(26 * sizeof(char));
// an array of 26 slots for 26 lowercase letters in the alphabet
a = (char *)malloc(26 * sizeof(char));
for (i = 0; i < 26; i++) {
A[i] = (char)(i + 65); // fill the array from A to Z
a[i] = (char)(i + 97); // fill the array from a to z
}
for (i = 0; i < strlen(word); i++) {
for (j = 0; j < 26; j++) {
// upper and lower case have the same hash value
if (word[i] == A[j] || word[i] == a[j]) {
h += j; // get the hash value of the word
break;
}
}
}
return h;
}
typedef struct Entry {
char *word;
int len;
struct Entry *next;
} Entry;
#define TABLE_SIZE 20 // test number
Entry *table[TABLE_SIZE] = { NULL };
void init() {
// create memory spaces for each element
struct Entry *en = (struct Entry *)malloc(sizeof(struct Entry));
int i;
// initialize
for (i = 0; i < TABLE_SIZE; i++) {
en->word = "";
en->len = 0;
en->next = table[i];
table[i] = en;
}
}
void insertElement(char *word, int len) {
int h = hash(word);
int i = 0;
// check if value has already existed
while(i < TABLE_SIZE && (strcmp(table[h]->word, "") != 0)) {
// !!!! NEXT LINE IS WHERE IT CRASHES !!!
if (strcmp(table[h]->word, word) == 0) { // found
table[h]->len = len;
return; // exit function and skip the rest
}
i++; // increment loop index
}
// found empty element
if (strcmp(table[h]->word, "") == 0) {
struct Entry *en;
en->word = word;
en->len = len;
en->next = table[h];
table[h] = en;
}
}
int main() {
init(); // initialize hash table
// test call
insertElement("kkj\0", 2);
int i;
for ( i=0; i < 10; i++)
{
printf("%d: ", i);
struct Entry *enTemp = table[i];
while (enTemp->next != NULL)
{
printf("Word: %s, Len:%d)", enTemp->word, enTemp->len);
enTemp = enTemp->next;
}
printf("\n");
}
return 0;
}
It's not necessary to cast the return value from malloc, and doing so can mask other errors.
The following lines malloc memory which is never freed, so there's a memory leak in your hash function.
// an array of 26 slots for 26 uppercase letters in the alphabet
A = (char *)malloc(26 * sizeof(char));
// an array of 26 slots for 26 lowercase letters in the alphabet
a = (char *)malloc(26 * sizeof(char));
sizeof(char) is guaranteed to be 1, by definition, so it's not necessary to multiply by sizeof(char).
Your code also assume ascii layout of the characters, which is not guaranteed.
In the init() function, you have
// create memory spaces for each element
struct Entry *en = (struct Entry *)malloc(sizeof(struct Entry));
does not do what the comment says. It only allocates enough memory for one struct Entry. Perhaps you meant to put this inside the loop.
For a fixed table size you could also just have an array of struct Entry
directly rather than an array of pointers to such. I.e.
struct Entry table[TABLE_SIZE] = { 0 };
And then you wouldn't need to malloc memory for the entries themselves, just the contents.
In your initialization loop
for (i = 0; i < TABLE_SIZE; i++) {
en->word = "";
en->len = 0;
en->next = table[i];
table[i] = en;
}
each en->next is set to itself, and all of the table elements are set to the same value. The first time through the loop, en->next is set to table[0], which at this point is NULL due to your static initializer. table[0] is then set to en.
The second time through the loop, en->next is set to table[1], which is also null. And en hasn't changed, it is still pointing to the result from the earlier malloc. table[1] is then set to en, which is the same value you had before. So, when you're done, every element of table is set to the same value, and en->next is NULL.
I haven't traced through the hash function, but I don't immediately see
anything limiting the use of the hash value to possible indexes of table. When I tested it, "kkj\0" (btw, String literals in C are already null terminated, so the \0 isn't needed.) had a hash value of 29, which is outside the valid
indexes of table. So you are accessing memory outside the limits of the table
array. At that point all bets are off and pretty much anything can happen. A
seg fault in this case is actually a good result, since it's immediately
obvious something's wrong. You need to take the hash value modulo the table
size to fix the array bounds issue, i.e. h % TABLE_SIZE.