Address is not stack'd, malloc'd or (recently) free'd - c

I am new to C so I am having troubles with making a hash table and malloc-ing spaces.
I am doing an anagram solver. Right now I am still at the step where I create the hash table for this program. I am trying to test my insert function to see if it is working properly by calling the function once with some random arguments.
However, I kept getting segmentation faults, and I used valgrind to track down where it crashes.
Can you point out what am I missing?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int hash(char *word)
{
int h = 0;
int i, j;
char *A;
char *a;
// an array of 26 slots for 26 uppercase letters in the alphabet
A = (char *)malloc(26 * sizeof(char));
// an array of 26 slots for 26 lowercase letters in the alphabet
a = (char *)malloc(26 * sizeof(char));
for (i = 0; i < 26; i++) {
A[i] = (char)(i + 65); // fill the array from A to Z
a[i] = (char)(i + 97); // fill the array from a to z
}
for (i = 0; i < strlen(word); i++) {
for (j = 0; j < 26; j++) {
// upper and lower case have the same hash value
if (word[i] == A[j] || word[i] == a[j]) {
h += j; // get the hash value of the word
break;
}
}
}
return h;
}
typedef struct Entry {
char *word;
int len;
struct Entry *next;
} Entry;
#define TABLE_SIZE 20 // test number
Entry *table[TABLE_SIZE] = { NULL };
void init() {
// create memory spaces for each element
struct Entry *en = (struct Entry *)malloc(sizeof(struct Entry));
int i;
// initialize
for (i = 0; i < TABLE_SIZE; i++) {
en->word = "";
en->len = 0;
en->next = table[i];
table[i] = en;
}
}
void insertElement(char *word, int len) {
int h = hash(word);
int i = 0;
// check if value has already existed
while(i < TABLE_SIZE && (strcmp(table[h]->word, "") != 0)) {
// !!!! NEXT LINE IS WHERE IT CRASHES !!!
if (strcmp(table[h]->word, word) == 0) { // found
table[h]->len = len;
return; // exit function and skip the rest
}
i++; // increment loop index
}
// found empty element
if (strcmp(table[h]->word, "") == 0) {
struct Entry *en;
en->word = word;
en->len = len;
en->next = table[h];
table[h] = en;
}
}
int main() {
init(); // initialize hash table
// test call
insertElement("kkj\0", 2);
int i;
for ( i=0; i < 10; i++)
{
printf("%d: ", i);
struct Entry *enTemp = table[i];
while (enTemp->next != NULL)
{
printf("Word: %s, Len:%d)", enTemp->word, enTemp->len);
enTemp = enTemp->next;
}
printf("\n");
}
return 0;
}

It's not necessary to cast the return value from malloc, and doing so can mask other errors.
The following lines malloc memory which is never freed, so there's a memory leak in your hash function.
// an array of 26 slots for 26 uppercase letters in the alphabet
A = (char *)malloc(26 * sizeof(char));
// an array of 26 slots for 26 lowercase letters in the alphabet
a = (char *)malloc(26 * sizeof(char));
sizeof(char) is guaranteed to be 1, by definition, so it's not necessary to multiply by sizeof(char).
Your code also assume ascii layout of the characters, which is not guaranteed.
In the init() function, you have
// create memory spaces for each element
struct Entry *en = (struct Entry *)malloc(sizeof(struct Entry));
does not do what the comment says. It only allocates enough memory for one struct Entry. Perhaps you meant to put this inside the loop.
For a fixed table size you could also just have an array of struct Entry
directly rather than an array of pointers to such. I.e.
struct Entry table[TABLE_SIZE] = { 0 };
And then you wouldn't need to malloc memory for the entries themselves, just the contents.
In your initialization loop
for (i = 0; i < TABLE_SIZE; i++) {
en->word = "";
en->len = 0;
en->next = table[i];
table[i] = en;
}
each en->next is set to itself, and all of the table elements are set to the same value. The first time through the loop, en->next is set to table[0], which at this point is NULL due to your static initializer. table[0] is then set to en.
The second time through the loop, en->next is set to table[1], which is also null. And en hasn't changed, it is still pointing to the result from the earlier malloc. table[1] is then set to en, which is the same value you had before. So, when you're done, every element of table is set to the same value, and en->next is NULL.
I haven't traced through the hash function, but I don't immediately see
anything limiting the use of the hash value to possible indexes of table. When I tested it, "kkj\0" (btw, String literals in C are already null terminated, so the \0 isn't needed.) had a hash value of 29, which is outside the valid
indexes of table. So you are accessing memory outside the limits of the table
array. At that point all bets are off and pretty much anything can happen. A
seg fault in this case is actually a good result, since it's immediately
obvious something's wrong. You need to take the hash value modulo the table
size to fix the array bounds issue, i.e. h % TABLE_SIZE.

Related

How can I correctly allocate memory for this MergeSort implementation in C (with the DS I am using)?

My goal here is to perform MergeSort on a dynamic array-like data structure I called a dictionary used to store strings and their relative weights. Sorry if the implementation is dumb, I'm a student and still learning.
Anyway, based on the segfaults I'm getting, I'm incorrectly allocating memory for my structs of type item to be copied over into the temporary lists I'm making. Not sure how to fix this. Code for mergesort and data structure setup is below, any help is appreciated.
/////// DICTIONARY METHODS ////////
typedef struct {
char *item;
int weight;
} item;
typedef struct {
item **wordlist;
//track size of dictionary
int size;
} dict;
//dict constructor
dict* Dict(int count){
//allocate space for dictionary
dict* D = malloc(sizeof(dict));
//allocate space for words
D->wordlist = malloc(sizeof(item*) * count);
//initial size
D->size = 0;
return D;
}
//word constructor
item* Item(char str[]){
//allocate memory for struct
item* W = malloc(sizeof(item));
//allocate memory for string
W->item = malloc(sizeof(char) * strlen(str));
W->weight = 0;
return W;
}
void merge(dict* D, int start, int middle, int stop){
//create ints to track lengths of left and right of array
int leftlen = middle - start + 1;
int rightlen = stop - middle;
//create new temporary dicts to store the two sides of the array
dict* L = Dict(leftlen);
dict* R = Dict(rightlen);
int i, j, k;
//copy elements start through middle into left dict- this gives a segfault
for (int i = 0; i < leftlen; i++){
L->wordlist[i] = malloc(sizeof(item*));
L->wordlist[i] = D->wordlist[start + i];
}
//copy elements middle through end into right dict- this gives a segfault
for (int j = 0; j < rightlen; j++){
R->wordlist[j] = malloc(sizeof(item*));
R->wordlist[j]= D->wordlist[middle + 1 + k];
}
i = 0;
j = 0;
k = leftlen;
while ((i < leftlen) && (j < rightlen)){
if (strcmp(L->wordlist[i]->item, R->wordlist[j]->item) <= 0) {
D->wordlist[k] = L->wordlist[i];
i++;
k++;
}
else{
D->wordlist[k] = R->wordlist[j];
j++;
k++;
}
}
while (i < leftlen){
D->wordlist[k] = L->wordlist[i];
i++;
k++;
}
while (j < rightlen){
D->wordlist[k] = L->wordlist[j];
j++;
k++;
}
}
void mergeSort(dict* D, int start, int stop){
if (start < stop) {
int middle = start + (stop - start) / 2;
mergeSort(D, start, middle);
mergeSort(D, middle + 1, stop);
merge(D, start, middle, stop);
}
I put print statements everywhere and narrowed it down to the mallocs in the section where I copy the dictionary to be sorted into 2 separate dictionaries. Also tried writing that malloc as malloc(sizeof(D->wordlist[start + i])). Is there something else I need to do to be able to copy the item struct into the wordlist of the new struct?
Again, I'm new to this, so cut me some slack :)
There are numerous errors in the code:
In merge() when copying elements to the R list, the wrong (and uninitialized) index variable k is being used instead of j. R->wordlist[j]= D->wordlist[middle + 1 + k]; should be R->wordlist[j]= D->wordlist[middle + 1 + j];.
In merge() before merging the L and R lists back to D, the index variable k for the D list is being initialized to the wrong value. k = leftLen; should be k = start;.
In merge() in the loop that should copy the remaining elements of the "right" list to D, the elements are being copied from the "left" list instead of the "right" list. D->wordlist[k] = L->wordlist[j]; should be D->wordlist[k] = R->wordlist[j];.
In Item(), the malloc() call is not reserving space for the null terminator at the end of the string. W->item = malloc(sizeof(char) * strlen(str)); should be W->item = malloc(sizeof(char) * (strlen(str) + 1)); (and since sizeof(char) is 1 by definition it can be simplified to W->item = malloc(strlen(str) + 1);).
Item() is not copying the string to the allocated memory. Add strcpy(W->item, str);.
There are memory leaks in merge():
L->wordlist[i] = malloc(sizeof(item*)); is not required and can be removed since L->wordlist[i] is changed on the very next line: L->wordlist[i] = D->wordlist[start + i];.
Similarly, R->wordlist[j] = malloc(sizeof(item*)); is not required and can be removed since R->wordlist[j] is changed on the very next line.
L and R memory is created but never destroyed. Add these lines to the end of merge() to free them:
free(L->wordlist);
free(L);
free(R->wordlist);
free(R);
None of the malloc() calls are checked for success.
Allocate it all at once, before the merge sort even starts.
#include <stdlib.h>
#include <string.h>
// Weighted Word --------------------------------------------------------------
//
typedef struct {
char *word;
int weight;
} weighted_word;
// Create a weighted word
//
weighted_word* CreateWeightedWord(const char *str, int weight){
weighted_word* W = malloc(sizeof(weighted_word));
if (W){
W->word = malloc(strlen(str) + 1); // string length + nul terminator
if (W->word)
strcpy( W->word, str);
W->weight = weight;
}
return W;
}
// Free a weighted word
//
weighted_word *FreeWeightedWord(weighted_word *W){
if (W){
if (W->word)
free(W->word);
free(W);
}
return NULL;
}
// Dictionary (of Weighted Words) ---------------------------------------------
//
typedef struct {
weighted_word **wordlist; // this is a pointer to an array of (weighted_word *)s
int size; // current number of elements in use
int capacity; // maximum number of elements available to use
} dict;
// Create a dictionary with a fixed capacity
//
dict* CreateDict(int capacity){
dict* D = malloc(sizeof(dict));
if (D){
D->wordlist = malloc(sizeof(weighted_word*) * capacity);
D->size = 0;
D->capacity = capacity;
}
return D;
}
// Free a dictionary (and all weighted words)
//
dict *FreeDict(dict *D){
if (D){
for (int n = 0; n < D->size; n++)
FreeWeightedWord(D->wordlist[n]);
free(D->wordlist);
free(D);
}
return NULL;
}
// Add a new weighted word to the end of our dictionary
//
void DictAddWord(dict *D, const char *str, int weight){
if (!D) return;
if (D->size == D->capacity) return;
D->wordlist[D->size] = CreateWeightedWord(str, weight);
if (D->wordlist[D->size])
D->size += 1;
}
// Merge Sort the Dictionary --------------------------------------------------
// Merge two partitions of sorted words
// words • the partitioned weighted word list
// start • beginning of left partition
// middle • end of left partition, beginning of right partition
// stop • end of right partition
// buffer • temporary work buffer, at least as big as (middle-start)
//
void MergeWeightedWords(weighted_word **words, int start, int middle, int stop, weighted_word **buffer){
int Lstart = start; int Rstart = middle; // Left partition
int Lstop = middle; int Rstop = stop; // Right partition
int Bindex = 0; // temporary work buffer output index
// while (left partition has elements) AND (right partition has elements)
while ((Lstart < Lstop) && (Rstart < Rstop)){
if (strcmp( words[Rstart]->word, words[Lstart]->word ) < 0)
buffer[Bindex++] = words[Rstart++];
else
buffer[Bindex++] = words[Lstart++];
}
// if (left partition has any remaining elements)
while (Lstart < Lstop)
buffer[Bindex++] = words[Lstart++];
// We don't actually need this. Think about it. Why not?
// // if (right partition has any remaining elements)
// while (Rstart < Rstop)
// buffer[Bindex++] = words[Rstart++];
// Copy merged data from temporary buffer back into source word list
for (int n = 0; n < Bindex; n++)
words[start++] = buffer[n];
}
// Merge Sort an array of weighted words
// words • the array of (weighted_word*)s to sort
// start • index of first element to sort
// stop • index ONE PAST the last element to sort
// buffer • the temporary merge buffer, at least as big as (stop-start+1)/2
//
void MergeSortWeightedWords(weighted_word **words, int start, int stop, weighted_word **buffer){
if (start < stop-1){ // -1 because a singleton array is by definition sorted
int middle = start + (stop - start) / 2;
MergeSortWeightedWords(words, start, middle, buffer);
MergeSortWeightedWords(words, middle, stop, buffer);
MergeWeightedWords(words, start, middle, stop, buffer);
}
}
// Merge Sort a Dictionary
//
void MergeSortDict(dict *D){
if (D){
// We only need to allocate a single temporary work buffer, just once, right here.
dict * Temp = CreateDict(D->size);
if (Temp){
MergeSortWeightedWords(D->wordlist, 0, D->size, Temp->wordlist);
}
FreeDict(Temp);
}
}
// Main program ---------------------------------------------------------------
#include <stdio.h>
int main(int argc, char **argv){
// Command-line arguments --> dictionary
dict *a_dict = CreateDict(argc-1);
for (int n = 1; n < argc; n++)
DictAddWord(a_dict, argv[n], 0);
// Sort the dictionary
MergeSortDict(a_dict);
// Print the weighted words
for (int n = 0; n < a_dict->size; n++)
printf( "%d %s\n", a_dict->wordlist[n]->weight, a_dict->wordlist[n]->word );
// Clean up
FreeDict(a_dict);
}
Notes for you:
Be consistent. You were inconsistent with capitalization and * placement and, oddly, vertical spacing. (You are waaay better than most beginners, though.) I personally hate the Egyptian brace style, but to each his own.
I personally think there are far too many levels of malloc()s in this code too, but I will leave it at this one comment. It works as is.
Strings must be nul-terminated — that is, each string takes strlen() characters plus one for a '\0' character. There is a convenient library function that can copy a string for you too, called strdup(), which AFAIK exists on every system.
Always check that malloc() and friends succeed.
Don’t forget to free everything you allocate. Functions help.
“Item” was a terribly non-descript name, and it overlapped with the meaning of two different things in your code. I renamed them to separate things.
Your dictionary object should be expected to keep track of how many elements it can support. The above code simply refuses to add words after the capacity is filled, but you could easily make it realloc() a larger capacity if the need arises. The point is to prevent invalid array accesses by adding too many elements to a fixed-size array.
Printing the array could probably go in a function.
Notice how I set start as inclusive and stop as exclusive. This is a very C (and C++) way of looking at things, and it is a good one. It will help you with all kinds of algorithms.
Notice also how I split the Merge Sort up into two functions: one that takes a dictionary as argument, and a lower-level one that takes an array of the weighted words as argument that does all the work.
The higher-level merge sort a dictionary allocates all the temporary buffer the merge algorithm needs, just once.
The lower-level merge sort an array of (weighted_word*)s expects that temporary buffer to exist and doesn’t care (or know anything) about the dictionary object.
The merge algorithm likewise doesn't know much. It is simply given all the information it needs.
Right now the merge condition simply compares the weighted-word’s string value. But it doesn’t have to be so simple. For example, you could sort equal elements by weight. Create a function:
int CompareWeightedWords(const weighted_word *a, const weighted_word *b){
int rel = strcmp( a->word, b->word );
if (rel < 0) return -1;
if (rel > 0) return 1;
return a->weight < b->weight ? -1 : a->weight > b->weight;
}
And put it to use in the merge function:
if (CompareWeightedWords( words[Rstart], words[Lstart] ) < 0)
buffer[Bindex++] = words[Rstart++];
else
buffer[Bindex++] = words[Lstart++];
I don’t think I forgot anything.

How do I free struct pointers with nested double pointers?

I pasted code at the bottom that allocates lots of pointers but doesn't free any. I have a struct named Node that has fields of type struct Node**. In my main function I have the variable: Node** nodes = malloc(size * typeof(Node*));. I would like to know how to properly deallocate nodes.
typedef struct Node {
size_t id; // identifier of the node
int data; // actual data
size_t num_parents; // actual number of parent nodes
size_t size_parents; // current maximum capacity of array of parent nodes
struct Node** parents; // all nodes that connect from "upstream"
size_t num_children; // actual number of child nodes
size_t size_children; // current maximum capacity of array of children nodes
struct Node** children; // all nodes that connect "downstream"
} Node;
I've pasted the whole code down at the bottom because it is already almost minimal (only things we don't need here are the printing function and find_smallest_value function). VS2019 also gives me two warnings for two lines within the main loop in the main function where I'm allocating each node:
Node** nodes = malloc((num_nodes + 1) * sizeof(Node*));
for (size_t i = 1; i <= num_nodes; i++) {
nodes[i] = malloc(sizeof(Node)); // WARNING Buffer overrun while writing to 'nodes': the writable size is '((num_nodes+1))*sizeof(Node *)' bytes, but '16' bytes might be written.
nodes[i]->id = i; // WARNING Reading invalid data from 'nodes': the readable size is '((num_nodes+1))*sizeof(Node *)' bytes, but '16' bytes may be read.
I don't understand these warnings at all. Finally, you can obtain large input for this program from this website. Just save it to a text file and modify the hardcoded file name in the main function. The program runs fine if I comment out the last lines where I try to deallocate my nodes. My attempt at deallocating crashes the program. I'd greatly appreciate if anyone could explain the correct way to do it.
Explaining the purpose of the code:
The code at the bottom has the following goal. I'm trying to build a directed graph where every vertex has a label and a value. An example of such a graph. The graphs I'm interested in all represent hierarchies. I am to perform two operations on these graphs: I. given a vertex, find the one with smallest value that above it in the hierarchy and print its value; II. given a pair of vertices, swap their places. For example, given vertices 4 and 2 in that figure, the result of operation II would be the same graph but the vertices labelled 2 and 4 would have their labels and data swapped. Given vertex 6, the result of operation I would be "18". I implemented both operations successfully, I believe.
My main function reads from a txt file in order to build the data structure, which I chose to be a multiply linked list. Any input file should be of the following format (this file generates the graph shown in the figure and performs some operations on it):
7 8 9
21 33 33 18 42 22 26
1 2
1 3
2 5
3 5
3 6
4 6
4 7
6 7
P 7
T 4 2
P 7
P 5
T 1 4
P 7
T 4 7
P 2
P 6
First line has three numbers: number of vertices (nodes), number of edges (k, connections) and number of instructions (l, either operation I or II).
Second line is the data in each node. Labels correspond to the index of the node.
The next k lines consist of two node labels: left is a parent node, right is a child node.
The next l lines consist of instructions. P stands for operation I and it's followed by the label of the node. T stands for operation II and it's followed by the two labels of the nodes to be swapped.
The entire pattern can repeat.
The code:
#include<stdlib.h>
#include<stdio.h>
typedef unsigned int uint;
typedef struct Node {
size_t id; // identifier of the node
int data; // actual data
size_t num_parents; // actual number of parent nodes
size_t size_parents; // current maximum capacity of array of parent nodes
struct Node** parents; // all nodes that connect from "upstream"
size_t num_children; // actual number of child nodes
size_t size_children; // current maximum capacity of array of children nodes
struct Node** children; // all nodes that connect "downstream"
} Node;
Node** reallocate_node_array(Node** array, size_t* size) {
Node** new_array = realloc(array, sizeof(Node*) * (*size) * 2);
if (new_array == NULL) {
perror("realloc");
exit(1);
}
*size *= 2;
return new_array;
}
// The intention is to pass `num_children` or `num_parents` as `size` in order to decrease them
void remove_node(Node** array, size_t* size, size_t index) {
for (size_t i = index; i < *size - 1; i++) {
array[i] = array[i + 1];
}
(*size)--; // the decrement to either `num_children` or `num_parents`
}
void remove_parent(Node* node, size_t id) {
for (size_t i = 0; i < node->num_parents; i++) {
if (node->parents[i]->id == id) {
remove_node(node->parents, &node->num_parents, i);
}
}
}
void remove_child(Node* node, size_t id) {
for (size_t i = 0; i < node->num_children; i++) {
if (node->children[i]->id == id) {
remove_node(node->children, &node->num_children, i);
}
}
}
void add_parent(Node* node, Node* parent) {
if (node->num_parents >= node->size_parents) {
node->parents = reallocate_node_array(node->parents, &node->size_parents);
}
node->parents[node->num_parents++] = parent;
}
void add_child(Node* node, Node* child) {
if (node->num_children >= node->size_children) {
node->children = reallocate_node_array(node->children, &node->size_children);
}
node->children[node->num_children++] = child;
}
uint number_of_digits(int n) {
uint d = 0;
do { d++; n /= 10; } while (n != 0);
return d;
}
// return format: "{ parent1.id parent2.id ...} { id data } { child1.id child2.id ...}"
void print_node(Node node) {
printf("{ ");
for (size_t i = 0; i < node.num_parents; i++) {
printf("%zu ", node.parents[i]->id);
}
printf("} [ %zu %d ] { ", node.id, node.data);
for (size_t i = 0; i < node.num_children; i++) {
printf("%zu ", node.children[i]->id);
}
printf("}\n");
}
void switch_nodes(Node* n1, Node* n2, Node** array) {
uint temp_id = n1->id;
uint temp_data = n1->data;
n1->id = n2->id;
n1->data = n2->data;
n2->id = temp_id;
n2->data = temp_data;
Node* temp = array[n1->id];
array[n1->id] = array[n2->id];
array[n2->id] = temp;
}
int find_smallest_valued_parent(Node* node, uint depth) {
// has no parents
if (node->num_parents == 0 || node->parents == NULL) {
if (depth == 0) return -1; // there was no parent on first call (nothing to report)
else return node->data;
}
else {
depth++;
int minimum_value = node->parents[0]->data; // we're guaranteed 1 parent
for (size_t i = 0; i < node->num_parents; i++) {
int next_value = find_smallest_valued_parent(node->parents[i], depth);
if (node->parents[i]->data < next_value) next_value = node->parents[i]->data;
if (next_value < minimum_value) minimum_value = next_value;
}
return minimum_value;
}
}
void free_node_array(Node** array, size_t start, size_t end) {
for (size_t i = start; i < end; i++) {
free(array[i]);
}
free(array);
}
int main() {
char* file_name = "input_feodorv.txt";
FILE* data_file = fopen(file_name, "r");
if (data_file == NULL) {
printf("Error: invalid file %s", file_name);
return 1;
}
for (;;) {
size_t num_nodes, num_relationships, num_instructions;
if (fscanf(data_file, "%zu %zu %zu\n", &num_nodes, &num_relationships, &num_instructions) == EOF)
break;
Node** nodes = malloc((num_nodes + 1) * sizeof(Node*));
for (size_t i = 1; i <= num_nodes; i++) {
nodes[i] = malloc(sizeof(Node)); // WARNING Buffer overrun while writing to 'nodes': the writable size is '((num_nodes+1))*sizeof(Node *)' bytes, but '16' bytes might be written.
nodes[i]->id = i; // WARNING Reading invalid data from 'nodes': the readable size is '((num_nodes+1))*sizeof(Node *)' bytes, but '16' bytes may be read.
fscanf(data_file, "%u ", &nodes[i]->data);
nodes[i]->num_children = 0;
nodes[i]->size_children = 2;
nodes[i]->children = (Node**)malloc(2 * sizeof(Node*));
for (size_t j = 0; j < 2; j++) nodes[i]->children[j] = malloc(sizeof(Node));
nodes[i]->num_parents = 0;
nodes[i]->size_parents = 2;
nodes[i]->parents = (Node**)malloc(2 * sizeof(Node*));
for (size_t j = 0; j < 2; j++) nodes[i]->parents[j] = malloc(sizeof(Node));
}
for (size_t i = 0; i < num_relationships; i++) {
size_t parent_id, child_id;
fscanf(data_file, "%zu %zu\n", &parent_id, &child_id);
add_child(nodes[parent_id], nodes[child_id]);
add_parent(nodes[child_id], nodes[parent_id]);
}
for (size_t i = 0; i < num_instructions; i++) {
char instruction;
fscanf(data_file, "%c ", &instruction);
if (instruction == 'P') {
size_t id;
fscanf(data_file, "%zu\n", &id);
int minimum_value = find_smallest_valued_parent(nodes[id], 0);
if (minimum_value == -1) printf("*\n");
else printf("%u\n", minimum_value);
}
else {
size_t n1_id, n2_id;
fscanf(data_file, "%zu %zu\n", &n1_id, &n2_id);
switch_nodes(nodes[n1_id], nodes[n2_id], nodes);
}
}
/**/
for (size_t i = 1; i <= num_nodes; i++) {
free_node_array(nodes[i]->parents, 0, nodes[i]->size_parents);
free_node_array(nodes[i]->children, 0, nodes[i]->size_children);
}
free_node_array(nodes, 0, num_nodes);
/**/
}
}
There is a memory leak in your code. In the main() function, you are doing:
nodes[i]->children = (Node**)malloc(2 * sizeof(Node*));
for (size_t j = 0; j < 2; j++) nodes[i]->children[j] = malloc(sizeof(Node));
and
nodes[i]->parents = (Node**)malloc(2 * sizeof(Node*));
for (size_t j = 0; j < 2; j++) nodes[i]->parents[j] = malloc(sizeof(Node));
that mean, allocating memory to nodes[i]->children[j] and nodes[i]->parents[j] pointers.
In add_child() and add_parent() function, you are making them point to some other node resulting in loosing there allocated memory reference:
void add_parent(Node* node, Node* parent) {
.....
node->parents[node->num_parents++] = parent;
}
void add_child(Node* node, Node* child) {
.....
node->children[node->num_children++] = child;
}
You actually don't need to allocate memory to nodes[i]->children[j] and nodes[i]->parents[j] pointers in main() because these pointer are suppose to point to the existing nodes of the graph and you are already allocating memory to those nodes here in main():
nodes[i] = malloc(sizeof(Node));
nodes[i] is an element of array of all the nodes of the given graph and childrens and parents pointer should point to these nodes only.
Now coming to freeing these pointers:
The way you are freeing the nodes of graph is not correct. Look at free_node_array() function:
void free_node_array(Node** array, size_t start, size_t end) {
for (size_t i = start; i < end; i++) {
free(array[i]);
}
free(array);
}
and you are calling it in this way:
for (size_t i = 1; i <= num_nodes; i++) {
free_node_array(nodes[i]->parents, 0, nodes[i]->size_parents);
free_node_array(nodes[i]->children, 0, nodes[i]->size_children);
}
That mean, you are freeing the pointers pointed by array of pointers nodes[i]->parents and nodes[i]->children. The members of nodes[i]->parents and nodes[i]->children are pointers which are pointing to elements of nodes array. It is perfectly possible that a node can be a child 1 or more parents and a parent node can have more than 1 child. Now assume case where a child node is pointed by 2 parent nodes, say n1 and n2. When you call free_node_array() function and pass the first parent (n1), it will end you freeing that child node and when free_node_array() function is called to free the second parent (n2), it will try to free the node which is already freed while freeing n1.
So, this way of freeing the memory is not correct. The correct way to free the memory is, simply, free the elements of nodes array because it's the array which will contain all the nodes of given graph and parents and children pointers are supposed to point to these nodes only. No need to traverse the hierarchy of parent and child nodes. To free the graph appropriately, you should do:
Traverse through the nodes array and for each element of array:
Free the array of parents pointer (free (nodes[i]->parents).
Free the array of children pointer (free (nodes[i]->children).
Free that element of nodes array (free (nodes[i]).
Once, this is done then free the nodes array - free (nodes).

Try to split string but got messy substrings

I try to split one string to 3-gram strings. But turns out that the resulting substrings were always messy. The length and char ** input... are needed, since I will use them as args later for python calling the funxtion.
This is the function I wrote.
struct strArrIntArr getSearchArr(char* input, int length) {
struct strArrIntArr nameIndArr;
// flag of same bit
int same;
// flag/index of identical strings
int flag = 0;
// how many identical strings
int num = 0;
// array of split strings
char** nameArr = (char **)malloc(sizeof(char *) * (length - 2));
if ( nameArr == NULL ) exit(0);
// numbers of every split string
int* valueArr = (int* )malloc(sizeof(int) * (length-2));
if ( valueArr == NULL ) exit(0);
// loop length of search string -2 times (3-gram)
for(int i = 0; i<length-2; i++){
if(flag==0){
nameArr[i - num] = (char *)malloc(sizeof(char) * 3);
if ( nameArr[i - num] == NULL ) exit(0);
printf("----i------------%d------\n", i);
printf("----i-num--------%d------\n", i-num);
}
flag = 0;
// compare splitting string with existing split strings,
// if a string exists, it would not be stored
for(int k=0; k<i-num; k++){
same = 0;
for(int j=0; j<3; j++){
if(input[i + j] == nameArr[k][j]){
same ++;
}
}
// identical strings found, if all the three bits are the same
if(same == 3){
flag = k;
num++;
break;
}
}
// if the current split string doesn't exist yet
// put current split string to array
if(flag == 0){
for(int j=0; j<3; j++){
nameArr[i-num][j] = input[i + j];
valueArr[i-num] = 1;
}
}else{
valueArr[flag]++;
}
printf("-----string----%s\n", nameArr[i-num]);
}
// number of N-gram strings
nameIndArr.length = length- 2- num;
// array of N-gram strings
nameIndArr.charArr = nameArr;
nameIndArr.intArr = valueArr;
return nameIndArr;
}
To call the function:
int main(int argc, const char * argv[]) {
int length = 30;
char* input = (char *)malloc(sizeof(char) * length);
input = "googleapis.com.wncln.wncln.org";
// split the search string into N-gram strings
// and count the numbers of every split string
struct strArrIntArr nameIndArr = getSearchArr(input, length);
}
Below is the result. The strings from 17 are messy.
----i------------0------
----i-num--------0------
-----string----goo
----i------------1------
----i-num--------1------
-----string----oog
----i------------2------
----i-num--------2------
-----string----ogl
----i------------3------
----i-num--------3------
-----string----gle
----i------------4------
----i-num--------4------
-----string----lea
----i------------5------
----i-num--------5------
-----string----eap
----i------------6------
----i-num--------6------
-----string----api
----i------------7------
----i-num--------7------
-----string----pis
----i------------8------
----i-num--------8------
-----string----is.
----i------------9------
----i-num--------9------
-----string----s.c
----i------------10------
----i-num--------10------
-----string----.co
----i------------11------
----i-num--------11------
-----string----com
----i------------12------
----i-num--------12------
-----string----om.
----i------------13------
----i-num--------13------
-----string----m.w
----i------------14------
----i-num--------14------
-----string----.wn
----i------------15------
----i-num--------15------
-----string----wnc
---i------------16------
----i-num--------16------
-----string----ncl
----i------------17------
----i-num--------17------
-----string----clnsole
----i------------18------
----i-num--------18------
-----string----ln.=C:
----i------------19------
----i-num--------19------
-----string----n.wgram 馻绚s
----i------------20------
----i-num--------20------
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.oiles(騛窑=
----i------------26------
----i-num--------21------
-----string----.orSModu鯽蓼t
----i------------27------
----i-num--------22------
-----string----org
under win10, codeblocks 17.12, gcc 8.1.0
You are making life complicated for you in several places:
Don't count backwards: Instead of making num the count of duplicates, make it the count of unique trigraphs.
Scope variable definitions in functions as closely as possible. You have several uninitialized variables. You have declared them at the start of the function, but you need them only in local blocks.
Initialize as soon as you allocate. In your code, you use a flag to determine whather to create a new string. The code to allocate he string and to initialize it are in different blocks. Those blocks have the same flag as condition, but the flag is updated in between. This could lead to asynchronities, even to bugs when you try to initialize memory that wasn't allocated.
It's probably better to keep the strings and their counts together in a struct. If anything, this will help you with sorting later. This also offers some simplification: Instead of allocating chunks of 3 bytes, keep a char array of four bytes in the struct, so that all entries can be properly null-terminated. Those don't need to be allocated separately.
Here's an alternative implementation:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
struct tri {
char str[4]; // trigraph: 3 chars and NUL
int count; // count of occurrences
};
struct stat {
struct tri *tri; // list of trigraphs with counts
int size; // number of trigraphs
};
/*
* Find string 'key' in list of trigraphs. Return the index
* or in the array or -1 if it isn't found.
*/
int find_trigraph(const struct tri *tri, int n, const char *key)
{
for (int i = 0; i < n; i++) {
int j = 0;
while (j < 3 && tri[i].str[j] == key[j]) j++;
if (j == 3) return i;
}
return -1;
}
/*
* Create an array of trigraphs from the input string.
*/
struct stat getSearchArr(char* input, int length)
{
int num = 0;
struct tri *tri = malloc(sizeof(*tri) * (length - 2));
for(int i = 0; i < length - 2; i++) {
int index = find_trigraph(tri, num, input + i);
if (index < 0) {
snprintf(tri[num].str, 4, "%.3s", input + i); // see [1]
tri[num].count = 1;
num++;
} else {
tri[index].count++;
}
}
for(int i = 0; i < num; i++) {
printf("#%d %s: %d\n", i, tri[i].str, tri[i].count);
}
struct stat stat = { tri, num };
return stat;
}
/*
* Driver code
*/
int main(void)
{
char *input = "googleapis.com.wncln.wncln.org";
int length = strlen(input);
struct stat stat = getSearchArr(input, length);
// ... do stuff with stat ...
free(stat.tri);
return 0;
}
Footnote 1: I find that snprintf(str, n, "%.*s", len, str + offset) is useful for copying substrings: The result will not overflow the buffer and it will be null-terminated. There really ought to be a stanard function for this, but strcpy may overflow and strncpy may leave the buffer unterminated.
This answer tries to fix the existing code instead of proposing alternative/better solutions.
After fixing the output
printf("-----string----%s\n", nameArr[i-num]);
in the question, there is still another important problem.
You want to store 3 characters in nameArr[i-num] and allocate space for 3 characters. Later you print is as a string in the code shown above. This requires a trailing '\0' after the 3 characters, so you have to allocate memory for 4 characters and either append a '\0' or initialize the allocated memory with 0. Using calloc instead of malloc would automatically initialize the memory to 0.
Here is a modified version of the source code
I also changed the initialization of the string value and its length in main() to avoid the memory leak.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct strArrIntArr {
int length;
char **charArr;
int *intArr;
};
struct strArrIntArr getSearchArr(char* input, int length) {
struct strArrIntArr nameIndArr;
// flag of same bit
int same;
// flag/index of identical strings
int flag = 0;
// how many identical strings
int num = 0;
// array of split strings
char** nameArr = (char **)malloc(sizeof(char *) * (length - 2));
if ( nameArr == NULL ) exit(0);
// numbers of every split string
int* valueArr = (int* )malloc(sizeof(int) * (length-2));
if ( valueArr == NULL ) exit(0);
// loop length of search string -2 times (3-gram)
for(int i = 0; i<length-2; i++){
if(flag==0){
nameArr[i - num] = (char *)malloc(sizeof(char) * 4);
if ( nameArr[i - num] == NULL ) exit(0);
printf("----i------------%d------\n", i);
printf("----i-num--------%d------\n", i-num);
}
flag = 0;
// compare splitting string with existing split strings,
// if a string exists, it would not be stored
for(int k=0; k<i-num; k++){
same = 0;
for(int j=0; j<3; j++){
if(input[i + j] == nameArr[k][j]){
same ++;
}
}
// identical strings found, if all the three bits are the same
if(same == 3){
flag = 1;
num++;
break;
}
}
// if the current split string doesn't exist yet
// put current split string to array
if(flag == 0){
for(int j=0; j<3; j++){
nameArr[i-num][j] = input[i + j];
valueArr[i-num] = 1;
}
nameArr[i-num][3] = '\0';
}else{
valueArr[flag]++;
}
printf("-----string----%s\n", nameArr[i-num]);
}
// number of N-gram strings
nameIndArr.length = length- 2- num;
// array of N-gram strings
nameIndArr.charArr = nameArr;
nameIndArr.intArr = valueArr;
return nameIndArr;
}
int main(int argc, const char * argv[]) {
int length;
char* input = strdup("googleapis.com.wncln.wncln.org");
length = strlen(input);
// split the search string into N-gram strings
// and count the numbers of every split string
struct strArrIntArr nameIndArr = getSearchArr(input, length);
}
This other answer contains more improvements which I personally would prefer over the modified original solution.

How to find an element in an array of structs in C?

I have to write a function that finds a product with given code from the given array. If product is found, a pointer to the corresponding array element is returned.
My main problem is that the given code should first be truncated to seven characters and only after that compared with array elements.
Would greatly appreciate your help.
struct product *find_product(struct product_array *pa, const char *code)
{
char *temp;
int i = 0;
while (*code) {
temp[i] = (*code);
code++;
i++;
if (i == 7)
break;
}
temp[i] = '\0';
for (int j = 0; j < pa->count; j++)
if (pa->arr[j].code == temp[i])
return &(pa->arr[j]);
}
Why don't you just use strncmp in a loop?
struct product *find_product(struct product_array *pa, const char *code)
{
for (size_t i = 0; i < pa->count; ++i)
{
if (strncmp(pa->arr[i].code, code, 7) == 0)
return &pa->arr[i];
}
return 0;
}
temp is a pointer which is uninitialized and you are dereferencing it which will lead to undefined behavior.
temp = malloc(size); // Allocate some memory size = 8 in your case
One more mistake I see is
if (pa->arr[j].code == temp[i]) // i is already indexing `\0`
should be
strcmp(pa->arr[j].code,temp); // returns 0 if both the strings are same
This code can completely be avoided if you can use strncmp()
As pointed out by others, you are using temp uninitialized and you are always comparing characters with '\0'.
You don't need a temp variable:
int strncmp ( const char * str1, const char * str2, size_t num );
Compare characters of two strings
Compares up to num characters of the
C string str1 to those of the C string str2.
/* Don't use magic numbers like 7 in the body of function */
#define PRODUCT_CODE_LEN 7
struct product *find_product(struct product_array *pa, const char *code)
{
for (int i = 0; i < pa->count; i++) {
if (strncmp(pa->arr[i].code, code, PRODUCT_CODE_LEN) == 0)
return &(pa->arr[i]);
}
return NULL; /* Not found */
}
When you write char* temp; you are just declaring an uninitialized pointer
In your case since you say that the code is truncated to 7 you could create a buffer
on the stack with place for the code
char temp[8];
Writing
temp[i] = (*code);
code++;
i++;
Can be simplified to:
temp[i++] = *code++;
In your loop
for (int j = 0; j < pa->count; j++)
if (pa->arr[j].code == temp[i])
return &(pa->arr[j]);
You are comparing the address of code and the character value of temp[i] which incidentally could be 8 and outside the array.
Instead what you want to do is compare what code points to and what temp contains:
for (int j = 0; j < pa->count; j++)
if (!strncmp(pa->arr[j].code, temp, 7)
return &(pa->arr[j]);
You should also return NULL; if nothing was found, seems you do not return anything.
Probably a good thing is also to make sure your temp[] always contains 7 characters.

realloc() seems to affect already allocated memory

I am experiencing an issue where the invocation of realloc seems to modify the contents of another string, keyfile.
It's supposed to run through a null-terminated char* (keyfile), which contains just above 500 characters. The problem, however, is that the reallocation I perform in the while-loop seems to modify the contents of the keyfile.
I tried removing the dynamic reallocation with realloc and instead initialize the pointers in the for-loop with a size of 200*sizeof(int) instead. The problem remains, the keyfile string is modified during the (re)allocation of memory, and I have no idea why. I have confirmed this by printing the keyfile-string before and after both the malloc and realloc statements.
Note: The keyfile only contains the characters a-z, no digits, spaces, linebreaks or uppercase. Only a text of 26, lowercase letters.
int **getCharMap(const char *keyfile) {
char *alphabet = "abcdefghijklmnopqrstuvwxyz";
int **charmap = malloc(26*sizeof(int));
for (int i = 0; i < 26; i++) {
charmap[(int) alphabet[i]] = malloc(sizeof(int));
charmap[(int) alphabet[i]][0] = 0; // place a counter at index 0
}
int letter;
int count = 0;
unsigned char c = keyfile[count];
while (c != '\0') {
int arr_count = charmap[c][0];
arr_count++;
charmap[c] = realloc(charmap[c], (arr_count+1)*sizeof(int));
charmap[c][0] = arr_count;
charmap[c][arr_count] = count;
c = keyfile[++count];
}
// Just inspecting the results for debugging
printf("\nCHARMAP\n");
for (int i = 0; i < 26; i++) {
letter = (int) alphabet[i];
printf("%c: ", (char) letter);
int count = charmap[letter][0];
printf("%d", charmap[letter][0]);
if (count > 0) {
for (int j = 1; j < count+1; j++) {
printf(",%d", charmap[letter][j]);
}
}
printf("\n");
}
exit(0);
return charmap;
}
charmap[(int) alphabet[i]] = malloc(sizeof(int));
charmap[(int) alphabet[i]][0] = 0; // place a counter at index 0
You are writing beyond the end of your charmap array. So, you are invoking undefined behaviour and it's not surprising that you are seeing weird effects.
You are using the character codes as an index into the array, but they do not start at 0! They start at whatever the ASCII code for a is.
You should use alphabet[i] - 'a' as your array index.
The following piece of code is a source of troubles:
int **charmap = malloc(26*sizeof(int));
for (int i = 0; i < 26; i++)
charmap[...] = ...;
If sizeof(int) < sizeof(int*), then it will be performing illegal memory access operations.
For example, on 64-bit platforms, the case is usually sizeof(int) == 4 < 8 == sizeof(int*).
Under that scenario, by writing into charmap[13...25], you will be accessing unallocated memory.
Change this:
int **charmap = malloc(26*sizeof(int));
To this:
int **charmap = malloc(26*sizeof(int*));

Resources