An array of structures - c

How is an array of structures created in C without knowing the eventual amount of structures to be stored in the array?
I would like to loop in a for loop, create a tempStruct set its variables, add this to an array and then loop again, creating a new tempStruct and adding it to the array.
Im having some issues wrapping my head around how this is done in C while trying to relate from objective C.

Dynamically allocated arrays (using malloc) can be reallocated (using realloc).
Therefore the solution will look something like this:
malloc initial array (arbitrary size)
while still space in array, add structures
when array full, realloc to bigger size
goto 2

You could create a double linked list which points to parent and child
struct list{
list* next;
list* prev;
special_data* data;
}
easy to do and flexible

You can't create an array in C without knowing number of it's members up front.
Your options for adding are:
(Faster) Create new array with +1 element, copy entire array and add new element to the end
(Better) Create your own implementation of linked list (Linked list) which will dynamically allocate memory for each new member.

You can use malloc to create your structure.
Edit: The following demonstrates one way to do what you're asking by creating a linked list:
#include <stdio.h>
#include <stdlib.h>
typedef struct {
int data;
void* next;
} tempStruct;
#define NUM_STRUCTS 4
int main(void) {
tempStruct* cur_ptr;
tempStruct* root_ptr;
int i;
root_ptr = malloc(sizeof(tempStruct));
root_ptr -> data = 0;
root_ptr -> next = NULL;
cur_ptr = root_ptr;
for (i = 1; i < NUM_STRUCTS; i ++ ) {
tempStruct* new_ptr = malloc(sizeof(tempStruct));
new_ptr -> data = i;
new_ptr -> next = NULL;
cur_ptr -> next = new_ptr;
cur_ptr = cur_ptr -> next;
}
cur_ptr = root_ptr;
while (cur_ptr != NULL) {
printf("cur_ptr -> data = %d\n", cur_ptr -> data);
cur_ptr = cur_ptr -> next;
}
return 0;
}
If you really want to create something that acts more like an array, you'll need to allocate all your memory at the same time, using something like:
the_data = malloc(NUM_STRUCTS * sizeof(tempStruct);
Then you'll have to access the data with the dot operator (i.e. '.' (no quotes in your code).

struct foo {int bar;};
size_t i = 0, n = 8;
struct foo *arr = malloc(n * sizeof *arr);
int bar;
while ((bar = get_next_bar()) != -1) {
if (++i == n) { // no room for new element; expand array
arr = realloc(arr, n *= 2);
if (arr == NULL) abort; // see note below.
}
arr[i] = (struct foo){bar};
}
The number of assigned elements in the array is i+1. Don’t forget to free() the array when you’re done with it.
Note: In real programs you generally do not do p = realloc(p, s) directly. Instead you assign the result of realloc() to a new pointer, then do error detection & handling before clobbering your original pointer.

Related

Howto manage freeing single pointers from a double-pointer block

I have a block of pointers to some structs which I want to handle (i.e. free) separately. As an example below there is an integer double-pointer which should keep other pointers to integer. I then would like to free the second of those integer pointers (in my program based on some filterings and calculations). If I do so however, I should keep track of int-pointers already set free so that when I iterate over the pointers in the double-pointer I do not take the risk of working with them further. Is there a better approach for solving this problem (in ANSI-C) without using other libs (e.g. glib or alike)?
Here is a small simulation of the problem:
#include <stdio.h>
#include <stdlib.h>
int main() {
int **ipp=NULL;
for (int i = 0; i < 3; i++) {
int *ip = malloc(sizeof (int));
printf("%p -> ip %d\n", ip, i);
*ip = i * 10;
if ((ipp = realloc(ipp, sizeof (int *) * (i + 1)))) {
ipp[i] = ip;
}
}
printf("%p -> ipp\n", ipp);
for (int i = 0; i < 3; i++) {
printf("%d. %p %p %d\n", i, ipp + i, *(ipp+i), **(ipp + i));
}
// free the middle integer pointer
free(*(ipp+1));
printf("====\n");
for (int i = 0; i < 3; i++) {
printf("%d. %p %p %d\n", i, ipp + i, *(ipp+i), **(ipp + i));
}
return 0;
}
which prints something like
0x555bcc07f2a0 -> ip 0
0x555bcc07f6f0 -> ip 1
0x555bcc07f710 -> ip 2
0x555bcc07f6d0 -> ipp
0. 0x555bcc07f6d0 0x555bcc07f2a0 0
1. 0x555bcc07f6d8 0x555bcc07f6f0 10
2. 0x555bcc07f6e0 0x555bcc07f710 20
====
0. 0x555bcc07f6d0 0x555bcc07f2a0 0
1. 0x555bcc07f6d8 0x555bcc07f6f0 0
2. 0x555bcc07f6e0 0x555bcc07f710 20
Here I have freed the middle int-pointer. In my actual program I create a new block for an integer double-pointer, iterate over the current one, create new integer pointers and copy the old values into it, realloc the double-pointer block and append the new pointer to it, and at the end free the old block and all it's containing pointers. This is a bit ugly, and resource-consuming if there is a huge amount of data, since I have to iterate over and create and copy all the data twice. Any help is appreciated.
Re:
"This is a bit ugly, and resource-consuming if there is a huge amount of data, since I have to iterate over and create and copy all the data
twice. Any help is appreciated."
First observation: It is not necessary to use realloc() when allocating new memory on a pointer that has already been freed. realloc() is useful when needing to preserve the contents in a particular area of memory, while expanding its size. If that is not a need (which is not in this case) malloc() or calloc() are sufficient. #Marco's suggestion is correct.
Second observation: the following code snippet:
if ((ipp = realloc(ipp, sizeof (int *) * (i + 1)))) {
ipp[i] = ip;
}
is a potential memory leak. If the call to realloc()_ fails, the pointer ipp will be set to null, making the memory location that was previously allocated becomes orphaned, with no way to free it.
Third observation: Your approach is described as needing:
Array of struct
dynamic memory allocation of a 2D array
need to delete elements of 2D array, and ensure they are not referenced once deleted
need to repurpose deleted elements of 2D array
Your initial reaction in comments to considering using an alternative approach notwithstanding, Linked lists are a perfect fit to address the needs stated in your post.
The fundamental element of a Linked List uses a struct
Nodes (elements) of a List are dynamically allocated when created.
Nodes of a List are not accessible to be used once deleted. (No need to track)
Once the need exists, a new node is easily created.
Example struct follows. I like to use a data struct to contain the payload, then use an additional struct as the conveyance, to carry the data when building a Linked List:
typedef struct {//to simulate your struct
int dNum;
char unique_name[30];
double fNum;
} data_s;
typedef struct Node {//conveyance of payload, forward and backward searchable
data_s data;
struct Node *next; // Pointer to next node in DLL
struct Node *prev; // Pointer to previous node in DLL
} list_t;
Creating a list is done by creating a series of nodes as needed during run-time. Typically as records of a database, or lines of a file are read, and the elements of the table record (of element of the line in a file) are read into and instance of the data part of the list_s struct. A function is typically defined to do this, for example
void insert_node(list_s **head, data_s *new)
{
list_s *temp = malloc(sizeof(*temp));
//insert lines to populate
temp.data.dNum = new.dNum;
strcpy(temp.data.unique_name, new.unique_name);
temp.fNum = new.fNum
//arrange list to accomdate new node in new list
temp->next = temp->prev = NULL;
if (!(*head))
(*head) = temp;
else//...or existing list
{
temp->next = *head;
(*head)->prev = temp;
(*head) = temp;
}
}
Deleting a node can be done in multiple ways. It the following example method a unique value of a node member is used, in this case unique_name
void delete_node_by_name(list_s** head_ref, const char *name)
{
BOOL not_found = TRUE;
// if list is empty
if ((*head_ref) == NULL)
return;
list_s *current = *head_ref;
list_s *next = NULL;
// traverse the list up to the end
while (current != NULL && not_found)
{
// if 'name' in node...
if (strcmp(current->data.unique_name, name) == 0)
{
//set loop to exit
not_found = FALSE;
//save current's next node in the pointer 'next' /
next = current->next;
// delete the node pointed to by 'current'
delete_node(head_ref, current);
// reset the pointers
current = next;
}
// increment to next node
else
{
current = current->next;
}
}
}
Where delete_node() is defined as:
void delete_node(list_t **head_ref, list_t *del)
{
// base case
if (*head_ref == NULL || del == NULL)
return;
// If node to be deleted is head node
if (*head_ref == del)
*head_ref = del->next;
// Change next only if node to be deleted is NOT the last node
if (del->next != NULL)
del->next->prev = del->prev;
// Change prev only if node to be deleted is NOT the first node
if (del->prev != NULL)
del->prev->next = del->next;
// Finally, free the memory occupied by del
free(del);
}
This link is an introduction to Linked Lists, and has additional links to other related topic to expand the types of lists that are available.
You could use standard function memmove and then call realloc. For example
Let's assume that currently there are n pointers. Then you can write
free( *(ipp + i ) );
memmove( ipp + i, ipp + i + 1, ( n - i - 1 ) * sizeof( *pp ) );
*( ipp + n - 1 ) = NULL; // if the call of realloc will not be successfull
// then the pointer will be equal to NULL
int **tmp = realloc( ipp, ( n - 1 ) * sizeof( *tmp ) );
if ( tmp != NULL )
{
ipp = tmp;
--n;
}
else
{
// some other actions
}

Pointer seg faulting although I malloc-ed right

I don't understand why my program seg faults at this line: if ((**table->table).link == NULL){ I seem to have malloc-ed memory for it, and I tried looking at it with gdb. *table->table was accessible and not NULL, but **table->table was not accessible.
Definition of hash_t:
struct table_s {
struct node_s **table;
size_t bins;
size_t size;
};
typedef struct table_s *hash_t;
void set(hash_t table, char *key, int value){
unsigned int hashnum = hash(key)%table->bins;
printf("%d \n", hashnum);
unsigned int i;
for (i = 0; i<hashnum; i++){
(table->table)++;
}
if (*(table->table) == NULL){
struct node_s n = {key, value, NULL};
struct node_s *np = &n;
*(table->table) = malloc(sizeof(struct node_s));
*(table->table) = np;
}else{
while ( *(table->table) != NULL){
if ((**table->table).link == NULL){
struct node_s n = {key, value, NULL};
struct node_s *np = &n;
(**table->table).link = malloc(sizeof(struct node_s));
(**table->table).link = np;
break;
}else if (strcmp((**table->table).key, key) == 0){
break;
}
*table->table = (**(table->table)).link;
}
if (table->size/table->bins > 1){
rehash(table);
}
}
}
I'm calling set from here:
for (int i = 0; i < trials; i++) {
int sample = rand() % max_num;
sprintf(key, "%d", sample);
set(table, key, sample);
}
Your hashtable works like this: You have bins bins and each bin is a linked list of key / value pairs. All items in a bin share the same hash code modulo the number of bins.
You have probably created the table of bins when you created or initialised the hash table, something like this:
table->table = malloc(table->bins * sizeof(*table->table);
for (size_t i = 0; i < table->bins; i++) table->table[i] = NULL;
Now why does the member table have two stars?
The "inner" star means that the table stores pointers to nodes, not the nodes themselves.
The "outer" start is a handle to allocated memory. If your hash table were of a fixed size, for example always with 256 bins, you could define it as:
struct node_s *table[256];
If you passed this array around, it would become (or "decay into") a pointer to its first element, a struct node_s **, just as the array you got from malloc.
You access the contents of the l´bins via the linked lists and the head of linked list i is table->table[i].
You code has other problems:
What did you want to achieve with (table->table)++? This will make the handle to the allocated memory point not to the first element but tho the next one. After doing that hashnum times, *table->table will now be at the right node, but you will have lost the original handle, which you must retain, because you must pass it to free later when you clean up your hash table. Don't lose the handle to allocated memory! Use another local pointer instead.
You create a local node n and then make a link in your linked list with a pointer to that node. But the node n will be gone after you leave the function and the link will be "stale": It will point to invalid memory. You must also create memory for the node with malloc.
A simple implementation of your has table might be:
void set(hash_t table, char *key, int value)
{
unsigned int hashnum = hash(key) % table->bins;
// create (uninitialised) new node
struct node_s *nnew = malloc(sizeof(*nnew));
// initialise new node, point it to old head
nnew->key = strdup(key);
nnew->value = value;
nnew->link = table->table[hashnum];
// make the new node the new head
table->table[hashnum] = nnew;
}
This makes the new node the head of the linked list. This is not ideal, because if you overwrite items, the new ones will be found (which is good), but the old ones will still be in the table (which isn't good). But that, as they say, is left as an exercise to the reader.
(The strdup function isn't standard, but widely available. It also creates new memory, which you must free later, but it ensures, that the string "lives" (is still valid) after you have ceated the hash table.)
Please not how few stars there are in the code. If there is one star too few, it is in hash_t, where you have typecasted away the pointer nature.

Inserting into hash table

I am trying to insert an integer into a hash table. To do this, I'm creating an array of node*'s and I'm trying to make assignments like listarray[i]->data=5 possible. However, I'm still very confused with pointers and I'm crashing at the line with the comment '//crashes here' and I don't understand why. Was my initialization in main() invalid?
#include <stdio.h>
#include <stdlib.h>
typedef struct node
{
int data;
struct node * next;
} node;
//------------------------------------------------------------------------------
void insert (node **listarray, int size)
{
node *temp;
int value = 11; //just some random value for now, eventually will be scanned in
int index = value % size; // 11 modulo 8 yields 3
printf ("index is %d\n", index); //prints 3 fine
if (listarray[index] == NULL)
{
printf("listarray[%d] is NULL",index); //prints because of loop in main
listarray[index]->data = value; //crashes here
printf("listarray[%d] is now %d",index,listarray[index]->data); //never prints
listarray[index]->next = NULL;
}
else
{
temp->next = listarray[index];
listarray[index] = temp;
listarray[index]->data = value;
}
}//end insert()
//------------------------------------------------------------------------------
int main()
{
int size = 8,i; //set default to 8
node * head=NULL; //head of the list
node **listarray = malloc (sizeof (node*) * size); //declare an array of Node *
//do i need double pointers here?
for (i = 0; i < size; i++) //malloc each array position
{
listarray[i] = malloc (sizeof (node) * size);
listarray[i] = NULL; //satisfies the first condition in insert();
}
insert(*&listarray,size);
}
output:
index is 3
listarray[3] is NULL
(crash)
desired output:
index is 3
listarray[3] is NULL
listarray[3] is now 11
There are various issues here:
If you have a hash table of a certain size, then the hash code must map to a value between 0 and size - 1. Your default size is 8, but your hash code is x % 13, which means that your index might be out of bounds.
Your insert function should also pass the item to insert (unless that's the parameter called size, in which case it is severely misnamed).
if (listarray[index] == NULL) {
listarray[index]->data = value; //crashes here
listarray[index]->next = NULL;
}
It's no wonder that it crashes: When the node is NULL, you cannot dereference it with either * or ->. You should allocate new memory here.
And you shouldn't allocate memory here:
for (i = 0; i < size; i++) //malloc each array position
{
listarray[i] = malloc (sizeof (node) * size);
listarray[i] = NULL; //satisfies the first condition in insert();
}
Allocating memory and then resetting it to NULL is nonsense. NULL is a special value that means that no memory is at the pointed-to location. Just set all nodes to NULL, which means that the hash table starts out without any nodes. Allocate when you need a node at a certain position.
In the else clause, you write:
else
{
temp->next = listarray[index];
listarray[index] = temp;
listarray[index]->data = value;
}
but temp hasn't been allocated, but you dereference it. That's just as bad as dereferencing ´NULL`.
Your hash table also needs a means to handle collisions. It looks as if at every index in the hash table, there is a linked list. That's a good way to deal with it, but you haven't implemented it properly.
You seem to have problems to understand pointers. Perhaps you should start with a simpler data structure like a linked list, just to practice? When you have gotten a firm grasp of that, you can use what you've learned to implement your hash table.

Freeing memory of used data leads to Segmentation Fault

I wrote a hashtable and it basically consists of these two structures:
typedef struct dictEntry {
void *key;
void *value;
struct dictEntry *next;
} dictEntry;
typedef struct dict {
dictEntry **table;
unsigned long size;
unsigned long items;
} dict;
dict.table is a multidimensional array, which contains all the stored key/value pair, which again are a linked list.
If half of the hashtable is full, I expand it by doubling the size and rehashing it:
dict *_dictRehash(dict *d) {
int i;
dict *_d;
dictEntry *dit;
_d = dictCreate(d->size * 2);
for (i = 0; i < d->size; i++) {
for (dit = d->table[i]; dit != NULL; dit = dit->next) {
_dictAddRaw(_d, dit);
}
}
/* FIXME memory leak because the old dict can never be freed */
free(d); // seg fault
return _d;
}
The function above uses the pointers from the old hash table and stores it in the newly created one. When freeing the old dict d a Segmentation Fault occurs.
How am I able to free the old hashtable struct without having to allocate the memory for the key/value pairs again?
Edit, for completness:
dict *dictCreate(unsigned long size) {
dict *d;
d = malloc(sizeof(dict));
d->size = size;
d->items = 0;
d->table = calloc(size, sizeof(dictEntry*));
return d;
}
void dictAdd(dict *d, void *key, void *value) {
dictEntry *entry;
entry = malloc(sizeof *entry);
entry->key = key;
entry->value = value;
entry->next = '\0';
if ((((float)d->items) / d->size) > 0.5) d = _dictRehash(d);
_dictAddRaw(d, entry);
}
void _dictAddRaw(dict *d, dictEntry *entry) {
int index = (hash(entry->key) & (d->size - 1));
if (d->table[index]) {
dictEntry *next, *prev;
for (next = d->table[index]; next != NULL; next = next->next) {
prev = next;
}
prev->next = entry;
} else {
d->table[index] = entry;
}
d->items++;
}
best way to debug this is to run your code against valgrind .
But to you give some perspective :
when you free(d) you are expecting more of a destructor call on your struct dict which would internally free the memory allocated to the pointer to pointer to dictEntry
why do you have to delete the entire has table to expand it ? you have a next pointer anyways why not just append new hash entries to it ?
Solution is not to free the d rather just expand the d by allocating more struct dictEntry and assigning them to appropriate next.
When contracting the d you will have to iterate over next to reach the end and then start freeing the memory for struct dictEntrys inside of your d.
To clarify Graham's point, you need to pay attention to how memory is being accessed in this library. The user has one pointer to their dictionary. When you rehash, you free the memory referenced by that pointer. Although you allocated a new dictionary for them, the new pointer is never returned to them, so they don't know not to use the old one. When they try to access their dictionary again, it's pointing to freed memory.
One possibility is not to throw away the old dictionary entirely, but only the dictEntry table you allocated within the dictionary. That way your users will never have to update their pointer, but you can rescale the table to accomodate more efficient access. Try something like this:
void _dictRehash(dict *d) {
printf("rehashing!\n");
int i;
dictEntry *dit;
int old_size = d->size;
dictEntry** old_table = d->table;
int size = old_size * 2;
d->table = calloc(size, sizeof(dictEntry*));
d->size = size;
d->items = 0;
for (i = 0; i < old_size; i++) {
for (dit = old_table[i]; dit != NULL; dit = dit->next) {
_dictAddRaw(d, dit);
}
}
free(old_table);
return;
}
As a side note, I'm not sure what your hash function does, but it seems to me that the line
int index = (hash(entry->key) & (d->size - 1));
is a little unorthodox. You get a hash value and do a bitwise and with the size of the table, which I guess works in the sense that it will be guaranteed to be within (I think?) [0, max_size), I think you might mean % for modulus.
You are freeing a pointer which is passed in to your function. This is only safe if you know that whoever's calling your function isn't still trying to use the old value of d. Check all the code which calls _dictRehash() and make sure nothing's hanging on to the old pointer.
What does dictCreate actually do?
I think you're getting confused between the (fixed size) dict object, and the (presumably variable sized) array of pointers to dictEntries in dict.table.
Maybe you could just realloc() the memory pointed to by dict.table, rather than creating a new 'dict' object and freeing the old one (which incidentally, isn't freeing the table of dictentries anyway!)

Why does this malloc not work in C?

Just trying to make a kind of hash table with each node being a linked list.
Having trouble just initializing the space, what am I doing wrong?
#include <stdlib.h>
typedef struct entry {
struct entry *next;
void *theData;
} Entry;
typedef struct HashTable {
Entry **table;
int size;
} HashTable;
int main(){
HashTable *ml;
ml = initialize();
return 0;
}
HashTable *initialize(void)
{
HashTable *p;
Entry **b;
int i;
if ((p = (HashTable *)malloc(sizeof(HashTable *))) == NULL)
return NULL;
p->size = 101;
if ((b = (Entry **)malloc(p->size * sizeof(Entry **))) == NULL)
return NULL;
p->table = b;
for(i = 0; i < p->size; i++) {
Entry * b = p->table[i];
b->theData = NULL;
b->next = NULL;
}
return p;
}
You need to change sizeof(HashTable*) to sizeof(HashTable) and similarly sizeof(Entry **) to sizeof(Entry *) . And the second thing is for every Entry you need to allocate memory using malloc again inside the loop.
if ((p = malloc(sizeof(HashTable))) == NULL)
return NULL;
p->size = 101;
if ((b = malloc(p->size * sizeof(Entry *))) == NULL)
return NULL;
I believe removing the malloc() result casts is best practice.
Plus, as #Naveen was first to point out you also need to allocate memory for each Entry.
Firstly your sizeofs are wrong. T * = malloc( num * sizeof(T)) is correct. You can also use calloc.
You are reusing b for different purposes so it is quite confusing. Not generally good using a single character variable.
p->table which was b is allocated but not initialised, i.e. it doesn't point to anything useful, then you are trying to dereference it.
You need to fill it will Entry* pointers first, and they must be pointing to valid Entry structs if you are going to dereference those.
Your process probably dies on the line b>theData = NULL
Also, you can statically declare your HashTable, either locally, or in some region high enough in the stack that the stack is non-ascending (in memory) while it is used and pass a pointer to the HashTable to your initialize function to avoid a malloc. malloc is slow.
So in main, you can do:
HashTable table;
InitializeHashTable(&table);
// use table (no need to free)
// just do not return table

Resources