Inserting into hash table - c

I am trying to insert an integer into a hash table. To do this, I'm creating an array of node*'s and I'm trying to make assignments like listarray[i]->data=5 possible. However, I'm still very confused with pointers and I'm crashing at the line with the comment '//crashes here' and I don't understand why. Was my initialization in main() invalid?
#include <stdio.h>
#include <stdlib.h>
typedef struct node
{
int data;
struct node * next;
} node;
//------------------------------------------------------------------------------
void insert (node **listarray, int size)
{
node *temp;
int value = 11; //just some random value for now, eventually will be scanned in
int index = value % size; // 11 modulo 8 yields 3
printf ("index is %d\n", index); //prints 3 fine
if (listarray[index] == NULL)
{
printf("listarray[%d] is NULL",index); //prints because of loop in main
listarray[index]->data = value; //crashes here
printf("listarray[%d] is now %d",index,listarray[index]->data); //never prints
listarray[index]->next = NULL;
}
else
{
temp->next = listarray[index];
listarray[index] = temp;
listarray[index]->data = value;
}
}//end insert()
//------------------------------------------------------------------------------
int main()
{
int size = 8,i; //set default to 8
node * head=NULL; //head of the list
node **listarray = malloc (sizeof (node*) * size); //declare an array of Node *
//do i need double pointers here?
for (i = 0; i < size; i++) //malloc each array position
{
listarray[i] = malloc (sizeof (node) * size);
listarray[i] = NULL; //satisfies the first condition in insert();
}
insert(*&listarray,size);
}
output:
index is 3
listarray[3] is NULL
(crash)
desired output:
index is 3
listarray[3] is NULL
listarray[3] is now 11

There are various issues here:
If you have a hash table of a certain size, then the hash code must map to a value between 0 and size - 1. Your default size is 8, but your hash code is x % 13, which means that your index might be out of bounds.
Your insert function should also pass the item to insert (unless that's the parameter called size, in which case it is severely misnamed).
if (listarray[index] == NULL) {
listarray[index]->data = value; //crashes here
listarray[index]->next = NULL;
}
It's no wonder that it crashes: When the node is NULL, you cannot dereference it with either * or ->. You should allocate new memory here.
And you shouldn't allocate memory here:
for (i = 0; i < size; i++) //malloc each array position
{
listarray[i] = malloc (sizeof (node) * size);
listarray[i] = NULL; //satisfies the first condition in insert();
}
Allocating memory and then resetting it to NULL is nonsense. NULL is a special value that means that no memory is at the pointed-to location. Just set all nodes to NULL, which means that the hash table starts out without any nodes. Allocate when you need a node at a certain position.
In the else clause, you write:
else
{
temp->next = listarray[index];
listarray[index] = temp;
listarray[index]->data = value;
}
but temp hasn't been allocated, but you dereference it. That's just as bad as dereferencing ´NULL`.
Your hash table also needs a means to handle collisions. It looks as if at every index in the hash table, there is a linked list. That's a good way to deal with it, but you haven't implemented it properly.
You seem to have problems to understand pointers. Perhaps you should start with a simpler data structure like a linked list, just to practice? When you have gotten a firm grasp of that, you can use what you've learned to implement your hash table.

Related

Howto manage freeing single pointers from a double-pointer block

I have a block of pointers to some structs which I want to handle (i.e. free) separately. As an example below there is an integer double-pointer which should keep other pointers to integer. I then would like to free the second of those integer pointers (in my program based on some filterings and calculations). If I do so however, I should keep track of int-pointers already set free so that when I iterate over the pointers in the double-pointer I do not take the risk of working with them further. Is there a better approach for solving this problem (in ANSI-C) without using other libs (e.g. glib or alike)?
Here is a small simulation of the problem:
#include <stdio.h>
#include <stdlib.h>
int main() {
int **ipp=NULL;
for (int i = 0; i < 3; i++) {
int *ip = malloc(sizeof (int));
printf("%p -> ip %d\n", ip, i);
*ip = i * 10;
if ((ipp = realloc(ipp, sizeof (int *) * (i + 1)))) {
ipp[i] = ip;
}
}
printf("%p -> ipp\n", ipp);
for (int i = 0; i < 3; i++) {
printf("%d. %p %p %d\n", i, ipp + i, *(ipp+i), **(ipp + i));
}
// free the middle integer pointer
free(*(ipp+1));
printf("====\n");
for (int i = 0; i < 3; i++) {
printf("%d. %p %p %d\n", i, ipp + i, *(ipp+i), **(ipp + i));
}
return 0;
}
which prints something like
0x555bcc07f2a0 -> ip 0
0x555bcc07f6f0 -> ip 1
0x555bcc07f710 -> ip 2
0x555bcc07f6d0 -> ipp
0. 0x555bcc07f6d0 0x555bcc07f2a0 0
1. 0x555bcc07f6d8 0x555bcc07f6f0 10
2. 0x555bcc07f6e0 0x555bcc07f710 20
====
0. 0x555bcc07f6d0 0x555bcc07f2a0 0
1. 0x555bcc07f6d8 0x555bcc07f6f0 0
2. 0x555bcc07f6e0 0x555bcc07f710 20
Here I have freed the middle int-pointer. In my actual program I create a new block for an integer double-pointer, iterate over the current one, create new integer pointers and copy the old values into it, realloc the double-pointer block and append the new pointer to it, and at the end free the old block and all it's containing pointers. This is a bit ugly, and resource-consuming if there is a huge amount of data, since I have to iterate over and create and copy all the data twice. Any help is appreciated.
Re:
"This is a bit ugly, and resource-consuming if there is a huge amount of data, since I have to iterate over and create and copy all the data
twice. Any help is appreciated."
First observation: It is not necessary to use realloc() when allocating new memory on a pointer that has already been freed. realloc() is useful when needing to preserve the contents in a particular area of memory, while expanding its size. If that is not a need (which is not in this case) malloc() or calloc() are sufficient. #Marco's suggestion is correct.
Second observation: the following code snippet:
if ((ipp = realloc(ipp, sizeof (int *) * (i + 1)))) {
ipp[i] = ip;
}
is a potential memory leak. If the call to realloc()_ fails, the pointer ipp will be set to null, making the memory location that was previously allocated becomes orphaned, with no way to free it.
Third observation: Your approach is described as needing:
Array of struct
dynamic memory allocation of a 2D array
need to delete elements of 2D array, and ensure they are not referenced once deleted
need to repurpose deleted elements of 2D array
Your initial reaction in comments to considering using an alternative approach notwithstanding, Linked lists are a perfect fit to address the needs stated in your post.
The fundamental element of a Linked List uses a struct
Nodes (elements) of a List are dynamically allocated when created.
Nodes of a List are not accessible to be used once deleted. (No need to track)
Once the need exists, a new node is easily created.
Example struct follows. I like to use a data struct to contain the payload, then use an additional struct as the conveyance, to carry the data when building a Linked List:
typedef struct {//to simulate your struct
int dNum;
char unique_name[30];
double fNum;
} data_s;
typedef struct Node {//conveyance of payload, forward and backward searchable
data_s data;
struct Node *next; // Pointer to next node in DLL
struct Node *prev; // Pointer to previous node in DLL
} list_t;
Creating a list is done by creating a series of nodes as needed during run-time. Typically as records of a database, or lines of a file are read, and the elements of the table record (of element of the line in a file) are read into and instance of the data part of the list_s struct. A function is typically defined to do this, for example
void insert_node(list_s **head, data_s *new)
{
list_s *temp = malloc(sizeof(*temp));
//insert lines to populate
temp.data.dNum = new.dNum;
strcpy(temp.data.unique_name, new.unique_name);
temp.fNum = new.fNum
//arrange list to accomdate new node in new list
temp->next = temp->prev = NULL;
if (!(*head))
(*head) = temp;
else//...or existing list
{
temp->next = *head;
(*head)->prev = temp;
(*head) = temp;
}
}
Deleting a node can be done in multiple ways. It the following example method a unique value of a node member is used, in this case unique_name
void delete_node_by_name(list_s** head_ref, const char *name)
{
BOOL not_found = TRUE;
// if list is empty
if ((*head_ref) == NULL)
return;
list_s *current = *head_ref;
list_s *next = NULL;
// traverse the list up to the end
while (current != NULL && not_found)
{
// if 'name' in node...
if (strcmp(current->data.unique_name, name) == 0)
{
//set loop to exit
not_found = FALSE;
//save current's next node in the pointer 'next' /
next = current->next;
// delete the node pointed to by 'current'
delete_node(head_ref, current);
// reset the pointers
current = next;
}
// increment to next node
else
{
current = current->next;
}
}
}
Where delete_node() is defined as:
void delete_node(list_t **head_ref, list_t *del)
{
// base case
if (*head_ref == NULL || del == NULL)
return;
// If node to be deleted is head node
if (*head_ref == del)
*head_ref = del->next;
// Change next only if node to be deleted is NOT the last node
if (del->next != NULL)
del->next->prev = del->prev;
// Change prev only if node to be deleted is NOT the first node
if (del->prev != NULL)
del->prev->next = del->next;
// Finally, free the memory occupied by del
free(del);
}
This link is an introduction to Linked Lists, and has additional links to other related topic to expand the types of lists that are available.
You could use standard function memmove and then call realloc. For example
Let's assume that currently there are n pointers. Then you can write
free( *(ipp + i ) );
memmove( ipp + i, ipp + i + 1, ( n - i - 1 ) * sizeof( *pp ) );
*( ipp + n - 1 ) = NULL; // if the call of realloc will not be successfull
// then the pointer will be equal to NULL
int **tmp = realloc( ipp, ( n - 1 ) * sizeof( *tmp ) );
if ( tmp != NULL )
{
ipp = tmp;
--n;
}
else
{
// some other actions
}

Pointer seg faulting although I malloc-ed right

I don't understand why my program seg faults at this line: if ((**table->table).link == NULL){ I seem to have malloc-ed memory for it, and I tried looking at it with gdb. *table->table was accessible and not NULL, but **table->table was not accessible.
Definition of hash_t:
struct table_s {
struct node_s **table;
size_t bins;
size_t size;
};
typedef struct table_s *hash_t;
void set(hash_t table, char *key, int value){
unsigned int hashnum = hash(key)%table->bins;
printf("%d \n", hashnum);
unsigned int i;
for (i = 0; i<hashnum; i++){
(table->table)++;
}
if (*(table->table) == NULL){
struct node_s n = {key, value, NULL};
struct node_s *np = &n;
*(table->table) = malloc(sizeof(struct node_s));
*(table->table) = np;
}else{
while ( *(table->table) != NULL){
if ((**table->table).link == NULL){
struct node_s n = {key, value, NULL};
struct node_s *np = &n;
(**table->table).link = malloc(sizeof(struct node_s));
(**table->table).link = np;
break;
}else if (strcmp((**table->table).key, key) == 0){
break;
}
*table->table = (**(table->table)).link;
}
if (table->size/table->bins > 1){
rehash(table);
}
}
}
I'm calling set from here:
for (int i = 0; i < trials; i++) {
int sample = rand() % max_num;
sprintf(key, "%d", sample);
set(table, key, sample);
}
Your hashtable works like this: You have bins bins and each bin is a linked list of key / value pairs. All items in a bin share the same hash code modulo the number of bins.
You have probably created the table of bins when you created or initialised the hash table, something like this:
table->table = malloc(table->bins * sizeof(*table->table);
for (size_t i = 0; i < table->bins; i++) table->table[i] = NULL;
Now why does the member table have two stars?
The "inner" star means that the table stores pointers to nodes, not the nodes themselves.
The "outer" start is a handle to allocated memory. If your hash table were of a fixed size, for example always with 256 bins, you could define it as:
struct node_s *table[256];
If you passed this array around, it would become (or "decay into") a pointer to its first element, a struct node_s **, just as the array you got from malloc.
You access the contents of the l´bins via the linked lists and the head of linked list i is table->table[i].
You code has other problems:
What did you want to achieve with (table->table)++? This will make the handle to the allocated memory point not to the first element but tho the next one. After doing that hashnum times, *table->table will now be at the right node, but you will have lost the original handle, which you must retain, because you must pass it to free later when you clean up your hash table. Don't lose the handle to allocated memory! Use another local pointer instead.
You create a local node n and then make a link in your linked list with a pointer to that node. But the node n will be gone after you leave the function and the link will be "stale": It will point to invalid memory. You must also create memory for the node with malloc.
A simple implementation of your has table might be:
void set(hash_t table, char *key, int value)
{
unsigned int hashnum = hash(key) % table->bins;
// create (uninitialised) new node
struct node_s *nnew = malloc(sizeof(*nnew));
// initialise new node, point it to old head
nnew->key = strdup(key);
nnew->value = value;
nnew->link = table->table[hashnum];
// make the new node the new head
table->table[hashnum] = nnew;
}
This makes the new node the head of the linked list. This is not ideal, because if you overwrite items, the new ones will be found (which is good), but the old ones will still be in the table (which isn't good). But that, as they say, is left as an exercise to the reader.
(The strdup function isn't standard, but widely available. It also creates new memory, which you must free later, but it ensures, that the string "lives" (is still valid) after you have ceated the hash table.)
Please not how few stars there are in the code. If there is one star too few, it is in hash_t, where you have typecasted away the pointer nature.

Array pointer always points to NULL

Im trying create a linked list in an array of nodes. When I try to update the pointer for arrTab->h_table[index] to the address of newNode, The address points to newNodes address. But when I try to add to a list that exists in the array, the pointer always points to NULL instead of the previous value in memory. Basically the arrTab->h_table[index] head of the linked list does not update to the address of newNode.
typedef struct node {
struct node* next;
int hash;
s_type symbol;
} node_t;
struct array {
int cap;
int size;
n_type** h_table;
};
int add_to_array (array* arrTab, const char* name, int address) {
if(s_search(arrTab, name, NULL, NULL) == NULL){
s_type *symbol = (s_type*) calloc(1, sizeof(s_type));
symbol->name = strdup(name);
symbol->addr = addr;
n_type *newNode = (n_type*) calloc(1, sizeof(n_type));
newNode->next = NULL;
newNode->hash = nameHash(name);
newNode->symbol = *symbol;
int index = newNode->hash % arrTab->cap;
if(arrTab->h_table[index] == NULL){
arrTab->h_table[index] = newNode;
} else {
newNode->next = arrTab->h_table[index];
arrTab->h_table[index] = newNode;
}
//
arrTab->size++;
return 1;
}
return 0;
}
struct node* s_search (array* arrTab, const char* name, int* hash, int* index) {
int hashVal = nameHash(name);
hash = &hashVal;
int indexVal = *hash % arrTab->cap;
index = &indexVal;
s_type *symCopy = arrTab;
while (symCopy->h_table[*index] != NULL){
if(*hash == symCopy->h_table[*index]->hash){
return symCopy->h_table[*index];
}
symCopy->h_table[*index] = symCopy->h_table[*index]->next;
}
return NULL;
}
I cannot say for sure why the pointer always points to NULL; there is not enough code. Consider posting an MCVE.
The posted code however presents few problems to address.
First, it leaks memory like there is no tomorrow:
symbol_t *symbol = (symbol_t*) calloc(1, sizeof(symbol_t));
allocates some memory, and
newNode->symbol = *symbol;
copies the contents of that memory to the new location. The memory allocated still exists, and continues to exist after the function returns - but there's no way to get to it. I strongly recommend to not allocate symbol, and work directly with newNode->symbol:
newNode->symbol.name = strdup(name);
newNode->symbol.addr = addr;
The hash and index parameters to symbol_search seem to be planned as an out parameters. In that case, notice that the results of hash = &hashVal; and index = &indexVal; are invisible to the caller. You likely meant *hash = hashVal and *index = indexVal.
The biggest problem comes with sym_table_t *symCopy = symTab;.
symTab is a pointer. It points to an actual symbol table, a big piece of memory. After the assignment, symCopy points to the same piece of memory. Which means that
symCopy->hash_table[*index] = symCopy->hash_table[*index]->next;
modifies that piece of memory. Once the search is completed, the hash_table[index] is not the same as it was before the search. This could be a root of your problem. In any case, consider
node_t * cursor = symTab->hash_table[*index];
and work with this cursor instead.
As a side note, a search condition *hash == symCopy->hash_table[*index]->hash is strange. Every node in a given linked list has the same hash (check how you add them). The very first node would produce a match, even if the names are different.

How to double the size of a dynamic array while keeping the old contents

For part of my C data structures assignment, I am tasked with taking an array of pointers to nodes of 2 doubly linked lists (one representing the main service queue, and the other representing a "bucket" of buzzers ready to be reused or used for the first time in the queue), doubling the size, while keeping the original contents in tact. The idea is that each node has an ID associated which corresponds to the number index of the pointer array map. So for example, the pointer in index 3 will always point to the node whose ID is 3. The boolean inQ is for something unrelated to this issue.
I've written most of the code, but it seems to be functioning incorrectly (it changes all the original pointers to the last node in the list before the array resizing) So, since the starting size of the array is 10 elements, when I print out the contents after the function, it displays 9 9 9 9 9 9 9 9 9 9.
Here are the structs im using:
typedef struct node {
int id;
int inQ;
struct node *next;
struct node *prev;
}NODE;
typedef struct list
{
NODE *front;
NODE *back;
int size;
} LIST;
//referred to as SQ in the separate header file
struct service_queue
{
LIST *queue;
LIST *bucket;
NODE **arr;
int arrSize;
int maxID;
};
Here is the function in question:
SQ sq_double_array(SQ *q)
{
NODE **arr2 = malloc(q->arrSize * 2 * sizeof(NODE*));
int i;
//fill the first half of the new array with the node pointers of the first array
for (i = 0; i < q->arrSize; i++)
{
arr2[i] = malloc(sizeof(NODE));
if (i > 0)
{
arr2[i - 1]->next = arr2[i];
arr2[i]->prev = arr2[i - 1];
}
arr2[i]->id = q->arr[i]->id;
arr2[i]->inQ = q->arr[i]->inQ;
arr2[i]->next = q->arr[i]->next;
arr2[i]->prev = q->arr[i]->prev;
}
//fill the second half with node pointers to the new nodes and place them into the bucket
for (i = q->arrSize; i < q->arrSize * 2; i++)
{
//Point the array elements equal to empty nodes, corresponding to the inidicies
arr2[i] = malloc(sizeof(NODE));
arr2[i]->id = i;
arr2[i]->inQ = 0;
//If the bucket is empty (first pass)
if (q->bucket->front == NULL)
{
q->bucket->front = arr2[i];
arr2[i]->prev = NULL;
arr2[i]->next = NULL;
q->bucket->back = arr2[i];
}
//If the bucket has at least 1 buzzer in it
else
{
q->bucket->back = malloc(sizeof(NODE));
q->bucket->back->next = arr2[i];
q->bucket->back = arr2[i];
q->bucket->back->next = NULL;
}
}
q->arrSize *= 2;
q->arr = arr2;
return *q;
}
Keep in mind this must only be done in c, which is why im not using 'new'
You could use the realloc function:
void *realloc(void *ptr, size_t size);
Quoted from the man pages:
The realloc() function changes the size of the memory block pointed to
by ptr to size bytes. The
contents will be unchanged in the range from the start of the region up to the minimum of the old
and new sizes. If the new size is larger than the old size, the added memory will not be initial‐
ized. If ptr is NULL, then the call is equivalent to malloc(size), for all values of size; if
size is equal to zero, and ptr is not NULL, then the call is equivalent to free(ptr). Unless ptr
is NULL, it must have been returned by an earlier call to malloc(), calloc() or realloc(). If the
area pointed to was moved, a free(ptr) is done.

An array of structures

How is an array of structures created in C without knowing the eventual amount of structures to be stored in the array?
I would like to loop in a for loop, create a tempStruct set its variables, add this to an array and then loop again, creating a new tempStruct and adding it to the array.
Im having some issues wrapping my head around how this is done in C while trying to relate from objective C.
Dynamically allocated arrays (using malloc) can be reallocated (using realloc).
Therefore the solution will look something like this:
malloc initial array (arbitrary size)
while still space in array, add structures
when array full, realloc to bigger size
goto 2
You could create a double linked list which points to parent and child
struct list{
list* next;
list* prev;
special_data* data;
}
easy to do and flexible
You can't create an array in C without knowing number of it's members up front.
Your options for adding are:
(Faster) Create new array with +1 element, copy entire array and add new element to the end
(Better) Create your own implementation of linked list (Linked list) which will dynamically allocate memory for each new member.
You can use malloc to create your structure.
Edit: The following demonstrates one way to do what you're asking by creating a linked list:
#include <stdio.h>
#include <stdlib.h>
typedef struct {
int data;
void* next;
} tempStruct;
#define NUM_STRUCTS 4
int main(void) {
tempStruct* cur_ptr;
tempStruct* root_ptr;
int i;
root_ptr = malloc(sizeof(tempStruct));
root_ptr -> data = 0;
root_ptr -> next = NULL;
cur_ptr = root_ptr;
for (i = 1; i < NUM_STRUCTS; i ++ ) {
tempStruct* new_ptr = malloc(sizeof(tempStruct));
new_ptr -> data = i;
new_ptr -> next = NULL;
cur_ptr -> next = new_ptr;
cur_ptr = cur_ptr -> next;
}
cur_ptr = root_ptr;
while (cur_ptr != NULL) {
printf("cur_ptr -> data = %d\n", cur_ptr -> data);
cur_ptr = cur_ptr -> next;
}
return 0;
}
If you really want to create something that acts more like an array, you'll need to allocate all your memory at the same time, using something like:
the_data = malloc(NUM_STRUCTS * sizeof(tempStruct);
Then you'll have to access the data with the dot operator (i.e. '.' (no quotes in your code).
struct foo {int bar;};
size_t i = 0, n = 8;
struct foo *arr = malloc(n * sizeof *arr);
int bar;
while ((bar = get_next_bar()) != -1) {
if (++i == n) { // no room for new element; expand array
arr = realloc(arr, n *= 2);
if (arr == NULL) abort; // see note below.
}
arr[i] = (struct foo){bar};
}
The number of assigned elements in the array is i+1. Don’t forget to free() the array when you’re done with it.
Note: In real programs you generally do not do p = realloc(p, s) directly. Instead you assign the result of realloc() to a new pointer, then do error detection & handling before clobbering your original pointer.

Resources