I am writing a doubly linked list based code in C. I had wrongly assumed that deleting the head node by doing free(head_node). And I could see the computer slowing down as the run progressed (which apparently is due to memory leak). I searched stackoverflow and other sites and the code I usually came across for deleting a linked list was this :
Node* current = head;
while( current != NULL ) {
Node* next = current->Next;
free( current );
current = next;
}
When I tried this in my code, the program just hangs right there after the free statement without returning to the function that calls this one. Is the above code relevant for a doubly linked list? My list member data contains a lot of pointers too. When I do free on one of the links, does it free all data the members point to? Please suggest and clarify with code snippets or references to books.
Thank you.
When I do free on one of the links, does it free all data the members point to?
No. This is what would happen if you deleted the last reference to an object in a garbage-collected language, but C doesn't work like that. You need to manually free each bit of memory that you've allocated.
That code looks like what you'd usually use for a singly- or doubly-linked list, assuming none of its values were pointers.
My list member data contains a lot of pointers too.
Since they are you need to free each current->value as well (and if they're pointers to pointers...).
The code you posted should work for singly or doubly linked lists, but makes some assumptions:
That there's no cleanup of the node to do before freeing it; this is often an incorrect assumption.
That the end of the list is marked with a NULL pointer (i.e. the last node's Next member is NULL)
Regarding the first assumption:
Since you have dynamically allocated data in your nodes, and presuming you don't have another pointer to it somewhere else that you'll use to clean it up later, you'll need to free that data before you free each node. In C, this is not done for you; the general rule is that if you had to allocate it yourself, you have to free it yourself too. A sensible way to deal with this is to write a function to clean up and free a node, and call that instead of just calling free(); your cleanup function would still free the node, but it would free the node's data first.
Regarding the second assumption:
It's a pretty common practice to set the last node's Next pointer to NULL to mark the end since it makes it easy to tell when you've walked all the way through the list. For a doubly linked list, the same goes for the first node's Prev pointer. However, if it's a circular list, the last node just points back to the first node instead -- and that would break the code you posted. In that situation, you'd start with the node head->Next instead of head, and check whether current is not head rather than not NULL. Then deal with head at the end, since you skipped it initially.
And one more thing:
Make sure after you're done freeing your list, that you don't leave head pointing to an invalid (already freed) node and then try to access the list again...
Related
In an interview about Linus Torvalds, he shared about the importance of having "good taste". He explained good taste with the following code.
Code with "bad taste"
remove_list_entry(entry)
{
prev = NULL;
walk = head;
// Walk the list
while (walk != entry) {
prev = walk;
walk = walk->next;
}
// Remove the entry by updating the head
// or the previous entry
if (!entry)
head = entry->next;
else
prev-next = entry->next;
}
Code with "good taste"
remove_list_entry(entry)
{
// The "indirect" pointer points to the
// *address* of the thing we'll update
indirect = &head;
// Walk the list, looking for the thing that
// points to the thing we want to remove
while ((*indirect) != entry)
indirect = &(*indirect)->next;
// .. and just remove it
*indirect = entry->next;
}
Both examples haven't use free() to release the memory of the node to be deleted. Can someone tell me why the code is written in this way? Or am I having the wrong concept regarding to C or linked list?
It is important also to note the semantics of this function. It is intended to remove a node from a list, not to delete the node as you suggest in your question. At least that is implied by the function name. I would be surprised if it deleted the node without advertising that semantic.
Moreover we have no way of knowing in isolation how the object was allocated, so the function cannot reasonably assume it is a heap object and free() it - that would cause a run-time error if the object were static.
The caller may be removing the object from the list and continuing to use it, it is the responsibility of this function to maintain the list not the objects in the list which have independent existence. It is the responsibility of the owner of entry to manage its memory.
In this context, good taste is being exemplified by minimalism; the second example is less ornate and more to the point than the first. The bad taste code has a few errors in it, but the point is that the extra variable and if statement are superfluous.
Taste is a subjective quality, and there are many reasons to favour either approach. To me, taste has always been about readability. Readability doesn't clear favour either approach -- TLDR vs Too Complex; Didn't Bother (TC;DB).
Compiler technology has a fashion mavens influence on taste. Long ago, when peep hole optimisation was the thing, the second version was preferred; it generated significantly less code. In the not-so-old days compilers dallied with data flow analysis, and liked the first version because it the pointer chasing was clearly not a hidden alias (unlike the second). Modern compilers, so hip with gobs of memory and cpu permit better alias checking, so have gone back to the first. Even still, in your example, it is 3 instructions different.
I am implementing a linked list in C and I am running into the issue where C does not implement any specific scheme for memory management other than just giving you the ability to allocate and free memory by passing a pointer. There is no concept of whether the value might be needed later on in the program.
The typical implementation I find online for a linked list basically deallocs the deleted node but does not dealloc the node's value.
Whose responsibility should it be to release the memory taken up by the value when deleted from the list ? The linked list's or the normal flow of the program ?
example:
// allocate 10 bytes
char *text = malloc(sizeof(char) * 10);
// create the linked list
LinkedList *list = list_create();
// add the text pointer to the linked list
list_append(list, text);
// remove the pointer from the linked list
list_remove_last(list);
In this case text would end up not getting deallocated as list_remove_last just frees the memory that the new node takes up. What would be the proper way to release the memory taken up by text ?
that is a very common way of container implementation in C.
basically you dynamically allocate the contents of the list and pass the pointer to the container, now the container is responsible for freeing it.
You can also pass in a function pointer to list_create() so it knows how to do list_remove_last() properly, this is especially useful for using a generic container that does not know what type of elements it will contain (it will just hold void * pointers).
think of the case where the data itself is a struct that contains other pointers. in this case list_remove() can not do a simple free() on its data field, instead it should use the function pointer that was passed in to free the data.
your approach has a small problem:
if you have list* as the return type of list_create(), then you will have to do a free(list) in your main function. alternatively, you can have list_create() return a list, as opposed to a list*, this is a logical choice because a list has its bulk of information dynamically allocated and accessible through a pointer anyway.
in the second case you would need a function list_destroy(list) that would destroy any element your list holds.
C does not implement any specific scheme for memory management other than just giving you the ability to allocate and free memory by passing a pointer
Yes, C lacks any kind of automatic memory management, so you have to be careful to deallocate any memory blocks that you instantiate.
Whose responsibility should it be to release the memory taken up by the value when deleted from the list? The linked list's or the normal flow of the program?
It's your responsibility. You can do it however you like. You can write a general purpose linked list where the caller has to be responsible for allocating and deallocating space for each value in the list because the list management functions don't know how much space each value might require, or whether the values might be needed beyond the lifetime of the node. Or, you can write a list implementation that manages every aspect of the node, including space for the value stored in the node. In some cases, a list node includes the value in the node definition, like:
struct Node {
struct Node *next;
int value;
};
and other times the node has a pointer to some other block that has the actual value:
struct Node {
struct Node *next;
void *value;
};
Another approach is to define a structure with just the part needed for the list operation (i.e. the next pointer), and then piggyback data onto that structure:
struct Node {
struct Node *next;
};
struct MyNode {
struct Node node;
int price;
int quantity;
};
So, there are lots of ways to do it, and none of them are wrong. You should choose the style that makes sense for your needs. Do you have big, complex values that you don't want to duplicate, that you want to store in a linked list, but which you want to continue to use even after they're removed from the list? Go with the first style above. Do you want to manage everything related to the linked list in one place? Then go with the second style.
The point is: C dictates a lot less than other languages do, and while that means that you have to think harder about program correctness, you also get the freedom to do things very directly and in a style of your choosing. Embrace that.
My guide line is: the one who allocates memory is also responsible for de-allocating it.
If you implement a linked list that allocates the memory for the values, the implementation should also take care of freeing this memory when the entries are removed from the list. For strings this could be done by copying the strings to a newly allocated buffer of adequate size.
If your implementation of a linked list only stores plain values (e.g. pointers) without allocating extra memory for the values, it should also avoid freeing memory it did not allocate, because it doesn't know what the allocator planned for this memory in the future.
The proper way would be to have list_remove_node() a function that would free not only the list (node) itself, but also the value that was allocated for that specific node. Also, you shouldn't need to search for a specific node according to your text as you should be able to just call free(node->text) (which can be done even in the current list_remove_last() function)
The main C logic is that you are supposed to free() anything that you allocated yourself. Certain libraries will allocate memory for their own work, which most often you are supposed to clean up as well (as you were the one who asked for it).
Iv'e been given a mission to implement a dynamic queue in c language without any loops and any recursion.
the queue should contain the next function: installation, destruct, add, remove and peek.
I thought to make a link structure, that way each link will have a pointer the next link and so on..but the problem is that I don't know how to do the destruct function without any loops, the only solution I can think of is making a loop that will send each one of the links to the remove function(but again, I need to it without any loops). Is their any possibility to do the destruct function without any loops?
p.s the destruct function should free all of the memory that we used for the queue.
If a recursing function doesn't count as a loop for your constrains, you could use recursion to traverse the list and destroy the items.
Another approach is to store items in an array, and maintain a pointer into the array for the head and tail of the queue. Destroying the queue just means freeing the array or resetting the head/tail pointers, and no loops would be required.
There's no real need to make a queue based on a linked list, it would have all the downside of random allocated elements and lack of spatial locality, would be relatively harder to debug, and won't use the main benefit of a linked list (insertion at O(1)) since it's a queue with only one insertion point anyway (or 2 for a double-ended one).
Instead you could use an array, and maintain a head and tail index variables, use cyclic incrementation when they wrap around at the end, and reallocate when required. If the queue holds basic data types this would also allow you to deallocate the entire queue in one go, just free the array (although for elements you had to manually allocate, I can't see any way to avoid iterated removal, unless you move to c++).
I am assuming that the item to be inserted in memory is of constant size. If needed, it could be a pointer to a block of memory. In that case, you can use a circular buffer with a head and tail pointer. When either pointer "gets to the end of the block" it should wrap - i.e. you increment / decrement modulo queue size.
Initialization:
Create a memory space of finite size (max size of the buffer)
Add:
Update memory location at the current tail (if add to end)
or head (if add to beginning), and update the tail/head pointer.
Remove:
Read the data at the head/tail, and update the pointer
Peek:
Read the data at the head/tail, and don't move the pointer
Destruct:
Free the memory block
No loops, no recursion. It uses the fact that a FIFO buffer only allows changes at the beginning / end of the queue- it is not possible to remove elements "in the middle".
If the head and tail pointers meet, the queue is "full". In that case, the "insert" function should return an error, unless you add a "insert destructively" flag that says "overwrite the oldest element". That seems beyond the scope of your homework, but it is important in real life applications. Sometimes you care about the oldest data - at other times you care about the latest data. But usually, if your queue is filling up, there is a problem with the over all system design (you didn't scale the process that empties the queue to deal with the rate at which it is filling, basically).
Note - if each element in the queue is a pointer to dynamically allocated memory you WILL need to iterate over all elements to free that memory, or you will create a memory leak. But if the queue is of constant size, this is not needed. Given the constraints given, and the lack of specification that queue element size should be variable, I would recommend you write your solution for a fixed size queue element.
I apologize in advance if this is an incredibly dumb question...
Currently I have a circular linked list. The number of nodes is normally held static. When I want to add to it, I malloc a number of nodes (ex. 100000 or so) and splice it in. This part works fine when I malloc the nodes one by one.
I want to attempt to allocate by blocks:
NODE *temp_node = node->next;
NODE *free_nodes = malloc( size_block * sizeof( NODE ) );
node->next = free_nodes;
for ( i = 0; i < size_block - 1; i++ ) {
free_nodes[i].src = 1;
free_nodes[i].dst = 0;
free_nodes[i].next = &free_nodes[i+1];
}
free_nodes[size_block - 1].next = temp_node;
The list works as long as I don't attempt to free anything ('glibc detected: double free or corruption' error). Intuitively, I think that is because freeing it doesn't free the single node, and looping through the normal way is attempting to free it multiple times (plus freeing the entire block probably screws up all the other pointers from the nodes that still exist?), but:
Could somebody please explain to me explicitly what is happening?
Is there a way to allocate the nodes by blocks and not break things?
The purpose of this is because I am calling malloc hundreds of thousands of times, and it would be nice if things were faster. If there is a better way around this, or I can't expect it to get faster, I would appreciate hearing that too. :)
Could somebody please explain to me explicitly what is happening?
Exactly what you said. You are allocating a single space of contiguous memory for all blocks. Then if you free it, all memory will be released.
Is there a way to allocate the nodes by blocks and not break things?
Allocate different memory segments for each block. In your code (that isn't complete) should be something like:
for ( i = 0; i < size_block ; i++ ) {
free_nodes[i] = malloc (sizeof( NODE ));
}
First, the way you allocated your nodes in blocks, you always have to free the whole block with exactly the same start address as you got from malloc. There is no way around this, malloc is designed like this.
Putting up your own ways around this is complicated and usually not worth it. Modern run-times have quite efficient garbage collection behind malloc/free (for its buffers, not for user allocations) and it will be hard for you to achieve something better, better meaning more efficient but still guaranteeing the consistency of your data.
Before losing yourself in such a project measure where the real bottlenecks of your program are. If the allocation part is a problem there is still another possibility that is more likely to be the cause, namely bad design. If you are using so many elements in your linked list such that allocation dominates, probably a linked list is just not the appropriate data structure. Think of using an array with a moving cursor or something like that.
When you free a node you free the entire allocation that the node was allocated with. You must somehow arrange to free the entire group of nodes at once.
Probably your best bet is to keep a list of "free" nodes and reuse those rather than allocating/freeing each node. And with some effort you can arrange to keep the nodes in blocks and allocate from the "most used" block first such that if an entire block goes empty you can free it.
How would you find if one of the pointers in a linked list is corrupted or not ?
Introduce a magic value in your node structures. Initialize it upon new node allocation. Before every access, check if the node structure that the pointer points to contains the valid magic. If the pointer points at an unreadable data block, your program will crash. For that, on Windows there's API VirtualQuery() - call that before reading, and make sure the pointer points at readable data.
There are several possibilities.
If the list is doubly linked, it's possible to check the back pointer from what a front pointer points to, or vice versa.
If you have some idea as to the range of expected memory addresses, you can check. This is particularly true of the linked list is allocated from a limited number of chunks of memory, rather than having each node allocated independently.
If the nodes have some recognizable data in them, you can run down the list and check for recognizable data.
This looks to me like one of those questions where the interviewer isn't expecting a snappy answer, but rather an analysis of the question including further questions from you.
It's sort of a pain, but you can record the values of each pointer as you come across them with your debugger and verify that it's consistent with what you'd expect to find (if you'd expect a pointer to be NULL, make sure it's NULL. if you'd expect a pointer to refer to an already existing object, verify that that object's address has that value, etc.).
Yuo could keep a doubly linked list. Then you can check that node->child->parent == node (although if node->child has become corrupt this has a reasonable chance of causing an exception)
Several debuggers / bound-checkers will do this for you, but a cheap and quick solution to this question is to
Alter the structure of the list's nodes to include one additional char[n] field (or more typically two, one as the first the other as the last fields in the structure, hence allowing bounds-checking in addition to pointer corruption).
Initiallize these fields with a short (but long enough...) constant string such as "VaL1D-LiST-NODE 1234" when the nodes are created.
Check that the values read in this(these) field(s) match the expected text, each time a node is dereferenced, and before using the node in earnest.
When the field(s)' value do not match this is either the indication that:
the pointer is invalid (it never pointed to a list node)
something else is overwriting the node structure (the pointer is "valid" but the data it points to has been corrupted).