Is free() missing in this "good taste" and "bad taste"code? - c

In an interview about Linus Torvalds, he shared about the importance of having "good taste". He explained good taste with the following code.
Code with "bad taste"
remove_list_entry(entry)
{
prev = NULL;
walk = head;
// Walk the list
while (walk != entry) {
prev = walk;
walk = walk->next;
}
// Remove the entry by updating the head
// or the previous entry
if (!entry)
head = entry->next;
else
prev-next = entry->next;
}
Code with "good taste"
remove_list_entry(entry)
{
// The "indirect" pointer points to the
// *address* of the thing we'll update
indirect = &head;
// Walk the list, looking for the thing that
// points to the thing we want to remove
while ((*indirect) != entry)
indirect = &(*indirect)->next;
// .. and just remove it
*indirect = entry->next;
}
Both examples haven't use free() to release the memory of the node to be deleted. Can someone tell me why the code is written in this way? Or am I having the wrong concept regarding to C or linked list?

It is important also to note the semantics of this function. It is intended to remove a node from a list, not to delete the node as you suggest in your question. At least that is implied by the function name. I would be surprised if it deleted the node without advertising that semantic.
Moreover we have no way of knowing in isolation how the object was allocated, so the function cannot reasonably assume it is a heap object and free() it - that would cause a run-time error if the object were static.
The caller may be removing the object from the list and continuing to use it, it is the responsibility of this function to maintain the list not the objects in the list which have independent existence. It is the responsibility of the owner of entry to manage its memory.

In this context, good taste is being exemplified by minimalism; the second example is less ornate and more to the point than the first. The bad taste code has a few errors in it, but the point is that the extra variable and if statement are superfluous.
Taste is a subjective quality, and there are many reasons to favour either approach. To me, taste has always been about readability. Readability doesn't clear favour either approach -- TLDR vs Too Complex; Didn't Bother (TC;DB).
Compiler technology has a fashion mavens influence on taste. Long ago, when peep hole optimisation was the thing, the second version was preferred; it generated significantly less code. In the not-so-old days compilers dallied with data flow analysis, and liked the first version because it the pointer chasing was clearly not a hidden alias (unlike the second). Modern compilers, so hip with gobs of memory and cpu permit better alias checking, so have gone back to the first. Even still, in your example, it is 3 instructions different.

Related

Usage of void pointer type in C function

I am reading a book that has the following code:
int list_ins_next(List *list, ListElmt *element, const void *data) {
ListElmt *new_element;
/* Allocate storage for the element.*/
if ((new_element = (ListElmt *)malloc(sizeof(ListElmt))) == NULL)
return -1;
/* Insert the element into the list.*/
new_element->data = (void *)data;
/* and so on */
}
As we see, the function gets const void *data as one of the arguments.
Why is it re-cast in the line:
new_element->data = (void *)data; ?
Additional information:
/* Define a structure for linked list elements. */
typedef struct ListElmt_ {
void *data;
struct ListElmt_ *next;
} ListElmt;
/* Define a structure for linked lists.*/
typedef struct List_ {
ListElmt *head;
ListElmt *tail;
} List;
Without the cast, and with sufficient compiler warning levels enabled, your compilation would fail with something like:
warning: assignment discards 'const' qualifier from pointer target type
What this means is that the following assignment is breaking some rules:
new_element->data = data; // <-- danger!
What's wrong?
Well, the message tells you that a 'const qualifier' is being discarded as a result of the assignment. Let's unpack what that implies.
The variable data has the type const void*, which means the memory it points at should never be modified as a result of following that pointer. In other words, the function list_ins_next is not allowed to do anything that will result in that memory being changed.
So now, the list node that the pointer is being stored in does not make such promises. It has a data member of type void* which means anyone following the pointer stored in the list is allowed to modify the memory it points to.
Some memory really is not allowed to be modified. String literals are an example. If you have the string "hello" in your program, that's actually stored in memory somewhere as part of the code. And you can have a pointer: const char* hello = "hello"; ... but you really should not drop the const and start messing with that data.
Do you see the problem? If this were allowed to pass, then the function list_ins_next might be unknowingly breaking the rules. Even though it's not allowed to modify memory pointed to by data, it is storing that pointer in a structure not bound by such restrictions. Later on, someone using that structure could happily go and use data as if it were non-const, without realizing it's supposed to be const.
Analogy time (naughty)
Imagine you went on holiday, and had someone look after your house. Before you left, you made them promise not to touch your top-shelf whisky. You return a week later, the whisky is gone. You ask them why they broke their promise, and they insist that they really didn't touch it, but they did invite a friend over and gave them full permission to drink it all.
Obviously, this is untenable behavior and something we want to avoid. Being in this situation could provoke a violent, irrational response that could damage friendships, property, and bodies. In computer land, we call that Undefined Behavior.
Anyway, let's put aside my anxieties about stewardship of expensive single malt, and return to talking about code.
Good behavior
The issue here is that the compiler wants to stop us from accidentally making mistakes, wherever it can. It rightly warns us: hey, this is bad, you might be using this wrong.
Now, back to why the cast is there. It is a little hard to say for sure without more context. What I will say is that the cast is making a promise:
Dear C compiler, by this cast I solemnly swear that I'm not going to break the rules, or maybe I might break the rules but I really really REALLY know what I'm doing and all of the ramifications, so please let's just pretend that this const pointer is not const.
Thanks <3 - A. C. Programmer.
This is the best case scenario. If the way that this list is used will really not result in modifications to the data, OR if the data didn't really need to be const (but for some reason the insert function made it const) then all is well. It's now the programmer's responsibility to keep their word.
Analogy time (nice)
You can actually trust your friend with your whisky. While you're away, they looked through the collection, but only to catalogue it. They had a friend around, and they spent time researching where it all came from and finding out information about distilleries. You return from holiday and it's all right where you left it, safe and secure.
Next to it there's a notebook containing detailed information about what you have and how much money you probably wasted on booze. There's also a note asking if you'll consider organizing a whisky tasting session one day.
Bad behavior
However, this sadly is not the most common reason for casting away const.
What we see very frequently is addition of a cast by someone who just wanted to get around a compiler error without fully understanding it. The compiler said "whoah buddy, hold on" and the programmer said "please shut up so I can run my code".
And, you know... I get the feeling that the author of this book might possibly fall into the latter category, but I won't outright claim this because I haven't seen the full context of the code example to comment on why it was written this way.
At face value, either the list should be storing const data or the insert function should be accepting non-const data. End of story.
Summary
We commonly say "casts hide bugs". Keep this in mind. When you cast, do it with intent. Do it because you know it's valid, not just because the language will let you do it. C will happily let you hold a nailgun to your foot and pull the trigger.
Anyways, I hope this lengthy explanation has helped somewhat, even if I can't peer inside the head of whoever wrote that code.
And that really touches on another topic -- if you cast away a const, you should probably at the very least somewhere have a comment in the code explaining why. Because one day, someone is gonna read it and ask why. And that person might even be the same person who wrote the code years prior.

Traversing a list with hazard pointers

I'm working with Hazard pointer in order to implement a lock-free linked list in C.
I couldn't find any example code other than vary basics queues and stacks.
The problem is I need to traverse the list, so my question is if I can change the value of a hazard pointer once is assigned.
For example:
t←Top
while(true) {
if t=null then
return null
*hp←t
if Top!=t then
continue
...
t←(t→next) //after this instruction pointer t will be still protected?
}
Finally I ended implementing my own version of Hazard Pointers (HP) according to the original paper. The answer to my question is NO, t is no longer safe to use. The reason is that, the way HP works, *hp is protecting the node being pointed by t when you declared it as a hazardous pointer, so when t moves to the next node, the HP mechanism keeps protecting the previous node. I have to reassign the new value to *hp before I can use it safely.
Also, in the example of the paper it is not explicit, but when you finish using a hazard pointer you have to release it. That means, return *hp to its original state (NULL). This way, if another thread wants to delete (retire) this node, it won't be seen as being used.
In my example above, I have to release *hp before leaving the method. Inside the loop it is not necessary because I am overwriting the same *hp position (*hp ← t), so the previous node is no longer protected.
You do not need hazard pointers when you are only traversing the list. Hazard happens when different threads are reading and writing from and to the same resource (In particular, hazard pointers are to overcome ABA problem, when a resource's value is changed to something and then back to its original value, which makes noticing the change difficult). With traversing, you are only reading, hence no need for hazard pointers.
By the way, it seems to me that you have to change if Top=t to if Top!=t, so that you can proceed with your code if there is no hazard. Note that continue returns to the beginning of the loop. Also, your whole code should be in a while(true) loop.
You can read more about hazard pointers here http://www.drdobbs.com/lock-free-data-structures-with-hazard-po/184401890 , or just by googling!
EDIT You need to provide the code for insert and delete functions. In short, the part of the code that you've mentioned ends up being an infinite loop after execution of t←(t→next), since Top!=t will hold true afterwards.
What you need to do instead of checking t against Top, is to check it against its previously captured value. Again, it depends on your implementation of other methods, but you probably want to implement something similar to Tim Harris algorithm, which uses a two phase deletion (1-marking and 2-freeing the node). Then, when you traverse the list, you need to check for marked nodes as well. There is also an implementation of a doubly linked list, with a find method which you can use as a base of your implementaion, in Fig 9 of http://www.research.ibm.com/people/m/michael/ieeetpds-2004.pdf. Hope this helps.

Deleting links in a doubly linked list

I am writing a doubly linked list based code in C. I had wrongly assumed that deleting the head node by doing free(head_node). And I could see the computer slowing down as the run progressed (which apparently is due to memory leak). I searched stackoverflow and other sites and the code I usually came across for deleting a linked list was this :
Node* current = head;
while( current != NULL ) {
Node* next = current->Next;
free( current );
current = next;
}
When I tried this in my code, the program just hangs right there after the free statement without returning to the function that calls this one. Is the above code relevant for a doubly linked list? My list member data contains a lot of pointers too. When I do free on one of the links, does it free all data the members point to? Please suggest and clarify with code snippets or references to books.
Thank you.
When I do free on one of the links, does it free all data the members point to?
No. This is what would happen if you deleted the last reference to an object in a garbage-collected language, but C doesn't work like that. You need to manually free each bit of memory that you've allocated.
That code looks like what you'd usually use for a singly- or doubly-linked list, assuming none of its values were pointers.
My list member data contains a lot of pointers too.
Since they are you need to free each current->value as well (and if they're pointers to pointers...).
The code you posted should work for singly or doubly linked lists, but makes some assumptions:
That there's no cleanup of the node to do before freeing it; this is often an incorrect assumption.
That the end of the list is marked with a NULL pointer (i.e. the last node's Next member is NULL)
Regarding the first assumption:
Since you have dynamically allocated data in your nodes, and presuming you don't have another pointer to it somewhere else that you'll use to clean it up later, you'll need to free that data before you free each node. In C, this is not done for you; the general rule is that if you had to allocate it yourself, you have to free it yourself too. A sensible way to deal with this is to write a function to clean up and free a node, and call that instead of just calling free(); your cleanup function would still free the node, but it would free the node's data first.
Regarding the second assumption:
It's a pretty common practice to set the last node's Next pointer to NULL to mark the end since it makes it easy to tell when you've walked all the way through the list. For a doubly linked list, the same goes for the first node's Prev pointer. However, if it's a circular list, the last node just points back to the first node instead -- and that would break the code you posted. In that situation, you'd start with the node head->Next instead of head, and check whether current is not head rather than not NULL. Then deal with head at the end, since you skipped it initially.
And one more thing:
Make sure after you're done freeing your list, that you don't leave head pointing to an invalid (already freed) node and then try to access the list again...

Proper use of malloc [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
A chapter out of the book I have been reading has focused on memory management allocating space using malloc linux functions.
Before I read this I would make relatively small programs without allocating space.
Is it acceptable to not do anything in the way of memory allocation for applications whose memory footprint remains under 50MB? What are the repercussions of not doing so?
I think the answers are missing an important point. The size of memory is a relatively specific technical detail which isn't of primary interest. The crucial difference is that between automatic and dynamic storage, and the associated lifetime:
Automatic storage ends at the end of the scope.
Dynamic storage begins with malloc() and ends with free(), entirely at the discretion (and responsibility) of the user.
If you can and if it makes sense, everything should be automatic. This entails locality and well-defined interfaces. However, in C (not so much in C++) there comes a time when you need to talk about objects that aren't local to the scope. That's when we need dynamic allocation.
The prime example is your typical linked list. The list consists of nodes:
typedef struct node_tmp
{
int data;
struct node_tmp * next;
struct node_tmp * prev;
} node;
Now to talk about such a list boils down to talking about any of its nodes and brachiate along the prev/next pointers. However, the actual nodes cannot sensibly be part of any local scope, so they are usually dynamically allocated:
node * create_list()
{
node * p = malloc(sizeof node); // [1]
p->prev = p->next = 0;
return p;
}
void free_list(node * p) // call with head node
{
while (p->next)
{
node * tmp = p;
p = p->next;
free(tmp); // [2a]
}
free(p); // [2b]
}
void append_to_end(node * p, int data); // etc.
Here the list nodes exist outside any scope, and you have to bring them to life manually using malloc(), and clean them up when you're done.
You can use linked lists even in the tiniest of programs, but there's no real way around the manual allocation.
Edit: I thought of another example that should really convince you: You might think that you can just make the list with automatically allocated nodes:
node n1, n2, n3; // an automatic linked list
n1.prev = n3.next = 0;
n1.next = &n2; n2.prev = &n1; n2.next = &n3; n3.prev = &n2;
But note that you cannot do this dynamically! "Dynamic" means "at runtime", but automatic variables have to be determined entirely at compile time.
Suppose you wanted a program that reads integers from the user. If it's even, you add it to the list, if it's odd you ignore it, and if it's zero you stop. You cannot possibly realize such a program with automatic allocation, because the allocation needs are only determined at runtime.
It is in such a scenario that you require malloc().
If you can do without malloc for small applications, you're probably just not needing to use any heap space. Little utility programs or toy programs often don't. The things you might be doing wrong though to get by when you should be using the heap are:
Arrays. If you find yourself allocating large arrays 'just to make sure everything fits' then you should perhaps be using malloc. At the least, handle the error condition that everything overflows to check they really are big enough. With dynamically allocated arrays, you can make bigger ones on the fly if you find you need more space.
Doing too much recursion. C benefits from flattening out recursion sometimes into loops over arrays, because unlike function languages it can't optimise things properly. If you are getting your storage space by calling function lots to create it, that's pretty dangerous (the program might crash on you one day).
Using static pools of objects (structs, classes). Perhaps you have a ring buffer, and 15 objects that could be in it, and you have them statically allocated because you know that your buffer will never have more than 15 entries. That's kind of OK, but allowing the buffer to grow more by adding in more structs, created with malloc, might be nice.
Probably plenty more situations where programmes which don't need malloc could benefit from having it added.
The size of an application and the use of malloc() are two independant things. malloc() is used to allocate memory at runtime, when sizes are not known at compilation time.
Anyway, if you do know the maximum size of the structures that you want to play with, you can statically allocate them and build an application without using malloc(). Space critical software is an example of such applications.
You were probably allocating memory statically at compile time, but not dynamically.
The possible issue when allocating everything statically are :
you are wasting memory because you are always allocating an upper limit with a margin.
in some case, your application will run out of memory (because your estimation was wrong for example), and and since you can not add new memory ressources at runtime it can potentially be lethal.
That being said, in some cases like real-time embedded system, it is a requirement to not allocate any memory dynamically at runtime. (because you have hard memory constraints, or because allocating memory can break real time)

Already freed memory

Is there any way in C to know if a memory block has previously been freed with free()? Can i do something like...
if(isFree(pointer))
{
//code here
}
Ok if you need to check whether a pointer has already been freed you may want to check your design. You should never have to either track reference count on a pointer or if it's freed. Also some pointers are not dynamically allocated memory so I hope you mean ones called with malloc(). This is my opinion but again if you have a solid design you should know when the things your pointers point to are done being used.
The only place I have seen this not work is in monolithic kernels because pages in memory need a usage count because of shared mappings among other things.
In your case simply set unused pointers to NULL and check that. This gives you a guaranteed way of knowing in the case that you have unused fields in structures that were malloced. A simple rule is wherever you free a pointer that needs to be checked in the above way just set it to NULL and replace isFree() with if pointer == NULL. This way no reference count needs to be tracked and you know for sure if your pointer is valid and not pointing to garbage.
No, there is no way.
You can, however, use a little code discipline as follows:
Always always always guard allocations with malloc:
void * vp;
if((vp = malloc(SIZE))==NULL){
/* do something dreadful here to respond to the out of mem */
exit(-1);
}
After freeing a pointer, set it to 0
free(vp); vp = (void*)0;
/* I like to put them on one line and think of them as one peration */
Anywhere you'd be tempted to use your "is freed" function, just say
if(vp == NULL)[
/* it's been freed already */
}
Update
#Jesus in comments says:
I can't really recommend this because as soon as you're done with that
memory the pointer should go out of scope immediately (or at least at
the end of the function that releases it) these dangling pointers
existence just doesn't sit right with me.
That's generally good practice when possible; the problem is that in real life in C it's often not possible. Consider as an example a text editor that contains a doubly-linked list of lines. The list is really simple:
struct line {
struct line * prev;
struct line * next;
char * contents;
}
I define a guarded_malloc function that allocates memory
void * guarded_malloc(size_t sz){
return (malloc(sz)) ? : exit(-1); /* cute, eh? */
}
and create list nodes with newLine()
struct line * newLine(){
struct line * lp;
lp = (struct line *) guarded_malloc(sizeof(struct line));
lp->prev = lp->next = lp-contents = NULL ;
return lp;
}
I add text in string s to my line
lp->contents = guarded_malloc(strlen(s)+1);
strcpy(lp->contents,s);
and don't quibble that I should be using the bounded-length forms, this is just an example.
Now, how can I implement deleting the contents of a line I created with the char * contents going out of scope after freeing?
I see nobody has addressed the reason why what you want is fundamentally impossible. To free a resource (in this case memory, but the same applies to basically any resource) means to return it to a resource pool where it's available for reuse. The only way the system could provide a reasonable answer to "Has the memory block at address X already been freed?" is to prevent this address from ever being reused, and store with it a status flag indicating whether it was "freed". But in this case, it has not actually been freed, since it is not available for reuse.
As others have said, the fact that you're trying to answer this question means you have fundamental design errors you need to address.
In general the only way to do this portably is to replace the memory allocation functions. But if you're only concerned about your own code, a fairly common technique is to set pointers to NULL after you free() them, so any subsequent use will throw an exception or segfault:
free(pointer);
pointer = NULL;
For a platform-specific solution, you may be interested in the Win32 function IsBadReadPtr (and others like it). This function will be able to (almost) predict whether you will get a segmentation fault when reading from a particular chunk of memory.
Note: IsBadReadPtr has been deprecated by Microsoft.
However, this does not protect you in the general case, because the operating system knows nothing of the C runtime heap manager, and if a caller passes in a buffer that isn't as large as you expect, then the rest of the heap block will continue to be readable from an OS perspective.
Pointers have no information with them other than where they point. The best you can do is say "I know how this particular compiler version allocates memory, so I'll dereference memory, move the pointer back 4 bytes, check the size, makes sure it matches..." and so on. You cannot do it in a standard fashion, since memory allocation is implementation defined. Not to mention they might have not dynamically allocated it at all.
On a side note, I recommend reading 'Writing Solid Code' by Steve McGuire. Excellent sections on memory management.

Resources