I am looking at two linked list implementations (queue.h [source] and utlist.h [source]), and I have a few questions regarding their respective implementations:
What is _Q_INVALIDATE all about in queue.h? I suppose it is part of some debugging, but I don't really understand the macro define logic.
Both implementations offer FOREACH and FOREACH_SAFE. The former is straight forward, but what is the logic behind the second? Also, if the former is unsafe in anyway, why has it been implemented in the first place?
Why has queue.h implemented its struct as having different types for next and prev (struct *le_next and struct **le_prev)?
In both implementations, why are there extra parenthesis inserted here and there? Eg. around head in #define LIST_FIRST(head) ((head)->lh_first)
For Question 1:
_Q_INVALIDATE is a macro that sets a pointer that should no longer be used to a value of -1. The intent is that if it is used subsequently, debugging will be made easier because using the pointer will cause an immediate crash. In non-debug mode the macro does nothing, so the pointer is left with its current value - if there's a bug that results in the pointer being used, the problem may be a much more subtle defect.
For Question 2:
The 'safe' versions of these macros take an additional pointer argument that the macro uses internally to point to the next item in the list while the current one is being processed. This allows the code inside the loop to remove the current item from the list. Since the next item has already been remembered in the temp pointer, the macro has no problem picking it up for the next iteration. The non-safe version of the macro doesn't use a temp pointer, so you can't remove the current item from the list while iterating it.
For Question 3:
This makes it easier to add a new element before the current one or to remove the current element from the list without concern about whether the current element is at the head of the list (and therefore only 'pointed to' by the list pointer) or if the current element is elsewhere in the list (and therefore pointed to by another element's le_next pointer). If le_prev were a struct type* then dealing with the first element in the list would need special case code. Since le_prev is a struct type** it can refer to a simple struct type* (like the list head pointer) just as easily as a struct type* that's embedded at some arbitrary offset inside type (like the le_next links in each element).
Question 4 was answered in a comment above.
Related
Hello i am trying to learn and build data structures in c and i want to store integers progressively in the stack.
my struct is like this:
typedef struct STACK_NODE_s *STACK_NODE;
typedef struct STACK_NODE_s{
STACK_NODE forward;
void *storage;
} STACK_NODE_t;
typedef struct L_STACK_s{
STACK_NODE top;
} L_STACK_t, *L_STACK;
In a while loop i want to read and store my chars in integer form.
//assume that str is an proper string
//assume that we have a linked stack called LS
int i=0;
int temp;
while(str[i]!='\0'){
tmp=str[i]-'0';
push(LS,(void *)&tmp);
}
I know this won't work properly as we store the same variable's adress over and over again.
Do i need to allocate an auxiliary array in order to store them 1 by 1 or is there a better way to do this?
The answer must address two separate aspects of your question:
How to organize some collection of items, and where to get the memory from to do that.
First code snippet / Linked list format
The first code snippet is good the way it is.
It sets up a linked list, which has its pros and cons, but serves very well if you don't know the number of items in advance, if you want to be able to quickly remove or insert items somewhere in the middle of the list, and if you don't mind that looking up one certain entry inside the list costs you O(N) effort.
For a generic library-like implementation...
... void* is as good as it goes with ANSI C.
In C++, for example, you could make a template that leaves open the type that is stored in the list (or better yet, you would directly reuse the well-known STL implementation in class forward_list<int>).
Sadly, ANSI C doesn't have something comparable.
One solution is the one you picked, create int objects and hook their addresses into your list of void*.
Another solution for a generic library implementation is to use a precompiler macro for the type, and to define this macro above a header file that holds the generic implementation. This tries to resemble the clean C++ solution, but with precompiler it is not typesafe, so this approach is far from beautiful and comes with several risks.
Second code snippet / Memory allocation
Creating the list with void* instead of int (or whatever non-pointer type) requires you to allocate further memory beside the list.
I. e., it is not only that you have to allocate every list item (= variable of type STACK_NODE_t) but also the actual entry value (e. g., *(int*)(LS->storage)).
This means you have to allocate/deallocate the data in some other way that outlives the stack.
On most systems, you can use malloc/free for that, and you only have to take into account the size of the heap available for malloc and the time de-/allocating takes.
If the list shall implement real-time requirements or on embedded systems, you may not have malloc or you may not be allowed to use it.
Then you have to allocate and implement your own heap (= memory pool of storage items) for your list.
How to implement such a memory pool with desired properties is a separate question that would take us to far here.
In any case, you must not use the pointer to a stack variable (like a local variable inside a function) because the memory "behind" that variable will not be reserved for this purpose once the function exits, and the memory may be used for something different in the meantime.
This is, however, what the second code snippet does apparently.
As you noticed yourself, taking this path...
we store the same variable's adress over and over again.
Reusing the memory position for another entry of the same list is an extreme case of the risk explained above.
I solved the problem using an auxiliary array like i anticipated. If someone comes up with a better solution its more than welcome.
I am trying to work a through a computer science course on coursebuffet.com, which referred me to saylor.org, which gave me this to learn about how to implement a stack with a linked list in C.
First, I think I understood the concept, but if you'd be so kind and scroll down in the link you will find at the end of it a link to a main file, with which you should test your implementation of it. And what absolutely baffles me for the last two days (yeah, that's how much time I already sank in this one problem) is the following passage:
/*
* Initialize the stack. Make it at least
* big enough to hold the string we read in.
*/
StackInit(&stack, strlen(str));
I can't understand how to initialise a linked list. I mean, that's against its concept, isn't it? I would need to create struct Elements before filling them with push commands, but if I do that, I need to follow the stack in two directions. One directions for pushing it and the opposite direction for popping it. That would need two pointers. I thought the whole concept as described here would be one data element and one pointer per ADT-unit.
Can someone please explain this to me?
When you initialize list to be length of the string you want to read, you will still have stack pointer pointing to the first element of the list. So basically nothing is lost. However you are correct, there is no point of doing something like that.
There is no need for double linked list. Stack pointer will always point to the first element. Basically whenever you want to push() you will add new node to the beginning of the list, and whenever you want to pop() you will remove first node of the list.
Assume stack signifies LIFO operation only. i.e last in will be the first to get out.
Now Lets think how we would like to implement it.
First choice: Just have an fixed size internal array. In that case, you will never have the option to resize on the fly. So the comment above stack init is valid, that you should allocate a size that you think will be safe for the purpose.
Second Choice: Having an array, but using dynamic memory. In that case, even if you hit the limit of existing stack, you always can expand the size by realloc. So comment does not make sense.
Third Choice: Using linklist, So in theory, even if you initialize the stack with 0 size, it should expand on each node insertion, so the comment is misleading. Just to add, top is always the head of the linklist, and with every insertion, new node will become new head.
So to answer your question, the comment above is confusing and make sense only when internal implementation is array based.
But in common sense and general perception associated to Stack DS, Stack is DS which is always associated to Stack Depth. And when implementing this, its always safe to have max limit of elements that can be pushed and I guess, may be comment meant that.
To further illustrate it with an real example, you must have heard of callstack of functions, though in theory its expands but has MAX possible limit and thats the reason we see stack overflow error when we do infinite recursion.
I am implementing a linked list in C and I am running into the issue where C does not implement any specific scheme for memory management other than just giving you the ability to allocate and free memory by passing a pointer. There is no concept of whether the value might be needed later on in the program.
The typical implementation I find online for a linked list basically deallocs the deleted node but does not dealloc the node's value.
Whose responsibility should it be to release the memory taken up by the value when deleted from the list ? The linked list's or the normal flow of the program ?
example:
// allocate 10 bytes
char *text = malloc(sizeof(char) * 10);
// create the linked list
LinkedList *list = list_create();
// add the text pointer to the linked list
list_append(list, text);
// remove the pointer from the linked list
list_remove_last(list);
In this case text would end up not getting deallocated as list_remove_last just frees the memory that the new node takes up. What would be the proper way to release the memory taken up by text ?
that is a very common way of container implementation in C.
basically you dynamically allocate the contents of the list and pass the pointer to the container, now the container is responsible for freeing it.
You can also pass in a function pointer to list_create() so it knows how to do list_remove_last() properly, this is especially useful for using a generic container that does not know what type of elements it will contain (it will just hold void * pointers).
think of the case where the data itself is a struct that contains other pointers. in this case list_remove() can not do a simple free() on its data field, instead it should use the function pointer that was passed in to free the data.
your approach has a small problem:
if you have list* as the return type of list_create(), then you will have to do a free(list) in your main function. alternatively, you can have list_create() return a list, as opposed to a list*, this is a logical choice because a list has its bulk of information dynamically allocated and accessible through a pointer anyway.
in the second case you would need a function list_destroy(list) that would destroy any element your list holds.
C does not implement any specific scheme for memory management other than just giving you the ability to allocate and free memory by passing a pointer
Yes, C lacks any kind of automatic memory management, so you have to be careful to deallocate any memory blocks that you instantiate.
Whose responsibility should it be to release the memory taken up by the value when deleted from the list? The linked list's or the normal flow of the program?
It's your responsibility. You can do it however you like. You can write a general purpose linked list where the caller has to be responsible for allocating and deallocating space for each value in the list because the list management functions don't know how much space each value might require, or whether the values might be needed beyond the lifetime of the node. Or, you can write a list implementation that manages every aspect of the node, including space for the value stored in the node. In some cases, a list node includes the value in the node definition, like:
struct Node {
struct Node *next;
int value;
};
and other times the node has a pointer to some other block that has the actual value:
struct Node {
struct Node *next;
void *value;
};
Another approach is to define a structure with just the part needed for the list operation (i.e. the next pointer), and then piggyback data onto that structure:
struct Node {
struct Node *next;
};
struct MyNode {
struct Node node;
int price;
int quantity;
};
So, there are lots of ways to do it, and none of them are wrong. You should choose the style that makes sense for your needs. Do you have big, complex values that you don't want to duplicate, that you want to store in a linked list, but which you want to continue to use even after they're removed from the list? Go with the first style above. Do you want to manage everything related to the linked list in one place? Then go with the second style.
The point is: C dictates a lot less than other languages do, and while that means that you have to think harder about program correctness, you also get the freedom to do things very directly and in a style of your choosing. Embrace that.
My guide line is: the one who allocates memory is also responsible for de-allocating it.
If you implement a linked list that allocates the memory for the values, the implementation should also take care of freeing this memory when the entries are removed from the list. For strings this could be done by copying the strings to a newly allocated buffer of adequate size.
If your implementation of a linked list only stores plain values (e.g. pointers) without allocating extra memory for the values, it should also avoid freeing memory it did not allocate, because it doesn't know what the allocator planned for this memory in the future.
The proper way would be to have list_remove_node() a function that would free not only the list (node) itself, but also the value that was allocated for that specific node. Also, you shouldn't need to search for a specific node according to your text as you should be able to just call free(node->text) (which can be done even in the current list_remove_last() function)
The main C logic is that you are supposed to free() anything that you allocated yourself. Certain libraries will allocate memory for their own work, which most often you are supposed to clean up as well (as you were the one who asked for it).
I'm working with Hazard pointer in order to implement a lock-free linked list in C.
I couldn't find any example code other than vary basics queues and stacks.
The problem is I need to traverse the list, so my question is if I can change the value of a hazard pointer once is assigned.
For example:
t←Top
while(true) {
if t=null then
return null
*hp←t
if Top!=t then
continue
...
t←(t→next) //after this instruction pointer t will be still protected?
}
Finally I ended implementing my own version of Hazard Pointers (HP) according to the original paper. The answer to my question is NO, t is no longer safe to use. The reason is that, the way HP works, *hp is protecting the node being pointed by t when you declared it as a hazardous pointer, so when t moves to the next node, the HP mechanism keeps protecting the previous node. I have to reassign the new value to *hp before I can use it safely.
Also, in the example of the paper it is not explicit, but when you finish using a hazard pointer you have to release it. That means, return *hp to its original state (NULL). This way, if another thread wants to delete (retire) this node, it won't be seen as being used.
In my example above, I have to release *hp before leaving the method. Inside the loop it is not necessary because I am overwriting the same *hp position (*hp ← t), so the previous node is no longer protected.
You do not need hazard pointers when you are only traversing the list. Hazard happens when different threads are reading and writing from and to the same resource (In particular, hazard pointers are to overcome ABA problem, when a resource's value is changed to something and then back to its original value, which makes noticing the change difficult). With traversing, you are only reading, hence no need for hazard pointers.
By the way, it seems to me that you have to change if Top=t to if Top!=t, so that you can proceed with your code if there is no hazard. Note that continue returns to the beginning of the loop. Also, your whole code should be in a while(true) loop.
You can read more about hazard pointers here http://www.drdobbs.com/lock-free-data-structures-with-hazard-po/184401890 , or just by googling!
EDIT You need to provide the code for insert and delete functions. In short, the part of the code that you've mentioned ends up being an infinite loop after execution of t←(t→next), since Top!=t will hold true afterwards.
What you need to do instead of checking t against Top, is to check it against its previously captured value. Again, it depends on your implementation of other methods, but you probably want to implement something similar to Tim Harris algorithm, which uses a two phase deletion (1-marking and 2-freeing the node). Then, when you traverse the list, you need to check for marked nodes as well. There is also an implementation of a doubly linked list, with a find method which you can use as a base of your implementaion, in Fig 9 of http://www.research.ibm.com/people/m/michael/ieeetpds-2004.pdf. Hope this helps.
How would you find if one of the pointers in a linked list is corrupted or not ?
Introduce a magic value in your node structures. Initialize it upon new node allocation. Before every access, check if the node structure that the pointer points to contains the valid magic. If the pointer points at an unreadable data block, your program will crash. For that, on Windows there's API VirtualQuery() - call that before reading, and make sure the pointer points at readable data.
There are several possibilities.
If the list is doubly linked, it's possible to check the back pointer from what a front pointer points to, or vice versa.
If you have some idea as to the range of expected memory addresses, you can check. This is particularly true of the linked list is allocated from a limited number of chunks of memory, rather than having each node allocated independently.
If the nodes have some recognizable data in them, you can run down the list and check for recognizable data.
This looks to me like one of those questions where the interviewer isn't expecting a snappy answer, but rather an analysis of the question including further questions from you.
It's sort of a pain, but you can record the values of each pointer as you come across them with your debugger and verify that it's consistent with what you'd expect to find (if you'd expect a pointer to be NULL, make sure it's NULL. if you'd expect a pointer to refer to an already existing object, verify that that object's address has that value, etc.).
Yuo could keep a doubly linked list. Then you can check that node->child->parent == node (although if node->child has become corrupt this has a reasonable chance of causing an exception)
Several debuggers / bound-checkers will do this for you, but a cheap and quick solution to this question is to
Alter the structure of the list's nodes to include one additional char[n] field (or more typically two, one as the first the other as the last fields in the structure, hence allowing bounds-checking in addition to pointer corruption).
Initiallize these fields with a short (but long enough...) constant string such as "VaL1D-LiST-NODE 1234" when the nodes are created.
Check that the values read in this(these) field(s) match the expected text, each time a node is dereferenced, and before using the node in earnest.
When the field(s)' value do not match this is either the indication that:
the pointer is invalid (it never pointed to a list node)
something else is overwriting the node structure (the pointer is "valid" but the data it points to has been corrupted).