Double pointer: pointer to struct member that is a pointer - c

I'm trying to write a program to play "Pangolin" (like this guy - it asks yes/no questions, walking down a binary tree until it gets to a leaf node. It then "guesses", and if the user says the answer was wrong, asks the user what they were thinking of and for a question that distinguishes that from the incorrect gues. It then adds the new data to the tree).
This is the my struct for a tree node. NodeType is QUESTION_NODE for nodes containing a question or OBJECT_NODE for nodes containing an "object" - that is the thing the program deduces the user to be thinking of. Question nodes have pointers to child nodes - one for yes and one for no.
typedef struct _TreeNode {
NodeType type;
union {
char* question;
char* objectName;
} nodeString;
//children for yes and no answers: will be invalid when type is OBJECT_NODE
struct _TreeNode* yes;
struct _TreeNode* no;
} TreeNode;
As this is a learning exercise, I'm trying to do it with double pointers. Here is the function that is supposed to add a question node to the tree:
void addData(TreeNode** replace, char* wrongGuess) {
//create a new object node for what the user was thinking of
// ... (code to get user input and build the new object node struct) ... //
//create a new question node so we don't suck at pangolin so much
// ... (code to get a question from the user and put it in a question node struct) ... //
//link the question node up to its yes and no
printf("What is the answer for %s?\n", newObjectName);
if (userSaysYes()) {
newQuestionNodePtr->yes = newObjectNodePtr;
newQuestionNodePtr->no = *replace;
}
else {
newQuestionNodePtr->no = newObjectNodePtr;
newQuestionNodePtr->yes = *replace;
}
//redirect the arc that brought us to lose to the new question
*replace = newQuestionNodePtr;
}
The addData function is then called thus:
void ask(node) {
//(... ask the question contained by "node" ...)//
//get a pointer to the pointer that points to the yes/no member pointer
TreeNode** answerP2p;
answerP2p = userSaysYes() ? &(node.yes) : &(node.no);
//(... the user reports that the answer we guessed was wrong ...)//
puts("I am defeated!");
//if wrong, pass the pointer to pointer
addData(answerP2p, answerNode.nodeString.objectName);
My (presumably wrong) understanding is this:
In "ask()", I am passing addData a pointer which points to "node"'s member "yes" (or no). That member is in turn a pointer. When, in addData, I assign to "*replace", this should modify the struct, redirecting its "yes" (or no) member pointer to point to the new question node I have created.
I have debugged and found that the newQuestionNode and newObjectNode are created successfully. newQuestionNode's children are correctly assigned. However the new question node is not inserted into the tree. The "*replace = newQuestionNodePtr" line does not have the effect I would expect, and the node referred to by "node" in the "ask" scope does not have its child pointer redirected.
Can anyone see what is wrong in my understanding? Or perhaps a way in which I haven't expressed it right in my code? Sorry this question is so long.

You should not declare the pointer you pass to the function as a double pointer. Instead pass the address of a single pointer to the function:
TreeNode* answerP2p;
answerP2p = userSaysYes() ? node.yes : node.no;
addData(&answerP2p, answerNode.nodeString.objectName);

Unfortunately I don't quite understan'd Joachim Pileborg's answer above, but I eventually sussed my problem and I guess it's a fairly common mistake for new C-farers[1], so I'll post it here in my own terms.
In my hasty transition from Java to C I had told myself "OK, structs are just objects without methods". Assessing the validity of this simplification is left as an exercise to the reader. I also extended this assumption to "when an argument is of a struct type, it is automatically passed by reference". That's obviously false, but I hadn't even thought about it. Stupid.
So the real problem here is that I was passing ask() a variable of type TreeNode for its node argument. This entire struct was being passed by value (of course). When I passed answerP2p to addData(), it was actually working correctly, but it was modifying ask()'s local copy of the TreeNode. I changed ask() to take a TreeNode* and lo, there was a tree.
C what I did there[1]?

Related

Removing First element of linked list in C

Seemingly simple C code is seemingly not allowing me to remove the first element from a linked list. I can, however, successfully remove any other individual element and can successfully delete the whole linked list.
typedef struct list{
int data;
struct list * next;
} list;
void remove_element(list * node, unsigned int index){
if (node == NULL)
exit(-1);
list *currElem = node;
if (index == 0) {
node = node->next;
currElem->next = NULL;
free(currElem);
return;
}
Produces the follwing:
"free(): invalid pointer: 0xbfabb964"
I've followed the same format for all of my other manipulation functions with no issues. Similar threads on forums don't seem to be dealing with this particular problem.
You can read the explanation in this pdf on the Push function which explains it:
http://cslibrary.stanford.edu/103/
This is where c gets funky pschologically. You instinctively want to label a pointer as a pointer, which it is. But it is a pointer value, not a pointer reference. It's like the holy spirit of the C divinty. The triumvirate. C passed arguments to functions by value, not by address/reference. So, what do you do to pass a variable by reference? Remember, the solution is so obvious, it really didn't make sense to me for a week, I swear to god.

Linked List function explanation, subscription of a structure pointer

Programming a simple singly-linked-list in C, I came about this repository on Github: https://github.com/clehner/ll.c while looking for some examples.
There is the following function (_list_next(void *)):
struct list
{
struct list *next; // on 64-bit-systems, we have 8 bytes here, on 32-bit-systems 4 bytes.
void *value[]; // ISO C99 flexible array member, incomplete type, sizeof may not be applied and evaluates to zero.
};
void *_list_next(void *list)
{
return list ? ((struct list *)list)[-1].next : NULL; // <-- what is happening here?
}
Could you explain how this works?
It looks like he is casting a void pointer to a list pointer and then subscripting that pointer. How does that work and what exactly happens there?
I don't understand purpose of [-1].
This is undefined behavior that happens to work on the system where the author has tried it.
To understand what is going on, note the return value of _ll_new:
void * _ll_new(void *next, size_t size)
{
struct ll *ll = malloc(sizeof(struct ll) + size);
if (!ll)
return NULL;
ll->next = next;
return &ll->value;
}
The author gives you the address of value, not the address of the node. However, _list_next needs the address of struct list: otherwise it would be unable to access next. Therefore, in order to get to next member you need to find its address by walking back one member.
That is the idea behind indexing list at [-1] - it gets the address of next associated with this particular address of value. However, this indexes the array outside of its valid range, which is undefined behavior.
Other functions do that too, but they use pointer arithmetic instead of indexing. For example, _ll_pop uses
ll--;
which achieves the same result.
A better approach would be using something along the lines of container_of macro.

Pondering the purpose of TAILQ's tqe_prev not pointing to the previous node

In sys/queue.h there defines a data structure TAILQ. It is very popularly used throughout Linux kernel. Its definition is like this:
#define TAILQ_ENTRY(type) \
struct { \
struct type *tqe_next; /* next element */ \
struct type **tqe_prev; /* address of previous next element */ \
}
I am a little baffled at this code: what is the advantage to have tqe_prev pointing the tqe_next of the previous node? If it was me, I would have tqe_prev directly pointing to the previous node, similar to tqe_next pointing to the next node.
One reason I'd think of, when we insert a node, we directly operate on the pointer to be updated, we do not need to go through its owning node first. But is that it? Any other advantages?
I am wondering how we can travel backwards of the queue? When we have a pointer to a node, since its tqe_prev does not point to the previous node, we have no way to go through the queue till the head. Or such backward travel is by design not supported by TAILQ?
Oh, interesting. I didn't know this technique had any other users (I came up with it myself).
The reason to do things this way is that there may not be a "previous node": The first element does not have a predecessor, but it does have a pointer pointing to it.
This simplifies several operations. For example, if you want to delete a node given only a pointer to it, you can do this:
void delete(struct node *p) {
*p->tqe_prev = p->tqe_next;
if (p->tqe_next) {
p->tqe_next->tqe_prev = p->tqe_prev;
}
free(p);
}
If you had a pointer to the preceding node, you'd have to write this:
void delete(struct node *p) {
if (p->tqe_prev) {
p->tqe_prev->tqe_next = p->tqe_next;
} else {
???
}
if (p->tqe_next) {
p->tqe_next->tqe_prev = p->tqe_prev;
}
free(p);
}
... but now you're stuck: You can't write the ??? part without knowing where the root of the list is.
Similar arguments apply to insert operations.
Backwards traversal is indeed not a priority for this kind of structure. But it can be done if must be (but only if you know for sure that you are not at the root, i.e. you know there actually is a previous node):
#include <stddef.h>
struct node *prev(struct node *p) {
return (struct node *)((unsigned char *)p->tqe_prev - offsetof(struct node, tqe_next));
}
We know that p->tqe_prev is the address of a .tqe_next slot within a struct node. We cast this address to (unsigned char *) so we can do bytewise pointer arithmetic. We subtract the (byte) offset of .tqe_next within the struct node structure (offsetof macro courtesy of <stddef.h>). This gives us the address of the beginning of the struct node structure, which we finally cast to the right type.
Linus answered the question in https://meta.slashdot.org/story/12/10/11/0030249/linus-torvalds-answers-your-questions.
The quote is as follows:
At the opposite end of the spectrum, I actually wish more people understood the really core low-level kind of coding. Not big, complex stuff like the lockless name lookup, but simply good use of pointers-to-pointers etc. For example, I've seen too many people who delete a singly-linked list entry by keeping track of the "prev" entry, and then to delete the entry, doing something like
if (prev)
prev->next = entry->next;
else
list_head = entry->next;
and whenever I see code like that, I just go "This person doesn't understand pointers". And it's sadly quite common.
People who understand pointers just use a "pointer to the entry pointer", and initialize that with the address of the list_head. And then as they traverse the list, they can remove the entry without using any conditionals, by just doing a "*pp = entry->next".

Confused about this C code I saw in a job interview

Disclaimer: I'm allowed to talk about the interview question since I already was rejected by the company and never had to sign an NDA before taking the interview test anyhow.
Also, this isn't a "Write my code for me" post. I'm just curious about understanding the setup of this problem. It wanted me to fill in the body of a function that removes an element from a linked list:
typedef struct
{
int val;
node * next;
} node;
void remove_val(node ** arr, int i)
{
/* Write the procedure here */
}
I was confused about why the problem had a pointer-to-a-pointer as a parameter. I would expect that the parameter to a function would be the root of the list, which would be a pointer to a node. Right? Any idea what the first parameter was supposed to be???
What happens if you remove the head of the list? You need some way to communicate that back to the calling code. This function signature allows you to change the head of the list to point to its next, if necessary.

Pointers changing values [duplicate]

So, here's the story. I'm trying to create a recursive descent parser that tokenizes a string and then creates a tree of nodes out of those tokens.
All of the pointers for my major classes are working... if you're worked with an RDP before then you know what I'm talking about with program -> statement -> assignStmt... etc. The idea being that the program node has a child that points to the statement node, etc.
Here's the problem. When I get to the end of the treenode I'm pointing to the actual tokens that the tokenizer created from the string.
So, let's say the string is:
firstvar = 1;
In this case there are 4 tokens [{id} firstvar], [{assignment} =], [{number} 1], [{scolon}]
And I want my assignStmt node to point to the non-decorator portions of that statement.. namely, child1 of assignStmt would be [{id} firstvar] and child2 would be [{number} 1]...
HOWEVER. When I assign child1 to [{id} firstvar], and then move onward to the next tokens, the value of child1 changes as I move forward. So, if I change my global token to the next token ( in this case [{assignment} =] ) then child1 of the assignStmt changes with it.
Why is this? What can I do?! Thank you!
TOKEN* getNextToken(void);
//only shown here to you know the return... it's working properly elsewhere
typedef struct node {
TOKEN *data;
struct node *child1, *child2, *child3, *child4, *parent;
} node;
TOKEN *token;
Symbol sym;
struct node *root;
void getsym()
{
token = getNextToken();
sym = token->sym;
}
int main()
{
getsym();
//So, right now, from getsym() the global token has the value {identifier; firstvar}
struct node* tempNode;
tempNode = (struct node*) calloc(1, sizeof(struct node));
tempNode->child1 = tempNode->child2 = tempNode->child3 = tempNode->child4 = NULL;
tempNode->data = token;
getsym();
//BUT NOW from getsym() the global token has the value {assignment; =}, and
//subsequently the tempNode->data has changed from what it should be
//{identifier; firstvar} to what the global token's new value is: {assignment; =}
}
Since i can't comment on this due to my low reputation i will add this answer and if have understood your problem you are probably passing a pointer to a function, and the problem is that you probably need a pointer to pointer instead of just a pointer.
in C when you pass values to a function you are passing them by value, not by reference, meaning that the function makes a local copy of that argument and it will work only with that local copy, the problem is that all the changes will affect only the local copy and when the function terminates all the changes will be lost if you will not handle this correctly.
You are returning a pointer to a global variable, and that pointer will always be the same even if you modify the global variable.
The solution is to either allocate a new object each time, or to not use pointers at all and return the structure directly and let the compiler handle copying of the structures internal values.

Resources