Pointers as arguments in C functions - c

In a lot of examples I've read a simple getListLength() function would look something like this:
int getListLength(struct node *head)
{
struct node *temp = head;
int iCount = 0;
while (temp)
{
++iCount;
temp = temp->next;
}
return iCount;
}
What strikes me as unneeded is the declaration of a local pointer (in this case *temp) that copies the passed parameter. If I recall correctly, passed parameters obtain their own copies. Thus, there won't be a need for a local pointer that copies the *head just because the *head is a copy itself, right?
In other words, would it be correct to discard the *temp pointer and use head everywhere instead?

Yes, it's a copy, so yes, it would be correct.
int getListLength(struct node* head)
{
int iCount = 0;
while (head)
{
++iCount;
head = head->next;
}
return iCount;
}
Why don't you execute it and see for yourself?

While it's true that you don't need the local copy since the pointer is passed by value, it's probably there for stylistic reasons. Some consider it bad form to modify arguments passed in (though I do find it useful in some scenarios), but perhaps more importantly, you lose some of the self-documentation in the code; specifically, head no longer always points to the true head of the linked list. This isn't that confusing in your short piece of code, but having inaccurately-named variables can be much more confusing when the code is longer and more complex.

Often, the reason to make a local copy of a passed-in pointer is to reduce the side-effects of a function (by not modifying the function parameter).
If a function is only using the pointer to read (not write), and has no other interaction with the outside world, the function could be annotated as 'pure' in GCC and would be open for some nice optimizations.
Example:
__attribute__((pure)) int getListLength(struct node *head)
{
struct node *temp = head;
int iCount = 0;
while (temp)
{
++iCount;
temp = temp->next;
}
return iCount;
}
If you aren't familiar with what side effects are, try reading the Side Effects and Functional Programming Wikipedia articles to get more information on the subject.

Related

Double pointer to binary search-tree node

This might seem like a silly question to some of you and I know that I get things mixed up quite often but I need to understand the code so I can stop obsessing about it and focus on the real matter of why I need to use it.
So, in the code I see several assignments like this:
struct bst_node** node = root;
node = &(*node)->left;
node = &(*node)->right;
is there an invisible parenthesis here?
node = &((*node)->right);
This example is taken from literateprograms.org.
So to me it seems &(*node) is unnecessary and I might as well just write node->left instead, but the code seems to work where I can't make sense of it and I'm wondering if it's because I'm misunderstanding what's happening at those lines. Particularly, at one place in the code where it is deleting a node by constantly moving the "deleted" data to the bottom of the tree to safely remove the node without having to "break things", I'm lost because I don't get how
old_node = *node;
if ((*node)->left == NULL) {
*node = (*node)->right;
free_node(old_node);
else if ((*node)->right == NULL) {
*node = (*node)->left;
free_node(old_node);
} else {
struct bst_node **pred = &(*node)->left;
while ((*pred)->right != NULL) {
pred = &(*pred)->right;
}
psudo-code: swap values of *pred and *node when the
bottom-right of the left tree of old_node has been found.
recursive call with pred;
}
can keep the tree structure intact. I don't understand how this makes sure the structure is intact and would appreciate some help from somebody who knows what's going on. I interpret node being a local variable on the stack, created at the function call. Since it is a double pointer it points to a location in the stack (I assume this, since they did &(*node) previously to the function call), of either it's own stack or the function before, which then points to said node on the heap.
In the example code above what I think it is supposed to do is switch either left or right, since one of them is NULL, and then switch the one that isn't (assuming the other one isn't NULL?) As I said, I'm not sure about how this would work. My question mostly relates to the fact that I think &(*node) <=> node but I want to know if that's not the case etc.
node = &(*node)->right;
is there an invisible parenthesis here?
node = &((*node)->right);
Yes. It is taking the address of the right member of *node. The -> takes precedence over &; see C++ Operator Precedence (-> is 2 and & is 3 in that list) (it's the same general precedence as C).
So to me it seems &(*node) is unnecessary and I might as well just write node->left instead,
Your premise is off. There is no expression &(*node), as explained above, the & applies to the entire (*node)->left, not (*node).
In that code the double pointers are just that, a pointer to a pointer. Just as this works:
int x = 0;
int *xptr = &x;
*xptr = 5;
assert(x == 5);
This is the same, it changes the value of the pointer x:
int someint;
int *x = &someint;
int **xptr = &x;
*xptr = NULL;
assert(x == NULL);
In that code snippet you posted, assigning a pointer to *node changes the value of the pointer that node points to. So, e.g. (pseudo-ish code):
typedef struct bst_node_ {
struct bst_node_ *left;
struct bst_node_ *right;
} bst_node;
bst_node * construct_node () {
return a pointer to a new bst_node;
}
void create_node (bst_node ** destination_ptr) {
*destination_ptr = construct_node();
}
void somewhere () {
bst_node *n = construct_node();
create_node(&n->left); // after this, n->left points to a new node
create_node(&n->right); // after this, n->right points to a new node
}
Noting again that &n->left is the same as &(n->left) because of precedence rules. I hope that helps.
In C++ you can pass arguments to a function by reference, which is essentially the same as passing a pointer except syntactically it leads to code that is a bit easier to read.
That is useful
&(*node)->left <=>&((*node)->left)
The variable edited by this code is *node. I need the context fo this code to give more info

If a parameter is a pointer type, is the parameter a pointer allocated in local memory

I'm just learning C, and I have a question about pointer parameters. My code is the following:
int Length(node *head)
{
int length = 0;
while (head) {
length++;
head = head->next;
}
return length;
}
The code in the book I'm reading says to do this though:
int Length(struct node* head)
{
struct node* current = head;
int count = 0;
while (current != NULL) {
count++;
current = current->next;
}
return count;
}
Is there really a difference? The way I'm reading my code is that I get a pointer to a node struct as a parameter. The pointer itself however, is a local variable that I am free to mutate as long as I don't dereference it first. Therefore, I can change the value of the pointer to point to a different node (the next node as it may be).
Will this cause a memory leak or is there some other difference I'm not seeing?
This code is for a linked list implementation. The node struct is defined as:
// Define our linked list node type
typedef struct node {
int data;
struct node *next;
} node;
Yes, they are both doing the same. But in the second example, it is more clear what the author is trying to do because of the code. In your first example, you're using the pointer head to reference nodes other than the head. That can be confusing.
You could write your function like this and your intend would be clear:
int GetLength(node* current)
{
int length = 0;
while (current != NULL)
{
length += 1;
current = current->next;
}
return length;
}
Your solution and reasoning is correct. The node argument is a local variable: a copy of the pointer passed to your function, allocated on the stack. That's why you can modify it from within the function.
There is no difference between the two solutions, at least not in functionality, modern compilers are most likely to optimize away the extra variable in the book's solution. The only slight difference is in style, many tend to take arguments as unmodifiable values just in case to avoid mistakes.
Your understanding of the argument-passing mechanics is correct. Some people simply prefer not to modify argument values, the reasoning being that modifying an argument tends to be bug-prone. There's a strong expectation that at any point in the function, if you want to get the value the caller passed as head, you can just write head. If you modify the argument and then don't pay attention, or if you're maintaining the code 6 months later, you might write head expecting the original value and get some other thing. This is true regardless of the type of the argument.

Why to use local pointer to iterate over a list?

Say I have the following struct to define list nodes:
struct node {
int data;
struct node* next;
};
And I have this function to get the length of a list:
int Length(struct node* head) {
struct node* current = head;
int count = 0;
while (current != NULL) {
count++;
current = current->next;
}
return count;
}
Why would I want to do this: struct node* current = head; instead of just iterating over the head?
So, why would this not be ok:
int Length(struct node* head) {
int count = 0;
while (head != NULL) {
count++;
head = head->next;
}
return count;
}
Doesn't the head lose the scope once it gets inside the Length function, and therefore even if we do head = head->next it won't be affected outside the function?
Thanks
Your two codes snippets are equivalent.
However, there's a school of thought that says that you should never modify function arguments, in order to avoid potential programming errors, and to enhance readability (you're not really modifying the head). To that end, you will often see people defining as many arguments as possible as const.
A smart compiler will do that anyway. Some people do it for clarity as head to them means the head of the list and current is just the iterator, it's just for readability.
The programmers I know all intuitively assume that the value of an argument which is passed by-value (such as the address referenced by a pointer) remain unchanged throughout the function. Due to this assumption, it's easy to introduce little bugs when extending the function. Imagine I wanted to print a little bit of debug information to your Length function:
int Length(struct node* head) {
int count = 0;
while (head != NULL) {
count++;
head = head->next;
}
printf( "Length of list at %p is %d\n", head, count );
return count;
}
The larger the function gets (or the more contrived the logic is, or the less attention the guy doing the modification is paying...), the easier this kind of issue can happen.
For short functions, such as Length, I personally consider it to be fine (I do it as well).

How can I edit a pointer to a list node from a function in a recursion?

I have been writing a program that is quite complex compared to what I have dealt with until now. Anyways at some point I am supposed to write a function that will manipulate a struct list. I'm trying to make this question as simple as possible so I wrote down a pretty simple piece of code just for reference.
Here is the thing: at first I call testf from another function providing it with a valid current as well as an i with a value of 0. This means that testf will call itself about 100 times before it starts accessing the rest of the code. This is when all the generated instances of testf will start getting resolved.
void testf(listnode *current, int *i) {
wordwagon *current2;
current2 = current;
if (*i < 100) {
*i = *i + 1;
current2 = current2->next;
testf(current2, i);
}
current = current->next;
return;
}
If, let's say, I have enough connected list nodes at my disposal, is current = current->next; the correct way for the "last" testf function to access and edit the caller's current2 value (which is this function's current), or am I horribly wrong?
If I am, what is the way to make changes to the caller function's variables from inside the called function and be sure they won't go away as soon as the called function returns? I find it kind of hard to get a good grasp on how pointers work.
It is very likely that I have left out important information or that I haven't asked my question clearly enough. Please inform me if that is the case so I can edit in whatever you need.
Thanks in advance.
You can pass pointer to a pointer in your function, and derefrence it to get a listnode pointer back , here is how the code will look like after that ( not tested for compilation ) :
void testf(listnode **current, int *i) { // accept pointer to listnode pointer
wordwagon *current2;
current2 = *current; // retreive pointer value by dereferece
if (*i < 100) {
*i = *i + 1;
current2 = current2->next;
testf(&current2, i); // recursively call by reference to the pointer
}
*current = (*current)->next; /* change the current pointer next pointer, CORRECTED as suggested by Azure */
return;
}
Here is a list of really good articles for learning pointers :
a) http://cslibrary.stanford.edu/102/PointersAndMemory.pdf
b) http://cslibrary.stanford.edu/103/LinkedListBasics.pdf

Linked List finding length - whats the difference between these two functions?

Is there any difference between these two functions? I mean in-terms of the result returned?
int Length(struct node* head) {
struct node* current = head;
int count = 0;
while (current != NULL) {
count++;
current = current->next;
}
return count;
}
and this function
int Length(struct node* head) {
int count = 0;
while (head != NULL) {
count++;
head = head->next;
}
return count;
}
They are the same. One uses a local 'current' variable to iterate over the list, while the other one uses the same variable that was received through the function arguments.
The returned value will be the same.
The former is the kind of code that would be written by a programmer subscribing to the style rule that says "it is bad practice to modify parameters because they are passed by value and would give the reader a false sense that the function would modify the corresponding argument."
Not necessarily bad advice. It makes the code a little longer, though, but it reads better. Many readers look at the second and have an initial reaction of "wait, what? changing the head? Oh... okay, no it's safe...."
No difference. The second version simply uses the function argument itself as a variable to work with in the body, and that's perfectly legitimate. In fact, it's even slightly more efficient than the first version which make a gratuitous copy.
You couldn't use the second version if the argument were declared const, i.e. int Length(struct node* const head) -- but since it isn't, you're free to use the argument variable for your own purposes.

Resources