In the following code I try to print all paths from the root to leaves in a Binary Tree.
If I write a recursive function as follows :
void printPath(BinaryTreeNode * n, int path[],int pathlen)
{
//assume base case and initializations taken care of
path[pathlen++] = n->data;
printPath(root->left,path,pathlen);
printPath(root->right,path,pathlen);
}
(I have purposefully removed base cases and edge cases handling to improve readability)
What happens to the path array? Is it just one global copy that gets modified during each recursive call ?Does the pathlen variable overwrites some of the path values, giving the feeling that each stack frame has it's own local copy of path, since pathlen is local to each stack frame?
Passing around a int[] variable is really almost like passing a int* around. The first invocation of the recursive function passes the real int[] which is nothing more than an address in memory, that's the same address used in every recursive call.
Basically if you place a debug print, eg
printf("%p\n", path);
in your recursive function you will see that the address is always the same, it doesn't change nor it gets modified. The only thing that is pushed onto the stack frame during the invocation is the address of the array, which nonetheless remains is always the same.
Welcome to array-pointer-decay. When you pass around an array, two distinct things happen:
When you declare a function to take an array parameter, you are really defining the function to take a pointer parameter. I. e., the declaration
void foo(int bar[]);
is perfectly equivalent to
void foo(int* bar);
The declaration of the array has decayed into a declaration of a pointer to its first element.
Whenever you use an array, it also decays into a pointer to its first element, so the code
int baz[7];
foo(baz);
is again perfectly equivalent to
int baz[7];
foo(&baz[0]);
There are only two exceptions, where this array-pointer-decay does not happen: the statements sizeof(baz) and &baz.
Together, these two effects create the illusion that arrays are passed by reference, even though C only ever passes a pointer by value.
The array-pointer-decay was invented to allow the definition of the array subscript operator in terms of pointer arithmetic: the statement baz[3] is defined to be equivalent to *(baz + 3). An expression which tries to add 3 to an array. But due to array-pointer-decay, baz decays into an int*, on which pointer arithmetic is defined, and which yields a pointer to the fourth element in the array. The modified pointer can then be dereferenced to get the value at baz[3].
void printPath(BinaryTreeNode * n, int path[],int pathlen);
compiler really looks at that like this
void printPath(BinaryTreeNode * n, int *path, int pathlen);
what happens to the path array? Is it just one global copy that gets modified during each recursive call
Nothing. The same path gets passed around, since in C array passing is just a pointer copy operation; and no, it isn't a global copy, but a parameter passed to the first call of the function, and will almost always live on the stack.
and the pathlen variable overwrites some of the path values giving the feeling that each stack frame has it's own local copy of path since pathlen is local to each stack frame?
Since you modify the value of the array elements and not the pointer pointing to the beginning of the array, nothing changes what path itself is pointing to (which is the array all the time). Like like you said it may give a feeling (particularly if you're used to that construct in other languages), but in reality only the same path gets passed around.
Aside: You don't seem to handle the exit condition and as it stands it'll be an infinite loop and mostly probably undefined behaviour when you start modifying elements that are out of the array's bounds.
Related
I understand the premise of pointers, but I find it very annoying, and I don't get why it's considered useful;
I've learned about pointers, and the next thing I know, I start seeing bubbles, asterisks, and ampersands everywhere.
#include <stdio.h>
int main () {
int *ptr, q;
q = 50;
ptr = &q;
printf("%d", *ptr);
return 0;
}
why is this important or useful?
First, parameters passed to a function can only be primitives(int, char, long....), structs or pointers. Then if you need to pass a more complex element like an array (strings) or a function, you have to pass a reference to this element.
The second things that I can quickly think of is: parameters are always passed by "value". This means the called function only get a copy of your variable. So, modifications will only affect the copy, the original variable will remain unchanged.
If you pass a variable by "reference" with a pointer, the pointer itself is immutable but as it is a reference to the original var, any modification to the pointed element will also affect the var in the caller function.
In other words, if you want to create a function that can alter a variable, you have to pass it a pointer to that variable to achieve this.
I am dusting off my C skills working on some C libraries of mine. After having put together a first working implementation I am now going over the code to make it more efficient. Currently I am on the topic of passing function parameters by reference or value.
My question is, why would I ever pass any function parameter by value in C? The code might look cleaner, but wouldn't it always be less efficient than passing by reference?
Because it's not as important to code for the computer as it is to code for the next human being. If you are passing references around then any reader must assume that any called function could change the value of his parameters and would be obligated to check it or copy the parameter before calling.
Your function signature is a contract and divides your code up so that you don't have to fit the entire code base into your head in order to comprehend what is going on in some area, by passing references you are making the next guy's life worse, your biggest job as a programmer should be making the next guy's life better--because the next guy will probably be you.
In C, all arguments are passed by value. A true pass by reference is when you see the effect of a modification without any explicit indirection at all:
void f(int c, int *p) {
c++; // in C you can't change the original paramenter passed like this
p++; // or this
}
Using values instead of pointers though, is frequently desirable:
int sum(int a, int b) {
return a + b;
}
You would not write this like:
int sum(int *a, int *b) {
return *a + *b;
}
Because it is not safe and it is inefficient. Inefficient because there is an additional indirection. Moreover, in C, a pointer argument suggests the caller that the value will be modified through the pointer (especially true when the pointed type has a size less than or equal to the pointer itself).
Please refer to Passing by reference in C. Pass by reference is a misnomer in C. It refers to passing the address of a variable instead of the variable, but you are passing a pointer to the variable by value.
That said, if you were to pass the variable as a pointer, then yes it would be marginally more efficient, but the main reason is to be able to modify the original variable it points to. If you don't want to be able to do this, it is recommended you take it by value to make your intent clear.
Of course, all this is moot in terms of one of Cs heavier data structures. Arrays are passed by a pointer to their first variable whether you like it or not.
Two reasons:
Often times you will have to dereference the pointer you've passed in many times (think a long for-loop). You don't want to dereference every single time you want to look up the value at that address. Direct access is faster.
Sometimes you want to modify the passed-in value inside you function, but not in the caller. Example:
void foo( int count ){
while (count>0){
printf("%d\n",count);
count--;
}
}
If you wanted to do the above with something passed by reference, you would haev to create yet another variable inside your function to store it first.
I have a doubt regarding passing of arrays to a function.
consider the following code snippet.
void main()
{
int a[4]={10,20,30,40};
fun1(a);
}
void fun1(int a1[])
{
for(int i=0;i<4;i++)
{
printf("%d\n",a1[i]);
}
}
Passing an array is nothing but passing the address of the first location.
And I should pass the above array with its name(starting address of the array).
My doubt is since a[4] is an auto variable, it should die when it comes out of the main function and it should give the unexpected results(the pointer should be dangled).
But it is working fine.
I am very confused with this, can you please clear it off.
Even if we pass a single element int a as f(&a), it should not exist in the function f, if it is declared as automatic(local variable in main function).
Please clear this as well.
Yes, variable a will be out of scope when main() terminates.
But when fun1 is executing, main() has not reached termination yet, so the content of a is still perfectly valid.
What you are doing is fine. The array a does indeed go out of scope but by that point your function has finished so you don't have to worry about accessing data that is no longer around. If you have concerns about passing the variable as the array name (which is fine) you can always step through your code to ensure you are accessing the data you think that you are.
You could also make your function safer by passing an additional integer argument that specifies the size of your array rather than having it hard coded as 4. If you used the function you have and passed an integer array of length less than 4, it will be accessing out of bounds memory.
void fun1 ( int a1[]) is creating a copy of whatever array is coming into the function. So it will exist.
You can also vision it as a stack. A stack will be created for main() method. And since the fun1() is called from the main method, the stack of main method will destroy only when the stack for fun1() is destroyed.
int a[] in function declaration/definition is equal to const int *a so nothing bad will happen and no memory will be freed implicitly.
I am a bit confused about the behaviour of a C program from another programmer I am working now with. What I can not understand is the following:
1) a variable is defined this way
typedef float (array3d_i)[3];
array3d_i d_i[NMAX];
2) once some values are assgined to all the d_i's, a function is called which is like this:
void calc(elem3d_i d_element);
which is called from main using:
calc(d_i[i]);
in a loop.
When the d_i's are initialized in main, each element gets an address in memory, I guess in the stack or somewhere else. When we call the function "calc", I would expect that inside the function, a copy of the variable is created, in anoother address. But I debugged the program, and I can see that inside the function "calc", the variable "d_elemt" gets the same address than d_i in main.
Is it normal or not?
I am even more confused because later there is call to another function, very similar situation except that now the variables are float type and also an array of them is initialized, and inside the function, the variables are given a different address than the one in main.
How can this be? Why the difference? Is the code or the debugger doing something weird?
Thanks
Arrays are passed by reference in C, while simple values will be passed by value. Or, rather, arrays are also passed by value, but the "value" of an array in this context is a reference to its first element. This is the "decay" Charles refers to in his comment.
By "reference", I mean pointer of course since C doesn't have references like C++ does.
C does not have a higher-level array concept, which is also why you can't compute the length of the array in the called function.
It's the difference between pointers and variables. When you pass an array to a function you are passing a pointer (by value). When you pass a float, you are passing a float (by value). It's all pass by value, but with the array the value is the address of the pointer.
Note that "passing arrays" is the same as passing pointers and that, in C all parameters are passed by value.
What you see in the different functions is the pointer value. The pointer itself (the parameter received by the function) is a different one in each function (you can check its address)
Imagine
int a = 42, b = 42, c = 42;
If you look at a, b, or c in the debugger you always see 42 (the value), but they're different variables.
As others have noted, everything is passed by value. However, that can be misleading for arrays. While similar to pointers, arrays are not pointers. Pointers are variables which hold addresses. Arrays are blocks of memory, at a particular address. For arrays, there is no separate variable which holds the address, like a pointer. When you pass an array as an argument to a function, or get it's address by assigning the name (without the [] indexing) to a pointer, then you do have it's address contained in a variable. So, what is "passed by value" is a pointer, not an array, even though you called the function with an array as an argument. So the following are equivalent:
void func1(char *const arg);
void func2(char arg[]);
I know C pretty well, however I'm confused of how temporary storage works.
Like when a function returns, all the allocation happened inside that function is freed (from the stack or however the implementation decides to do this).
For example:
void f() {
int a = 5;
} // a's value doesn't exist anymore
However we can use the return keyword to transfer some data to the outside world:
int f() {
int a = 5;
return a;
} // a's value exists because it's transfered to the outside world
Please stop me if any of this is wrong.
Now here's the weird thing, when you do this with arrays, it doesn't work.
int []f() {
int a[1] = {5};
return a;
} // a's value doesn't exist. WHY?
I know arrays are only accessible by pointers, and you can't pass arrays around like another data structure without using pointers. Is this the reason you can't return arrays and use them in the outside world? Because they're only accessible by pointers?
I know I could be using dynamic allocation to keep the data to the outside world, but my question is about temporary allocation.
Thanks!
When you return something, its value is copied. a does not exist outside the function in your second example; it's value does. (It exists as an rvalue.)
In your last example, you implicitly convert the array a to an int*, and that copy is returned. a's lifetime ends, and you're pointing at garbage.
No variable lives outside its scope, ever.
In the first example the data is copied and returned to the calling function, however the second returns a pointer so the pointer is copied and returned, however the data that is pointed to is cleaned up.
In implementations of C I use (primarily for embedded 8/16-bit microcontrollers), space is allocated for the return value in the stack when the function is called.
Before calling the function, assume the stack is this (the lines could represent various lengths, but are all fixed):
[whatever]
...
When the routine is called (e.g. sometype myFunc(arg1,arg2)), C throws the parameters for the function (arguments and space for the return value, which are all of fixed length) on to the stack, followed by the return address to continue code execution from, and possibly backs up some processor registers.
[myFunc local variables...]
[return address after myFunc is done]
[myFunc argument 1]
[myFunc argument 2]
[myFunc return value]
[whatever]
...
By the time the function fully completes and returns to the code it was called from, all of it's variables have been deallocated off the stack (they might still be there in theory, but there is no guarantee)
In any case, in order to return the array, you would need to allocate space for it somewhere else, then return the address to the 0th element.
Some compilers will store return values in temporary registers of the processor rather than using the stack, but it's rare (only seen it on some AVR compilers).
When you attempt to return a locally allocated array like that, the calling function gets a pointer to where the array used to live on the stack. This can make for some spectacularly gruesome crashes, when later on, something else writes to the array, and clobbers a stack frame .. which may not manifest itself until much later, if the corrupted frame is deep in the calling sequence. The maddening this with debugging this type of error is that real error (returning a local array) can make some other, absolutely perfect function blow up.
You still return a memory address, you can try to check its value, but the contents its pointing are not valid beyond the scope of function,so dont confuse value with reference.
int []f() {
int a[1] = {5};
return a;
} // a's value doesn't exist. WHY?
First, the compiler wouldn't know what size of array to return. I just got syntax errors when I used your code, but with a typedef I was able to get an error that said that functions can't return arrays, which I knew.
typedef int ia[1];
ia h(void) {
ia a = 5;
return a;
}
Secondly, you can't do that anyway. You also can't do
int a[1] = {4};
int b[1];
b = a; // Here both a and b are interpreted as pointer literals or pointer arithmatic
While you don't write it out like that, and the compiler really wouldn't even have to generate any code for it this operation would have to happen semantically for this to be possible so that a new variable name could be used to refer the value that was returned by the function. If you enclosed it in a struct then the compiler would be just fine with copying the data.
Also, outside of the declaration and sizeof statements (and possibly typeof operations if the compiler has that extension) whenever an array name appears in code it is thought of by the compiler as either a pointer literal or as a chunk of pointer arithmetic that results in a pointer. This means that the return statement would end looking like you were returning the wrong type -- a pointer rather than an array.
If you want to know why this can't be done -- it just can't. A compiler could implicitly think about the array as though it were in a struct and make it happen, but that's just not how the C standard says it is to be done.