I wonder why we can pass structure to C function by value, but we can never do the same with array (which is passed by address).
When I was learning C, they told me that arrays consume much stack, so it's not preferred to pass them by value.
But it seems that structures are often (if not always) larger than arrays and are more complex data structure, so this explanation makes no sense for me now !
Can anybody help with as much details as possible ?
In C, an array is always defined as a pointer to the first position of the array, so by definition, when you are passing an array to a function your are passing its memory address, hence its reference.
When you define a variable of type struct, you're allocating all the space in memory needed to to contain this struct, and if you make something like:
struct a, b;
...
a = b;
You are copying all the values from b to a, and in the same way, when you are passing it to a function, you are copying the values of the original struct to the stack. That's called passing a parameter by value.
It's true what you're stating in your question. A struct may be more complex than an array, but it's perfectly possible to pass it as value, and it may be inefficient, but the reason that you can't pass an array by value is because it is defined as a pointer by default.
Related
I feel like this is the final piece in me understanding pointers. "Why are pointers important?"
(I mean, I'm sure there's a lot of reasons, but is this not the biggest?)
For awhile I've understood that int num = 5; was done because num is a way for us to refer to the value stored at whatever memory address we put 5 into. If I then do num = 10; it updates that memory address to store 10 instead, and num still points to the value at that particular memory address. Am I right so far?
So I was confused why we wouldn't just do char str = "string", or the same for other objects. Is it because what we're trying to store cannot be stored in one memory block, unlike int and other primitives?
We do it because we need multiple memory blocks, and pointers effectively give us a reference to where it starts and then we can go as far as we need to collect all the data needed for the object?
Is the importance of pointers due to the fact that non-primitive data requires multiple memory blocks? We need multiple memory blocks, and pointers effectively give us a reference to where it starts and then we can go as far as we need to collect all the data needed for the object?
No. What you are describing is the important of arrays.
So what is the importance of pointers?
Suppose you have some data:
12
What can you do with that data? You can store it in a storage location and fetch it from that location later when you need it.
int height = 12;
You can pass it as a value to a method:
doit(12);
You can compare it for equality with other values:
if (height == 13)
and a few other things as well, but these are some of the big ones.
Well, the same thing is true of pointers. Pointers are values, so they can be stored, fetched, passed and compared. But any addressable storage location can be turned into a pointer. So this is the power of pointers in C: they allow you to treat storage locations like any other data.
and num still points to the value at that particular memory address
No, because num is not a pointer. I'm counting beans here, but that's important to get your ideas right. When you define a variable num, the compiler assigns the name num to a memory address, which stays valid as long as the variable is in scope - for the whole program, if it's a global variable, or until your function returns, for local variables.
Objects can be any size, for example, a structure might consist of a lot of elements. And something like
char c[100]="string"
is perfectly valid; there are no pointers involved (yet).
One of the reasons why you need pointers is when you call a function. Normally, all parameters to a function are called by value. So, if you have a function
void swap (int x, int y) {
int temp=x;
x=y;
y=temp;
}
and call it
int a=3;
int b=5;
swap(a, b);
printf("%d %d\n", a, b);
you'll get output values 3 and 5 - the variables have, obviously, not been swapped. The reason for this is the compiler creates copies of the variables and passes the copies to swap.
Now, if you want to tell the compiler "I don't want copies, i want swap to change the memory locations that i named a and b", you need pointers:
void swap (int *x, int *y) {
int temp=*x;
*x=*y;
*y=temp;
}
int a=3;
int b=5;
swap(&a, &b);
printf("%d %d\n", a, b);
The & operator tells the compiler "I want the memory address that i named a, not the value that i wrote into that memory address". Within the swap function, the * means "x is not the value that i want, it's the address of the memory i really want to change - the variable that belongs to the function that called me". Other programming languages like pascal call this "call by reference", vs. "call by value" in C.
Now your language has pointers. Let's reconsider strings, and arrays.
If i call a function
myfunction("Thisisaverylongstringthatjustwontendnoway....")
the call by value has to copy a lot of bytes from the caller to the callee, which is inefficient. So, one of the design decisions in C was:
Whenever a function calls another, passing an array as a parameter, the array is not passed by value, instead, we automatically pass a pointer the the start of the array, making it a call by reference.
So, in function calls, arrays (and strings are just a special case of an array) are always passed as pointer. Which is why, in C, you have to learn about pointers quite soon; you can get along without them much longer in, say, Pascal. Or Java (where, behind the scenes, you work with pointers all the time when you're working with Objects, but Java hides that from you).
And this is why pointers are, very often, introduced shortly after arrays and strings, when you learn C. As soon as you know what a function is, and what an array, or a string, is, you need to know about pointers or you won't get functions and arrays together.
Your last sentence about "multiple memory blocks" isn't correct in the sense that anything that's larger than a few bytes needs a pointer - structures do not - but, in most cases, your program will just be faster with pointers. If you have two strings, each 100 bytes long, and want to swap them, you'd need to do a lot of copying stuff around. Just swapping the pointers is much faster (typically, a pointer needs 4-8 bytes), so as soon as you're dealing with larger objects, you want to deal with pointers pointing to them, for efficiency reasons.
Now i haven't even begun with dynamic memory allocation .. but i guess my answer is large enough already.
I have a function written in C
FindBeginKey(KeyListTraverser, BeginPage, BeginKey, key1);
BeginKey is a pointer before function invoking, and I didn't initiate it, like
BeginKey = NULL;
In the FindBeginKey() function, I assign BeginKey to another pointer, and try to print out the current address of BeginKey in the function, it works correct.
But when code returns from function, I try to print out the address of BeginKey again, it shows 0x0.
Why does this happen, and if I want to preserve the address assigned in the function, what should I do?
To pass a value out of a function you have to pass by reference rather than by value as is normally the case with C functions. TO do this make the parameter a pointer to the type you want to pass out. Then pass the value into the call with the & (address operand).
e.g.
FindFoo(FOO** BeginKey);
and call it:
FindFoo(&BeginKey);
and in the function:
*BeginKey = 0xDEADC0DE;
From what I understand, you are calling the function like:
FindBeginKey(KeyListTraverser, BeginPage, BeginKey, key1);
However, when you try to write at the BeginKey address, you're basically passing in a pointer to 0x00. Rather, you need to pass a pointer to BeginKey.
FindBeginKey(KeyListTraverser, BeginPage, &BeginKey, key1);
If this is isn't what you meant, it would certainly help if you posted a code sample.
If you want to modify a parameter in a subroutine, you should pass a pointer of the thing you wanna modify.
void subroutine(int* x) {
*x = 5; // will modify the variable which x points to
x = 5; // INVALID! x is a pointer, not an integer
}
I don't know what all the C parameter passing rules are now, so this answer might be a little dated. From common practice in building applications and libraries that those applications called, the return from a C function would contain status, so the caller of the function could make a decision depending on the status code.
If you wanted the function to modify its input parameters, you would pass those parameters by reference &my_val, where int my_val;. And your function must dereference my_val like this *my_val to get its value.
Also, for performance reasons, and address (by reference) might be preferable, so that the your application did not bother copying the parameter's value into a local variable. That prolog code is generated by the compiler. Single parameters, char, int, and so on are fairly straight forward.
I am so used to C++ that passing by reference in C++ does not require dereferencing. The compiler's code takes care of that for you.
However, think about passing a pointer to a structure.
struct my_struct
{
int iType;
char szName[100];
} struct1;
struct my_struct *pStruct1 = &struct1;
If the structure contains lookup data that is filled in once on initialization and then referenced throughout your program, then pass a pointer to the structure by value pStruct1. If you are writing a function to fill that structure or alter already present data, then pass a pointer to the structure by value. You still get to alter what the structure pointer points to.
If on the other hand you are writing a function to assign memory to the pointer, then pass the address of the pointer (a pointer to the pointer) &pStruct1, so you will get your pointer pointing to the right memory.
I am dusting off my C skills working on some C libraries of mine. After having put together a first working implementation I am now going over the code to make it more efficient. Currently I am on the topic of passing function parameters by reference or value.
My question is, why would I ever pass any function parameter by value in C? The code might look cleaner, but wouldn't it always be less efficient than passing by reference?
Because it's not as important to code for the computer as it is to code for the next human being. If you are passing references around then any reader must assume that any called function could change the value of his parameters and would be obligated to check it or copy the parameter before calling.
Your function signature is a contract and divides your code up so that you don't have to fit the entire code base into your head in order to comprehend what is going on in some area, by passing references you are making the next guy's life worse, your biggest job as a programmer should be making the next guy's life better--because the next guy will probably be you.
In C, all arguments are passed by value. A true pass by reference is when you see the effect of a modification without any explicit indirection at all:
void f(int c, int *p) {
c++; // in C you can't change the original paramenter passed like this
p++; // or this
}
Using values instead of pointers though, is frequently desirable:
int sum(int a, int b) {
return a + b;
}
You would not write this like:
int sum(int *a, int *b) {
return *a + *b;
}
Because it is not safe and it is inefficient. Inefficient because there is an additional indirection. Moreover, in C, a pointer argument suggests the caller that the value will be modified through the pointer (especially true when the pointed type has a size less than or equal to the pointer itself).
Please refer to Passing by reference in C. Pass by reference is a misnomer in C. It refers to passing the address of a variable instead of the variable, but you are passing a pointer to the variable by value.
That said, if you were to pass the variable as a pointer, then yes it would be marginally more efficient, but the main reason is to be able to modify the original variable it points to. If you don't want to be able to do this, it is recommended you take it by value to make your intent clear.
Of course, all this is moot in terms of one of Cs heavier data structures. Arrays are passed by a pointer to their first variable whether you like it or not.
Two reasons:
Often times you will have to dereference the pointer you've passed in many times (think a long for-loop). You don't want to dereference every single time you want to look up the value at that address. Direct access is faster.
Sometimes you want to modify the passed-in value inside you function, but not in the caller. Example:
void foo( int count ){
while (count>0){
printf("%d\n",count);
count--;
}
}
If you wanted to do the above with something passed by reference, you would haev to create yet another variable inside your function to store it first.
Why would you do this:
void f(Struct** struct)
{
...
}
If I wish to operate on a list of structs, is it not enough to pass in a Struct*? This way I can do struct++ to address the next struct or am I very confused here? :)
Wouldn't it only be useful if I want to rearrange the list of structs in some way? However if I'm just reading I don't see the point.
It depends on what your data structure looks like. Assuming that p is a null-terminated array of pointers to struct s, you can run through it using a loop like this:
void f(struct s **p)
{
while (*p != NULL) {
/* some stuff */
(*p)++;
}
}
Generally, use a pointer to a pointer is useful only if you attempt to modify the pointer itself.
If you want to modify the pointer in caller which was passed to this function, you'd typically do this.
Because, everything is passed by value in C, passing struct* will only pass the copy of the pointer and won't modify pointer in the caller. Why passing struct * is explained in this C-FAQ.
If you don't intend to modify the pointer in caller, it's not neccessary to pass struct **.
There are a number of uses for this kind of parameter...
One already mentioned, and quite common is to allow the caller to use the function to modify a pointer. The obvious case here would be when getting some blob of data...
void getData( void** pData, int* size )
{
*pData = getMyDataPointer();
*size = getMyDataSize();
}
Another option is that perhaps the extra level of indirection allows for the list to behave in some way? e.g. by using indices to refer to specific elements they can be allocated and reallocated without having the risk of dangling pointers.
Yet another option is that the list is very large and lives in fragmented memory, or is rapidly accessed so that the list is actually several smaller lists grouped together. This sort of technique can also be used to 'lazily' allocate huge arrays, e.g. providing an interface to an array of a billion elements, but then allocating chunks of 100k on demand as they are read/written with struct** pointing at the whole thing, and each struct* being either null or pointing to 100k structs...
To be honest the context is quite important... there are plenty of uses for also triple pointers as function parameters that follow similar reasoning. (e.g. combine the first thing i mention with the second, or the second with the third etc.)
You are correct, there is no reason to pass a pointer to pointer unless your function is intended to modify the pointer passed in. In case of accessing an array of structs, a single level of indirection is definitely sufficient.
The creator of the API probably thought that the argument list would be easier to memorize if the first argument of every function is the same.
I am a bit confused about the behaviour of a C program from another programmer I am working now with. What I can not understand is the following:
1) a variable is defined this way
typedef float (array3d_i)[3];
array3d_i d_i[NMAX];
2) once some values are assgined to all the d_i's, a function is called which is like this:
void calc(elem3d_i d_element);
which is called from main using:
calc(d_i[i]);
in a loop.
When the d_i's are initialized in main, each element gets an address in memory, I guess in the stack or somewhere else. When we call the function "calc", I would expect that inside the function, a copy of the variable is created, in anoother address. But I debugged the program, and I can see that inside the function "calc", the variable "d_elemt" gets the same address than d_i in main.
Is it normal or not?
I am even more confused because later there is call to another function, very similar situation except that now the variables are float type and also an array of them is initialized, and inside the function, the variables are given a different address than the one in main.
How can this be? Why the difference? Is the code or the debugger doing something weird?
Thanks
Arrays are passed by reference in C, while simple values will be passed by value. Or, rather, arrays are also passed by value, but the "value" of an array in this context is a reference to its first element. This is the "decay" Charles refers to in his comment.
By "reference", I mean pointer of course since C doesn't have references like C++ does.
C does not have a higher-level array concept, which is also why you can't compute the length of the array in the called function.
It's the difference between pointers and variables. When you pass an array to a function you are passing a pointer (by value). When you pass a float, you are passing a float (by value). It's all pass by value, but with the array the value is the address of the pointer.
Note that "passing arrays" is the same as passing pointers and that, in C all parameters are passed by value.
What you see in the different functions is the pointer value. The pointer itself (the parameter received by the function) is a different one in each function (you can check its address)
Imagine
int a = 42, b = 42, c = 42;
If you look at a, b, or c in the debugger you always see 42 (the value), but they're different variables.
As others have noted, everything is passed by value. However, that can be misleading for arrays. While similar to pointers, arrays are not pointers. Pointers are variables which hold addresses. Arrays are blocks of memory, at a particular address. For arrays, there is no separate variable which holds the address, like a pointer. When you pass an array as an argument to a function, or get it's address by assigning the name (without the [] indexing) to a pointer, then you do have it's address contained in a variable. So, what is "passed by value" is a pointer, not an array, even though you called the function with an array as an argument. So the following are equivalent:
void func1(char *const arg);
void func2(char arg[]);