Why would you do this:
void f(Struct** struct)
{
...
}
If I wish to operate on a list of structs, is it not enough to pass in a Struct*? This way I can do struct++ to address the next struct or am I very confused here? :)
Wouldn't it only be useful if I want to rearrange the list of structs in some way? However if I'm just reading I don't see the point.
It depends on what your data structure looks like. Assuming that p is a null-terminated array of pointers to struct s, you can run through it using a loop like this:
void f(struct s **p)
{
while (*p != NULL) {
/* some stuff */
(*p)++;
}
}
Generally, use a pointer to a pointer is useful only if you attempt to modify the pointer itself.
If you want to modify the pointer in caller which was passed to this function, you'd typically do this.
Because, everything is passed by value in C, passing struct* will only pass the copy of the pointer and won't modify pointer in the caller. Why passing struct * is explained in this C-FAQ.
If you don't intend to modify the pointer in caller, it's not neccessary to pass struct **.
There are a number of uses for this kind of parameter...
One already mentioned, and quite common is to allow the caller to use the function to modify a pointer. The obvious case here would be when getting some blob of data...
void getData( void** pData, int* size )
{
*pData = getMyDataPointer();
*size = getMyDataSize();
}
Another option is that perhaps the extra level of indirection allows for the list to behave in some way? e.g. by using indices to refer to specific elements they can be allocated and reallocated without having the risk of dangling pointers.
Yet another option is that the list is very large and lives in fragmented memory, or is rapidly accessed so that the list is actually several smaller lists grouped together. This sort of technique can also be used to 'lazily' allocate huge arrays, e.g. providing an interface to an array of a billion elements, but then allocating chunks of 100k on demand as they are read/written with struct** pointing at the whole thing, and each struct* being either null or pointing to 100k structs...
To be honest the context is quite important... there are plenty of uses for also triple pointers as function parameters that follow similar reasoning. (e.g. combine the first thing i mention with the second, or the second with the third etc.)
You are correct, there is no reason to pass a pointer to pointer unless your function is intended to modify the pointer passed in. In case of accessing an array of structs, a single level of indirection is definitely sufficient.
The creator of the API probably thought that the argument list would be easier to memorize if the first argument of every function is the same.
Related
I want to add something to the end of the array passed to the function.
Which is better, declaring a new larger array or using alloc ()?
1.
void array_append(int *block, size_t size)
{
int new_block[size + 2];
memcpy(new_block, block, size);
(...append)
}
void array_append(int *block, size_t size)
{
int *new_block = calloc(1, sizeof(int) + 2);
memcpy(new_block, block, size);
(...append)
free(new_block);
}
I am not returning the newly created array anywhere.
I only use new_block inside functions.
Does not modify the original array in the function.
Declaring new_block as static is omitted.
I know how calloc() / malloc() works, I know that this operation has to be validated.
new_block is only meant to live in a function.
I just wonder which solution is better and why ...
regards
You should dynamically allocate an array instead of using a variable length array because in general in the last case the code can be unsafe due to a comparatively big size of the array that can lead to the stack overflow.
I want to add something to the end of the array
But you cannot really. Unless with realloc(). This is how your ...append trick can be done, whatever it means.
If you need a temporary array to work with and then copy into your array (but not at the end!), then all methods for allocation are allowed - it really depends on how often and with which sizes.
If it is called very often with limited sizes, it could be a static array.
There is no easy solution for growing arrays (or for memory management in general). At the extreme you allocate every element individually and link them together: a linked list.
--> avoid reaching the end of your arrays. Define a higher maximum or then implement a linked list.
In certain situations realloc() also makes sense (big changes in size, but not often). Problem is sometimes the whole array has to be memcopied to keep the larger array contiguous: "realloc", not "append". So it is "expensive".
I am not returning the newly created array anywhere.
That is part of the problem. You actually seem to be doing half of what realloc() does: allocate the new space, memcpy() the old contents...and then free the old and return the new array(-pointer) to the caller.
First version can not return the array pointer, because end of function is also end of local auto arrays, VLA or not.
If the append can be done to the existing array (which it can if the caller expects this and the memory of the array has room), you can merely append to the existing array.
Otherwise, you need a new array. In this case, the array must be returned to the caller. You can do this by returning a pointer to its first element or by having the caller pass a pointer to a pointer, and you modify the pointed-to pointer to point to the first element of the new array.
When you provide a new array, you must allocate memory for it with malloc or a similar routine. You should not use an array defined inside your function without static, as the memory for such an array is reserved only until execution of the function ends. When your function returns to the caller, that memory is released for other uses. (Generally, you also should not use an array declared with static, but for reasons involving good design, reducing bugs, and multiple serial or parallel calls to the function.)
Generally it is preferred to pass pointer to structure to a function in C, in order to avoid copying during function call. This has an unwanted side effect that the called function can modify the elements of the structure inadvertently. What is a good programming practice to avoid such errors without compromising on the efficiency of the function call ?
Pass a pointer-to-const is the obvious answer
void foo(const struct some_struct *p)
That will prevent you from modifying the immediate members of the struct inadvertently. That's what const is for.
In fact, your question sounds like a copy-paste from some quiz card, with const being the expected answer.
In general, when it comes to simple optimizations like what you've described, it is often preferable to use a pointer-to-struct rather than passing a struct itself, as passing a whole struct means more overhead from extra data being copied onto the call stack.
The example below is a fairly common approach:
#include <errno.h>
typedef struct myStruct {
int i;
char c;
} myStruct_t;
int myFunc(myStruct_t* pStruct) {
if (!pStruct) {
return EINVAL;
}
// Do some stuff
return 0;
}
If you want to avoid modifying the data passed to the function, just make sure that the data is immutable by modifying the function prototype.
int myFunc(const myStruct_t* pStruct)
You will also benefit from reading up on "const correctness".
A very common idiom, particularly in unix/posix style system code is to have the caller allocate a struct, and pass a pointer to that struct through the function call.
This is a little different than what I think your asking about where you are passing data into a function with a struct (where as others have mention you may the function to treat the struct as const). In these cases, the struct is empty (or only partially full) before the function call. The caller will do something like allocate an empty struct and then passes a pointer to this struct. Probably different than your general question, but relevant to the discussion I think.
This accomplishes a couple handy things. It avoids copying a possibly large structure, also it lets the caller fill in some fields and the callee to fill out other (giving an effective shared space for communication).
The most important aspect to this idiom is that the caller has full control over the allocation of the struct. It can have it on the stack, heap, reuse the same one repeatedly, but where it comes from the caller is responsible for the handling the memory.
This is one of the problems with passing around struct pointers; you can easily lose track of who allocated the struct and whose responsibility it is to free it. This idiom gives you the advantage of not having to copy the struct around, while making it clear who has the job of free'ing the memory is.
I have a function which takes a structure as a parameter, like:
add_new_structure(structure s);
then store it inside
structure structure_list[200];
question:
1. when I want to use the structure, I have a function like
structure *getStructure(int id)
{
return &structure_list[id];
}
is it gonna work if I add one structure like this:
void init()
{
structure test;
memset(&test,0,sizeof(structure));
add_new_structure(test);
}
and then call getStructure from another function? like this:
void anotherFunction()
{
structure *got_test = getStructure(0);
}
because I remember I can't have local variable and then call it from another function right?
2.is it better to just store it like this?
change the add_new_structure() parameter to structure *s;
then store it inside
structure *structure_list[200]; by calling add_new_structure(&test);
3. which one is better? or what is the right way to do it?
The first approach, i.e. you pass the instance directly as a parameter, works. Because the whole instance is copied when calling the function. And what you store is a copy of the original struct instance.
However, you can't pass and store a pointer to a local variable. The problem you mentioned above will occur in this case.
IMHO, neither of the above approaches are right. The first approach will introduce too much overhead when passing parameters to the function. While the second one cannot achieve what you want.
You'd better dynamically allocate memory with malloc/calloc and store the pointer in the array. Don't forget to release the object at the end of use in case of memory leak. Like this:
void init()
{
structure *test = NULL;
test = (structure *) calloc(1, sizeof(structure));
add_new_structure(test);
}
void add_new_structure(structure *s);
Option 2, as I think you point out, will not work. It's a bit more subtle than saying that pointers to local variables can't be used outside of a function; it's that they are only valid while the function is still "active", so to speak. In option 2, a pointer to structure test would be stored inside of structure *structure_list[200] when you call add_new_structure. At this point, some function is calling init which is calling add_new_structure. When you return from init, the memory address you put into structure_list is no longer owned by the original owner, and this is dangerous. If this is too mechanical of an explanation, you should look at how stacks work to see why.
Without using malloc and its friends, which can introduce a lot of complexity, I would be inclined to keep the memory stored in structure_list, with the minor modification that you can pass structure test by reference and not by value. This is probably a reasonable compromise between the two stylistically.
void init() {
structure test;
memset(&test,0,sizeof(structure));
add_new_structure(&test);
}
void add_new_structure(structure *s) {
if (structure_count < 200) {
structure_list[structure_count++] = *s;
}
}
A lot of this depends on what structure is (if it contains pointers itself, who owns those?), but hopefully this provides some intuition.
I wonder why we can pass structure to C function by value, but we can never do the same with array (which is passed by address).
When I was learning C, they told me that arrays consume much stack, so it's not preferred to pass them by value.
But it seems that structures are often (if not always) larger than arrays and are more complex data structure, so this explanation makes no sense for me now !
Can anybody help with as much details as possible ?
In C, an array is always defined as a pointer to the first position of the array, so by definition, when you are passing an array to a function your are passing its memory address, hence its reference.
When you define a variable of type struct, you're allocating all the space in memory needed to to contain this struct, and if you make something like:
struct a, b;
...
a = b;
You are copying all the values from b to a, and in the same way, when you are passing it to a function, you are copying the values of the original struct to the stack. That's called passing a parameter by value.
It's true what you're stating in your question. A struct may be more complex than an array, but it's perfectly possible to pass it as value, and it may be inefficient, but the reason that you can't pass an array by value is because it is defined as a pointer by default.
void my_cool_function()
{
obj_scene_data scene;
obj_scene_data *scene_ptr = &scene;
parse_obj_scene(scene_ptr, "test.txt");
}
Why would I ever create a pointer to a local variable as above if I can just do
void my_cool_function()
{
obj_scene_data scene;
parse_obj_scene(&scene, "test.txt");
}
Just in case it's relevant:
int parse_obj_scene(obj_scene_data *data_out, char *filename);
In the specific code you linked, there isn't really a reason.
It could be functionally necessary if you have a function taking an obj_scene_data **. You can't do &&scene, so you'd have to create a local variable before passing the address on.
Yes absolutely you can do this for many reasons.
For example if you want to iterate over the members of a stack allocated array via a pointer.
Or in other cases if you want to point sometimes to one memory address and other times to another memory address. You can setup a pointer to point to one or the other via an if statement and then later use your common code all within the same scope.
Typically in these cases your pointer variable goes out of scope at the same time as your stack allocated memory goes out of scope. There is no harm if you use your pointer within the same scope.
In your exact example there is no good reason to do it.
If the function accepts a NULL pointer as input, and you want to decide whether to pass NULL based on some condition, then a pointer to a stack variable is useful to avoid having to call the same function in separate code paths, especially if the rest of the parameters are the same otherwise. For example, instead of this:
void my_function()
{
obj_data obj = {0};
if( some condition )
other_function(&scene, "test.txt");
else
other_function(NULL, "test.txt");
}
You could do this:
void my_function()
{
obj_data obj = {0};
obj_data *obj_ptr = (condition is true) ? &obj : NULL;
other_function(obj_ptr, "test.txt");
}
If parse_obj_scene() is a function there may be no good reason to create a separate pointer. But if for some unholy reason it is a macro it may be necessary to reassign the value to the pointer to iterate over the subject data.
Not in terms of semantics, and in fact there is a more general point that you can replace all local variables with function calls with no change in semantics, and given suitable compiler optimisations, equal efficiency. (see section 2.3 of "Lambda: The Ultimate Imperative".)
But the point of writing code to communicate with the next person to maintain it, and in an imperative language without tail call optimisation, it is usual to use local variables for things which are iterated over, for automatic structures, and to simplify expressions. So if it makes the code more readable, then use it.