Function returning array practices [duplicate] - c

This question already has answers here:
Declaring a C function to return an array
(5 answers)
Closed 9 years ago.
I have just started my back-to-the-roots project; learning C. I'll be honest, abstraction is nice when you need to plow out a desktop/server application, but with C, things get personal. Which is nice, for a change!
Now to the point; I'm reading into using arrays with functions (Programming in C, Stephen G. Kochan.) I have learnt that when passing an function as a parameter the compiler will always see the reference as a pointer, like so:
void foo (char *array);
Take this, for example:
void foo (char *a)
{
a[0] = 'R'; // side effect
}
int main (void)
{
char a[] = "The quick brown fox jumps over the lazy dog!";
foo (a);
}
I haven't come by information on functions returning arrays or pointer to an array. The function rather changes the array in its parameter, thus causing side effects (as seen above).
Is it possible for a function to return a pointer to an array? Or is the above method just preferred?
Thanks in advance for those that can offer an explanation.

A function can return a pointer to the first element of an array, but it's more problematic, you either have to malloc the memory (like strdup) for the array or have a static array (like ctime).
malloc gives the problem of having to remember to free the memory (memory leaks), static means the array changes with each call, therefore passing an existing array is easier.

I am not sure I follow. Your function is of type void. Are you trying to modify the array by passing it as reference?
In C, if you return a pointer to the first element of the array, you effective return the entire array because C arrays are contiguous in the programs memory (may be not in physical memory). So when you return the pointer to the first element, you effectively tell your main function where the entire array is stored.

Related

C, Dynamic allocation of a matrix: Why is this not allowed?

So I have the following example in some lecture notes
void f(int **p){}
void g(int *p[]){}
void h(int p[2][3]){}
int main(){
int **a;
allocate_mem(a); // allocate memory for a
f(a); // OK!
g(a); // OK!
// h(a); // NOT OK
int b[2][3];
// f(b); // NOT OK
// g(b); // NOT OK
h(b); // OK!
return 0;
}
(without any further explanation/comments). I am struggling to understand exactly why f(b) and g(b) would be illegal. Both of these functions are designed to accept two-dimensional arrays, and we are calling them with one. How does that not work? I assume the difference lies in the allocation of memory, but how would that affect how a function accepts it as input?
You're conflating pointers with arrays, and pointers-to-pointers with two-dimensional arrays.
That's an understandable mistake, due to C (and C++)'s "array-to-pointer decay". Sometimes you can refer to an array, and get a pointer to its first element; sometime it's the actual array - depends on the context. And with two-dimensional arrays it gets even weirder, since a 2-dimensional arrays can be used in less places instead of pointer-to-pointer-to an element (but can still be used in some places like that).
Please spend a few minutes reading about how pointers and arrays relate and differ, in Section 6 of the C language FAQ. Your specific question appears there as well:
Q 6.18: "My compiler complained when I passed a two-dimensional array to a function expecting a pointer to a pointer."

C a function that returns an array [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
If I need to write a function that returns an array: int*, which way is better?
int* f(..data..)
or: void f(..data..,int** arr)
and we call f like this: int* x; f(&x);. (maybe they are both the same but I am not sure. but if I need to return an ErrorCode(it's an enum) too, then in the first way f will get ErrorCode* and in the second way, f will return an ErrorCode).
Returning an array is just returning a variable amount of data.
That's a really old problem, and C programmers developed many answers for it:
Caller passes in buffer.
The neccessary size is documented and not passed, too short buffers are Undefined Behavior: strcpy()
The neccessary size is documented and passed, errors are signaled by the return value: strcpy_s()
The buffer size is passed by pointer, and the called function reallocates with the documented allocator as needed: POSIX getline()
The neccessary size is unknown, but can be queried by calling the function with buffer-length 0: snprintf()
The neccessary size is unknown and cannot be queried, as much as fits in a buffer of passed size is returned. If neccessary, additional calls must be made to get the rest: fread()
⚠ The neccessary size is unknown, cannot be queried, and passing too small a buffer is Undefined Behavior. This is a design defect, therefore the function is deprecated / removed in newer versions, and just mentioned here for completeness: gets().
Caller passes a callback:
The callback-function gets a context-parameter: qsort_s()
The callback-function gets no context-parameter. Getting the context requires magic: qsort()
Caller passes an allocator: Not found in the C standard library. All allocator-aware C++ containers support that though.
Callee contract specifies the deallocator. Calling the wrong one is Undefined Behavior: fopen()->fclose() strdup()->free()
Callee returns an object which contains the deallocator: COM-Objects
Callee uses an internal shared buffer: asctime()
Be aware that either the returned array must contain a sentinel object or other marker, you have to return the length separately, or you have to return a struct containing a pointer to the data and the length.
Pass-by-reference (pointer to size or such) helps there.
In general, whenever the user has to guess the size or look it up in the manual, he will sometimes get it wrong. If he does not get it wrong, a later revision might invalidate his careful work, so it doesn't matter he was once right. Anyway, this way lies madness (UB).
For the rest, choose the most comfortable and efficient one you can.
Regarding an error code: Remember there's errno.
Usually it's more convenient and semantic to return the array
int* f(..data..)
If ever you need complexe error handling (e.g., returning errors values), you should return the error as an int, and the array by value.
There is no "better" here: you decide which approach fits the needs of the callers better.
Note that both functions are bound to give a user an array that they allocate internally, so deallocating the resultant array becomes a responsibility of the caller. In other words, somewhere inside f() you would have a malloc, and the user who receives the data must call free() on it.
You have another option here - let the caller pass the array into you, and return back a number that says how many items you put back into it:
size_t f(int *buffer, size_t max_length)
This approach lets the caller pass you a buffer in a static or in the automatic memory, thus improving flexibility.
the classic model is (assuming you need to return error code too)
int f(...., int **arr)
even though it doesnt flow so nicely as a function returning the array
Note this is why the lovely go language supports multiple return values.
Its also one of the reasons for exceptions - it gets the error indicators out of the function i/o space
The first one is better if there is no requirement to deal with an already existent pointer in the function.
The second one is used when you already have a defined pointer that points to an already allocated container (for example a list) and inside the function the value of the pointer can be changed.
If you must call f like int* x; f(&x);, you do not have much of a choice. You must use the second syntax, i.e., void f(..data..,int** arr). This is because you are not using return value anyways in your code.
The approach depends on a specific task and perhaps on your personal taste or a coding convention adopted in your project.
In general, I'd like to pass pointers as "output" parameters instead of return'ing an array for a number of reasons.
You likely want to return a number of elements in the array together with the array itself. But if you do this:
int f(const void* data, int** out_array);
Then if you see the signature first time, you can't quite tell what the function returns, the number of elements, or an error code, so I prefer to do this:
void f(const void* data, int** out_array, int* out_array_nelements);
Or even better:
void f(const void* data, int** out_array, size_t* out_array_nelements);
The function signature must be self-explanatory, and the parameter names help to achieve that.
The output array needs to be stored somewhere. You need to allocate some memory for the array. If you return a pointer to the array without passing the same pointer as argument, then you can't allocate memory on the stack. I mean, you cannot do this:
int f (const void *data) {
int array[10];
return array; /* the array is likely deallocated when the function exits */
}
Instead, you have to do static int array[10] (which is not thread-safe) or int *array = malloc(...) which leads to memory leaks.
So I suggest you to pass a pointer to the array which is already allocated before the function call, like this:
void f(const void *data, int* out_array, size_t* out_nelements, size_t max_nelements);
The benefit is you are free to choose where to allocate the array:
On the stack:
int array[10] = { 0 };
size_t max_nelements = sizeof(array)/sizeof(array[0]);
size_t nelements = 0;
f(data, array, &nelements, max_nelements);
Or in the heap:
size_t nelements = 0;
size_t max_nelements = 10;
int *array = malloc(max_nelements * sizeof(int));
f(data, array, &nelements, max_nelements);
See, with this approach you are free to choose how to allocate the memory.

For function pointer "fptr",why is value of "fptr" and *fptr same?What *fptr even mean?I only knew (*fptr)() or fptr() [duplicate]

This question already has answers here:
How do function pointers in C work?
(12 answers)
Closed 9 years ago.
Why is a function pointer behaving like an array pointer as far as this behavior goes?I mean, let's begin with the case of an array list[] where we'll consider &list and list.
char name[5]= "Eric",(*aptr)[5]=&name;
printf("%p,%p",*aptr,name); //BOTH ARE NUMERICALLY SAME
and we can refer to array elements also as (*aptr)[1],(*aptr)[2], etc.I understand what's going on here.
But why is the same working for functions?After all a "function" as such is not a contiguous memory block of similar elements as an array is.Consider this.
Suppose fptr is a function pointer as in my program.Why does fptr and *fptr give the same value when printed?What does *fptr even mean?I only knew that we can invoke a function using its pointer as (*fptr)() or as fptr(),but what is *fptr alone then?
#include<stdio.h>
void foo(){};
int main(void)
{
void (*fptr)()=foo;
printf("%p,%p",fptr,*fptr);
}
Result- 00401318 00401318
A pointer is to point to a memory location. A function is in memory and has a starting address. You can very well dereference the function name(which is a pointer) to obtain the function at that address.
From, Stephen Prata "Cpp Primer Plus"
History Versus Logic Holy syntax!
How can pf and (*pf) be equivalent?
One school of thought maintains that because pf is a pointer to a
function, *pf is a function; hence, you should use (*pf)() as a
function call. A second school maintains that because the name of a
function is a pointer to that function, a pointer to that function
should act like the name of a function; hence you should use pf() as a
function call. C++ takes the compromise view that both forms are
correct, or at least can be allowed, even though they are logically
inconsistent with each other. Before you judge that compromise too
harshly, reflect that the ability to hold views that are not logically
self-consistent is a hallmark of the human mental process.

What about functions pointer list? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
how to use array of function pointers?
I am wondering how one can declare in C a dynamic list of function pointers ? Using a pointer to function pointer ?
A pointer to a function pointer, with which you can use to point to a block of dynamically allocated memory. Let's say the function type is void (int):
void (*(*))(int) // Bare type
void (*(*f))(int) // Variable
Or you may want an array to function pointers, in which case:
void (*[10])(int) // Bare type
void (*af[10])(int) // Variable
You can change the parameter and the return type of the functions.
An array of function pointers can be used as a map from number to function to be executed. It is one way to do emulation - for example, after you have obtained the OP code from the instruction, you can execute the operation by using the array of pointer. (This is one way to do so, but I'm not sure whether it is actually done in emulation code).
For a dynamic list, you can use a basic linked list:
struct list {
int (*func)(int, float, double);
struct list *next;
}
Or an array of function pointers which is resized as necessary.
Yes you can have an array of function pointers.
This might be useful, for instance for collecting a set of callback functions for a specific event.
As an example, consider implementing addEventListener to DOM elements in a browser, where any DOM element can have multiple click events. A simple data structure might be:
void (*clickHandlers[MAX_EVENT_HANDLERS])(EventData arg);
For a dynamic list then allocate a buffer of function pointers using malloc, and realloc as you need more space:
void (**clickHandlers)(EventData arg);

a few beginner C questions

I'm sort of learning C, I'm not a beginner to programming though, I "know" Java and python, and by the way I'm on a mac (leopard).
Firstly,
1: could someone explain when to use a pointer and when not to?
2:
char *fun = malloc(sizeof(char) * 4);
or
char fun[4];
or
char *fun = "fun";
And then all but the last would set indexes 0, 1, 2 and 3 to 'f', 'u', 'n' and '\0' respectively. My question is, why isn't the second one a pointer? Why char fun[4] and not char *fun[4]? And how come it seems that a pointer to a struct or an int is always an array?
3:
I understand this:
typedef struct car
{
...
};
is a shortcut for
struct car
{
...
};
typedef struct car car;
Correct? But something I am really confused about:
typedef struct A
{
...
}B;
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
4. I understand what pointers do, but I don't understand what the point of them is (no pun intended). And when does something get allocated on the stack vs. the heap? How do I know where it gets allocated? Do pointers have something to do with it?
5. And lastly, know any good tutorial for C game programming (simple) ? And for mac/OS X, not windows?
PS. Is there any other name people use to refer to just C, not C++? I hate how they're all named almost the same thing, so hard to try to google specifically C and not just get C++ and C# stuff.
Thanks!!
It was hard to pick a best answer, they were all great, but the one I picked was the only one that made me understand my 3rd question, which was the only one I was originally going to ask. Thanks again!
My question is, why isn't the second one a pointer?
Because it declares an array. In the two other cases, you have a pointer that refers to data that lives somewhere else. Your array declaration, however, declares an array of data that lives where it's declared. If you declared it within a function, then data will die when you return from that function. Finally char *fun[4] would be an array of 4 pointers - it wouldn't be a char pointer. In case you just want to point to a block of 4 chars, then char* would fully suffice, no need to tell it that there are exactly 4 chars to be pointed to.
The first way which creates an object on the heap is used if you need data to live from thereon until the matching free call. The data will survive a return from a function.
The last way just creates data that's not intended to be written to. It's a pointer which refers to a string literal - it's often stored in read-only memory. If you write to it, then the behavior is undefined.
I understand what pointers do, but I don't understand what the point of them is (no pun intended).
Pointers are used to point to something (no pun, of course). Look at it like this: If you have a row of items on the table, and your friend says "pick the second item", then the item won't magically walk its way to you. You have to grab it. Your hand acts like a pointer, and when you move your hand back to you, you dereference that pointer and get the item. The row of items can be seen as an array of items:
And how come it seems that a pointer to a struct or an int is always an array?
item row[5];
When you do item i = row[1]; then you first point your hand at the first item (get a pointer to the first one), and then you advance till you are at the second item. Then you take your hand with the item back to you :) So, the row[1] syntax is not something special to arrays, but rather special to pointers - it's equivalent to *(row + 1), and a temporary pointer is made up when you use an array like that.
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
typedef struct car
{
...
};
That's not valid code. You basically said "define the type struct car { ... } to be referable by the following ordinary identifier" but you missed to tell it the identifier. The two following snippets are equivalent instead, as far as i can see
1)
struct car
{
...
};
typedef struct car car;
2)
typedef struct car
{
...
} car;
What is the difference between A and B? A is the 'tag-name', but what's that? When do I use which? Same thing for enums.
In our case, the identifier car was declared two times in the same scope. But the declarations won't conflict because each of the identifiers are in a different namespace. The two namespaces involved are the ordinary namespace and the tag namespace. A tag identifier needs to be used after a struct, union or enum keyword, while an ordinary identifier doesn't need anything around it. You may have heard of the POSIX function stat, whose interface looks like the following
struct stat {
...
};
int stat(const char *path, struct stat *buf);
In that code snippet, stat is registered into the two aforementioned namespaces too. struct stat will refer to the struct, and merely stat will refer to the function. Some people don't like to precede identifiers always with struct, union or enum. Those use typedef to introduce an ordinary identifier that will refer to the struct too. The identifier can of course be the same (both times car), or they can differ (one time A the other time B). It doesn't matter.
3) It's bad style to use two different names A and B:
typedef struct A
{
...
} B;
With that definition, you can say
struct A a;
B b;
b.field = 42;
a.field = b.field;
because the variables a and b have the same type. C programmers usually say
typedef struct A
{
...
} A;
so that you can use "A" as a type name, equivalent to "struct A" but it saves you a lot of typing.
Use them when you need to. Read some more examples and tutorials until you understand what pointers are, and this ought to be a lot clearer :)
The second case creates an array in memory, with space for four bytes. When you use that array's name, you magically get back a pointer to the first (index 0) element. And then the [] operator then actually works on a pointer, not an array - x[y] is equivalent to *(x + y). And yes, this means x[y] is the same as y[x]. Sorry.
Note also that when you add an integer to a pointer, it's multiplied by the size of the pointed-to elements, so if you do someIntArray[1], you get the second (index 1) element, not somewhere inbetween starting at the first byte.
Also, as a final gotcha - array types in function argument lists - eg, void foo(int bar[4]) - secretly get turned into pointer types - that is, void foo(int *bar). This is only the case in function arguments.
Your third example declares a struct type with two names - struct A and B. In pure C, the struct is mandatory for A - in C++, you can just refer to it as either A or B. Apart from the name change, the two types are completely equivalent, and you can substitute one for the other anywhere, anytime without any change in behavior.
C has three places things can be stored:
The stack - local variables in functions go here. For example:
void foo() {
int x; // on the stack
}
The heap - things go here when you allocate them explicitly with malloc, calloc, or realloc.
void foo() {
int *x; // on the stack
x = malloc(sizeof(*x)); // the value pointed to by x is on the heap
}
Static storage - global variables and static variables, allocated once at program startup.
int x; // static
void foo() {
static int y; // essentially a global that can only be used in foo()
}
No idea. I wish I didn't need to answer all questions at once - this is why you should split them up :)
Note: formatting looks ugly due to some sort of markdown bug, if anyone knows of a workaround please feel free to edit (and remove this note!)
char *fun = malloc(sizeof(char) * 4);
or
char fun[4];
or
char *fun = "fun";
The first one can be set to any size you want at runtime, and be resized later - you can also free the memory when you are done.
The second one is a pointer really 'fun' is the same as char ptr=&fun[0].
I understand what pointers do, but I don't understand what the point of
them is (no pun intended). And when
does something get allocated on the
stack vs. the heap? How do I know
where it gets allocated? Do pointers
have something to do with it?
When you define something in a function like "char fun[4]" it is defined on the stack and the memory isn't available outside the function.
Using malloc (or new in C++) reserves memory on the heap - you can make this data available anywhere in the program by passing it the pointer. This also lets you decide the size of the memory at runtime and finaly the size of the stack is limited (typically 1Mb) while on the heap you can reserve all the memory you have available.
edit 5. Not really - I would say pure C. C++ is (almost) a superset of C so unless you are working on a very limited embedded system it's usualy OK to use C++.
\5. Chipmunk
Fast and lightweight 2D rigid body physics library in C.
Designed with 2D video games in mind.
Lightweight C99 implementation with no external dependencies outside of the Std. C library.
Many language bindings available.
Simple, read the documentation and see!
Unrestrictive MIT license.
Makes you smarter, stronger and more attractive to the opposite gender!
...
In your second question:
char *fun = malloc(sizeof(char) * 4);
vs
char fun[4];
vs
char *fun = "fun";
These all involve an array of 4 chars, but that's where the similarity ends. Where they differ is in the lifetime, modifiability and initialisation of those chars.
The first one creates a single pointer to char object called fun - this pointer variable will live only from when this function starts until the function returns. It also calls the C standard library and asks it to dynamically create a memory block the size of an array of 4 chars, and assigns the location of the first char in the block to fun. This memory block (which you can treat as an array of 4 chars) has a flexible lifetime that's entirely up to the programmer - it lives until you pass that memory location to free(). Note that this means that the memory block created by malloc can live for a longer or shorter time than the pointer variable fun itself does. Note also that the association between fun and that memory block is not fixed - you can change fun so it points to different memory block, or make a different pointer point to that memory block.
One more thing - the array of 4 chars created by malloc is not initialised - it contains garbage values.
The second example creates only one object - an array of 4 chars, called fun. (To test this, change the 4 to 40 and print out sizeof(fun)). This array lives only until the function it's declared in returns (unless it's declared outside of a function, when it lives for as long as the entire program is running). This array of 4 chars isn't initialised either.
The third example creates two objects. The first is a pointer-to-char variable called fun, just like in the first example (and as usual, it lives from the start of this function until it returns). The other object is a bit strange - it's an array of 4 chars, initialised to { 'f', 'u', 'n', 0 }, which has no name and that lives for as long as the entire program is running. It's also not guaranteed to be modifiable (although what happens if you try to modify it is left entirely undefined - it might crash your program, or it might not). The variable fun is initialised with the location of this strange unnamed, unmodifiable, long-lived array (but just like in the first example, this association isn't permanent - you can make fun point to something else).
The reason why there's so many confusing similarities and differences between arrays and pointers is down to two things:
The "array syntax" in C (the [] operator) actually works on pointers, not arrays!
Trying to pin down an array is a bit like catching fog - in almost all cases the array evaporates and is replaced by a pointer to its first element instead.

Resources