Why is alloca different from just creating a local variable? - c

I read that there is a funciton called alloca that allocates memory from the stack frame of the current function rather than the heap. The memory is automatically destroyed when the function exits.
What is the point of this, and how is it any different from just crating an array of a structure or a local variable within the function? They would go on the stack and would be destroyed at the end of the function as well.
PS: I saw the other alloca question and it didn't answer how these two things are different :)

When you use alloca, you get to specify how many bytes you want at run time. With a local variable, the amount is fixed at compile time. Note that alloca predates C's variable-length arrays.

With alloca you can create a dynamic array (something that normally requires malloc) AND it's VERY fast. Here there are the advantages and disadvantages of GCC alloca:
http://www.gnu.org/s/hello/manual/libc/Variable-Size-Automatic.html#Variable-Size-Automatic

I think the following are different:
void f()
{
{
int x;
int * p = &x;
}
// no more x
}
void g()
{
{
int * p = alloca(sizeof(int));
}
// memory still allocated
}

Until gcc and C99 adopted Variable-length arrays, alloca offered significantly more power than simple local variables in that you could allocate arrays whose length is not known until runtime.
The need for this can arise at the boundary between two data representations. In my postscript interpreter, I use counted strings internally; but if I want to use a library function, I have to convert to a nul-terminated representation to make the call.
OPFN_ void SSsearch(state *st, object str, object seek) {
//char *s, *sk;
char s[str.u.c.n+1], sk[seek.u.c.n+1]; /* VLA */
//// could also be written:
//char *s,*sk;
//s = alloca(str.u.c.n+1);
//sk = alloca(seek.u.c.n+1);
char *r;
//if (seek.u.c.n > str.u.c.n) error(st,rangecheck);
//s = strndup(STR(str), str.u.c.n);
//sk = strndup(STR(seek), seek.u.c.n);
memcpy(s, STR(str), str.u.c.n); s[str.u.c.n] = '\0';
memcpy(sk, STR(seek), seek.u.c.n); sk[seek.u.c.n] = '\0';
r = strstr(s, sk);
if (r != NULL) { int off = r-s;
push(substring(str, off + seek.u.c.n, str.u.c.n - seek.u.c.n - off)); /* post */
push(substring(str, off, seek.u.c.n)); /* match */
push(substring(str, 0, off)); /* pre */
push(consbool(true));
} else {
push(str);
push(consbool(false));
}
//free(sk);
//free(s);
}
There is also a dangerous usage of alloca, which is easily avoided by prefering VLAs. You cannot use alloca safely within the argument list of a function call. So don't ever do this:
char *s = strcpy(alloca(strlen(t)+1, t);
That's what VLAs are for:
char s[strlen(t)+1];
strcpy(s,t);

Related

Returning allocated buffer, vs buffer passed to a function

When passing values to my functions, I often consider either returning an allocated buffer from my function, rather than letting the function take a buffer as an argument. I was trying to figure out if there was any significant benefit to passing a buffer to my function (eg:
void f(char **buff) {
/* operations */
strcpy(*buff, value);
}
Versus
char *f() {
char *buff = malloc(BUF_SIZE);
/* operations */
return buff;
}
These are obviously not super advanced examples, but I think the point stands. But yeah, are there any benefits to letting the user pass an allocated buffer, or is it better to return an allocated buffer?
Are there any benefits to using one over the other, or is it just useless?
This is a specific case of the more general question of whether a function should return data to its caller via its return value or via an out parameter. Both approaches work fine, and the pros and cons are mostly stylistic, not technical.
The main technical consideration is that each function has only one return value, but can have any number of out parameters. That can be worked around, but doing so might not be acceptable. For example, if you want to reserve your functions' return values for use as status codes such as many standard library functions produce, then that limits your options for sending back other data.
Some of the stylistic considerations are
using the return value is more aligned with the idiom of a mathematical function;
many people have trouble understanding pointers; and in particular,
non-local modifications effected through pointers sometimes confuse people. On the other hand,
the return value of a function can be used directly in an expression.
With respect to modifications to the question since this answer was initially posted, if the question is about whether to dynamically allocate and populate a new object vs populating an object presented by the caller, then there are these additional considerations:
allocating the object inside the function frees the caller from allocating it themselves, which is a convenience. On the other hand,
allocating the object inside the function prevents the caller from allocating it themselves (maybe automatically or statically), and does not provide for re-initializing an existing object. Also,
returning a pointer to an allocated object can obscure the fact that the caller has an obligation to free it.
Of course, you can have it both ways:
void init_thing(thing *t, char *name) {
t->name = name;
}
thing *create_thing(char *name) {
thing *t = new malloc(sizeof(*t));
if (t) {
init_thing(t);
}
return t;
}
Both options work.
But in general, returning information through the parameters (the second option) is preferable because we usually reserve the return of the function to report an error. And we can return several information trough multiple parameters. Hence, it is easier for the caller to check if the function was OK or not by checking first the returned value. Most of the services from the C library or the Linux system calls work like this.
Concerning your examples, both options work because you are referencing a constant string which is globally allocated at program's loading time. So, in both solutions, you return the address of this string.
But if you do something like the following:
char *func(void) {
char buff[] = "example";
return buff;
}
You actually copy the content of the constant string "example" into the stack area of the function pointed by buff. In the caller the returned address is no longer valid as it refers to a stack location which can be reused by any other function called by the caller.
Let's compile a program using this function:
#include <stdio.h>
char *func(void) {
char buff[] = "example";
return buff;
}
int main(void) {
char *p = func();
printf("%s\n", p);
return 0;
}
If the compilation options of the compiler are smart enough, we get a first red flag with a warning like this:
$ gcc -g bad.c -o bad
bad.c: In function 'func':
bad.c:5:11: warning: function returns address of local variable [-Wreturn-local-addr]
5 | return buff;
| ^~~~
The compiler points out the fact that func() is returning the address of a local space in its stack which is no longer valid when the function returns. This is the compiler option -Wreturn-local-addr which triggers this warning. Let's deactivate this option to remove the warning:
$ gcc -g bad.c -o bad -Wno-return-local-addr
So, now we have a program compiled with 0 warning but this is misleading as the execution fails or may trigger some unpredictible behaviors:
$ ./bad
Segmentation fault (core dumped)
You can't return the address of local memory.
Your first example works because the memory in "example" will not be deallocated. But if you allocated local (aka automatic) memory it automtically be deallocated when the function returns; the returned pointer will be invalid.
char *func() {
char buff[10];
// Copy into local memory
strcpy(buff, "example");
// buff will be deallocated after returning.
// warning: function returns address of local variable
return buff;
}
You either return dynamic memory, using malloc, which the caller must then free.
char *func() {
char *buf = malloc(10);
strcpy(buff, "example");
return buff;
}
int main() {
char *buf = func();
puts(buf);
free(buf);
}
Or you let the caller allocate the memory and pass it in.
void *func(char **buff) {
// Copy a string into local memory
strcpy(buff, "example");
// buff will be deallocated after returning.
// warning: function returns address of local variable
return buff;
}
int main() {
char buf[10];
func(&buf);
puts(buf);
}
The upside is the caller has full control of the memory. They can reused existing memory, and they can use local memory.
The downside is the caller must allocate the correct amount of memory. This might lead to allocating too much memory, and also too little.
An additional downside is the function has no control over the memory which has been passed in. It cannot grow nor shrink nor free the memory.
You can only return one thing from a function.
For example, if you want to convert a string to an integer you could return the integer like atoi does. int atoi( const char *str ).
int num = atoi("42");
But then what happens when the conversion fails? atoi returns 0, but how do you tell the difference between atoi("0") and atoi("purple")?
You can instead pass in an int * for the converted value. int my_atoi( const char *str, int *ret ).
int num;
int err = my_atoi("42", &num);
if(err) {
exit(1);
}
else {
printf("%d\n");
}

When should one use dynamic memory allocation function versus direct variable declaration?

Below is an example of direct variable declaration.
double multiplyByTwo (double input) {
double twice = input * 2.0;
return twice;
}
Below is an example of dynamic memory allocation.
double *multiplyByTwo (double *input) {
double *twice = malloc(sizeof(double));
*twice = *input * 2.0;
return twice;
}
If I had a choice, I will use direct variable declaration all the time because the code looks more readable. When are circumstances when dynamic memory allocation is more suitable?
When are circumstances when dynamic memory allocation is more suitable?
When the allocation size is not known at compile time, we need to use dynamic memory allocation.
Other than the above case, there are some other scenarios, like
If we want to have a data-structure which is re-sizeable at runtime, we need to go for dynamic memory allocation.
The lifetime of dynamically allocated memory remains valid unless it is free()d. At times, it comes handy when returning some address of a variable from a function call, which , otherwise, with an auto variable, would have been out of scope.
Usually the stack size would be moderately limited. If you want to create and use an huge array, it is better to use dynamic memory allocation. This will allocate the memory from heap.
Dynamic memory allocation with malloc places the memory on the heap, so it is not destroyed when leaving the function.
At a later point you would need to manually free the memory.
Direct declaration lands on the stack and is deleted on leaving the function. What happens on the return statement is that a copy of the variable is made before it is destroyed.
Consider this example:
On heap
void createPeople():
struct person *p = makePerson();
addToOffice(p);
addToFamily(p);
Vs. on stack
void createPeople():
struct person p = makePerson();
addToOffice(p);
addToFamily(p);
In the first case only one person is created and added to office and family. Now if the person is deleted, it is invalidated in both office and family and moreover, if his data is changed, it is changed in both, too.
In the second case a copy of the person is created for the office and family. Now it can happen that you change data of the copy in office and the copy in family remains the same.
So basically if you want to give several parties access to the same object, it should be on the stack.
"If I had a choice, I will use direct variable declaration all the time"
As well you should. You don't use heap memory unless you need to. Which obviously begs the question: When do I need dynamic memory?
The stack space is limited, if you need more space, you'll have to allocate it yourself (think big arrays, like struct huge_struct array[10000]). To get an idea of how big the stack is see this page. Note that the actual stack size may differ.
C passes arguments, and returns values by value. If you want to return an array, which decays into a pointer, you'll end up returning a pointer to an array that is out of scope (invalid), resulting in UB. Functions like these should allocate memory and return a pointer to it.
When you need to change the size of something (realloc), or you don't know how much memory you'll need to store something. An array that you've declared on the stack is fixed in size, a pointer to a block of memory can be re-allocated (malloc new block >= current block size + memcpy + free original pointer is basically what realloc does)
When a certain piece of memory needs to remain valid over various function calls. In certain cases globals won't do (think threading). Besides: globals are in almost all cases regarded as bad practice.
Shared libs generally use heap memory. This is because their authors can't assume that their code will have tons of stack space readily available. If you want to write a shared library, you'll probably find yourself writing a lot of memory management code
So, some examples to clarify:
//perfectly fine
double sum(double a, double b)
{
return a + b;
}
//call:
double result = sum(double_a, double_b);
//or to reassign:
double_a = (double_a, double_b);
//valid, but silly
double *sum_into(double *target, double b)
{
if (target == NULL)
target = calloc(1, sizeof *target);
*target = b;
return target;
}
//call
sum_into(&double_a, double_b);//pass pointer to stack var
//or allocate new pointer, set to value double_b
double *double_a = sum_into(NULL, double_b);
//or pass double pointer (heap)
sum_into(ptr_a, double_b);
Returning "arrays"
//Illegal
double[] get_double_values(double *vals, double factor, size_t count)
{
double return_val[count];//VLA if C99
for (int i=0;i<count;++i)
return_val[i] = vals[i] * factor;
return return_val;
}
//valid
double *get_double_values(const double *vals, double factor, size_t count)
{
double *return_val = malloc(count * sizeof *return_val);
if (return_val == NULL)
exit( EXIT_FAILURE );
for (int i=0;i<count;++i)
return_val[i] = vals[i] * factor;
return return_val;
}
Having to resize the object:
double * double_vals = get_double_values(
my_array,
2,
sizeof my_array/ sizeof *my_array
);
//store the current size of double_vals here
size_t current_size = sizeof my_array/ sizeof *my_array;
//some code here
//then:
double_vals = realloc(
double_vals,
current_size + 1
);
if (double_vals == NULL)
exit( EXIT_FAILURE );
double_vals[current_size] = 0.0;
++current_size;
Variables that need to stay in scope for longer:
struct callback_params * some_func( void )
{
struct callback_params *foo = malloc(sizeof *foo);//allocate memory
foo->lib_sum = 0;
call_some_lib_func(foo, callback_func);
}
void callback_func(int lib_param, void *opaque)
{
struct callback_params * foo = (struct callback_params *) opaque;
foo->lib_sum += lib_param;
}
In this scenario, our code is calling some library function that processes something asynchronously. We can pass a callback function that handles the results of the library-stuff. The lib also provides us with a means of passing some data to that callback through a void *opaque.
call_some_lib_func will have a signature along the lines of:
void call_some_lib_func(void *, void (*)(int, void *))
Or in a more readable format:
void call_some_lib_func(void *opaque, void (*callback)(int, void *))
So it's a function, called call_some_lib_func, that takes 2 arguments: a void * called opaque, and a function pointer to a function that returns void, and takes an int and a void * as arguments.
All we need to do is cast the void * to the correct type, and we can manipulate it. Also note that the some_func returns a pointer to the opaque pointer, so we can use it wherever we need to:
int main ( void )
{
struct callback_params *params = some_func();
while (params->lib_sum < 100)
printf("Waiting for something: %d%%\r", params->lib_sum);
puts("Done!");
free(params);//free the memory, we're done with it
//do other stuff
return 0;
}
Dynamic memory allocation is needed when you intend to transport data out of a local scope (for example of a function).
Also, when you can not know in advance how much memory you need (for example user input).
And finally, when you do know the amount of memory needed but it overflows the stack.
Otherwise, you should not use dynamic memory allocation because of readability, runtime overhead and safety.

Memory allocation and changing values

I am very new to C so sorry in advance if this is really basic. This is related to homework.
I have several helper functions, and each changes the value of a given variable (binary operations mostly), i.e.:
void helper1(unsigned short *x, arg1, arg2) --> x = &some_new_x
The main function calls other arguments arg3, arg4, arg5. The x is supposed to start at 0 (16-bit 0) at first, then be modified by helper functions, and after all the modifications, should be eventually returned by mainFunction.
Where do I declare the initial x and how/where do I allocate/free memory? If I declare it within mainFunc, it will reset to 0 every time helpers are called. If I free and reallocate memory inside helper functions, I get the "pointer being freed was not allocated" error even though I freed and allocated everything, or so I thought. A global variable doesn't do, either.
I would say that I don't really fully understand memory allocation, so I assume that my problem is with this, but it's entirely possible I just don't understand how to change variable values in C on a more basic level...
The variable x will exist while the block in which it was declared is executed, even during helper execution, and giving a pointer to the helpers allows them to change its value. If I understand your problem right, you shouldn't need dynamic memory allocation. The following code returns 4 from mainFunction:
void plus_one(unsigned short* x)
{
*x = *x + 1;
}
unsigned short mainFunction(void)
{
unsigned short x = 0;
plus_one(&x);
plus_one(&x);
plus_one(&x);
plus_one(&x);
return x;
}
By your description I'd suggest declaring x in your main function as a local variable (allocated from the stack) which you then pass by reference to your helper functions and return it from your main function by value.
int main()
{
int x; //local variable
helper(&x); //passed by reference
return x; //returned by value
}
Inside your helper you can modify the variable by dereferencing it and assigning whatever value needed:
void helper(int * x)
{
*x = ...; //change value of x
}
The alternative is declaring a pointer to x (which gets allocated from the heap) passing it to your helper functions and free-ing it when you have no use for it anymore. But this route requires more careful consideration and is error-prone.
Functions receive a value-wise copy of their inputs to locally scoped variables. Thus a helper function cannot possibly change the value it was called with, only its local copy.
void f(int n)
{
n = 2;
}
int main()
{
int n = 1;
f(n);
return 0;
}
Despite having the same name, n in f is local to the invocation of f. So the n in main never changes.
The way to work around this is to pass by pointer:
int f(int *n)
{
*n = 2;
}
int main()
{
int n = 1;
f(&n);
// now we also see n == 2.
return 0;
}
Note that, again, n in f is local, so if we changed the pointer n in f, it would have no effect on main's perspective. If we wanted to change the address n in main, we'd have to pass the address of the pointer.
void f1(int* nPtr)
{
nPtr = malloc(sizeof int);
*nPtr = 2;
}
void f2(int** nPtr)
{
// since nPtr is a pointer-to-a-pointer,
// we have to dereference it once to
// reach the "pointer-to-int"
// typeof nPtr = (int*)*
// typeof *nPtr = int*
*nPtr = malloc(sizeof int);
// deref once to get to int*, deref that for int
**nPtr = 2;
}
int main()
{
int *nPtr = NULL;
f1(nPtr); // passes 'NULL' to param 1 of f1.
// after the call, our 'nPtr' is still NULL
f2(&nPtr); // passes the *address* of our nPtr variable
// nPtr here should no-longer be null.
return 0;
}
---- EDIT: Regarding ownership of allocations ----
The ownership of pointers is a messy can of worms; the standard C library has a function strdup which returns a pointer to a copy of a string. It is left to the programmer to understand that the pointer is allocated with malloc and is expected to be released to the memory manager by a call to free.
This approach becomes more onerous as the thing being pointed to becomes more complex. For example, if you get a directory structure, you might be expected to understand that each entry is an allocated pointer that you are responsible for releasing.
dir = getDirectory(dirName);
for (i = 0; i < numEntries; i++) {
printf("%d: %s\n", i, dir[i]->de_name);
free(dir[i]);
}
free(dir);
If this was a file operation you'd be a little surprised if the library didn't provide a close function and made you tear down the file descriptor on your own.
A lot of modern libraries tend to assume responsibility for their resources and provide matching acquire and release functions, e.g. to open and close a MySQL connection:
// allocate a MySQL descriptor and initialize it.
MYSQL* conn = mysql_init(NULL);
DoStuffWithDBConnection(conn);
// release everything.
mysql_close(conn);
LibEvent has, e.g.
bufferevent_new();
to allocate an event buffer and
bufferevent_free();
to release it, even though what it actually does is little more than malloc() and free(), but by having you call these functions, they provide a well-defined and clear API which assumes responsibility for knowing such things.
This is the basis for the concept known as "RAII" in C++

Is it OK to malloc an array in a called function but free it in the calling function?

I'm not an expert in C, but here's what I'm trying to do:
int main(void) {
double *myArray;
...
myFunction(myArray);
...
/* save myArray contents to file */
...
free(myArray);
...
return 0;
}
int myFunction(double *myArray) {
int len=0;
...
/* compute len */
...
myArray = malloc( sizeof(double) * len );
if (myArray == NULL)
exit(1);
...
/* populate myArray */
...
return 0;
}
I'd like to save the contents of myArray inside main, but I don't know the size required until the program is inside myFunction.
Since I'm using CentOS 6.2 Linux, which I could only find a gcc build available up to 4.4.6 (which doesn't support C99 feature of declaring a variable-length array; see "broken" under "Variable-length arrays in http://gcc.gnu.org/gcc-4.4/c99status.html), I'm stuck using -std=c89 to compile.
Simple answer is no.
You are not passing back the pointer.
use
int main(void) {
double *myArray;
...
myFunction(&myArray);
...
/* save myArray contents to file */
...
free(myArray);
...
return 0;
}
int myFunction(double **myArray) {
int len=0;
...
/* compute len */
...
*myArray = malloc( sizeof(double) * len );
if (NULL == *myArray)
exit(1);
...
EDIT
poputateThis = *myArray;
/* populate poputateThis */
END OF EDIT
...
return 0;
EDIT
Should simplify thigs for your
}
What you are doing is not OK since myFunction doesn't change the value myArray holds in main; it merely changes its own copy.
Other than that, it's OK even if stylistically debatable.
As a question of good design and practice (apart from syntax issues pointed out in other answers) this is okay as long as it is consistent with your code base's best practices and transparent. Your function should be documented so that the caller knows it has to free and furthermore knows not to allocate its own memory. Furthermore consider making an abstract data type such as:
// myarray.h
struct myarray_t;
int myarray_init(myarray_t* array); //int for return code
int myarray_cleanup(myarray_t* array); // will clean up
myarray_t will hold a dynamic pointer that will be encapsulated from the calling function, although in the init and cleanup functions it will respectively allocate and deallocate.
What you want to do is fine, but your code doesn't do it -- main never gets to see the allocated memory. The parameter myArray of myFunction is initialized with the value passed in the function call, but modifying it thereafter doesn't modify the otherwise-unrelated variable of the same name in main.
It appears in your code snippet that myFunction always returns 0. If so then the most obvious way to fix your code is to return myArray instead (and take no parameter). Then the call in main would look like myArray = myFunction();.
If myFunction in fact already uses its return value then you can pass in a pointer to double*, and write the address to the referand of that pointer. This is what Ed Heal's answer does. The double ** parameter is often called an "out-param", since it's a pointer to a location that the function uses to store its output. In this case, the output is the address of the buffer.
An alternative would be to do something like this:
size_t myFunction(double *myArray, size_t buf_len) {
int len=0;
...
/* compute len */
...
if (buf_len < len) {
return len;
}
/* populate myArray */
...
return len;
}
Then the callers have the freedom to allocate memory any way they like. Typical calling code might look like this:
size_t len = myFunction(NULL, 0);
// warning -- watch the case where len == 0, if that is possible
double *myArray = malloc(len * sizeof(*myArray));
if (!myArray) exit(1);
myFunction(myArray, len);
...
free(myArray);
What you've gained is that the caller can allocate the memory from anywhere that's convenient. What you've lost is that the caller has to write more code.
For an example of how to use that freedom, a caller could write:
#define SMALLSIZE 10;
void one_of_several_jobs() {
// doesn't usually require much space, occasionally does
double smallbuf[SMALLSIZE];
double *buf = 0;
size_t len = myFunction(smallbuf, SMALLSIZE);
if (len > SMALLSIZE) {
double *buf = malloc(len * sizeof(*buf));
if (!buf) {
puts("this job is too big, skipping it and moving to the next one");
return;
}
} else {
buf = smallbuf;
}
// use buf and len for something
...
if (buf != smallbuf) free(buf);
}
It's usually an unnecessary optimization to avoid a malloc in the common case where only a small buffer is needed -- this is only one example of why the caller might want a say in how the memory is allocated. A more pressing reason might be that your function is compiled into a different dll from the caller's function, perhaps using a different compiler, and the two don't use compatible implementations of malloc/free.

How to return an integer from a function

Which is considered better style?
int set_int (int *source) {
*source = 5;
return 0;
}
int main(){
int x;
set_int (&x);
}
OR
int *set_int (void) {
int *temp = NULL;
temp = malloc(sizeof (int));
*temp = 5;
return temp;
}
int main (void) {
int *x = set_int ();
}
Coming for a higher level programming background I gotta say I like the second version more. Any, tips would be very helpful. Still learning C.
Neither.
// "best" style for a function which sets an integer taken by pointer
void set_int(int *p) { *p = 5; }
int i;
set_int(&i);
Or:
// then again, minimise indirection
int an_interesting_int() { return 5; /* well, in real life more work */ }
int i = an_interesting_int();
Just because higher-level programming languages do a lot of allocation under the covers, does not mean that your C code will become easier to write/read/debug if you keep adding more unnecessary allocation :-)
If you do actually need an int allocated with malloc, and to use a pointer to that int, then I'd go with the first one (but bugfixed):
void set_int(int *p) { *p = 5; }
int *x = malloc(sizeof(*x));
if (x == 0) { do something about the error }
set_int(x);
Note that the function set_int is the same either way. It doesn't care where the integer it's setting came from, whether it's on the stack or the heap, who owns it, whether it has existed for a long time or whether it's brand new. So it's flexible. If you then want to also write a function which does two things (allocates something and sets the value) then of course you can, using set_int as a building block, perhaps like this:
int *allocate_and_set_int() {
int *x = malloc(sizeof(*x));
if (x != 0) set_int(x);
return x;
}
In the context of a real app, you can probably think of a better name than allocate_and_set_int...
Some errors:
int main(){
int x*; //should be int* x; or int *x;
set_int(x);
}
Also, you are not allocating any memory in the first code example.
int *x = malloc(sizeof(int));
About the style:
I prefer the first one, because you have less chances of not freeing the memory held by the pointer.
The first one is incorrect (apart from the syntax error) - you're passing an uninitialised pointer to set_int(). The correct call would be:
int main()
{
int x;
set_int(&x);
}
If they're just ints, and it can't fail, then the usual answer would be "neither" - you would usually write that like:
int get_int(void)
{
return 5;
}
int main()
{
int x;
x = get_int();
}
If, however, it's a more complicated aggregate type, then the second version is quite common:
struct somestruct *new_somestruct(int p1, const char *p2)
{
struct somestruct *s = malloc(sizeof *s);
if (s)
{
s->x = 0;
s->j = p1;
s->abc = p2;
}
return s;
}
int main()
{
struct somestruct *foo = new_somestruct(10, "Phil Collins");
free(foo);
return 0;
}
This allows struct somestruct * to be an "opaque pointer", where the complete definition of type struct somestruct isn't known to the calling code. The standard library uses this convention - for example, FILE *.
Definitely go with the first version. Notice that this allowed you to omit a dynamic memory allocation, which is SLOW, and may be a source of bugs, if you forget to later free that memory.
Also, if you decide for some reason to use the second style, notice that you don't need to initialize the pointer to NULL. This value will either way be overwritten by whatever malloc() returns. And if you're out of memory, malloc() will return NULL by itself, without your help :-).
So int *temp = malloc(sizeof(int)); is sufficient.
Memory managing rules usually state that the allocator of a memory block should also deallocate it. This is impossible when you return allocated memory. Therefore, the second should be better.
For a more complex type like a struct, you'll usually end up with a function to initialize it and maybe a function to dispose of it. Allocation and deallocate should be done separately, by you.
C gives you the freedom to allocate memory dynamically or statically, and having a function work only with one of the two modes (which would be the case if you had a function that returned dynamically allocated memory) limits you.
typedef struct
{
int x;
float y;
} foo;
void foo_init(foo* object, int x, float y)
{
object->x = x;
object->y = y;
}
int main()
{
foo myFoo;
foo_init(&foo, 1, 3.1416);
}
In the second one you would need a pointer to a pointer for it to work, and in the first you are not using the return value, though you should.
I tend to prefer the first one, in C, but that depends on what you are actually doing, as I doubt you are doing something this simple.
Keep your code as simple as you need to get it done, the KISS principle is still valid.
It is best not to return a piece of allocated memory from a function if somebody does not know how it works they might not deallocate the memory.
The memory deallocation should be the responsibility of the code allocating the memory.
The first is preferred (assuming the simple syntax bugs are fixed) because it is how you simulate an Out Parameter. However, it's only usable where the caller can arrange for all the space to be allocated to write the value into before the call; when the caller lacks that information, you've got to return a pointer to memory (maybe malloced, maybe from a pool, etc.)
What you are asking more generally is how to return values from a function. It's a great question because it's so hard to get right. What you can learn are some rules of thumb that will stop you making horrid code. Then, read good code until you internalize the different patterns.
Here is my advice:
In general any function that returns a new value should do so via its return statement. This applies for structures, obviously, but also arrays, strings, and integers. Since integers are simple types (they fit into one machine word) you can pass them around directly, not with pointers.
Never pass pointers to integers, it's an anti-pattern. Always pass integers by value.
Learn to group functions by type so that you don't have to learn (or explain) every case separately. A good model is a simple OO one: a _new function that creates an opaque struct and returns a pointer to it; a set of functions that take the pointer to that struct and do stuff with it (set properties, do work); a set of functions that return properties of that struct; a destructor that takes a pointer to the struct and frees it. Hey presto, C becomes much nicer like this.
When you do modify arguments (only structs or arrays), stick to conventions, e.g. stdc libraries always copy from right to left; the OO model I explained would always put the structure pointer first.
Avoid modifying more than one argument in one function. Otherwise you get complex interfaces you can't remember and you eventually get wrong.
Return 0 for success, -1 for errors, when the function does something which might go wrong. In some cases you may have to return -1 for errors, 0 or greater for success.
The standard POSIX APIs are a good template but don't use any kind of class pattern.

Resources