I've got a question to C structs and datatypes. I have a struct called test:
struct test
{
char* c;
char* c2;
};
And I am returning this struct from a function:
struct test a()
{
struct test t = { "yeah!", "string" };
return t;
}
My question is whether the memory for the struct is freed automatically or if I have to do this manually via free().
[update from comment:]
The function a is in a DLL and I want to use the struct in the main program.
You should only free something which you malloced (or used another similar function) first. Since nothing was malloced, nothing should be freed.
TL/DR Version: You do not need to manually free anything; you can treat this struct instance the way you would treat any scalar variable.
Slightly Longer Version: The struct instance t has automatic storage duration, meaning its lifetime extends over the lifetime of the a function; once a exits, any memory allocated for t is released. A copy of the contents of t is returned to the caller.
As for those contents...
c and c2 are pointing to string literals; string literals are allocated such that their lifetime extends over the entire program's execution. So the pointer values in c and c2 will be valid after t is returned from a; indeed, those pointer values will be valid over the lifetime of the program.
You should only have to call free on something that was allocated via malloc, calloc, or realloc.
you do not have to free a non-dynamic allocation. Nevertheless, If you want to use the struct in an other function, you have to pass the address of the struct, and take it as a (struct *), if you don't, you will not be able to use it again.
Related
Could someone please explain to me the difference between creating a structure with and without malloc. When should malloc be used and when should the regular initialization be used?
For example:
struct person {
char* name;
};
struct person p = {.name="apple"};
struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";
What is really the difference between the two? When would one approach be used over others?
Having a data structure like;
struct myStruct {
int a;
char *b;
};
struct myStruct p; // alternative 1
struct myStruct *q = malloc(sizeof(struct myStruct)); // alternative 2
Alternative 1: Allocates a myStruct width of memory space on stack and hands back to you the memory address of the struct (i.e., &p gives you the first byte address of the struct). If it is declared in a function, its life ends when the function exits (i.e. if function gets out of the scope, you can't reach it).
Alternative 2: Allocates a myStruct width of memory space on heap and a pointer width of memory space of type (struct myStruct*) on stack. The pointer value on the stack gets assigned the value of the memory address of the struct (which is on the heap) and this pointer address (not the actual structs address) is handed back to you. It's life time never ends until you use free(q).
In the latter case, say, myStruct sits on memory address 0xabcd0000 and q sits on memory address 0xdddd0000; then, the pointer value on memory address 0xdddd0000 is assigned as 0xabcd0000 and this is returned back to you.
printf("%p\n", &p); // will print "0xabcd0000" (the address of struct)
printf("%p\n", q); // will print "0xabcd0000" (the address of struct)
printf("%p\n", &q); // will print "0xdddd0000" (the address of pointer)
Addressing the second part of your; when to use which:
If this struct is in a function and you need to use it after the function exits, you need to malloc it. You can use the value of the struct by returning the pointer, like: return q;.
If this struct is temporary and you do not need its value after, you do not need to malloc memory.
Usage with an example:
struct myStruct {
int a;
char *b;
};
struct myStruct *foo() {
struct myStruct p;
p.a = 5;
return &p; // after this point, it's out of scope; possible warning
}
struct myStruct *bar() {
struct myStruct *q = malloc(sizeof(struct myStruct));
q->a = 5;
return q;
}
int main() {
struct myStruct *pMain = foo();
// memory is allocated in foo. p.a was assigned as '5'.
// a memory address is returned.
// but be careful!!!
// memory is susceptible to be overwritten.
// it is out of your control.
struct myStruct *qMain = bar();
// memory is allocated in bar. q->a was assigned as '5'.
// a memory address is returned.
// memory is *not* susceptible to be overwritten
// until you use 'free(qMain);'
}
If we assume both examples occur inside a function, then in:
struct person p = {.name="apple"};
the C implementation automatically allocates memory for p and releases it when execution of the function ends (or, if the statement is inside a block nested in the function, when execution of that block ends). This is useful when:
You are working with objects of modest size. (For big objects, using many kibibytes of memory, malloc may be better. The thresholds vary depending on circumstances.)
You are working with a small number of objects at one time.
In:
struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";
the program explicitly requests memory for an object, and the program generally should release that memory with free when it is done with the object. This is useful when:
The object must be returned to the caller of the function. An automatic object, as used above, will cease to exist (in the C model of computation; the actual memory in your computer does not stop existing—rather it is merely no longer reserved for use for the object) when execution of the function ends, but this allocated object will continue to exist until the program frees it (or ends execution).
The object is very large. (Generally, C implementations provide more memory for allocation by malloc than they do for automatic objects.)
The program will create a variable number of such objects, depending on circumstances, such as creating linked lists, trees, or other structures from input whose size is not known before it is read.
Note that struct person p = {.name="apple"}; initializes the name member with "apple" and initializes all other members to zero. However, the code that uses malloc and assigns to p_tr->name does not initialize the other members.
If struct person p = {.name="apple"}; appears outside of a function, then it creates an object with static storage duration. It will exist for the duration of program execution.
Instead of struct person* p_tr = malloc(sizeof(struct person));, it is preferable to use struct person *p_tr = malloc(sizeof *p_tr);. With the former, a change to the p_tr requires edits in two places, which allows a human opportunity to make mistakes. With the latter, changing the type of p_tr in just one place will still result in the correct size being requested.
struct person p = {.name="apple"};
^This is Automatic allocation for a variable/instance of type person.
struct person* p_tr = malloc(sizeof(person));
^This is dynamic allocation for a variable/instance of type person.
Static memory allocation occurs at Compile Time.
Dynamic memory allocation means it allocates memory at runtime when the program executes that line of instruction
Judging by your comments, you are interested in when to use one or the other. Note that all types of allocation reserve a computer memory sufficient to fit the value of the variable in it. The size depends on the type of the variable. Statically allocated variables are pined to a place in the memory by the compiler. Automatically allocated variables are pinned to a place in stack by the same compiler. Dynamically allocated variables do not exist before the program starts and do not have any place in memory till they are allocated by 'malloc' or other functions.
All named variables are allocated statically or automatically. Dynamic variables are allocated by the program, but in order to be able to access them, one still needs a named variable, which is a pointer. A pointer is a variable which is big enough to keep an address of another variable. The latter could be allocated dynamically or statically or automatically.
The question is, what to do if your program does not know the number of objects it needs to use during the execution time. For example, what if you read some data from a file and create a dynamic struct, like a list or a tree in your program. You do not know exactly how many members of such a struct you would have. This is the main use for the dynamically allocated variables. You can create as many of them as needed and put all on the list. In the simplest case you only need one named variable which points to the beginning of the list to know about all of the objects on the list.
Another interesting use is when you return a complex struct from a function. If allocated automatically on the stack, it will cease to exist after returning from the function. Dynamically allocated data will be persistent till it is explicitly freed. So, using the dynamic allocation would help here.
There are other uses as well.
In your simple example there is no much difference between both cases. The second requires additional computer operations, call to the 'malloc' function to allocate the memory for your struct. Whether in the first case the memory for the struct is allocated in a static program region defined at the program start up time. Note that the pointer in the second case also allocated statically. It just keeps the address of the memory region for the struct.
Also, as a general rule, the dynamically allocated data should be eventually freed by the 'free' function. You cannot free the static data.
Is it safe to return the pointer to a local struct in C? I mean is doing this
struct myStruct* GetStruct()
{
struct myStruct *str = (struct myStruct*)malloc(sizeof(struct myStruct));
//initialize struct members here
return str;
}
safe?
Thanks.
In your code, you aren't returning a pointer to a local structure. You are returning a pointer to a malloc()'d buffer that will reside upon the heap.
Thus, perfectly safe.
However, the caller (or the caller's caller or the caller's caller's callee, you get the idea) will then be responsible for calling free().
What isn't safe is this:
char *foo() {
char bar[100];
// fill bar
return bar;
}
As that returns a pointer to a chunk of memory that is on the stack -- is a local variable -- and, upon return, that memory will no longer be valid.
Tinkertim refers to "statically allocating bar and providing mutual exclusion".
Sure:
char *foo() {
static char bar[100];
// fill bar
return bar;
}
This will work in that it will return a pointer to the statically allocated buffer bar. Statically allocated means that bar is a global.
Thus, the above will not work in a multi-threaded environment where there may be concurrent calls to foo(). You would need to use some kind of synchronization primitive to ensure that two calls to foo() don't stomp on each other. There are many, many, synchronization primitives & patterns available -- that combined with the fact that the question was about a malloc()ed buffer puts such a discussion out of scope for this question.
To be clear:
// this is an allocation on the stack and cannot be safely returned
char bar[100];
// this is just like the above; don't return it!!
char *bar = alloca(100);
// this is an allocation on the heap and **can** be safely returned, but you gotta free()
malloc(100);
// this is a global or static allocation of which there is only one per app session
// you can return it safely, but you can't write to it from multiple threads without
// dealing with synchronization issues!
static char bar[100];
Think of it this way: You can return a pointer from a function if the memory allocated to that pointer is not local to that function (i.e. on the stack frame of that instance of that function - to be precise)
Is it allowed to dynamically allocate memory for static variable like this:
#include <stdio.h>
#include <stdlib.h>
struct person
{
int age;
int number;
};
static struct person* person_p = NULL;
int main()
{
person_p = (struct person*)malloc(10 * sizeof(struct person));
}
The above code built, but is it really allowed to dynamically allocate memory for static variable?
Yes, it's valid and allowed. (Unless you're using the pointer as a placeholder) You can (and need to) dynamically allocate and free() memory to and from the pointer before and after using it.
Rather, please make a note, you do not cast the return value of malloc() and family in C.
I don't see why not. Even though static means there can only be one instance of the object, you still need space for that object. Keep in mind however that anything that is malloc'd needs to be free'd, so you will want to do that at the end of your main() function.
Memory isn't "owned" by pointers to it. You can do the following things:
Dynamically allocate memory
Make a pointer point to that memory
It doesn't really make sense to say "dynamically allocate memory for a pointer".
Any pointer can point to any object (subject to alignment and aliasing restrictions), it makes no difference what the storage duration of the pointer is.
Note that it is the pointer that is static, not the memory it points to.
static means two unrelated things:
give static memory allocation (memory allocated at start of program, and only released at end of program)
give internal linkage (do not allow other compilation units - modules - to access the identifier. Here a variable, could be a function)
In your code, 1. static memory allocation is irrelevant, since the variable is global anyway and as such already has it.
Then, 2. internal linkage does not matter either, because what you are trying to do is inside the module anyway.
In other words, person_p is exactly as a usual global variable within your module, and you can do whatever you want to it.
It's only the pointer that is defined by this line of code, so you can dynamically allocate memory elsewhere, and assign the memory address to person_p if you wish.
I am a java programmer learning C. Have a question regaring functions. What are the differences between this:
main()
{
struct person myperson;
myperson = myfunction();
return;
}
struct person myfunction()
{
struct person myPerson;
myPerson.firstname = "John";
myPerson.lastname = "Doe";
return myPerson;
}
VS
main()
{
struct person *myperson;
myperson = myfunction();
return;
}
struct person* myfunction()
{
struct person *myPerson;
myPerson = malloc(sizeof(struct person));
myPerson->firstname = "John";
myPerson->lastname = "Doe";
return myPerson;
}
Are these legal in C? And y would 1 choose one over the other.
Thanks so much guys!
first code sample:
you create a struct in myfunction() on your stack and return it. then, you create another stack struct, and you copy the first to the second. the first is destroyed. the second will be automatically destroyed when you are out of the scope.
2 structs were actually created.
second code sample:
you create a struct in myfunction(), and then you copy only the address. the struct in main will actually be the same struct.
only one struct is created in here.
both code samples work, but for the later you will have to explicitly free the memory allocated for the struct, to avoid memory leak, but performance should be better since you don't need to copy the struct!
EDIT:
as mentioned by #Mat: this of course neglects the overhead of malloc(), which is not true for small structs.
The first version allocates the object on the stack and returns a copy of it. The second version creates the object on the heap and returns a pointer to it(this is closest to Java references except that the memory isn't automatically freed). You should not forget to call free() later on the returned pointer.
Btw, your main function is bad. It should be
int main(void)
{
...
return 0;
}
I suggest that you should read a good C book. This is really basic stuff you're asking.
I'm not sure if all this talk of "heap" and "stack" is cutting to the core of the language, so let me try something more language-intrinsic.
Your first version uses only automatic allocation, which means that all variables have automatic lifetime. That is, all variables end their life at the end of their enclosing scope: myFunction creates a local variable of type struct person and returns a copy of that variable; the main function declares a local variable of the same type and assigns to it the result of the function call. At the end of each scope, the local variables end as well.
The second version uses dynamic or manual allocation. You explicitly allocate storage for a person variable with the malloc() call, and that storage will remain allocated until someone deallocates is (via free()). Since you never deallocate it, this is in effect a memory leak.
The fundamental difference is one of lifetime and responsibility.
A few pros and cons: Automatic allocation means that responsibility is local, and you generally don't have to worry about anything. However, it comes at the price of having to copy arguments and return values by value, which may be expensive or undesirable. Manual allocation allows you to refer to large amounts of memory via a simple, cheap pointer, and is often the only way to implement certain constructions, but carries the burden of having the author remember who's responsible for which resource.
Both are legal, both work.
The 1st version is simpler, you avoid having to deal with memory allocation and releasing.
The 2nd version will perform better for bigger structs because you avoid putting the whole struct on stack for handing it over.
I would actually choose a third way. Let the caller worry about providing storage space (auto or dynamically allocated):
void myfunction(struct person* myPerson)
{
myPerson->firstname = "John";
myPerson->lastname = "Doe";
}
The function can be called either with an automatically or dynamically allocated variable:
struct person autoperson;
myfunction(&person);
struct person dynamic_person = malloc(sizeof struct person);
myfunction dynamic_person);
The first will allocate a struct person on the stack, and pass a copy of it back, then free the original. The second one will allocate it on the heap and pass a pointer to the location which was allocated, and will not free it.
The first one allocates the variables on the stack. The person object from myfunction is copied from the function and returned which is less efficient, but you can't get a memory leak which is good.
The second example returns a pointer (the *) to a person object that is dynamically allocated (with malloc). The person object allocated by malloc will never be destroyed unless you explicitly call free() on it, hwich you haven't - so you have a memory leak.
You need to explicitly free memory in C, it doesn't have garbage-collection like Java.
The first option creates a struct on the stack, when returning it, it gets copied to your struct defined in the main() function. Also copied are the fields. For larger structs this can be a costly operation.
The second option allocates dynamic memory, which does not get copied when you return it. You have to free() the pointer to avoid a memory leak.
Of course it depends on your needs, but for more important and long living objects I'd go for the second option. Also I would recommend to write allocation/initialization functions and a corresponding deallocation function. (see below why)
The problem is that the 2 strings you set in myfunction() are invalid outside of the function, as they are also created on the stack. You have to use strdup() or a similar function to make this failsave. Of course, to not let memory leaks slip in you have to free() the strduped pointers, just as with malloc().
In the first code, myPerson is an object of type struct person that is managed (*) by the implementation itself. In the second code, it is an object of type struct person * (a pointer to a struct person). In the second code, the object itself must be managed by the programmer (malloc, realloc, free).
Also, in the first code, the object itself is copied around a few times whereas in the 2nd code "only" the pointer gets copied. Usualy a pointer is much smaller than an object of a struct type.
Use the 2nd approach but remember to free the object.
Even better, create the object in the parent function and pass a pointer to functions: sruct person *myfunction(struct person *data) { /* ... */ }
(*) with object management I mean the time it gets created and deleted and stuff
First One:
main()
{
// create a person struct on the stack
struct person myperson;
// copy the struct returned by myfunction to myperson.
myperson = myfunction();
}
struct person myfunction()
{
// create a person struct on the stack.
struct person myPerson;
myPerson.firstname = "John";
myPerson.lastname = "Doe";
// return the myPerson struct. After myFunction returns, the memory
// holding the myPerson struct on the stack will be freed.
return myPerson;
}
Second one:
main()
{
// create a pointer to a person struct on the stack
struct person *myperson;
// assign the pointer returned by myfunction to myperson
myperson = myfunction();
}
struct person* myfunction()
{
// create a pointer to a person struct on the stack
struct person *myPerson;
// allocate memory for a person struct in dynamic memory and set myPerson
// to point to that memory. This memory will remain valid until it's freed by
// a call to the "free" function. Using malloc is much slower than creating
// an object on the stack. There is also the added performance cost of
// freeing the allocated memory at a later stage.
myPerson = malloc(sizeof(struct person));
myPerson->firstname = "John";
myPerson->lastname = "Doe";
// return the myPerson pointer
return myPerson;
}
1)
For which datatypes must I allocate memory with malloc?
For types like structs, pointers, except basic datatypes, like int
For all types?
2)
Why can I run this code? Why does it not crash? I assumed that I need to allocate memory for the struct first.
#include <stdio.h>
#include <stdlib.h>
typedef unsigned int uint32;
typedef struct
{
int a;
uint32* b;
}
foo;
int main(int argc, char* argv[])
{
foo foo2;
foo2.a = 3;
foo2.b = (uint32*)malloc(sizeof(uint32));
*foo2.b = 123;
}
Wouldn't it be better to use
foo* foo2 = malloc(sizeof(foo));
3)
How is foo.b set? Does is reference random memory or NULL?
#include <stdio.h>
#include <stdlib.h>
typedef unsigned int uint32;
typedef struct
{
int a;
uint32* b;
}
foo;
int main(int argc, char* argv[])
{
foo foo2;
foo2.a = 3;
}
All types in C can be allocated either dynamically, automatically (on the stack) or statically. The issue is not the type, but the lifetime you want - you use malloc when you want an object to exist outside of the scope of the function that created it, or when you don't know in advance how big a thing you need.
Edit to address your numbered questions.
There are no data types you must allocate with malloc. Only if you want a pointer type to point to valid memory must you use the unary & (address-of) operator or malloc() or some related function.
There is nothing wrong with your code - the line:
foo foo2;
Allocates a structure on the stack - then everything works as normal. Structures are no different in this sense than any other variable. It's not better or worse to use automatic variables (stack allocation) or globals or malloc(), they're all different, with different semantics and different reasons to choose them.
In your example in #3, foo2.b's value is undefined. Any automatic variable has an undefined and indeterminate value until you explicitly initialize it.
You must allocate with malloc any memory that you wish to be managed manually, as opposed to automatically. It doesn't matter if what's stored there is an int or a double or a struct or anything; malloc is all about manual memory management.
When you create a variable without malloc, it is stored on the stack and when it falls out of scope, its memory is automatically reclaimed. A variable falls out of scope when the variable is no longer accessible; e.g. when the block or function that the variable was declared in ends.
When you allocate memory with malloc, it is stored on the heap, and malloc returns a pointer to this memory. This memory will not be reclaimed until you call free on it, regardless of whether or not a pointer to it remains accessible (when no pointers remain to heap-allocated memory, this is a memory leak). This means that you gain the ability to continue to use the memory you allocated after the block or function that it was allocated in ends, but on the other hand you now have the responsibility to manually deallocate it when you are finished with it.
In your example, foo2 is on the stack, and will be automatically deallocated when main ends. However, the memory pointed to by foo2.b will not be automatically deallocated, since it is on the heap. This isn't a problem in your example because all memory is returned to the OS when the program ends, but if it were in a function other than main it would have been a memory leak.
foo foo2; automatically allocates the structure on the stack, and it is automatically deallocated when the enclosing function (main in this case) ends.
You only need to allocate memory on the heap, using malloc, if you need the structure to persist after the enclosing scope ends. You may also need to do this when the object is too large to fit on the stack.
2) Why can I run this code? Why does it not crash?
The code never refers to undefined memory or NULL. Why would it crash? (You have a memory leak as written, but it's probably because you're only showing part of the code and in the given program it's not a problem anyway.)
The alternative code you suggest will also work, though memory returned from malloc is also uninitialized by default. (I once worked with a custom memory allocator that filled returned memory blocks with ? characters by default. Still perfectly legal by the rules. Note that calloc returns a pointer to zero-initialized memory; use it if that's what you want.)
3) How is foo.b set? Does is reference random memory or NULL?
Random memory. Stack allocated structures are not initialized for you.
You can do it, but this is not enough.
Because, the second field is a pointer, which has to be set to a valid address. One of the ways to do it is by allocating memory (with malloc).
To your first question -
use malloc when you MUST manage object's lifetime manually.
For instance, the second code could be rewritten as:
int main(int argc, char* argv[])
{
foo foo2; uint32 b;
foo2.a = 3;
foo2.b = &b;
*foo2.b = 123;
}
This is better, because lifetime is the same, and the memory is now on stack - and doesn't need to be freed.
The memory for the struct instance ("foo2") will be allocated on the stack - there's no need to allocate memory for this yourself - if you do allocate using malloc be sure to free off the memory at a later date.