Returning a pointer to an automatic variable - c

Say you have the following function:
char *getp()
{
char s[] = "hello";
return s;
}
Since the function is returning a pointer to a local variable in the function to be used outside, will it cause a memory leak?
P.S. I am still learning C so my question may be a bit naive...
[Update]
So, if say you want to return a new char[] array (ie maybe for a substring function), what do you return exactly? Should it be pointer to an external variable ? ie a char[] that is not local to the function?

It won't cause a memory leak. It'll cause a dangling reference. The local variable is allocated on the stack and will be freed as soon as it goes out of scope. As a result, when the function ends, the pointer you are returning no longer points to a memory you own. This is not a memory leak (memory leak is when you allocate some memory and don't free it).
[Update]:
To be able to return an array allocated in a function, you should allocate it outside stack (e.g. in the heap) like:
char *test() {
char* arr = malloc(100);
arr[0] = 'M';
return arr;
}
Now, if you don't free the memory in the calling function after you finished using it, you'll have a memory leak.

No, it wont leak, since its destroyed after getp() ends;
It will result in undefined behaviour, because now you have a pointer to a memory area that no longer holds what you think it does, and that can be reused by anyone.
A memory leak would happen if you stored that array on the heap, without executing a call to free().
char* getp(){
char* p = malloc(N);
//do stuff to p
return p;
}
int main(){
char* p = getp();
//free(p) No leak if this line is uncommented
return 0;
}
Here, p is not destroyed because its not in the stack, but in the heap. However, once the program ends, allocated memory has not been released, causing a memory leak ( even though its done once the process dies).
[UPDATE]
If you want to return a new c-string from a function, you have two options.
Store it in the heap (as the example
above or like this real example that returns a duplicated string);
Pass a buffer parameter
for example:
//doesnt exactly answer your update question, but probably a better idea.
size_t foo (const char* str, size_t strleng, char* newstr);
Here, you'd have to allocate memory somewhere for newstr (could be stack OR heap) before calling foo function. In this particular case, it would return the amount of characters in newstr.

It's not a memory leak because the memory is being release properly.
But it is a bug. You have a pointer to unallocated memory. It is called a dangling reference and is a common source of errors in C. The results are undefined. You wont see any problems until run-time when you try to use that pointer.

Auto variables are destroyed at the end of the function call; you can't return a pointer to them. What you're doing could be described as "returning a pointer to the block of memory that used to hold s, but now is unused (but might still have something in it, at least for now) and that will rapidly be filled with something else entirely."

It will not cause memory leak, but it will cause undefined behavior. This case is particularly dangerous because the pointer will point somewhere in the program's stack, and if you use it, you will be accessing random data. Such pointer, when written through, can also be used to compromise program security and make it execute arbitrary code.

No-one else has yet mentioned another way that you can make this construct valid: tell the compiler that you want the array "s" to have "static storage duration" (this means it lives for the life of the program, like a global variable). You do this with the keyword "static":
char *getp()
{
static char s[] = "hello";
return s;
}
Now, the downside of this is that there is now only one instance of s, shared between every invocation of the getp() function. With the function as you've written it, that won't matter. In more complicated cases, it might not do what you want.
PS: The usual kind of local variables have what's called "automatic storage duration", which means that a new instance of the variable is brought into existence when the function is called, and disappears when the function returns. There's a corresponding keyword "auto", but it's implied anyway if you don't use "static", so you almost never see it in real world code.

I've deleted my earlier answer after putting the code in a debugger and watching the disassembly and the memory window.
The code in the question is invalid and returns a reference to stack memory, which will be overwritten.
This slightly different version, however, returns a reference to fixed memory, and works fine:
char *getp()
{
char* s = "hello";
return s;
}

s is a stack variable - it's automatically de-referenced at the end of the function. However, your pointer won't be valid and will refer to an area of memory that could be overwritten at any point.

Related

Freeing a pointer inside a function, and using it in main

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
char* test() {
char* s = "Hello World";
size_t len = strlen(s);
char* t = malloc(sizeof(char)*(len+1));
strcpy(t, s);
free(t);
return t;
};
int main(void) {
printf("%s\n", test());
return 0;
};
I would like to allocate and de-allocate memory inside the function. I tested this code and works, but I am wondering:
Why does this work?
Is it good practice to use the value of a freed pointer in main ?
Once you call free on a pointer, the memory it pointed to is no longer valid. Attempting to use a pointer to freed memory triggers undefined behavior. In this particular case it happened to work, but there's no guarantee of that.
If the function returns allocated memory, it is the responsibility of the caller to free it:
char* test() {
char* s = "Hello World";
size_t len = strlen(s);
char* t = malloc(sizeof(char)*(len+1));
strcpy(t, s);
return t;
};
int main(void) {
char *t = test();
printf("%s\n", t);
free(t);
return 0;
};
malloc reserves memory for use.
free releases that reservation. In general, it does not make the memory go away, it does not change the contents of that memory, and it does not alter the value of the pointer that held the address.
After free(t), the bytes of t still contain the same bit settings they did before the free. Then return t; returns those bits to the caller.
When main passes those bits to printf, printf uses them as the address to get the characters for %s. Since nothing has changed them, they are printed.
That is why you got the behavior you did with this program. However, none of it is guaranteed. Once free was called with t, the memory reservation was gone. Something else in your program could have used that memory. For example, printf might have allocated a buffer for its own internal use, and that could have used the same memory.
For the most part, malloc and free are just methods of coordinating use of memory, so that different parts of your program do not try to use the same memory at the same time for different purposes. When you only have one part of your program using allocated memory, there are no other parts of your program to interfere with that. So the lack of coordination did not cause your program to fail. If you had multiple routines in your program using allocated memory, then attempting to use memory after it has been released is more likely to encounter problems.
Additionally, once the memory has been freed, the compiler may treat a pointer to it as if it has no fixed value. The return t; statement is not required to return any particular value.
It doesn't matter where do you free() a pointer. Once it is free()d, the pointer is not deferrenciable anymore (neither inside nor ouside the function where it was free()d)
The purpose of free() is to return the memory allocated with malloc() so the semantics are that, once you have freed a chunk of memory, it is not anymore usable.
In C, all parameters are passed by value, so free() cannot change the value expression you passed to it, and this is the reason the pointer is not changed into an invalid pointer value (like NULL) but you are advised that no more uses of the pointer can be done without incurring in Undefined Behaviour.
There could be a solution in the design of free() and it is to pass the pointer variable that holds the pointer by address, and so free() would be able to turn the pointer into a NULL. But this not only takes more work to do, but free() doesn't know how many copies you have made of the value malloc() gave to you... so it is impossible to know how many references you have over there to be nullified. That approach makes it impossible to give free() the responsibility of nullifying the reference to the returned memory.
So, if you think that free doesn't turn the pointer into NULL and for some strange reason you can still use the memory returned, don't do it anymore, because you'll be making mistakes.
You are adviced! :)

When to use malloc in function?

I am learning the basic of C programming and find it quite strange and difficult. Especially with dynamic memory allocation, pointer and similar things. I came upon this function and don't quite understand what is wrong with it.
char *strdup(const char *p)
{
char *q;
strcpy(q, p);
return q;
}
I think I have to malloc and free q. But the function " return q". Doesn't that mean that it will store q value in its own memory. So the data still be saved after the function executed?
When it is appropriate to use malloc? As I understand so far is that I have to malloc a new variable every time I need that variable declared in a function to be used elsewhere. Is that true? Is there any other situation where malloc is needed?
The type of q is a pointer, and pointers hold addresses -- so what you are returning is the address that pointer holds.
Until you give that pointer a valid address, it points off to who-knows-where, memory that you may or may not own and have the right to access. So, the strdup call will copy a string from the address held in p into some location you probably don't own.
If you had done a malloc first, and given q the results of the malloc, then q would hold a valid address, and your strdup would put the copy into memory that you did own (assuming you malloc'd enough space for the string -- a strlen on p would tell you how much you needed).
Then, when you returned q, you would be giving the caller the address as well. Any code with that address can see the string you put there. If some future code were to free that address, then what it holds is up in the air -- it could be anything at all.
So, you don't want to free q before you return the address that it holds -- you need to let the caller free the address it gets from you, when it is ready to do so.
In terms of when you malloc, yes, if you want to return an address that will remain viable after your function completes, you need to malloc it -- giving the caller the address of a local variable, for example, would be bad: the memory is freed when the function returns, you don't own it anymore.
Another typical use of malloc is for building up dynamic data structures like trees and lists -- you can't know how much memory you need up front, so you build the list or tree up as you need to, malloc'ing more memory for each node in the structure.
My personal rules are: use malloc() when an object is too big to put on the stack or/and when it has to live outside the scope of the current block. In your case, I believe, you should do something like the following:
char *strdup(const char *p)
{
char *q;
q = malloc(strlen(p) + 1);
if(NULL != q)
strcpy(q, p);
return q;
}
malloc(X) creates space (of size X bytes) on the heap for you to play with. The data that you write to these X byes stays put when your function returns, as a result, you can read what your strdup() function wrote to that space on the heap.
free() is used for freeing space on the heap. You can pass a pointer to this function that you obtained only as a result of a malloc() or realloc() call. This function may not clear out the data, it just means that a subsequent call to malloc() may return the address of the same space that you just freed. I say "may" because these are all implementaion defined and should not be relied upon.
In the piece of code you wrote, the strcpy() function copies the bytes one by one from p to q until it finds a \0 at the location pointed to by q (and then it copies the \0 as well). In order to write data somewhere, you need to allocate space first for the data to be written, hence, one option is that you call malloc() to create some space and then write data there.
Well, calling free() is not mandatory as your OS will reclaim the space allocated by malloc() when you program ends, but as long as your program runs, you may be ocupying more space than you need to - and that's bad for your program, for other programs, for the OS and the universe as a whole.
I think I have to malloc and free q. But the function " return q". Doesn't that mean that it will store q value in its own memory. So the data still be saved after the function executed?
No, your data won't be saved. In fact your pointer q is being used without allocating it's size can cause problems. Also, once this function execution complete, variable char* q will be destroyed.
You need to allocate memory to pointer q before copying data as suggested by #Michael's answer.But once you finish using the data return by this function, you will need to manually free() the memory you allocated or else it will cause memory leak (a situation where the memory is allocated but there is no pointer refer to that chunk of memory you allocated and hence will be inaccessible throughout program execution)
char *strdup(const char *p) // From #Michael's answer
{
char *q;
q = malloc(strlen(p) + 1);
if(NULL != q)
strcpy(q, p);
return q;
}
void someFunction()
{
char* aDupString = strdup("Hello World!!!");
/*... somecode use aDupString */
free(aDupString); // If not freed, will cause memory leaks
}
When it is appropriate to use malloc? Is there any other situation where malloc is needed?
It is appropriate to use in following situations:
1> The usage size of array are unknown at compile time.
2> You need size flexibility. For example, your function need to work with small data size and large data size. (like Data structure such as Link list,Stacks, Queues, etc.)
As I understand so far is that I have to malloc a new variable every time I need that variable declared in a function to be used elsewhere. Is that true?
I think this one is partially true. depending on what you are trying to achive, there might be a way to get around using malloc though. For example, your strdup can also be rewrite in following way:
void strdup2(const char *p, char* strOut)
{
// malloc not require
strcpy(strOut, p);
}
void someFunction()
{
char aString[15] = "Hello World!!!";
char aDupStr[sizeof(aString)];
strdup2(aString, aDupStr);
// free() memory not required. but size is not dynamic.
}

Does char* cause memory leak if passed to a function

I have a function like this:
void readString(char* str){
str="asd";
}
Can I know if str will be dealloced? Or must I free it?
Note: I can't use string library as I am programming a microprocessor.
free() must only be called if malloc(), calloc() or realloc() was used to allocate memory. This is not the case in the posted code so calling free() is unrequired.
The "asd" is a string literal and exists for the lifetime of the program (has static storage duration).
Your function does nothing.
It doesn't "read" a string. All it does it assign the address of a string literal (a constant block of memory somewhere that is initialized to the text of the string) to the function's local variable str. The function then exits, causing that local variable to stop existing.
Nothing is returned, and the pointer is not de-referenced (which would in turn be wrong since it's only a char *, not a char * *), so nothing happens outside the function. The caller doesn't "get" any value, and thus has nothing to call free() on, so that problem can never even occur.
String will not be deallocated because it is stored in static memory. You didn't allocate it, you don't free it
No, there is no memory leak. In your case it is statically allocated.
In general you have to make up your own rules about who can or must free memory, and you should document your code so it is clear what the requirements are.
In the example given, readString() only overwrites its own private copy of the pointer, and when it returns the caller will not see that anything has changed. Consequently the caller will have the same duty to free() its pointer as it had before it called readString(), and there will be no leak.
However, if readString() instead accepted a char **, so that it could modify the caller's copy of the pointer, then the outcome would be that it would not be legal to call free() after calling readString(), as the pointer's new value is not part of the malloc heap.
If the previous value of that pointer variable had been a malloc()ed object, then the caller should have freed it before allowing the pointer to be overwritten. It would be truly horrible to have readString() call free() in that case, because it would turn a variable which must eventually be freed into one which must never be freed, and the program flow would be very hard to follow.
This code is useless and meaningless as for as I am concerned. Here are different ways of calling your function definition and why I say this!
int main (int argc, char *argv[], char *envp[])
{
char a, *b, *c;
b = malloc (10);
readString(&a); // Case-1, Valid calling.
readString(b); // Case-2, Valid calling.
readString(c); // Case-3, Invalid calling. Unallocated location.
}
Case-1: This is the only case, where it matters to the caller about what you do in your function. You may use the passed character as you wish. The only meaningful assignment would be something like this. Doing 'str = "asd";' would probably dump the core or mess with the caller's stack or data segment memory(if address of a global variable was passed) and create a complicated debugging nightmare!
void readString(char* str){
*str='a';
}
Case-2: There is nothing Fatal or Syntax error in the code, but it is meaningless to do this. The only meaningful thing would be, just using what ever passed to your function from the caller. What is the reason for assigning like this on the passed parameter? Your definition can just have a local variable and avoid parameter passing completely. That function can be called as "readString();"...
void readString(void){
char *str='asd';
}

Why can a function return an array setup by malloc but not one setup by "int cat[3] = {0,0,0};"

Why can I return from a function an array setup by malloc:
int *dog = (int*)malloc(n * sizeof(int));
but not an array setup by
int cat[3] = {0,0,0};
The "cat[ ]" array is returned with a Warning.
Thanks all for your help
This is a question of scope.
int cat[3]; // declares a local variable cat
Local variables versus malloc'd memory
Local variables exist on the stack. When this function returns, these local variables will be destroyed. At that point, the addresses used to store your array are recycled, so you cannot guarantee anything about their contents.
If you call malloc, you will be allocating from the heap, so the memory will persist beyond the life of your function.
If the function is supposed to return a pointer (in this case, a pointer-to-int which is the first address of the integer array), that pointer should point to good memory. Malloc is the way to ensure this.
Avoiding Malloc
You do not have to call malloc inside of your function (although it would be normal and appropriate to do so).
Alternatively, you could pass an address into your function which is supposed to hold these values. Your function would do the work of calculating the values and would fill the memory at the given address, and then it would return.
In fact, this is a common pattern. If you do this, however, you will find that you do not need to return the address, since you already know the address outside of the function you are calling. Because of this, it's more common to return a value which indicates the success or failure of the routine, like an int, than it is to return the address of the relevant data.
This way, the caller of the function can know whether or not the data was successfully populated or if an error occurred.
#include <stdio.h> // include stdio for the printf function
int rainCats (int *cats); // pass a pointer-to-int to function rainCats
int main (int argc, char *argv[]) {
int cats[3]; // cats is the address to the first element
int success; // declare an int to store the success value
success = rainCats(cats); // pass the address to the function
if (success == 0) {
int i;
for (i=0; i<3; i++) {
printf("cat[%d] is %d \r", i, cats[i]);
getchar();
}
}
return 0;
}
int rainCats (int *cats) {
int i;
for (i=0; i<3; i++) { // put a number in each element of the cats array
cats[i] = i;
}
return 0; // return a zero to signify success
}
Why this works
Note that you never did have to call malloc here because cats[3] was declared inside of the main function. The local variables in main will only be destroyed when the program exits. Unless the program is very simple, malloc will be used to create and control the lifespan of a data structure.
Also notice that rainCats is hard-coded to return 0. Nothing happens inside of rainCats which would make it fail, such as attempting to access a file, a network request, or other memory allocations. More complex programs have many reasons for failing, so there is often a good reason for returning a success code.
There are two key parts of memory in a running program: the stack, and the heap. The stack is also referred to as the call stack.
When you make a function call, information about the parameters, where to return, and all the variables defined in the scope of the function are pushed onto the stack. (It used to be the case that C variables could only be defined at the beginning of the function. Mostly because it made life easier for the compiler writers.)
When you return from a function, everything on the stack is popped off and is gone (and soon when you make some more function calls you'll overwrite that memory, so you don't want to be pointing at it!)
Anytime you allocate memory you are allocating if from the heap. That's some other part of memory, maintained by the allocation manager. Once you "reserve" part of it, you are responsible for it, and if you want to stop pointing at it, you're supposed to let the manager know. If you drop the pointer and can't ask to have it released any more, that's a leak.
You're also supposed to only look at the part of memory you said you wanted. Overwriting not just the part you said you wanted, but past (or before) that part of memory is a classic technique for exploits: writing information into part of memory that is holding computer instructions instead of data. Knowledge of how the compiler and the runtime manage things helps experts figure out how to do this. Well designed operating systems prevent them from doing that.
heap:
int *dog = (int*)malloc(n*sizeof(int*));
stack:
int cat[3] = {0,0,0};
Because int cat[3] = {0,0,0}; is declaring an automatic variable that only exists while the function is being called.
There is a special "dispensation" in C for inited automatic arrays of char, so that quoted strings can be returned, but it doesn't generalize to other array types.
cat[] is allocated on the stack of the function you are calling, when that stack is freed that memory is freed (when the function returns the stack should be considered freed).
If what you want to do is populate an array of int's in the calling frame pass in a pointer to an that you control from the calling frame;
void somefunction() {
int cats[3];
findMyCats(cats);
}
void findMyCats(int *cats) {
cats[0] = 0;
cats[1] = 0;
cats[2] = 0;
}
of course this is contrived and I've hardcoded that the array length is 3 but this is what you have to do to get data from an invoked function.
A single value works because it's copied back to the calling frame;
int findACat() {
int cat = 3;
return cat;
}
in findACat 3 is copied from findAtCat to the calling frame since its a known quantity the compiler can do that for you. The data a pointer points to can't be copied because the compiler does not know how much to copy.
When you define a variable like 'cat' the compiler assigns it an address. The association between the name and the address is only valid within the scope of the definition. In the case of auto variables that scope is the function body from the point of definition onwards.
Auto variables are allocated on the stack. The same address on the stack is associated with different variables at different times. When you return an array, what is actually returned is the address of the first element of the array. Unfortunately, after the return, the compiler can and will reuse that storage for completely unrelated purposes. What you'd see at a source code level would be your returned variable mysteriously changing for no apparent reason.
Now, if you really must return an initialized array, you can declare that array as static. A static variable has a permanent rather than a temporary storage allocation. You'll need to keep in mind that the same memory will be used by successive calls to the function, so the results from the previous call may need to be copied somewhere else before making the next call.
Another approach is to pass the array in as an argument and write into it in your function. The calling function then owns the variable, and the issues with stack variables don't arise.
None of this will make much sense unless you carefully study how the stack works. Good luck.
You cannot return an array. You are returning a pointer. This is not the same thing.
You can return a pointer to the memory allocated by malloc() because malloc() has allocated the memory and reserved it for use by your program until you explicitly use free() to deallocate it.
You may not return a pointer to the memory allocated by a local array because as soon as the function ends, the local array no longer exists.
This is a question of object lifetime - not scope or stack or heap. While those terms are related to the lifetime of an object, they aren't equivalent to lifetime, and it's the lifetime of the object that you're returning that's important. For example, a dynamically alloced object has a lifetime that extends from allocation to deallocataion. A local variable's lifetime might end when the scope of the variable ends, but if it's static its lifetime won't end there.
The lifetime of an object that has been allocated with malloc() is until that object has been freed using the free() function. Therefore when you create an object using malloc(), you can legitimately return the pointer to that object as long as you haven't freed it - it will still be alive when the function ends. In fact you should take care to do something with the pointer so it gets remembered somewhere or it will result in a leak.
The lifetime of an automatic variable ends when the scope of the variable ends (so scope is related to lifetime). Therefore, it doesn't make sense to return a pointer to such an object from a function - the pointer will be invalid as soon as the function returns.
Now, if your local variable is static instead of automatic, then its lifetime extends beyond the scope that it's in (therefore scope is not equivalent to lifetime). So if a function has a local static variable, the object will still be alive even when the function has returned, and it would be legitimate to return a pointer to a static array from your function. Though that brings in a whole new set of problems because there's only one instance of that object, so returning it multiple times from the function can cause problems with sharing the data (it basically only works if the data doesn't change after initialization or there are clear rules for when it can and cannot change).
Another example taken from another answer here is regarding string literals - pointers to them can be returned from a function not because of a scoping rule, but because of a rule that says that string literals have a lifetime that extends until the program ends.

Is this use of C pointers safe from leaking memory?

char *a() {
char *t = malloc(8);
t[0] = 'a';
t[1] = 'b';
//...
t[7] = 'h';
return t;
}
int main(void) {
char *x = a();
//do something with x
//...
free(x);
return 0;
}
Does this code have any potential problems since memory is allocated in a() and used memory in main()?
First, a() is declared as returning void, but you attempt to return a char*. Change the signature to return a char*.
Second, your function is fine, but your example code has a memory leak because you never free the memory that the returned pointer points to.
Third, as gbrandt pointed out, you are not checking for success after the call to malloc. malloc can fail, and checking to see if it did is a good habit to get into.
Another way to accomplish this would be to pass a pointer to a pointer into a() instead, and then the caller has to create the pointer themselves before passing it to a(), but either way you still need to free the memory. To be honest, I would go with your approach over this in this case. There is no compelling reason to do it this way, just thought I would mention it.
void a(char **p)
{
*p = malloc(8);
if (*p)
{
**p[0] = 'a';
**p[1] = 'b';
...
**p[7] = 'h';
}
}
int main(void)
{
char *x;
a(&x);
//do something with x
.....
free(x);
}
If this alternate approach confuses you, please let me know as I would be happy to provide an explanation (though, at the moment, I need to get back to work!)
All good advice above. Small problems with the code, the big one is...
You are not checking for success in the malloc or after the function call. Never forget error handling.
No problems—in fact, this is the correct way to allocate memory that will still be used after a function returns—but you may want to free() the memory when you're done using it. Eight bytes wouldn't be a problem, but coding against memory leaks is a good habit to get into.
Apart from having the wrong return type on a() (should be char * instead of void), the code doesn't have any problems.
Just be sure to free() the memory you allocated when you're done with it.
Most of the answers here indicate that this isn't a problem, just remember to free() it later. This is technically true, but is really a problem. Any time you allocate memory in one scope and expect it to be freed in another, you are asking for a memory leak. People don't remember to free the memory. There is also the problem that the caller will have to know that you used malloc and not alloca() or new() or something else. If they call anything but the matched deallocation routine, the results are undefined.
In short, it is usually a mistake to allocate memory and pass it back to the caller.
It is better to expect the memory to be allocated by the caller because if they allocate it, they will remember to free it and to do so correctly.
No you won't have any problems explicitly because you allocated the function in a and use it in main.
As long as you remember to free the memory that x points to then you won't have any problems. By using malloc() you've allocated memory on the heap which is left alone when returning to the calling function.
Thanks. I just edited the code. I guess this time it should have no problem. Btw, in C, when a function finishes, does the compiler clear all variables (including array & pointer) in the stack?

Resources