Do we get a dangling pointer by copying a dangling pointer? - c

On one hand, if I write the following code :
int* fortytwo_pointer () {
int v = 42;
return &v;
}
I get a dangling pointer and I can check this by calling
printf ("forty two : %d\n", *(fortytwo_pointer()));
On the other hand
int* copy (int *pointer) {
return pointer;
}
int* fortytwo_pointer () {
int v = 42;
return copy (&v);
}
does not produce the same 'Segmentation fault' error.
I would expect it to also be an instance of a dangling pointer because the value v goes out of scope just the same. Is it true ? If so, how to check that the pointer is indeed dangling ?
Edit :
This question is more focused on dynamically checking that a pointer is valid, for example if we get a pointer from a call of a library function. I'm more interested in checking if the code could produce invalid pointers.

Yes, you do. As pointed out in the comments on your question, you cannot "check it" by using printf. From the moment that the function stack is popped, you're dealing with undefined behavior.
Use valgrind to run your program, and it will point out any read/write errors. Those may not cause your program to fail every time, but should be taken care of nonetheless. I'm sure your dangling pointer will show up in there, and valgrind will be kind enough to show you the full trace of where each error occurs.

I would expect it to also be an instance of a dangling pointer because the value v goes out of scope just the same. Is it true ? If so, how to check that the pointer is indeed dangling ?
Yes, the variable you have declared in int* fortytwo_pointer() is a local variable. This means the variable you have declared stores in a stack frame and at the end of this function, the variable is going to become freed and there's no a good address to dereference and that's why v is a dangling pointer.
My suggestion to resolve this problem is allocating memory from heap by using some functions such as malloc() or calloc(). Functions delete all variables in stack, not heap.
So, you can write your code below :
int* fortytwo_pointer () {
int *v = (int*) malloc (sizeof(int));
if(v == NULL)
{
printf("An error occurred.\n");
getch();
}
*v = 42;
return v;
}
and finally, don't forget to free *v pointer. You can free that by using free().

Related

Freeing a pointer inside a function, and using it in main

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
char* test() {
char* s = "Hello World";
size_t len = strlen(s);
char* t = malloc(sizeof(char)*(len+1));
strcpy(t, s);
free(t);
return t;
};
int main(void) {
printf("%s\n", test());
return 0;
};
I would like to allocate and de-allocate memory inside the function. I tested this code and works, but I am wondering:
Why does this work?
Is it good practice to use the value of a freed pointer in main ?
Once you call free on a pointer, the memory it pointed to is no longer valid. Attempting to use a pointer to freed memory triggers undefined behavior. In this particular case it happened to work, but there's no guarantee of that.
If the function returns allocated memory, it is the responsibility of the caller to free it:
char* test() {
char* s = "Hello World";
size_t len = strlen(s);
char* t = malloc(sizeof(char)*(len+1));
strcpy(t, s);
return t;
};
int main(void) {
char *t = test();
printf("%s\n", t);
free(t);
return 0;
};
malloc reserves memory for use.
free releases that reservation. In general, it does not make the memory go away, it does not change the contents of that memory, and it does not alter the value of the pointer that held the address.
After free(t), the bytes of t still contain the same bit settings they did before the free. Then return t; returns those bits to the caller.
When main passes those bits to printf, printf uses them as the address to get the characters for %s. Since nothing has changed them, they are printed.
That is why you got the behavior you did with this program. However, none of it is guaranteed. Once free was called with t, the memory reservation was gone. Something else in your program could have used that memory. For example, printf might have allocated a buffer for its own internal use, and that could have used the same memory.
For the most part, malloc and free are just methods of coordinating use of memory, so that different parts of your program do not try to use the same memory at the same time for different purposes. When you only have one part of your program using allocated memory, there are no other parts of your program to interfere with that. So the lack of coordination did not cause your program to fail. If you had multiple routines in your program using allocated memory, then attempting to use memory after it has been released is more likely to encounter problems.
Additionally, once the memory has been freed, the compiler may treat a pointer to it as if it has no fixed value. The return t; statement is not required to return any particular value.
It doesn't matter where do you free() a pointer. Once it is free()d, the pointer is not deferrenciable anymore (neither inside nor ouside the function where it was free()d)
The purpose of free() is to return the memory allocated with malloc() so the semantics are that, once you have freed a chunk of memory, it is not anymore usable.
In C, all parameters are passed by value, so free() cannot change the value expression you passed to it, and this is the reason the pointer is not changed into an invalid pointer value (like NULL) but you are advised that no more uses of the pointer can be done without incurring in Undefined Behaviour.
There could be a solution in the design of free() and it is to pass the pointer variable that holds the pointer by address, and so free() would be able to turn the pointer into a NULL. But this not only takes more work to do, but free() doesn't know how many copies you have made of the value malloc() gave to you... so it is impossible to know how many references you have over there to be nullified. That approach makes it impossible to give free() the responsibility of nullifying the reference to the returned memory.
So, if you think that free doesn't turn the pointer into NULL and for some strange reason you can still use the memory returned, don't do it anymore, because you'll be making mistakes.
You are adviced! :)

When writing a function that returns pointer, why should I allocate memory to the pointer I'm going to return?

I'm a bit weak when it comes to memory allocation and pointers.
So, I want to understand why do I have to allocate memory to pointers in functions as follow:
char *cstring(char c, int n)
{
int i = 0;
char * res;
res = malloc ((n+1)*sizeof(char));
while (i<n)
{
res[i]=c;
i++;
}
res[i] ='\0';
return res;
}
and why is the following not valid?
char *cstring(char c, int n)
{
int i = 0;
char * res;
while (i<n)
{
res[i]=c;
i++;
}
res[i] ='\0';
return res;
}
I understand that I should allocate memory (generally) to pointers so that they have defined memory.
However, I want to mainly understand how is it related to the concept of stack and heap memories!
Thanks in advance!
Pointers need to point to a valid memory location before they can be dereferenced.
In your first example, res is made to point at a block of allocated memory which can subsequently be written to and read from.
In your second example, res remains uninitialized when you attempt to dereference it. This causes undefined behavior. The most likely outcome in this case is that whatever garbage value it happens to contain will not be a valid memory address, so when you attempt to dereference that invalid address your program will crash.
If you declare a variable like that,
int A = 5;
then that means the variable will be on the stack. When functions are called, their local variables are pushed to the stack. The main function is also an example of that. So you don't have to allocate memory manually, your compiler will do this for you in the background before it calls your main function. And that also means if you examine the stack during the execution of the function you can see the value 5.
With this,
int A = 5;
int *PtrToA = &A;
The pointer will be on the stack again. But this time, the value on the stack just shows the memory address of the actual integer value we want. It points to the address of the memory block that holds the value 5. Since A is held in the stack here, pointer will show a memory address on the stack.
Like the case in your question you can allocate memory dynamically. But you have to initialize it before you read it. Because when you request to allocate the memory, your operating system searches for a valid memory field in your programs heap and reserves that for you. Than it gives you back its adddress and gives you the read write permissions so you can use it. But the values in it won't contain what you want. When compiler allocates on stack, the initial values will be unset again. If you do this,
char *res;
res[1] = 3;
variable res will be on the stack and it will contain some random value. So accessing it is just like that,
(rand())[1] = 3;
You can get an access violation error because you may not have permission to write to that memory location.
An important note; after your function call returns, values of local variables on the stack are no more valid. So be careful with that. Do not dereference them after the function call ends.
In conclusion; if you want to use a pointer, be sure it points to a valid memory location. You can allocate it yourself or make it point another memory address.
The second version of your code declares a pointer, but does not initialize it to point to a valid memory address.
Then, the code dereferences that pointer -- during the loop. So, your code would access uninitialized memory in this case. Remember, array indexing is just syntactic sugar for dereferencing -- so your code accesses memory its not supposed to.
The first version of your code initializes the pointer to actually point to something, and hence when you dereference it during the loop, it works.
Of course, in either case, you return the pointer from the function -- its just that in the first version it points to something valid, whereas in the second version it points anywhere.
The moral here is to always initialize your variables. Not doing so could result in undefined behavior in your code (even if it appears to work sometimes). The general advice here is to always compile your code using at least some compilation flags. For example in gcc/clang, consider -Wall -Werror -Wextra. Such options often pick up on simple cases of not initializing variables.
Also, valgrind is a brilliant tool for memory profiling. It can easily detect uses of uninitialized memory at runtime, and also memory leaks.
Simple: because you do not have any allocated memory for the data you wite. In your example you define pointer, you do not initialize it so it will reference random (or rather not possible to predict) place in the memory, then you try to write to this random memory location.
You have 2 Undefined Behaviours here in 5 lines example. Your pointer is not initialized, and you did not allocate any valid memory this pointer to reference.
EDIT:
VALID
char *cstring(char c, int n)
{
char * res;
res = malloc ((n+1)*sizeof(char));
char *cstring(char c, int n)
{
char * res;
static char buff[somesize];
res = buff;
char buff[somesize];
char *cstring(char c, int n)
{
char * res;
res = buff;
INVALID
char *cstring(char c, int n)
{
char * res;
char buff[somesize];
res = buff;

Is malloc needed for this int pointer example?

The following application works with both the commented out malloced int and when just using an int pointer to point to the local int 'a.' My question is if this is safe to do without malloc because I would think that int 'a' goes out of scope when function 'doit' returns, leaving int *p pointing at nothing. Is the program not seg faulting due to its simplicity or is this perfectly ok?
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
typedef struct ht {
void *data;
} ht_t;
ht_t * the_t;
void doit(int v)
{
int a = v;
//int *p = (int *) malloc (sizeof(int));
//*p = a;
int *p = &a;
the_t->data = (void *)p;
}
int main (int argc, char *argv[])
{
the_t = (ht_t *) malloc (sizeof(ht_t));
doit(8);
printf("%d\n", *(int*)the_t->data);
doit(4);
printf("%d\n", *(int*)the_t->data);
}
Yes, dereferencing a pointer to a local stack variable after the function is no longer in scope is undefined behavior. You just happen to be unlucky enough that the memory hasn't been overwritten, released back to the OS or turned into a function pointer to a demons-in-nose factory before you try to access it again.
Not every UB (undefined behaviour) results in a segfault.
Normally, the stack's memory won't be released back to the OS (but it might!) so that accessing that memory works. But at the next (bigger) function call, the memory where the pointer points to might be overwritten, so your data you felt like bing safe there is lost.
malloc() inside a function call will work because it stores on heap.
The pointer remains until you free the memory.
And yes, not every undefined behaviour will result in segmentation fault.
It does not matter that p does not point at anything on return from doit.
Because, you know, p isn't there any more either.
It does matter that you are reading the pointer to a no-longer-existing object in main though. That's UB, but harmless on most modern platforms.
Even worse though, you are reading the non-existent object it once pointed to, which is straight Undefined Behavior.
Still, nobody is obliged to catch you:
Can a local variable's memory be accessed outside its scope?
As an aside, Don't cast the result of malloc (and friends).
Also, be aware that not free-ing memory is not harmless on all platforms: Can I avoid releasing allocated memory in C with modern OSes?
Yes, the malloc is needed, otherwise the pointer would point on the 4-byte space of the stack, which would be used by other data, if you called other functions or created now local variables.
You can see that if you call that function afterwards with a value of 10:
void use_memory(int i)
{
int f[128]={};
if(i>0)
{
use_memory(i-1);
}
}

What could be the possible reason behind the warning which comes up when the following piece of code is compiled

This is a simple piece of code which i wrote to check whether it is legitimate to return the address of a local variable and my assumptions were proved correct by the compiler which gives a warning saying the same:
warning: function returns address of local variable
But the correct address is printed when executed... Seems strange!
#include<stdio.h>
char * returnAddress();
main()
{
char *ptr;
ptr = returnAddress();
printf("%p\n",ptr);
}
char * returnAddress()
{
int x;
printf("%p\n",&x);
return &x;
}
The behaviour is undefined.
Anything is allowed to happen when you invoke undefined behaviour - including behaving semi-sanely.
The address of a local variable is returned. It remains an address; it might even be a valid address if you're lucky. What you get if you access the data that it points to is anyone's guess - though you're best off not knowing. If you call another function, the space pointed at could be overwritten by new data.
You should be getting warnings about the conversion between int pointer and char pointer - as well as warnings about returning the address of a local variable.
What you are trying to do is usually dangerous:
In returnAddress() you declare a local, non-static variable i on the stack. Then you return its address which will be invalid once the function returned.
Additionally you try to return a char * while you actually have an int *.
To get rid of the warning caused by returning a pointer to a local var, you could use this code:
void *p = &x;
return p;
Of course printing it is completely harmless but dereferencing (e.g. int x = *ptr;) it would likely crash your program.
However, what you are doing is a great way to break things - other people might not know that you return an invalid pointer that must never be dereferenced.
Yes, the same address is printed both times. Except that, when the address is printed in main(), it no longer points to any valid memory address. (The variable x was created in the stack frame of returnAddress(), which was scrapped when the function returned.)
That's why the warning is generated: Because you now have an address that you must not use.
Because you can access the memory of the local variable, doesn't mean it is a correct thing to do. After the end of a function call, the stack pointer backtracks to its previous position in memory, so you could access the local variables of the function, as they are not erased. But there is no guaranty that such a thing won't fail (like a segmentation fault), or that you won't read garbages.
Which warning? I get a type error (you're returning an int* but the type says char*) and a warning about returning the address of a local variable.
The type error is because the type you've declared for the function is lies (or statistics?).
The second is because that is a crazy thing to do. That address is going to be smack in the middle (or rather, near the top) of the stack. If you use it you'll be stomping on data (or have your data stomped on by subsequent function calls).
Its not strange. The local variables of a function is allocated in the stack of that function. Once the control goes out of the function, the local variables are invalid. You may have the reference to the address but the same space of memory can be replaced by some other values. This is why the behavior is undefined. If you want reference a memory throughout your program, allocate using malloc. This will allocate the memory in heap instead of stack. You can safely reference it until you free the memory explicitly.
#include<stdio.h>
#include<stdlib.h>
char * returnAddress();
main()
{
char *ptr;
ptr = returnAddress();
printf("%p\n",ptr);
}
char * returnAddress()
{
char *x = malloc(sizeof(char));
printf("%p\n",x);
return x;
}

Returning a pointer to an automatic variable

Say you have the following function:
char *getp()
{
char s[] = "hello";
return s;
}
Since the function is returning a pointer to a local variable in the function to be used outside, will it cause a memory leak?
P.S. I am still learning C so my question may be a bit naive...
[Update]
So, if say you want to return a new char[] array (ie maybe for a substring function), what do you return exactly? Should it be pointer to an external variable ? ie a char[] that is not local to the function?
It won't cause a memory leak. It'll cause a dangling reference. The local variable is allocated on the stack and will be freed as soon as it goes out of scope. As a result, when the function ends, the pointer you are returning no longer points to a memory you own. This is not a memory leak (memory leak is when you allocate some memory and don't free it).
[Update]:
To be able to return an array allocated in a function, you should allocate it outside stack (e.g. in the heap) like:
char *test() {
char* arr = malloc(100);
arr[0] = 'M';
return arr;
}
Now, if you don't free the memory in the calling function after you finished using it, you'll have a memory leak.
No, it wont leak, since its destroyed after getp() ends;
It will result in undefined behaviour, because now you have a pointer to a memory area that no longer holds what you think it does, and that can be reused by anyone.
A memory leak would happen if you stored that array on the heap, without executing a call to free().
char* getp(){
char* p = malloc(N);
//do stuff to p
return p;
}
int main(){
char* p = getp();
//free(p) No leak if this line is uncommented
return 0;
}
Here, p is not destroyed because its not in the stack, but in the heap. However, once the program ends, allocated memory has not been released, causing a memory leak ( even though its done once the process dies).
[UPDATE]
If you want to return a new c-string from a function, you have two options.
Store it in the heap (as the example
above or like this real example that returns a duplicated string);
Pass a buffer parameter
for example:
//doesnt exactly answer your update question, but probably a better idea.
size_t foo (const char* str, size_t strleng, char* newstr);
Here, you'd have to allocate memory somewhere for newstr (could be stack OR heap) before calling foo function. In this particular case, it would return the amount of characters in newstr.
It's not a memory leak because the memory is being release properly.
But it is a bug. You have a pointer to unallocated memory. It is called a dangling reference and is a common source of errors in C. The results are undefined. You wont see any problems until run-time when you try to use that pointer.
Auto variables are destroyed at the end of the function call; you can't return a pointer to them. What you're doing could be described as "returning a pointer to the block of memory that used to hold s, but now is unused (but might still have something in it, at least for now) and that will rapidly be filled with something else entirely."
It will not cause memory leak, but it will cause undefined behavior. This case is particularly dangerous because the pointer will point somewhere in the program's stack, and if you use it, you will be accessing random data. Such pointer, when written through, can also be used to compromise program security and make it execute arbitrary code.
No-one else has yet mentioned another way that you can make this construct valid: tell the compiler that you want the array "s" to have "static storage duration" (this means it lives for the life of the program, like a global variable). You do this with the keyword "static":
char *getp()
{
static char s[] = "hello";
return s;
}
Now, the downside of this is that there is now only one instance of s, shared between every invocation of the getp() function. With the function as you've written it, that won't matter. In more complicated cases, it might not do what you want.
PS: The usual kind of local variables have what's called "automatic storage duration", which means that a new instance of the variable is brought into existence when the function is called, and disappears when the function returns. There's a corresponding keyword "auto", but it's implied anyway if you don't use "static", so you almost never see it in real world code.
I've deleted my earlier answer after putting the code in a debugger and watching the disassembly and the memory window.
The code in the question is invalid and returns a reference to stack memory, which will be overwritten.
This slightly different version, however, returns a reference to fixed memory, and works fine:
char *getp()
{
char* s = "hello";
return s;
}
s is a stack variable - it's automatically de-referenced at the end of the function. However, your pointer won't be valid and will refer to an area of memory that could be overwritten at any point.

Resources