I'm currently learning the C language and I'm struggling to wrap my head around the pointers and malloc() function.
So in my book's example I have the following function defined :
island* create(char *name) {
island *i = malloc(sizeof(island));
i->name = strdup(name);
i->opens = "09:00";
i->closes = "17:00";
i->next = NULL;
return i;
}
Then it's called like this :
char name[80];
fgets(name, 80, stdin);
island *p_island0 = create(name);
There is several things I struggle to understand in this code example:
What happen to the i variable when assigned to malloc(sizeof(island));, does it just temporarily stores the reference of the new memory space allocated on the HEAP ?
After island *p_island0 = create(name); , eventually what is stored in p_island0 ? The address created by malloc() or was another pointer created and the value of the previous i variable copied into p_island0 on the ... STACK?
When you do return i; the pointer value stored in i is copied to the variable p_island0 in the calling function, and then i goes out of scope. The allocated memory never goes out of scope, it has a life-time of the full program or until you call free with the pointer value. Which variable is storing the pointer value doesn't matter, as long as it is the original pointer value returned by the malloc call.
How the value is returned by the function is not specified by the C specification, it depends on the compiler, operating system and underlying hardware. Most likely the stack is not involved, but the returned value is stored in a CPU register.
1. What happen to the i variable when assigned to malloc(sizeof(island));, does it just temporarily stores the reference of the new memory space allocated on the HEAP ?
i stores the pointer returned by malloc(). Later, that is returned as the return value of the function. Dynamic memory has a lifetime equal to the program runtime (unless manually deallocated by free()), so the values stored into the memory area pointed by the pointer are valid and accessible after the function returns.
FWIW, point to note here, before using the return value of the malloc(), it's always good to check the returned value against NULL to avoid UB in case malloc() fails.
2. After island *p_island0 = create(name); , eventually what is stored in p_island0 ? The address created by malloc() or was another pointer created and the value of the previous i variable copied into p_island0?
The same pointer returned by malloc() is returned.
malloc returns the address of the allocated memory block , as a void* wich is a generic pointer type, the address (like any other value) is being copied to i.
the address returned from malloc is being stored in p_island0.
*the address returned from malloc is to the heap memory , the allocated memory lives untill 'free' function is being called or untill the program ends.
island is probably a struct.
The function create(char *name) has its own scope
island *i = malloc(sizeof(island));
This statement allocates memory which is pointed by i.
Therefore, i is limited to this function's scope. This is not accessible outside the function.
However, the function returns the value(memory location) pointed by i which will ultimately be stored in p_island0.
Related
While I was doing my assignment I came across an issue. I know that double freeing cause undefined behavior.
When a program calls free() twice with the same argument, the program's memory management data structures become corrupted. This corruption can cause the program to crash or, in some circumstances, cause two later calls to malloc() to return the same pointer. If malloc() returns the same value twice and the program later gives the attacker control over the data that is written into this doubly-allocated memory, the program becomes vulnerable to a buffer overflow attack.
CWE-415: Double Free
However, while fuzzing I encountered double NULL(set NULL twice(one after another) for one variable) where pointer was NULL(ed) twice.
Is that same with double freeing and cause undefined behaviour?
e.g.
int *p;
p = (int*)malloc(10*sizeof(int));
p = NULL;
p = NULL; (**seconds time)**
I'm not quite sure what you're asking about, but:
if you try to free() the same memory block twice in a row:
void* ptr = malloc(120);
free(ptr);
free(ptr);
then the second call to free() will probably corrupt the heap;
however, if you set your pointer variable to NULL after freeing the pointed block:
void* ptr = malloc(120);
free(ptr);
ptr = NULL;
// ....do some other stuff
// (but no assignment to ptr!)
free(ptr);
ptr = NULL;
then the second call to free() passes a NULL pointer to it, and free() returns with no harm.
Setting ptr to NULL does not free the allocated memory. This would work e.g. in Java, where the virtual machine tracks all references to objects and deletes those no longer used. In that context the first ptr = null; would make a pointed object eligible for deleting (provided no other references to it exist), and another ptr = null; would not change anything.
In C you must 'manually' release memory you allocated and that is quite separate from nullifying any pointer variables.
If you do
int *p;
p = (int*)malloc(10*sizeof(int));
p = NULL;
p = NULL;
then you: declare a p variable; allocate a block of memory and store its address in the p variable; overwrite the stored address with a NULL pointer (thus effectively you loose access to the allocated memory block); and overwrite it once again with the same NULL value.
That does not affect the allocated block. It remains allocated till the end of your program execution, but without the pointer value returned from malloc() you can't use it or even free() it. That's what we call a memory leak.
A pointer can be "nulled" twice, because its just assigning a value to it. The memory however is still allocated. Setting the pointer to null won't change that. This is why you can do this as often as you like.
With free()-ing the pointer however, you're deallocating the memory. This means is could be used for other variables and such.
Once the memory was deallocated, you can't deallocate it a second time, this is why it causes undefined behaviour.
Have a look at this question for reference.
NULL is a macro that most of the times stands for 0. So all you do is, assign 0 twice to a variable. So pretty defined behaviour...
My question is about returning a value from a function call in c. I've read many questions and answers on this topic such as:
Returning a local variable confusion in C
I also understand that returning a pointer to a local variable is a problem since the local variable has no limited lifetime after the return. In my case below I am returning a pointer value not a pointer to a local variable. This should be ok yes? (i.e. returning &x would be bad)
int *foo(int a) {
int *x; //local scope local lifetime
...
x = &something_non-local;
...
return x; //return value of pointer
}
int main(void) {
int *bar;
bar = foo(10);
}
You can look at it this way. A variable represents a place in memory. Lifetime of a variable is defined by validity of this place in memory. The place in memory keeps a value. Any copy operation just copies values from one place in memory to another.
Every such memory place has an address. A pointer to a variable is just another variable which value is an address of another variable. A copy of a pointer is just a copy of an address from one place to another.
If a variable is declared in the global scope, than its memory will be valid till the exit of the program. If the variable is declared non-statically in a procedure, then its memory is valid till the end of this procedure. Technically it is probably allocated on stack which gets unallocated when the procedure returns, making this memory not valid.
In your case if pointer x points to a variable from a global scope, the variable itself is valid until the exit from the procedure. However, the return statement copies the value from the x into a different location just before the latter becomes invalid. As a result the value of x will end up in bar. This value is the address of a static variable which is still valid.
There would be a different story if you try to return the address of x, i.e. &x. This would be an address of memory which existed inside of the procedure. After return it will point to an invalid memory location.
So, if your something_non-local points to such a thing, than you are in trouble. It should point to something static or something in heap.
BTW, malloc allocates a memory in heap which is valid till you use free.
I pass a pointer of a given type to a function.
In the function I allocate the memory needed:
Pointer = (mytype * ) malloc (N* sizeof (mytype));
And it all goes well. After the function ends another one calls pointer.
But the previously filled pointer is now without memory.
Shouldn't pointer have kept its filled memory?
Or does the ending of a function deallocate the memory?
Sorry but I am unable to paste my code because I work on a non connected PC.
No. Memory allocated by malloc is not deallocated at the end of a function.
Otherwise, that would be a disaster, because you would be unable to write a function that creates a data structure by allocating memory for it, filling it with data, and returning it to the caller.
No, but you're not returning the pointer to the caller. The argument inside the function is not the same as the value at the calling site, so changing it by assigning the return value from malloc() doesn't change the caller's value.
This:
Foo *foo;
AllocateAFoo(foo);
has no chance of changing the value of foo after the function returns, even if the argument is assigned to inside the function. This is why malloc() returns the new value.
You need to do that also:
mytype * Allocate(size_t num)
{
return malloc(num * sizeof (mytype));
}
This means that there's no point in sending the uninitialized pointer from the caller to the function, so don't.
Also, you shouldn't cast the return value of malloc() in C.
Also, you need to be aware that malloc() is just a function like any other. How would you write a function that reacts when execution leaves other functions? The answer is of course "you can't", and thus malloc() can't either.
You can use alloca(), but that's not supported on all architectures.
I do not quite understand the difference between passing to a function *mode1 or **mode2.
I wrote some examples. (note: type can be any type)
[Code 1]
#include <stdio.h>
void function (type *vet)
{
/* other */
}
int main ()
{
type *vet;
function (vet)
/* other */
return 0;
}
[Code 2]
#include <stdio.h>
void function (type **vet)
{
/* other */
}
int main ()
{
type *vet;
function (&vet)
/* other */
return 0;
}
I know: in the first case is a pointer, in the second case is a pointer to a pointer. But why for example in the second case if I pass &vet can I allocate memory in function() and free it in main() and in the first one not?
I search someone who explain me the differences well. What can I do in the two cases? How and where to malloc or realloc? And free? And modify vet in the function?
Original questions
The main (most significant) difference is whether the value in the calling function can be changed.
In the first example, the called function gets a copy of the pointer in the calling code, and cannot modify the pointer in the calling code.
In the second example, the called function gets a pointer to the pointer, and by assigning to *vet in the called function, you can modify the value in the called function.
Why in the second case if I pass &vet can I allocate memory in function() and free it in main() and in the first one not?
In the second case, the code in function() can modify the actual pointer in main(), so the value of vet in main() ends up with the allocated pointer value, which can therefore be freed. In the first case, the value in main() is not modified by the called function, so the data can't be freed by main().
How and where to malloc or realloc? And free?
In the first case, you can use malloc() or realloc() in the function, but you should also free the allocated memory before return unless your code stores the value in a global variable (in which case you can delegate to some other code to handle the free(), but it had better be very clear which code has the responsibility, and in any case using global variables is probably not a good idea). Or unless you change the function signature and return a pointer to the allocated data which should be freed by the calling code.
In the second case, you can allocate or reallocate memory in the called function and leave it allocated to be used by other functions and to be freed by the calling function.
And modify vet in the function?
In both functions, you can modify the local vet variable as you see fit; this is true of any parameter to any function. What you can't necessarily do is modify values in the calling function; you have to have a pointer to value in the calling function to do that. In the first function, you can't change the value of vet in main(); in the second, you can. In both functions, you can change what vet points at. (One minor problem is the conflation of the name vet in the three contexts — the main() and the two different functions. The name vet in the two functions points at different types of things.)
Extension questions
But is it freed in function() like, for example, this?
#include <stdio.h>
#define NUM (10)
struct example
{
/* ... */
}
void dealloc (struct example *pointer)
{
free (pointer);
}
int main()
{
struct example *e;
e = malloc (NUM * sizeof(struct example));
if (e == NULL)
return -1;
struct example *e_copy = e;
dealloc (e_copy);
return 0;
}
This code is legitimate. You pass (a copy of) the value of the pointer to the dealloc() function; it passes that pointer to free() which releases the allocated memory. After dealloc() returns, the pointer values in e and e_copy are the same as before, but they no longer point to allocated memory and any use of the value leads to undefined behaviour. A new value could be assigned to them; the old value cannot be dereferenced reliably.
And what is the difference from this?
#include <stdio.h>
#define NUM (10)
struct example
{
/* ... */
}
int main()
{
struct example *e;
e = malloc (NUM * sizeof(struct example));
if (e == NULL)
return -1;
struct example *e_copy = e;
free (e_copy);
return 0;
}
The difference between this example and the last is that you call free() directly in main(), rather than from a function dealloc().
What would make a difference is:
void dealloc(struct example **eptr)
{
free(*eptr);
*eptr = 0;
}
int main()
{
...
dealloc(&e_copy);
return 0;
}
In this case, after dealloc() returns, e_copy is a null pointer. You could pass it to free() again because freeing the null pointer is a no-op. Freeing a non-null pointer twice is undefined behaviour — it generally leads to problems and should be avoided at all costs. Note that even now, e contains the pointer that was originally returned by malloc(); but any use of that pointer value leads to undefined behaviour again (but setting e = 0; or e = NULL; or e = e_copy; is fine, and using e = malloc(sizeof(struct example)); or such like also works.
As you said, the argument in the first case is a pointer, and a copy of the pointer vet is passed. We can modify the value that vet pointed to (e.g., *vet = new value). But we cannot modify the value of the pointer vet because it is just a copy of the original vet pointer. Therefore, after first function, value of *vet may be changed, but value of vet will not.
So how could we modify the value of the pointer vet? We use the pointer to pointer. In the second function, we can allocate memory for *vet, and this modified value will be kept after the second function. So we can free it in main.
We cannot do this in first case because if we try to allocate memory in function, we just allocate memory for the copy of the pointer vet, not the original vet.
You are correct in your understanding that type *var; is a pointer to data of type and type **var; is a pointer to a pointer to data of type.
The difference you asked about, allocating memory in the function and keeping track of it, is because of the ability to assign the value to the pointer.
In C, any time you want to modify a value in a function, you must provide it a pointer the data it is to modify.
If you want to allocate memory, you must know where it is in order to use it, and later free it. If you pass only a pointer to a function, and allocate memory to it, it cannot change the value of the pointer it is passed (rather, when your program returns to the function that called this allocation function, the stack will have unloaded the address you needed); it can only read from it (which in this use is rather pointless).
Consider this, the pointer variable
type *vet;
is created on the stack of function
main()
when "function_1()" is called from main, a stack for this new function is created. Any argument passed to this function is saved on the stack of this function. In this case, the argument is a pointer variable. Now function_1() can very well change the value of this pointer variable but as soon as the function returns the stack of this function is released and any changes are lost.
But when you are passing a pointer to pointer, what you pass is actually an address of a pointer variable and not a pointer variable. So when you work on this pointer variable inside the called function, you are actually working on the memory of the stack of the calling function. And since this memory is on the stack of the calling function, any changes made by the calling function will persist even after the stack of the called function is released.
First thing is, there is no garbage collector in C. So if you don't explicitely free an allocated memory block, it will eat up memory until the process exits. That is a mighty source of memory leaks.
So, once you've allocated a memory block using malloc or similar functions, you must keep a pointer to it in order to free it someday.
If you allocate a block within a function and you plan this block to remain useable after function termination, you must pass its value to some higher level piece of code that will eventually free it, long after the function that created it has exited.
To do so, you have three basic choices:
store the pointer in some global variable
return the pointer as the function's result
have one of the function arguments specify where the pointer is to be stored
case 1
void * global_address_of_buffer;
void alloc_a_buffer (int size)
{
global_address_of_buffer = malloc (size); // block reference in global var
}
alloc_a_buffer ();
// ...
free (global_address_of_buffer);
This is clearly impractical. If you call your function twice, you will lose the address of the first buffer.
One of the innumerable illustrations of why using globals will drag you screaming right into hell.
Nevertheless, it's a possibility.
case 2
void * alloc_a_buffer (int size)
{
return malloc (size); // block reference as return value
}
void * new_buffer;
new_buffer = alloc_a_buffer (10); // retrieve pointer through return value
// ...
free (new_buffer);
This is not always possible. For instance you might want all your functions to return a status indicating success or failure, in which case the return value will not be available for your pointer.
case 3
void alloc_a_buffer (int size, void ** buffer)
{
*buffer = malloc (size); // block reference set through 2nd parameter
}
void * new_buffer;
alloc_a_buffer (10, &new_buffer); // pass pointer address to the function
// ...
free (new_buffer);
It is known that if we pass a pointer by value to a function, it cannot be freed inside the function, like so:
void func(int *p)
{
free(p);
p = NULL;
}
p holds a copy of a (presumably valid) address, so free(p) tries to, well, free it. But since it is a copy, it cannot really free it. How does the call to free() know that it cannot really free it ?
The code above does not produce an error. Does that mean free() just fails silently, "somehow" knowing that address passed in as argument cannot be worked upon ?
p holds a copy of a (presumably valid) address, so free(p) tries to, well, free it. But since it is a copy, it cannot really free it.
It's not true. free() can work just fine if p is a valid address returned by malloc() (or NULL).
In fact, this is a common pattern for implementing custom "destructor" functions (when writing OO-style code in C).
What you probably mean is that p won't change to NULL after this - but that's natural, since you're passing it by value. If you want to free() and null out the pointer, then pass it by pointer ("byref"):
void func(int **p)
{
if (p != NULL) {
free(*p);
*p = NULL;
}
}
and use this like
int *p = someConstructor();
func(&p);
// here 'p' will actually be NULL
The only problem is if this function is in a different DLL (Windows). Then, it may be linked with a different version of the standard library and have different ideas on how the heap is built.
Otherwise no problem.
Passing p to func() by value, which will copy the pointer and creates the local copy to func() which frees the memory. func() then sets it's own instance of the pointer p to NULL but which is useless. Once the function is complete the parameter p come to end of existence. In calling function you still have pointer p holding an address, but the block is now on the free list and not useful for storage until allocated again.
What everybody is saying is that your memory will be freed by free(p);, but your original pointer (which you use to call the function with) will still hold the (now invalid) address. If a new block of memory including your address is allocated at a later stage than your original pointer will become valid (for memory manager) again, but will now point to completely different data causing all sorts of problems and confusion.
No you really free the block of memory. After the function call, the pointer passed to this function is pointing to nowhere : same address but the MMU don't know anymore what to do with this address