Garbage collector in c for variables inside a loop - c

I am mainly a Java person recently working on some projects involving C so please bear with me if it's a basic C question.
So inside my main I have a while loop and I declare a variable each iteration.
int main()
{
int done = 0;
while(!done)
{
char input[1024];
scanf("%s", input);
//parse the input string
...
}
}
Now since the input variable will change every time depending on what the user wants I have to use a "new" variable each time. However, I think the above declaration will be causing memory leak ultimately(or will it?). I would like to know if gcc takes care of garbage collection.
Is there any better approach without allocating and freeing after every iteration?

I think the above declaration will be causing memory leak ultimately(or will it?).
No, it wouldn't: input is an automatic (AKA "Stack") variable, it will get "deallocated" as soon as it goes out of scope (i.e. after the closing brace).
Is there any better approach without allocating and freeing after every iteration?
There is no actual allocation or deallocation going on: the space in the automatic memory (AKA "on the stack") is allocated by some compile-time bookkeeping around the stack pointer. The access of automatic variables is a very fast operation heavily assisted by hardware, so there is no loss of efficiency there.
Dynamic memory allocation (Java-style) is done with malloc/calloc/realloc in C. These are not garbage collected - you need to explicitly free every pointer that you allocated.

However, I think the above declaration will be causing memory leak ultimately(or will it?)
It will not. The inputobject is discarded every time the trailing } of the loop is encountered.

As other have already said, stack variables are automatically freed when they go out of scope and are usually overwritten soon.
That being said, the reason I'm writing this answer is to stress this: there is no garbage collector in C, at least not by default. This means that heap allocated memory (usually initialized with malloc, calloc and others) must be manually freed by you (using free).

Related

If I want a global VLA, could I use alloca() in the main function?

I have a main function for my app, and I allocate, for example, paths to configuration files, etc. Currently I use malloc for them, but they are never freed and always available for use throughout the lifetime of the app. I never even free them because the OS already automatically reclaims allocated memory when an application terminates. At this point, is there any reason not to use alloca instead of malloc, because the program ends when main returns and alloca memory is only deleted once the function it was allocated in is freed. So based on this logic, memory allocated in the main function with alloca is only deallocated once the program ends which is desired. Are these statements correct, and is there any reason not to use alloca (alloca is bad practice so when I said alloca meant alloca or making a VLA in main) in main for a 'global VLA' like object that lasts until the program terminates?
You can use alloca/VLA in main, but why?
The typical reason to use them is if you have some performance sensitive part that is called a lot, and you don't want the overhead of malloc/free. For main, your data is allocated once at the beginning of the program, so the overhead of a few malloc calls is negligible.
Another reason to not use alloca/VLA's in main is that they consume stack space, which is a very limited resource compared to heap space.
Depends on how much memory you need. If it is small enough (say a few hundred bytes or so), you can safely do alloca in main() or use VLAs.
But then, if the sizes of these arrays have a known upper-limit which is not very large, it would be even better and safer to declare them globally with that upper-limit as the size. That way you don't consume stack space and you don't have to malloc and then ensure the allocation succeeded. It is also then clear to whoever is reading that this piece of memory lives as long as the program does.
If the sizes can be arbitrarily large then the best thing to do is to continue using malloc() like you are already. Btw even if you are calling malloc() in main() and use it for the lifetime of the program, it is still considered good practice to free it before exit.
Technically no, because any variable declared in a function will not be global. But you can do something like this:
char *buffer;
int main(void) {
char buf[size];
buffer = buf;
That would give you an interface to access the buffer globally.
At this point, is there any reason not to use alloca instead of malloc
This is one question that typically should be asked the other way around. Is there any reason to use alloca instead of malloc? Consider changing if you have performance issues, but if you just want to avoid using free, I'd say that's a bad reason.
But I don't really see the point here. If you have an allocated buffer that you want to live from when the program starts to when it ends, then just free it in the end of the main function.
int main(void) {
char *buf = malloc(size);
// Do work
free(buf);
}
I wrote a long answer about alloca and VLA:s that you might find useful. Do I really need malloc?
VLA (as defined by the standard) and non-standard alloca are both meant to be used for allocating temporary, small arrays at local scope. Nothing else.
Allocating large objects on the stack is a well-known source for subtle & severe stack overflow bugs. This is the reason you should avoid large VLA and alloca objects. Whenever you need large objects at file scope, they should either be static arrays or dynamically allocated with malloc.
It should be noted that stack allocation is usually faster than heap allocation, because stack allocation doesn't need to concern itself with look-ups, fragmentation and other heap implementation-specific concerns. Stack allocation just says "these 100 bytes are mine" and then you are ready to go.
Regarding general confusion about "stack vs heap" please see What gets allocated on the stack and the heap?
You can't even place a standard VLA at file scope, because the array size needs to be an integer constant expression there. Plus the standard (C17 6.7.6) explicitly says that you aren't allowed to:
If an identifier is declared to be an object with static or thread storage
duration, it shall not have a variable length array type.
As for alloca it isn't standard C and bad for that reason. But it's also bad because it doesn't have any type safety, so VLA is preferred over alloca - it is safer and more portable.
It should be noted that the main purpose of VLA in modern programming is however to enable pointers to VLA, rather than allocating array objects of VLA type, which is a feature of limited use.
I never even free them because the OS already automatically reclaims allocated memory when an application terminates.
While that is correct, it is still considered good practice to call free() manually. Because if you have any heap corruption or pointer-related bugs somewhere in the program, you'll get a crash upon calling free(). Which is a good thing, since it allows you to catch such (common) bugs early on during development.
(If you are concerned about the performance of free(), you can exclude the free() calls from the release build and only use them in debug build. Though performance is rarely an issue when closing down the program - usually you can just shut down the GUI if any then let the program chew away on clean-up code in the background.)

C: Malloc and Free

I am trying to undestand the C functions malloc and free. I know this has been discussed a lot on StackOverflow. However, I think I kind of know what these functions do by now. I want to know why to use them. Let's take a look at this piece of code:
int n = 10;
char* array;
array = (char*) malloc(n * sizeof(char));
// Check whether memory could be allocated or not...
// Do whatever with array...
free(array);
array = NULL;
I created a pointer of type char which I called array. Then I used malloc to find a chunk of memory that is currently not used and (10 * sizeof(char)) bytes large. That address I casted to type char pointer before assigning it to my previously created char pointer. Now I can work with my char array. When I am done, I'll use free to free that chunk of memory since it's not being used anymore.
I have one question: Why wouldn't I just do char array[10];? Wikipedia has only one small sentence to give to answer that, and that sentence I unfortunately don't understand:
However, the size of the array is fixed at compile time. If one wishes to allocate a similar array dynamically...
The slide from my university is similarily concise:
It is also possible to allocate memory from the heap.
What is the heap? I know a data structure called heap. :)
However, I've someone could explain to me in which case it makes sense to use malloc and free instead of the regular declaration of a variable, that'd be great. :)
C provides three different possible "storage durations" for objects:
Automatic - local storage that's specific to the invocation of the function it's in. There may be more than one instance of objects created with automatic storage, if a function is called recursively or from multiple threads. Or there may be no instances (if/when the function isn't being called).
Static - storage that exists, in exactly one instance, for the entire duration of the running program.
Allocated (dynamic) - created by malloc, and persists until free is called to free it or the program terminates. Allocated storage is the only type of storage with which you can create arbitrarily large or arbitrarily many objects which you can keep even when functions return. This is what malloc is useful for.
First of all there is no need to cast the malloc
array = malloc(n * sizeof(char));
I have one question: Why wouldn't I just do char array[10];?
What will you do if you don't know how many storage space do you want (Say, if you wanted to have an array of arbitrary size like a stack or linked list for example)?
In this case you have to rely on malloc (in C99 you can use Variable Length Arrays but for small memory size).
The function malloc is used to allocate a certain amount of memory during the execution of a program. The malloc function will request a block of memory from the heap. If the request is granted, the operating system will reserve the requested amount of memory.
When the amount of memory is not needed anymore, you must return it to the operating system by calling the function free.
In simple: you use an array when you know the number of elements the array will need to hold at compile time. you use malloc with pointers when you don't know how many elements the array will need to be at compile time.
For more detail read Heap Management With malloc() and free().
Imagine you want to allocate 1,000 arrays.
If you did not have malloc and free... but needed a declaration in your source for each array, then you'd have to make 1,000 declarations. You'd have to give them all names. (array1, array2, ... array1000).
The idea in general of dynamic memory management is to handle items when the quantity of items is not something you can know in advance at the time you are writing your program.
Regarding your question: Why wouldn't I just do char array[10];?. You can, and most of the time, that will be completely sufficient. However, what if you wanted to do something similar, but much much bigger? Or what if the size of your data needs to change during execution? These are a few of the situations that point to using dynamically allocated memory (calloc() or malloc()).
Understanding a little about how/when the stack and heap are used would be good: When you use malloc() or calloc(), it uses memory from the heap, where automatic/static variables are given memory on the stack, and are freed when you leave the scope of that variable, i.e the function or block it was declared in.
Using malloc and calloc become very useful when the size of the data you need is not known until run-time. When the size is determined, you can easily call one of these to allocate memory onto the heap, then when you are finished, free it with free()
Regarding What is the heap? There is a good discussion on that topic here (slightly different topic, but good discussion)
In response to However, I've someone could explain to me in which case it makes sense to use malloc() and free()...?
In short, If you know what your memory requirements are at build time (before run-time) for a particular variable(s), use static / automatic creation of variables (and corresponding memory usage). If you do not know what size is necessary until run-time, use malloc() or calloc() with a corresponding call to free() (for each use) to create memory. This is of course a rule-of-thumb, and a gross generalization. As you gain experience using memory, you will find scenarios where even when size information is known before run-time, you will choose to dynamically allocate due to some other criteria. (size comes to mind)
If you know in advance that you only require an array of 10 chars, you should just say char array[10]. malloc is useful if you don't know in advance how much storage you need. It is also useful if you need storage that is valid after the current function returns. If you declare array as char array[10], it will be allocated on the stack. This data will not be valid after your function returns. Storage that you obtain from malloc is valid until you call free on it.
Also, there is no need to cast the return value of malloc.
Why to use free after malloc can be understood in the way that it is a good style to free memory as soon as you don't need it. However if you dont free the memory then it would not harm much but only your run time cost will increase.
You may also choose to leave memory unfreed when you exit the program. malloc() uses the heap and the complete heap of a process is freed when the process exits. The only reason why people insist on freeing the memory is to avoid memory leaks.
From here:
Allocation Myth 4: Non-garbage-collected programs should always
deallocate all memory they allocate.
The Truth: Omitted deallocations in frequently executed code cause
growing leaks. They are rarely acceptable. but Programs that retain
most allocated memory until program exit often perform better without
any intervening deallocation. Malloc is much easier to implement if
there is no free.
In most cases, deallocating memory just before program exit is
pointless. The OS will reclaim it anyway. Free will touch and page in
the dead objects; the OS won't.
Consequence: Be careful with "leak detectors" that count allocations.
Some "leaks" are good!
Also the wiki has a good point in Heap base memory allocation:-
The heap method suffers from a few inherent flaws, stemming entirely
from fragmentation. Like any method of memory allocation, the heap
will become fragmented; that is, there will be sections of used and
unused memory in the allocated space on the heap. A good allocator
will attempt to find an unused area of already allocated memory to use
before resorting to expanding the heap. The major problem with this
method is that the heap has only two significant attributes: base, or
the beginning of the heap in virtual memory space; and length, or its
size. The heap requires enough system memory to fill its entire
length, and its base can never change. Thus, any large areas of unused
memory are wasted. The heap can get "stuck" in this position if a
small used segment exists at the end of the heap, which could waste
any magnitude of address space, from a few megabytes to a few hundred.

Am i overusing malloc in c?

I am working on learning c. I understand that malloc() allocates a block of bytes that cannot be changed or corrupted without user request, however I find myself using it very often. To be exact, I am using malloc every time that I want to create either a struct or any of its contents that I want to reference in the future. I also do understand to free() the allocated memory when its complete.
Is my use of malloc correct?
Dynamic memory allocation (malloc and family) are there for two reasons:
Your data needs to persist beyond the scope that allocated it (e.g. multithreading)
Whatever you are allocating is too large for your stack
You should really be avoiding to allocate dynamic memory for any other reason. Automatic (stack) variables are far less prone to errors and are automatically deallocated for you at the end of the scope.
Having "corrupted memory" like you call it can only really arise from bad programming and can happen on both the stack and the heap and you should not rely on dynamic memory to provide safety from buffer overflows or other mistakes that lead to memory corruption.
There is a reason why many functions in the C standard library get a pointer to a buffer as an argument to put results in: it allows you to allocate those buffers on your stack. e.g:
ssize_t read(int fd, void *buf, size_t count);
Also as mentioned by another answer: Your stack memory is already in the CPU cache and is thus far faster accessible.
Please also consider the other types of allocation:
int foo;
outside of a block will allocate a global variable, which is alive during the whole lifetime of your process, and visible for other modules of the program.
static int foo;
outside of a block is the same but visible in the actual module only.
int foo;
inside a block is alive only while the code in the block runs, then it's destroyed.
static int foo;
inside a block is visible in the block only, but it preserves its value for the entire lifetime of the process.
I'm doing a lot of embedded C coding, and using malloc() is absolutely prohibited. And it's entirely possible. You typically need malloc() if you don't know the size of your problem at compile time. But even in some cases like that, you can replace dynamic memory allocation with other techinques like recursion, line-based processing etc, etc.
It depends on what you mean by
cannot be changed or corrupted without user request
If you are referring to code - then it's usually called client, not user. And it's still unclear what do you mean by that. But that's not the point.
The point is that malloc() is one of the functions used for dynamic memory allocation. It means that you can pass an address returned by this function somewhere else and data stored there will be there until it's manually deallocated. Unlike static memory allocation which is automatically freed when it's out of the scope.
So, you probably shouldn't be using malloc() if memory allocated by it is freed in the same scope, just because it's meaningless and because static allocation is faster because it's easier for CPU to cache and it's initialized at program startup, not at runtime as heap allocated memory.

Why are the contents pointed to by a pointer not changed when memory is deallocated using free()?

I am a newbie when it comes to dynamic memory allocation. When we free the memory using void free(void *ptr) the memory is deallocated but the contents of the pointer are not deleted. Why is that? Is there any difference in more recent C compilers?
Computers don't "delete" memory as such, they just stop using all references to that memory cell and forget that anything of value is stored there. For example:
int* func (void)
{
int x = 5;
return &x;
}
printf("%d", *func()); // undefined behavior
Once the function has finished, the program stops reserving the memory location where x is stored, any other part of the program (or perhaps another program) is free to use it. So the above code could print 5, or it could print garbage, or it could even crash the program: referencing the contents of a memory cell that has ceased to be valid is undefined behavior.
Dynamic memory is no exception to this and works in the same manner. Once you have called free(), the contents of that part of the memory can be used by anyone.
Also, see this question.
The thing is that accessing memory after it has been freed is undefined behavior. It's not only that the memory contents are undefined, accessing them could lead to anything. At least some compilers when you build a debug version of the code, actually do change the contents of the memory to aid in debugging, but in release versions it's generally unnecessary to do that, so the memory is just left as is, but anyway, that is not something you can safely rely upon, don't access freed memory, it's unsafe!
In C, parameters are passed by value. So free just can't change the value of ptr.
Any change it would make would only change the value within the free function, and won't affect the caller's variable.
Also, changing it won't be so much help. There can be multiple pointers pointing to the same piece of memory, and they should all be reset when freeing. The language can't keep track of them all, so it leaves the programmer to handle the pointers.
This is very normal, because clearing the memory location after free is an overhead and generally not necessary. If you have security concerns, you can wrap the free call within a function which clears the region before freeing. You'll also notice that this requires the knowledge of the allocation size, which is another overhead.
Actually the C programming language specifies that after the lifetime of the object, even the value of any pointer pointing to it becomes indeterminate, i.e. you can't even depend on the pointer to even retain the original value.
That is because a good compiler will try to aggressively store all the variables into the CPU registers instead of memory. So after it sees that the program flow calls a function named free with the argument ptr, it can mark the register of the ptr free for other use, until it has been assigned to again, for example ptr = malloc(42);.
In between these two it could be seen changing the value, or comparing inequal against its original value, or other similar behaviour. Here's an example of what might happen.

malloc and scope

I am struggling to wrap my head around malloc in c - specifically when it needs to be free()'d. I am getting weird errors in gcc such as:
... free(): invalid next size (fast): ...
when I try to free a char pointer. For example, when reading from an input file, it will crash on certain lines when doing the following:
FILE *f = fopen(file,"r");
char x[256];
while(1) {
if(fgets(x,sizeof x,f)==NULL) break;
char *tmp = some_function_return_char_pointer(x); //OR malloc(nbytes);
// do some stuff
free(tmp); // this is where I get the error, but only sometimes
}
I checked for obvious things, such as x being NULL, but it's not; it just crashes on random lines.
But my REAL question is - when do I need to use free()? Or, probably more correctly, when should I NOT use free? What if malloc is in a function, and I return the var that used malloc()? What about in a for or while loop? Does malloc-ing for an array of struct have the same rules as for a string/char pointer?
I gather from the errors I'm getting in gcc on program crash that I'm just not understanding malloc and free. I've spent my quality time with Google and I'm still hitting brick walls. Are there any good resources you've found? Everything I see says that whenever I use malloc I need to use free. But then I try that and my program crashes. So maybe it's different based on a variable's scope? Does C free the memory at the end of a loop when a variable is declared inside of it? At the end of a function?
So:
for(i=0;i<100;i++) char *x=malloc(n); // no need to use free(x)?
but:
char *x;
for(i=0;i<100;i++) {
x=malloc(n);
free(x); //must do this, since scope of x greater than loop?
}
Is that right?
Hopefully I'm making sense...
malloc() is C's dynamic allocator. You have to understand the difference between automatic (scoped) and dynamic (manual) variables.
Automatic variables live for the duration of their scope. They're the ones you declare without any decoration: int x;
Most variables in a C program should be automatic, since they are local to some piece of code (e.g. a function, or a loop), and they communicate via function calls and return values.
The only time you need dynamic allocation is when you have some data that needs to outlive any given scope. Such data must be allocated dynamically, and eventually freed when it is no longer necessary.
The prime usage example for this is your typical linked list. The list nodes cannot possibly be local to any scope if you are going to have generic "insert/erase/find" list manipulation functions. Thus, each node must be allocated dynamically, and the list manipulation functions must ensure that they free those nodes that are no longer part of the list.
In summary, variable allocation is fundamentally and primarily a question of scope. If possible keep everything automatic and you don't have to do anything. If necessary, use dynamic allocation and take care to deallocate manually whenever appropriate.
(Edit: As #Oli says, you may also want to use dynamic allocation in a strictly local context at times, because most platforms limit the size of automatic variables to a much smaller limit than the size of dynamic memory. Think "huge array". Exceeding the available space for automatic variables usually has a colourful name such as "pile overrun" or something similar.)
In general, every call to malloc must have one corresponding call to free.* This has nothing to do with scope (i.e. nothing to do with functions or loops).
* Exceptions to this rule include using functions like strdup, but the principle is the same.
Broadly speaking, every pointer that is ever returned by malloc() must eventually be passed to free(). The scope of the variable that you store the pointer in does not affect this, because even after the variable is no longer in scope, the memory that the pointer points to will still be allocated until you call free() on it.
Well, the scope of the malloc'd memory lays between calls to malloc and free or otherwise until process is stopped (that is when OS cleans up for the process). If you never call free you get a memory leak. That could happen when address that you can pass to free goes out of scope before you actually used it - that is like loosing your keys for the car, car is still there but you can't really drive it. The error you are getting is most likely either because function returns a pointer to some memory that was not allocated using malloc or it returns a null pointer which you pass to free, which you cannot do.
You should free memory when you will no longer be accessing it. You should not free memory if you will be accessing it. This will give you a lot of pain.
If you don't want memory leak, you have to free the memory from malloc.
It can be very tricky. For example, if the // do some stuff has a continue, the free will be skipped and lead to memory leak. It is tricky, so we have shared_ptr in C++; and rumor has it salary of C programmer is higher than C++ programmer.
Sometimes we don't care memory leak. If the memory holds something that is needed during the whole lifetime of execution, you can choose not to free it. Example: a string for environment variable.
PS: Valgrind is a tool to help detect memory bugs. Especially useful for memory leak.
malloc(n) allocates n bytes of memory from a memory location named heap and then returns a void* type of pointer to it. The memory is allocated at runtime. Once you have allocated a memory dynamically, scope does not matter as long as you keep a pointer to it with you(or the address of it specifically). For example:
int* allocate_an_integer_array(int n)
{
int* p = (int*) (malloc(sizeof(int)*n));
return p;
}
This functions simply allocates memory from heap equal to n integers and returns a pointer to the first location. The pointer can be used in the calling function as you want to. The SCOPE does not matter as long as the pointer is with you..
free(p) returns the memory to heap.
The only thing you need to remember is to free it as if you don't free it and lose the value of its address, there will bw a memory leak. It is so because according to OS, you are still using the memory as you have not freed it and a memory leak will happen..
Also after freeing just set the value of the pointer to null so that u don't use it again as the same memory may be allocated again at any other time for a different purpose....
So, all you need to do is to be careful...
Hope it helps!

Resources