C Array Instantiation - Stack or Heap Allocation? - c

I guarantee that this question has been asked before, but I haven't been able to find it via search; sorry in advance for any redundancies.
It's my (potentially wrong) understanding that you only allocate to the stack when you know the size of an object at compile time. So in the case of initializing an array, you could do one of these (and this should go on the stack):
char charArray[50];
Since the size of this array is known at compile time, this should have no issues.
On the other hand, this (I believe) is also valid code:
char anotherCharArray[someVariable + 50];
Would this go on the stack as well? I am pretty sure the code segfaults if you free() this, so it makes me think it does, but it doesn't really make sense to me. Similarly, is the 100% sole situation where you have to use free() when the data was allocated via malloc?
Thanks in advance for your help.

If char charArray[50]; is defined at file scope (outside of all functions) or is static, it's not going to be on the stack, it's going to be a global preallocated at program's start variable. If it's not static and is defined at function scope, it's going to be on the stack.
char anotherCharArray[someVariable + 50]; can only be defined at function scope and is going to be on the stack.
All of the above applies to typical implementations of C. Atypical ones may use the heap instead of the stack and instead of the preallocated space in the data section of the program.
You don't free() what hasn't been allocated with malloc(), calloc() or realloc(). Simple. Some functions may imply the use of one of the above, e.g. POSIX strdup().

Similarly, is the 100% sole situation where you have to use free() when the data was allocated via malloc?
Yes. (Apart from calloc and realloc, their return value is also to be free()'d. Similarly, there are functions that use malloc() and this fact is documented, for example strdup() - the return value of these functions is also to be freed using free(), obviously.)
char anotherCharArray[someVariable + 50];
Would this go on the stack as well?
Yes, it does (in most implementations - of course, it's not always true that you assume, but on most of the platforms, it is). And yes this is valid code, but it is only standard in C99.

Related

If I want a global VLA, could I use alloca() in the main function?

I have a main function for my app, and I allocate, for example, paths to configuration files, etc. Currently I use malloc for them, but they are never freed and always available for use throughout the lifetime of the app. I never even free them because the OS already automatically reclaims allocated memory when an application terminates. At this point, is there any reason not to use alloca instead of malloc, because the program ends when main returns and alloca memory is only deleted once the function it was allocated in is freed. So based on this logic, memory allocated in the main function with alloca is only deallocated once the program ends which is desired. Are these statements correct, and is there any reason not to use alloca (alloca is bad practice so when I said alloca meant alloca or making a VLA in main) in main for a 'global VLA' like object that lasts until the program terminates?
You can use alloca/VLA in main, but why?
The typical reason to use them is if you have some performance sensitive part that is called a lot, and you don't want the overhead of malloc/free. For main, your data is allocated once at the beginning of the program, so the overhead of a few malloc calls is negligible.
Another reason to not use alloca/VLA's in main is that they consume stack space, which is a very limited resource compared to heap space.
Depends on how much memory you need. If it is small enough (say a few hundred bytes or so), you can safely do alloca in main() or use VLAs.
But then, if the sizes of these arrays have a known upper-limit which is not very large, it would be even better and safer to declare them globally with that upper-limit as the size. That way you don't consume stack space and you don't have to malloc and then ensure the allocation succeeded. It is also then clear to whoever is reading that this piece of memory lives as long as the program does.
If the sizes can be arbitrarily large then the best thing to do is to continue using malloc() like you are already. Btw even if you are calling malloc() in main() and use it for the lifetime of the program, it is still considered good practice to free it before exit.
Technically no, because any variable declared in a function will not be global. But you can do something like this:
char *buffer;
int main(void) {
char buf[size];
buffer = buf;
That would give you an interface to access the buffer globally.
At this point, is there any reason not to use alloca instead of malloc
This is one question that typically should be asked the other way around. Is there any reason to use alloca instead of malloc? Consider changing if you have performance issues, but if you just want to avoid using free, I'd say that's a bad reason.
But I don't really see the point here. If you have an allocated buffer that you want to live from when the program starts to when it ends, then just free it in the end of the main function.
int main(void) {
char *buf = malloc(size);
// Do work
free(buf);
}
I wrote a long answer about alloca and VLA:s that you might find useful. Do I really need malloc?
VLA (as defined by the standard) and non-standard alloca are both meant to be used for allocating temporary, small arrays at local scope. Nothing else.
Allocating large objects on the stack is a well-known source for subtle & severe stack overflow bugs. This is the reason you should avoid large VLA and alloca objects. Whenever you need large objects at file scope, they should either be static arrays or dynamically allocated with malloc.
It should be noted that stack allocation is usually faster than heap allocation, because stack allocation doesn't need to concern itself with look-ups, fragmentation and other heap implementation-specific concerns. Stack allocation just says "these 100 bytes are mine" and then you are ready to go.
Regarding general confusion about "stack vs heap" please see What gets allocated on the stack and the heap?
You can't even place a standard VLA at file scope, because the array size needs to be an integer constant expression there. Plus the standard (C17 6.7.6) explicitly says that you aren't allowed to:
If an identifier is declared to be an object with static or thread storage
duration, it shall not have a variable length array type.
As for alloca it isn't standard C and bad for that reason. But it's also bad because it doesn't have any type safety, so VLA is preferred over alloca - it is safer and more portable.
It should be noted that the main purpose of VLA in modern programming is however to enable pointers to VLA, rather than allocating array objects of VLA type, which is a feature of limited use.
I never even free them because the OS already automatically reclaims allocated memory when an application terminates.
While that is correct, it is still considered good practice to call free() manually. Because if you have any heap corruption or pointer-related bugs somewhere in the program, you'll get a crash upon calling free(). Which is a good thing, since it allows you to catch such (common) bugs early on during development.
(If you are concerned about the performance of free(), you can exclude the free() calls from the release build and only use them in debug build. Though performance is rarely an issue when closing down the program - usually you can just shut down the GUI if any then let the program chew away on clean-up code in the background.)

How do I know when I ought to free strings in C returned by library functions?

Which strings ought I to free in C on my own, using free()¹?
My state of knowledge:
char a[256];: no
char *a = "abcdefg";: no
char *a = malloc(15L);: yes
a string returned by getenv(): no
strings returned by Windows functions²: ???
¹ or LocalFree()/GlobalFree()/VirtualFree()
² in particular by GetCommandLineW()
This will always be mentioned in the documentation for any API you use that returns strings (or other data larger than a single simple value such as an integer).
Yes, this means you have to read the documentation thoroughly for all such API functions, in order to keep track and not leak memory.
The only chunks of memory that must be freed are those that were previously malloced.
Now the questions are "is this pointer a pointer to memory that was created by malloc ?" and, if so, "am I supposed to free it myself or will some other function take care of it ?"
There are no easy answers to these questions, generally the documentation will tell you so, but the rule of thumb is that the module that takes care of memory creation also takes care of deallocation. So, if the library you use expects you to provide already allocated memory, you are supposed to free it, too, when needed.
In case of explicit dynamic allocation i.e. malloc/alloc/realloc you have to explicitly free it. But now that you have mentioned about strings there is a special function strdup() which under-the-hood malloc for you when you call it.
In case of strdup(), you have to make sure that without you allocating you MUST free it.

malloc and scope

I am struggling to wrap my head around malloc in c - specifically when it needs to be free()'d. I am getting weird errors in gcc such as:
... free(): invalid next size (fast): ...
when I try to free a char pointer. For example, when reading from an input file, it will crash on certain lines when doing the following:
FILE *f = fopen(file,"r");
char x[256];
while(1) {
if(fgets(x,sizeof x,f)==NULL) break;
char *tmp = some_function_return_char_pointer(x); //OR malloc(nbytes);
// do some stuff
free(tmp); // this is where I get the error, but only sometimes
}
I checked for obvious things, such as x being NULL, but it's not; it just crashes on random lines.
But my REAL question is - when do I need to use free()? Or, probably more correctly, when should I NOT use free? What if malloc is in a function, and I return the var that used malloc()? What about in a for or while loop? Does malloc-ing for an array of struct have the same rules as for a string/char pointer?
I gather from the errors I'm getting in gcc on program crash that I'm just not understanding malloc and free. I've spent my quality time with Google and I'm still hitting brick walls. Are there any good resources you've found? Everything I see says that whenever I use malloc I need to use free. But then I try that and my program crashes. So maybe it's different based on a variable's scope? Does C free the memory at the end of a loop when a variable is declared inside of it? At the end of a function?
So:
for(i=0;i<100;i++) char *x=malloc(n); // no need to use free(x)?
but:
char *x;
for(i=0;i<100;i++) {
x=malloc(n);
free(x); //must do this, since scope of x greater than loop?
}
Is that right?
Hopefully I'm making sense...
malloc() is C's dynamic allocator. You have to understand the difference between automatic (scoped) and dynamic (manual) variables.
Automatic variables live for the duration of their scope. They're the ones you declare without any decoration: int x;
Most variables in a C program should be automatic, since they are local to some piece of code (e.g. a function, or a loop), and they communicate via function calls and return values.
The only time you need dynamic allocation is when you have some data that needs to outlive any given scope. Such data must be allocated dynamically, and eventually freed when it is no longer necessary.
The prime usage example for this is your typical linked list. The list nodes cannot possibly be local to any scope if you are going to have generic "insert/erase/find" list manipulation functions. Thus, each node must be allocated dynamically, and the list manipulation functions must ensure that they free those nodes that are no longer part of the list.
In summary, variable allocation is fundamentally and primarily a question of scope. If possible keep everything automatic and you don't have to do anything. If necessary, use dynamic allocation and take care to deallocate manually whenever appropriate.
(Edit: As #Oli says, you may also want to use dynamic allocation in a strictly local context at times, because most platforms limit the size of automatic variables to a much smaller limit than the size of dynamic memory. Think "huge array". Exceeding the available space for automatic variables usually has a colourful name such as "pile overrun" or something similar.)
In general, every call to malloc must have one corresponding call to free.* This has nothing to do with scope (i.e. nothing to do with functions or loops).
* Exceptions to this rule include using functions like strdup, but the principle is the same.
Broadly speaking, every pointer that is ever returned by malloc() must eventually be passed to free(). The scope of the variable that you store the pointer in does not affect this, because even after the variable is no longer in scope, the memory that the pointer points to will still be allocated until you call free() on it.
Well, the scope of the malloc'd memory lays between calls to malloc and free or otherwise until process is stopped (that is when OS cleans up for the process). If you never call free you get a memory leak. That could happen when address that you can pass to free goes out of scope before you actually used it - that is like loosing your keys for the car, car is still there but you can't really drive it. The error you are getting is most likely either because function returns a pointer to some memory that was not allocated using malloc or it returns a null pointer which you pass to free, which you cannot do.
You should free memory when you will no longer be accessing it. You should not free memory if you will be accessing it. This will give you a lot of pain.
If you don't want memory leak, you have to free the memory from malloc.
It can be very tricky. For example, if the // do some stuff has a continue, the free will be skipped and lead to memory leak. It is tricky, so we have shared_ptr in C++; and rumor has it salary of C programmer is higher than C++ programmer.
Sometimes we don't care memory leak. If the memory holds something that is needed during the whole lifetime of execution, you can choose not to free it. Example: a string for environment variable.
PS: Valgrind is a tool to help detect memory bugs. Especially useful for memory leak.
malloc(n) allocates n bytes of memory from a memory location named heap and then returns a void* type of pointer to it. The memory is allocated at runtime. Once you have allocated a memory dynamically, scope does not matter as long as you keep a pointer to it with you(or the address of it specifically). For example:
int* allocate_an_integer_array(int n)
{
int* p = (int*) (malloc(sizeof(int)*n));
return p;
}
This functions simply allocates memory from heap equal to n integers and returns a pointer to the first location. The pointer can be used in the calling function as you want to. The SCOPE does not matter as long as the pointer is with you..
free(p) returns the memory to heap.
The only thing you need to remember is to free it as if you don't free it and lose the value of its address, there will bw a memory leak. It is so because according to OS, you are still using the memory as you have not freed it and a memory leak will happen..
Also after freeing just set the value of the pointer to null so that u don't use it again as the same memory may be allocated again at any other time for a different purpose....
So, all you need to do is to be careful...
Hope it helps!

Why exactly should I not call free() on variables not allocated by malloc()?

I read somewhere that it is disastrous to use free to get rid of an object not created by calling malloc, is this true? why?
That's undefined behavior - never try it.
Let's see what happens when you try to free() an automatic variable. The heap manager will have to deduce how to take ownership of the memory block. To do so it will either have to use some separate structure that lists all allocated blocks and that is very slow an rarely used or hope that the necessary data is located near the beginning of the block.
The latter is used quite often and here's how i is supposed to work. When you call malloc() the heap manager allocates a slightly bigger block, stores service data at the beginning and returns an offset pointer. Smth like:
void* malloc( size_t size )
{
void* block = tryAlloc( size + sizeof( size_t) );
if( block == 0 ) {
return 0;
}
// the following is for illustration, more service data is usually written
*((size_t*)block) = size;
return (size_t*)block + 1;
}
then free() will try to access that data by offsetting the passed pointer but if the pointer is to an automatic variable whatever data will be located where it expects to find service data. Hence undefined behavior. Many times service data is modified by free() for heap manager to take ownership of the block - so if the pointer passed is to an automatic variable some unrelated memory will be modified and read from.
Implementations may vary but you should never make any specific assumptions. Only call free() on addresses returned by malloc() family functions.
By the standard, it's "undefined behavior" - i.e. "anything can happen". That's usually bad things, though.
In practice: free'ing a pointer means modifying the heap. C runtime does virtually never validate if the pointer passed comes from the heap - that would be to costly in either time or memory. Combine these two factoids, and you get "free(non-malloced-ptr) will write something somewhere" - the resutl may be some of "your" data modified behind your back, an access violation, or trashing vital runtime structures, such as a return address on the stack.
Example: A disastrous scenario:
Your heap is implemented as a simple list of free blocks. malloc means removing a suitable block from the list, free means adding it to the list again. (a typical if trivial implementation)
You free() a pointer to a local variable on the stack. You are "lucky" because the modification goes into irrelevant stack space. However, part of the stack is now on your free list.
Because of the allocator design and your allocation patterns, malloc is unlikely to return this block. Later, in an completely unrelated part of the program, you actually do get this block as malloc result, writing to it trashes some local variables up the stack, and when returning some vital pointer contains garbage and your app crashes. Symptoms, repro and location are completely unrelated to the actual cause.
Debug that.
It is undefined behaviour. And logically, if behaviour is undefined, you cannot be sure what has happened, and if the program is still operating properly.
Some people have pointed out here that this is "undefined behavior". I'm going to go farther and say that on some implementations, this will either crash your program or cause data corruption. It has to do with how "malloc" and "free" are implemented.
One possible way to implement malloc/free is to put a small header before each allocated region. On a malloc'd region, that header would contain the size of the region. When the region is freed, that header is checked and the region is added to the appropriate freelist. If this happens to you, this is bad news. For example, if you free an object allocated on the stack, suddenly part of the stack is in the freelist. Then malloc might return that region in response to a future call, and you'll scribble data all over your stack. Another possibility is that you free a string constant. If that string constant is in read-only memory (it often is), this hypothetical implementation would cause a segfault and crash either after a later malloc or when free adds the object to its freelist.
This is a hypothetical implementation I am talking about, but you can use your imagination to see how it could go very, very wrong. Some implementations are very robust and are not vulnerable to this precise type of user error. Some implementations even allow you to set environment variables to diagnose these types of errors. Valgrind and other tools will also detect these errors.
Strictly speaking, this is not true. calloc() and realloc() are valid object sources for free(), too. ;)
Please have a look at what undefined behavior means. malloc() and free() on a conforming hosted C implementation are built to standards. The standards say the behavior of calling free() on a heap block that was not returned by malloc() (or something wrapping it, e.g. calloc()) is undefined.
This means, it can do whatever you want it to do, provided that you make the necessary modifications to free() on your own. You won't break the standard by making the behavior of free() on blocks not allocated by malloc() consistent and even possibly useful.
In fact, there could be platforms that (themselves) define this behavior. I don't know of any, but there could be some. There are several garbage collecting / logging malloc() implementations that might let it fail more gracefully while logging the event. But thats implementation , not standards defined behavior.
Undefined simply means don't count on any kind of consistent behavior unless you implement it yourself without breaking any defined behavior. Finally, implementation defined does not always mean defined by the host system. Many programs link against (and ship) uclibc. In that case, the implementation is self contained, consistent and portable.
It would certainly be possible for an implementation of malloc/free to keep a list of the memory blocks thats been allocated and in the case the user tries to free a block that isn't in this list do nothing.
However since the standard says that this isn't a requirement most implementation will treat all pointers coming into free as valid.

Is the memory of a (character) array freed by going out of scope?

Very much related to my previous question, but I found this to be a separate issue and am unable to find a solid answer to this.
Is the memory used by a (character) array freed by going out of scope?
An example:
void method1()
{
char str[10];
// manipulate str
}
So after the method1 call, is the memory used by str (10 bytes) freed, or do I need to explicitly call free on this as well?
My intuition tells me this is just a simple array of primitive types, so it's automatically freed. I'm in doubt because in C you can't assume anything to be automatically freed.
In this case no you do not need to call free. The value "str" is a stack based value which will be cleaned up when that particular method / scope is exited.
You only need to call free on values which are explicitly created via malloc.
It is automatically freed. If you didn't malloc it, you
don't need to free it. But this has nothing to do with it being
a "simple array of primitive types" - it would be freed if it was
an array of structures. It is freed because it is a local variable.
Given that you are asking these very basic questions,
I have to ask which C textbook are you using. Personally, I don't believe that you can usefully learn C without
reading Kernighan & Ritchie's The C Programming Language, which
explains all this stuff very clearly.
Yes, it is "freed." (Not free()'ed, though.)
Since str is an automatic variable, it will only last as long as its scope, which is until the end of the function block.
Note that you only free() what you malloc().
Yes, the memory is freed automatically once method1 returns. The memory for str is allocated on the stack and is freed once the method's stack frame is cleaned up. Compare this to memory allocated on the heap (via malloc) which you must explicitly free.
No, local variables of this sort are allocated on the stack, so when you return from the procedure the memory is available for the next function call, which will use the memory for its stack frame.
If you use malloc() the space is allocated on the heap, which must be explicitly freed.
I think it's freed not because it's primitives but that it's a local variable and that will be allocated on the stack not the heap. If you don't malloc it then you can't free it as far as I remember.
I'm a bit rusty in C/C++ lately, but I think you're right. As long as you didn't dynamically allocate that memory, you should be fine.
Yes, it is "freed" when it goes out of scope.
No, you don't have to explicitly free it.
The char array is allocated on the stack, so when you return from the function, that stack space is re-usable. You do not need to explicitly free the memory.
Good rule of thumb: if you malloc, you must free.

Resources