Growing an array on the stack - c

This is my problem in essence. In the life of a function, I generate some integers, then use the array of integers in an algorithm that is also part of the same function. The array of integers will only be used within the function, so naturally it makes sense to store the array on the stack.
The problem is I don't know the size of the array until I'm finished generating all the integers.
I know how to allocate a fixed size and variable sized array on the stack. However, I do not know how to grow an array on the stack, and that seems like the best way to solve my problem. I'm fairly certain this is possible to do in assembly, you just increment stack pointer and store an int for each int generated, so the array of ints would be at the end of the stack frame. Is this possible to do in C though?

I would disagree with your assertion that "so naturally it makes sense to store the array on the stack". Stack memory is really designed for when you know the size at compile time. I would argue that dynamic memory is the way to go here

C doesn't define what the "stack" is. It only has static, automatic and dynamic allocations. Static and automatic allocations are handled by the compiler, and only dynamic allocation puts the controls in your hands. Thus, if you want to manually deallocate an object and allocate a bigger one, you must use dynamic allocation.

Don't use dynamic arrays on the stack (compare Why is the use of alloca() not considered good practice?), better allocate memory from the heap using malloc and resize it using realloc.

Never Use alloca()
IMHO this point hasn't been made well enough in the standard references.
One rule of thumb is:
If you're not prepared to statically allocate the maximum possible size as a
fixed length C array then you shouldn't do it dynamically with alloca() either.
Why? The reason you're trying to avoid malloc() is performance.
alloca() will be slower and won't work in any circumstance static allocation will fail. It's generally less likely to succeed than malloc() too.
One thing is sure. Statically allocating the maximum will outdo both malloc() and alloca().
Static allocation is typically damn near a no-op. Most systems will advance the stack pointer for the function call anyway. There's no appreciable difference for how far.
So what you're telling me is you care about performance but want to hold back on a no-op solution? Think about why you feel like that.
The overwhelming likelihood is you're concerned about the size allocated.
But as explained it's free and it gets taken back. What's the worry?
If the worry is "I don't have a maximum or don't know if it will overflow the stack" then you shouldn't be using alloca() because you don't have a maximum and know it if it will overflow the stack.
If you do have a maximum and know it isn't going to blow the stack then statically allocate the maximum and go home. It's a free lunch - remember?
That makes alloca() either wrong or sub-optimal.
Every time you use alloca() you're either wasting your time or coding in one of the difficult-to-test-for arbitrary scaling ceilings that sleep quietly until things really matter then f**k up someone's day.
Don't.
PS: If you need a big 'workspace' but the malloc()/free() overhead is a bottle-neck for example called repeatedly in a big loop, then consider allocating the workspace outside the loop and carrying it from iteration to iteration. You may need to reallocate the workspace if you find a 'big' case but it's often possible to divide the number of allocations by 100 or even 1000.
Footnote:
There must be some theoretical algorithm where a() calls b() and if a() requires a massive environment b() doesn't and vice versa.
In that event there could be some kind of freaky play-off where the stack overflow is prevented by alloca(). I have never heard of or seen such an algorithm. Plausible specimens will be gratefully received!

The innards of the C compiler requires stack sizes to be fixed or calculable at compile time. It's been a while since I used C (now a C++ convert) and I don't know exactly why this is. http://gribblelab.org/CBootcamp/7_Memory_Stack_vs_Heap.html provides a useful comparison of the pros and cons of the two approaches.
I appreciate your assembly code analogy but C is largely managed, if that makes any sense, by the Operating System, which imposes/provides the task, process and stack notations.

In order to address your issue dynamic memory allocation looks ideal.
int *a = malloc(sizeof(int));
and dereference it to store the value .
Each time a new integer needs to be added to the existing list of integers
int *temp = realloc(a,sizeof(int) * (n+1)); /* n = number of new elements */
if(temp != NULL)
a = temp;
Once done using this memory free() it.

Is there an upper limit on the size? If you can impose one, so the size is at most a few tens of KiB, then yes alloca is appropriate (especially if this is a leaf function, not one calling other functions that might also allocate non-tiny arrays this way).
Or since this is C, not C++, use a variable-length array like int foo[n];.
But always sanity-check your size, otherwise it's a stack-clash vulnerability waiting to happen. (Where a huge allocation moves the stack pointer so far that it ends up in the middle of another memory region, where other things get overwritten by local variables and return addresses.) Some distros enable hardening options that make GCC generate code to touch every page in between when moving the stack pointer by more than a page.
It's usually not worth it to check the size and use alloc for small, malloc for large, since you also need another check at the end of your function to call free if the size was large. It might give a speedup, but this makes your code more complicated and more likely to get broken during maintenance if future editors don't notice that the memory is only sometimes malloced. So only consider a dual strategy if profiling shows this is actually important, and you care about performance more than simplicity / human-readability / maintainability for this particular project.
A size check for an upper limit (else log an error and exit) is more reasonable, but then you have to choose an upper limit beyond which your program will intentionally bail out, even though there's plenty of RAM you're choosing not to use. If there is a reasonable limit where you can be pretty sure something's gone wrong, like the input being intentionally malicious from an exploit, then great, if(size>limit) error(); int arr[size];.
If neither of those conditions can be satisfied, your use case is not appropriate for C automatic storage (stack memory) because it might need to be large. Just use dynamic allocation autom don't want malloc.
Windows x86/x64 the default user-space stack size is 1MiB, I think. On x86-64 Linux it's 8MiB. (ulimit -s). Thread stacks are allocated with the same size. But remember, your function will be part of a chain of function calls (so if every function used a large fraction of the total size, you'd have a problem if they called each other). And any stack memory you dirty won't get handed back to the OS even after the function returns, unlike malloc/free where a large allocation can give back the memory instead of leaving it on the free list.
Kernel thread stack are much smaller, like 16 KiB total for x86-64 Linux, so you never want VLAs or alloca in kernel code, except maybe for a tiny max size, like up to 16 or maybe 32 bytes, not large compared to the size of a pointer that would be needed to store a kmalloc return value.

Related

C allocation and memory overhead

Apologies if this is a stupid question, but it's been kinda bothering me for a long time.
I'd like to know some details on how the memory manager knows what memory is in use.
Imagine a one-chip microcomputer with 1024B of RAM - not much to spare.
Now you allocate 100 ints - each int is 4 bytes, each pointer 4 bytes too (yea, 8bit one-chip will most likely have smaller pointers, but w/e).
So you've just used 800B of ram for 100 ints? But it's worse - the allocation system must somehow take note on where the memory is malloc'd and where it's free - 200 extra bytes or something? Or some bit marks?
If this is true, why is C favoured over assembler so often?
Is this really how it works? So super inefficient?
(Or am I having a totally incorrect idea about it?)
It may surprise younger developers to learn that greying old ones like myself used to write in C on systems with 1 or 2k of RAM.
In systems this size, dynamic memory allocation would have been a luxury that we could not afford. It's not just the pointer overhead of managing the free store, but also the effects of free store fragmentation making memory allocation inefficient and very probably leading to a fatal out-of-memory condition (virtual memory was not an option).
So we used to use static memory allocation (i.e. global variables), kept a very tight control on the depth of function all nesting, and an even tighter control over nested interrupt handling.
When writing on these systems, I didn't even link the standard library. I wrote my own C startup routines and provided custom minimal I/O routines.
One program I wrote in a 2k ram system used the lower part of RAM as the data area and the upper part as the stack. In the final cut, I proved that the maximal use of stack reached so far down in memory that it was 1 byte away from the last variable in the data area.
Ah, the good old days...
EDIT:
To answer your question specifically, the original K&R free store manager used to add a header block to the beginning of each block of memory allocated via malloc.
The header block looked something like this:
union header {
struct {
union header *ptr;
unsigned size;
} s;
};
Where ptr is the address of the next header block and size is the size of the memory allocated (in blocks). The malloc function would actually return the address computed by &header + sizeof(header). The free function would subtract the size of the header from the pointer you provided in order to re-link the block back into the free list.
There are several approaches how you can do that:
as you write, malloc() one memory block for every int you have. Completely inefficient, thus I strike it out.
malloc() an array of 100 ints. That needs in total 100*sizeof(int) + 1* sizeof(int*) + whatever malloc() internally needs. Much better.
statically allocate the array. Here you just need 100*sizeof(int).
allocate the array on the stack. That needs the same, but only for the current function call.
Which of these you need depends on how long you need the memory and other criteria.
If you have that few RAM, it might even be questionable if it is useful to use malloc() at all. It can be an option if several code blocks need a lot of RAM, but not at the same time.
On how the memory addresses are tracked, that as well depends:
for malloc(), you have to put the pointer in a place where you don't lose it.
for an array on the stack, it is expressed relatively to the current frame pointer. The code sets up the frame and thus knows about the offset, so it is normally not needed to store it anywhere.
for an array in the data segment, the compiler and linker know about the address and statically put the address where it is needed.
If this is true, why is C favoured over assembler so often?
You're simplifying the problem too much. C or assembler - doesn't matter, you still need to manage the memory chunks. The main issue is fragmentation, not the actual management overhead. In a system like the one you described, you would probably just allocate the memory and never ever release it, thus no need to check what's free - whatever is below the watermark is free, and that's it.
Is this really how it works? So super inefficient?
There are many algorithms around this problem, but if you're simplifying - yes, it basically is. In reality its a much more complicated problem - and there are much more complicated systems revolving around servicing memory, dealing with fragmentation, garbage collection (on a OS level), etc etc.

Arrays of which size require malloc (or global assignment)?

While taking my first steps with C, I quickly noticed that int array[big number] causes my programs to crash when called inside a function. Not quite as quickly, I discovered that I can prevent this from happening by defining the array with global scope (outside the functions) or using malloc.
My question is:
Starting at which size is it necessary to use one of the above methods to make sure my programs won't crash?
I mean, is it safe to use just, e.g., int i; for counters and int chars[256]; for small arrays or should I just use malloc for all local variables?
You should understand what the difference is between int chars[256] in a function and using malloc().
In short, the former places the entire array on the stack. The latter allocates the memory you requested from the heap. Generally speaking, the heap is much larger than the stack, but the size of each can be adjusted.
Another key difference is that a variable allocated on the stack will technically be gone after you return from the method. (Oh, your program may function as though it's not gone for a bit if you continue to access that array, but ho ho ho danger lurks.) A hunk of memory allocated with malloc will remain allocated until you explicitly free it or the program exits.
You should use malloc for your dynamic memory allocation. For statically sized arrays (or any other object) within functions, if the memory required is to big you will quickly get a segmentation fault. I don't think a 'safe limit' can be defined, its probably implementation specific and other factors come in play too, like the current stack and objects within it created by callers to the current function. I would be tempted to say that anything below the page size (usually 4kb) should be safe as long as there is no recursion involved, but I do not think there are such guarantees.
It depends. If you have some guarantee that a line will never be longer than 100 ... 1000 characters you can get away with a fixed size buffer. If you don't: you don't. There is a difference between reading in a neat x KB config file and a x GB XML file (with no CR/LF). It depends.
The next choice is: do you want your program to die gracefully? It is only a design choice.

Is it bad practice to declare an array mid-function

in an effort to only ask what I'm really looking for here... I'm really only concerned if it's considered bad practice or not to declare an array like below where the size could vary. If it is... I would generally malloc() instead.
void MyFunction()
{
int size;
//do a bunch of stuff
size = 10; //but could have been something else
int array[size];
//do more stuff...
}
Generally yes, this is bad practice, although new standards allow you to use this syntax. In my opinion you must allocate (on the heap) the memory you want to use and release it once you're done with it. Since there is no portable way of checking if the stack is enough to hold that array you should use some methods that can really be checked - like malloc/calloc & free. In the embedded world stack size can be an issue.
If you are worried about fragmentation you can create your own memory allocator, but this is a totally different story.
That depends. The first clearly isn't what I'd call "proper", and the second is only under rather limited circumstances.
In the first, you shouldn't cast the return from malloc in C -- doing so can cover up the bug of accidentally omitting inclusion of the correct header (<stdlib.h>).
In the second, you're restricting the code to C99 or a gcc extension. As long as you're aware of that, and it works for your purposes, it's all right, but hardly what I'd call an ideal of portability.
As far as what you're really asking: with the minor bug mentioned above fixed, the first is portable, but may be slower than you'd like. If the second is portable enough for your purposes, it'll normally be faster.
For your question, I think each has its advantages and disadvantages.
Dynamic Allocation:
Slow, but you can detect when there is no memory to be given to your programmer by checking the pointer.
Stack Allocation:
Only in C99 and it is blazingly fast but in case of stackoverflow you are out of luck.
In summary, when you need a small array, reserve it on the stack. Otherwise, use dynamic memory wisely.
The argument against VLAs runs that because of the absolute badness of overflowing the stack, by the time you've done enough thinking/checking to make them safe, you've done enough thinking/checking to use a fixed-size array:
1) In order to safely use VLAs, you must know that there is enough stack available.
2) In the vast majority of cases, the way that you know there's enough stack is that you know an upper bound on the size required, and you know (or at least are willing to guess or require) a lower bound on the stack available, and the one is smaller than the other. So just use a fixed-size array.
3) In the vast majority of the few cases that aren't that simple, you're using multiple VLAs (perhaps one in each call to a recursive function), and you know an upper bound on their total size, which is less than a lower bound on available stack. So you could use a fixed-size array and divide it into pieces as required.
4) If you ever encounter one of the remaining cases, in a situation where the performance of malloc is unacceptable, do let me know...
It may be more convenient, from the POV of the source code, to use VLAs. For instance you can use sizeof (in the defining scope) instead of maintaining the size in a variable, and that business with dividing an array into chunks might require passing an extra parameter around. So there's some small gain in convenience, sometimes.
It's also easier to miss that you're using a humongous amount of stack, yielding undefined behavior, if instead of a rather scary-looking int buf[1920*1024] or int buf[MAX_IMG_SIZE] you have an int buf[img->size]. That works fine right up to the first time you actually handle a big image. That's broadly an issue of proper testing, but if you miss some possible difficult inputs, then it won't be the first or last test suite to do so. I find that a fixed-size array reminds me either to put in fixed-size checks of the input, or to replace it with a dynamic allocation and stop worrying whether it fits on the stack or not. There is no valid option to put it on the stack and not worry whether it fits...
two points from a UNIX/C perspective -
malloc is only slow when you force it to call brk(). Meaning for reasonable arrays it is the same as allocating stack space for a variable. By the way when you use method #2 (via alloca and in the code libc I have seen) also invokes brk() for huge objects. So it is a wash. Note: with #2 & #1 you still have to invoke directly or indirectly a memset-type call to zero the bytes in the array. This is just a side note to the real issue (IMO):
The real issue is memory leaks. alloca cleans up after itself when the function retruns so #2 is less likely to cause a problem. With malloc/calloc you have to call free() or start a leak.

Why would you ever want to allocate memory on the heap rather than the stack? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
When is it best to use a Stack instead of a Heap and vice versa?
I've read a few of the other questions regarding the heap vs stack, but they seem to focus more on what the heap/stack do rather than why you would use them.
It seems to me that stack allocation would almost always be preferred since it is quicker (just moving the stack pointer vs looking for free space in the heap), and you don't have to manually free allocated memory when you're done using it. The only reason I can see for using heap allocation is if you wanted to create an object in a function and then use it outside that functions scope, since stack allocated memory is automatically unallocated after returning from the function.
Are there other reasons for using heap allocation instead of stack allocation that I am not aware of?
There are a few reasons:
The main one is that with heap allocation, you have the most flexible control over the object's lifetime (from malloc/calloc to free);
Stack space is typically a more limited resource than heap space, at least in default configurations;
A failure to allocate heap space can be handled gracefully, whereas running out of stack space is often unrecoverable.
Without the flexible object lifetime, useful data structures such as binary trees and linked lists would be virtually impossible to write.
You want an allocation to live beyond a function invocation
You want to conserve stack space (which is typically limited to a few MBs)
You're working with re-locatable memory (Win16, databases, etc.), or want to recover from allocation failures.
Variable length anything. You can fake around this, but your code will be really nasty.
The big one is #1. As soon as you get into any sort of concurrency or IPC #1 is everywhere. Even most non-trivial single threaded applications are tricky to devise without some heap allocation. That'd practically be faking a functional language in C/C++.
So I want to make a string. I can make it on the heap or on the stack. Let's try both:
char *heap = malloc(14);
if(heap == NULL)
{
// bad things happened!
}
strcat(heap, "Hello, world!");
And for the stack:
char stack[] = "Hello, world!";
So now I have these two strings in their respective places. Later, I want to make them longer:
char *tmp = realloc(heap, 20);
if(tmp == NULL)
{
// bad things happened!
}
heap = tmp;
memmove(heap + 13, heap + 7);
memcpy(heap + 7, "cruel ", 6);
And for the stack:
// umm... What?
This is only one benefit, and others have mentioned other benefits, but this is a rather nice one. With the heap, we can at least try to make our allocated space larger. With the stack, we're stuck with what we have. If we want room to grow, we have to declare it all up front, and we all know how it stinks to see this:
char username[MAX_BUF_SIZE];
The most obvious rationale for using the heap is when you call a function and need something of unknown length returned. Sometimes the caller may pass a memory block and size to the function, but at other times this is just impractical, especially if the returned stuff is complex (e.g. a collection of different objects with pointers flying around, etc.).
Size limits are a huge dealbreaker in a lot of cases. The stack is usually measured in the low megabytes or even kilobytes (that's for everything on the stack), whereas all modern PCs allow you a few gigabytes of heap. So if you're going to be using a large amount of data, you absolutely need the heap.
just to add
you can use alloca to allocate memory on the stack, but again memory on the stack is limited and also the space exists only during the function execution only.
that does not mean everything should be allocated on the heap. like all design decisions this is also somewhat difficult, a "judicious" combination of both should be used.
Besides manual control of object's lifetime (which you mentioned), the other reasons for using heap would include:
Run-time control over object's size (both initial size and it's "later" size, during the program's execution).
For example, you can allocate an array of certain size, which is only known at run time.
With the introduction of VLA (Variable Length Arrays) in C99, it became possible to allocate arrays of fixed run-time size without using heap (this is basically a language-level implementation of 'alloca' functionality). However, in other cases you'd still need heap even in C99.
Run-time control over the total number of objects.
For example, when you build a binary tree stucture, you can't meaningfully allocate the nodes of the tree on the stack in advance. You have to use heap to allocated them "on demand".
Low-level technical considerations, as limited stack space (others already mentioned that).
When you need a large, say, I/O buffer, even for a short time (inside a single function) it makes more sense to request it from the heap instead of declaring a large automatic array.
Stack variables (often called 'automatic variables') is best used for things you want to always be the same, and always be small.
int x;
char foo[32];
Are all stack allocations, These are fixed at compile time too.
The best reason for heap allocation is that you cant always know how much space you need. You often only know this once the program is running. You might have an idea of limits but you would only want to use the exact amount of space required.
If you had to read in a file that could be anything from 1k to 50mb, you would not do this:-
int readdata ( FILE * f ) {
char inputdata[50*1024*1025];
...
return x;
}
That would try to allocate 50mb on the stack, which would usually fail as the stack is usually limited to 256k anyway.
The stack and heap share the same "open" memory space and will have to eventually come to a point where they meet, if you use the entire segment of memory. Keeping the balance between the space that each of them use will have amortized cost later for allocation and de-allocation of memory a smaller asymptotic value.

What is causing a stack overflow?

You may think that this is a coincidence that the topic of my question is similar to the name of the forum but I actually got here by googling the term "stack overflow".
I use the OPNET network simulator in which I program using C. I think I am having a problem with big array sizes. It seems that I am hitting some sort of memory allocation limitation. It may have to do with OPNET, Windows, my laptop memory or most likely C language. The problem is caused when I try to use nested arrays with a total number of elements coming to several thousand integers. I think I am exceeding an overall memory allocation limit and I am wondering if there is a way to increase this cap.
Here's the exact problem description:
I basically have a routing table. Let's call it routing_tbl[n], meaning I am supporting 30 nodes (routers). Now, for each node in this table, I keep info. about many (hundreds) available paths, in an array called paths[p]. Again, for each path in this array, I keep the list of nodes that belong to it in an array called hops[h]. So, I am using at least nph integers worth of memory but this table contains other information as well. In the same function, I am also using another nested array that consumes almost 40,000 integers as well.
As soon as I run my simulation, it quits complaining about stack overflow. It works when I reduce the total size of the routing table.
What do you think causes the problem and how can it be solved?
Much appreciated
Ali
It may help if you post some code. Edit the question to include the problem function and the error.
Meanwhile, here's a very generic answer:
The two principal causes of a stack overflow are 1) a recursive function, or 2) the allocation of a large number of local variables.
Recursion
if your function calls itself, like this:
int recurse(int number) {
return (recurse(number));
}
Since local variables and function arguments are stored on the stack, then it will in fill the stack and cause a stack overflow.
Large local variables
If you try to allocate a large array of local variables then you can overflow the stack in one easy go. A function like this may cause the issue:
void hugeStack (void) {
unsigned long long reallyBig[100000000][1000000000];
...
}
There is quite a detailed answer to this similar question.
Somehow you are using a lot of stack. Possible causes include that you're creating the routing table on the stack, you're passing it on the stack, or else you're generating lots of calls (eg by recursively processing the whole thing).
In the first two cases you should create it on the heap and pass around a pointer to it. In the third case you'll need to rewrite your algorithm in an iterative form.
Stack overflows can happen in C when the number of embedded recursive calls is too high. Perhaps you are calling a function from itself too many times?
This error may also be due to allocating too much memory in static declarations. You can switch to dynamic allocations through malloc() to fix this type of problem.
Is there a reason why you cannot use the debugger on this program?
It depends on where you have declared the variable.
A local variable (i.e. one declared on the stack is limited by the maximum frame size) This is a limit of the compiler you are using (and can usually be adjusted with compiler flags).
A dynamically allocated object (i.e. one that is on the heap) is limited by the amount of available memory. This is a property of the OS (and can technically by larger the physical memory if you have a smart OS).
Many operating systems dynamically expand the stack as you use more of it. When you start writing to a memory address that's just beyond the stack, the OS assumes your stack has just grown a bit more and allocates it an extra page (usually 4096Kib on x86 - exactly 1024 ints).
The problem is, on the x86 (and some other architectures) the stack grows downwards but C arrays grow upwards. This means if you access the start of a large array, you'll be accessing memory that's more than a page away from the edge of the stack.
If you initialise your array to 0 starting from the end of the array (that's right, make a for loop to do it), the errors might go away. If they do, this is indeed the problem.
You might be able to find some OS API functions to force stack allocation, or compiler pragmas/flags. I'm not sure about how this can be done portably, except of course for using malloc() and free()!
You are unlikely to run into a stack overflow with unthreaded compiled C unless you do something particularly egregious like have runaway recursion or a cosmic memory leak. However, your simulator probably has a threading package which will impose stack size limits. When you start a new thread it will allocate a chunk of memory for the stack for that thread. Likely, there is a parameter you can set somewhere that establishes the the default stack size, or there may be a way to grow the stack dynamically. For example, pthreads has a function pthread_attr_setstacksize() which you call prior to starting a new thread to set its size. Your simulator may or may not be using pthreads. Consult your simulator reference documentation.

Resources