Checking if stack allocation fails? - c

Is there a way to correct failed static allocation or program just fails with Segmentation or Bus Fault when run?
Post was inspired by how C99 allows crazy stuff like char text[n];
EDIT: Thanks. I now understand the part in bold is not a static alloc. So just to check, if something like char text[1234]; fails would the possible recovery strategies be the same?

char text[n] allocates a variable-size array on the stack. It simply involves incrementing the stack pointer by n.
There is not much a userspace process can do if a stack overflow occurs - it's up to the operating system to either send a signal to the process and terminate it or resize the stack.

You can probably catch the signal(s) but there's not much else you can do. Of course, checking n before using it to make sure it has a sensible value would solve this instantly.

Never check for an error condition you don't know how to handle.
Seriously, what are you planning on doing? There is only a small subset of function you are allowed to call from a signal handler (see man 7 signal), and printf and longjmp (longjmp is the only way I can think of to recover from such a problem) are not one of them. If you are going to the trouble to re-exec the process, you might as well have a nanny to do that job and avoid the mess.
Note according to man alloca you don't actually get told that the "allocation" fails, you just get a SIGSEGV when you try to access the bad memory, and of course that might not happen in the text[] array at all, or perhaps not even in the function that allocates text[] at all.
While the above two paragraphs are based on Linux, the overarching theory is true for all platforms.
Use malloc and have clean handling. Be sane.
[EDIT]
Actually there is one way to try and do this, and that is by computing the start of the stack (recording stack in main) and stack limit (hoping the OS doesn't run out of pages). Then before you do the large stack allocation you can compute how close you are to the end. Give yourself a generous wiggle-room and fail before you allocate.

This is a stack allocation rather than static. The failure mode is stack overflow. The most rational policy for stack overflow is to regard it as terminal.
Design your code so that it won't overflow the stack rather than trying to make it resilient to stack overflow.

Related

Does linux provide a guaranteed inaccessible memory area below the lower stack end?

Does Linux provide an inaccessible memory area below the lower stack end that has a guaranteed minimum size? And if such a guaranteed minimum size exists, what is it?
Or in other words, when should I start to worry about alloca() or so giving me pointers into valid, non-stack memory?
As the alloca man page says:
There is no error indication if the stack frame cannot be extended.
(However, after a failed allocation, the program is likely to receive
a SIGSEGV signal if it attempts to access the unallocated space.)
So there is no indication at all and it also says:
If the allocation causes stack overflow, program behavior is undefined.
The stack overflow problem is a general issue with recursion and not really particular to alloca or let's say variable length arrays. Typically you either need to find a way to limit the depth of the recursion, refactor to an iterative solution or use your own dynamic stack(probably does not apply to this case).
Update
As the OP discovered Linux does provide an after the fact indication using a guard page after the stack of stack overflow by generating a SIGBUS signal, which addresses the first part of the question.
Thanks to #ElliottFrisch for making me google this with the proper name... whoops.
Looks like the answer is "in newer kernels: one page, in older kernels: no such protection".
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=320b2b8de12698082609ebbc1a17165727f4c893

Strcpy a bigger string to a smaller array of char

Why when I do this:
char teststrcpy[5];
strcpy(teststrcpy,"thisisahugestring");
I get this message in run time:
Abort trap: 6
Shouldn't it just overwrite what is in the right of the memory of teststrcpy? If not, what does Abort trap means?
I'm using the GCC compiler under MAC OSX
As a note, and in answer to some comments, I am doing this for playing around C, I'm not going to try to do this in production. Don't you worry folkz! :)
Thanks
I don't own one, but I've read that Mac OS treats overflow differently, it won't allow you to overwrite memory incertian instances. strcpy() being one of them
On Linux machine, this code successfully overwrite next stack, but prevented on mac os (Abort trap) due to a stack canary.
You might be able to get around that with the gcc option -fno-stack-protector
Ok, since you're seeing an abort from __strcpy_chk that would mean it's specifically checking strcpy (and probably friends). So in theory you could do the following*:
char teststrcpy[5];
gets(teststrcpy);
Then enter your really long string and it should behave baddly as you wish.
*I am only advising gets in this specific instance in an attempt to get around the OS's protection mechanisms that are in place. Under NO other instances would I suggest anyone use the code. gets is not safe.
Shouldn't it just overwrite what is in the right of the memory of teststrcpy?
Not necessarily, it's undefined behaviour to write outside the allocated memory. In your case, something detected the out-of-bounds write and aborted the programme.
In C there is nobody who tells you that "buffer is too small" if you insist on copying too many characters to a buffer that is too small you will go into undefined behavior terrority
If you would LIKE to overwrite what's after 5th char of teststrcpy, you are a scary man. You can copy a string of size 4 to your teststrcpy (fifth char SHOLULD be reserved for NULL).
Most likely your compiler is using a canary for buffer overflow protection and, thus, raising this exception when there is an overflow, preventing you from writing outside the buffer.
See http://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries

Why some people don't check for NULL after calling malloc?

Some time ago I downloaded a sourcecode from the Internet. There were several malloc calls, and after that there was no check for NULL. As far as I know you need to check for NULL after calling malloc.
Is there a good reason for somebody not check for NULL after calling malloc? Am I missing something?
As Jens Gustedt mentioned in a comment, by the time malloc() returns an error your program is likely to be in a heap of trouble already. Does it make sense to put in a bunch of error handling code to handle the situation, when the program is likely not going to be able to do much of anything anyway? For many programs the answer might be 'no', for others it might be very important to do something appropriate.
You can try allocating your memory through a simple 'malloc-or-die' wrapper function that guarantees that the allocation succeeds or the program will terminate:
void* m_malloc(size_t size)
{
void* p;
// make sure a size request of `0` doesn't trigger
// an error situation needlessly
if (size == 0) size = 1;
p = malloc(size);
if (!p) {
// attempt to log the error or whatever
abort();
}
return p;
}
One problem that you then run into is that there's not much you can reliably do except maybe terminate the program. Even logging the problem is likely to require some memory allocation, so the logging facility will probably have its own problems (unless your allocation failure is due to trying to allocate an unreasonably large block of memory).
You might try to solve that issue by allocating a 'fail-safe' block early in your program that can be freed when you need to log the problem (I think there are quite a few programs that use this strategy). But how much work you are willing to put into this kind of error handling depends on your specific needs. If your program needs to ensure that something of significant complexity is done when malloc() returns an error, you'll need to have corresponding safeguards to make sure you can do those things in a very low-memory situation. Generally this means additional complexity, and it may not always be worth the effort.
People don't check because they're lazy, it makes their code uglier, and they don't want to figure out how to recover from errors everywhere.
I've heard a few programmers say, "If I can't malloc a block the system is going to crash soon anyway because VM is full, so why should I bother checking?"
I disagree. You should check for errors, even if it means just logging the error and calling exit() or throwing an exception. While we were trending towards systems with huge disks and always-on paged memory, the industry has flipped and now we have smartphones and tablets with limited RAM and no on-demand paging. Plus even on the desktop our datasets have grown so much that sometimes malloc will fail.
If you don't want to add extra lines of code everywhere, just write your own malloc replacement that calls malloc and checks for errors and use it instead of malloc.
They just don't care about unexpected crashes!
When you do malloc, it's very likely you are going to store something immediately. So if you don't check for NULL, then program may crash subsequently when trying to store something there.
This is unlikely in small programs where malloc hardly fails when requested for small amount of memory. So the malloc doesn't return NULL.
But it's usually good to practice the NULL check for malloc even in small programs, in my opinion.
If you need more memory and malloc can not give you more can you do anything about it?
I guess exit gracefully.
But if you exit, I guess they think it doesn't really matter how you exit (might as well crash and avoid, the what they think as "overhead" for checks for null).
Perhaps the functionality was such that they didn't have any need for cleanup code?
I don't agree though. You should check for NULL on malloc's return

Can I rely on malloc returning NULL?

I read that on Unix systems, malloc can return a non-NULL pointer even if the memory is not actually available, and trying to use the memory later on will trigger an error. Since I cannot catch such an error by checking for NULL, I wonder how useful it is to check for NULL at all?
On a related note, Herb Sutter says that handling C++ memory errors is futile, because the system will go into spasms of paging long before an exception will actually occur. Does this apply to malloc as well?
Quoting Linux manuals:
By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no
guarantee that
the memory really is available. This is a really bad bug. In case it turns out that the system is out of memory, one or more
processes will be
killed by the infamous OOM killer. In case Linux is employed under circumstances where it would be less desirable to suddenly lose
some randomly
picked processes, and moreover the kernel version is sufficiently recent, one can switch off this overcommitting behavior
using a command like:
# echo 2 > /proc/sys/vm/overcommit_memory
You ought to check for NULL return, especially on 32-bit systems, as the process address space could be exhausted far before the RAM: on 32-bit Linux for example, user processes might have usable address space of 2G - 3G as opposed to over 4G of total RAM. On 64-bit systems it might be useless to check the malloc return code, but might be considered good practice anyway, and it does make your program more portable. And, remember, dereferencing the null pointer kills your process certainly; some swapping might not hurt much compared to that.
If malloc happens to return NULL when one tries to allocate only a small amount of memory, then one must be cautious when trying to recover from the error condition as any subsequent malloc can fail too, until enough memory is available.
The default C++ operator new is often a wrapper over the same allocation mechanisms employed by malloc().
On Linux, you can indeed not rely on malloc returning NULL if sufficient memory is not available due to the kernel's overallocation strategy, but you should still check for it because in some circumstances malloc will return NULL, e.g. when you ask for more memory than is available in the machine in total. The Linux malloc(3) manpage calls the overallocation "a really bad bug" and contains advice on how to turn it off.
I've never heard about this behavior also occurring in other Unix variants.
As for the "spasms of paging", that depends on the machine setup. E.g., I tend not to setup a swap partition on laptop Linux installations, since the exact behavior you fear might kill the hard disk. I would still like the C/C++ programs that I run to check malloc return values, give appropriate error messages and when possible clean up after themselves.
Checking for the return of malloc doesn't help you much by its own to make your allocations safer or less error prone. It can even be a trap if this is the only test that you implement.
When called with an argument of 0 the standard allows malloc to return a sort of unique address, which is not a null pointer and which you don't have the right to access, nevertheless. So if you just test if the return is 0 but don't test the arguments to malloc, calloc or realloc you might encounter a segfault much later.
This error condition (memory exhausted) is quite rare in "hosted" environments. Usually you are in trouble long before you hassle with this kind of error. (But if you are writing runtime libraries, are a kernel hacker or rocket builder this is different, and there the test makes perfect sense.)
People then tend to decorate their code with complicated captures of that error condition that span several lines, doing perror and stuff like that, that can have an impact on the readability of the code.
I think that this "check the return of malloc" is much overestimated, sometimes even defended quite dogmatically. Other things are much more important:
always initialize variables, always. for pointer variables this is crucial,
let the program crash nicely before things get too bad. uninitialized pointer members in structs are an important cause of errors that are difficult to find.
always check the argument to malloc and Co. if this is a compile
time constant like sizof toto there can't be a problem, but
always ensure that your vector allocation handles the zero case properly.
An easy thing to check for return of malloc is to wrap it up with something like memset(malloc(n), 0, 1). This just writes a 0 in the first byte and crashes nicely if malloc had an error or n was 0 to start with.
To view this from an alternative point of view:
"malloc can return a non-NULL pointer even if the memory is not actually available" does not mean that it always returns non-NULL. There might (and will) be cases where NULL is returned (as others already said), so this check is necessary nevertheless.

What is causing a stack overflow?

You may think that this is a coincidence that the topic of my question is similar to the name of the forum but I actually got here by googling the term "stack overflow".
I use the OPNET network simulator in which I program using C. I think I am having a problem with big array sizes. It seems that I am hitting some sort of memory allocation limitation. It may have to do with OPNET, Windows, my laptop memory or most likely C language. The problem is caused when I try to use nested arrays with a total number of elements coming to several thousand integers. I think I am exceeding an overall memory allocation limit and I am wondering if there is a way to increase this cap.
Here's the exact problem description:
I basically have a routing table. Let's call it routing_tbl[n], meaning I am supporting 30 nodes (routers). Now, for each node in this table, I keep info. about many (hundreds) available paths, in an array called paths[p]. Again, for each path in this array, I keep the list of nodes that belong to it in an array called hops[h]. So, I am using at least nph integers worth of memory but this table contains other information as well. In the same function, I am also using another nested array that consumes almost 40,000 integers as well.
As soon as I run my simulation, it quits complaining about stack overflow. It works when I reduce the total size of the routing table.
What do you think causes the problem and how can it be solved?
Much appreciated
Ali
It may help if you post some code. Edit the question to include the problem function and the error.
Meanwhile, here's a very generic answer:
The two principal causes of a stack overflow are 1) a recursive function, or 2) the allocation of a large number of local variables.
Recursion
if your function calls itself, like this:
int recurse(int number) {
return (recurse(number));
}
Since local variables and function arguments are stored on the stack, then it will in fill the stack and cause a stack overflow.
Large local variables
If you try to allocate a large array of local variables then you can overflow the stack in one easy go. A function like this may cause the issue:
void hugeStack (void) {
unsigned long long reallyBig[100000000][1000000000];
...
}
There is quite a detailed answer to this similar question.
Somehow you are using a lot of stack. Possible causes include that you're creating the routing table on the stack, you're passing it on the stack, or else you're generating lots of calls (eg by recursively processing the whole thing).
In the first two cases you should create it on the heap and pass around a pointer to it. In the third case you'll need to rewrite your algorithm in an iterative form.
Stack overflows can happen in C when the number of embedded recursive calls is too high. Perhaps you are calling a function from itself too many times?
This error may also be due to allocating too much memory in static declarations. You can switch to dynamic allocations through malloc() to fix this type of problem.
Is there a reason why you cannot use the debugger on this program?
It depends on where you have declared the variable.
A local variable (i.e. one declared on the stack is limited by the maximum frame size) This is a limit of the compiler you are using (and can usually be adjusted with compiler flags).
A dynamically allocated object (i.e. one that is on the heap) is limited by the amount of available memory. This is a property of the OS (and can technically by larger the physical memory if you have a smart OS).
Many operating systems dynamically expand the stack as you use more of it. When you start writing to a memory address that's just beyond the stack, the OS assumes your stack has just grown a bit more and allocates it an extra page (usually 4096Kib on x86 - exactly 1024 ints).
The problem is, on the x86 (and some other architectures) the stack grows downwards but C arrays grow upwards. This means if you access the start of a large array, you'll be accessing memory that's more than a page away from the edge of the stack.
If you initialise your array to 0 starting from the end of the array (that's right, make a for loop to do it), the errors might go away. If they do, this is indeed the problem.
You might be able to find some OS API functions to force stack allocation, or compiler pragmas/flags. I'm not sure about how this can be done portably, except of course for using malloc() and free()!
You are unlikely to run into a stack overflow with unthreaded compiled C unless you do something particularly egregious like have runaway recursion or a cosmic memory leak. However, your simulator probably has a threading package which will impose stack size limits. When you start a new thread it will allocate a chunk of memory for the stack for that thread. Likely, there is a parameter you can set somewhere that establishes the the default stack size, or there may be a way to grow the stack dynamically. For example, pthreads has a function pthread_attr_setstacksize() which you call prior to starting a new thread to set its size. Your simulator may or may not be using pthreads. Consult your simulator reference documentation.

Resources