detecting stack overwrite error - c

I'm developing a software for the Cortex-M3 embedded micro controller (Atmel SAM3S)
I using the IAR EWARM IDE & compiler.
I suspect that for some reason I have a buffer overflow, or a memory leak, which causes the stack to be corrupted, because I suddenly find myself stuck outside of my code space.
The reason I ask this question, is that it's really hard finding out what actually caused this mess-up, and I want to know which techniques are you using when you want to find out the cause of the issue.
Are you using memory debuggers, in-circuit trace debugging hardware, etc.

You should try using canary values. This is how it basically goes - say you have some struct:
struct foo {
unsigned long bar;
void * baz;
};
Modify it so it looks like this:
struct foo {
unsigned long canary1;
unsigned long bar;
void * baz;
unsigned long canary2;
};
When you initialize the struct, put some arbitrary values into canary1 and canary2. Whenever you do some operation on your struct, check if the values stay the same. This way, if you have a buffer overflow or stack smashing, you'll detect it. You can do the same inside functions with automatic variables:
int foo(int bar) {
unsigned long canary1 = 0xDEADBABE;
char baz[20];
unsigned long canary2 = 0xBAD0C0DE;
...
}
And so on. Don't forget to check that the values remain the same before you return. Also, if you can get your code to consistently jump to the same location, try putting some code there (or a breakpoint) and get a stack trace.
GCC knows how to add these canary values by itself, but I don't know if your compiler can do that. But you could still do it manually.

The program counter is a register, and as such it cannot be "overwritten". What might happen is, as you say, that the stack gets overwritten and then you execute a return instruction which reads an invalid return address from the stack, thus causing a jump into la-la-land.
My favourite debugging method is printing things out, which might be difficult on an embedded target, of course. The second-best would be to step through the suspect routine.
You should also investigate things that are known to cause jumps, such as interrupt service routines.

I had a similar issue using IAR EWARM on an STM32. Memory dumps, disassembly, canaries, all turned up nothing. Finally rolled back to an earlier version of EWARM and the problem went away. I sent a message to IAR support but never heard back. I'm sorry I don't remember which version of EWARM this was. It was a few projects ago.
I would keep a memory window open and try the canary test first. If it still randomly jumps out of code space, try installing an older version of EWARM.

One thing I can add is that with ARM chips, it is possible that there was a BL somewhere instead of a BX or BLX causing the chip to go into the wrong Thumb/ARM mode. Not as common with later chips, but still...
When I find jumps to nowhere, I look for bad function pointer tables, overwrites of any interrupt vector tables, and yes, stack overflow which is the easiest to test. Drop known bytes values into your stack area and when the crash occurs, see how much stack you had remaining with a debugger. If none, there you go.
I'd also do the standard see what's changed in the last X days stuff to try and isolate any problems. Finally, just printf the heck out of your code to try and narrow where the bad jump is occurring. If you can get it down to a function or two, you can trace the assembler and see if it's a compiler issue, a memory issue, or an interrupt issue pretty quickly. Good luck!

Related

Strange memory issue with sscanf

Using gcc-arm-none-eabi-5_42016q3-20160926 tool chain in eclipse.
Processor: STM32F030
I have a 3 line program that starts before any hardware initializations to isolate the problem:
int a;
char * num="3";
memset(0x20000970,0XAA, 0x20001f00-0x20000970);
sscanf(num, "%i",&a);
I have set the RAM to 0XAA so I can see what gets clobbered, leaving plenty of room for the stack.
After the memset instruction the stack pointer is at 0X20001F78, the memory is 0XAA up to 0X20001F00 as expected. After I execute the sscanf function the stack pointer is back at 0X20001F78 however memory was clobbered all the way down to 0X20001BB4 which makes me think that either this simple call took almost 1K of stack or there is some other error in the routine. I have stopped using this function but am curious whether this is expected behavior? Also, is there a list of C functions that one should avoid in embedded systems, this was a surprise to me but from searching I see I am not alone.

Stack overflow happening when changing a line which is never reached - why and how to prevent it?

I'm developing something in an embedded context with Zephyr.
Essentially I'm dealing with a boot-loop caused by a stack overflow. The stack overflow goes away when I change an unused parameter of a function call deep inside my main. To make sure that the problem is not with the inside of the function, I hard-coded its implementation to be return 0;.
The offending line being like such creates a boot loop:
uint8_t port;
ret = foo(&port, NULL, NULL);
But the line missing the de-referenced port has the code run normally:
uint8_t port;
ret = foo(NULL, NULL, NULL);
Mind you, as I've already said, the implementation of foo is hard-coded to return 0. The parameters are at no point used. Furthermore, I'm sure the line is never actually reached at runtime (in this case) as it lives behind some conditionals requiring my interaction to actually go through.
I've started to give up and blame things on faulty memory or ESD damage but when I tried the same code with the same changes on a spare piece of hardware I had laying around the same thing happens. What is it that I'm missing? I genuinely don't know what else I could do to find out why this is happening and how to fix it. I don't have an access to a debugger for this microcontroller (SAMD21) so I'm at a bit of a loss... Any ideas (or at least sympathy)?
When you remove that parameter does it run without any errors or are there other errors? If you are writing to the wrong memory (e.g. memory that was allocated with a size of zero) somewhere in your program, changes to unrelated parts of the program's code, such as changing the size of a struct, or the parameters of a function, could change where a fatal error occurs and what kind of fatal error it is.
Nevermind, I've found the culprit - a simple stack overflow. I was one byte away from it before the addition of the uint8_t port variable declaration into main. The variable when not used as a parameter in foo() was being optimised away by the compiler. Having one fewer byte on the call stack apparently was enough to prevent the overflow.
Solution: increase stack size and be more careful with clogging it up with unnecessary items.

Function crashes on returning if malloc() has been used

I'm having one of those moments where I'm sure there is some obvious thing I'm missing but I can't see it for looking.
We have some code (Not Invented Here, natch) which looks something like this (I've made it pseudocode for ease of reading):
struct outputs_struct{
char *SomeString;
};
int DoSomething(struct allthings_struct *AllThings)
{
struct inputs_struct The_Inputs;
struct outputs_struct The_Outputs;
int error = 0;
// Populate input data, then:
error = DoGetOutputsFromInputs(Allthings, &The_Inputs, &The_Outputs);
return error;
}
int DoGetOutputsFromInputs(struct allthings_struct *AllThings, struct input_struct *Inputs, struct outputs_struct *Outputs)
{
// Some reading of input data, then:
Outputs->SomeString = (char *)malloc(100);
strcpy(Outputs->SomeString, "Hello,world");
// Some other stuff
return 0;
}
As soon as this function returns, we get a SEGFAULT.
It SEGFAULTs immediately on coming back from DoGetOutputsFromInputs(). Likewise if I print markers & pause before the return statement in DoGetOutputsFromInputs() it is fine right up to the moment it actually returns.
I have also tried upping my caffeine dosage, experiments are ongoing in that department, so far: no progress.
Edit 1: Further testing reveals it's not the malloc() that's at fault / causing the issue, the code actually crashes if we return sooner than that part, so I think there is some oddness going on elsewhere that I will have to chase down.
Apologies for the vagueness and pseudocode, it's a huge steaming pile of code auto-generated by gSoap (which doesn't auto-generate any sort of comments or documentation, of course...) from ONVIF WSDL's, we're developing in Ubuntu and the target is a TI DaVinci DSP/ARM9 SoC. This code is a subsection of a corner of the TI SDK and hence various things are outside our immediate influence / too time-consuming to delve into.
Your example does not repro. I suspect that the referencing of the parent-frame-stack-declared The_Outputs is the culprit and somewhere on the code a cast is done that fools the compiler to write a few bytes higher on the stack, where exactly the ebp ret address would be, triggering the fault when execting the ret (I assume an x86 like stack architecture).
Running under gdb should make this fairly trivial to capture. Enter DoGetOutputsFromInputs and use watch to set a break-on-write on the stack ret address (see Can I set a breakpoint on 'memory access' in GDB?). Let it run, should break when the overwrite occurs (if my hypothesis is correct) and that instruction is your culprit.
Of course compiling with stack-smash protection would also capture the problem fairly easy, but where is the fun?
Well to answer my own question and close this off / avoid wasting anyone's time... basically, it's not the malloc, it's unlikely it's even that function, there is something lurking in the code which isn't quite right and which I will have to devote a fair bit more time & coffee to tracking down.
Thanks all for the input.
Nurse, fetch the valium!
Its impossible to say without the actual code but this could be due to memory corruption (e.g., buffer overflow or underflow) or UB (undefined behavior). If it is chances are the actual issue is happening somewhere else and just happens to show up at this point.
A few things you can do to narrow down the cause:
Use Valgrind or a similar tool to look for memory issues.
Create a minimal example code that replicates the issue.
Double-check all memory allocations, frees, and copies.
Test the DoGetOutputsFromInputs() to ensure it works as expected.

Allocating a new call stack

(I think there's a high chance of this question either being a duplicate or otherwise answered here already, but searching for the answer is hard thanks to interference from "stack allocation" and related terms.)
I have a toy compiler I've been working on for a scripting language. In order to be able to pause the execution of a script while it's in progress and return to the host program, it has its own stack: a simple block of memory with a "stack pointer" variable that gets incremented using the normal C code operations for that sort of thing and so on and so forth. Not interesting so far.
At the moment I compile to C. But I'm interested in investigating compiling to machine code as well - while keeping the secondary stack and the ability to return to the host program at predefined control points.
So... I figure it's not likely to be a problem to use the conventional stack registers within my own code, I assume what happens to registers there is my own business as long as everything is restored when it's done (do correct me if I'm wrong on this point). But... if I want the script code to call out to some other library code, is it safe to leave the program using this "virtual stack", or is it essential that it be given back the original stack for this purpose?
Answers like this one and this one indicate that the stack isn't a conventional block of memory, but that it relies on special, system specific behaviour to do with page faults and whatnot.
So:
is it safe to move the stack pointers into some other area of memory? Stack memory isn't "special"? I figure threading libraries must do something like this, as they create more stacks...
assuming any area of memory is safe to manipulate using the stack registers and instructions, I can think of no reason why it would be a problem to call any functions with a known call depth (i.e. no recursion, no function pointers) as long as that amount is available on the virtual stack. Right?
stack overflow is obviously a problem in normal code anyway, but would there be any extra-disastrous consequences to an overflow in such a system?
This is obviously not actually necessary, since simply returning the pointers to the real stack would be perfectly serviceable, or for that matter not abusing them in the first place and just putting up with fewer registers, and I probably shouldn't try to do it at all (not least due to being obviously out of my depth). But I'm still curious either way. Want to know how these sorts of things work.
EDIT: Sorry of course, should have said. I'm working on x86 (32-bit for my own machine), Windows and Ubuntu. Nothing exotic.
All of these answer are based on "common processor architectures", and since it involves generating assembler code, it has to be "target specific" - if you decide to do this on processor X, which has some weird handling of stack, below is obviously not worth the screensurface it's written on [substitute for paper]. For x86 in general, the below holds unless otherwise stated.
is it safe to move the stack pointers into some other area of memory?
Stack memory isn't "special"? I figure threading libraries
must do something like this, as they create more stacks...
The memory as such is not special. This does however assume that it's not on an x86 architecture where the stack segment is used to limit the stack usage. Whilst that is possible, it's rather rare to see in an implementation. I know that some years ago Nokia had a special operating system using segments in 32-bit mode. As far as I can think of right now, that's the only one I've got any contact with that uses the stack segment for as x86-segmentation mode describes.
Assuming any area of memory is safe to manipulate using the stack
registers and instructions, I can think of no reason why it would be a
problem to call any functions with a known call depth (i.e. no
recursion, no function pointers) as long as that amount is available
on the virtual stack. Right?
Correct. Just as long as you don't expect to be able to get back to some other function without switching back to the original stack. Limited level of recursion would also be acceptable, as long as the stack is deep enough [there are certain types of problems that are definitely hard to solve without recursion - binary tree search for example].
stack overflow is obviously a problem in normal code anyway,
but would there be any extra-disastrous consequences to an overflow in
such a system?
Indeed, it would be a tough bug to crack if you are a little unlucky.
I would suggest that you use a call to VirtualProtect() (Windows) or mprotect() (Linux etc) to mark the "end of the stack" as unreadable and unwriteable so that if your code accidentally walks off the stack, it crashes properly rather than some other more subtle undefined behaviour [because it's not guaranteed that the memory just below (lower address) is unavailable, so you could overwrite some other useful things if it does go off the stack, and that would cause some very hard to debug bugs].
Adding a bit of code that occassionally checks the stack depth (you know where your stack starts and ends, so it shouldn't be hard to check if a particular stack value is "outside the range" [if you give yourself some "extra buffer space" between the top of the stack and the "we're dead" zone that you protected - a "crumble zone" as they would call it if it was a car in a crash]. You can also fill the entire stack with a recognisable pattern, and check how much of that is "untouched".
Typically, on x86, you can use the existing stack without any problems so long as:
you don't overflow it
you don't increment the stack pointer register (with pop or add esp, positive_value / sub esp, negative_value) beyond what your code starts with (if you do, interrupts or asynchronous callbacks (signals) or any other activity using the stack will trash its contents)
you don't cause any CPU exception (if you do, the exception handling code might not be able to unwind the stack to the nearest point where the exception can be handled)
The same applies to using a different block of memory for a temporary stack and pointing esp to its end.
The problem with exception handling and stack unwinding has to do with the fact that your compiled C and C++ code contains some exception-handling-related data structures like the ranges of eip with the links to their respective exception handlers (this tells where the closest exception handler is for every piece of code) and there's also some information related to identification of the calling function (i.e. where the return address is on the stack, etc), so you can bubble up exceptions. If you just plug in raw machine code into this "framework", you won't properly extend these exception-handling data structures to cover it, and if things go wrong, they'll likely go very wrong (the entire process may crash or become damaged, despite you having exception handlers around the generated code).
So, yeah, if you're careful, you can play with stacks.
You can use any region you like for the processor's stack (modulo the memory protections).
Essentially, you simply load the ESP register ("MOV ESP, ...") with a pointer to the new area, however you managed to allocate it.
You have to have enough for your program, and whatever it might call (e.g., a Windows OS API), and whatever funny behaviours the OS has. You might be able to figure out how much space your code needs; a good compiler can easily do that. Figuring how much is needed by Windows is harder; you can always allocate "way too much" which is what Windows programs tend to do.
If you decide to manage this space tightly, you'll probably have to switch stacks to call Windows functions. That won't be enough; you'll likely get burned by various Windows surprises. I describe one of them here Windows: avoid pushing full x86 context on stack. I have mediocre solutions, but not good solutions for this.

how to find if stack increases upwards or downwards?

how to find if stack increases upwards or downwards?
This is very platform-dependent, and even application-dependent.
The code posted by Vino only works in targets where parameters are passed on the stack AND local variables are allocated from the stack, in that order. Many compilers will assign fixed memory addresses to parameters, or pass parameters in registers. While common, passing parameters on the stack is one of the least efficient ways to get data into and out of a function.
Look at the disassembly for your compiled app and see what code the compiler is generating. If your target has native stack manipulation commands (like PUSH and POP) that the compiler is using, then the CPU datasheet/reference manual will tell you which direction the stack is growing. However, the compiler may choose to implement its own stack, in which case you'll have to do some digging.
Or, read the stack pointer, push something on the stack, and read the stack pointer again. Compare the results of the first and second read to determine the direction in which the pointer moves.
For future reference: if you include some details about your target architecture (embedded? PC? Linux, Windows? GCC? VC? Watcom? blah blah blah) you'll get more meaningful answers.
One possible way is...
#include <stdio.h>
void call(int *a)
{
int b;
if (&b > a)
printf("Stack grows up.\n");
else
printf("Stack grows down.\n");
}
int main ()
{
int a;
call(&a);
return 0;
}
Brute force approach is to fill your memory with a known value say 0xFF. Push some items on the stack. Do a memory dump. Push some more items on the stack. Do another memory dump.
Create function with many local variables.
Turn off optimizations.
Either print the assembly language..
Or when debugging, display as mixed source and assembly language.
Note the stack pointer (or register) before the function is executed.
Single-step through the function and watch the stack pointer.
In general, whether a compiler uses incrementing or decrementing stack pointers is a very minor issue as long as the issue is consistent and working. This is one issue that rarely occupies my mind. I tend to concentrate on more important topics, such as quality, correctness and robustness.
I'll trust the compiler to correctly handle stack manipulation. I don't trust recursive functions, especially on embedded or restricted platforms.

Resources