How to debug a stackoverflow problem on targets

How to debug a stackoverflow problem on targets - c

I want to know how do we proceed to debug a STACKOVERFLOW issue on targets .
I mean what are the steps we should follow to reach a conclusion.

Put a memory write watchpoint for one word past the end of your stack space. Then the debugger will break in when that spot gets written to, and you can see what's at fault.

All stacks can be filled at start up with certain hex value (for example 0xAAAAAAAA). And then using special routine you can monitor all stack's maximum usage periodically by calculating the quantity of known values (0xAA..) from end of stack until finds the first difference.

Run it through a debugger such as gdb. The backtrace at the time of the stack overflow will tell you exactly which function or functions are repeating indefinitely. From there, figure out which input(s) to those functions are not changing, and not moving the function (if it's recursive) towards a base-case that will end the recursion.

Related

Running own code with a buffer overflow exploit

I am trying to understand the buffer overflow exploit and more specifically, how it can be used to run own code - e.g. by starting our own malicious application or anything similar.
While I do understand the idea of the buffer overflow exploit using the gets() function (overwriting the return address with a long enough string and then jumping to the said address), there are a few things I am struggling to understand in real application, those being:
Do I put my own code into the string just behind the return address? If so, how do I know the address to jump to? And if not, where do I jump and where is the actual code located?
Is the actual payload that runs the code my own software that's running and the other program just jumps into it or are all the instructions provided in the payload? Or more specifically, what does the buffer overflow exploit implementation actually look like?
What can I do when the address (or any instruction) contains 0? gets() function stops reading when it reads 0 so how is it possible to get around this problem?
As a homework, I am trying to exploit a very simple program that just asks for an input with gets() (ASLR turned off) and then prints it. While I can find the memory address of the function which calls it and the return, I just can't figure out how to actually implement the exploit.

You understand how changing the return address lets you jump to an arbitrary location.
But as you have correctly identified you don't know where you have loaded the code you want to execute. You just copied it into a local buffer(which was mostly some where on the stack).
But there is something that always points to this stack and it is the stack pointer register. (Lets assume x64 and it would be %rsp).
Assuming your custom code is on the top of the stack. (It could be at an offset but that too can be managed similarly).
Now we need an instruction that
1. Allows us to jump to the esp
2. Is located at a fixed address.
So most binaries use some kind of shared libraries. On windows you have kernel32.dll. In all the programs this library is loaded, it is always mapped at the same address. So you know the exact location of every instruction in this library.
All you have to do is disassemble one such library and find an instruction like
jmp *%rsp // or a sequence of instructions that lets you jump to an offset
Then the address of this instruction is what you will place where the return address is supposed to be.
The function will return then and then jump to the stack (ofcourse you need an executable stack for this). Then it will execute your arbitrary code.
Hope that clears some confusion on how to get the exploit running.
To answer your other questions -
Yes you can place your code in the buffer directly. Or if you can find the exact code you want to execute (again in a shared library), you can simply jump to that.
Yes, gets would stop at \n and 0. But usually you can get away by changing your instructions a bit to write code that doesn't use these bytes at all.
You try different instructions and check the assembled bytes.

Getting a stack trace from GDB

I must have a misunderstanding of the stack, or how functions are called, the backtrace results I'm getting from GDB make no sense to me. I'm trying to find out where things get called in a program so that I can add my component.
The tool draws bounding boxes on videos, what I made is an interpolator. I thought it only made sense to open GDB and put a breakpoint in when a box was being drawn, and run a backtrace. Here's mu output (after running the program from ffmpeg.c main())
#0 draw_glyphs (vidatbox=0x10183d200, picref=0x10141e340, width=720, height=480,
rgbcolor=0x10183d284 "????", yuvcolor=0x10183d278 "뀀?\020???\020???????", x=0, y=0) at
libavfilter/vf_VidAT.c:627
#1 0x000000010001ce4c in draw_text (ctx=0x10120df20, picref=0x10141e340, width=720,
height=480) at libavfilter/vf_VidAT.c:787
Disregarding all the non ascii chars,how are the two functions draw_glyphs and draw_text being called? How come there is nothing else on the stack? When I select Frame #1 and try and go up, it tells me:
Initial frame selected; you cannot go up.
EDIT:
I've looked more, and I'm even more confused then I was when I asked. The function draw_glyphs is not even called inside of the main that I'm running. I've grepped through all the files that this uses to compile and well...it's not called anywhere!
Does this mean that it's a dynamically created function pointer or something? If so, would that make the stack innaccessible like mine is?

If the stack trace is informative but terminates unexpectedly (especially after just one or two entries), then that indicates that gdb was unable to follow the call stack past that point.
Compiler options that interfere with stack unwinding include higher optimisation levels (esp. -O3) and -fomit-frame-pointer, so search your Makefiles and remove those options. The frame pointer is not usually necessary to the execution of code, so using it as a general purpose register will improve performance on register-starved architectures such as x86, but can interfere with debugging.
More recently frame-based stack unwinding is being replaced with unwind tables, but gdb still relies on the frame pointer if unwind tables are not present.

Buffer Overflow - why some ascii's work and not others

I'm sorry if this question is stupid or has been asked, but I couldn't find it.
I have a program that I was attempting to use a buffer over flow. It is a simple program that uses getchar() to retrieve the input from the user. The buffer is set to size 12. I can get the program to crash by typing >12 x's or using >12 \x78's, but it won't seg fault if I type in hundreds of A's or \x41's.
Any help or pointing in the right direction would be greatly appreciated.

0x41414141 may be a valid address within a text page of the process. Look at the segment map of the process for details.

To eliminate guessing, look at the assembly code and then at machine instructions of your program. Run it in a debugger and see what happens in the memory. You can see at what addresses on the stack local variables are placed and and what addresses registers and especially the instruction pointer are saved on a function call.
Have you look at examples like the stack overflow on Wikipedia?

Need help with buffer overrun

I've got a buffer overrun I absolutely can't see to figure out (in C). First of all, it only happens maybe 10% of the time or so. The data that it is pulling from the DB each time doesn't seem to be all that much different between executions... at least not different enough for me to find any discernible pattern as to when it happens. The exact message from Visual Studio is this:
A buffer overrun has occurred in
hub.exe which has corrupted the
program's internal state. Press
Break to debug the program or Continue
to terminate the program.
For more details please see Help topic
'How to debug Buffer Overrun Issues'.
If I debug, I find that it is broken in __report_gsfailure() which I'm pretty sure is from the /GS flag on the compiler and also signifies that this is an overrun on the stack rather than the heap. I can also see the function it threw this on as it was leaving, but I can't see anything in there that would cause this behavior, the function has also existed for a long time (10+ years, albeit with some minor modifications) and as far as I know, this has never happened.
I'd post the code of the function, but it's decently long and references a lot of proprietary functions/variables/etc.
I'm basically just looking for either some idea of what I should be looking for that I haven't or perhaps some tools that may help. Unfortunately, nearly every tool I've found only helps with debugging overruns on the heap, and unless I'm mistaken, this is on the stack. Thanks in advance.

You could try putting some local variables on either end of the buffer, or even sentinels into the (slightly expanded) buffer itself, and trigger a breakpoint if those values aren't what you think they should be. Obviously, using a pattern that is not likely in the data would be a good idea.

While it won't help you in Windows, Valgrind is by far the best tool for detecting bad memory behavior.
If you are debugging the stack, your need to get to low level tools - place a canary in the stack frame (perhaps a buffer filled with something like 0xA5) around any potential suspects. Run the program in a debugger and see which canaries are no longer the right size and contain the right contents. You will gobble up a large chunk of stack doing this, but it may help you spot exactly what is occurring.

One thing I have done in the past to help narrow down a mystery bug like this was to create a variable with global visibility named checkpoint. Inside the culprit function, I set checkpoint = 0; as the very first line. Then, I added ++checkpoint; statements before and after function calls or memory operations that I even remotely suspected might be able to cause an out-of-bounds memory reference (plus peppering the rest of the code so that I had a checkpoint at least every 10 lines or so). When your program crashes, the value of checkpoint will narrow down the range you need to focus on to a handful of lines of code. This may be a bit overkill, I do this sort of thing on embedded systems (where tools like valgrind can't be used) but it should still be useful.

Wrap it in an exception handler and dump out useful information when it occurs.

Does this program recurse at all? If so, I check there to ensure you don't have an infinite recursion bug. If you can't see it manually, sometimes you can catch it in the debugger by pausing frequently and observing the stack.

Bizarre bug in C [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
So I have a C program. And I don't think I can post any code snippets due to complexity issues. But I'll outline my error, because it's weird, and see if anyone can give any insights.
I set a pointer to NULL. If, in the same function where I set the pointer to NULL, I printf() the pointer (with "%p"), I get 0x0, and when I print that same pointer a million miles away at the end of my program, I get 0x0. If I remove the printf() and make absolutely no other changes, then when the pointer is printed later, I get 0x1, and other random variables in my structure have incorrect values as well. I'm compiling it with GCC on -O2, but it has the same behavior if I take off optimization, so that's not hte problem.
This sounds like a Heisenbug, and I have no idea why it's happening, nor how to fix it. Does anyone who has dealt with something like this in the past have advice on how they approached this kind of problem? I know this may sound kind of vague.
EDIT: Somehow, it works now. Thank you, all of you, for your suggestions.
The debugger told me interesting things - that my variable was getting optimized away. So I rewrote the function so it didn't need the intermediate variable, and now it works with and without the printf(). I have a vague idea of what might have been happening, but I need sleep more than I need to know what was happening.

Are you using multiple threads? I've often found that the act of printing something out can be enough to effectively suppress a race condition (i.e. not remove the bug, just make it harder to spot).
As for how to diagnose/fix it... can you move the second print earlier and earlier until you can see where it's changing?
Do you always see 0x1 later on when you don't have the printf in there?
One way of avoiding the delay/synchronization of printf would be to copy the pointer value into another variable at the location of the first printf and then print out that value later on - so you can see what the value was at that point, but in a less time-critical spot. Of course, as you've got odd value "corruption" going on, that may not be as reliable as it sounds...
EDIT: The fact that you're always seeing 0x1 is encouraging. It should make it easier to track down. Not being multithreaded does make it slightly harder to explain, admittedly.
I wonder whether it's something to do with the extra printf call making a difference to the size of stack. What happens if you print the value of a different variable in the same place as the first printf call was?
EDIT: Okay, let's take the stack idea a bit further. Can you create another function with the same sort of signature as printf and with enough code to avoid it being inlined, but which doesn't actually print anything? Call that instead of printf, and see what happens. I suspect you'll still be okay.
Basically I suspect you're screwing with your stack memory somewhere, e.g. by writing past the end of an array on the stack; changing how the stack is used by calling a function may be disguising it.

If you're running on a processor that supports hardware data breakpoints (like x86), just set a breakpoint on writes to the pointer.

Do you have a debugger available to you? If so, what do the values look like in that? Can you set any kind of memory/hardware breakpoint on the value? Maybe there's something trampling over the memory elsewhere, and the printf moves things around enough to move or hide the bug?
Probably worth looking at the asm to see if there's anything obviously wrong there. Also, if you haven't already, do a full clean rebuild. If the definition of the struct has changed recently, there's a vague change that the compiler could be getting it wrong if the dependency checking failed to correctly rebuild everything it needed to.

Have you tried setting a condition in your debugger which notifies you when that value is modified? Or running it through Valgrind? These are the two major things that I would try, especially Valgrind if you're using Linux. There's no better way to figure out memory errors.

Without code, it's a little hard to help, but I understand why you don't want to foist copious amounts on us.
Here's my first suggestion: use a debugger and set a watchpoint on that pointer location.
If that's not possible, or the bug disappears again, here's my second suggestion.
1/ Start with the buggy code, the one where you print the pointer value and you see 0x1.
2/ Insert another printf a little way back from there (in terms of code execution path).
3/ If it's still 0x1, go back to step 2, moving a little back through the execution path each time.
4/ If it's 0x0, you know where the problem lies.
If there's nothing obvious between the 0x0 printf and the 0x1 printf, it's likely to be corruption of some sort. Without a watchpoint, that'll be hard to track down - you need to check every single stack variable to ensure there's no possibility of overrun.
I'm assuming that pointer is a global since you set it and print it "a million miles away". If it is, lok at the variables you define on either side of it (in the source). They're the ones most likely to be causing overrun.
Another possibility is to turn off the optimization to see if the problem still occurs. We've occasionally had to ship code like that in cases where we couldn't fix the bug before deadlines (we'll always go back and fix it later, of course).