Curious thing when finding environment variable address in gdb - c

Recently I'm doing some Return-to-libc attack experiment base on the paper Bypassing non-executable-stack during exploitation using return-to-libc with my Ubuntu11.10.
Before my experiment I closed the ALSR.
According to the paper, I can find address of the environment variable SHELL="/bin/bash" in gdb(use gdb to debug the program I want to attack):
But I found that this address is wrong when I try to use it to Return-to-libc experiment.
And then I write a simple program to get the environment variable address:
When I run this program in the Terminal, I get the right address:
With this address I can do the attack.
I also find the related question about this. But the answers doesn't really make sense(the second one may be better).
Just tell me some details about this, please.

From your screenshots, I'll assume you're running on an 32-bit intel platform. I haven't spent the time to fully research an answer to this, but these are points worth noting:
I'll bet that your entire environment is in about the same place, and is packed together tightly as c-style strings. (try x/100s **(char***)&environ).
When I tried ths on my x86-64 installation, the only thing I saw after the environment was my command line, and some empty strings.
At 0xBffff47A, you're very close to the top of user address space (which ends at 0xC0000000).
So, my guess is that what's going on here is that:
The environment block and command line parameters are, at some point during startup, shoved in a packed form right at the end of user address space.
The contents of your environment are different when you run your program in GDB or in the terminal. For example, I notice "_=/usr/bin/gdb" when running under GDB, and I'll just bet that's only there when running under GDB.
The result is that, while your fixed pointer tends to land somewhere in the middle of the environment block, it doesn't land in the same place every time, since the environment itself is changing between runs.

Related

where does a debugger saves the pointer to the next command in the code segment?

I am working on a challenge for my school and teacher. Basically they are given a .exe file that receives a password input and their goal is to break the password. They can use whatever they like to try and do it and I'm already working on some defense mechanisms. One thing I want to do is to prevent them from debugging the decompiled version of the .exe file(they will use Ghidra to reverse engineer it).
I know debuggers work by running the code one command after another, basically like a script. I thought of a possible(I hope) concept and thought about accessing the software's memory in order to be able to edit the value of the pointer to the next command on the debugger.
Meaning if the debugger is now running a certain command line from the code, it must have stored a pointer to it somewhere, and perhaps I would be able to change the value of the pointer mid-run, to mess with the debugging process. Is there a way to know where a certain debugger stores the pointer value of the next command?

Running own code with a buffer overflow exploit

I am trying to understand the buffer overflow exploit and more specifically, how it can be used to run own code - e.g. by starting our own malicious application or anything similar.
While I do understand the idea of the buffer overflow exploit using the gets() function (overwriting the return address with a long enough string and then jumping to the said address), there are a few things I am struggling to understand in real application, those being:
Do I put my own code into the string just behind the return address? If so, how do I know the address to jump to? And if not, where do I jump and where is the actual code located?
Is the actual payload that runs the code my own software that's running and the other program just jumps into it or are all the instructions provided in the payload? Or more specifically, what does the buffer overflow exploit implementation actually look like?
What can I do when the address (or any instruction) contains 0? gets() function stops reading when it reads 0 so how is it possible to get around this problem?
As a homework, I am trying to exploit a very simple program that just asks for an input with gets() (ASLR turned off) and then prints it. While I can find the memory address of the function which calls it and the return, I just can't figure out how to actually implement the exploit.
You understand how changing the return address lets you jump to an arbitrary location.
But as you have correctly identified you don't know where you have loaded the code you want to execute. You just copied it into a local buffer(which was mostly some where on the stack).
But there is something that always points to this stack and it is the stack pointer register. (Lets assume x64 and it would be %rsp).
Assuming your custom code is on the top of the stack. (It could be at an offset but that too can be managed similarly).
Now we need an instruction that
1. Allows us to jump to the esp
2. Is located at a fixed address.
So most binaries use some kind of shared libraries. On windows you have kernel32.dll. In all the programs this library is loaded, it is always mapped at the same address. So you know the exact location of every instruction in this library.
All you have to do is disassemble one such library and find an instruction like
jmp *%rsp // or a sequence of instructions that lets you jump to an offset
Then the address of this instruction is what you will place where the return address is supposed to be.
The function will return then and then jump to the stack (ofcourse you need an executable stack for this). Then it will execute your arbitrary code.
Hope that clears some confusion on how to get the exploit running.
To answer your other questions -
Yes you can place your code in the buffer directly. Or if you can find the exact code you want to execute (again in a shared library), you can simply jump to that.
Yes, gets would stop at \n and 0. But usually you can get away by changing your instructions a bit to write code that doesn't use these bytes at all.
You try different instructions and check the assembled bytes.

run code stored in memory

Problem:
run a non-trivial c program stored on the heap or data section of another c program as asm instructions.
My progress:
Ran a set of simple instructions that print something to stdout. The instructions are stored on the heap and I allowed the page containing the instructions to be executed and then calling into the raw data as though it was a function. This worked fine.
Next up, I want given any statically linked c program, to just read it's binary and be able to run it's main function while it is in memory from another c program.
I believe the issues are:
* jumping to where the main function code is
* changing the binary file's addresses which were created when linking so they are relative to where the code lies now in memory
Please let me know if my approach is good or whether I missed something important and what is the best way to go about it.
Thank you
Modern OSes try not to let you execute code in your data exactly because it's a security nightmare. http://en.wikipedia.org/wiki/No-execute_bit
Even if you get past that, there will be lots more 'gotchas' because both programs will think that they 'own' the stack/heap/etc. Once the new program executes, it's various bits of RAM from the old program will get stomped on. (exec exists just for this reason, to cleanly go from one program to another.)
If you really need to load code, you should make the first one a library, then use dlopen to run it. (You can use objcopy to extract just the subroutine you want and turn it into a library.)
Alternately, you can start the program (in another process) and use strace to inject a little bit of your code into their process to control it.
(If you're really trying to get into shell code, you should have said so. That's a whole 'nother can of worms.)

Detect write to string

Is there a way for me to detect/initiate-creash-on a write into a string without using mprotect (which I can't use)?
Currently I can detect the write only in the following read, but that's too late (the following read can come from a completely different lib).
Note: Using gdb with watchpoints failed due to optimizer moving the string around in the process memory.
Edit: The variable in question is a class member (char*) that contains some metadata as a prefix to a string. The string is the part that needs to be immutable, and the prefix must be writable. I've got a few millions of these objects in a class-static hash, and they are accessed from just about anywhere in our code.
You can try to wrap all the code that writes to memory in preprocessor macros which check the address that you're using but since most people love to use bare bones pointers (instead of library calls that encapsulate things), it will probably be a lot of effort.
The only other option is mprotect(2) or GDB which all use special parts of the CPU to watch the address bus for accesses to the memory in question.
Since you can't use that either, the last option is to print the code on paper and sit down in a quiet corner for a couple of days to read it. This will usually work but most people shun the effort (and because it doesn't look like "real" work ;-).
I am not sure if there is a command in gdb similar to "trace" in dbx, but in dbx I remember using a command called "trace" that can be used to track individual variables in the code and it intimates you when the variable value gets changed during the course of execution.

setting up system for program debugging buffer overflow

I remember reading a long time ago that if I want to test for buffer overflows on my linux box that I need to set something in the system to allow it to happen. I can't remember exactly what it was for, but I was hoping some one knew what I was talking about.
I want to be able to test my programs for vulnerabilities, and see if the registers are overwritten.
EDIT: I am running ubuntu 10.04
One option is to use a memory debugger such as Valgrind. Note, however, that Valgrind only tracks for buffer overflows on dynamically-allocated memory.
If you have the option to use C++ instead of C, then you can switch to using containers rather than raw arrays, and harness GCC's "checked container" mode (see GCC STL bound checking). I'm sure other compilers offer similar tools.
Another hint (in addition of Oli's answer), when chasing memory bugs with the gdb debugger, is to disable address space layout randomization, with e.g.
echo 0 > /proc/sys/kernel/randomize_va_space
After doing that, two consecutive runs of the same deterministic program will usually mmap regions at the same addresses (from one run to another), and this helps a lot debugging with gdb (because then malloc usually gives the same result from one run to another, at the same given location in the run).
You can also use the watch command of gdb. In particular, if in a first run (with ASLR disabled) you figure that the location 0x123456 is changing unexepectedly, you could give gdb the following command in its second run:
watch * (void**) 0x123456
Then gdb will break when this location changes (sadly, it has to be mmap-ed already).

Resources