Stack output in C - c

in advance: I do not want any 'ready-to-use' solution. Especially, imho it would defeat the purpose to learn something. And this is my primary goal: what I'd like to have is a few explainations/hints, or deeper understanding.
Now to the problem:
After using gdb and setting a breakpoint the following output of the stack is generated ( c-program):
The question that emerges now is:
0xbfa62f84:0x08048350 0xbfa62fe8 0xb7df0390 0x00000001
0xbfa62f94:0xbfa63014 0xbfa6301c 0xb7f262d0 0x00000000
for what do the values stand for? Or how can they be disassembled/decomposed?
I assume that they encode the memory address + some OP-code like mov, sub etc.
But how? and why? Or asked in a different fashion: how can these instructions be 'read out'?
Thanks in advance
Dan

If you want to understand the such a flow use a debugger like Keil .There at the same time you can see the assembly code and the generated hex file and your source code at the same time .Then when you step through the code you will understand how the assembly is related to the hex file and source code.

Machine code is not stored in the stack; however, the return address stored in the stack frame points to machine code. 0x08048350 is a good candidate for a code address (on x86, the code segment starts at a low address); you can examine the memory starting at that address and try to puzzle out opcodes and registers.
Or you could use the gdb command x/i to display the instructions starting at that address - x/16i 0x08048350 will display the first 16 instructions starting at that address.

Related

Running own code with a buffer overflow exploit

I am trying to understand the buffer overflow exploit and more specifically, how it can be used to run own code - e.g. by starting our own malicious application or anything similar.
While I do understand the idea of the buffer overflow exploit using the gets() function (overwriting the return address with a long enough string and then jumping to the said address), there are a few things I am struggling to understand in real application, those being:
Do I put my own code into the string just behind the return address? If so, how do I know the address to jump to? And if not, where do I jump and where is the actual code located?
Is the actual payload that runs the code my own software that's running and the other program just jumps into it or are all the instructions provided in the payload? Or more specifically, what does the buffer overflow exploit implementation actually look like?
What can I do when the address (or any instruction) contains 0? gets() function stops reading when it reads 0 so how is it possible to get around this problem?
As a homework, I am trying to exploit a very simple program that just asks for an input with gets() (ASLR turned off) and then prints it. While I can find the memory address of the function which calls it and the return, I just can't figure out how to actually implement the exploit.
You understand how changing the return address lets you jump to an arbitrary location.
But as you have correctly identified you don't know where you have loaded the code you want to execute. You just copied it into a local buffer(which was mostly some where on the stack).
But there is something that always points to this stack and it is the stack pointer register. (Lets assume x64 and it would be %rsp).
Assuming your custom code is on the top of the stack. (It could be at an offset but that too can be managed similarly).
Now we need an instruction that
1. Allows us to jump to the esp
2. Is located at a fixed address.
So most binaries use some kind of shared libraries. On windows you have kernel32.dll. In all the programs this library is loaded, it is always mapped at the same address. So you know the exact location of every instruction in this library.
All you have to do is disassemble one such library and find an instruction like
jmp *%rsp // or a sequence of instructions that lets you jump to an offset
Then the address of this instruction is what you will place where the return address is supposed to be.
The function will return then and then jump to the stack (ofcourse you need an executable stack for this). Then it will execute your arbitrary code.
Hope that clears some confusion on how to get the exploit running.
To answer your other questions -
Yes you can place your code in the buffer directly. Or if you can find the exact code you want to execute (again in a shared library), you can simply jump to that.
Yes, gets would stop at \n and 0. But usually you can get away by changing your instructions a bit to write code that doesn't use these bytes at all.
You try different instructions and check the assembled bytes.

fetch and decode instruction located at X address with c

I want to fetch and decode a instruction at address X. After that I want to increment the address by 4 and then execute the decoded instruction. The registers are 32 bit big endian. I am not asking for a solution, more a pointer or tips on how to do this in C, or if any of you know some good guides to follow.
You probably want assembly for this, not C. You could link assembly code into a C program, but you shouldn't write that in C.

Write jump instruction in c

To preface this, yes this is a project to take control of an executable externally. No, I do not have any malicious intents with this, the end result of this project won't be anything useful anyway. I am writing this in cygwin on a 32-bit installation of XP.
What I need to do is change the first few bits of a COM file to be a jump instruction so that on execution, it will jump to the very end of the COM file. I have looked in Assembler manuals to find what the bytes of that command would be so that I can just hard code it in C, but have had no luck.
First Question: Can I do this in C? It seems to me like I could just insert OpCodes in the beginning of any COM file so that it would execute that instead of the COM file.
Second Question: does someone know where I can find a resource for OpCodes so that I can insert them in my file? Or, does anyone know what the bytes would be for a Jump instruction?
If you have any question about the authenticity of this, feel free to ask.
The Intel® 64 and IA-32 Architectures Software Developer Manual Volume 2A Instruction Set Reference explains the encoding of the JMP instruction (real mode is a subset of IA-32).
For a 16-byte near jump (within the current code segment) you'd use 0xE9 followed by the relative offset to jump to. If your jump is the first bytes of the COM file then the offset will be relative to address 0x103 - the first instruction of a COM file is always loaded at address 0x100, and the jump is relative to the instruction following the 3-byte jump.
On XP there should be debug.exe. Simply start it, start writing code with 'a'
type jmp ff00, and dis/[u]nassemble the result with 'u' if the corresponding hex dump was not shown.
Notice first that your program is necessarily operating system, ABI, and machine instruction set specific. (e.g. it won't run under Linux/x86-64 or Linux/PowerPC)
You could write in C the machine instructions as a sequence of bytes. Which bytes you have to write (i.e. the encoding of the appropriate jump instructions) is left to you!!!!!
Of course, that is not portable C. But you could basically do a memcpy with some appropriate source byte zone.
Maybe libraries like asmjit or GNU lightning might inspire you.
You probably cannot use them directly, but studying their code could help you.
See also x86 wikipedia pages for more references.

In ELF or DWARF, how can I get .PLT section values? -- Trying to get the address of a function on where an instrumentation tool is in

I am working in obtaining all the data of a program using its ELF and DWARF info and by hooking a pin tool to a process that is currently running -- It is kind of a debugger using a Pin tool.
For getting the local variables from the stack I am working with the registers EIP, EBP and ESP which I have access to from Pin.
What stroke me as weird is that I was expecting EIP to be pointing to the current function that was running when the pin tool was attached to the process, but instead EIP is pointing to the section .PLT. In other words, if the pin tool was hooked into the process when Foo() was running, then I was expecting EIP to be pointing to some address inside the Foo function. However it is pointing to the beginning of the .PLT section.
What I need to know is which function the process is currently in -- Is there any way to get the address of the function using the .PLT section? Is there any other ways to get the address of the function from the stack or using Pin? I hope I was clear enough, let me know if there are any questions though.
I might not be understanding exactly what is going on here...is the instruction pointer really in the .plt section or are you just getting a garbage value from Pin ?
You name the instruction pointer you are reading EIP, which might be a problem if you are running on a 64bit system, is that the case ?
You see the instruction pointer register is a 32bit value on a 32bit system, and a 64bit value on a 64bit system. So Pin actually provides 3 REG_* names for the instruction pointer: EIP, RIP and GBP. EIP is always the lower 32bit half of the register, RIP the 64bit value, and GBP one of the two depending on your architecture. Asking for EIP on a 64bit system gives you garbage, same for asking RIP on a 32bit one.
Otherwise, a quick look on Google gives me this. Quoting a bit:
By default the .plt entries are all initialized by the linker not to point to the correct target functions, but instead to point to the dynamic loader itself. Thus, the first time you call any given function, the dynamic loader looks up the function and fixes the target of the .plt so that the next time this .plt slot is used we call the correct function.
And more importantly:
It is possible to instruct the dynamic loader to bind addresses to all of the .plt slots before transferring control to the application—this is done by setting the environment variable LD_BIND_NOW=1 before running the program. This turns out to be useful in some cases when you are debugging a program, for example.
Hope that helps.

Getting instruction given address pointed by the instruction pointer

I am working on this code where, I need to get the instructions executed by a program, given the instruction pointers. Assume for now that I have a mechanism that provides me addresses of the instructions, would it be possible to get the opcode from this (on an IA32 instruction set) ?
You need an in memory disassembler, such as BeaEngine or DiStorm, these can be passed a memory address to read from, just make sure the address is readable. If you know the length in bytes of the function, its a little better to use the Run-Length-Dissassemblers also provided on those sites.
If you are looking for hardware supported help, that's not how it works. This needs to be done in software. Your code needs a table of opcodes and instructions and just has to perform a lookup.
What you describe is known as disassembly. There are many open source disassemblers and if you could use one of those it would make your task very simple. Look here: http://en.wikibooks.org/wiki/X86_Disassembly/Disassemblers_and_Decompilers

Resources