Return from Shellcode instead of exit - c

I'm working on a shared mem assignment for uni and I've 'borrowed' and 'massaged' some shellcode I've seen in posts I've read here and other places. I've been able to construct an example that runs on my MacBook (Mojave) and almost does what I want.
However, since I'm not that experienced with assembly programming in an OS environment (MacOS in this case), and because I don't fully understand the assembly below, I need a little help to overcome my last issue.
In my C boilerplate wrapper, I have a loop that calls my shellcode periodically, but the loop only executes one iteration. This leads me to the conclusion that the second syscall (in the code below) is performing an exit 0, thus terminating the process.
How can I modify the assembly to return instead of exit?
Note if asked, I can post more information about the wrapper code and tools I'm using.
bits 64
Section .text
global start
start:
jmp short MESSAGE ;00000000 EB24 jmp short 0x26
GOBACK:
mov eax, 0x2000004 ;00000002 B804000002 mov eax,0x2000004 ; write
mov edi, 0x1 ;00000007 BF01000000 mov edi,0x1
lea rsi, [rel msg] ;0000000C 488D3518000000 lea rsi,[rel 0x2b]
mov edx, 0xf ;00000013 BA0F000000 mov edx,0xf
syscall ;00000018 0F05 syscall
mov eax,0x2000001 ;0000001A B801000002 mov eax,0x2000001 ;exit
mov edi,0x0 ;0000001F BF00000000 mov edi,0x0
syscall ;00000024 0F05 syscall
MESSAGE:
call GOBACK
msg: db "Hello, world!", 0dh, 0ah
.len: equ $ - msg
Here is my C boilerplate code, which includes the shellcode from above.
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>
const char shellcode[] = "\xeb\x24\xb8\x04\x00\x00\x02\xbf\x01\x00\x00\x00\x48\x8d\x35\x18\x00\x00\x00\xba\x0f\x00\x00\x00\x0f\x05\xb8\x01\x00\x00\x02\xbf\x00\x00\x00\x00\x0f\x05\xe8\xd7\xff\xff\xff\x48\x65\x6c\x6c\x6f\x2c\x20\x77\x6f\x72\x6c\x64\x21\x0d\x0a";
int main(int argc, char **argv)
{
void *mem = mmap(0, sizeof(shellcode), PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0);
memcpy(mem, shellcode, sizeof(shellcode));
mprotect(mem, sizeof(shellcode), PROT_READ|PROT_WRITE|PROT_EXEC);
for (int i = 0; i < 10; i++) {
int (*func)();
func = (int (*)())mem;
(int)(*func)();
sleep(5);
}
munmap(mem, sizeof(shellcode));
return 0;
}

I have a loop that calls my shellcode
So if it's an actual call then you are good to go and the only change you need to do is to remove the 3 last instructions (mov, mov and syscall) and replaced them with a ret since the return address will be on the stack already. But we need to additionally clean up the stack a little bit since the address of the text string is put there for the sake of syscall, we want to get rid of it before the return.
Also since you shellcode contains an offset (the first jmp opcode with the value of 0x24) it's safer to not delete anything (as it will change the offsets) but to replace with NOP. Unless you modify the shellcode's source code and generate it every time - than it's fine to remove stuff.
So what I would actually propose to do is to replace the bytes that constitutes for the last 3 opcodes with 0x90(NOP) and the last 2 bytes from that replace with pop rax\ret so bytes 0x58 and 0xc3. The final shellcode could look like this:
const char shellcode[] = "\xeb\x24\xb8\x04\x00\x00\x02\xbf\x01\x00\x00\x00\x48\x8d\x35\x18\x00\x00\x00\xba\x0f\x00\x00\x00\x0f\x05\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x58\xc3\xe8\xd7\xff\xff\xff\x48\x65\x6c\x6c\x6f\x2c\x20\x77\x6f\x72\x6c\x64\x21\x0d\x0a";

Related

Why do GDB and different assemblers give different and wrong jmp addresses?

I wanted to run this assembly jmp 0x8048540 in the C code (below) to run a function located at memory address 0x8048540. But I got seg fault. I decided to see where I went wrong...
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#define AMOUNT_OF_STUFF 10
//TODO: Ask IT why this is here
void win(){
system("/bin/cat ./flag.txt");
}
void vuln(){
char * stuff = (char *)mmap(NULL, AMOUNT_OF_STUFF, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
if(stuff == MAP_FAILED){
printf("Failed to get space. Please talk to admin\n");
exit(0);
}
printf("Give me %d bytes:\n", AMOUNT_OF_STUFF);
fflush(stdout);
int len = read(STDIN_FILENO, stuff, AMOUNT_OF_STUFF);
if(len == 0){
printf("You didn't give me anything :(");
exit(0);
}
void (*func)() = (void (*)())stuff;
func();
}
int main(int argc, char*argv[]){
printf("My mother told me to never accept things from strangers\n");
printf("How bad could running a couple bytes be though?\n");
fflush(stdout);
vuln();
return 0;
}
This is the function at the address:
Dump of assembler code for function win:
0x08048540 <+0>: push %ebp
0x08048541 <+1>: mov %esp,%ebp
0x08048543 <+3>: sub $0x8,%esp
0x08048546 <+6>: sub $0xc,%esp
0x08048549 <+9>: push $0x8048700
0x0804854e <+14>: call 0x80483f0 <system#plt>
0x08048553 <+19>: add $0x10,%esp
0x08048556 <+22>: leave
0x08048557 <+23>: ret
End of assembler dump.
I noticed that the opcode that my assemblers gave me were inconsistent. The jump addresses they gave me were also different from the intended address of 0x8048540.
According to defuse.ca for x86, my string literal is \xE9\x3C\x85\x04\x08. The address I see is 0x804853C
However, according to rasm2 for x86, my string literal is \xe9\x3b\x85\x04\x08. The address I see is 0x804853B
1st Qn: Why are the addresses different from my intended address and so different from each other? They were both supposed to give opcode for x86.
Nevertheless, I just decided to go with rasm2's opcode.
Then, I noticed something weird in GDB. (Note: the read() command reads 10 bytes to the memory address 0xf7fd3000.
(gdb) x/8x 0xf7fd3000
0xf7fd3000: 0xe9 0x3b 0x85 0x04 0x08 0x00 0x00 0x00
Seems all well and good so far. The value in the memory address matches the string literal given by rasm2.
Then I decided to see the memory in terms of instructions:
(gdb) x/2i 0xf7fd3000
0xf7fd3000: jmp 0x1b540
0xf7fd3005: add BYTE PTR [eax],al
Woah. Why jump to address 0x1b540?? Could it just be a visual error?
So I ran it.
But GDB REALLY jumped to that address!
(gdb) si
0x0001b540 in ?? ()
=> 0x0001b540: Cannot access memory at address 0x1b540
I thought perhaps I made a mistake. Perhaps jmp 0x8048540 is illegal. But, according to this source, jmp accepts 32 bit pointers.
2nd Qn: Why is GDB giving me such a ridiculous address?
Could someone kindly enlighten me the reason behind the different addresses? All I want is just to jump to 0x8048540. defuse.ca gave me 0x804853C, rasm2 gave me 0x804853B, and GDB gave me 0x1b540. T.T
Thank you.
FYI, this is from Shells challenge in PicoCTF 2017.
The machine code for "jmp 0x8048540" is the input.
That's wrong:
There are different kinds of jmp instructions (like jmp ecx which takes the destination address from the ecx register) on x86 CPUs.
The jump instructions (jmp, call, je, jae ...) which take an immediate value however are PC-relative:
The destination address of the jump is calculated by the formula:
argument of "jmp" + address of the next instruction
So the following code:
0x12340000 E9 00 00 01 00
Disassembles to:
0x12340000 jmp 0x12350005
This is calculated the following way:
The jmp instruction is 5 bytes long and it is located at address 0x12340000. So the next instruction (the instruction following jmp) is located at 0x12340005.
The argument of jmp is 0x10000 and 0x12340005 + 0x10000 = 0x12350005.
And of course: The instruction will not only disassemble like this but also jump to 0x12350005.

Executed shellcode terminates main program

I am trying to execute shellcode in a memory region. While it works so far, I am confronted with another problem right now: The main-c-program exits after I called the shellcode-program. Is there a (simple) way around this other than working with threads?
I think that this has something to do with the mov rax, 60 and the following syscall, exiting the program. Right?
Main-C-Code
#include <string.h>
#include <sys/mman.h>
const char shellcode[] = "\xeb\x1e\xb8\x01\x00\x00\x00\xbf\x01\x00\x00\x00\x5e\xba\x0d\x00\x00\x00\x0f\x05\xb8\x3c\x00\x00\x00\xbf\x00\x00\x00\x00\x0f\x05\xe8\xdd\xff\xff\xff\x48\x65\x6c\x6c\x6f\x2c\x20\x57\x6f\x72\x6c\x64\x21";
// Error checking omitted for expository purposes
int main(int argc, char **argv)
{
// Allocate some read-write memory
void *mem = mmap(0, sizeof(shellcode), PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0);
// Copy the shellcode into the new memory
memcpy(mem, shellcode, sizeof(shellcode));
// Make the memory read-execute
mprotect(mem, sizeof(shellcode), PROT_READ|PROT_WRITE|PROT_EXEC);
// Call the shellcode
void (*func)();
func = (void (*)())mem;
(void)(*func)();
// This text will never appear
printf("This text never appears");
// Now, if we managed to return here, it would be prudent to clean up the memory:
// (I think that this line of code is also never reached)
munmap(mem, sizeof(shellcode));
return 0;
}
Basis of the Shellcode (assembler (Intel))
global _start
_start:
jmp message
code:
mov rax, 1
mov rdi, 1
pop rsi
mov rdx, 13
syscall
mov rax, 60
mov rdi, 0
syscall
message:
call code
db "Hello, World!"
imo the simplest way would be to make a binary file, then exec() that. and if you need output from that then setup pipes.
I actually found it out by myself. If anyone is interested, the simple solution was to alter the assembler-code as follows:
global _start
_start:
jmp message
code:
mov rax, 1
mov rdi, 1
pop rsi
mov rdx, 13
syscall
ret # Instead of "mov.., mov..., syscall"
message:
call code
db "Hello, World!"

Is it practical to create a C language addon for anonymous functions?

I know that C compilers are capable of taking standalone code, and generate standalone shellcode out of it for the specific system they are targetting.
For example, given the following in anon.c:
int give3() {
return 3;
}
I can run
gcc anon.c -o anon.obj -c
objdump -D anon.obj
which gives me (on MinGW):
anon1.obj: file format pe-i386
Disassembly of section .text:
00000000 <_give3>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: b8 03 00 00 00 mov $0x3,%eax
8: 5d pop %ebp
9: c3 ret
a: 90 nop
b: 90 nop
So I can make main like this:
main.c
#include <stdio.h>
#include <stdint.h>
int main(int argc, char **argv)
{
uint8_t shellcode[] = {
0x55,
0x89, 0xe5,
0xb8, 0x03, 0x00, 0x00, 0x00,
0x5d, 0xc3,
0x90,
0x90
};
int (*p_give3)() = (int (*)())shellcode;
printf("%d.\n", (*p_give3)());
}
My question is, is it practical to automate the process of converting the self contained anonymous function that does not refer to anything that is not within its scope or in arguments?
eg:
#include <stdio.h>
#include <stdint.h>
int main(int argc, char **argv)
{
uint8_t shellcode[] = [#[
int anonymous() {
return 3;
}
]];
int (*p_give3)() = (int (*)())shellcode;
printf("%d.\n", (*p_give3)());
}
Which would compile the text into shellcode, and place it into the buffer?
The reason I ask is because I really like writing C, but making pthreads, callbacks is incredibly painful; and as soon as you go one step above C to get the notion of "lambdas", you lose your language's ABI(eg, C++ has lambda, but everything you do in C++ is suddenly implementation dependent), and "Lisplike" scripting addons(eg plug in Lisp, Perl, JavaScript/V8, any other runtime that already knows how to generalize callbacks) make callbacks very easy, but also much more expensive than tossing shellcode around.
If this is practical, then it is possible to put functions which are only called once into the body of the function calling it, thus reducing global scope pollution. It also means that you do not need to generate the shellcode manually for each system you are targetting, since each system's C compiler already knows how to turn self contained C into assembly, so why should you do it for it, and ruin readability of your own code with a bunch of binary blobs.
So the question is: is this practical(for functions which are perfectly self contained, eg even if they want to call puts, puts has to be given as an argument or inside a hash table/struct in an argument)? Or is there some issue preventing this from being practical?
Apple has implemented a very similar feature in clang, where it's called "blocks". Here's a sample:
int main(int argc, char **argv)
{
int (^blk_give3)(void) = ^(void) {
return 3;
};
printf("%d.\n", blk_give3());
return 0;
}
More information:
Clang: Language Specification for Blocks
Wikipedia: Blocks (C language extension)
I know that C compilers are capable of taking standalone code, and generate standalone shellcode out of it for the specific system they are targeting.
Turning source into machine code is what compilation is. Shellcode is machine code with specific constraints, none of which apply to this use-case. You just want ordinary machine code like compilers generate when they compile functions normally.
AFAICT, what you want is exactly what you get from static foo(int x){ ...; }, and then passing foo as a function pointer. i.e. a block of machine code with a label attached, in the code section of your executable.
Jumping through hoops to get compiler-generated machine code into an array is not even close to worth the portability downsides (esp. in terms of making sure the array is in executable memory).
It seems the only thing you're trying to avoid is having a separately-defined function with its own name. That's an incredibly small benefit that doesn't come close to justifying doing anything like you're suggesting in the question. AFAIK, there's no good way to achieve it in ISO C11, but:
Some compilers support nested functions as a GNU extension:
This compiles (with gcc6.2). On Godbolt, I used -xc to compile it as C, not C++.. It also compiles with ICC17, but not clang3.9.
#include <stdlib.h>
void sort_integers(int *arr, size_t len)
{
int bar(){return 3;} // gcc warning: ISO C forbids nested functions [-Wpedantic]
int cmp(const void *va, const void *vb) {
const int *a=va, *b=vb; // taking const int* args directly gives a warning, which we could silence with a cast
return *a > *b;
}
qsort(arr, len, sizeof(int), cmp);
}
The asm output is:
cmp.2286:
mov eax, DWORD PTR [rsi]
cmp DWORD PTR [rdi], eax
setg al
movzx eax, al
ret
sort_integers:
mov ecx, OFFSET FLAT:cmp.2286
mov edx, 4
jmp qsort
Notice that no definition for bar() was emitted, because it's unused.
Programs with nested functions built without optimization will have executable stacks. (For reasons explained below). So if you use this, make sure you use optimization if you care about security.
BTW, nested functions can even access variable in their parent (like lambas). Changing cmp into a function that does return len results in this highly surprising asm:
__attribute__((noinline))
void call_callback(int (*cb)()) {
cb();
}
void foo(int *arr, size_t len) {
int access_parent() { return len; }
call_callback(access_parent);
}
## gcc5.4
access_parent.2450:
mov rax, QWORD PTR [r10]
ret
call_callback:
xor eax, eax
jmp rdi
foo:
sub rsp, 40
mov eax, -17599
mov edx, -17847
lea rdi, [rsp+8]
mov WORD PTR [rsp+8], ax
mov eax, OFFSET FLAT:access_parent.2450
mov QWORD PTR [rsp], rsi
mov QWORD PTR [rdi+8], rsp
mov DWORD PTR [rdi+2], eax
mov WORD PTR [rdi+6], dx
mov DWORD PTR [rdi+16], -1864106167
call call_callback
add rsp, 40
ret
I just figured out what this mess is about while single-stepping it: Those MOV-immediate instructions are writing machine-code for a trampoline function to the stack, and passing that as the actual callback.
gcc must ensure that the ELF metadata in the final binary tells the OS that the process needs an executable stack (note readelf -l shows GNU_STACK with RWE permissions). So nested functions that access outside their scope prevent the whole process from having the security benefits of NX stacks. (With optimization disabled, this still affects programs that use nested functions that don't access stuff from outer scopes, but with optimization enabled gcc realizes that it doesn't need the trampoline.)
The trampoline (from gcc5.2 -O0 on my desktop) is:
0x00007fffffffd714: 41 bb 80 05 40 00 mov r11d,0x400580 # address of access_parent.2450
0x00007fffffffd71a: 49 ba 10 d7 ff ff ff 7f 00 00 movabs r10,0x7fffffffd710 # address of `len` in the parent stack frame
0x00007fffffffd724: 49 ff e3 rex.WB jmp r11
# This can't be a normal rel32 jmp, and indirect is the only way to get an absolute near jump in x86-64.
0x00007fffffffd727: 90 nop
0x00007fffffffd728: 00 00 add BYTE PTR [rax],al
...
(trampoline might not be the right terminology for this wrapper function; I'm not sure.)
This finally makes sense, because r10 is normally clobbered without saving by functions. There's no register that foo could set that would be guaranteed to still have that value when the callback is eventually called.
The x86-64 SysV ABI says that r10 is the "static chain pointer", but C/C++ don't use that. (Which is why r10 is treated like r11, as a pure scratch register).
Obviously a nested function that accesses variables in the outer scope can't be called after the outer function returns. e.g. if call_callback held onto the pointer for future use from other callers, you would get bogus results. When the nested function doesn't do that, gcc doesn't do the trampoline thing, so the function works just like a separately-defined function, so it would be a function pointer you could pass around arbitrarily.
It seems possible, but unnecessarliy complicated:
shellcode.c
int anon() { return 3; }
main.c
...
uint8_t shellcode[] = {
#include anon.shell
};
int (*p_give3)() = (int (*)())shellcode;
printf("%d.\n", (*p_give3)());
makefile:
anon.shell:
gcc anon.c -o anon.obj -c; objdump -D anon.obj | extractShellBytes.py anon.shell
Where extractShellBytes.py is a script you write which prints only the raw comma-separated code bytes from the objdump output.

Scan from stdin and print to stdout using inline assembly in gcc

How to read from stdin and write to stdout in inline assembly gcc, just like we do it in NASM:
_start:
mov ecx, buffer ;buffer is a data word initialised 0h in section .data
mov edx, 03
mov eax, 03 ;read
mov ebx, 00 ;stdin
int 0x80
;Output the number entered
mov eax, 04 ;write
mov ebx, 01 ;stdout
int 0x80
I tried reading from stdin in inline assembly and then assign the input to x:
#include<stdio.h>
int x;
int main()
{
asm(" movl $5,  %%edx \n\t" "
movl $0,  %%ebx \n\t" "
movl $3,  %%eax \n\t" "
int $0x80 \n\t "
mov %%ecx,x"
::: "%eax", "%ebx", "%ecx", "%edx");
printf("%d",x);  
return 0;
}
However it fails to do so.
syscall from within GCC inline assembly
This link contains a code that is able to print only a single character to the stdout.
This code is based solely on my reading of linux references. I'm not on linux, so I cannot test it, but it should be pretty close. I would test it using redirection: a.out < foo.txt
#include <stdio.h>
#define SYS_READ 3
int main()
{
char buff[10]; /* Declare a buff to hold the returned string. */
ssize_t charsread; /* The number of characters returned. */
/* Use constraints to move buffer size into edx, stdin handle number
into ebx, address of buff into ecx. Also, "0" means this value
goes into the same place as parameter 0 (charsread). So eax will
hold SYS_READ (3) on input, and charsread on output. Lastly, you
MUST use the "memory" clobber since you are changing the contents
of buff without any of the constraints saying that you are.
This is a much better approach than doing the "mov" statements
inside the asm. For one thing, since gcc will be moving the
values into the registers, it can RE-USE them if you make a
second call to read more chars. */
asm volatile("int $0x80" /* Call the syscall interrupt. */
: "=a" (charsread)
: "0" (SYS_READ), "b" (STDIN_FILENO), "c" (buff), "d" (sizeof(buff))
: "memory", "cc");
printf("%d: %s", (int)charsread, buff);
return 0;
}
Responding to Aanchal Dalmia's comments below:
1) As Timothy says below, even if you aren't using the return value, you must let gcc know that the ax register is being modified. In other words, it isn't safe to remove the "=a" (charsread), even if it appears to work.
2) I was really confused by your observation that this code wouldn't work unless buff was global. Now that I have a linux install to play with, I was able to reproduce the error and I suspect I know the problem. I'll bet you are using the int 0x80 on an x64 system. That's not how you are supposed to call the kernel in 64bit.
Here is some alternate code that shows how to do this call in x64. Note that the function number and the registers have changed from the example above (see http://blog.rchapman.org/post/36801038863/linux-system-call-table-for-x86-64):
#include <stdio.h>
#define SYS_READ 0
#define STDIN_FILENO 0
int main()
{
char buff[10]; /* Declare a buff to hold the returned string. */
ssize_t charsread; /* The number of characters returned. */
/* Use constraints to move buffer size into rdx, stdin handle number
into rdi, address of buff into rsi. Also, "0" means this value
goes into the same place as parameter 0 (charsread). So eax will
hold SYS_READ on input, and charsread on output. Lastly, I
use the "memory" clobber since I am changing the contents
of buff without any of the constraints saying that I am.
This is a much better approach than doing the "mov" statements
inside the asm. For one thing, since gcc will be moving the
values into the registers, it can RE-USE them if you make a
second call to read more chars. */
asm volatile("syscall" /* Make the syscall. */
: "=a" (charsread)
: "0" (SYS_READ), "D" (STDIN_FILENO), "S" (buff), "d" (sizeof(buff))
: "rcx", "r11", "memory", "cc");
printf("%d: %s", (int)charsread, buff);
return 0;
}
It's going to take a better linux expert than me to explain why the int 0x80 on x64 wouldn't work with stack variables. But using syscall does work, and syscall is faster on x64 than int.
Edit: It has been pointed out to me that the kernel clobbers rcx and r11 during syscalls. Failing to account for this can cause all sorts of problems, so I have added them to the clobber list.

How to write a buffer-overflow exploit in GCC,windows XP,x86?

void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 12;
(*ret) += 8;//why is it 8??
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
The above demo is from here:
http://insecure.org/stf/smashstack.html
But it's not working here:
D:\test>gcc -Wall -Wextra hw.cpp && a.exe
hw.cpp: In function `void function(int, int, int)':
hw.cpp:6: warning: unused variable 'buffer2'
hw.cpp: At global scope:
hw.cpp:4: warning: unused parameter 'a'
hw.cpp:4: warning: unused parameter 'b'
hw.cpp:4: warning: unused parameter 'c'
1
And I don't understand why it's 8 though the author thinks:
A little math tells us the distance is
8 bytes.
My gdb dump as called:
Dump of assembler code for function main:
0x004012ee <main+0>: push %ebp
0x004012ef <main+1>: mov %esp,%ebp
0x004012f1 <main+3>: sub $0x18,%esp
0x004012f4 <main+6>: and $0xfffffff0,%esp
0x004012f7 <main+9>: mov $0x0,%eax
0x004012fc <main+14>: add $0xf,%eax
0x004012ff <main+17>: add $0xf,%eax
0x00401302 <main+20>: shr $0x4,%eax
0x00401305 <main+23>: shl $0x4,%eax
0x00401308 <main+26>: mov %eax,0xfffffff8(%ebp)
0x0040130b <main+29>: mov 0xfffffff8(%ebp),%eax
0x0040130e <main+32>: call 0x401b00 <_alloca>
0x00401313 <main+37>: call 0x4017b0 <__main>
0x00401318 <main+42>: movl $0x0,0xfffffffc(%ebp)
0x0040131f <main+49>: movl $0x3,0x8(%esp)
0x00401327 <main+57>: movl $0x2,0x4(%esp)
0x0040132f <main+65>: movl $0x1,(%esp)
0x00401336 <main+72>: call 0x4012d0 <function>
0x0040133b <main+77>: movl $0x1,0xfffffffc(%ebp)
0x00401342 <main+84>: mov 0xfffffffc(%ebp),%eax
0x00401345 <main+87>: mov %eax,0x4(%esp)
0x00401349 <main+91>: movl $0x403000,(%esp)
0x00401350 <main+98>: call 0x401b60 <printf>
0x00401355 <main+103>: leave
0x00401356 <main+104>: ret
0x00401357 <main+105>: nop
0x00401358 <main+106>: add %al,(%eax)
0x0040135a <main+108>: add %al,(%eax)
0x0040135c <main+110>: add %al,(%eax)
0x0040135e <main+112>: add %al,(%eax)
End of assembler dump.
Dump of assembler code for function function:
0x004012d0 <function+0>: push %ebp
0x004012d1 <function+1>: mov %esp,%ebp
0x004012d3 <function+3>: sub $0x38,%esp
0x004012d6 <function+6>: lea 0xffffffe8(%ebp),%eax
0x004012d9 <function+9>: add $0xc,%eax
0x004012dc <function+12>: mov %eax,0xffffffd4(%ebp)
0x004012df <function+15>: mov 0xffffffd4(%ebp),%edx
0x004012e2 <function+18>: mov 0xffffffd4(%ebp),%eax
0x004012e5 <function+21>: movzbl (%eax),%eax
0x004012e8 <function+24>: add $0x5,%al
0x004012ea <function+26>: mov %al,(%edx)
0x004012ec <function+28>: leave
0x004012ed <function+29>: ret
In my case the distance should be - = 5,right?But it seems not working..
Why function needs 56 bytes for local variables?( sub $0x38,%esp )
As joveha pointed out, the value of EIP saved on the stack (return address) by the call instruction needs to be incremented by 7 bytes (0x00401342 - 0x0040133b = 7) in order to skip the x = 1; instruction (movl $0x1,0xfffffffc(%ebp)).
You are correct that 56 bytes are being reserved for local variables (sub $0x38,%esp), so the missing piece is how many bytes past buffer1 on the stack is the saved EIP.
A bit of test code and inline assembly tells me that the magic value is 28 for my test. I cannot provide a definitive answer as to why it is 28, but I would assume the compiler is adding padding and/or stack canaries.
The following code was compiled using GCC 3.4.5 (MinGW) and tested on Windows XP SP3 (x86).
unsigned long get_ebp() {
__asm__("pop %ebp\n\t"
"movl %ebp,%eax\n\t"
"push %ebp\n\t");
}
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
/* distance in bytes from buffer1 to return address on the stack */
printf("test %d\n", ((get_ebp() + 4) - (unsigned long)&buffer1));
ret = (int *)(buffer1 + 28);
(*ret) += 7;
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
I could have just as easily used gdb to determine this value.
(compiled w/ -g to include debug symbols)
(gdb) break function
...
(gdb) run
...
(gdb) p $ebp
$1 = (void *) 0x22ff28
(gdb) p &buffer1
$2 = (char (*)[5]) 0x22ff10
(gdb) quit
(0x22ff28 + 4) - 0x22ff10 = 28
(ebp value + size of word) - address of buffer1 = number of bytes
In addition to Smashing The Stack For Fun And Profit, I would suggest reading some of the articles I mentioned in my answer to a previous question of yours and/or other material on the subject. Having a good understanding of exactly how this type of exploit works should help you write more secure code.
It's hard to predict what buffer1 + 12 really points to. Your compiler can put buffer1 and buffer2 in any location on the stack it desires, even going as far as to not save space for buffer2 at all. The only way to really know where buffer1 goes is to look at the assembler output of your compiler, and there's a good chance it would jump around with different optimization settings or different versions of the same compiler.
I do not test the code on my own machine yet, but have you taken memory alignment into consideration?
Try to disassembly the code with gcc. I think a assembly code may give you a further understanding of the code. :-)
This code prints out 1 as well on OpenBSD and FreeBSD, and gives a segmentation fault on Linux.
This kind of exploit is heavily dependent on both the instruction set of the particular machine, and the calling conventions of the compiler and operating system. Everything about the layout of the stack is defined by the implementation, not the C language. The article assumes Linux on x86, but it looks like you're using Windows, and your system could be 64-bit, although you can switch gcc to 32-bit with -m32.
The parameters you'll have to tweak are 12, which is the offset from the tip of the stack to the return address, and 8, which is how many bytes of main you want to jump over. As the article says, you can use gdb to inspect the disassembly of the function to see (a) how far the stack gets pushed when you call function, and (b) the byte offsets of the instructions in main.
The +8 bytes part is by how much he wants the saved EIP to the incremented with. The EIP was saved so the program could return to the last assignment after the function is done - now he wants to skip over it by adding 8 bytes to the saved EIP.
So all he tries to is to "skip" the
x = 1;
In your case the saved EIP will point to 0x0040133b, the first instruction after function returns. To skip the assignment you need to make the saved EIP point to 0x00401342. That's 7 bytes.
It's really a "mess with RET EIP" rather than an buffer overflow example.
And as far as the 56 bytes for local variables goes, that could be anything your compiler comes up with like padding, stack canaries, etc.
Edit:
This shows how difficult it is to make buffer overflows examples in C. The offset of 12 from buffer1 assumes a certain padding style and compile options. GCC will happily insert stack canaries nowadays (which becomes a local variable that "protects" the saved EIP) unless you tell it not to. Also, the new address he wants to jump to (the start instruction for the printf call) really has to be resolved manually from assembly. In his case, on his machie, with his OS, with his compiler, on that day.... it was 8.
You're compiling a C program with the C++ compiler. Rename hw.cpp to hw.c and you'll find it will compile.

Resources