I'm using libcurl in my program, and running into a segfault. Before I filed a bug with the curl project, I thought I'd do a little debugging. What I found seemed very odd to me, and I haven't been able to make sense of it yet.
First, the segfault traceback:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe77f6700 (LWP 592)]
0x00007ffff6a2ea5c in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff6a2ea5c in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff5bc29e5 in x509_name_oneline (a=0x7fffe3d9c3c0,
buf=0x7fffe77f4ec0 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%", size=255) at ssluse.c:629
#2 0x00007ffff5bc2a6f in cert_verify_callback (ok=1, ctx=0x7fffe77f50b0)
at ssluse.c:645
#3 0x00007ffff72c9a80 in ?? () from /lib/libcrypto.so.0.9.8
#4 0x00007ffff72ca430 in X509_verify_cert () from /lib/libcrypto.so.0.9.8
#5 0x00007ffff759af58 in ssl_verify_cert_chain () from /lib/libssl.so.0.9.8
#6 0x00007ffff75809f3 in ssl3_get_server_certificate ()
from /lib/libssl.so.0.9.8
#7 0x00007ffff7583e50 in ssl3_connect () from /lib/libssl.so.0.9.8
#8 0x00007ffff5bc48f0 in ossl_connect_step2 (conn=0x7fffe315e9a8, sockindex=0)
at ssluse.c:1724
#9 0x00007ffff5bc700f in ossl_connect_common (conn=0x7fffe315e9a8,
sockindex=0, nonblocking=false, done=0x7fffe77f543f) at ssluse.c:2498
#10 0x00007ffff5bc7172 in Curl_ossl_connect (conn=0x7fffe315e9a8, sockindex=0)
at ssluse.c:2544
#11 0x00007ffff5ba76b9 in Curl_ssl_connect (conn=0x7fffe315e9a8, sockindex=0)
...
The call to memcpy looks like this:
memcpy(buf, biomem->data, size);
(gdb) p buf
$46 = 0x7fffe77f4ec0 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%"
(gdb) p biomem->data
$47 = 0x7fffe3e1ef60 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%"
(gdb) p size
$48 = 255
If I go up a frame, I see that the pointer passed in for buf came from a local variable defined in the calling function:
char buf[256];
Here's where it starts to get weird. I can manually inspect all 256 bytes of both buf and biomem->data without gdb complaining that the memory isn't accesible. I can also manually write all 256 bytes of buf using the gdb set command, without any error. So if all the memory involved is readable and writable, why does memcpy fail?
Also interesting is that I can use gdb to manually call memcpy with the pointers involved. As long as I pass a size <= 160, it runs without a problem. As soon as I pass 161 or higher, gdb gets a sigsegv. I know buf is larger than 160, because it was created on the stack as an array of 256. biomem->data is a little harder to figure, but I can read well past byte 160 with gdb.
I should also mention that this function (or rather the curl method I call that leads to this) completes successfully many times before the crash. My program uses curl to repeatedly call a web service API while it runs. It calls the API every five seconds or so, and runs for about 14 hours before it crashes. It's possible that something else in my app is writing out of bounds and stomping on something that creates the error condition. But it seems suspicious that it crashes at exactly the same point every time, although the timing varies. And all the pointers seem ok in gdb, but memcpy still fails. Valgrind doesn't find any bounds errors, but I haven't let my program run with valgrind for 14 hours.
Within memcpy itself, the disassembly looks like this:
(gdb) x/20i $rip-10
0x7ffff6a2ea52 <memcpy+242>: jbe 0x7ffff6a2ea74 <memcpy+276>
0x7ffff6a2ea54 <memcpy+244>: lea 0x20(%rdi),%rdi
0x7ffff6a2ea58 <memcpy+248>: je 0x7ffff6a2ea90 <memcpy+304>
0x7ffff6a2ea5a <memcpy+250>: dec %ecx
=> 0x7ffff6a2ea5c <memcpy+252>: mov (%rsi),%rax
0x7ffff6a2ea5f <memcpy+255>: mov 0x8(%rsi),%r8
0x7ffff6a2ea63 <memcpy+259>: mov 0x10(%rsi),%r9
0x7ffff6a2ea67 <memcpy+263>: mov 0x18(%rsi),%r10
0x7ffff6a2ea6b <memcpy+267>: mov %rax,(%rdi)
0x7ffff6a2ea6e <memcpy+270>: mov %r8,0x8(%rdi)
0x7ffff6a2ea72 <memcpy+274>: mov %r9,0x10(%rdi)
0x7ffff6a2ea76 <memcpy+278>: mov %r10,0x18(%rdi)
0x7ffff6a2ea7a <memcpy+282>: lea 0x20(%rsi),%rsi
0x7ffff6a2ea7e <memcpy+286>: lea 0x20(%rdi),%rdi
0x7ffff6a2ea82 <memcpy+290>: jne 0x7ffff6a2ea30 <memcpy+208>
0x7ffff6a2ea84 <memcpy+292>: data32 data32 nopw %cs:0x0(%rax,%rax,1)
0x7ffff6a2ea90 <memcpy+304>: and $0x1f,%edx
0x7ffff6a2ea93 <memcpy+307>: mov -0x8(%rsp),%rax
0x7ffff6a2ea98 <memcpy+312>: jne 0x7ffff6a2e969 <memcpy+9>
0x7ffff6a2ea9e <memcpy+318>: repz retq
(gdb) info registers
rax 0x0 0
rbx 0x7fffe77f50b0 140737077268656
rcx 0x1 1
rdx 0xff 255
rsi 0x7fffe3e1f000 140737016623104
rdi 0x7fffe77f4f60 140737077268320
rbp 0x7fffe77f4e90 0x7fffe77f4e90
rsp 0x7fffe77f4e48 0x7fffe77f4e48
r8 0x11 17
r9 0x10 16
r10 0x1 1
r11 0x7ffff6a28f7a 140737331236730
r12 0x7fffe3dde490 140737016358032
r13 0x7ffff5bc2a0c 140737316137484
r14 0x7fffe3d69b50 140737015880528
r15 0x0 0
rip 0x7ffff6a2ea5c 0x7ffff6a2ea5c <memcpy+252>
eflags 0x10203 [ CF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb) p/x $rsi
$50 = 0x7fffe3e1f000
(gdb) x/20x $rsi
0x7fffe3e1f000: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f010: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f020: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f030: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f040: 0x00000000 0x00000000 0x00000000 0x00000000
I'm using libcurl version 7.21.6, c-ares version 1.7.4, and openssl version 1.0.0d. My program is multithreaded, but I have registered mutex callbacks with openssl. The program is running on Ubuntu 11.04 desktop, 64-bit. libc is 2.13.
Clearly libcurl is over-reading the source buffer, and stepping into unreadable memory (page at 0x7fffe3e1f000 -- you can confirm that memory is unreadable by looking at /proc/<pid>/maps for the program being debugged).
Here's where it starts to get weird. I can manually inspect all 256 bytes of both
buf and biomem->data without gdb complaining that the memory isn't accesible.
There is a well-known Linux kernel flaw: even for memory that has PROT_NONE (and causes SIGSEGV on attempt to read it from the process itself), attempt by GDB to ptrace(PEEK_DATA,...) succeeds. That explains why you can examine 256 bytes of the source buffer in GDB, even though only 96 of them are actually accessible.
Try running your program under Valgrind, chances are it will tell you that you are memcpying into heap-allocated buffer that is too small.
Do you any possibility of creating a "crumple zone"?
That is, deliberately increasing the size of the two buffers, or in the case of the structure putting an extra unused element after the destination?
You then seed the source crumple with something such as "0xDEADBEEF", and the destination with som with something nice. If the destination every changes you've got something to work with.
256 is a bit suggestive, any possibility it could somehow be being treated as signed quantity, becoming -1, and hence very big? Can't see how gdb wouldn't show it, but ...
Related
Background
I've built qemu-system-x86_64.exe on a Windows machine using MSYS2 (x86_64), and I'm debugging a segmentation fault that happens when I try to run it.
Actually I don't think the problem is related to either QEMU or MSYS2, it's a problem of debugging segmentation fault and possibly wrong code generation.
Debugging the Segmentation Fault
The program crashes with segmentation fault error right at the beginning.
When running with gdb, I found out the following:
Starting program: C:\msys64\home\Administrator\qemu\x86_64-softmmu\qemu-system-x86_64.exe
[New Thread 4656.0x1194]
Program received signal SIGSEGV, Segmentation fault.
0x00000000007d3254 in getpagesize () at util/oslib-win32.c:535
535 {
(gdb) bt
#0 0x00000000007d3254 in getpagesize () at util/oslib-win32.c:535
#1 0x000000000086dd39 in init_real_host_page_size () at util/pagesize.c:16
#2 0x00000000007ea1b2 in __do_global_ctors ()
at C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/gccmain.c:67
#3 0x00000000007ea20f in __main ()
at C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/gccmain.c:83
#4 0x000000000040137f in __tmainCRTStartup ()
at C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:329
#5 0x00000000004014db in WinMainCRTStartup ()
at C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:195
This is strange.
The program crashes when running __do_global_ctors and calling init_real_host_page_size() which calls getpagesize(). These are really simple functions:
uintptr_t qemu_real_host_page_size;
intptr_t qemu_real_host_page_mask;
static void __attribute__((constructor)) init_real_host_page_size(void)
{
qemu_real_host_page_size = getpagesize();
qemu_real_host_page_mask = -(intptr_t)qemu_real_host_page_size;
}
...
int getpagesize(void)
{
SYSTEM_INFO system_info;
GetSystemInfo(&system_info);
return system_info.dwPageSize;
}
getpagesize() crashes right at the beginning of the function, before it even calls GetSystemInfo.
Here is the disassembly of that code fragment and register values:
(gdb) disassem
Dump of assembler code for function getpagesize:
0x00000000007d3250 <+0>: sub $0x68,%rsp
=> 0x00000000007d3254 <+4>: mov %fs:0x0,%rax
0x00000000007d325d <+13>: mov %rax,0x58(%rsp)
0x00000000007d3262 <+18>: xor %eax,%eax
0x00000000007d3264 <+20>: lea 0x20(%rsp),%rcx
0x00000000007d3269 <+25>: callq *0x68e8b9(%rip) # 0xe61b28 <__imp_GetSystemInfo>
0x00000000007d326f <+31>: mov 0x24(%rsp),%eax
0x00000000007d3273 <+35>: mov 0x58(%rsp),%rdx
0x00000000007d3278 <+40>: xor %fs:0x0,%rdx
0x00000000007d3281 <+49>: jne 0x7d3288 <getpagesize+56>
0x00000000007d3283 <+51>: add $0x68,%rsp
0x00000000007d3287 <+55>: retq
0x00000000007d3288 <+56>: callq 0x85bde0 <__stack_chk_fail>
0x00000000007d328d <+61>: nop
End of assembler dump.
(gdb) info registers
rax 0x6f4b868 116701288
rbx 0x86ec10 8842256
rcx 0x6f4b8b8 116701368
rdx 0xe5a780 15050624
rsi 0x86e220 8839712
rdi 0x6f4ad50 116698448
rbp 0x6f4ad10 0x6f4ad10
rsp 0x22fd80 0x22fd80
r8 0x0 0
r9 0x0 0
r10 0x5000016b 1342177643
r11 0x22f9d8 2292184
r12 0x0 0
r13 0x10 16
r14 0x0 0
r15 0x0 0
rip 0x7d3254 0x7d3254 <getpagesize+4>
eflags 0x10202 [ IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x53 83
gs 0x2b 43
It looks like something is wrong with the memory access mov %fs:0x0,%rax.
Who sets FS to 83?
(gdb) starti
Starting program: C:\msys64\home\Administrator\qemu\x86_64-softmmu\qemu-system-x86_64.exe
[New Thread 3508.0x14b0]
Program stopped.
0x00000000778b6fb1 in ntdll!CsrSetPriorityClass ()
from C:\Windows\SYSTEM32\ntdll.dll
(gdb) p $fs
$1 = 83
(gdb) watch $fs
Watchpoint 1: $fs
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x00000000007d3254 in getpagesize () at util/oslib-win32.c:535
535 {
No one sets FS!
Questions
GCC generated code that uses uninitialized register. What could cause that? Was there some initialization code that should have run but didn't?
Any ideas how can I further debug this issue?
FS is an x86 segment register. These are generally not set by the user program, but instead set by the OS or by the runtime libraries, for various special purposes. For instance on Windows x86-64 GS is used to point to a per-thread data block: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block (and FS is not used).
In this case the problem is a bug in the GCC 8 compiler you are using: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86832
In some situations this compiler generates code that assumes FS has been set up for "native TLS", which is wrong because MINGW does not support "native TLS" and FS is not set to anything useful.
The workaround is to avoid compiling with the -fstack-protector-strong compiler option. For QEMU you can do that by passing configure the flag --disable-stack-protector.
(PS: if you want to know how I identified the cause of this segfault: I googled for 'qemu-devel sigsegv getpagesize', which brings up a mailing list thread where somebody else ran into and reported the bug, the problem was diagnosed and a link to the GCC bug found.)
I was recently experimenting with shared library injection in Linux and decided to write my own program to do it (instead of using say GDB to inject the library).
My program uses pthread to overwrite the first 0x25 bytes of a loaded program program (0x40000-0x400025) with assembly code to allocate space for the filename and call dlopen. Once all of this is done, it restores the program state and detaches from it.
Here's the assembly:
global inject_library
global nullsub
section .data
section .text
inject_library:
; rdi -> Pointer to malloc()
; rsi -> Pointer to free()
; rdx -> Pointer to dlopen()
; rcx -> Size of the path to the .so to load
; Create a new stack frame
push rbp
; Save rbx because we're using it as scratch space
push rbx
; Save addresses of free & dlopen on the stack
push rsi
push rdx
; Move the pointer to malloc into rbx
mov rbx, rdi
; Move the size of the path as the first argument to malloc
mov rdi, rcx
; Call malloc(so_path_size)
call rbx
; Stop so that we can see what's happening from the injector process
int 0x3
; Move the pointer to dlopen into rbx
pop rbx
; Move the malloc'd space (now containing the path) to rdi for the first argument
mov rdi, rax
; Push rax because it'll be overwritten
push rax
; Second argument to dlopen (RTLD_NOW)
mov rsi, 0x2
; Call dlopen(path_to_library, RTLD_NOW)
call rbx
; Pass control to the injector
int 0x3
; Finally, begin free-ing the malloc'd area
pop rdi
; Get the address of free into rbx
pop rbx
; Call free(path_to_library)
call rbx
; Restore rbx
pop rbx
; Destory the stack frame
pop rbp
; We're done
int 0x3
retn
nullsub:
retn
There's also a C program which calls this assembly routine and uses pthread to handle these breakpoints.
This setup works just fine for small, single threaded programs like the following.
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
int main(int argc, char* argv) {
pid_t my_pid = getpid();
printf("PID: %ld\n", my_pid);
getchar();
return 0;
}
I used a simple shared library that just did puts("Hi"); in its constructor. As stated above, everything upto here works perfectly.
However, when I try to inject the same library into a much bigger (external, closed-source program), I run into a segfault.
Here's the backtrace:
#0 0x00007f6a7985d64d in _dl_relocate_object (scope=0x21fbc08, reloc_mode=reloc_mode#entry=0, consider_profiling=consider_profiling#entry=0)
at dl-reloc.c:259
#1 0x00007f6a79865723 in dl_open_worker (a=a#entry=0x7fff82d7cbf8) at dl-open.c:424
#2 0x00007f6a793cf5d4 in __GI__dl_catch_error (objname=objname#entry=0x7fff82d7cbe8, errstring=errstring#entry=0x7fff82d7cbf0,
mallocedp=mallocedp#entry=0x7fff82d7cbe7, operate=operate#entry=0x7f6a798654c0 <dl_open_worker>, args=args#entry=0x7fff82d7cbf8)
at dl-error-skeleton.c:198
#3 0x00007f6a79865069 in _dl_open (file=0x21fb830 "/home/umang/code/insertion/test_library.so", mode=-2147483646, caller_dlopen=0x40001a, nsid=-2,
argc=<optimized out>, argv=<optimized out>, env=0x7fff82d7cfe8) at dl-open.c:649
#4 0x00007f6a7964ef96 in dlopen_doit (a=a#entry=0x7fff82d7ce08) at dlopen.c:66
#5 0x00007f6a793cf5d4 in __GI__dl_catch_error (objname=objname#entry=0x7f6a798510f0 <last_result+16>,
errstring=errstring#entry=0x7f6a798510f8 <last_result+24>, mallocedp=mallocedp#entry=0x7f6a798510e8 <last_result+8>,
operate=operate#entry=0x7f6a7964ef40 <dlopen_doit>, args=args#entry=0x7fff82d7ce08) at dl-error-skeleton.c:198
#6 0x00007f6a7964f665 in _dlerror_run (operate=operate#entry=0x7f6a7964ef40 <dlopen_doit>, args=args#entry=0x7fff82d7ce08) at dlerror.c:163
#7 0x00007f6a7964f021 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#8 0x000000000040001a in ?? ()
#9 0x00000000021fb830 in ?? ()
#10 0x00007f6a79326a90 in ?? () at malloc.c:3071 from /lib64/libc.so.6
#11 0x00007f6a796488a0 in ?? () from /lib64/libc.so.6
#12 0x0000000000000d68 in ?? ()
#13 0x00007f6a7931e938 in _IO_new_file_underflow (fp=0x7f6a7964efe0 <__dlopen>) at fileops.c:600
#14 0x00007f6a7931fa72 in __GI__IO_default_uflow (fp=0x7f6a796488a0 <_IO_2_1_stdin_>) at genops.c:404
#15 0x00007f6a7931a20d in getchar () at getchar.c:37
#16 0x00000000004005d7 in main ()
This backtrace tells me something went (horribly) wrong in the dlopen call. Specifically, the error lies at glibc dl-reloc.c:259.
Here's the questionable glibc code.
254 l->l_lookup_cache.value = _lr; })) \
255 : l)
256
257 #include "dynamic-link.h"
258
259 ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc);
260
261 #ifndef PROF
262 if (__glibc_unlikely (consider_profiling)
263 && l->l_info[DT_PLTRELSZ] != NULL)
ELF_DYNAMIC_RELOCATE is a macro defined in dynamic-link.h as the following -
/* This can't just be an inline function because GCC is too dumb
to inline functions containing inlines themselves. */
# define ELF_DYNAMIC_RELOCATE(map, lazy, consider_profile, skip_ifunc) \
do { \
int edr_lazy = elf_machine_runtime_setup ((map), (lazy), \
(consider_profile)); \
ELF_DYNAMIC_DO_REL ((map), edr_lazy, skip_ifunc); \
ELF_DYNAMIC_DO_RELA ((map), edr_lazy, skip_ifunc); \
} while (0)
#endif
elf_machine_runtime_setup returns just fine, so I'm assuming that the problem lies with ELF_DYNAMIC_DO_REL. This is the source for the mentioned macro. The problem here is that the called method is inline, so GDB only displays the macro name and not the underlying source.
Using ni in GDB, I see the following after elf_machine_runtime_setup returns:
ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc);
ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc);
ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling, skip_ifunc);
Stepping through assembly, the segfault happens after the following instruction: movaps %xmm0,-0x70(%rbp).
info local isn't of much help:
(gdb) info local
ranges = {{start = 140072440991568, size = 0, nrelative = 0, lazy = 670467104}, {start = 0, size = 140072438891376, nrelative = 140072441065920,
lazy = 672664367}}
textrels = 0x0
errstring = 0x0
lazy = <optimized out>
skip_ifunc = 0
Interestingly enough, when I use GDB to inject the shared library (using this code I found somewhere on the net), the library loads perfectly.
sudo gdb -n -q -batch \
-ex "attach $pid" \
-ex "set \$dlopen = (void*(*)(char*, int)) dlopen" \
-ex "call \$dlopen(\"$(pwd)/libexample.so\", 1)" \
-ex "detach" \
-ex "quit"
)"
Thanks in advance!
After days of scratching my head and ripping off my hair, I decided to Google "MOVAPS segfault".
MOVAPS is a SIMD instruction (and here, it is used to quickly zero out a quadword). Here's some more info about the same.
On taking a closer look, I noticed the following paragraph:
When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) is generated.
Hmm. So I read the value of the offending address.
(gdb) print $rbp - 0x70
$2 = (void *) 0x7ffecd32e838
There. The address isn't aligned to a 16-byte boundary and thus the segfault occurs.
Fixing this was easy.
; Create a new stack frame
push rbp
sub rsp, 0x8
; Do stuff
; Fix the stack pointer
add rsp, 0x8
; Destroy stack frame, return, etc.
I'm still doubtful if this is the right way to do it, but it works.
Oh, and GDB got it right the whole time - it made sure that the stack was aligned.
I have a piece of code that just strcpy() the argv1 in a buffer of 100 bytes long. After that I am placing for testing purposes the exit(0) or exit(1) function. Nothing else used. What I am getting back from gdb is the following
(gdb) i r eip
eip 0x8048455 0x8048455 <main+65>
(gdb) info frame
Stack level 0, frame at 0xbffff260:
eip = 0x8048455 in main (exploitable.c:9); saved eip 0x41414141
source language c.
Arglist at 0xbffff258, args: argc=1094795585, argv=0xbffff304
Locals at 0xbffff258, Previous frame's sp is 0xbffff260
Saved registers:
ebp at 0xbffff258, eip at 0xbffff25c
(gdb) i r eip
eip 0x8048455 0x8048455 <main+65>
(gdb) c
Continuing.
[Inferior 1 (process 2829) exited normally]
Since the saved eip is 0x41414141 why after leaving this current stack the execution is going to the invalid 0x41414141 address? For sure it has something to do with the exit function but I cant understand it :/
I know that the explanation is in the following code but I cant get it
=> 0x08048455 <+65>: mov DWORD PTR [esp],0x0
0x0804845c <+72>: call 0x8048350 <exit#plt>
The last line implies that the execution goes to the exit function and im not sure that the 0x08040455 line shows the 0 argument that passes to exit function. Exit function does not have any leave / ret instructions when it is running ? Because the saved-eip of the frame that is "just" outside the main is overwritten!
The exit function does not return. It calls function defined with atexit(), does some cleanup, and terminates the process by calling Linux with function 0 (EXIT).
Use return 1 / return 0 instead of exit(1) / exit(0), if you want to check what happens with your EIP after main() is finished.
I am working on ubuntu 12.04 and 64 bit machine. I was reading a good book on buffer overflows and while playing with one example found one strange moment.
I have this really simple C code:
void getInput (void){
char array[8];
gets (array);
printf("%s\n", array);
}
main() {
getInput();
return 0;
}
in the file overflow.c
I compile it with 32 bit flag cause all example in the book assumed 32 bit machine, I do it like this
gcc -fno-stack-protector -g -m32 -o ./overflow ./overflow.c
In the code char array was only 8 bytes but looking at disassembly I found that that array starts 16 bytes away from saved EBP on the stack, so I executed this line:
printf "aaaaaaaaaaaaaaaaaaaa\x10\x10\x10\x20" | ./overflow
And got:
aaaaaaaaaaaaaaaaaaaa
Segmentation fault (core dumped)
Then I opened core file:
gdb ./overflow core
#0 0x20101010 in ?? ()
(gdb) info registers
eax 0x19 25
ecx 0xffffffff -1
edx 0xf77118b8 -143583048
ebx 0xf770fff4 -143589388
esp 0xffef6370 0xffef6370
ebp 0x61616161 0x61616161
esi 0x0 0
edi 0x0 0
eip 0x20101010 0x20101010
As you see EIP in fact got new value, which I wanted. BUT when I want to put some useful values like this 0x08048410
printf "aaaaaaaaaaaaaaaaaaaa\x10\x84\x04\x08" | ./overflow
Program crashes as usual but than something strange happens when I'm trying to observe the value in EIP register:
#0 0xf765be1f in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) info registers
eax 0x61616151 1633771857
ecx 0xf77828c4 -143120188
edx 0x1 1
ebx 0xf7780ff4 -143126540
esp 0xff92dffc 0xff92dffc
ebp 0x61616161 0x61616161
esi 0x0 0
edi 0x0 0
eip 0xf765be1f 0xf765be1f
Suddenly EIP start to look like this 0xf765be1f, it doesn't look like 0x08048410. In fact I noticed that it's enough to put any hexadecimal value starting from 0 to get this crumbled EIP value. Do you know why this might happen? Is it because I'm on 64 bit machine?
UPD
Well guys in comments asked for more information, here is the disassembly of getInput function:
(gdb) disas getInput
Dump of assembler code for function getInput:
0x08048404 <+0>: push %ebp
0x08048405 <+1>: mov %esp,%ebp
0x08048407 <+3>: sub $0x28,%esp
0x0804840a <+6>: lea -0x10(%ebp),%eax
0x0804840d <+9>: mov %eax,(%esp)
0x08048410 <+12>: call 0x8048310 <gets#plt>
0x08048415 <+17>: lea -0x10(%ebp),%eax
0x08048418 <+20>: mov %eax,(%esp)
0x0804841b <+23>: call 0x8048320 <puts#plt>
0x08048420 <+28>: leave
0x08048421 <+29>: ret
Perhaps code at 0x08048410 was executed, and jumped to the area of 0xf765be1f.
What's in this address? I guess it's a function (libC?), so you can examine its assembly code and see what it would do.
Also note that in the successful run, you managed to overrun EBP, not EIP. EBP contains 0x61616161, which is aaaa, and EIP contains 0x20101010, which is \n\n\n. It seems like the corrupt EBP indirectly got EIP corrupt.
Try to make the overrun 4 bytes longer, and then it should overrun the return address too.
This is probably due to the fact that modern OS (Linux does at least, I don't know about Windows) and modern libc have mechanisms that do not allow code found in stack to be executed.
Buffer overflow is invoking undefined behavior, therefore anything can happen. Theorizing what might happen is futile.
This line is causing segfault to me:
30 printf("st_name:\t%s\n", &p_str_tab[p->st_name]);
I've tried to trace it down in gdb:
(gdb) p p_str_tab[p->st_name]
$11 = 0 '\000'
(gdb) p &p_str_tab[p->st_name]
$12 = 0x2aaaaaab0000 ""
(gdb) x/16s 0x2aaaaaab0000
0x2aaaaaab0000: ""
0x2aaaaaab0001: ".symtab"
0x2aaaaaab0009: ".strtab"
(gdb) call printf("st_name:\t%s\n", 0x2aaaaaab0000)
Program received signal SIGSEGV, Segmentation fault.
0x00000034f4042729 in vfprintf () from /lib64/libc.so.6
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on".
Evaluation of the expression containing the function
I can print the memory with gdb's x command ,but if I use printf,segmentation fault.
Why?
UPDATE as required in comment:
(gdb) x/1i $rip
0x34f4042729 <vfprintf+57>: mov 0xc0(%rdi),%eax
(gdb) info reg
rax 0x54 84
rbx 0x34f3e1bbc0 227429956544
rcx 0x0 0
rdx 0xffffffffffffffb0 -80
rsi 0x401b08 4201224
rdi 0x600908 6293768
rbp 0x7fffffffe6e0 0x7fffffffe6e0
rsp 0x7fffffffe040 0x7fffffffe040
r8 0x2aaaaaabf210 46912496202256
r9 0x34f4351780 227435419520
r10 0x1238 4664
r11 0x648 1608
r12 0x0 0
r13 0x7fffffffe9c0 140737488349632
r14 0x0 0
r15 0x0 0
rip 0x34f4042729 0x34f4042729 <vfprintf+57>
eflags 0x10202 [ IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
fctrl 0x37f 895
fstat 0x0 0
ftag 0xffff 65535
---Type <return> to continue, or q <return> to quit---
fiseg 0x0 0
fioff 0x0 0
foseg 0x0 0
fooff 0x0 0
fop 0x0 0
mxcsr 0x1f80 [ IM DM ZM OM UM PM ]
You might want to check whether you're overflowing the stack.
The faulting instruction mov 0xc0(%rdi),%eax represents something like eax = rdi->member where member is at offset 0xc0. Without seeing more disassembly it's hard to know what that is for sure, but it seems likely that it's stdout or something inside stdout. It's not likely that the faulting instruction is dereferencing your input string.
Have you done anything unusual to stdout? A brute force approach would be to sprinkle printf everywhere (of what it probably doesn't matter) and see where it starts crashing. Just before that is where something got corrupted.
Must be a pointer overran issue ,try valgrind.