I'm really new on LLDB. I'm trying to figure out why my C application sometimes breaks on a segmentation fault.
I've compiled my application with -g and started lldb pointing to the binary. So I ran the app with "run" code and when it crashes, LLDB shows me the message:
* thread #1: tid = 0x8817, 0x00007fffae0feb52 libsystem_c.dylib`strlen + 18, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x00007fffae0feb52 libsystem_c.dylib`strlen + 18
libsystem_c.dylib`strlen:
-> 0x7fffae0feb52 <+18>: pcmpeqb (%rdi), %xmm0
0x7fffae0feb56 <+22>: pmovmskb %xmm0, %esi
0x7fffae0feb5a <+26>: andq $0xf, %rcx
0x7fffae0feb5e <+30>: orq $-0x1, %rax
I saw a "tutorial" where the person did the same thing as me, but for him LLDB showed his C source code and pointed the line where it crashed. For me I can only see this hex with assembly that I cannot trace.
What am I doing wrong?
Thank you guys.
EDIT:
Forgot to say that my app is only printing c* chars on terminal each second. Sometimes it take a few minutes to crash, sometimes it crashes after hours.
Related
I have a __kernel_vsyscall error in a function that gives the correct output but the program can never get past it and gives a __kernel_vsyscall error.
C functions:
void f1(int* input, int* output, int nbElements);
Assembler function (GAS). I can't post the whole thing since it's for an assignment and I don't want someone to copy it.
f1:
push %ebp
mov %esp, %ebp
movl $0, %ecx
movl $0, %ebx
jmp for_loop1
for_loop1:
cmpl %ecx, 16(%ebp)
jb end
movl $0, %ebx
jmp for_loop2
for_loop2:
/*move input elements to output elements*/
cmpl %ecx, %ebx
jmp incr_2
incr_2:
addl $1, %ebx
cmpl %ebx, 16(%ebp)
jb incr_1
jmp for_loop2
incr_1:
addl $1, %ecx
jmp for_loop1
end:
addl $8, %esp
leave
ret
Error when the program terminated :
malloc(): invalid size (unsorted)
Aborted (core dumped)
Error when debugging with gdb with the coredump file :
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./program_name...
[New LWP 2634]
warning: Loadable section ".note.gnu.property" outside of ELF segments
Core was generated by `./program_name'.
Program terminated with signal SIGABRT, Aborted.
#0 0xf7f35ac9 in __kernel_vsyscall ()
I've tried looking at what is at address 0xf7f35ac9 and it returned -402652697, or another random value.
Local variables are at adresses around 0xff83b1fc, the values stored in the pointers are at adresses around 0x80cab30, the functions declared previously are at adresses around 0x8049fc4, and the stack has values around 0xffaddedc, so I have no idea where this is.
Thanks
I was able to fix the problem by changing
cmpl %ecx, 16(%ebp)
jb end
to
cmpl 16(%ebp), %ecx
jge end
same as with all the other comparisons. Looked around online and apparently the difference between jb/ja and jl/jg is that jb/ja is for unsigned comparison while jl/jg is for signed comparison, but I'm still not sure why this causes this error.
I am trying to find the cause of a segfault, and narrowed it to the PLT using gdb's btrace. The segfault occurs during a jump from the PLT to the GOT, which I interpret to signify that the PLT became corrupted during execution. Based on the analysis presented below, is this interpretation correct? What are likely culprits for corruption of the PLT? Stack overflow? I believe that installing a watchpoint on the GOT address could be helpful in this instance. Would watch -l 0x55555562f048 be the correct approach? Other ideas for debugging are welcome.
For context, the segfault occurs during a call to strlen in function foo:
int foo(char * path, ...) {
...
if (strlen(path) >= PATH_MAX) {
The corresponding lines of assembly are:
0x58114 <foo+212> cmpq $0x0,-0x4c8(%rbp)
0x5811c <foo+220> jne 0x5812a <foo+234>
0x5811e <foo+222> lea 0xb96fb(%rip),%rdi # 0x111820
0x58125 <foo+229> callq 0x376c0 <__ubsan_handle_nonnull_arg#plt>
0x5812a <foo+234> mov -0x4c8(%rbp),%rax
0x58131 <foo+241> mov %rax,%rdi
0x58134 <foo+244> callq 0x37090 <strlen#plt>
First, path is compared to NULL (cmpq $0x0,-0x4c8(%rbp)), which I believe is solely added for ubsan instrumentation in this case. That branch was not followed, and the program jumped to <foo+234>, where it set up for the strlen call, by moving path to rax and then rdi, and finally calling strlen#plt. Running record btrace in gdb just before this produced this instruction history:
(gdb) record btrace
(gdb) c
136 0x00005555555ac114 <foo+212>: cmpq $0x0,-0x4c8(%rbp)
137 0x00005555555ac11c <foo+220>: jne 0x5555555ac12a <foo+234>
138 0x00005555555ac12a <foo+234>: mov -0x4c8(%rbp),%rax
139 0x00005555555ac131 <foo+241>: mov %rax,%rdi
140 0x00005555555ac134 <foo+244>: callq 0x55555558b090 <strlen#plt>
141 0x000055555558b090 <strlen#plt+0>: jmpq *0xa3fb2(%rip) # 0x55555562f048 <strlen#got.plt>
Here, we see that path was not null, and so the program jumped to <foo+234>, and set up for the strlen call (mov, mov, callq <strlen#plt>). The final instruction executed was jmpq *0xa3fb2(%rip) to the entry for strlen in the GOT (strlen#got.plt), whereupon the program crashed and gdb lost context (reporting that it Cannot find bounds of current function).
For my hw assignment I am trying to exploit into an overflow c file. I cannot edit the original file, and I also cannot recompile it.
I have gotten as far as to return to my intended address. However after it executed the code for a while I get this:
0xffffcc38 pop %eax
0xffffcc39 push %eax
0xffffcc3a pop %ecx
→ 0xffffcc3b xor 0x30(%ebp), %eax
0xffffcc3e xor %eax, 0x30(%ebp)
0xffffcc41 xor %esi, 0x30(%ebp)
0xffffcc44 xor 0x30(%ebp), %esi
0xffffcc47 pop %ax
0xffffcc49 push $0x68736538
0xffffcc3b in ?? ()
gef➤
Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
It was executing fine from 0xfffcc30->0xffffcc3a but suddenly stopped at 0xffffcc3b. Can I know why this could have happened?
Can I know why this could have happened?
If this instruction produced a SIGSEGV:
0xffffcc3b xor 0x30(%ebp), %eax
then it's a safe bet that $EBP + 0x30 points to inaccessible memory.
Most likely you "stepped" on $EBP earlier in your exploit.
I am a reader of your great book CSAPP 2e. I've a question regarding chapter 3.6.6.
In this chapter the author used a function called cread to show that in some situation we should not use conditional moves.
function cread() is as follows:
int cread(int* xp) {
return (xp? *xp : 0);
}
The assembly code of this function is :
1 movl $0 , %eax Set 0 as return value
2 testl %edx , %edx Test xp
3 cmovne (%edx), %eax if !0, dereference xp to get return value
The author emphasized the problem is that the dereference of xp is invalid if xp is null. But as I see it, line 1 is for the condition that xp is a null pointer, and in line 3, if xp is null, (%edx) will not be copied to %eax, so this code has avoided the possibility of dereferencing a null pointer.
Additionally, when I looked this problem in CSAPP 3e, the assembly code of this function has changed as follows :
1 cread:
2 movq (%rdi), %rax
3 tests %rdi , %rdi
4 move $0 , %edx
5 cmove %rdx , %rax
6 ret
I can see the problem in the second assembly code that in line 2, if xp is null pointer then this dereference is an error. However, I cannot figure out if there's same error in the first piece of assembly code (actually I think this piece of code is correct).
My question is: Is my understanding correct, or is there really an error in the first piece of assembly code?
In the third instruction:
cmovne (%edx), %eax
it will first get the value of (%edx), then analyze the condition, and finally move (%edx) to %eax or not according to the condition.
So even if %edx=0, it will still access the value of (%edx), which will cause segmentation fault.
While I'm debugging a segmentation fault in x86-Linux, I've ran into this problem:
Here goes the seg-fault message from the GDB
0xe2a5a99f in my_function (pSt=pSt#entry=0xe1d09000, version=43)
Here goes the faulting assembly:
0xe2a5a994 <my_function> push %ebp
0xe2a5a995 <my_function+1> push %edi
0xe2a5a996 <my_function+2> push %esi
0xe2a5a997 <my_function+3> push %ebx
0xe2a5a998 <my_function+4> lea -0x100b0c(%esp),%esp
0xe2a5a99f <my_function+11> call 0xe29966cb <__x86.get_pc_thunk.bx>
0xe2a5a9a4 <my_function+16> add $0x9542c,%ebx
As you can see above, the faulting line is "call get_pc_thunk" which is just getting the pc value.
And, I checked the memory at 0xe29966cb is valid and accessible with the following command:
(gdb) x/10i 0xe29966cb
0xe29966cb <__x86.get_pc_thunk.bx>: nop
0xe29966cc <__x86.get_pc_thunk.bx+1>: nop
0xe29966cd <__x86.get_pc_thunk.bx+2>: nop
0xe29966ce <__x86.get_pc_thunk.bx+3>: nop
0xe29966cf <__x86.get_pc_thunk.bx+4>: nop
0xe29966d0 <__x86.get_pc_thunk.bx+5>: nop
0xe29966d1 <__x86.get_pc_thunk.bx+6>: nop
0xe29966d2 <__x86.get_pc_thunk.bx+7>: nop
0xe29966d3 <__x86.get_pc_thunk.bx+8>: mov (%esp),%ebx
0xe29966d6 <__x86.get_pc_thunk.bx+11>: ret
Which looks perfectly fine.
But Strangely, if I use "si" to step into the "get_pc_thunk" function, it seg-faults without even entering the first nop.
Any help would be appreciated.
A crash on CALL (or PUSH, of MOV (%esp)) instruction almost always is due to stack overflow.
Check your program for infinite (or just very deep) recursion.
Also, this:
0xe2a5a998 <my_function+4> lea -0x100b0c(%esp),%esp
means that my_function is allocating slightly over 1MB of local variables for the current stack frame. And now you know why that may not be a good idea.