Strange segmentation fault in x86 assembly - c

While I'm debugging a segmentation fault in x86-Linux, I've ran into this problem:
Here goes the seg-fault message from the GDB
0xe2a5a99f in my_function (pSt=pSt#entry=0xe1d09000, version=43)
Here goes the faulting assembly:
0xe2a5a994 <my_function> push %ebp
0xe2a5a995 <my_function+1> push %edi
0xe2a5a996 <my_function+2> push %esi
0xe2a5a997 <my_function+3> push %ebx
0xe2a5a998 <my_function+4> lea -0x100b0c(%esp),%esp
0xe2a5a99f <my_function+11> call 0xe29966cb <__x86.get_pc_thunk.bx>
0xe2a5a9a4 <my_function+16> add $0x9542c,%ebx
As you can see above, the faulting line is "call get_pc_thunk" which is just getting the pc value.
And, I checked the memory at 0xe29966cb is valid and accessible with the following command:
(gdb) x/10i 0xe29966cb
0xe29966cb <__x86.get_pc_thunk.bx>: nop
0xe29966cc <__x86.get_pc_thunk.bx+1>: nop
0xe29966cd <__x86.get_pc_thunk.bx+2>: nop
0xe29966ce <__x86.get_pc_thunk.bx+3>: nop
0xe29966cf <__x86.get_pc_thunk.bx+4>: nop
0xe29966d0 <__x86.get_pc_thunk.bx+5>: nop
0xe29966d1 <__x86.get_pc_thunk.bx+6>: nop
0xe29966d2 <__x86.get_pc_thunk.bx+7>: nop
0xe29966d3 <__x86.get_pc_thunk.bx+8>: mov (%esp),%ebx
0xe29966d6 <__x86.get_pc_thunk.bx+11>: ret
Which looks perfectly fine.
But Strangely, if I use "si" to step into the "get_pc_thunk" function, it seg-faults without even entering the first nop.
Any help would be appreciated.

A crash on CALL (or PUSH, of MOV (%esp)) instruction almost always is due to stack overflow.
Check your program for infinite (or just very deep) recursion.
Also, this:
0xe2a5a998 <my_function+4> lea -0x100b0c(%esp),%esp
means that my_function is allocating slightly over 1MB of local variables for the current stack frame. And now you know why that may not be a good idea.

Related

segfault on jump from PLT

I am trying to find the cause of a segfault, and narrowed it to the PLT using gdb's btrace. The segfault occurs during a jump from the PLT to the GOT, which I interpret to signify that the PLT became corrupted during execution. Based on the analysis presented below, is this interpretation correct? What are likely culprits for corruption of the PLT? Stack overflow? I believe that installing a watchpoint on the GOT address could be helpful in this instance. Would watch -l 0x55555562f048 be the correct approach? Other ideas for debugging are welcome.
For context, the segfault occurs during a call to strlen in function foo:
int foo(char * path, ...) {
...
if (strlen(path) >= PATH_MAX) {
The corresponding lines of assembly are:
0x58114 <foo+212> cmpq $0x0,-0x4c8(%rbp)
0x5811c <foo+220> jne 0x5812a <foo+234>
0x5811e <foo+222> lea 0xb96fb(%rip),%rdi # 0x111820
0x58125 <foo+229> callq 0x376c0 <__ubsan_handle_nonnull_arg#plt>
0x5812a <foo+234> mov -0x4c8(%rbp),%rax
0x58131 <foo+241> mov %rax,%rdi
0x58134 <foo+244> callq 0x37090 <strlen#plt>
First, path is compared to NULL (cmpq $0x0,-0x4c8(%rbp)), which I believe is solely added for ubsan instrumentation in this case. That branch was not followed, and the program jumped to <foo+234>, where it set up for the strlen call, by moving path to rax and then rdi, and finally calling strlen#plt. Running record btrace in gdb just before this produced this instruction history:
(gdb) record btrace
(gdb) c
136 0x00005555555ac114 <foo+212>: cmpq $0x0,-0x4c8(%rbp)
137 0x00005555555ac11c <foo+220>: jne 0x5555555ac12a <foo+234>
138 0x00005555555ac12a <foo+234>: mov -0x4c8(%rbp),%rax
139 0x00005555555ac131 <foo+241>: mov %rax,%rdi
140 0x00005555555ac134 <foo+244>: callq 0x55555558b090 <strlen#plt>
141 0x000055555558b090 <strlen#plt+0>: jmpq *0xa3fb2(%rip) # 0x55555562f048 <strlen#got.plt>
Here, we see that path was not null, and so the program jumped to <foo+234>, and set up for the strlen call (mov, mov, callq <strlen#plt>). The final instruction executed was jmpq *0xa3fb2(%rip) to the entry for strlen in the GOT (strlen#got.plt), whereupon the program crashed and gdb lost context (reporting that it Cannot find bounds of current function).

exploit: SIGSEGV, segmentation fault

For my hw assignment I am trying to exploit into an overflow c file. I cannot edit the original file, and I also cannot recompile it.
I have gotten as far as to return to my intended address. However after it executed the code for a while I get this:
0xffffcc38 pop %eax
0xffffcc39 push %eax
0xffffcc3a pop %ecx
→ 0xffffcc3b xor 0x30(%ebp), %eax
0xffffcc3e xor %eax, 0x30(%ebp)
0xffffcc41 xor %esi, 0x30(%ebp)
0xffffcc44 xor 0x30(%ebp), %esi
0xffffcc47 pop %ax
0xffffcc49 push $0x68736538
0xffffcc3b in ?? ()
gef➤
Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
It was executing fine from 0xfffcc30->0xffffcc3a but suddenly stopped at 0xffffcc3b. Can I know why this could have happened?
Can I know why this could have happened?
If this instruction produced a SIGSEGV:
0xffffcc3b xor 0x30(%ebp), %eax
then it's a safe bet that $EBP + 0x30 points to inaccessible memory.
Most likely you "stepped" on $EBP earlier in your exploit.

Interpreting this line in Assembly language?

Below are the first 5 lines of a disassembled C program that I am trying to reverse engineer back into it's C code for purposes of better learning assembly language. At the beginning of this code I see it makes room on the stack and immediately calls
0x000000000040054e <+8>: mov %fs:0x28,%rax
I am confused what this line does, and what might be calling this from the corresponding C program. The only time I have seen this line so far is when a different method within a C program is called, but this time it is not followed by any Callq instructions so I am not so sure... Any ideas what else could be in this C program to be making this call?
0x0000000000400546 <+0>: push %rbp
0x0000000000400547 <+1>: mov %rsp,%rbp
0x000000000040054a <+4>: sub $0x40,%rsp
0x000000000040054e <+8>: mov %fs:0x28,%rax
0x0000000000400557 <+17>: mov %rax,-0x8(%rbp)
0x000000000040055b <+21>: xor %eax,%eax
0x000000000040055d <+23>: movl $0x17,-0x30(%rbp)
...
I know this is to provide some form of stack protection for buffer overflow attacks, I just need to know what C code would prompt this protection if not for a seperate method.
As you say, this is code used to defend against buffer overflows. The compiler generates this "stack canary check" for functions that have local variables that might be buffers that could be overflowed. Note the instructions immediately above and below the line you are asking about:
sub $0x40, %rsp
mov %fs:0x28, %rax
mov %rax, -0x8(%ebp)
xor %eax, %eax
The sub allocates 64 bytes of space on the stack, which is enough room for at least one small array. Then a secret value is copied from %fs:0x28 to the top of that space, just below the previous frame pointer and the return address, and then it is erased from the register file.
The body of the function does something with arrays; if it writes sufficiently far past the end of an array, it will overwrite the secret value. At the end of the function, there will be code along the lines of
mov -0x8(%rbp), %rax
xor %fs:28, %rax
jne 1
mov %rbp, %rsp
pop %rbp
ret
1:
call __stack_chk_fail # does not return
This verifies that the secret value is unchanged, and crashes the program if it has changed. The idea is that someone trying to exploit a simple buffer overflow vulnerability, like you have when you use gets, won't be able to change the return address without also modifying the secret value.
The compiler has several different heuristics, selectable with command line options, for deciding when it is necessary to generate stack-canary protection code.
You can't write C code corresponding to this assembly language yourself, because it uses the unusual %fs:nnnn addressing mode; the stack-canary code intentionally uses an addressing mode that no other code generation relies on, to make it as difficult as possible for the adversary to learn the secret value.

segmentation fault with .text .data and main (main in .data section)

I'm just trying to load the value of myarray[0] to eax:
.text
.data
# define an array of 3 words
array_words: .word 1, 2, 3
.globl main
main:
# assign array_words[0] to eax
mov $0, %edi
lea array_words(,%edi,4), %eax
But when I run this, I keep getting seg fault.
Could someone please point out what I did wrong here?
It seems the label main is in the .data section.
It leads to a segmentation fault on systems that doesn't allow to execute code in the .data section. (Most modern systems map .data with read + write but not exec permission.)
Program code should be in the .text section. (Read + exec)
Surprisingly, on GNU/Linux systems, hand-written asm often results in an executable .data unless you're careful to avoid that, so this is often not the real problem: See Why data and stack segments are executable? But putting code in .text where it belongs can make some debugging tools work better.
Also you need to ret from main or call exit (or make an _exit system call) so execution doesn't fall off the end of main into whatever bytes come next. See What happens if there is no exit system call in an assembly program?
You need to properly terminate your program, e.g. on Linux x86_64 by calling the sys_exit system call:
...
main:
# assign array_words[0] to eax
mov $0, %edi
lea array_words(,%edi,4), %eax
mov $60, %rax # System-call "sys_exit"
mov $0, %rdi # exit code 0
syscall
Otherwise program execution continues with the memory contents following your last instruction, which are most likely in all cases invalid instructions (or even invalid memory locations).

itoa implementation crashing?

I am trying to implement atoi in assembly (the netwide assembler). I have verified that my approach is valid by inspecting the register values with a debugger. The problem is that the application will crash when it is about to exit. I am afraid my program is corrupting the stack somehow. I am linking against the GCC stdlib to allow the use of the printf function. I noticed it mutated the registers which caused unexpected behaviour (extensive iterations over values I did not recognize), however I solved this by storing the value of EAX inside EBX (not modified by printf) and then restoring the value after the function call. This is why I have been able to confirm that the program now behaves as it is supposed to by singlestepping through the algorithm AND confirm that the program crashes as it is about to terminate.
Here is the code:
global _main
extern _printf
section .data
_str: db "%d", 0
section .text
_main:
mov eax, 1234
mov ebx, 10
call _itoa
_terminate:
ret
_itoa:
test eax, eax
jz _terminate
xor edx, edx
div ebx
add edx, 30h
push eax
push edx
push _str
call _printf
add esp, 8
pop eax
jmp _itoa
And here is the stackdump:
Exception: STATUS_ACCESS_VIOLATION at eip=00402005
eax=00000000 ebx=00000000 ecx=20000038 edx=61185C40 esi=612A3A7C edi=0022CD84
ebp=0022ACF8 esp=0022AC20 program=C:\Cygwin\home\Benjamin\nasm\itoa.exe, pid 3556, thread main
cs=001B ds=0023 es=0023 fs=003B gs=0000 ss=0023
Stack trace:
Frame Function Args
0022ACF8 00402005 (00000000, 0022CD84, 61007120, 00000000)
End of stack trace
EDIT: Please note that the stackdump really is not that relevant anymore as the program no longer crashes, it just displays an incorrect value.
I'm not familiar with your platform, but I would expect you need to restore the stack by popping off the pushed values after calling printf().
Since printf() doesn't know how many arguments will be passed, it can't restore the stack. Your code pushes arguments that are never popped off. So when your procedure returns, it gets the return address from the data that was pushed on the stack, which are not going to point to valid code. And that would be your access violation.

Resources