lldb issue on mac with debugging a basic c program - lldb

I am trying to learn how to use a debugger and I am having some difficulty getting lldb to work properly. In particular, I was trying to follow along with the code the lldb wikipedia page for starters, and after attempting to run the debugger with lldb run I obtained the following error:
(lldb) target create "./example"
Current executable set to '/Users/me/Desktop/c/old/example' (x86_64).
(lldb) run
Process 69451 launched: '/Users/me/Desktop/c/old/example' (x86_64)
Process 69451 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xffffff90)
frame #0: 0x00007ff80f9066b2 libsystem_platform.dylib`_platform_strlen + 18
libsystem_platform.dylib`_platform_strlen:
-> 0x7ff80f9066b2 <+18>: pcmpeqb (%rdi), %xmm0
0x7ff80f9066b6 <+22>: pmovmskb %xmm0, %esi
0x7ff80f9066ba <+26>: andq $0xf, %rcx
0x7ff80f9066be <+30>: orq $-0x1, %rax
Target 0: (example) stopped.
(lldb)
I searched around online and I could not seem to find what this was all about. I would try to use gdb instead, but it looks like a bit of a mess to set up on mac with permissions and all. Any thoughts on what this issue is?

That's not a debugger error, lldb is reporting that your program crashed in the _platform_strlen function because of a bad memory access (that's what EXC_BAD_ACCESS means). From the crash message, the address you were trying to access was 0xffffff90.
The frame you stopped in is from a system library, so all the debugger can show you is the disassembly of the function. It is interesting (for people who can read disassembly). (%rdi) means "fetch the memory at the address held in the register rdi. That's an argument passing register, so presumably one of the arguments to strlen was bogus. That makes sense with the crash reason.
The next thing to do is to issue the bt command to see the full stack trace for the crash. Then use the frame select - short alias f - command to choose the last stack frame with your code. Presumably the PC in that frame will be a call to strlen, and your code is passing a bad pointer to strlen. You can use the v <varname> command to see the variable you passed in to verify that.

Related

what's happened when process is attacked by "stack buffer overflow"?

I'm a student learning computer security. Recently, I learned stack buffer overflow on c.
I understood its concepts and run sample codes written by c.
void main(){
char buf[] = "\xeb\x0b\x31\xc0\xb0\x0b\x31\xd2\x31\xc9\x5b\xcd\x80\xe8\xf0\xff\xff\xff/bin/sh\x0";
int* p;
p = (int*)&p + 2;
*p = (int)buf;
return;
}
Runtime Environment
Architecture: i686
OS: ubuntu 16.04 32bit
Compiler: gcc
Turn off ASLR(sysctl -w kernel.randomize_va_space=0)
Options: gcc -z execstack -mpreferred-stack-boundary=2 -fno-stack-protector
But I confuse what stack is saved and which memories are overlapped.
Above binary code, "\xeb\x0b\x31\xc0\xb0\x0b\x31\xd2\x31\xc9\x5b\xcd\x80\xe8\xf0\xff\xff\xff/bin/sh\x0",
the same assembly code is
.global main
main:
jmp strings
start:
xor %eax, %eax
movb $0xb, %al
xor %edx, %edx
xor %ecx, %ecx
popl %ebx
int $0x80
strings:
call start
.string "/bin/sh"
means execve("/bin/sh", NULL, NULL);.
When buffer overflow occurs, the binaries are overlapped return addresses of main on stack. But, I'm understood to be that the stack stores data s.t local variables, previous frame pointers, and return address.
I think the above binaries are not data, actually instructions. If so, why is this valid? The stack stores instructions and executes one-by-one by popping them? Or I misunderstand something?
And if the stack stores instructions, how do previous stack frame pointers(fp) and return addresses(ra) work?
I learned that previous function's stack frame address is stored in fp and next instruction's address on code area is stored in ra. So, when called function is terminated, sp is popped and then ra does to restore previous function state and run next instruction. Is it correct? Or I misunderstand something?
I want to know really this..
Thank you for your help.
Data are instructions are instructions are instructions.
The stack is memory is memory is memory.
That's just that.
Since the stack is ordinary memory, just like what you get with malloc, only growing downward and used implicitly by some instructions, you can put any data on the stack.
Since instructions are data, it follows that you can put instructions on the stack.
This particular exploitation works by overwriting the return address with a specific value and everything above it with a sequence of instructions.
That's why you need to tell GCC to make the stack executable (the code is on the stack) and not to generate a canary (both of these protections will suffice to prevent the attack) and also you need to tell Linux not to randomize the process address space layout (or the specific, fixed, value used to overwrite the return address won't work).
The fp and ra thing is most likely for a RISC architecture, x86 doesn't have such registers.
The execution flow is redirected when main returns (with ret), that's what ret does.
Look in Intel's manuals how the call/ret pair works and then see it in practice by just stepping into a call with a debugger.
Make sure you understand the calling convention and keep an eye on the stack every time you step.

Why am I allowed to exit main using ret?

I am about to figure out how exactly a programm stack is set up.
I have learned that calling the function with
call pointer;
Is effectively the same as:
mov register, pc ;programcounter
add register, 1 ; where 1 is one instruction not 1 byte ...
push register
jump pointer
However, this would mean that when the Unix Kernel calls the main function that the stack base should point to reentry in the kernel function which calls main.
Therefore jumping "*rbp-1" in the C - Code should reenter the main function.
This, however, is not what happens in the following code:
#include <stdlib.h>
#include <unistd.h>
extern void ** rbp(); //pointer to stack pointing to function
int main() {
void ** p = rbp();
printf("Main: %p\n", main);
printf("&Main: %p\n", &main); //WTF
printf("*Main: %p\n", *main); //WTF
printf("Stackbasepointer: %p\n", p);
int (*c)(void) = (*p)-4;
asm("movq %rax, 0");
c();
return 0; //should never be executed...
}
Assembly file: rsp.asm
...
.intel_syntax
.text:
.global _rbp
_rbp:
mov rax, rbp
ret;
This is not allowed, unsurprisingly, maybe because the instruction at this point are not exactly 64 bits, maybe because UNIX does not allow this...
But also this call is not allowed:
void (*c)(void) = (*p);
asm("movq %rax, 0"); //Exit code is 11, so now it should be 0
c(); //this comes with stack corruption, when successful
This means I am not obliged to exit the main - calling function.
My question then is: Why am I when I use ret as seen in the end of every GCC main function?, which should do effectively the same as the code above. How does a unix - system check for such attempts effectively...
I hope my question is clear...
Thank you.
P.S.: Code compiles only on macOS, change assembly for linux
C main is called (indirectly) from CRT startup code, not directly from the kernel.
After main returns, that code calls atexit functions to do stuff like flushing stdio buffers, then passes main's return value to a raw _exit system call. Or exit_group which exits all threads.
You make several wrong assumptions, all I think based on a misunderstanding of how kernels work.
The kernel runs at a different privilege level from user-space (ring 0 vs. ring 3 on x86). Even if user-space knew the right address to jump to, it can't jump into kernel code. (And even if it could, it wouldn't be running with kernel privilege level).
ret isn't magic, it's basically just pop %rip and doesn't let you jump anywhere you couldn't jump to with other instructions. Also doesn't change privilege level1.
Kernel addresses aren't mapped / accessible when user-space code is running; those page-table entries are marked as supervisor-only. (Or they're not mapped at all in kernels that mitigate the Meltdown vulnerability, so entering the kernel goes through a "wrapper" block of code that changes CR3.)
Virtual memory is how the kernel protects itself from user-space. User-space can't modify page tables directly, only by asking the kernel to do it via mmap and mprotect system calls. (And user-space can't execute privileged instructions like mov cr3, rax to install new page tables. That's the purpose of having ring 0 (kernel mode) vs. ring 3 (user mode).)
The kernel stack is separate from the user-space stack for a process. (In the kernel, there's also a small kernel stack for each task (aka thread) that's used during system calls / interrupts while that user-space thread is running. At least that's how Linux does it, IDK about others.)
The kernel doesn't literally call user-space code; The user-space stack doesn't hold any return address back into the kernel. A kernel->user transition involves swapping stack pointers, as well as changing privilege levels. e.g. with an instruction like iret (interrupt-return).
Plus, leaving a kernel code address anywhere user-space can see it would defeat kernel ASLR.
Footnote 1: (The compiler-generated ret will always be a normal near ret, not a retf that could return through a call gate or something to a privileged cs value. x86 handles privilege levels via the low 2 bits of CS but nevermind that. MacOS / Linux don't set up call gates that user-space can use to call into the kernel; that's done with syscall or int 0x80 instructions.)
In a fresh process (after an execve system call replaced the previous process with this PID with a new one), execution begins at the process entry point (usually labeled _start), not at the C main function directly.
C implementations come with CRT (C RunTime) startup code that has (among other things) a hand-written asm implementation of _start which (indirectly) calls main, passing args to main according to the calling convention.
_start itself is not a function. On process entry, RSP points at argc, and above that on the user-space stack is argv[0], argv[1], etc. (i.e. the char *argv[] array is right there by value, and above that the envp array.) _start loads argc into a register and puts pointers to the argv and envp into registers. (The x86-64 System V ABI that MacOS and Linux both use documents all this, including the process-startup environment and the calling convention.)
If you try to ret from _start, you're just going to pop argc into RIP, and then code-fetch from absolute address 1 or 2 (or other small number) will segfault. For example, Nasm segmentation fault on RET in _start shows an attempt to ret from the process entry point (linked without CRT startup code). It has a hand-written _start that just falls through into main.
When you run gcc main.c, the gcc front-end runs multiple other programs (use gcc -v to show details). This is how the CRT startup code gets linked into your process:
gcc preprocesses (CPP) and compiles+assembles main.c to main.o (or a temporary file). On MacOS, the gcc command is actually clang which has a built-in assembler, but real gcc really does compile to asm and then run as on that. (The C preprocessor is built-in to the compiler, though.)
gcc runs something like ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie /usr/lib/Scrt1.o /usr/lib/gcc/x86_64-pc-linux-gnu/9.1.0/crtbeginS.o main.o -lc -lgcc /usr/lib/gcc/x86_64-pc-linux-gnu/9.1.0/crtendS.o. That's actually simplified a lot, with some of the CRT files left out, and paths canonicalized to remove ../../lib parts. Also, it doesn't run ld directly, it runs collect2 which is a wrapper for ld. But anyway, that statically links in those .o CRT files that contain _start and some other stuff, and dynamically links libc (-lc) and libgcc (for GCC helper functions like implementing __int128 multiply and divide with 64-bit registers, in case your program uses those).
.intel_syntax
.text:
.global _rbp
_rbp:
mov rax, rbp
ret;
This is not allowed, ...
The only reason that doesn't assemble is because you tried to declare .text: as a label, instead of using the .text directive. If you remove the trailing : it does assemble with clang (which treats .intel_syntax the same as .intel_syntax noprefix).
For GCC / GAS to assemble it, you'd also need the noprefix to tell it that register names aren't prefixed by %. (Yes you can have Intel op dst, src order but still with %rsp register names. No you shouldn't do this!) And of course GNU/Linux doesn't use leading underscores.
Not that it would always do what you want if you called it, though! If you compiled main without optimization (so -fno-omit-frame-pointer was in effect), then yes you'd get a pointer to the stack slot below the return address.
And you definitely use the value incorrectly. (*p)-4; loads the saved RBP value (*p) and then offsets by four 8-byte void-pointers. (Because that's how C pointer math works; *p has type void* because p has type void **).
I think you're trying to get your own return address and re-run the call instruction (in main's caller) that reached main, eventually leading to a stack overflow from pushing more return addresses. In GNU C, use void * __builtin_return_address (0) to get your own return address.
x86 call rel32 instructions are 5 bytes, but the call that called main was probably an indirect call, using a pointer in a register. So it might be a 2-byte call *%rax or a 3-byte call *%r12, you don't know unless you disassemble your caller. (I'd suggest single-stepping by instructions (GDB / LLDB stepi) off the end of main using a debugger in disassembly mode. If it has any symbol info for main's caller, you'll be able to scroll backward and see what the previous instruction was.
If not, you might have to try and see what looks sane; x86 machine code can't be unambiguously decoded backwards because it's variable-length. You can't tell the difference between a byte within an instruction (like an immediate or ModRM) vs. the start of an instruction. It all depends on where you start disassembling from. If you try a few byte offsets, usually only one will produce anything that looks sane.
asm("movq %rax, 0"); //Exit code is 11, so now it should be 0
This is a store of RAX to absolute address 0, in AT&T syntax. This of course segfaults. exit code 11 is from SIGSEGV, which is signal 11. (Use kill -l to see signal numbers).
Perhaps you wanted mov $0, %eax. Although that's still pointless here, you're about to call through your function pointer. In debug mode, the compiler might load it into RAX and step on your value.
Also, writing a register in an asm statement is never safe when you don't tell the compiler which registers you're modifying (using constraints).
printf("Main: %p\n", main);
printf("&Main: %p\n", &main); //WTF
main and &main are the same thing because main is a function. That's just how C syntax works for function names. main isn't an object that can have its address taken. & operator optional in function pointer assignment
It's similar for arrays: the bare name of an array can be assigned to a pointer or passed to functions as a pointer arg. But &array is also the same pointer, same as &array[0]. This is true only for arrays like int array[10], not for pointers like int *ptr; in the latter case the pointer object itself has storage space and can have its own address taken.
I think there are quite a few misunderstandings you have here. First, main is not what gets called by the kernel. The kernel allocates a process and loads our binary into memory - usually from an ELF file if you are using a Unix-based OS. This ELF file contains all of the sections that need to be mapped into memory and an address that is the "Entry Point" for the code in the ELF(among other things). The ELF can specify any address for the loader to jump to in order to start launching the program. In applications built with GCC, this is a function called _start. _start then sets up the stack and does any other initialization it needs to before calling __libc_start_main which is a libc function that can do additional set up before calling main main.
Here is an example of a start function:
00000000000006c0 <_start>:
6c0: 31 ed xor %ebp,%ebp
6c2: 49 89 d1 mov %rdx,%r9
6c5: 5e pop %rsi
6c6: 48 89 e2 mov %rsp,%rdx
6c9: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
6cd: 50 push %rax
6ce: 54 push %rsp
6cf: 4c 8d 05 0a 02 00 00 lea 0x20a(%rip),%r8 # 8e0 <__libc_csu_fini>
6d6: 48 8d 0d 93 01 00 00 lea 0x193(%rip),%rcx # 870 <__libc_csu_init>
6dd: 48 8d 3d 7c ff ff ff lea -0x84(%rip),%rdi # 660 <main>
6e4: ff 15 f6 08 20 00 callq *0x2008f6(%rip) # 200fe0 <__libc_start_main#GLIBC_2.2.5>
6ea: f4 hlt
6eb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
As you can see, this function sets the value of the stack and the stack base pointer. Therefore, there is no valid stack frame in this function. The stack frame is not even set to anything but 0 until you call main (at least by this compiler)
Now what is important to see here is that The stack was initialized in this code, and by the loader, it is not a continuation of the kernel's stack. Each program has its own stack, and these are all different from the kernel's stack. In fact, even if you knew the address of the stack in the kernel, you could not read from it or write to it from your program because your process can only see the pages of memory that have been allocated to it by the MMU which is controlled by the kernel.
Just to clarify, when I said the stack was "created" I did not mean that it was allocated. I only mean that the stack pointer and stack base are set here. The memory for it is allocated when the program is loaded, and pages are added to it as needed whenever a page fault is triggered by a write to an unallocated part of the stack. Upon entering start there is clearly some stack in existence as evidence from the pop rsi instruction however this is not the stack the final stack values that will be used by the program. those are the variables that get set up in _start (maybe these get changed in __libc_start_main later on, I'm not sure.)
However, this would mean that when the Unix Kernel calls the main function that the stack base should point to reentry in the kernel function which calls main.
Absolutely not.
This particular question covers the details for MacOS, please have a look. In any case main is most likely returning to start function of the C standard library. Details of implementation differ between different *nix operating systems.
Therefore jumping "*rbp-1" in the C - Code should reenter the main function.
You have no guarantee what the compiler will emit and what will be the state of rsp/rbp when you call rbp() function. You can't make such assumptions.
Btw if you want to access stack entry in 64bit you would do this in +-8 increments (so rbp+8 rbp-8 rsp+8 rsp-8 respectively).

Are ARM Cortex-M0 Stacking Registers Saved On $psp or $msp During Hardfault?

I have an issue where my Cortex-M0 is hard faulting, so I am trying to debug it. I am trying to print the contents of the ARM core registers that were pushed to the stack when the hard fault occurred.
Here is my basic assembly code:
__attribute__((naked)) void HardFaultVector(void) {
asm volatile(
// check LR to see whether the process stack or the main stack was being used at time of exception.
"mov r2, lr\n"
"mov r3, #0x4\n"
"tst r2, r3\n"
"beq _MSP\n"
//process stack was being used.
"_PSP:\n"
"mrs r0, psp\n"
"b _END\n"
//main stack was being used.
"_MSP:\n"
"mrs r0, msp\n"
"b _END\n"
"_END:\n"
"b fault_handler\n"
);
}
The function fault_handler will print the contents of the stack frame that was pushed to either the process stack or the main stack. Here's my question though:
When I print the contents of the stack frame that supposedly has the saved registers, here is what I see:
Stack frame at 0x20000120:
pc = 0xfffffffd; saved pc 0x55555554
called by frame at 0x20000120, caller of frame at 0x20000100
Arglist at unknown address.
Locals at unknown address, Previous frame's sp is 0x20000120
Saved registers:
r0 at 0x20000100, r1 at 0x20000104, r2 at 0x20000108, r3 at 0x2000010c, r12 at 0x20000110, lr at 0x20000114, pc at 0x20000118, xPSR at 0x2000011c
You can see the saved registers, these are the registers that are pushed by the ARM core when a hard fault occurs. You can also see the line pc = 0xfffffffd; which indicates that this is the LR's EXC_RETURN value. The value 0xfffffffd indicates to me that the process stack was being used at the time of the hard fault.
If I print the $psp value, I get the following:
gdb $ p/x $psp
$91 = 0x20000328
If I print the $msp value, I get the following:
gdb $ p/x $msp
$92 = 0x20000100
You can clearly see that the $msp is pointing to the top of the stack where supposedly the saved registers are located. Doesn't this mean that the main stack has the saved registers that the ARM core pushed to the stack?
If I print the memory contents, starting at the $msp address, I get the following:
gdb $ x/8xw 0x20000100
0x20000100 <__process_stack_base__>: 0x55555555 0x55555555 0x55555555 0x55555555
0x20000110 <__process_stack_base__+16>: 0x55555555 0x55555555 0x55555555 0x55555555
It's empty...
Now, if I print the memory contents, starting at the $psp address, I get the following:
gdb $ x/8xw 0x20000328
0x20000328 <__process_stack_base__+552>: 0x20000860 0x00000054 0x00000054 0x20000408
0x20000338 <__process_stack_base__+568>: 0x20000828 0x08001615 0x1ad10800 0x20000000
This looks more accurate. But I thought the saved registers are supposed to indicate where in flash memory they are located? So how does this make sense?
The comments by old_timer under your question are all correct. The registers will be pushed to the active stack on exception entry, whether this is PSP or MSP at the time. By default, all code uses the main stack (MSP), but if you're using anything other than complete bare metal it's likely that whatever kernel you're using has switched Thread mode to using the process stack (PSP).
Most of your investigations suggest that the PSP was in use, with your memory peek around the PSP and MSP being pretty much indisputable. The only bit of evidence you have for it having been the MSP is the results of the fault_handler function, for which you have not posted the source; so my first guess would be that this function is broken in some way.
Do also remember that one common reason for entering the HardFault handler is that another exception handler has caused an exception. This can easily happen in cases of memory corruption. In these cases (assuming Thread mode uses the PSP) the CPU will first enter Handler mode in response to the original exception, pushing r0-r3,r12,lr,pc,psr to the process stack. It will start executing the original exception handler, then fault again, pushing r0-r3,r12,lr,pc,psr to the main stack while entering the HardFault handler. There's often some unravelling to do.
old_timer also mentions using real assembly language, and I agree here too. Even though the ((naked)) attribute should be removing the prologue and epilogue (between them most of the possible 'compilerisms'), your code would simply be far more readable if it was written in bare assembly language. Inline assembly language has its uses, for example if you want to do something very low-level that you can't do from C but you want to avoid a call-return overhead. But when your entire function is written in assembly language, there's no reason to use it.

How do I use GDB Debugger to look at __asm__ content?

I'm trying to understand what is happening in this code, specifically within __asm__. How do I step through the assembly code, so I can print each variable and what not?
Specifically, I am trying to step through this to figure out what does the 8() mean and to see how does it know that it is going into array at index 2.
/* a[2] = 99 in assembly */
__asm__("\n\
movl $_a, %eax\n\
movl $99, 8(%eax)\n\
");
The stepi command steps through the assembly one instruction at a time. There is also a nexti for stepping over function calls. These commands do not adhere to the 'type only the unique prefix of a command is enough' rule that works for most commands -- partially because they the next and step commands are entirely prefixes of these commands and partially because these are not used too often and when they are they are typically used by someone who knows that they really want to use them.
info registers displays a lot of the register contents.
You'll also want to view the disassembly with the disassemble command.
More info on all of these commands is available with the help command, for instance:
(gdb) help info registers
tells you that info registers displays the integer registers and their contents, but it also tells you that if you supply a register name it will limit output to that register's value:
(gdb) info registers rax
rax 0x0 0
(rax is the x86_64 version of eax)
The first column is the register name, the second is the hex value, and the third is the integer value.
There is useful help for the disassemble command as well.
Remember that gdb has tab completion for many commands, and this can be used for more than just simple commands, though many times it offers you bad suggestions -- it's sometimes helpful, though.
Including a label within your inline assembly will allow you to easily make a break point at the beginning of it.
I was never any good at AT&T syntax, but I'm pretty sure the 8(%eax) part means "the address 8 bytes after the address stored in EAX", that is, it's the offset relative to the address stored in the register.
Approximate equivalent in Intel syntax would be something like this (off the top of my head, so it's entirely possible that there is some minor mistake here...)
mov eax, a
mov DWORD PTR [eax+8], 99
movl $_a, %eax // load the memory address of a into %eax
movl $99, 8(%eax) // jump 8 bytes and store number 99 (which is a[2])
It seems to me that a is an int array (int has 4 bytes in most platforms). So by increment 4 bytes you'll be accessing the next item of the array. Other examples of assigning values to this array would be:
movl $10, (%eax) // store number 10 on the the first position: a[0]
movl $20, 4(%eax) // jump 4 bytes from the address loaded in %eax
// and store number 20 on the next position (a[1])

pusha assembly language instruction

I am having a core dump which stack is corrupted.
I try to disassemble it and found the following plz help me to anaylyse it ..
(gdb) bt
#0 0x55a63c98 in ?? ()
#1 0x00000000 in ?? ()
(gdb) disassemble 0x55a63c90 0x55a63ca8
Dump of assembler code from 0x55a63c90 to 0x55a63ca8:
0x55a63c90: add %cl,%dh
0x55a63c92: cmpsb %es:(%edi),%ds:(%esi)
0x55a63c93: push %ebp
0x55a63c94: add %al,(%eax)
0x55a63c96: add %al,(%eax)
**0x55a63c98: pusha**
0x55a63c99: lret $0x9
0x55a63c9c: subb $0x56,0xd005598(%ebp)
0x55a63ca3: push %ebp
0x55a63ca4: jo 0x55a63cc5
0x55a63ca6: sahf
0x55a63ca7: push %ebp
End of assembler dump.
(gdb) q
Can this pusha instruction can lead to a core dump ?
No*, all pusha does is push all the general purpose registers to the stack, including the stack pointer! What is causing the core dump is the instruction after the pusha, lret, which is a long return with stack pop. The return address is the most recent value pushed to the stack, which in this case will be whatever was in esi:edi (since they are the last values to be pushed by the pusha instruction), and that's likely to point to somewhere random.
* Unless you run out of stack space.
Absolutely. PUSHA followed by RET is never correct, the return address will be junk. Seeing ADD AL,[EAX] in your disassembly is another dead give away, that's the disassembly for 0.
In other words: you are disassembling data, not code. Your program bombed because it was executing data. The classic way that happens is corrupting a stack frame with a buffer overflow. When the function returns it pops an invalid return address from the corrupted stack and jumps into never never land. Not getting a seg fault is very unlucky.
Hard to debug, the stack trace is junked. You'll need to set a breakpoint at the last known good code address and start single stepping. The last good function you stepped into just before it bombs is typically the trouble maker.
pusha can only lead to a core dump if stack overflow occurs at that point. This instruction pushes all the registers values onto the stack and that could lead to an overflow. However the root of the problem is likely elsewhere - most likely the call stack is too deep at that point and pusha just happens to cause such consequences because it is executed in such conditions.
check if you see aligned disassemble code:
x/i $eip
Also show registers values:
i r

Resources