Understanding OpenSSL's Assembly version of AES decryption implementation - c

I am trying to understand how OpenSSL implemented AES decryption using Assembly language and for the same I am using gdb debugger.
To understand from very basic concept how AES works - like Byte Substitute, Mix Columns OR how lookup table operations are performed. I am using very old version of OpenSSL-0.9.8.a as it contains only one assembly file ax86-elf.s ( generated from aes-586.pl) whereas latest Openssl contains multiple files (supporting AES-NI, SSE etc).
After understanding with Old OpenSSL, I may try to look into latest OpenSSL version.
Here is what I have done so far.
Generated assembly code of ax86-elf.s can be found here
https://pastebin.com/L3FsbqmK
It has global label AES_Td which points to some long data. I think those long data is the AES look up table data.
.globl AES_Td
.text
.globl _x86_AES_decrypt
.type _x86_AES_decrypt,#function
.align 16
_x86_AES_decrypt:
movl %edi, 12(%esp)
...
...
.align 64
AES_Td:
.long 1353184337,1353184337
.long 1399144830,1399144830
.long 3282310938,3282310938
.long 2522752826,2522752826
.long 3412831035,3412831035
.long 4047871263,4047871263
.long 2874735276,2874735276
.long 2466505547,2466505547
.long 1442459680,1442459680
To debug openssl, I have used gdb.
gdb /usr/local/ssl/bin/openssl
(gdb) set args enc -d -aes-128-cbc -in ciphertext2.bin -out plaintext2.txt -K 0123456789abcdef0123456789abcdef -iv 00000000000000000000000000000000
(gdb) set logging on
Copying output to gdb.txt.
(gdb) b main
Breakpoint 1 at 0x8055591
(gdb) b _x86_AES_decrypt
Function "_x86_AES_decrypt" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
(gdb) run
Starting program: /usr/local/ssl/bin/openssl enc -d -aes-128-cbc -in ciphertext2.bin -out plaintext2.txt -K 0123456789abcdef0123456789abcdef -iv 00000000000000000000000000000000
Breakpoint 1, 0x08055591 in main ()
(gdb) s
Single stepping until exit from function main,
which has no line number information.
__GI_getenv (name=0x8090a32 "OPENSSL_DEBUG_MEMORY") at getenv.c:35
35 getenv.c: No such file or directory.
(gdb)
__x86.get_pc_thunk.bx () at ../sysdeps/i386/i686/multiarch/strcat.S:55
55 ../sysdeps/i386/i686/multiarch/strcat.S: No such file or directory.
(gdb)
__GI_getenv (name=0x8090a32 "OPENSSL_DEBUG_MEMORY") at getenv.c:36
36 getenv.c: No such file or directory.
(gdb)
__strlen_ia32 () at ../sysdeps/i386/i686/multiarch/../../i586/strlen.S:43
43 ../sysdeps/i386/i686/multiarch/../../i586/strlen.S: No such file or directory.
Now this shows getenv.c is called, then ../sysdeps/i386/i686/multiarch/../../i586/strlen.S is called (although I am not able to view source code due to non-availablity).
I am wondering this way I would be able to understand the code as it shows so many calls of functions whose source code is not available.
So I follow this second approach, without putting any break-point at main.
gdb /usr/local/ssl/bin/openssl
(gdb) set args enc -d -aes-128-cbc -in ciphertext2.bin -out plaintext2.txt -K 0123456789abcdef0123456789abcdef -iv 00000000000000000000000000000000
(gdb) set logging on
Copying output to gdb.txt.
(gdb) b _x86_AES_decrypt
Function "_x86_AES_decrypt" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (_x86_AES_decrypt) pending.
(gdb) run
Starting program: /usr/local/ssl/bin/openssl enc -d -aes-128-cbc -in ciphertext2.bin -out plaintext2.txt -K 0123456789abcdef0123456789abcdef -iv 00000000000000000000000000000000
Breakpoint 1, 0xb7e57590 in _x86_AES_decrypt ()
from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb) s
Single stepping until exit from function _x86_AES_decrypt,
which has no line number information.
0xb7e586d0 in AES_cbc_encrypt () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb) s
Single stepping until exit from function AES_cbc_encrypt,
which has no line number information.
Breakpoint 1, 0xb7e57590 in _x86_AES_decrypt ()
from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
(gdb)
Single stepping until exit from function _x86_AES_decrypt,
which has no line number information.
0xb7e586d0 in AES_cbc_encrypt () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
Single stepping until exit from function AES_cbc_encrypt,
which has no line number information.
0xb7ea380e in aes_128_cbc_cipher () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
Single stepping until exit from function aes_128_cbc_cipher,
which has no line number information.
0xb7ea182d in EVP_EncryptUpdate () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
Single stepping until exit from function EVP_EncryptUpdate,
which has no line number information.
0xb7ea1a39 in EVP_DecryptUpdate () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
Single stepping until exit from function EVP_DecryptUpdate,
which has no line number information.
__memcpy_ssse3_rep () at ../sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S:111
111 ../sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S: No such file or directory.
I have following doubts -
I am not able to understand where AES LOOKUP table operations are performed ? How to find them ?
Why AES_cbc_encrypt, EVP_EncryptUpdate functions are called although I am performing decryption operations?
How to debug functions EVP_DecryptUpdate() to understand how the function is executed ? Simply putting breakpoint at the function like b EVP_DecryptUpdate is not giving any insight.
Here is what I am getting while putting breakpoint
(gdb) b EVP_DecryptUpdate
Breakpoint 3 at 0xb7ea19e0
(gdb) run
Breakpoint 3, 0xb7ea19e0 in EVP_DecryptUpdate ()
from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb) si
0xb7ea19e1 in EVP_DecryptUpdate () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb) si
0xb7ea19e2 in EVP_DecryptUpdate () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
0xb7ea19e3 in EVP_DecryptUpdate () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
0xb7ea19e4 in EVP_DecryptUpdate () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
0xb7e372c0 in __x86.get_pc_thunk.bx ()
from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
0xb7e372c3 in __x86.get_pc_thunk.bx ()
from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
0xb7ea19e9 in EVP_DecryptUpdate () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
0xb7ea19ef in EVP_DecryptUpdate () from /usr/local/ssl/lib/libcrypto.so.0.9.8
(gdb)
EVP_DecryptUpdate() function is defined inside crypto/evp/evp_enc.c
int EVP_DecryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl,
const unsigned char *in, int inl)
{
How to display the corresponding C instructions while debugging using gdb ?
I am not sure, but it may be possible OpenSSL use various C programs as well as assembly to decrypt the AES encrypted ciphertext. How to find the flow of evaluation ?
I am using Debian linux, 32bit, gdb .
Any help to understand the above flow or any link to understand will be a great help. Thanks in advance.

Related

Watchpoint doesn't work when reverse executing

Here's a debugging scenario:
Create start breakpoint A and finish breakpoint B.
Start recording. Continue.
Reach breakpoint B.
Set watchpoint to watch writes to some piece of memory.
Reverse continue until watchpoint breaks execution.
Let's suppose that setting the watchpoint is only possible in step 4, not earlier, since the location of memory that should be watched is only known at that point.
Here's a simple example.
main.c:
int main(void)
{
int num;
num = 5;
return 0;
}
Debugging session:
$ cc main.c
$ gdb -q -nx -ex 'set disassembly-flavor intel' ./a.out
Reading symbols from ./a.out...
(No debugging symbols found in ./a.out)
(gdb) b *main + 0
Breakpoint 1 at 0x1129
(gdb) disas main
Dump of assembler code for function main:
0x0000000000001129 <+0>: push rbp
0x000000000000112a <+1>: mov rbp,rsp
0x000000000000112d <+4>: mov DWORD PTR [rbp-0x4],0x5
0x0000000000001134 <+11>: mov eax,0x0
0x0000000000001139 <+16>: pop rbp
0x000000000000113a <+17>: ret
End of assembler dump.
(gdb) b *main + 16
Breakpoint 2 at 0x1139
(gdb) r
Starting program: /var/tmp/test/a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, 0x0000555555555129 in main ()
(gdb) record
(gdb) c
Continuing.
Breakpoint 2, 0x0000555555555139 in main ()
(gdb) p/x $rbp-0x4
$1 = 0x7fffffffe63c
(gdb) watch *(int*)0x7fffffffe63c
Hardware watchpoint 3: *(int*)0x7fffffffe63c
(gdb) reverse-continue
Continuing.
No more reverse-execution history.
0x0000555555555129 in main ()
(gdb)
The last command shows that reverse execution didn't stop where it is supposed to stop, that is *main+4. Instead, it stopped where the recording has started, ignoring any memory writes.
My simple experiment was motivated by that video at the CppCon 2015, highlighting usefulness of the record & replay concept. Really, I don't know what I've done wrong, because for me it seems that I've repeated every step from the video.
Also, reading gdb docs didn't help either.
So, why the watchpoint has been ignored and how can it be prevented?
Try telling gdb to use a software breakpoint instead of the hardware one through the command set can-use-hw-watchpoints 0.
It seems that the hardware watchpoint doesn't work here because when executing back and forth through reverse-execution history, no traditional process execution takes place, it's only gdb changing it's internal memory to make appearance that a program executed backwards.
In other words, we should use hardware watchpoints when the real process executes on a CPU, and software watchpoints when gdb makes an emulation through its recording history.
Thanks to Peter Cordes for the explanation in the comments.

Inspecting caller frames with gdb

Suppose I have:
#include <stdlib.h>
int main()
{
int a = 2, b = 3;
if (a!=b)
abort();
}
Compiled with:
gcc -g c.c
Running this, I'll get a coredump (due to the SIGABRT raised by abort()), which I can debug with:
gdb a.out core
How can I get gdb to print the values of a and b from this context?
Here's the another way to specifically get a and b values by moving to the interested frame and then info locals would give you the values.
a.out was compiled with your code. (frame 2 is what you are interested in i.e., main()).
$ gdb ./a.out core
[ removed some not-so-interesting info here ]
Reading symbols from ./a.out...done.
[New LWP 14732]
Core was generated by `./a.out'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007fac16269f5d in __GI_abort () at abort.c:90
#2 0x00005592862f266d in main () at f.c:7
(gdb) frame 2
#2 0x00005592862f266d in main () at f.c:7
7 abort();
(gdb) info locals
a = 2
b = 3
(gdb) q
You can also use print once frame 2:
(gdb) print a
$1 = 2
(gdb) print b
$2 = 3
Did you compile with debug symbols -g? The command should be bt for backtrace, you can also use bt full for a full backtrace.
More infos: https://sourceware.org/gdb/onlinedocs/gdb/Backtrace.html

Why is it that gdb "can't compute CFA" when using a separate debug symbols file?

I'm trying to invoke gdb with a stripped executable and a separate debug symbols file, on a core dump generated from running the stripped executable.
But when I use the separate debug symbols file, gdb is unable to give information on local variables for me.
Here is a log showing entirely how I produce my 3 ELF files and the core file and then run them through gdb 3 times.
First I just run gdb with the stripped executable and of course can't see any file names or line numbers, and can't inspect variables.
Then I run gdb using the stripped executable and grabbing the debug symbols from the original unstripped executable. This works pretty well but does give a disturbing and apparently unwarranted warning about the core and executable possibly mismatching.
Finally I run gdb with the stripped executable and the separate debug file. This still gives filenames and line numbers, but I can't inspect local variables and I get a "can't compute CFA for this frame" error.
Here is the log:
2016-09-16 16:01:45 barry#somehost ~/proj/segfault/segfault
$ cat segfault.c
#include <stdio.h>
int main(int argc, char **argv) {
char *badpointer = (char *)0x2398723;
printf("badpointer: %s\n", badpointer);
return 0;
}
2016-09-16 16:03:31 barry#somehost ~/proj/segfault/segfault
$ gcc -g -o segfault segfault.c
2016-09-16 16:03:37 barry#somehost ~/proj/segfault/segfault
$ objcopy --strip-debug segfault segfault.stripped
2016-09-16 16:03:40 barry#somehost ~/proj/segfault/segfault
$ objcopy --only-keep-debug segfault segfault.debug
2016-09-16 16:03:43 barry#somehost ~/proj/segfault/segfault
$ ./segfault.stripped
Segmentation fault (core dumped)
2016-09-16 16:03:48 barry#somehost ~/proj/segfault/segfault
$ ll /tmp/core.segfault.stripp.11
-rw------- 1 barry bsm-it 188416 2016-09-16 16:03 /tmp/core.segfault.stripp.11
2016-09-16 16:03:51 barry#somehost ~/proj/segfault/segfault
$ gdb ./segfault.stripped /tmp/core.segfault.stripp.11
GNU gdb (GDB) Fedora (7.0.1-50.fc12)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/barry/proj/segfault/segfault/segfault.stripped...(no debugging symbols found)...done.
warning: core file may not match specified executable file.
Missing separate debuginfo for
Try: yum --disablerepo='*' --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/a6/8dce9115a92508af92ac4ccac24b9f0cc34d71
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `./segfault.stripped'.
Program terminated with signal 11, Segmentation fault.
#0 0x00000035fec47cb7 in vfprintf () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.11.2-3.x86_64
(gdb) bt
#0 0x00000035fec47cb7 in vfprintf () from /lib64/libc.so.6
#1 0x00000035fec4ec4a in printf () from /lib64/libc.so.6
#2 0x00000000004004f4 in main ()
(gdb) up
#1 0x00000035fec4ec4a in printf () from /lib64/libc.so.6
(gdb) up
#2 0x00000000004004f4 in main ()
(gdb) p argc
No symbol table is loaded. Use the "file" command.
(gdb) q
2016-09-16 16:04:19 barry#somehost ~/proj/segfault/segfault
$ gdb -q -e ./segfault.stripped -s ./segfault -c /tmp/core.segfault.stripp.11
Reading symbols from /home/barry/proj/segfault/segfault/segfault...done.
warning: core file may not match specified executable file.
Missing separate debuginfo for
Try: yum --disablerepo='*' --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/a6/8dce9115a92508af92ac4ccac24b9f0cc34d71
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `./segfault.stripped'.
Program terminated with signal 11, Segmentation fault.
#0 0x00000035fec47cb7 in vfprintf () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.11.2-3.x86_64
(gdb) bt
#0 0x00000035fec47cb7 in vfprintf () from /lib64/libc.so.6
#1 0x00000035fec4ec4a in printf () from /lib64/libc.so.6
#2 0x00000000004004f4 in main (argc=1, argv=0x7fffd1c0a728) at segfault.c:4
(gdb) up
#1 0x00000035fec4ec4a in printf () from /lib64/libc.so.6
(gdb) up
#2 0x00000000004004f4 in main (argc=1, argv=0x7fffd1c0a728) at segfault.c:4
4 printf("badpointer: %s\n", badpointer);
(gdb) p argc
$1 = 1
(gdb) q
2016-09-16 16:04:39 barry#somehost ~/proj/segfault/segfault
$ gdb -q -e ./segfault.stripped -s ./segfault.debug -c /tmp/core.segfault.stripp.11
Reading symbols from /home/barry/proj/segfault/segfault/segfault.debug...done.
warning: core file may not match specified executable file.
Missing separate debuginfo for
Try: yum --disablerepo='*' --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/a6/8dce9115a92508af92ac4ccac24b9f0cc34d71
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `./segfault.stripped'.
Program terminated with signal 11, Segmentation fault.
#0 0x00000035fec47cb7 in vfprintf () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.11.2-3.x86_64
(gdb) bt
#0 0x00000035fec47cb7 in vfprintf () from /lib64/libc.so.6
#1 0x00000035fec4ec4a in printf () from /lib64/libc.so.6
#2 0x00000000004004f4 in main (argc=can't compute CFA for this frame
) at segfault.c:4
(gdb) up
#1 0x00000035fec4ec4a in printf () from /lib64/libc.so.6
(gdb) up
#2 0x00000000004004f4 in main (argc=can't compute CFA for this frame
) at segfault.c:4
4 printf("badpointer: %s\n", badpointer);
(gdb) p argc
can't compute CFA for this frame
(gdb) q
I have some questions about this:
Why does it display the warning "warning: core file may not match specified executable file.", even though I'm using the exact same executable path as was used when the core dump was originally generated?
Why does using the separate debug symbols (-s ./segfault.debug) result in the error "can't compute CFA for this frame" when attempting to inspect local variables?
What is a CFA anyway?
Am I using an incorrect method to product the debug symbol file?
I confirmed that using "objcopy --strip-debug" gives the same result as "strip -g".
Am I using the right options to feed the debug info into gdb?
My intention is that the stripped executables will be installed on a binary-compatible production system and any core dumps generated due to segfaults can be copied back to the devel system where we can feed them into gdb with the debug info and analyse the crash position and stack variables. But as a first step I'm trying to sort out the issues with using separate debug info files on the devel system.
It seems that using a separate debug symbols file causes the "can't compute CFA for this frame" error, even when a core file is not used.
My gcc version:
2016-09-16 16:07:39 barry#somehost ~/proj/segfault/segfault
$ gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.4 20100630 (Red Hat 4.4.4-10) (GCC)
I suspect that gdb might be looking for symbols related to the variables in the segfault.debug file when objcopy actually only put them in the segfault.stripped file. If this is the case, perhaps some small adjustment to the options to objcopy could put those symbols in the place gdb is looking?
I commend you for wanting to keep a set of symbol files for everything that is deployed to the production server; in my opinion this is an often overlooked practice, but you will not regret it -- one day it will save you a lot of debugging trouble.
As I have had similar issues in the past, I will try to answer some of your questions, although you have quite an ancient toolchain, if you don't mind me saying so, so I'm not sure how much that really applies here. I'll put up here anyway.
CFA = Canonical Frame Address. This is the base pointer to the stack frame that every local variable is addressed relative to. If you have done some traditional x86 assembly programming, the BP register was used for this. So "can't compute CFA for this frame" basically says "I know of these local variables, but I don't know where they are located on the stack".
There used to be code in GDB that worked only for the DWARF-2 debugging format, and non-conformance triggered this particular error at least. That restriction was lifted some time ago, but that change won't be in your version.
The other thing is there are debug information regarding how variables may be moved around is not always generated. This usually happens in newer compilers though, as they get better at optimizing.
I was able to get rid of my problems by compiling like this:
gcc -g3 -gdwarf-2 -fvar-tracking -fvar-tracking-assignments -o segfault segfault.c
you can try to see if this solves your problem, too.
Regarding the message about the location of the symbol file; it seems that the debugger wants to load it from the system directory. Maybe you have to link the executable to the symbol file with:
objcopy --add-gnu-debuglink=segfault.debug segfault
I found this question while searching for an answer to the following part of the original question:
Why does it display the warning "warning: core file may not match
specified executable file.", even though I'm using the exact same
executable path as was used when the core dump was originally
generated?
There was not an answer to this particular question but through experimentation and research I believe I have found the answer.
Below is a transcript of using gdb to debug a core file. Notice that the "warning: core file may not match specified executable file." error appears when the executable file that caused the core is greater than 15 characters in length.
[~/t]$cat do_abort.c
#include <stdlib.h>
int func4(int f) { if(f) {abort();} return 0;}
int func3(int f) { return func4(f); }
int func2(int f) { return func3(f); }
int func1(int f) { return func2(f); }
int main(void) { return func1(1); }
[~/t]$gcc -g -o 123456789012345 do_abort.c
[~/t]$./123456789012345
Aborted (core dumped)
[~/t]$ll core*
-rw-------. 1 dev wheel 240K Apr 22 03:19 core.42697
[~/t]$gdb -q -c core.42697 123456789012345
Reading symbols from /home/dev/t/123456789012345...done.
[New LWP 42697]
Core was generated by `./123456789012345'.
Program terminated with signal 6, Aborted.
#0 0x00007f0be67631d7 in __GI_raise (sig=sig#entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0 0x00007f0be67631d7 in __GI_raise (sig=sig#entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f0be67648c8 in __GI_abort () at abort.c:90
#2 0x0000000000400543 in func4 (f=1) at do_abort.c:3
#3 0x000000000040055f in func3 (f=1) at do_abort.c:4
#4 0x0000000000400576 in func2 (f=1) at do_abort.c:5
#5 0x000000000040058d in func1 (f=1) at do_abort.c:6
#6 0x000000000040059d in main () at do_abort.c:7
(gdb) quit
[~/t]$rm core.42697
[~/t]$
[~/t]$mv 123456789012345 1234567890123456
[~/t]$./1234567890123456
Aborted (core dumped)
[~/t]$ll core*
-rw-------. 1 dev wheel 240K Apr 22 03:20 core.42721
[~/t]$gdb -q -c core.42721 1234567890123456
Reading symbols from /home/dev/t/1234567890123456...done.
warning: core file may not match specified executable file.
[New LWP 42721]
Core was generated by `./1234567890123456'.
Program terminated with signal 6, Aborted.
#0 0x00007f5b271fa1d7 in __GI_raise (sig=sig#entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0 0x00007f5b271fa1d7 in __GI_raise (sig=sig#entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f5b271fb8c8 in __GI_abort () at abort.c:90
#2 0x0000000000400543 in func4 (f=1) at do_abort.c:3
#3 0x000000000040055f in func3 (f=1) at do_abort.c:4
#4 0x0000000000400576 in func2 (f=1) at do_abort.c:5
#5 0x000000000040058d in func1 (f=1) at do_abort.c:6
#6 0x000000000040059d in main () at do_abort.c:7
(gdb) quit
[~/t]$mv 1234567890123456 123456789012345
[~/t]$gdb -q -c core.42721 123456789012345
Reading symbols from /home/dev/t/123456789012345...done.
[New LWP 42721]
Core was generated by `./1234567890123456'.
Program terminated with signal 6, Aborted.
#0 0x00007f5b271fa1d7 in __GI_raise (sig=sig#entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0 0x00007f5b271fa1d7 in __GI_raise (sig=sig#entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f5b271fb8c8 in __GI_abort () at abort.c:90
#2 0x0000000000400543 in func4 (f=1) at do_abort.c:3
#3 0x000000000040055f in func3 (f=1) at do_abort.c:4
#4 0x0000000000400576 in func2 (f=1) at do_abort.c:5
#5 0x000000000040058d in func1 (f=1) at do_abort.c:6
#6 0x000000000040059d in main () at do_abort.c:7
(gdb) quit
Following through the gdb source code I discovered that the ELF core file structure only reserves sixteen bytes to hold the executable filename, pr_fname[16], including the nul terminator (reference):
35 struct elf_external_linux_prpsinfo32_ugid32
36 {
37 char pr_state; /* Numeric process state. */
38 char pr_sname; /* Char for pr_state. */
39 char pr_zomb; /* Zombie. */
40 char pr_nice; /* Nice val. */
41 char pr_flag[4]; /* Flags. */
42 char pr_uid[4];
43 char pr_gid[4];
44 char pr_pid[4];
45 char pr_ppid[4];
46 char pr_pgrp[4];
47 char pr_sid[4];
48 char pr_fname[16]; /* Filename of executable. */
49 char pr_psargs[80]; /* Initial part of arg list. */
50 };
The "warning: core file may not match specified executable file." warning will be issued by gdb when the name of the executable passed on the command-line to gdb doesn't match the value stored in pr_fname[] in the core file (references here, here, and here).
Using the demonstration I showed at the start of this answer, when the filename is 1234567890123456 the filename stored in the core file as pr_fname[] is 123456789012345 (truncated to 15 characters). If gdb is started using gdb -c core.XXXX 1234567890123456 then the warning will be issued. If gdb is started using gdb -c core.XXXX 123456789012345 then the warning will not be issued.
It should follow that in the example from the original question, if segfault.stripped was renamed to segfault.stripp and gdb was run using gdb ./segfault.stripp /tmp/core.segfault.stripp.11 then the warning should not be issued.

Trying to understand example char_array2.c from "the art of exploitation"

OK so I am really trying to understand what's going on this example of "The art of exploitation" second edition. I am trying to see exactly what is going on with the example by closely following the output of GDB on the book. My greatest problem with this is the last part, I included the whole thing so that everyone can see what's going on. Granted I only have very(very) basic knowledge of assembly code. I do understand basic C.
In the last part the author says that there is a minor difference in the second run of the program from the last one in the address that strcpy() points to and I just can't see it.
The program is simply
#include<stdio.h>
#include<string.h>
int main() {
char str_a[20];
strcpy(str_a, "Hello, world!\n");
printf(str_a);
}
After I compile it with the necessary options to be able to debug it I load it on
GDB and include the following:
(gdb) break 6
Breakpoint 1 at 0x80483c4: file char_array2.c, line 6.
(gdb) break strcpy
Function "strcpy" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (strcpy) pending.
(gdb) break 8
Breakpoint 3 at 0x80483d7: file char_array2.c, line 8.
(gdb)
I have no problem with this, it is to my understanding that the
debugger can only do this sort of things with user defined functions. I also know how to go around this problem with gcc options.
I also know that when the program runs the strcpy breakpoint is resolved. Let me continue.
(gdb) run
Starting program: /home/reader/booksrc/char_array2
Breakpoint 4 at 0xb7f076f4
Pending breakpoint "strcpy" resolved
Breakpoint 1, main() at char_array2.c:7
7 strcpy(str_a, "Hello, world!\n");
(gdb) i r eip
eip 0x80483c4 0x80483c4 <main+16>
(gdb) x/5i $eip
0x80483c4 <main+16>: mov DWORD PTR [esp+4],0x80484c4
0x80483cc <main+24>: lea eax,[ebp-40]
0x80483cf <main+27>: mov DWORD PTR [esp],eax
0x80483d2 <main+30>: call 0x80482c4 <strcpy#plt>
0x80483d7 <main+35>: lea eax,[ebp-40]
(gdb) continue
Continuing.
Breakpoint 4, 0xb7f076f4 in strcpy () from /lib/tls/i686/cmov/libc.so.6
(gdb) i r eip
eip 0xb7f076f4 0xb7f076f4 <strcpy+4>
(gdb) x/5i $eip
0xb7f076f4 <strcpy+4>: mov esi,DWORD PTR [ebp+8]
0xb7f076f7 <strcpy+7>: mov eax,DWORD PTR [ebp+12]
0xb7f076fa <strcpy+10>: mov ecx,esi
0xb7f076fc <strcpy+12>: sub ecx,eax
0xb7f076fe <strcpy+14>: mov edx,eax
(gdb) continue
Continuing.
Breakpoint 3, main () at char_array2.c:8
8
printf(str_a);
(gdb) i r eip
eip 0x80483d7 0x80483d7 <main+35>
(gdb) x/5i $eip
0x80483d7 <main+35>: lea eax,[ebp-40]
0x80483da <main+38>: mov DWORD PTR [esp],eax
0x80483dd <main+41>: call 0x80482d4 <printf#plt>
0x80483e2 <main+46>: leave
0x80483e3 <main+47>: ret
(gdb)
This is the second run of the program in which supposedly the address to strcpy is different from the other address.
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/reader/booksrc/char_array2
Error in re-setting breakpoint 4:
Function "strcpy" not defined.
Breakpoint 1, main () at char_array2.c:7
7
strcpy(str_a, "Hello, world!\n");
(gdb) bt
#0 main () at char_array2.c:7
(gdb) cont
Continuing.
Breakpoint 4, 0xb7f076f4 in strcpy () from /lib/tls/i686/cmov/libc.so.6
(gdb) bt
#0 0xb7f076f4 in strcpy () from /lib/tls/i686/cmov/libc.so.6
#1 0x080483d7 in main () at char_array2.c:7
(gdb) cont
Continuing.
Breakpoint 3, main () at char_array2.c:8
8
printf(str_a);
(gdb) bt
#0 main () at char_array2.c:8
(gdb)
Where is the difference? am I wrong for thinking that 0xb7f076f4 is the address of strcpy? On the second run if I am correct everything indicates that the address is 0xb7f076f4.
Also, what is ? I can't find the explanation for this anywhere earlier in the book. If someone could be kind enough to explain this from the top down to me I would appreciate it so much being that I don't know any expert in real life that could help me. I find the explanations to be vague, he explains variables and loops like if he was explaining it to a 5 year old, but leaves much of the assembly code for us to figure out by ourselves, I have not been very successful at this.
Any help would be greatly appreciated.
Apparently gdb turns off ASLR for the debugged process to make (session-to-session) debugging easier.
From https://sourceware.org/gdb/current/onlinedocs/gdb/Starting.html
set disable-randomization
set disable-randomization on
This option (enabled by default in GDB) will turn off the native
randomization of the virtual address space of the started program.
This option is useful for multiple debugging sessions to make the
execution better reproducible and memory addresses reusable across
debugging sessions.
Set set disable-randomization off in gdb or in a .gdbinit file and try it again. Libc should now get loaded at a different address each time you run the binary.
Running watch -n 1 cat /proc/self/maps also is nice to see how the binary and the libraries are mapped at 'random' addresses.
As #Voo said in his comment above, the book probably refers to ASLR (Address Space Layout Randomization) which is a security feature. It changes how the address space is used for each execution so you can't rely on finding things always in the same place.
If you don't see it in gdb that means you have ASLR turned off. Either globally or locally in gdb. You can check the former using cat /proc/sys/kernel/randomize_va_space and the latter using show disable-randomization command at the gdb prompt.

What does this mean in gdb?

Program received signal SIGSEGV, Segmentation fault.
0x08049795 in execute_jobs ()
Current language: auto; currently asm
(gdb) info symbol 0x08049795
execute_jobs + 22 in section .text
(gdb) ptype 0x08049795
type = int
How to get the line number at which the error occurred?
Your binary was not compiled with debugging information. Rebuild with at least -g (or -ggdb, or -ggdb -g3, see GCC manual.)
The exact lines from GDB output:
(gdb) info symbol 0x08049795 execute_jobs + 22 in section .text
means that instruction at address 0x08049795, which is 22 bytes from beginning of function execute_jobs, generated the segmentation fault.
(gdb) ptype 0x08049795 type = int
Here you are asking for type of an integer, and GDB happily replies. Do
(gdb) x/10i 0x08049795
or
(gdb) disassemble execute_jobs
to see actual instructions.
The gdb command "bt" will show you a back trace. Unless you've corrupted the stack this should show the sequence of function calls that lead to the segfault. To get more meaningful information make sure that you've compiled your program with debug information by including -g on the gcc/g++ command line.

Resources