Tracing a program from its entry point with ptrace (linux, c) - c

I want to trace registers & instructions of a program using ptrace. For a better understanding of my code i have reduced it to a point where it just counts the number of instructions of "/bin/ls".
Here is my Code (ignore unnecessary includes):
#include <stdio.h>
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/user.h>
#include <sys/reg.h>
#include <sys/syscall.h>
int main()
{
pid_t child;
child = fork(); //create child
if(child == 0) {
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
char* child_argv[] = {"/bin/ls", NULL};
execv("/bin/ls", child_argv);
}
else {
int status;
long long ins_count = 0;
while(1)
{
//stop tracing if child terminated successfully
wait(&status);
if(WIFEXITED(status))
break;
ins_count++;
ptrace(PTRACE_SINGLESTEP, child, NULL, NULL);
}
printf("\n%lld Instructions executed.\n", ins_count);
}
return 0;
}
When i run this code i get "484252 Instructions executed", which i really doubt. I googled and found out that most of these instructions come from loading libraries before the actual program (/bin/ls) is executed.
How can i skip the single-stepping to the first actual instruction of /bin/ls and count from there?

You're right, your count includes the dynamic linker doing its job (and AFAIK a single phantom instruction just before the binary starts executing).
(I'm using shell commands but it can all be done from C code as well, using elf.h; see the musl dynamic linker for a nice example)
You could:
Parse /bin/ls's ELF header to find the entrypoint and the program header containing the entrypoint (I'm using cat here as it's easier to keep it running for a long time while I'm writing this)
# readelf -l /bin/cat
Elf file type is EXEC (Executable file)
Entry point 0x4025b0
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
(...)
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x000000000000b36c 0x000000000000b36c R E 0x200000
(...)
The entrypoint is between VirtAddr and VirtAddr+FileSiz and the flags include the executable bit (E) so it looks like we're on the right track.
Note: Elf file type is EXEC (and not DYN) means that we always map the program headers to the fixed location specified in VirtAddr; this means that for my build of cat we could just use the entry point address we found above. DYN binaries can--and are--loaded at an arbitrary address so we need to do this relocation dance.
Find the actual load address of the binary
AFAIK program headers are sorted by VirtAddr, so the first segment with the LOAD flag will be mapped to the lowest address. Open /proc/<pid>/maps and look for your binary:
# grep /bin/cat /proc/7431/maps
00400000-0040c000 r-xp 00000000 08:03 1046541 /bin/cat
0060b000-0060c000 r--p 0000b000 08:03 1046541 /bin/cat
0060c000-0060d000 rw-p 0000c000 08:03 1046541 /bin/cat
The first segment is mapped to 0x00400000 (which is expected from ELF type == EXEC). If it was not, you'd need to adjust the entrypoint address:
actual_entrypoint_addr = elf_entrypoint_addr - elf_virt_addr_of_first_phdr + actual_addr_of_first_phdr
Set a breakpoint on actual_entrypoint_addr and call ptrace(PTRACE_CONT). Once the breakpoint hits (waitpid() returns), proceed as you have so far (count the ptrace(PTRACE_SINGLESTEP)s).
An example where we would need to handle the relocation:
# readelf -l /usr/sbin/nginx
Elf file type is DYN (Shared object file)
Entry point 0x24e20
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
(...)
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x000000000010df54 0x000000000010df54 R E 0x200000
(...)
# grep /usr/sbin/nginx /proc/1425/maps
55e299e78000-55e299f86000 r-xp 00000000 08:03 660029 /usr/sbin/nginx
55e29a186000-55e29a188000 r--p 0010e000 08:03 660029 /usr/sbin/nginx
55e29a188000-55e29a1a4000 rw-p 00110000 08:03 660029 /usr/sbin/nginx
The entrypoint is at 0x55e299e78000 - 0 + 0x24e20 == 0x55e299e9ce20

Related

cant view C structures from an assembly program? [duplicate]

I recently started learning assembly language for the Intel x86-64 architecture using YASM. While solving one of the tasks suggested in a book (by Ray Seyfarth) I came to following problem:
When I place some characters into a buffer in the .bss section, I still see an empty string while debugging it in gdb. Placing characters into a buffer in the .data section shows up as expected in gdb.
segment .bss
result resb 75
buf resw 100
usage resq 1
segment .data
str_test db 0, 0, 0, 0
segment .text
global main
main:
mov rbx, 'A'
mov [buf], rbx ; LINE - 1 STILL GET EMPTY STRING AFTER THAT INSTRUCTION
mov [str_test], rbx ; LINE - 2 PLACES CHARACTER NICELY.
ret
In gdb I get:
after LINE 1: x/s &buf, result - 0x7ffff7dd2740 <buf>: ""
after LINE 2: x/s &str_test, result - 0x601030: "A"
It looks like &buf isn't evaluating to the correct address, so it still sees all-zeros. 0x7ffff7dd2740 isn't in the BSS of the process being debugged, according to its /proc/PID/maps, so that makes no sense. Why does &buf evaluate to the wrong address, but &str_test evaluates to the right address? Neither are "global" symbols, but we did build with debug info.
Tested with GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10 on x86-64 Ubuntu 15.10.
I'm building with
yasm -felf64 -Worphan-labels -gdwarf2 buf-test.asm
gcc -g buf-test.o -o buf-test
nm on the executable shows the correct symbol addresses:
$ nm -n buf-test # numeric sort, heavily edited to omit symbols from glibc
...
0000000000601028 D __data_start
0000000000601038 d str_test
...
000000000060103c B __bss_start
0000000000601040 b result
000000000060108b b buf
0000000000601153 b usage
(editor's note: I rewrote a lot of the question because the weirdness is in gdb's behaviour, not the OP's asm!).
glibc includes a symbol named buf, as well.
(gdb) info variables ^buf$
All variables matching regular expression "^buf$":
File strerror.c:
static char *buf;
Non-debugging symbols:
0x000000000060108b buf <-- this is our buf
0x00007ffff7dd6400 buf <-- this is glibc's buf
gdb happens to choose the symbol from glibc over the symbol from the executable. This is why ptype buf shows char *.
Using a different name for the buffer avoids the problem, and so does a global buf to make it a global symbol. You also wouldn't have a problem if you wrote a stand-alone program that didn't link libc (i.e. define _start and make an exit system call instead of running a ret)
Note that 0x00007ffff7dd6400 (address of buf on my system; different from yours) is not actually a stack address. It visually looks like a stack address, but it's not: it has a different number of f digits after the 7. Sorry for that confusion in comments and an earlier edit of the question.
Shared libraries are also loaded near the top of the low 47 bits of virtual address space, near where the stack is mapped. They're position-independent, but a library's BSS space has to be in the right place relative to its code. Checking /proc/PID/maps again more carefully, gdb's &buf is in fact in the rwx block of anonymous memory (not mapped to any file) right next to the mapping for libc-2.21.so.
7ffff7a0f000-7ffff7bcf000 r-xp 00000000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7bcf000-7ffff7dcf000 ---p 001c0000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7dcf000-7ffff7dd3000 r-xp 001c0000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7dd3000-7ffff7dd5000 rwxp 001c4000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7dd5000-7ffff7dd9000 rwxp 00000000 00:00 0 <--- &buf is in this mapping
...
7ffffffdd000-7ffffffff000 rwxp 00000000 00:00 0 [stack] <---- more FFs before the first non-FF than in &buf.
A normal call instruction with a rel32 encoding can't reach a library function, but it doesn't need to because GNU/Linux shared libraries have to support symbol interposition, so calls to library functions actually jump to the PLT, where an indirect jmp (with a pointer from the GOT) goes to the final destination.

How the virtual address of the process memory is calculated?

I'm write the following program to examine process memory layout:
#include <stdio.h>
#include <string.h>
#include <sys/resource.h>
#include <sys/time.h>
#include <unistd.h>
#define CHAR_LEN 255
char filepath[CHAR_LEN];
char line[CHAR_LEN];
char address[CHAR_LEN];
char perms[CHAR_LEN];
char offset[CHAR_LEN];
char dev[CHAR_LEN];
char inode[CHAR_LEN];
char pathname[CHAR_LEN];
int main() {
printf("Hello world.\n");
sprintf(filepath, "/proc/%u/maps", (unsigned)getpid());
FILE *f = fopen(filepath, "r");
printf("%-32s %-8s %-10s %-8s %-10s %s\n", "address", "perms", "offset",
"dev", "inode", "pathname");
while (fgets(line, sizeof(line), f) != NULL) {
sscanf(line, "%s%s%s%s%s%s", address, perms, offset, dev, inode, pathname);
printf("%-32s %-8s %-10s %-8s %-10s %s\n", address, perms, offset, dev,
inode, pathname);
}
fclose(f);
return 0;
}
I compile the program as gcc -static -O0 -g -std=gnu11 -o test_helloworld_memory_map test_helloworld_memory_map.c -lpthread. I first run readelf -l test_helloworld_memory_map and obtain:
Elf file type is EXEC (Executable file)
Entry point 0x400890
There are 6 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000c9e2e 0x00000000000c9e2e R E 200000
LOAD 0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
0x0000000000001c98 0x0000000000003db0 RW 200000
NOTE 0x0000000000000190 0x0000000000400190 0x0000000000400190
0x0000000000000044 0x0000000000000044 R 4
TLS 0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
0x0000000000000020 0x0000000000000050 R 8
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
0x0000000000000148 0x0000000000000148 R 1
Section to Segment mapping:
Segment Sections...
00 .note.ABI-tag .note.gnu.build-id .rela.plt .init .plt .text __libc_freeres_fn __libc_thread_freeres_fn .fini .rodata __libc_subfreeres __libc_atexit .stapsdt.base __libc_thread_subfreeres .eh_frame .gcc_except_table
01 .tdata .init_array .fini_array .jcr .data.rel.ro .got .got.plt .data .bss __libc_freeres_ptrs
02 .note.ABI-tag .note.gnu.build-id
03 .tdata .tbss
04
05 .tdata .init_array .fini_array .jcr .data.rel.ro .got
Then, I run the program and obtain:
address perms offset dev inode pathname
00400000-004ca000 r-xp 00000000 fd:01 12551992 /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
006c9000-006cc000 rw-p 000c9000 fd:01 12551992 /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
006cc000-006ce000 rw-p 00000000 00:00 0 /home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
018ac000-018cf000 rw-p 00000000 00:00 0 [heap]
7ffc2845c000-7ffc2847d000 rw-p 00000000 00:00 0 [stack]
7ffc28561000-7ffc28563000 r--p 00000000 00:00 0 [vvar]
7ffc28563000-7ffc28565000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
I'm confused about why the virtual address of memory segment is different from one shown in "/proc/[pid]/maps". For example, the virtual address of the 2nd memory segment is 0xc9eb8 shown by readelf but in the process memory, it is calculated to 0x6c9000. How's this calculation is done?
I know the linker specifies 0x400000 as the starting address of the first memory segment and process memory shows address aligned to the page size (4K) (e.g., 0xc9e2e is aligned to 0xca000 plus 0x400000). I think this has something to do with "Align" column shown by readelf. However, reading ELF header makes me confuse:
p_align This member holds the value to which the segments are
aligned in memory and in the file. Loadable process seg‐
ments must have congruent values for p_vaddr and p_offset,
modulo the page size. Values of zero and one mean no
alignment is required. Otherwise, p_align should be a pos‐
itive, integral power of two, and p_vaddr should equal
p_offset, modulo p_align.
In specific, what does the last sentence means?:
Otherwise, p_align should be a positive, integral power of two, and p_vaddr should equal p_offset, modulo p_align.
What's the calculation formula it is talking about?
Thanks much!
CPU address mapping has a "page" granularity, 4K is still a very common page size. /proc/$pid/maps shows you the OS mappings, it doesn't show you what addresses the process actually cares about inside the mapped ranges. Your process only cares about what starts at offset eb8 into the first mapped page, but the CPU (and hence the OS that's controlling it for you) can't be bothered to map down to byte granularity, and the linker knows it, so it sets up the disk file with cpu-page-sized blocks.
It means that for other than loadable segments, i.e. those without LOAD, the last n bits in the offset must match the last n in virtual address; and the value of the p_align field is the 1 << n.
For example, the stack says it can be placed anywhere, just that the address needs to be 16-aligned.
For loadable they need to be at least page-aligned. Take the second one from your example:
Offset VirtAddr
LOAD 0x00000000000c9eb8 0x00000000006c9eb8 0x00000000006c9eb8
0x0000000000001c98 0x0000000000003db0 RW 200000
Given page size of 4096, the last 12 bits of the offset must be the same as the last 12 bits of the virtual address. This is because a dynamic linker usually uses mmapto map the pages directly from the file into memory, and this can be only page-granular. So in fact the dynamic linker did map the first part of this range from the file.
006c9000-006cc000 rw-p 000c9000 fd:01 12551992
/home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
Further see that the file size is less than virtual size - the rest of the data will be zero mapped in the other mapping:
006cc000-006ce000 rw-p 00000000 00:00 0
/home/zeyuanhu/share/380L-Spring19/lab3/src/test_helloworld_memory_map
If you read the bytes at 0x00000000006c9000 - 0x00000000006c9eb7 you should see the exact same bytes as those at 0x00000000004c9000 - 0x00000000006c9eb7, this is because the data segment and code segment come right after each other in the file without padding - this saves lots of disk space and actually helps saving ram too because the executable takes less space in the block device caches!

Interesting binary dump of executable file

For some reason I made simple program in C to output binary representation of given input:
int main()
{
char c;
while(read(0,&c,1) > 0)
{
unsigned char cmp = 128;
while(cmp)
{
if(c & cmp)
write(1,"1",1);
else
write(1,"0",1);
cmp >>= 1;
}
}
return 0;
}
After compilation:
$ gcc bindump.c -o bindump
I made simple test to check if program is able to print binary:
$ cat bindump | ./bindump | fold -b100 | nl
Output is following: http://pastebin.com/u7SasKDJ
I suspected the output to look like random series of ones and zeroes. However, output partially seems to be quite more interesting. For example take a look at the output between line 171 and 357. I wonder why there are lots of zeros in compare to other sections of executable ?
My architecture is:
$ lscpu
Architecture: i686
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 28
Stepping: 10
CPU MHz: 1000.000
BogoMIPS: 3325.21
Virtualization: VT-x
L1d cache: 24K
L1i cache: 32K
L2 cache: 512K
When you compile a program into an executable on Linux (and a number of other unix systems), it is written in the ELF format. The ELF format has a number of sections, which you can examine with readelf or objdump:
readelf -a bindump | less
For example, section .text contains CPU instructions, .data global variables, .bss uninitialized global variables (it is actually empty in the ELF file itself, but is created in the main memory when the program is executed), .plt and .got which are jump tables, debugging information, etc.
Btw. it is much more convenient to examine the binary content of files with hexdump:
hexdump -C bindata | less
There you can see that starting with offset 0x850 (approx. line 171 in your dump) there is a lot of zeros, and you can also see the ASCII representation on the right.
Let us look at which sections correspond to the block of your interest between 0x850 and 0x1160 (the field Off – offset in the file is important here):
> readelf -a bindata
...
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
...
[28] .shstrtab STRTAB 00000000 00074c 000106 00 0 0 1
[29] .symtab SYMTAB 00000000 000d2c 000440 10 30 45 4
...
You can examine the content of an individual section with -x:
> readelf -x .symtab bindump | less
0x00000000 00000000 00000000 00000000 00000000 ................
0x00000010 00000000 34810408 00000000 03000100 ....4...........
0x00000020 00000000 48810408 00000000 03000200 ....H...........
0x00000030 00000000 68810408 00000000 03000300 ....h...........
0x00000040 00000000 8c810408 00000000 03000400 ................
0x00000050 00000000 b8810408 00000000 03000500 ................
0x00000060 00000000 d8810408 00000000 03000600 ................
You would see that there are many zeros. The section is composed of 18-byte values (= one line in the -x output) defining symbols. From readelf -a you can see that it has 68 entries, and first 27 of them (excl. the very first one) are of type SECTION:
Symbol table '.symtab' contains 68 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 08048134 0 SECTION LOCAL DEFAULT 1
2: 08048148 0 SECTION LOCAL DEFAULT 2
3: 08048168 0 SECTION LOCAL DEFAULT 3
4: 0804818c 0 SECTION LOCAL DEFAULT 4
...
According to the specification (page 1-18), each entry has the following format:
typedef struct {
Elf32_Word st_name;
Elf32_Addr st_value;
Elf32_Word st_size;
unsigned char st_info;
unsigned char st_other;
Elf32_Half st_shndx;
} Elf32_Sym;
Without going into too much detail here, I think what matters here is that st_name and st_size are both zeros for these SECTION entries. Both are 32-bit numbers, which means lots of zeros in this particular section.
This is not really a programming question, but however...
A binary normally consists of different sections: code, data, debugging info, etc. Since these sections contents differ by type, I would pretty much expect them to look different.
I.e. the symbol table consists of address offsets in your binary. If I read your lspci correctly, you are on a 32-bit system. That means Each offset has four bytes, and given the size of your program, in most cases two of those bytes will be zero. And there are more effects like this.
You didn't strip your program, that means there's still lots of information (symbol table etc.) present in the binary. Try stripping the binary and have a look at it again.

C function profiling (address seem to be offseted)

I'm trying to profile the function calls using -finstrument-functions option.
Basically, what I have done is to write the following into any compiled source:
static int __stepper=0;
void __cyg_profile_func_enter(void *this_fn, void *call_site)
__attribute__((no_instrument_function));
void __cyg_profile_func_enter(void *this_fn, void *call_site) {
int i=0;
for( ; i<__stepper; i++ ) printf(" ");
printf("E: %p %p\n", this_fn, call_site);
__stepper ++;
} /* __cyg_profile_func_enter */
void __cyg_profile_func_exit(void *this_fn, void *call_site)
__attribute__((no_instrument_function));
void __cyg_profile_func_exit(void *this_fn, void *call_site) {
int i=0;
__stepper --;
for( ; i<__stepper; i++ ) printf(" ");
printf("L: %p %p\n", this_fn, call_site);
} /* __cyg_profile_func_enter */
And got the following results:
E: 0xb7597ea0 0xb75987a8
E: 0xb7597de0 0xb7597ef5
L: 0xb7597de0 0xb7597ef5
L: 0xb7597ea0 0xb75987a8
All the function calls address is around that region (0xb7.......)
But, if I try to read the symbols for function using 'readelf -s' it gives the following:
2157: 00101150 361 FUNC LOCAL DEFAULT 13 usb_audio_initfn
2158: 00100940 234 FUNC LOCAL DEFAULT 13 usb_audio_handle_reset
2159: 00100de0 867 FUNC LOCAL DEFAULT 13 usb_audio_handle_control
The address region of all the functions in binary is around 0x00......
So, I can not be able to get the function name from the function pointers.
Looks like some how the function pointer gets an offset or something.
Anybody has any idea?
From the question it looks like you're profiling a library function.
To know what are the functions being measured you have 2 options:
1 Run the program which uses the library under gdb and stop at main. At this point, get the pid of the program PID=... and do `cat /proc/$PID/maps'. There you should see something like this:
➜ ~ ps
PID TTY TIME CMD
18533 pts/4 00:00:00 zsh
18664 pts/4 00:00:00 ps
➜ ~ PID=18533
➜ ~ cat /proc/$PID/maps
00400000-004a2000 r-xp 00000000 08:01 3670052 /bin/zsh5
006a1000-006a2000 r--p 000a1000 08:01 3670052 /bin/zsh5
006a2000-006a8000 rw-p 000a2000 08:01 3670052 /bin/zsh5
006a8000-006bc000 rw-p 00000000 00:00 0
...
7fa174cc9000-7fa174ccd000 r-xp 00000000 08:01 528003 /lib/x86_64-linux-gnu/libcap.so.2.22
7fa174ccd000-7fa174ecc000 ---p 00004000 08:01 528003 /lib/x86_64-linux-gnu/libcap.so.2.22
7fa174ecc000-7fa174ecd000 r--p 00003000 08:01 528003 /lib/x86_64-linux-gnu/libcap.so.2.22
7fa174ecd000-7fa174ece000 rw-p 00004000 08:01 528003 /lib/x86_64-linux-gnu/libcap.so.2.22
...
Here 7fa174cc9000 is base address of the /lib/x86_64-linux-gnu/libcap.so.2.22 library. So all the addresses you get by readelf -s will be offset by that value. Knowing base address you can calculate back what the original offset in file was.
I.e. if you got the value 7fa174206370 and base address of the library is 7fa1741cf000 then offset is 7fa174206370 - 7fa1741cf000 = 37370. In my example it's sigsuspend from GLIBC:
94: 0000000000037370 132 FUNC WEAK DEFAULT 12 sigsuspend##GLIBC_2.2.5
2 Run gdb on the program which uses these libraries. It'll either immediately find the loaded library in memory, or will need to be pointed to the .text section of the library.
> gdb
(gdb) attach YOUR_PID
(a lot of output about symbols)
(gdb) x/i 0x00007fa174206386
=> 0x7fa174206386 <sigsuspend+22>: cmp $0xfffffffffffff000,%rax
This way you know that 0x7fa174206386 is inside sigsuspend.
In case gdb doesn't load any symbols by itself (no output like Reading symbols from ... Loading symbols for ... after attach), you can look up the base address of library as in option 1, then add to it the offset of .text section
➜ ~ readelf -S /lib/x86_64-linux-gnu/libcap.so.2.22 | grep '.text.'
[11] .text PROGBITS 0000000000001620 00001620
7fa174cc9000 + 0000000000001620 in hexadecimal gives 7FA174CCA620, and then you attach by gdb as above and do
(gdb) add-symbol-file /lib/x86_64-linux-gnu/libcap.so.2.22 7FA174CCA620
Then you should be able to find symbols (via x/i ADDRESS as in option 1) even if gdb doesn't load them by itself.
Please ask if anything is unclear, I'll try to explain.
Clarification on why is this so:
The observed behavior is due to the libraries being compiled as Position-Independent Code. It allows us to easily support dynamic libraries. PIC essentially means that library's ELF has .plt and .got sections and can be loaded at any base address. PLT is procedure linkage table and it contains traps for calls of functions located in other modules, which first go to program interpreter to allow it to relocate the called function, and then just jump to the function after the first call. It works because program interpreter updates GOT (Global Offset Table), which contains addresses of functions to call. Initially the GOT is initialized so that on first function call the jump is performed to the function of program interpreter which performs resolution of currently called function.
On x86-64, PLT entries typically looks like this:
0000000000001430 <free#plt>:
1430: ff 25 e2 2b 20 00 jmpq *0x202be2(%rip) # 204018 <_fini+0x201264>
1436: 68 00 00 00 00 pushq $0x0
143b: e9 e0 ff ff ff jmpq 1420 <_init+0x28>
The first jmpq is jump to address, stored in GOT at location %rip + 0x202be2:
[20] .got PROGBITS 0000000000203fd0 00003fd0
0000000000000030 0000000000000008 WA 0 0 8
%rip + 0x202be2 will be 0x204012, and that gets added to the base address of the library to produce absolute address relevant to location where the library is actually loaded. I.e. if it's loaded at 0x7f66dfc03000, then the resulting address of corresponding GOT entry will be 0x7F66DFE07012. The address stored at that location is address of (in this example) free function. It's maintained by program interpreter to point to actual free in libc.
More information on this can be found here.
What you need is this dladdr function. If you've built in debug mode the module (your main program or the shared library) in which the function in question is defined, then by calling the dladdr function you''ll get the function name based on its address and also the base address where the module (e.g. your shared library) is loaded:
#define _GNU_SOURCE
#include <dlfcn.h>
void find_func(void* pfnFuncAddr)
{
Dl_info info;
memset(&info,0,sizeof(info));
if(dladdr(pfnFuncAddr,&info) && info.dli_fname)
{
/*here: 'info.dli_fname' contains the function name */
/* 'info.dli_fbase' contains Address at which shared library is loaded */
}
else
{
/* if we got here it means that the module was not built with debug
information or some other funny thing happened (e.g. we called function)
written purely in assembly) */
}
}
You have to add -ldl when linking.
Bear in mind that:
Function find_func needs to be called from your profiled process (read: somewhere from your __cyg_profile_func_enter or __cyg_profile_func_exit functions) because the address pfnFuncAddr is the actual function address (read: should be equal to this_fn or call_site arguments of the __cyg_* functions)
Function name that you'll get may be mangled (if it is a c++ function/method of a class). You can demangle the name using command line tool called c++filt. If you want to demangle from your profiler code then you need to look at the bfd library and functions like bfd_read_minisymbols bfd_demangle and friends. If you really want o profile your code demangling all the function names later (after profiling) may be a good idea.
The difference in address values that you observed is exactly the difference between the actual address of the function(s) in question and the base address at which the module that contains the function was loaded (read: the info.dli_fbase).
I hope that helps.

Location of destructors in elf files: not where it should be?

I register a token destructor function with
static void cleanup __attribute__ ((destructor));
The function just prints a debug message; the token program runs fine (main() just prints another message; token function prints upon exit).
When I look at the file with
nm ./a.out,
I see: 
08049f10 d __DTOR_END__
08049f0c d __DTOR_LIST__
However, the token destructor function's address should be at 0x08049f10 - an address which contains 0, indicating end of destructor list, as I can check using:
objdump -s ./a.out
At 0x08049f0c, I see 0xffffffff, as is expected for this location. It is my understanding that what I see in the elf file would mean that no destructor was registered; but it is executed with one.
If someone could explain, I'd appreciate. Is this part of the security suite to prevent inserting malicious destructors? How does the compiler keep track of the destructors' addresses?
My system:
Ubuntu 12.04.
elf32-i386
Kernel: 3.2.0-30-generic-pae
gcc version: 4.6.3
DTOR_LIST is the start of a table of desctructors. Have a look which section it is in (probably .dtors):
~> objdump -t test | grep DTOR_LIST
0000000000600728 l O .dtors 0000000000000000 __DTOR_LIST__
Then dump that section with readelf (or whatever):
~> readelf --hex-dump=.dtors test
Hex dump of section '.dtors':
0x00600728 ffffffff ffffffff 1c054000 00000000 ..........#.....
0x00600738 00000000 00000000 ........
Which in my test case contains a couple of presumably -1, a real pointer, and then zero termination.

Resources