vdso gettimeofday with 64 bit kernel & application compiled for 32 bit - c

is vdso supported for a 32 bit application which is running on a 64 bit kernel with glibc version 2.15.? If yes, how do I make it work for 32 bit application running on 64 bit kernel.? Cause even though dlopen on "linux-vdso.so.1" is success, dlsym on "__vdso_gettimeofday" fails.
On the same system I able to do a dlopen on "linux-vdso.so.1" & dlsym on "__vdso_gettimeofday" from a application compiled for 64 bit.

On my 64-bit Linux 4.4.15, the 32-bit vdso has these symbols:
readelf -Ws vdso32
Symbol table '.dynsym' contains 9 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000ce0 9 FUNC GLOBAL DEFAULT 12 __kernel_sigreturn##LINUX_2.5
2: 00000d00 13 FUNC GLOBAL DEFAULT 12 __kernel_vsyscall##LINUX_2.5
3: 00000ad0 438 FUNC GLOBAL DEFAULT 12 __vdso_gettimeofday##LINUX_2.6
4: 00000c90 42 FUNC GLOBAL DEFAULT 12 __vdso_time##LINUX_2.6
5: 00000770 853 FUNC GLOBAL DEFAULT 12 __vdso_clock_gettime##LINUX_2.6
6: 00000cf0 8 FUNC GLOBAL DEFAULT 12 __kernel_rt_sigreturn##LINUX_2.5
7: 00000000 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.5
8: 00000000 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.6
This suggests that the __vdso_gettimeofday you are looking for has been added in kernel 2.6, and that your kernel version is older.

Related

Hide function name in GCC compilation

I am compiling a c "hello world" program that juste include one simple function and a main function.
I am using GCC under Linux.
When I run readelf command on the binary, I can see symbol table and I can see function names in clear.
Is there a way to tell GCC (or the linker) to not generate this symbol table?
Is it possible to tell GCC to store only functions addresses, without storing function names in clear?
Use the -s option to strip the symbol table:
gcc -s -o hello hello.c
The utility strip discards symbols from object files.
Consider :
#include <stdio.h>
static void static_func(void)
{
puts(__FUNCTION__);
}
void func(void)
{
puts(__FUNCTION__);
}
int main(void)
{
static_func();
func();
return 0;
}
readelf produces on a fresh compiled binary :
Symbol table '.symtab' contains 71 entries:
Num: Value Size Type Bind Vis Ndx Name
....
37: 0000000000000000 0 FILE LOCAL DEFAULT ABS hide.c
38: 0000000000400526 17 FUNC LOCAL DEFAULT 14 static_func
....
61: 0000000000400537 17 FUNC GLOBAL DEFAULT 14 func
....
66: 0000000000400548 21 FUNC GLOBAL DEFAULT 14 main
....
And after stripping the binary the whole output is :
Symbol table '.dynsym' contains 4 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts#GLIBC_2.2.5 (2)
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main#GLIBC_2.2.5 (2)
3: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__

How to load library defined symbols to a specified location?

The test is on Ubuntu 12.04, 32-bit, with gcc 4.6.3.
Basically I am doing some binary manipulation work on ELF binaries, and what I have to do now is to assemble a assembly program and guarantee the libc symbols are loaded to a predefined address by me.
Let me elaborate it in an simple example.
Suppose in the original code, libc symbols stdout#GLIBC_2.0 is used.
#include <stdio.h>
int main() {
FILE* fout = stdout;
fprintf( fout, "hello\n" );
}
When I compile it and check the symbol address using these commands:
gcc main.c
readelf -s a.out | grep stdout
I got this:
0804a020 4 OBJECT GLOBAL DEFAULT 25 stdout#GLIBC_2.0 (2)
0804a020 4 OBJECT GLOBAL DEFAULT 25 stdout##GLIBC_2.0
and the .bss section is like this:
readelf -S a.out | grep bss
[25] .bss NOBITS 0804a020 001014 00000c 00 WA 0 0 32
Now what I am trying to do is to load the stdout symbol in a predefined address, so I did this:
echo "stdout = 0x804a024;" > symbolfile
gcc -Wl,--just-symbols=symbolfile main.c
Then when I check the .bss section and symbol stdout, I got this:
[25] .bss NOBITS 0804a014 001014 000008 00 WA 0 0 4
4: 0804a024 0 NOTYPE GLOBAL DEFAULT ABS stdout
49: 0804a024 0 NOTYPE GLOBAL DEFAULT ABS stdout
It seems that I didn't successfully load the symbol stdout##GLIBC_2.0, but just a wired stdout. (I tried to write stdout##GLIBC_2.0 in symbolfile, but it can't compile... )
It seems that as I didn't make it, the beginning address of .bss section has also changed, which makes the address of stdout symbol in a non-section area. During runtime, it throws a segmentation fault when loading from 0x804a024.
Could anyone help me on how to successfully load the library symbol at a predefined address? Thanks!

Interesting binary dump of executable file

For some reason I made simple program in C to output binary representation of given input:
int main()
{
char c;
while(read(0,&c,1) > 0)
{
unsigned char cmp = 128;
while(cmp)
{
if(c & cmp)
write(1,"1",1);
else
write(1,"0",1);
cmp >>= 1;
}
}
return 0;
}
After compilation:
$ gcc bindump.c -o bindump
I made simple test to check if program is able to print binary:
$ cat bindump | ./bindump | fold -b100 | nl
Output is following: http://pastebin.com/u7SasKDJ
I suspected the output to look like random series of ones and zeroes. However, output partially seems to be quite more interesting. For example take a look at the output between line 171 and 357. I wonder why there are lots of zeros in compare to other sections of executable ?
My architecture is:
$ lscpu
Architecture: i686
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 28
Stepping: 10
CPU MHz: 1000.000
BogoMIPS: 3325.21
Virtualization: VT-x
L1d cache: 24K
L1i cache: 32K
L2 cache: 512K
When you compile a program into an executable on Linux (and a number of other unix systems), it is written in the ELF format. The ELF format has a number of sections, which you can examine with readelf or objdump:
readelf -a bindump | less
For example, section .text contains CPU instructions, .data global variables, .bss uninitialized global variables (it is actually empty in the ELF file itself, but is created in the main memory when the program is executed), .plt and .got which are jump tables, debugging information, etc.
Btw. it is much more convenient to examine the binary content of files with hexdump:
hexdump -C bindata | less
There you can see that starting with offset 0x850 (approx. line 171 in your dump) there is a lot of zeros, and you can also see the ASCII representation on the right.
Let us look at which sections correspond to the block of your interest between 0x850 and 0x1160 (the field Off – offset in the file is important here):
> readelf -a bindata
...
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
...
[28] .shstrtab STRTAB 00000000 00074c 000106 00 0 0 1
[29] .symtab SYMTAB 00000000 000d2c 000440 10 30 45 4
...
You can examine the content of an individual section with -x:
> readelf -x .symtab bindump | less
0x00000000 00000000 00000000 00000000 00000000 ................
0x00000010 00000000 34810408 00000000 03000100 ....4...........
0x00000020 00000000 48810408 00000000 03000200 ....H...........
0x00000030 00000000 68810408 00000000 03000300 ....h...........
0x00000040 00000000 8c810408 00000000 03000400 ................
0x00000050 00000000 b8810408 00000000 03000500 ................
0x00000060 00000000 d8810408 00000000 03000600 ................
You would see that there are many zeros. The section is composed of 18-byte values (= one line in the -x output) defining symbols. From readelf -a you can see that it has 68 entries, and first 27 of them (excl. the very first one) are of type SECTION:
Symbol table '.symtab' contains 68 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 08048134 0 SECTION LOCAL DEFAULT 1
2: 08048148 0 SECTION LOCAL DEFAULT 2
3: 08048168 0 SECTION LOCAL DEFAULT 3
4: 0804818c 0 SECTION LOCAL DEFAULT 4
...
According to the specification (page 1-18), each entry has the following format:
typedef struct {
Elf32_Word st_name;
Elf32_Addr st_value;
Elf32_Word st_size;
unsigned char st_info;
unsigned char st_other;
Elf32_Half st_shndx;
} Elf32_Sym;
Without going into too much detail here, I think what matters here is that st_name and st_size are both zeros for these SECTION entries. Both are 32-bit numbers, which means lots of zeros in this particular section.
This is not really a programming question, but however...
A binary normally consists of different sections: code, data, debugging info, etc. Since these sections contents differ by type, I would pretty much expect them to look different.
I.e. the symbol table consists of address offsets in your binary. If I read your lspci correctly, you are on a 32-bit system. That means Each offset has four bytes, and given the size of your program, in most cases two of those bytes will be zero. And there are more effects like this.
You didn't strip your program, that means there's still lots of information (symbol table etc.) present in the binary. Try stripping the binary and have a look at it again.

C function profiling (address seem to be offseted)

I'm trying to profile the function calls using -finstrument-functions option.
Basically, what I have done is to write the following into any compiled source:
static int __stepper=0;
void __cyg_profile_func_enter(void *this_fn, void *call_site)
__attribute__((no_instrument_function));
void __cyg_profile_func_enter(void *this_fn, void *call_site) {
int i=0;
for( ; i<__stepper; i++ ) printf(" ");
printf("E: %p %p\n", this_fn, call_site);
__stepper ++;
} /* __cyg_profile_func_enter */
void __cyg_profile_func_exit(void *this_fn, void *call_site)
__attribute__((no_instrument_function));
void __cyg_profile_func_exit(void *this_fn, void *call_site) {
int i=0;
__stepper --;
for( ; i<__stepper; i++ ) printf(" ");
printf("L: %p %p\n", this_fn, call_site);
} /* __cyg_profile_func_enter */
And got the following results:
E: 0xb7597ea0 0xb75987a8
E: 0xb7597de0 0xb7597ef5
L: 0xb7597de0 0xb7597ef5
L: 0xb7597ea0 0xb75987a8
All the function calls address is around that region (0xb7.......)
But, if I try to read the symbols for function using 'readelf -s' it gives the following:
2157: 00101150 361 FUNC LOCAL DEFAULT 13 usb_audio_initfn
2158: 00100940 234 FUNC LOCAL DEFAULT 13 usb_audio_handle_reset
2159: 00100de0 867 FUNC LOCAL DEFAULT 13 usb_audio_handle_control
The address region of all the functions in binary is around 0x00......
So, I can not be able to get the function name from the function pointers.
Looks like some how the function pointer gets an offset or something.
Anybody has any idea?
From the question it looks like you're profiling a library function.
To know what are the functions being measured you have 2 options:
1 Run the program which uses the library under gdb and stop at main. At this point, get the pid of the program PID=... and do `cat /proc/$PID/maps'. There you should see something like this:
➜ ~ ps
PID TTY TIME CMD
18533 pts/4 00:00:00 zsh
18664 pts/4 00:00:00 ps
➜ ~ PID=18533
➜ ~ cat /proc/$PID/maps
00400000-004a2000 r-xp 00000000 08:01 3670052 /bin/zsh5
006a1000-006a2000 r--p 000a1000 08:01 3670052 /bin/zsh5
006a2000-006a8000 rw-p 000a2000 08:01 3670052 /bin/zsh5
006a8000-006bc000 rw-p 00000000 00:00 0
...
7fa174cc9000-7fa174ccd000 r-xp 00000000 08:01 528003 /lib/x86_64-linux-gnu/libcap.so.2.22
7fa174ccd000-7fa174ecc000 ---p 00004000 08:01 528003 /lib/x86_64-linux-gnu/libcap.so.2.22
7fa174ecc000-7fa174ecd000 r--p 00003000 08:01 528003 /lib/x86_64-linux-gnu/libcap.so.2.22
7fa174ecd000-7fa174ece000 rw-p 00004000 08:01 528003 /lib/x86_64-linux-gnu/libcap.so.2.22
...
Here 7fa174cc9000 is base address of the /lib/x86_64-linux-gnu/libcap.so.2.22 library. So all the addresses you get by readelf -s will be offset by that value. Knowing base address you can calculate back what the original offset in file was.
I.e. if you got the value 7fa174206370 and base address of the library is 7fa1741cf000 then offset is 7fa174206370 - 7fa1741cf000 = 37370. In my example it's sigsuspend from GLIBC:
94: 0000000000037370 132 FUNC WEAK DEFAULT 12 sigsuspend##GLIBC_2.2.5
2 Run gdb on the program which uses these libraries. It'll either immediately find the loaded library in memory, or will need to be pointed to the .text section of the library.
> gdb
(gdb) attach YOUR_PID
(a lot of output about symbols)
(gdb) x/i 0x00007fa174206386
=> 0x7fa174206386 <sigsuspend+22>: cmp $0xfffffffffffff000,%rax
This way you know that 0x7fa174206386 is inside sigsuspend.
In case gdb doesn't load any symbols by itself (no output like Reading symbols from ... Loading symbols for ... after attach), you can look up the base address of library as in option 1, then add to it the offset of .text section
➜ ~ readelf -S /lib/x86_64-linux-gnu/libcap.so.2.22 | grep '.text.'
[11] .text PROGBITS 0000000000001620 00001620
7fa174cc9000 + 0000000000001620 in hexadecimal gives 7FA174CCA620, and then you attach by gdb as above and do
(gdb) add-symbol-file /lib/x86_64-linux-gnu/libcap.so.2.22 7FA174CCA620
Then you should be able to find symbols (via x/i ADDRESS as in option 1) even if gdb doesn't load them by itself.
Please ask if anything is unclear, I'll try to explain.
Clarification on why is this so:
The observed behavior is due to the libraries being compiled as Position-Independent Code. It allows us to easily support dynamic libraries. PIC essentially means that library's ELF has .plt and .got sections and can be loaded at any base address. PLT is procedure linkage table and it contains traps for calls of functions located in other modules, which first go to program interpreter to allow it to relocate the called function, and then just jump to the function after the first call. It works because program interpreter updates GOT (Global Offset Table), which contains addresses of functions to call. Initially the GOT is initialized so that on first function call the jump is performed to the function of program interpreter which performs resolution of currently called function.
On x86-64, PLT entries typically looks like this:
0000000000001430 <free#plt>:
1430: ff 25 e2 2b 20 00 jmpq *0x202be2(%rip) # 204018 <_fini+0x201264>
1436: 68 00 00 00 00 pushq $0x0
143b: e9 e0 ff ff ff jmpq 1420 <_init+0x28>
The first jmpq is jump to address, stored in GOT at location %rip + 0x202be2:
[20] .got PROGBITS 0000000000203fd0 00003fd0
0000000000000030 0000000000000008 WA 0 0 8
%rip + 0x202be2 will be 0x204012, and that gets added to the base address of the library to produce absolute address relevant to location where the library is actually loaded. I.e. if it's loaded at 0x7f66dfc03000, then the resulting address of corresponding GOT entry will be 0x7F66DFE07012. The address stored at that location is address of (in this example) free function. It's maintained by program interpreter to point to actual free in libc.
More information on this can be found here.
What you need is this dladdr function. If you've built in debug mode the module (your main program or the shared library) in which the function in question is defined, then by calling the dladdr function you''ll get the function name based on its address and also the base address where the module (e.g. your shared library) is loaded:
#define _GNU_SOURCE
#include <dlfcn.h>
void find_func(void* pfnFuncAddr)
{
Dl_info info;
memset(&info,0,sizeof(info));
if(dladdr(pfnFuncAddr,&info) && info.dli_fname)
{
/*here: 'info.dli_fname' contains the function name */
/* 'info.dli_fbase' contains Address at which shared library is loaded */
}
else
{
/* if we got here it means that the module was not built with debug
information or some other funny thing happened (e.g. we called function)
written purely in assembly) */
}
}
You have to add -ldl when linking.
Bear in mind that:
Function find_func needs to be called from your profiled process (read: somewhere from your __cyg_profile_func_enter or __cyg_profile_func_exit functions) because the address pfnFuncAddr is the actual function address (read: should be equal to this_fn or call_site arguments of the __cyg_* functions)
Function name that you'll get may be mangled (if it is a c++ function/method of a class). You can demangle the name using command line tool called c++filt. If you want to demangle from your profiler code then you need to look at the bfd library and functions like bfd_read_minisymbols bfd_demangle and friends. If you really want o profile your code demangling all the function names later (after profiling) may be a good idea.
The difference in address values that you observed is exactly the difference between the actual address of the function(s) in question and the base address at which the module that contains the function was loaded (read: the info.dli_fbase).
I hope that helps.

glibc function strtoull() failure

I am facing issue with c library function strtoull which is returning me wrong output.
int main(int argc, char *argv[])
{
unsigned long long int intValue;
if(atoi(argv[2]) == 1)
{
intValue = strtoull((const char *)argv[1], 0, 10 );
}
else
{
// ...
}
printf("intValue of %s is %llu \n", argv[1], intValue);
return 0;
}
I built them and generated 32 and 64 bit executables as str32_new and str64_new.
But the output received from 32 bit exe is errorneous as wrong number is returned:
strtoull should had returned me number 5368709120 for the passed string "5368709120" but it returned me 1073741824.
# ./str32_new "5368709120" 1
intValue of 5368709120 is 1073741824
I note that when I decrease one character from string then it shows proper output.
# ./str32_new "536870912" 1
intValue of 536870912 is 536870912
glibc attached to 32 bit exe is
# readelf -Wa /home/str32_new | grep strt
[39] .shstrtab STRTAB 00000000 002545 000190 00 0 0 1
[41] .strtab STRTAB 00000000 0032f8 0002a4 00 0 0 1
0804a014 00000607 R_386_JUMP_SLOT 00000000 strtoull
6: 00000000 0 FUNC GLOBAL DEFAULT UND strtoull#GLIBC_2.0 (2)
55: 00000000 0 FILE LOCAL DEFAULT ABS strtoull.c
75: 00000000 0 FUNC GLOBAL DEFAULT UND strtoull##GLIBC_2.0
77: 08048534 915 FUNC GLOBAL DEFAULT 15 my_strtoull
glibc attached to 64 bit exe is
# readelf -Wa /home/str64_new | grep strt
[39] .shstrtab STRTAB 0000000000000000 001893 000192 00 0 0 1
[41] .strtab STRTAB 0000000000000000 002cd0 00029b 00 0 0 1
0000000000601028 0000000700000007 R_X86_64_JUMP_SLOT 0000000000000000 strtoull + 0
7: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strtoull#GLIBC_2.2.5 (2)
57: 0000000000000000 0 FILE LOCAL DEFAULT ABS strtoull.c
73: 00000000004006cc 804 FUNC GLOBAL DEFAULT 15 my_strtoull
82: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strtoull##GLIBC_2.2.5
64 bit exe shows proper output but on some system it too behaves abnormally.
Why is the strtoull in 32 bit exe behaving so and how to resolve this issue?
Ok, so we've established that this is quite obviously happening due to an overflow, as the value matches what would happen if casted into 32bit int.
This however does not explain everything - you did use strtoull, not the shorter strtoul, and it indeed works on 64bit binary. If anything, I was surprised to see you were even able to call the longer version in your 32bit build (how did you build it by the way, with -m32? or on a special machine?)
This link, raises the possibility that there's some linkage phenomenon that makes strtoull get declared as int strtoll() (presumably the system can't support the original lib version), and so we get the value implicitly casted through int, before copied back to your unsigned long long.
Either way - this should have been warned against by the compiler, try setting it to c99 and raise the warning levels, maybe that would make it shout
I think that is due to overflow. int in 32bit can not hold a number that large (max is 4294967296). As Leeor said, 5368709120 & (0xffffffff) = 1073741824.
The type int is minimally 32 bits wide, and is only 32 bits wide on most (if not all) systems.
You most likely forgot to #include <stdlib.h> and you probably did not enable any compiler warnings (like for using undeclared functions).
When a C compiler sees a function call to an undeclared function it blindly assumes int f(int) as prototype. In your case the return value of strtoull() will be int and so the value will be truncated to 32-bit.
(It is indeed quite strange that you get the correct result on a 64-bit system, where int is usually also just 32-bit.)

Resources