I recently discovered the linker option "-Bsymbolic-functions" in GNU ld:
-Bsymbolic
When creating a shared library, bind references to global symbols to the
definition within the shared library, if any. Normally, it is possible
for a program linked against a shared library to override the definition
within the shared library.
This option is only meaningful on ELF platforms which support shared libraries.
-Bsymbolic-functions
When creating a shared library, bind references to global function symbols
to the definition within the shared library, if any.
This option is only meaningful on ELF platforms which support shared libraries.
This seems to be the inverse of the GCC option -fvisibility=hidden, in that instead of preventing the export of the referenced function to other shared objects, it prevents library-internal references to that function from being bound to an an exported function of a different shared object. I informed myself that -Bsymbolic-functions will prevent the creation of PLT entries for the functions, which is a nice side effect.
But I was wondering whether there is perhaps a finer-grained control over this, like overwriting -Bsymbolic for individual function definitions of a library.
Should I be aware of any pitfalls of using -Bsymbolic-functions? I plan to only use that, because the -Bsymbolic will break exceptions, I think (it will make it so that references to typeinfo objects are not unified, I think).
Thanks!
Answering my own question because I just earned a Tumbleweed badge for it... and I found out subsequently
But I was wondering whether there is perhaps a finer-grained control over this, like overwriting -Bsymbolic for individual function definitions of a library.
Yes, there is the option --dynamic-list which does exactly that
Should I be aware of any pitfalls of using -Bsymbolic-functions? I plan to only use that, because the -Bsymbolic will break exceptions, I think (it will make it so that references to typeinfo objects are not unified, I think).
I looked more into it, and it seems there is no issue. The libstdc++ library apparently does it or at least did consider it and they only had to add --dynamic-list-cpp-new to still have operator new unified (to prevent issues with multiple allocator / deallocators mixing up in a program but I would argue such programs are broken anyway). Ubuntu uses it or used it by default, and it seems it causes conflicts with some packages. But overall it should work nicely I expect.
I recently discussed this this with one of the toolchain experts at SUSE. Here are his remarks:
"-Bsymbolic-functions is a thing from an old world which doesn't
exist anymore. It completely bypasses everything about what ELF can
provide, including visibility. When you're using it, everything is bound
locally. IOW: don't use it :)
Noone should use -Bsymbolic-functions, it's a too big hammer for
most purposes."
How does -Bsymbolic-functions relate to library versioning (--version-script) ?
"-Bsymbolic-functions overrides anything, from linker
scripts, from GCC attributes or anywhere, about symbol visibilities or anything. It makes everything bind local, always, irrespective of
anything else that you might have added on command lines, or extra files,
or object files. (And yes, --dynamic-list= was a mis-guided attempt to
fix some of that and make -Bsymbolic* somewhat more friendly). So, yes, it takes precendence over linker script. It's a big hammer :)
"
"To be extra precise: -Bsymbolic-functions is not quite that same as linker
script global/local, which is probably a reason why people still use it
sometimes. While -Bsymbolic-functions does bind references to definitions
locally (like local: in linker scripts), it also keeps them exported
(like the global: ones). In ELF speak that would be somewhat like
PROTECTED visibility. Unfortunately that can't be expressed in a symbol
version script right now, only via GCCs __attribute__(visibility). So,
when people try to get the speed advantage of local binding (fewer symbol
lookups at library load time), while still exporting all their functions
from the shared lib, they unfortunately often end up first finding that
-Bsymbolic-functions "does what I want", without realizing that it creates
problems down the line."
Well you could say it is a "hardening" option as it ensures your calls to in-library functions surely end up there. But one issue that I found is some projects test-suites.
For example the libvirt test-suite would want to call into the just built libvirt0.so but also mock some of the calls that will be done from there.
Due to -Bsymbolic-functions being used on the build that breaks the test as the original and not the mocked function is called.
Example backtraces
Good case:
#0 virHostCPUGetThreadsPerSubcore (arch=VIR_ARCH_PPC64) at ../../../tests/virhostcpumock.c:30
#1 0x00007ffff7c1e4c4 in virHostCPUGetInfoPopulateLinux (cpuinfo=<optimized out>, arch=VIR_ARCH_PPC64, cpus=0x7fffffffdf38, mhz=<optimized out>, nodes=0x7fffffffdf40, sockets=0x7fffffffdf44, cores=0x7fffffffdf48, threads=0x7fffffffdf4c)
at ../../../src/util/virhostcpu.c:661
#2 0x0000555555557e6f in linuxTestCompareFiles (outputfile=0x55555558f150 "/build/libvirt-OUKR8i/libvirt-4.10.0/tests/virhostcpudata/linux-ppc64-subcores2.expected", arch=VIR_ARCH_PPC64,·
cpuinfofile=0x5555555a3f10 "/build/libvirt-OUKR8i/libvirt-4.10.0/tests/virhostcpudata/linux-ppc64-subcores2.cpuinfo") at ../../../tests/virhostcputest.c:44
#3 linuxTestHostCPU (opaque=<optimized out>) at ../../../tests/virhostcputest.c:189
#4 0x000055555555914d in virTestRun (title=0x55555555c0a1 "subcores2", body=0x555555557cc0 <linuxTestHostCPU>, data=0x7fffffffe0c0) at ../../../tests/testutils.c:176
#5 0x000055555555781a in mymain () at ../../../tests/virhostcputest.c:263
#6 0x0000555555559df4 in virTestMain (argc=1, argv=0x7fffffffe2c8, func=0x5555555577b0 <mymain>) at ../../../tests/testutils.c:1114
#7 0x00007ffff79bb09b in __libc_start_main (main=0x5555555576a0 <main>, argc=1, argv=0x7fffffffe2c8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe2b8) at ../csu/libc-start.c:308
#8 0x00005555555576ea in _start () at ../../../tests/virhostcputest.c:278
Bad case:
#0 virHostCPUGetThreadsPerSubcore (arch=arch#entry=VIR_ARCH_PPC64) at ../../../src/util/virhostcpu.c:1119
#1 0x00007ffff7c27e04 in virHostCPUGetInfoPopulateLinux (cpuinfo=<optimized out>, arch=VIR_ARCH_PPC64, cpus=0x7fffffffdea8, mhz=<optimized out>, nodes=0x7fffffffdeb0, sockets=0x7fffffffdeb4, cores=0x7fffffffdeb8, threads=0x7fffffffdebc)
at ../../../src/util/virhostcpu.c:661
#2 0x0000555555557e6f in linuxTestCompareFiles (outputfile=0x5555555a5c30 "/build/libvirt-4biJ7f/libvirt-4.10.0/tests/virhostcpudata/linux-ppc64-subcores2.expected", arch=VIR_ARCH_PPC64,·
cpuinfofile=0x55555558fd20 "/build/libvirt-4biJ7f/libvirt-4.10.0/tests/virhostcpudata/linux-ppc64-subcores2.cpuinfo") at ../../../tests/virhostcputest.c:44
#3 linuxTestHostCPU (opaque=<optimized out>) at ../../../tests/virhostcputest.c:189
#4 0x000055555555914d in virTestRun (title=0x55555555c0a1 "subcores2", body=0x555555557cc0 <linuxTestHostCPU>, data=0x7fffffffe030) at ../../../tests/testutils.c:176
#5 0x000055555555781a in mymain () at ../../../tests/virhostcputest.c:263
#6 0x0000555555559df4 in virTestMain (argc=1, argv=0x7fffffffe238, func=0x5555555577b0 <mymain>) at ../../../tests/testutils.c:1114
#7 0x00007ffff79b009b in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#8 0x00005555555576ea in _start () at ../../../tests/virhostcputest.c:278
Compare the source for virHostCPUGetThreadsPerSubcore in those two and you will see the difference.
Another case I have seen are:
static variables becoming multiple instances in singularity
segfaulting tests in sssd
autofs issues with global variables
Since the original question was about potential drawbacks I thought it is worth to mention those somewhat common category of related issues as well.
There are cases with side effects. A documented one:
https://bugs.launchpad.net/ubuntu/+source/xfe/+bug/644645
I would also like to figure out more about it, because I have such a case right now.
building glibc with -Bsymbolic-functions is not recommended neither. Here is the result I got:
Core was generated by `/home/lano1106/dev/packages/glibc/repos/core-i686/src/glibc-build/elf/ld-linux .'.
Program terminated with signal 11, Segmentation fault.
#0 0x400a3e90 in _int_free ()
from /home/lano1106/dev/packages/glibc/repos/core-i686/src/glibc-build/libc.so.6
(gdb) where
#0 0x400a3e90 in _int_free ()
from /home/lano1106/dev/packages/glibc/repos/core-i686/src/glibc-build/libc.so.6
#1 0x4016b94b in __libc_dlsym ()
from /home/lano1106/dev/packages/glibc/repos/core-i686/src/glibc-build/libc.so.6
#2 0x4004c2c7 in __gconv_find_shlib ()
from /home/lano1106/dev/packages/glibc/repos/core-i686/src/glibc-build/libc.so.6
#3 0x40042320 in find_derivation ()
from /home/lano1106/dev/packages/glibc/repos/core-i686/src/glibc-build/libc.so.6
#4 0x40042889 in __gconv_find_transform ()
from /home/lano1106/dev/packages/glibc/repos/core-i686/src/glibc-build/libc.so.6
#5 0x400d6f00 in __wcsmbs_load_conv ()
from /home/lano1106/dev/packages/glibc/repos/core-i686/src/glibc-build/libc.so.6
#6 0x400c86f6 in mbrtowc ()
from /home/lano1106/dev/packages/glibc/repos/core-i686/src/glibc-build/libc.so.6
#7 0x08048914 in ?? ()
#8 0x00000000 in ?? ()
Related
I'm reading a textbook which describes how loader works:
When the loader runs, it copies chunks of the executable object file into the code and data segments. Next, the loader jumps to the program’s entry point, which is always the address of the _start function. The _start function calls the system startup function, __libc_start_main
From the answer of this question What is __libc_start_main and _start? we have the below pseudo-code about the execution flow:
_start:
call __setup_for_c ; set up C environment
call __libc_start_main ; set up standard library
call _main ; call your main
call __libc_stop_main ; tear down standard library
call __teardown_for_c ; tear down C environment
jmp __exit ; return to OS
My questions are:
I used objdump to check the assembly code of the program and I found _start only call __libc_start_main as picture below shows:
What about the rest of functions like call __setup_for_c ,_main etc? especially my program's main function, I can't see how it get called. so is the pseudo-code about the execution flow correct?
What does __libc_start_main setup standard library mean? Why the standard library needs to be setup? Isn't that the standard library just need to be linked by the dynamic linker when the program is loaded?
Pseudo-code isn't code ;) _libc_start_main() can call the application's main() because the address of main() will have been fixed up by the linker. The order in which the code generated by the compiler does initialization might be interesting, but you shouldn't assume it will be the same from one compiler to another, or even one release to another. It's probably best not to rely on things being done in a particular way if you can avoid it.
As to what needs to be initialized -- standard C libraries like glibc are hugely complex, and a lot of stuff needs to be initialized. To take one example, the memory allocator's block table has to be set up, so that malloc() doesn't start with a random pattern of memory allocation.
The other function calls described in the linked answer give a synopsis of what needs to happen; the actual implementation details in the GNU C library are different, either using “constructors” (_dl_start_user), or explicitly in __libc_start_main. __libc_start_main also takes care of calling the user’s main, which is why you don’t see it called in your disassembly — but its address is passed along (see the lea just the callq). __libc_start_main also takes care of the program exit, and never returns; that’s the reason for the hlt just after the callq, which will crash the program if the function returns.
The library needs quite a lot of setup nowadays:
some of its own relocation
thread-local storage setup
pthread setup
destructor registration
vDSO setup (on Linux)
ctype initialisation
copying the program name, arguments and environment to various library variables
etc. See the x86-64-specific sysdeps/x86_64/start.S and the generic csu/libc-start.c, csu/init-first.c, and misc/init-misc.c among others.
what about the rest of functions like call __setup_for_c ,_main etc?
Those are just fancy made-up readable names used in the linked answer to transfer the meaning of that answer better.
how it get called
Your standard library implementation doesn't provide a function named __setup_for_c nor _main, so they don't exists so they don't get called. Every implementation may choose different names for the functions.
is the pseudo-code about the executation flow correct?
Yes - and the word "psuedo-code" you used infers that you are aware that it's not real code.
what does __libc_start_main setup standard library mean?
It means a symbol with the name __libc_start_main. __libc_start_main is a function that initializes all standard library things and runs main in glibc. It initializes libc, pthreads, atexit and finally runs main. glibc is open source, so just look at it.
why standard library needs to be setup?
Because it was written in the way that it depends on it. The simplest is, when you write:
int var = 42; // variable with static storage duration
int main() {
return var == 42;
}
(Assuming the optimizer doesn't kick in) then the value 42 has to be written into the memory held for var before main is executed. So something has to execute before main and actually write the 42 into the memory of var. This is the simplest case why something has to execute before main. Global variables are used in many places and all of them need to be setup, for example a variable named program_invocation_name in glibc holds the name of the program - so some code needs to actually query the environment or kernel about what is the name of the program and actually store the value (and potentially parse) a string into a global variable (and also remember about free() that string if dynamically allocated on exit). Some code "has to do it" - and that code is in standard library initialization.
There are many more cases - in C++ and other languages there are constructors, there is gcc GNU extension __attribute__((__constructor__)) and .init/.preinit sections - all of them executed before main. And destructors have to execute on exit, but not on _exit - thus atexit stuff is initialized before main and all destructors may be registered with it, depending on implementation.
Environment need to be initialized, potentially stack and some more stuff. And thread local variables need to be allocated only for current thread so that when you pthread_create another thread they don't get copied with non-thread-local variables.
isn't that standard library just need to be linked by the dynamic linker when the program is loaded?
It is - when the program is loaded, the standard library is just linked. The compiler, when generating the program, uses crt code to include some startup code into the program - for example a call to __libc_start_main.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Env: Ubuntu 18.04, Linux Kernel 5.3
I'm debugging some binary with gdb. Here is what I found when catching stat system call:
(gdb) bt
#0 0x00007f2d8ecae775 in __GI___xstat (vers=vers#entry=1, name=name#entry=0x7f2d882d7d60 "/etc/app/cfg", buf=buf#entry=0x7f2d8f3a14f0) at ../sysdeps/unix/sysv/linux/wordsize-64/xstat.c:35
#1 0x00007f2d592294e4 in stat64 (__statbuf=0x7f2d8f3a14f0, __path=0x7f2d882d7d60 "/etc/app/cfg") at /usr/include/x86_64-linux-gnu/sys/stat.h:500
#2 0x00007f2d6fac1990 in ?? ()
#3 0x00007f2d8f3a15c8 in ?? ()
#4 0x00007f2d8f3a1620 in ?? ()
#5 0x00007f2d6fabbcb3 in ?? ()
#6 0x00000007170a2ae8 in ?? ()
#7 0x00007f2d8f3a15d0 in ?? ()
#8 0x0000000000000000 in ?? ()
The line #1 0x00007f2d592294e4 in stat64 (__statbuf=0x7f2d8f3a14f0, __path=0x7f2d882d7d60 "/etc/app/cfg") at /usr/include/x86_64-linux-gnu/sys/stat.h:500 got me confused.
I don't have an idea about why one would use stat64 explicitly. First of all it requires _GNU_SOURCE to be defined. Secondly to my knowledge glibc's stat already handle all the kernel-specific 32/64-bit difference staff.
And besides, both the stat and stat64 use the same stat system call on my kernel.
The most likely explanation is the program did a #define _FILE_OFFSET_BITS 64 before including any system headers. This causes calls to plain stat to be remapped to stat64, open to open64, etc. Nowadays all applications should do this.
However, there is a reason to use stat64 etc directly. In a library whose public interfaces logically should involve off_t or any of the other types that are changed by defining _FILE_OFFSET_BITS, you can’t use that define or any of those types in your interface headers because then your own ABI will depend on the setting of that macro, which is controlled by the library user, not you. Instead you have to define _LARGEFILE64_SOURCE and use the explicitly sized types (off64_t, etc.) and functions (stat64, etc.) in your interface headers. In principle, .c and .h files that aren't exposed to external macro defines can still use _FILE_OFFSET_BITS and the ordinary functions, but in practice it’s easier to enforce a style rule that all of the library’s code must use only the explicitly sized types and functions.
Edit: I appear to have been mistaken, the backtrace works wonderfully from anywhere on Linux -- it is only when remote debugging from gdb on ubuntu to remote windows that the stacktrace gets absolutely destroyed after entering one of the memory allocation functions in msvcrt... dammit microsoft.
And this happens for both 64bit and 32bit windows, so I'm not sure this is related to the unwind information...
Edit: It appears adding -g3 and -Og has helped with part of the issue in some programs but the problem still persists in other programs, cannot post their source here as it is IP of my company -- sorry!
Background
I am using gcc to compile ubuntu->ubuntu and mingw to compile ubuntu->windows.
I have created a cross platform (linux + windows) memory tracking & leak detection library which hooks malloc/calloc/realloc/free with an assembly bytepatch on the first instructions (not IAT/PLT hooking).
The hook redirects to a gate which checks if the hooks are enabled in the current thread and redirects to the memory tracking hook function if they are, otherwise it just redirects to the trampoline of the real function if they are disabled for that thread.
The library works great and detects leaks on linux/windows (probably would work on mac but I don't have one).
I use the library to programmatically detect leaks from within my code, I can install callbacks on the memory allocation routines and programmatically raise breakpoints (by looping and waiting for debugger to attach then executing asm("int3")) inside the callbacks so that I can attach to my program while it's inside of a call that leaks memory.
Everything works great up until I try to view a backtrace from within my callback, I understand this is is probably because the unwind information is probably not matching my stack anymore because I have inserted new frames and data via the hook routines I have inserted.
Edit: If I am mistaken about the unwind info mismatching the stack being the cause of the incorrect backtrace then please correct me!
The Question
Is there any small hacks I can do to trick GDB into correctly rebuilding the backtrace from within my hook callbacks?
I understand that I can manually walk and edit the unwind info with libdwarf or something but I imagine that would be incredibly cumbersome and large.
So I am wondering if perhaps there is a hack or a cheat I can do which would trick GDB into properly rebuilding the backtrace?
If there are no easy hacks or tricks then what are all of my options for fixing this issue?
Edit: Just to clear up the exact call order of everything:
program
V
malloc
V
hook_malloc -> hooks are disabled -> return malloc trampoline -> real malloc > program
V
hooks are enabled
V
Call original malloc -> malloc trampoline -> real malloc -> returns to hook
V
Record memory size/info etc from malloc
V
Call user defined callback -> **User defined callback* -> returns to hook
V
return to program
It is the "User Defined Callback" where I want to capture a backtrace
Apparently this is the same problem GDB Windows ?? in Backtraces
And the solution was to simply add -g3 to the mingw compile flags and viola I have non-broken backtraces!
Edit: Nevermind, this isn't the whole answer. It appears like this fix worked for some test programs, but other programs still appear to show incorrect backtraces like:
(gdb) bt
#0 malloc_callback (s=38, rv=0x2c5058) at test_dll.c:729
#1 0x000000000040731d in hook_malloc_raw (file=0x410ea1 <__FUNCTION__.63079+55> "", function=0x410ea1 <__FUNCTION__.63079+55> "", line=0, s=38, rv=8791758343065)
#2 0x0000000000407367 in hook_malloc (s=38)
#3 0x000007fefda20b9e in ?? ()
#4 0x0000000000000026 in ?? ()
#5 0x0000000000410ea1 in __FUNCTION__.63079 ()
#6 0x0000000000000000 in ?? ()
Obviously Frame #4 isn't actually a stack frame, and I'm not sure why frame #5 is labeled "__FUNCTION__.63079".
Edit2: If people are going to downvote this at least leave a comment saying why
I'm trying to understand how glibc dynamic linker works. I know that _dl_fixup is called in _dl_runtime_resolve, and solves the relocation problems. So I thought it's called only after linker starts and has loaded some libraries. But when I do some print work in it, I find the function is called even before _dl_start. It's confusing: why it was called? What work it has done?
I did some print work, the function is working on symbols like strncpy, fopen, fread64 and so on, but the object name(l->l_name) seems to be null.
I use gdb to debug the linker, and I think gdb itself used _dl_fixup to complete some tasks. If I didn't use gdb, the _dl_fixup will be called only after _dl_start.
So I thought it's called only after linker starts and has loaded some libraries
That is correct.
I find the function is called even before _dl_start
This is not correct: _dl_fixup is called only after _dl_start.
Unfortunately you didn't provide any details on how you've came to the incorrect conclusion, so it's impossible to tell you where you made a mistake, but you did make (at least one) mistake.
I am compiling a C library in Mac OS X Snow Leopard with the folloing GCC:
Diderot:~ brandizzi$ gcc -v
Using built-in specs.
Target: i686-apple-darwin10
Configured with: /var/tmp/gcc/gcc-5666.3~6/src/configure --disable-checking --enable-werror --prefix=/usr --mandir=/share/man --enable-languages=c,objc,c++,obj-c++ --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 --with-gxx-include-dir=/include/c++/4.2.1
Thread model: posix
gcc version 4.2.1 (Apple Inc. build 5666) (dot 3)
When I run some unit tests of this library (which are written on CuTest) one of the tests got a problem: an EXC_BAD_ACCESS signal. That is a usual problem and I have some understanding about this kind of issue - I'm a Linux guy who called it "Segmentation fault", understand what is happening and the usual ways to solve the problem. What is amazing is that the bad access is executed inside the execution of malloc function. Look at this backtrace I've got in GDB:
(gdb) bt
#0 0x00007fff89000a34 in tiny_free_list_add_ptr ()
#1 0x00007fff88ffe147 in tiny_malloc_from_free_list ()
#2 0x00007fff88ffcfdd in szone_malloc_should_clear ()
#3 0x00007fff88ffceaa in malloc_zone_malloc ()
#4 0x00007fff88ffb1a8 in malloc ()
#5 0x0000000100008c72 in util_copy_string (string=0x100008e48 "libsecretary") at src/util.c:7
#6 0x0000000100008126 in project_new (name=0x100008e48 "libsecretary") at src/project.c:8
#7 0x00000001000078b9 in secretary_start (secretary=0x10080b000, name=0x100008e48 "libsecretary") at src/secretary.c:23
#8 0x00000001000020f8 in test_secretary_move_task_from_project_to_project (test=0x1001005b0) at src/test/secretary.c:146
#9 0x0000000100006eae in CuTestRun (tc=0x1001005b0) at cutest/CuTest.c:143
#10 0x00000001000075c1 in CuSuiteRun (testSuite=0x100800000) at cutest/CuTest.c:289
#11 0x0000000100001527 in RunAllTests () at src/test/run_all.c:22
#12 0x000000010000156b in main () at src/test/run_all.c:32
This test case has the following lines, and the error always happens at the fourth one. If I switch the lines in any way, the problem still happens at the fourth one:
Secretary *secretary = secretary_new();
Task *task = secretary_appoint(secretary, "Test task transference");
Project *destination = secretary_start(secretary, "Chocrotary");
Project *origin = secretary_start(secretary, "libsecretary");
So, how can malloc() cause such problem? I do not even pass a pointer to it! Is it a bug? Has someone seen something like this?
Thanks in advance!
Most likely, something earlier in the program's execution is writing to memory it isn't entitled to, corrupting the heap's data structures. Then, later, malloc gets called and tries to follow a pointer that's been overwritten with nonsense (or index something via a value that's being overwritten with nonsense, or whatever), and boom.
You might want to try running your test suite under valgrind to see where things first start going wrong.
There is an awful lot of reasons for this problem: one did not allocate the memory, the pointer points to the wrong place etc. etc.
In my case, I was allocating an array of e.g. Project with MAX_PROJECT_COUNT positions. I wrote
Project *array = malloc(MAX_PROJECT_COUNT);
but it does not consider the size of the Project struct! The correct solution would be
Project *array = malloc(MAX_PROJECT_COUNT*sizeof(Project));
Note, however, that your problem may be considerably different so it is not possible to apply the same solution.