Why is _dl_fixup called before dynamic linker start? - c

I'm trying to understand how glibc dynamic linker works. I know that _dl_fixup is called in _dl_runtime_resolve, and solves the relocation problems. So I thought it's called only after linker starts and has loaded some libraries. But when I do some print work in it, I find the function is called even before _dl_start. It's confusing: why it was called? What work it has done?
I did some print work, the function is working on symbols like strncpy, fopen, fread64 and so on, but the object name(l->l_name) seems to be null.
I use gdb to debug the linker, and I think gdb itself used _dl_fixup to complete some tasks. If I didn't use gdb, the _dl_fixup will be called only after _dl_start.

So I thought it's called only after linker starts and has loaded some libraries
That is correct.
I find the function is called even before _dl_start
This is not correct: _dl_fixup is called only after _dl_start.
Unfortunately you didn't provide any details on how you've came to the incorrect conclusion, so it's impossible to tell you where you made a mistake, but you did make (at least one) mistake.

Related

When I debug a C program with gdb, and key in 'p system', what exactly do I get?

Before I go deep into my questions, I need to confess that I am still fairly inexperienced to this subject, and am confused over quite a number of concepts, so please bear with me if my manner of asking those questions seems unorganized.
I recently learnt that as standard C library would be loaded into every C program we compiled (is this because we have #include at the beginning of the source file?[quesiton1]), we would have its functions loaded into the memory. So, I would know that the system() function had already been loaded and stored somewhere in the memory, and then I was made know that I could find the exact address of where the system() function was stored by debugging a random C program with gdb, and issuing the command 'p system', which would print out the address of the function. I understand that 'p' is used to print variable in gdb, and 'system' in this case probably indicates the address of the system() function, so it seems to make sense to do so, but then I think to myself, wait a second, it does not appear that I have used the system() function anywhere in my code, why would the inventor of gdb include such a variable for me to print out the address of some function that I don't even use? and does this imply that the address of every function in stand C library can be found out in the same fashion? and they all have a corresponding variable name in gdb? [question2]
One more question unrelated to stuff I talked above is whether functions like system(), execve() and many others are specific to Linux OS, or they are also used in Windows OS? [question3]
Hope that you guys can help me out. Thanks in advance!
The standard C library is linked with every program because it's necessary for it to be there to be able to run your program. There's a lot of things happening in your program before your main function gets called and after it returns, the standard library takes care of this. It also provides you with most of the standard functions you can call. You can compile things without a standard library, but that's an advanced topic. This is pretty much unrelated to #include.
Gdb can see system with p because it prints more than just variables. It prints anything that is in scope. system just happens to be a symbol that's visible to you in that scope. You could print any symbol that's visible to you, including all the globally visible variables and functions in libc and your program. Symbols in this context means "names of various things that need to be findable by programs and other libraries", this includes all functions, variables, section boundaries and many other things that the compiler/linker/runtime/debugger need to find to do its job.
Usually the standard library gets linked dynamically, which means that every program has the exact same copy of the library. In that case all symbols in it will be visible to your program because there's no reason to exclude them. If you link your program statically only the necessary parts of libc will be included and you would probably not see the system symbol unless you actually use that function.

Changing glibc but nothing happens

I want to modify glibc. So I have downloaded a version of it and made some changes in the code. For example I've made changes to memset. However, I don't see any difference if I use the .so file produced by the compilation (using LD_PRELOAD) as compared to when I don't do any LD_PRELOAD. memset behaves like it does. Why is that so? Is that maybe the compiler is inlining memset and not using anything from the shared object? I don't understand this. I even made changes to printf, but still nothing. Why is that so. How can I modify glibc (for testing purposes), so that I do see a change?
Moreover, when I tried to change pthread_create (and ofcourse LD_PRELOAded libpthread.so) by introducing printf( "pthread_create") in the beginning of that function, I just get a segmentation fault. What is going on here? Also if I check the difference in the libc.so after making changes in glibc source, I see no difference in the produced versions. What is happening here. This is driving me nuts!
GCC provides built-in versions of several functions, including memset() and printf(). It does not link to glibc's implementation of these functions.
Try passing the -fno-builtin compiler option to inhibit this behavior.

How to debug a crash before main?

My program links statically to many libraries and crashes before getting to main in GDB. How do I diagnose what the problem is?
It's a good bet that LD_DEBUG can help you here. Try this: LD_DEBUG=all ./a.out. This will allow you to easily identify the library which is being loaded when your program crashes.
(Edit: if it wasn't clear, a.out is meant to refer to a generic binary file -- in this case, replace it with the name of your executable).
Edit 2:
To clarify, LD_DEBUG is an environment variable which is examined by the dynamic linker when a program begins execution. If LD_DEBUG is set to some value, the dynamic linker will output a lot of information about the dynamic libraries being loaded during program execution, symbol binding, and so on.
For starters, execute the following on your machine:
LD_DEBUG=help ls
You will see the valid options for LD_DEBUG on your system listed. The most verbose setting is all, which will display all available information.
Now, to use this is as simple as the ls example, only replace ls with the name of your program. There is no need for gdb in order to use LD_DEBUG, as it is functionality provided solely by the dynamic linker, and not by gdb.
It may crash because some component throws an exception and nobody catches it since main() hasn't been entered yet. Set a breakpoint on throwing an exception:
catch throw
run
(If catch throw doen't work the first time you start it, run it once to let it load the dynamic libraries and then do catch throw and run again).
This post has the answer, you have to set a breakpoint before main in the crt0 startup code:
Using GDB without debugging symbols on x86?
starti
starti breaks at the very first instruction executed, see also: Stopping at the first machine code instruction in GDB
An alternative if your GDB is not new enough:
break _start
if you know the that the name of the entry point method is _start, or:
info files
search for Entry point:
Entry point: 0x400440
and run:
break *0x400440
TODO: find out how to compile crt* objects with debug symbols and step into them: How to compile my own glibc C standard library from source and use it?
Start taking the libraries out one by one until it stops crashing.
Then examine the culprit.
I haven't run into this in C but if you link to a c++ library static initialization can crash. You can create it easily by having an assert in a constructor of a static scope variable.
If you can, link your program dynamically instead of statically and follow #denniston.t answer. Maybe debug trace from dynamic linker will help to fix this problem.

dlopen: Is it possible to trap unresolved symbols, "manually" resolving them as they happen?

Is it possible to trap unresolved symbol references when they happen, so that a function is called to try to resolve the symbol as needed? Or is it possible to add new symbols to the dynamic symbol table at runtime without creating a library file and dlopen'ing it? I am on GNU/Linux, using GCC. (Portability to other Unixes would be nice, but is not a key concern.)
Thanks in advance!
Edit: I should have given more detail about what I am trying to do. I want to write an interpreter for a programming language, which is expected to support both compiled (dlopen'ed) and interpreted modules. I wanted calls from a compiled module to functions defined elsewhere to be resolved by the linker, to avoid a lookup for the function at every call, but calls to interpreted code would be left unresolved. I wanted to trap those calls, so that I could call the appropriate interpreted function when needed (or signal an error if the function does not exist).
If you know what symbols are missing, you could write a library just with them, and LD_PRELOAD it prior to the application execution.
If you don't have the list of the symbols that are missing, you could discover them by using either 'nm' or 'objdump' on the binary, and, with base on that, write a script which will build the library with the missing symbols prior to the application execution, and then LD_PRELOAD it as well.
Also, you could use gdb to inject new 'code' into applications, making the functions point to what you need.
Finally, you could also override some of the ld.so functions to detect the missing symbols, and do something about them.
But in any case, if you could explain what you are trying to accomplish, it would be easier to provide a proper solution.
I'm making a wild guess that the problem you're trying to address is the case where you dlopen and start using a loadable module, then suddenly crash due to unresolved symbols. If so, this is a result of lazy binding, and you can disable it by exporting LD_BIND_NOW=1 (or any value, as long as it's set) in the environment. This will ensure that all symbols can be resolved before dlopen returns, and if any can't, the dlopen operation will fail, letting you handle the situation gracefully.

How to get the function name of a C function pointer

I have the following problem: when I get a backtrace in C using the backtrace(3) function the symbols returned the name of the functions is easily determinable with the dwarf library and dladdr(3).
The problem is that if I have a simple function pointer (e.g. by passing it &function) the dladdr + dwarf functions can't help. It seems that the pointer is different from the one returned by backtrace(3) (it's not strange, since backtrace gets these function pointers straight from the stack).
My question is whether there is some method for resolving these names too? Also, I'd like to know exactly what is the difference between the two pointers.
Thanks!
UPDATE:
The difference between the pointers is quite significant:
The one I get with backtrace is: 0x8048ca4
The direct pointer version: 0x3ba838
Seems to me that the second one needs an offset.
Making a guess from the substantial differences in the typical addresses you cited, one is from an actual shared library, and the other from your main executable. Reading between the lines of a man page for dladdr(3), it could be the case that if the symbol isn't located in a module that was loaded by dlopen(3) then it might not be able to reconstruct the matching file and symbol names.
I assume you haven't stripped the symbols off any module you care about here, or all bets are off. If the executable has symbols, then it should be possible to look in them for one that is an exact match for the address of any nameable function. A pointer to a function is just that, after all.
addr2line(1) may be just the thing you're looking for.

Resources