I am little bit stack with kernel symbols type meaning.
Simple static symbols have the same meaning like C static. So local static variable have local scope and static allocation. Static functions scope is a file. But what about static exported symbols? How to deal with EXPORT_SYMBOL(), EXPORT_PER_CPU_SYMBOL(), EXPORT_UNUSED_SYMBOL() if macro export static symbol? What is the difference between global and exported symbols? Is it linker responsibility to add additional info for exported symbols? Is global static variable built-in kernel visible in all the kernel and loadable module?
Kernel exported symbols can be accessed from loadable module. Is it good style touch such symbols inside kernel.
When kernel resolve symbols is it lookup thru kernel symbols table?
Conceptually, using static keyword with function declaration means internal linkage -- so such function is only visible within single translation unit (*.o file). This may involve inlining of that function (in which case it will be unusable further), but since EXPORT_SYMBOL() takes address of static function, compiler should disable inlining optimization.
Implementation is a bit more complicated. This internal and external linkage rules are only apply to static ld linker which works when vmlinux or kernel module is built. Normally symbol with external linkage is added to symtab ELF section and when dynamic linker ld.so loads shared object it reads that section.
But when module is loaded Linux Kernel uses separate symbol table ksymtab. EXPORT_SYMBOL() adds symbol to that table, but this process is completely transparent to compiler-linker toolchain thus it is not related with internal and external linkage at all.
Related
I am reading this article on PLT (Process Linkage Table) and GOT (Global Offset Table). While the purpose of PLT is clear to me, I'm still confused about GOT. What I've understood from the article is that GOT is only necessary for variables declared as extern in a shared library. For global variables declared as static in a shared library code, it is not required.
Is my understanding right, or am I completely missing the point.
Perhaps your confusion is with the meaning of extern. Since the default linkage is extern, any variable declared outside function scope without the static keyword is extern.
The reason the GOT is necessary is because the address of variables accessed by the shared library code is not known at the time the shared library is generated. It depends either on the load address the library gets loaded at (if the definition is in the library itself) or the third-party code the variable is defined in (if the definition is elsewhere). So rather than putting the address inline in the code, the compiler generates code to read the shared library's GOT and then loads the address from the GOT at runtime.
If the variable is known to be defined within the same shared library (either because it's static or the hidden or protected visibility attribute it used) then the address relative to the code in the library can be fixed at the time the shared library file is generated. In this case, rather than performing a lookup through the GOT, the compiler just generates code to access the variable with program-counter-relative addressing. This is less expensive both at runtime and at load time (because the whole symbol lookup and relocation process can be skipped at load time).
I have two dynamically loadable libraries lib_smtp.so and and libpop.so etc. Both have a global variable named protocol which is initialized to "SMTP" and "POP" respectively. I have another static library libhttp.a where protocol is initialized to "HTTP".
Now for some reason i need to compile all dynamic linkable and loadable libraries statically and include in the executable. Doing so i am getting error "multiple definition of symbol" during linking of static libraries.
I am curious to know how linker resolves duplicate symbols during dynamic linking where all three mentioned libraries are getting linked ?
Is there some way i can do the same statically as linker is doing in dynamic linking ie without any conflict add all static libraries to executable which have same symbols? if not, why the process is different for statically linked libraries.
Dynamic linking in modern Linux and several other operating systems is based on the ELF binary format. The (ELF) dynamic libraries on which an executable or other shared library relies are prioritized. To resolve a given symbol, the dynamic linker checks each library in priority order until it finds one that defines the symbol.
That can be dicey when multiple dynamic objects define the same symbol and also multiple dynamic objects use that symbol. It can then be the case that the symbol is resolved differently in different dynamic objects.
Full details are out of scope for SO, but I don't know a better technical explanation than the one in Ulrich Drepper's paper "How to Write Shared Libraries".
In dynamic linking some facility called "symbol visibility" kicks in. Essentially this allows to expose only certain symbols across the object's (object in the sense of shared object) boundaries. It is good style to compile and link shared objects with symbols being hidden by default and only expose those explicitly that are required by callees.
Symbol visibility is applied during linking and so far only implemented in dynamic linkers. It's certainly possible to also have it in static linkage, Apple's GCC variant implements so called Mach-O relocateable object files which can be statically linked with visibility applied. But I don't know if the vanilla GCC, binutils ld or the gold linker can do this for plain old ELF.
I have two dynamically loadable libraries lib_smtp.so and and libpop.so etc. Both have a global variable named protocol which is initialized to "SMTP" and "POP" respectively. I have another static library libhttp.a where protocol is initialized to "HTTP".
Now for some reason i need to compile all dynamic linkable and loadable libraries statically and include in the executable. Doing so i am getting error "multiple definition of symbol" during linking of static libraries.
I am curious to know how linker resolves duplicate symbols during dynamic linking where all three mentioned libraries are getting linked ?
Is there some way i can do the same statically as linker is doing in dynamic linking ie without any conflict add all static libraries to executable which have same symbols? if not, why the process is different for statically linked libraries.
Dynamic linking in modern Linux and several other operating systems is based on the ELF binary format. The (ELF) dynamic libraries on which an executable or other shared library relies are prioritized. To resolve a given symbol, the dynamic linker checks each library in priority order until it finds one that defines the symbol.
That can be dicey when multiple dynamic objects define the same symbol and also multiple dynamic objects use that symbol. It can then be the case that the symbol is resolved differently in different dynamic objects.
Full details are out of scope for SO, but I don't know a better technical explanation than the one in Ulrich Drepper's paper "How to Write Shared Libraries".
In dynamic linking some facility called "symbol visibility" kicks in. Essentially this allows to expose only certain symbols across the object's (object in the sense of shared object) boundaries. It is good style to compile and link shared objects with symbols being hidden by default and only expose those explicitly that are required by callees.
Symbol visibility is applied during linking and so far only implemented in dynamic linkers. It's certainly possible to also have it in static linkage, Apple's GCC variant implements so called Mach-O relocateable object files which can be statically linked with visibility applied. But I don't know if the vanilla GCC, binutils ld or the gold linker can do this for plain old ELF.
I am reading this article on PLT (Process Linkage Table) and GOT (Global Offset Table). While the purpose of PLT is clear to me, I'm still confused about GOT. What I've understood from the article is that GOT is only necessary for variables declared as extern in a shared library. For global variables declared as static in a shared library code, it is not required.
Is my understanding right, or am I completely missing the point.
Perhaps your confusion is with the meaning of extern. Since the default linkage is extern, any variable declared outside function scope without the static keyword is extern.
The reason the GOT is necessary is because the address of variables accessed by the shared library code is not known at the time the shared library is generated. It depends either on the load address the library gets loaded at (if the definition is in the library itself) or the third-party code the variable is defined in (if the definition is elsewhere). So rather than putting the address inline in the code, the compiler generates code to read the shared library's GOT and then loads the address from the GOT at runtime.
If the variable is known to be defined within the same shared library (either because it's static or the hidden or protected visibility attribute it used) then the address relative to the code in the library can be fixed at the time the shared library file is generated. In this case, rather than performing a lookup through the GOT, the compiler just generates code to access the variable with program-counter-relative addressing. This is less expensive both at runtime and at load time (because the whole symbol lookup and relocation process can be skipped at load time).
When we build a programe,some symbols are to be resolved at link time(like those in a .lib),
but some can be resolved at run time(those in a .dll),
my doubt is that how does the compiler know about this, or how do we notify the compiler about this?
When you link your code, the compiler searches both static and dynamic libraries for the undefined symbols. If it finds a dynamic symbol exported by a dynamic library, then it defers symbol resolution to runtime; if it finds a static symbol it resolves the symbol right away; and if it doesn't find the symbol at all, it reports an error (unless you're compiling a shared library, in which case it's OK).
You can examine the dynamic symbols exported by a shared library using nm -D.
You must declare a prototype for functions whose bodies are not available at compile time.
You do this by including the appropriate header (.h file) which will contain a definition like so:
int foo(int bar);
Note the lack of a body there.
Often with shared libraries there is also a layer of indirection where a struct containing function pointers is formed. When the library is loaded, it adjusts the function pointers to reference the functions contained in the shared library.
Those that can be resolved at link time are; those that can't are then searched for in shared libraries at run time.
The linker does the job.
For static functions, the linker include the libraries into your excutable. Calls are to fixed positions in memory.
For dynamic libraries, the linker put a runtime "searcher" for the library. Dynamic Libraries publish the list of functions and its relative memory addresses. So, the runtime can fill the list of function pointers to them.
The original code for dynamic functions could be compiled as a call to a function pointer.
[indeed, that's the job of the linker: replace the function calls to its references to produce the executable].
The compiler needs to know the function declaration at compile time. The linker will then link to the declaration at link time to make aa executable.
For dynamically loaded libraries you insert a code to fetch the symbols at runtime using dlopen dlsym and dlclose. Now these function calls search for the symbols and if they are not found in the dynamic libraries they return error. Hence you need to handle this error as well. A dynamic library loading doesnt ensure symbols have been be resolved and linked. It still has to be present when the dynamic library is loaded.
EDIT : Fixed terrible grammar