process of running linux executable - c

Is there good documentation of what happen when I run some executable in Linux. For example: I start ./a.out, so probably some bootloader assembly is run (come with c runtime?), and it finds start symbol in program, doing dynamic relocation, finally call main.
I know the above is not correct, but looking for detailed documentation of how this process happen. Can you please explain, or point to links or books that do?

For dynamic linked programs, the kernel detects the PT_INTERP header in the ELF file and first mmaps the dynamic linker (/lib/ld-linux.so.2 or similar), and starts execution at the e_entry address from the main ELF header of the dynamic linker. The initial state of the stack contains the information the dynamic linker needs to find the main program binary (already in memory). It's responsible for reading this and finding all the additional libraries that must be loaded, loading them, performing relocations, and jumping to the e_entry address of the main program.
For static linked programs, the kernel uses the e_entry address from the main program's ELF header directly.
In either case, the main program begins with a routine written in assembly traditionally called _start (but the name is not important as long as its address is in the e_entry field of the ELF header). It uses the initial stack contents to determine argc, argv, environ, etc. and calls the right implementation-internal functions (usually written in C) to run global constructors (if any) and perform any libc initialization needed prior to the entry to main. This usually ends with a call to exit(main(argc, argv)); or equivalent.

A book "Linker and Loader" gives a detail description about the loading process. Maybe it can give you some help on the problem.

Related

Does linker add the library function into the source code?

Linker basic function is to link the object code with other object code(it can be standard library code).
#include<stdio.h>
int main()
{
printf("hello");
}
I want to know will linker replace the printf() function with its definition (like an inline function in c++). Or it will paste the printf()
function outside the main() function and pass "hello" as argument to that function.
For printf("hello");, the compiler generates an instruction to call a subroutine. It leaves the address of the subroutine not completely filled in. The object module the compiler generates has some notes about what routine’s address should be filled in there.
The linker may work in different ways. For static linking, the linker will find the implementation of printf in a library and copy the object module for it from the library into the executable file it is building. Depending on certain characteristics of the link, the linker might then complete the call instruction with the final address of the printf routine or it might leave notes in the executable file about the relationship between the call instruction and the printf routine. Later, when the program is being loaded into memory, the program loader will complete the address in the instruction.
For dynamic linking, the linker will find the implementation of printf in a library (or in a file with sufficient information about the library). It will not copy the printf function’s object module into the executable file, but it will include notes about the relationship between the call instruction and the printf routine and its library in the executable file. Later, the program loader will copy the printf function’s object module into the memory of the process. (This might be done by mapping part of the process’ virtual address space to physical memory that already contains the object module from the library and that is shared by other processes on the system. This sharing reduces the load on the system and makes dynamic loading more favorable in this regard.) And the loader will complete the address in the call instruction.
Some dynamic loading is not done as soon as the program is loaded. When a process is started, the loader might load just the program entry point and some essential parts. Some call instructions might be left incomplete. They will have been filled in with the addresses of special subroutines of the program loader (or dynamic library loader). When one of these subroutines is called, it will then load the desired routine, change the address in the call instruction (or otherwise arrange for future calls to call the desired routine), and then jump to the desired routine. This is beneficial because routines that are not used by your program in a particular run do not have to be loaded into memory at all. For example, if your program has a lot of code and data to log errors and inform the user when certain errors occur, that code and data does not have to be loaded into memory of those errors do not occur in a particular session.

How are shared libraries referenced by various programs?

I understand that shared libraries are loaded into memory and used by various programs.
How can a program know where in memory the library is?
When a shared library is used, there are two parts to the linkage process. At compile time, the linker program, ld in Linux, links against the shared library in order to learn which symbols are defined by it. However, none of the code or data initializers from the shared library are actually included in the ultimate a.out file. Instead, ld just records which dynamic libraries were linked against and the information is placed into an auxiliary section of the a.out file.
The second phase takes placed at execution time, before main gets invoked. The kernel loads a small helper program, ld.so, into the address space and this gets executed. Therefore, the start address of the program is not main or even _start (if you have heard of it). Rather, it is actually the start address of the dynamic library loader.
In Linux, the kernel maps the ld.so loader code into a convenient place in the precess address space and sets up the stack so that the list of required shared libraries (and other necessary info) is present. The dynamic loader finds each of the required libraries by looking at a sequence of directories which are often point in the LD_LIBRARY_PATH environment variable. There is also a pre-defined list which is hard-coded into ld.so (and additional search places can be hard-coded into the a.out during link time). For each of the libraries, the dynamic loader reads its header and then uses mmap to create memory regions for the library.
Now for the fun part.
Since the actual libraries used at run-time to satisfy the requirements are not known at link-time, we need to figure out a way to access functions defined in the shared library and global variables that are exported by the shared library (this practice is deprecated since exporting global variables is not thread-safe, but it is still something we try to handle).
Global variables are assigned a statics address at link time and are then accessed by absolute memory address.
For functions exported by the library, the user of the library is going to emit a series of call assembly instructions, which reference an absolute memory address. But, the exact absolute memory address of the referenced function is not known at link time. How do we deal with this?
Well, the linker creates what is known as a Procedure Linkage Table, which is a series of jmp (assembly jump) instructions. The target of the jump is filled in at run time.
Now, when dealing with the dynamic portions of the code (i.e. the .o files that have been compiled with -fpic), there are no absolute memory references whatsoever. In order to access global variables which are also visible to the static portion of the code, another table called the Global Offset Table is used. This table is an array of pointers. At link time, since the absolute memory addresses of the global variables are known, the linker populates this table. Then, at run time, dynamic code is able to access the global variables by first finding the Global Offset Table, then loading the address of the correct variable from the appropriate slot in the table, and finally dereferencing the pointer.

When is dynamic linking between a program and a shared library performed?

In C, when is dynamic linking between a program and a shared library performed:
Once loading of the program into the memory, but before executing the main() of the program, or
After executing the main() of the program, when the first call to a routine from the library is executed? Will dynamic linking happen again when a second or third or... call to a routine from the library is executed?
I was thinking the first, until I read the following quote, and now I am not sure.
Not sure if OS matters, I am using Linux.
From Operating System Concepts:
With dynamic linking, a stub is included in the image for each
library- routine reference. The stub is a small piece of code that
indicates how to locate the appropriate memory-resident library
routine or how to load the library if the routine is not already
present.
When the stub is executed, it checks to see whether the needed routine is already in memory. If it is not, the program loads the
routine into memory. Either way, the stub replaces itself with the
address of the routine and executes the routine. Thus, the next time
that particular code segment is reached, the library routine is
executed directly, incurring no cost for dynamic linking. Under this
scheme, all processes that use a language library execute only one
copy of the library code.
I was thinking the first, until I read the following quote, and now I am not sure.
It's complicated (and depends on exactly what you call "dynamic linking").
The Linux kernel loads a.out into memory. It then examines PT_INTERP segment (if any).
If that segment is not present, the binary is statically linked and the kernel transfers control to the Elf{32,64}Ehdr.e_entry (usually the _start routine).
If the PT_INTERP segment is present, the kernel loads it into memory, and transfers control to it's .e_entry. It is here that the dynamic linking begins.
The dynamic loader relocates itself, then looks in a.outs PT_DYNAMIC segment for instructions on what else is necessary.
For example, it will usually find one or more DT_NEEDED entries -- shared libraries that a.out was directly linked against. The loader loads any such libraries, initializes them, and resolves any data references between them.
IF a.outs PT_DYNAMIC has a DT_FLAGS entry, and IF that entry contains DF_BIND_NOW flag, then function references from a.out will also be resolved. Otherwise (and assuming that LD_BIND_NOW is not set in the environment), lazy PLT resolution will be performed (resolving functions as part of first call to any given function). Details here.
When the stub is executed, it checks to see whether the needed routine is already in memory. If it is not, the program loads the routine into memory.
I don't know which book you are quoting from, but no current UNIX OS works that way.
The OS (and compiler, etc.) certainly matters: the language itself has nothing to say about dynamic libraries (and very little about linking in general). Even if we know that dynamic linking is occurring, a strictly-conforming program cannot observe any effect from timing among its translation units (since non-local initialization cannot have side effects).
That said, the common toolchains on Linux do support automatic initialization upon loading a dynamic library (for implementing C++, among other things). Executables and the dynamic libraries on which they depend (usually specified with -l) are loaded and initialized recursively to allow initialization in each module to (successfully) use functions from its dependencies. (There is an unfortunate choice of order in some cases.) Of course, dlopen(3) can be used to load and initialize more libraries later.

Is Dynamic Linker part of Kernel or GCC Library on Linux Systems?

Is Dynamic Linker (aka Program Interpreter, Link Loader) part of Kernel or GCC Library ?
UPDATE (28-08-16):
I have found that the default path for dynamic linker that every binary (i.e linked against a shared library) uses /lib64/ld-linux-x86-64.so.2 is a link to the shared library /lib/x86_64-linux-gnu/ld-2.23.so which is the actual dynamic linker.
And It is part of libc6 (2.23-0ubuntu3) package viz. GNU C Library: Shared libraries in ubuntu for AMD64 architectures.
My actual question was
what would happen to all the applications that are dynamically linked (all,now a days), if this helper program (ld-2.23.so) doesn't exist ?
And answer to that is " no application would run, even the shell program ". I've tried it on virutal machine.
In an ELF executable, this is referred to as the "ELF interpreter". On linux (e.g.) this is /lib64/ld-linux-x86-64.so.2
This is not part of the kernel and [generally] with glibc et. al.
When the kernel executes an ELF executable, it must map the executable into userspace memory. It then looks inside for a special sub-section known as INTERP [which contains a string that is the full path].
The kernel then maps the interpreter into userspace memory and transfers control to it. Then, the interpreter does the necessary linking/loading and starts the program.
Because ELF stands for "extensible linker format", this allows many different sub-sections with the ELF file.
Rather than burdening the kernel with having to know about all the myriad of extensions, the ELF interpreter that is paired with the file knows.
Although usually only one format is used on a given system, there can be several different variants of ELF files on a system, each with its own ELF interpreter.
This would allow [say] a BSD ELF file to be run on a linux system [with other adjustments/support] because the ELF file would point to the BSD ELF interpreter rather than the linux one.
UPDATE:
every process(vlc player, chrome) had the shared library ld.so as part of their address space.
Yes. I assume you're looking at /proc/<pid>/maps. These are mappings (e.g. like using mmap) to the files. That is somewhat different than "loading", which can imply [symbol] linking.
So primarily loader after loading the executable(code & data) onto memory , It loads& maps dynamic linker (.so) to its address space
The best way to understand this is to rephrase what you just said:
So primarily the kernel after mapping the executable(code & data) onto memory, the kernel maps dynamic linker (.so) to the program address space
That is essentially correct. The kernel also maps other things, such as the bss segment and the stack. It then "pushes" argc, argv, and envp [the space for environment variables] onto the stack.
Then, having determined the start address of ld.so [by reading a special section of the file], it sets that as the resume address and starts the thread.
Up until now, it has been the kernel doing things. The kernel does little to no symbol linking.
Now, ld.so takes over ...
which further Loads shared Libraries , map & resolve references to libraries. It then calls entry function (_start)
Because the original executable (e.g. vlc) has been mapped into memory, ld.so can examine it for the list of shared libraries that it needs. It maps these into memory, but does not necessarily link the symbols right away.
Mapping is easy and quick--just an mmap call.
The start address of the executable [not to be confused with the start address of ld.so], is taken from a special section of the ELF executable. Although, the symbol associated with this start address has been traditionally called _start, it could actually be named anything (e.g. __my_start) as it is what is in the section data that determines the start address and not address of the symbol _start
Linking symbol references to symbol definitions is a time consuming process. So, this is deferred until the symbol is actually used. That is, if a program has references to printf, the linker doesn't actually try to link in printf until the first time the program actually calls printf
This is sometimes called "link-on-demand" or "on-demand-linking". See my answer here: Which segments are affected by a copy-on-write? for a more detailed explanation of that and what actually happens when an executable is mapped into userspace.
If you're interested, you could do ldd /usr/bin/vlc to get a list of the shared libraries it uses. If you looked at the output of readelf -a /usr/bin/vlc, you'll see these same shared libraries. Also, you'd get the full path of the ELF interpreter and could do readelf -a <full_path_to_interpreter> and note some of the differences. You could repeat the process for any .so files that vlc wanted.
Combining all that with /proc/<pid>maps et. al. might help with your understanding.

how to get minimum executable opcodes for c program?

to get opcodes author here does following:
[bodo#bakawali testbed8]$ as testshell2.s -o testshell2.o
[bodo#bakawali testbed8]$ ld testshell2.o -o testshell2
[bodo#bakawali testbed8]$ objdump -d testshell2
and then he gets three sections (or mentions only these 3):
<_start>
< starter>
< ender>
I have tried to get hex opcodes the same way but cannot ld correctly. Of course I can produce .o and prog file for example with:
gcc main.o -o prog -g
however when
objdump --prefix-addresses --show-raw-insn -Srl prog
to see complete code with annotations and symbols, I have many additional sections there, for example:
.init
.plt
.text (yes, I know, main is here) [many parts here: _start(), call_gmon_start(), __do_global_dtors_aux(), frame_dummy(), main(), __libc_csu_init(), __libc_csu_fini(), __do_global_ctors_aux()]
.fini
I assume these are additions introduced by gcc linking to runtime libraries. I think i don't need these all sections to call opcode from c code (author uses only those 3 sections) however my problem is I don't know which exactly I might discard and which are necessary. I want to use it like this:
#include <unistd.h>
char code[] = "\x31\xed\x49\x89\x...x00\x00";
int main(int argc, char **argv)
{
/*creating a function pointer*/
int (*func)();
func = (int (*)()) code;
(int)(*func)();
return 0;
}
so I have created this :
#include <unistd.h>
/*
*
*/
int main() {
char *shell[2];
shell[0] = "/bin/sh";
shell[1] = NULL;
execve(shell[0], shell, NULL);
return 0;
}
and I did disassembly as I described. I tried to use opcode from .text main(), this gave me segmentation fault, then .text main() + additionally .text _start(), with same result.
So, what to choose from above sections, or how to generate only as minimized "prog" as with three sections?
char code[] = "\x31\xed\x49\x89\x...x00\x00";
This will not work.
Reason: The code definitely contains adresses. Mainly the address of the function execve() and the address of the string constant "/bin/sh".
The executable using the "code[]" approach will not contain a string constant "/bin/sh" at all and the address of the function execve() will be different (if the function will be linked into the executable at all).
Therefore the "call" instruction to the "execve()" function will jump to anywhere in the executable using the "code[]" approach.
Some theory about executables - just for your information:
There are two possibilities for executables:
Statically linked: These executables contain all necessary code. Therefore they do not access dynamic libraries like "libc.so"
Dynamically linked: These executables do not contain code that is frequently used. Such code is stored in files common to all executables: The dynamic libraries (e.g. "libc.so")
When the same C code is used then statically linked executables are much bigger than dynamically linked executables because all C functions (e.g. "printf", "execve", ...) must be bundled into the executable.
When not using any of these library functions the statically linked executables are simpler and therefore easier to understand.
Statically linked executable behaviour
A statically linked executable is loaded into the memory by the operating system (when it is started using execve()). The executable contains an entry point address. This address is stored in the file header of the executable. You can see it using "objdump -h ...".
The operating system performs a jump to that address so the program execution starts at this address. The address is typically the function "_start" however this can be changed using command line options when linking using "ld".
The code at "_start" will prepare the executable (e.g. initialize variables, calculate the values for "argc" and "argv", ...) and call the "main()" function. When "main()" returns the "_start" function will pass the value returned by "main()" to the "_exit()" function.
Dynamically linked executable behaviour
Such executables contain two additional sections. The first section contains the file name of the dynamic linker (maybe. "/lib/ld-linux.so.1"). The operating system will then load the executable and the dynamic linker and jump to the entry point of the dynamic linker (and not to that of the executable).
The dynamic linker will read the second additional section: It contains information about dynamic libraries (e.g. "libc.so") required by the executable. It will load all these libraries and initialize a lot of variables. Then it calls the initialization function ("_init()") of all libraries and of the executable.
Note that both the operating system and the dynamic linker ignore the function and section names! The address of the entry point is taken from the file header and the addresses of the "_init()" functions is taken from the additional section - the functions may be named differently!
When all this is done the dynamic linker will jump to the entry point ("_start") of the executable.
About the "GOT", "PLT", ... sections:
These sections contain information about the addresses where the dynamic libraries have been loaded by the linker. The "PLT" section contains wrapper code that will contain jumps to the dynamic libraries. This means: The section "PLT" will contain a function "printf()" that will actually do nothing but jump to the "printf()" function in "libc.so". This is done because directly calling a function in a dynamic library from C code would make linking much more difficult so C code will not call functions in a dynamic library directly. Another advantage of this implementation is that "lazy linking" is possible.
Some words about Windows
Windows only knows dynamically linked executables. Windows XP even refused to load an executable not requiring DLLs. The "dynamic linker" is integrated into the operating system and not a separate file. There is also an equivalent of the "PLT" section. However many compilers support "directly" calling DLL code from C code without calling the code in the PLT section first (theoretically this would also be possible under Linux). Lazy linking is not supported.
You should read this article: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html.
It explains all you need to create really tiny program in great detail.

Resources