What is __libc_start_main and _start? - c

From the past few days I have been trying to understand what happens behind the curtain when we execute a C program. However even after reading numerous posts I cannot find a detailed and accurate explanation for the same. Can someone please help me out ?

You would usually find special names like this for specific uses when compiling and linking programs.
Keeping in mind that this answer is of a general nature rather than a specific implementation of starting up a C environment, you would typically have something like a _start label, which would be the actual entry point for an executable (from the hosting environment's point of view).
This would be located in some object file or library (like crt0.o for the C runtime start-up code) and would normally be added automagically to your executable file by the linker, similar to the way the C runtime library is added(a).
The operating system code for starting a program would then be akin to (pseudo-code, obviously, and with much less error checking than it should have):
def spawnProg(progName):
id = newProcess() # make process space
loadProgram(pid = id, file = progName) # load program into it
newThread(pid, initialPc = '_start') # make thread to run it
Even though you yourself create a main when coding in C, that's not really where things start happening. There's a whole slew of things that need to be done even before your main program starts. Hence the content of the C start-up code would be along the lines of (at its most simplistic):
_start: ;; Weave magic here to set up C and libc.
;; Note this is example code for a mythical implementation,
;; intended to show how it could work. It is not specific
;; bound to any given implementation.
call __setup_for_c ; Set up C environment.
call __libc_start_main ; Set up standard library.
call _main ; Call your main.
call __libc_stop_main ; Tear down standard library.
call __teardown_for_c ; Tear down C environment.
jmp __exit ; Return to OS.
The "weaving of magic" is whatever it takes to make the environment ready for a C program. This may include things like:
setting up static data (this is supposed to be initialised to zeros so it's probably just an allocation of a chunk of of memory, which is then zeroed by the start-up code - otherwise you would need to store a chunk of that size, already zeroed, in the executable file);
preparing argc and argv on the stack, and even preparing the stack itself (there are specific calling conventions that may be used for C, and it's likely the operating system doesn't necessarily set up the stack at all when calling _start since the needs of the process are not known);
setting up thread-specific data structures (things like random number generators, or error variables, per thread);
initialising the C library in other ways; and so on.
Only once all that is complete will it be okay to call your main function. There's also the likelihood that work needs to be done after your main exits, such as:
invoking atexit handlers (things you want run automatically on exit, no matter where the exit occurs);
detaching from shared resources (for example, shared memory if the OS doesn't do this automatically when it shuts down a process); and
freeing up any other resources not automatically cleaned when the process exits, that would otherwise hang around.
(a) Many linkers can be told to not do that if, for example, you're writing something that doesn't use the standard C library, or if you want to provide your own _start routine for low-level work.

Related

Does _start call my program's main function and other essential setup functions?

I'm reading a textbook which describes how loader works:
When the loader runs, it copies chunks of the executable object file into the code and data segments. Next, the loader jumps to the program’s entry point, which is always the address of the _start function. The _start function calls the system startup function, __libc_start_main
From the answer of this question What is __libc_start_main and _start? we have the below pseudo-code about the execution flow:
_start:
call __setup_for_c ; set up C environment
call __libc_start_main ; set up standard library
call _main ; call your main
call __libc_stop_main ; tear down standard library
call __teardown_for_c ; tear down C environment
jmp __exit ; return to OS
My questions are:
I used objdump to check the assembly code of the program and I found _start only call __libc_start_main as picture below shows:
What about the rest of functions like call __setup_for_c ,_main etc? especially my program's main function, I can't see how it get called. so is the pseudo-code about the execution flow correct?
What does __libc_start_main setup standard library mean? Why the standard library needs to be setup? Isn't that the standard library just need to be linked by the dynamic linker when the program is loaded?
Pseudo-code isn't code ;) _libc_start_main() can call the application's main() because the address of main() will have been fixed up by the linker. The order in which the code generated by the compiler does initialization might be interesting, but you shouldn't assume it will be the same from one compiler to another, or even one release to another. It's probably best not to rely on things being done in a particular way if you can avoid it.
As to what needs to be initialized -- standard C libraries like glibc are hugely complex, and a lot of stuff needs to be initialized. To take one example, the memory allocator's block table has to be set up, so that malloc() doesn't start with a random pattern of memory allocation.
The other function calls described in the linked answer give a synopsis of what needs to happen; the actual implementation details in the GNU C library are different, either using “constructors” (_dl_start_user), or explicitly in __libc_start_main. __libc_start_main also takes care of calling the user’s main, which is why you don’t see it called in your disassembly — but its address is passed along (see the lea just the callq). __libc_start_main also takes care of the program exit, and never returns; that’s the reason for the hlt just after the callq, which will crash the program if the function returns.
The library needs quite a lot of setup nowadays:
some of its own relocation
thread-local storage setup
pthread setup
destructor registration
vDSO setup (on Linux)
ctype initialisation
copying the program name, arguments and environment to various library variables
etc. See the x86-64-specific sysdeps/x86_64/start.S and the generic csu/libc-start.c, csu/init-first.c, and misc/init-misc.c among others.
what about the rest of functions like call __setup_for_c ,_main etc?
Those are just fancy made-up readable names used in the linked answer to transfer the meaning of that answer better.
how it get called
Your standard library implementation doesn't provide a function named __setup_for_c nor _main, so they don't exists so they don't get called. Every implementation may choose different names for the functions.
is the pseudo-code about the executation flow correct?
Yes - and the word "psuedo-code" you used infers that you are aware that it's not real code.
what does __libc_start_main setup standard library mean?
It means a symbol with the name __libc_start_main. __libc_start_main is a function that initializes all standard library things and runs main in glibc. It initializes libc, pthreads, atexit and finally runs main. glibc is open source, so just look at it.
why standard library needs to be setup?
Because it was written in the way that it depends on it. The simplest is, when you write:
int var = 42; // variable with static storage duration
int main() {
return var == 42;
}
(Assuming the optimizer doesn't kick in) then the value 42 has to be written into the memory held for var before main is executed. So something has to execute before main and actually write the 42 into the memory of var. This is the simplest case why something has to execute before main. Global variables are used in many places and all of them need to be setup, for example a variable named program_invocation_name in glibc holds the name of the program - so some code needs to actually query the environment or kernel about what is the name of the program and actually store the value (and potentially parse) a string into a global variable (and also remember about free() that string if dynamically allocated on exit). Some code "has to do it" - and that code is in standard library initialization.
There are many more cases - in C++ and other languages there are constructors, there is gcc GNU extension __attribute__((__constructor__)) and .init/.preinit sections - all of them executed before main. And destructors have to execute on exit, but not on _exit - thus atexit stuff is initialized before main and all destructors may be registered with it, depending on implementation.
Environment need to be initialized, potentially stack and some more stuff. And thread local variables need to be allocated only for current thread so that when you pthread_create another thread they don't get copied with non-thread-local variables.
isn't that standard library just need to be linked by the dynamic linker when the program is loaded?
It is - when the program is loaded, the standard library is just linked. The compiler, when generating the program, uses crt code to include some startup code into the program - for example a call to __libc_start_main.

When is dynamic linking between a program and a shared library performed?

In C, when is dynamic linking between a program and a shared library performed:
Once loading of the program into the memory, but before executing the main() of the program, or
After executing the main() of the program, when the first call to a routine from the library is executed? Will dynamic linking happen again when a second or third or... call to a routine from the library is executed?
I was thinking the first, until I read the following quote, and now I am not sure.
Not sure if OS matters, I am using Linux.
From Operating System Concepts:
With dynamic linking, a stub is included in the image for each
library- routine reference. The stub is a small piece of code that
indicates how to locate the appropriate memory-resident library
routine or how to load the library if the routine is not already
present.
When the stub is executed, it checks to see whether the needed routine is already in memory. If it is not, the program loads the
routine into memory. Either way, the stub replaces itself with the
address of the routine and executes the routine. Thus, the next time
that particular code segment is reached, the library routine is
executed directly, incurring no cost for dynamic linking. Under this
scheme, all processes that use a language library execute only one
copy of the library code.
I was thinking the first, until I read the following quote, and now I am not sure.
It's complicated (and depends on exactly what you call "dynamic linking").
The Linux kernel loads a.out into memory. It then examines PT_INTERP segment (if any).
If that segment is not present, the binary is statically linked and the kernel transfers control to the Elf{32,64}Ehdr.e_entry (usually the _start routine).
If the PT_INTERP segment is present, the kernel loads it into memory, and transfers control to it's .e_entry. It is here that the dynamic linking begins.
The dynamic loader relocates itself, then looks in a.outs PT_DYNAMIC segment for instructions on what else is necessary.
For example, it will usually find one or more DT_NEEDED entries -- shared libraries that a.out was directly linked against. The loader loads any such libraries, initializes them, and resolves any data references between them.
IF a.outs PT_DYNAMIC has a DT_FLAGS entry, and IF that entry contains DF_BIND_NOW flag, then function references from a.out will also be resolved. Otherwise (and assuming that LD_BIND_NOW is not set in the environment), lazy PLT resolution will be performed (resolving functions as part of first call to any given function). Details here.
When the stub is executed, it checks to see whether the needed routine is already in memory. If it is not, the program loads the routine into memory.
I don't know which book you are quoting from, but no current UNIX OS works that way.
The OS (and compiler, etc.) certainly matters: the language itself has nothing to say about dynamic libraries (and very little about linking in general). Even if we know that dynamic linking is occurring, a strictly-conforming program cannot observe any effect from timing among its translation units (since non-local initialization cannot have side effects).
That said, the common toolchains on Linux do support automatic initialization upon loading a dynamic library (for implementing C++, among other things). Executables and the dynamic libraries on which they depend (usually specified with -l) are loaded and initialized recursively to allow initialization in each module to (successfully) use functions from its dependencies. (There is an unfortunate choice of order in some cases.) Of course, dlopen(3) can be used to load and initialize more libraries later.

Possible drawbacks of overriding the entry point of a main program

So I was trying to set my own custom name for main in my C program, and I found this answer.
You can specify an entry point to your program using the -e flag to ld.
That means you can override the entry point if you like, but you may not want to do that for a C program you intend to run normally on your machine, since start might do all kinds of OS specific stuff that's required before your program runs.
What would be the (possible) drawbacks of not calling _start from crt0.o and writing my own that simply does whatever I want it to?
The entry point usually does stuff like
Prepare arguments and call main and handles its exit
Call global constructors before main and destructors after
Populate global variables like environ and the like
Initialize the C runtime, e.g. timezone, stdio streams and such
Maybe configure x87 to use 80-bit floating point
Inflate and zero .bss if your loader doesn't
Whatever else is necessary for hosted C programs to run on your platform
These things are tightly coupled to your C implementation, so usually you provide your own _start only when you are targeting a freestanding environment.

What is the need for C startup routine?

Quoting from one of the unix programming books,
When a C program is executed by the
kernelby, one of the exec functions
calls special start-up routine. This
function is called before the main
function is called. The executable
program file specifies this routine as
the starting address for the program;
this is set up by the link editor when
it is invoked by the C compiler. This
start-up routine takes values from the
kernel the command-line arguments and
the environment and sets things up so
that the main function is called as
shown earlier.
Why do we a need a middle man start-up routine. The exec function could have straightway called the main function and the kernel could have directly passed the command line arguments and environment to the main function. Why do we need the start-up routine in between?
Because C has no concept of "plug in". So if you want to use, say, malloc() someone has to initialize the necessary data structures. The C programmers were lazy and didn't want to have to write code like this all the time:
main() {
initialize_malloc();
initialize_stdio();
initialize_...();
initialize_...();
initialize_...();
initialize_...();
initialize_...();
... oh wow, can we start already? ...
}
So the C compiler figures out what needs to be done, generates the necessary code and sets up everything so you can start with your code right away.
The start-up routine initializes the CRT (i.e. creates the CRT heap so that malloc/free work, initializes standard I/O streams, etc.); in case of C++ it also calls the globals' constructors. There may be other system-specific setup, you should check the sources of your run-time library for more details.
Calling main() is a C thing, while calling _start() is a kernel thing, indicated by the entry point in the binary format header. (for clarity: the kernel doesn't want or need to know that we call it _start)
If you would have a non-C binary, you might not have a main() function, you might not even have the concept of a "function" at all.
So the actual question would be: why doesn't a compiler give the address of main() as a starting point? That's because typical libc implementations want to do some initializations before really starting the program, see the other answers for that.
edit as an example, you can change the entry point like this:
$ cat entrypoint.c
int blabla() { printf("Yes it works!\n"); exit(0); }
int main() { printf("not called\n"); }
$ gcc entrypoint.c -e blabla
$ ./a.out
Yes it works!
Important to know also is that an application program is executed in user mode, and any system calls out, set the privileged bit and go into kernel mode. This helps increase OS security by preventing the user from accessing kernel level system calls and a myriad of other complications. So a call to printf will trap, set kernel mode bit, execute code, then reset to user mode and return to your application.
The CRT is required to help you and allow you to use the languages you want in Windows and Linux. it provides some very fundamental bootstrapping into the OS to provide you with feature sets for development.

Mixing Assembly language and C programs

I am using a bootloader program which is in Assembly and I am calling a C function frequently to SEND and RECEIVE a Character at a time. The controller I am using seems to have just 3 general purpose registers which it uses frequently. Apart from that I am storing some bytes in fixed RAM locations.
SO, my question is:
Will C function overwrite these RAM location, which were defined in Assembly?
I am doing PUSH and PULL of the concerned registers before going and after coming from these C functions.
If I understand your question correctly, you are concerned about the RAM locations used in your assembly module overlapping with some variable declared in a C module. You can examine the list file output by your linker to determine if this is the case. The linker list file will show all of the RAM addresses used by your C modules which you can compare to the fixed RAM locations used in the assembly module.
Note that if your linker does not produce a list file automatically, you will have to read through your linker's documentation to find the right command line option to do so.
As long as you are keeping the previous values on the stack when doing the c calls you should be fine. Just make sure that you are pushing onto stack before the call and popping off the stack after returning.
It all depends on the C calling convention that the C code was compiled in. Calling convention is how the caller and callee will communicate with regards to passing data into the function and returning values afterwards. This includes who wil do stuff like back up registers onto the stack before/after calling, will it be necessary to prep the registers before calling the C function, can you guarantee that the registers will return the way they were, etc.
You'll need to find out how the C code was compiled (with what Calling Convention setting). Note that this is also architecture specific. A summary of the different calling conventions and a description of what each entails can be found at Wikipedia here:
http://en.wikipedia.org/wiki/Calling_convention
http://en.wikipedia.org/wiki/X86_calling_conventions
On x86, cdecl and stdcall are the most popular conventions. cdecl means your ASM code should do the cleanup, while stdcall says the function being called is responsible for it. If you have the source code for the C function, I would suggest passing the necessary flags to the compiler to make it a "Callee cleanup" convention (usually stdcall, but safecall and fastcall are also options) which means you can safely call the C function without worrying about register corruption.

Resources