Is the linker retained after a call to exec? - c

I'm trying to watch the linker load libraries and search for symbols in zygote on Android.
Zygote is started by init (or a start/stop zygote command on the CLI). In the init.*.rc files, I've modified it to have an environment variable of LD_DEBUG=2 (via a setenv LD_DEBUG 2 in the init.zygote[32|64].rc file). In the linker code, this should print debugging information to logcat, which it does if you invoke a program from the command line, ie: "LD_DEBUG=2 ./myprogram".
However, this does not work for zygote. I took a look at /proc/[zygote's pid]/environ, and the LD_DEBUG=2 is definitely there.
The linker code for a service (among other things) sets up the environment array, calls fork(), does a little more work, and then calls exec() with the array it created.
So, I'm wondering, is there any way that the linker would be retained across this fork-exec? I didn't think that was possible, since I thought an exec completely wiped a process's memory space.
I could see how if it is retained, it wouldn't be invoked again, since the linker has already been loaded, but if it the process memory space is being wiped, this doesn't make sense.

Related

gdb core dump warning: Can't open file /memfd:magicringbuffer (deleted) during file-backed mapping note processing

I implemented a magic ring buffer (MRB) on linux using memfd_create, ftruncate, mmap, and munmap. The fd returned by memfd_create gets close()'d after the buffer is fully constructed. The MRB itself runs and works perfectly fine.
The problem:
One tries to create a core-file on a process running this MRB with gcore.
They then try to use gdb <executable> -c <core-file>
gdb then prints a warning:
warning: Can't open file /memfd:magicringbuffer (deleted) during file-backed mapping note processing
Additional notes:
"magicringbuffer" is the string passed as the name parameter in memfd_create(const char *name, unsigned int flags);
built and run on CentOS version 7
Questions:
What does this warning exactly mean? What causes it? Is it because the "file" is virtual? or because it was close()'d?
What are the implications of it? Could it lead to missing debug symbols? The <executable> is indeed a binary with debug symbols
I tried to look for an answer on the internet, but I found nothing satisfactory.
GDB is trying to reconstruct the virtual address space of the former process, at the time of the core dump, as accurately as possible. This includes re-creating all mmap regions. The message means simply that GDB tried, and failed, to re-create the mmap region that was backed by the memfd. IIRC, the annotation in the core file that tells GDB that an mmap region existed -- the "file-backed mapping note" -- was designed before memfd_create was a thing, and so GDB doesn't know it should be calling memfd_create() instead of regular old open() for this one. And even if it did, it wouldn't be able to recover access to the original memfd area (which might be completely gone by the time you get around to debugging from the core dump).
The practical upshot of this is that, while debugging from the core dump, you won't be able to look at the contents of memory within your magic ring buffer. Debug symbols, however, should be unaffected.
This is arguably a bug in either the kernel or gcore (not sure which); the contents of memfd-backed memory regions should arguably be dumped into the core file like MAP_ANONYMOUS regions, rather than generating file-backed mapping notes.

How to run an arbitrary script or executable from memory?

I know I can use a system call like execl("/bin/sh", "-c", some_string, 0) to interpret a "snippet" of shell code using a particular shell/interpreter. But in my case I have an arbitrary string in memory that represents some complete script which needs to be run. That is, the contents of this string/memory buffer could be:
#! /bin/bash
echo "Hello"
Or they might be:
#! /usr/bin/env python
print "Hello from Python"
I suppose in theory the string/buffer could even include a valid binary executable, though that's not a particular priority.
My question is: is there any way to have the system launch a subprocess directly from a buffer of memory I give it, without writing it to a temporary file? Or at least, a way to give the string to a shell and have it route it to the proper interpreter?
It seems that all the system calls I've found expect a path to an existing executable, rather than something low level which takes an executable itself. I do not want to parse the shebang or anything myself.
You haven't specified the operating system, but since #! is specific to Unix, I assume that's what you're talking about.
As far as I know, there's no system call that will load a program from a block of memory rather than a file. The lowest-level system call for loading a program is the execve() function, and it requires a pathname of the file to load from.
My question is: is there any way to have the system launch a
subprocess directly from a buffer of memory I give it, without writing
it to a temporary file? Or at least, a way to give the string to a
shell and have it route it to the proper interpreter?
It seems that all the system calls I've found expect a path to an
existing executable, rather than something low level which takes an
executable itself. I do not want to parse the shebang or anything
myself.
Simple answer: no.
Detailed answer:
execl and shebang convention are POSIXisms, so this answer will focus on POSIX systems. Whether the program you want to execute is a script utilizing the shebang convention or a binary executable, the exec-family functions are the way for a userspace program to cause a different program to run. Other interfaces such as system() and popen() are implemented on top of these.
The exec-family functions all expect to load a process image from a file. Moreover, on success they replace the contents of the process in which they are called, including all memory assigned to it, with the new image.
More generally, substantially all modern operating systems enforce process isolation, and one of the central pillars of process isolation is that no process can access another's memory.

Are functions marked with __attribute__(constructor) run again when a shared library is reloaded?

Assume that no other executable linked against a shared library libshlib has already been loaded. And assume libshlib contains one function marked with __attribute__(constructor) and one function marked with __attribute__(destructor). When executable that is linked against libshlib is started, libshlib will be loaded and the corresponding function marked with __attribute(constructor) is run exactly once. But what happens if the shared library can be reloaded e.g. via a user defined signal such as SIGUSR1? From my testing it seems that the __attribute__(constructor) is not run again. Is that correct or is there a standard saying otherwise?
I assume you have a program that was linked via (e.g.):
cc -o mypgm mypgm.o -lshlib
Upon execution, once the ELF interpreter has loaded libshlib.so and executed the constructor(s), the library is never loaded again. Side note: To find your interpreter do: readelf -a mypgm | grep interpreter:
If the program receives a signal (e.g. SIGUSR1), the signal is either caught by a signal handler (assuming signal or sigaction has been called to set one up), or the default action is taken (which is [IIRC] program termination for SIGUSR1). This does not cause the library to be reloaded.
There is no other action that can cause the library to be reloaded either. Only the destructor(s) will be called upon program exit (e.g. main returns or exit is called).
Even manually calling the destructors has no effect because the constructors and destructors are independent. (e.g. The constructor could do able = malloc(...) and the destructor could do free(able). But, the destructor could do free(baker) instead.). Calling a destructor does not "reset" a constructor.
To get the "reload" effect, the library would need to be dynamically loaded/unloaded via dlopen/dlsym/dlclose. That is, the link command would be:
cc -o mypgm mypgm.o
Then, mypgm would [at some point] call dlopen("libshlib.so") (and the constructor(s) would be called). When [and if] mypgm calls dlclose, libshlib.so will be unloaded (and destructor(s) called).
If mypgm then called dlopen("libshlib.so") a second time, the contructors would be called [again].
UPDATE:
Note that calling dlclose does not necessarily unload the library or call destructors.
I just checked the code [in glibc]. The library has a refcount. The library will be unloaded if the refcount is 1 upon entry to dlclose, which should be the case for libshlib.so above with dlopen [as nobody else bumps it up].
In other words, to force the "desired" behavior nothing else should refer to libshlib via an -lshlib. Not the program or any other .so. This lays the groundwork.
Note that if libshlib.so wanted glibc, but so did the program, unloading libshlib will bump down the glibc refcount, but glibc will remain because its refcount is [still] >0.
There are conditions where the library can't be unloaded (in fact, these conditions are much more common then conditions when the library can be unloaded).
Again, this is dependent upon the refcount and [possibly] some state. When the library is loaded from a "static" linkage (vs. dlopen), the refcount gets an extra increment, so it won't get yanked.
The code also handles the case where a constructor calls dlopen on its own library.
For a given libA, if it needs libB, B's refcount gets upped/downed by A's load/unload.
If the library is not unloaded, then it's not well defined whether destructors will run, and whether the subsequent dlopen will run constructors again
The whole point of using dlopen this way for libshlib is to guarantee the loading at dlopen and unloading at dlclose [along with constructor/destructor action]. This will be true if there is no static reference to it or cyclic dependency, which was the starting criteria.
UPDATE #2:
The part about "as nobody else bumps it up" is way too simplistic.
Don't confuse prose with substance.
As mentioned above: This will be true if there is no static reference to it or cyclic dependency.
This means that only the executable/object that does the dlopen/dlclose for shlib refers to a symbol in shlib.
And, this is only via dlsym. Otherwise, it's a static reference (i.e. in the object's symbol table as UNDEF)].
And, no shared library that shlib drags in refers to a symbol defined in shlib [the cyclic dependency].
Look at all the places where DF_1_NODELETE is set during symbol resolution.
Yes, I did look.
DF_1_NODELETE is set in only the following places. None of them apply to this situation [or most dlopen scenarios].
If the flags argument to dlopen has RTLD_NODELETE, which we can avoid.
If profiling is enabled, the profile map [not related to our dlopen] gets DF_1_NODELETE set
If a symbol has a bind type (e.g. LOCAL, GLOBAL, WEAK, etc.) that is type 10 (STB_GNU_UNIQUE)
A symbol is referenced by a non-dynamic object as an UNDEF [in its symbol table] that can't be removed [because the referent object has DF_1_NODELETE set]. This does not apply because of the pre-conditions above.
There was malloc failure when adding a dependency [from a _different object]. This code doesn't even execute for the case herein.
And, OP's usage aside, there are legitimate reasons for dlopen/dlclose to work as I've set up/described.
See my answer here: Is it possible to perform munmap based on information in /proc/self/maps?
There, OP needed to have a nonstop program that could run for months/years (e.g. a hi-reliabilty, mission-critical application). If an updated version of one of its shared libraries was installed [via a package manager, etc.], the program had to dynamically, on-the-fly, without a re-exec, be able to load the newer version.

What and where exactly is the loader?

I understand every bit of the C compilation process (how the object files are linked to create the executable). But about the loader itself (which starts the program running) I have a few doubts.
Is the loader part of the kernel?
How exactly is the ./firefox or some command like that loaded? I mean you normally type such commands into the terminal which loads the executable I presume. So is the loader a component of the shell?
I think I'm also confused about where the terminal/shell fits into all of this and what its role is.
The format of an executable determines how it will be loaded. For example executables with "#!" as the first two characters are loaded by the kernel by executing the named interpreter and feeding the file to it as the first argument. If the executable is formatted as a PE, ELF, or MachO binary then the kernel uses an intrepter for that format that is built in to the kernel in order to find the executable code and data and then choose the next step.
In the case of a dynamically linked ELF, the next step is to execute the dynamic loader (usually ld.so) in order to find the libraries, load them, abd resolve the symbols. This all happens in userspace. The kernel is more or less unaware of dynamic linking, because it all happens in userspace after the kernel has handed control to the interprter named in the ELF file.
The corresponding system call is exec. It is part of the kernel and in charge of cleaning the old address space that makes the call and get a new fresh one with all materials to run a new code. This is part of the kernel because address space is a kind of sandbox that protect processes from others, and since it is critical it is in charge of the kernel.
The shell is just in charge of interpreting what you type and transform it to proper structures (list or arrays of C-strings) to pass to some exec call (after having, most of the time, spawned a new process with fork).

Fork and dynamic library interaction

I considered the following experinment: simple C program, that only return 0, but linked with
all libraries that gcc allowed me to link - 207 total. It takes a lot of time to run this programm -2.1 cold start, 0.24 warm. So the next step is write program, also linked with
this heap of libraries, who will fork&exec on request. Idea was, that if it already loaded
libraries, and fork creates idential copy of process, then I will get running first programm
very quickly. But I found no difference, running first program via shell or via second programm, linked with all libraries.
What is my mistake?
EDIT: Yeah, I missed the point of exec. But is it any possible improvement of my idea to speedup starting application. I know about prelink, but it do a bit different idea.
The only advantage of what you're doing is that it gets all the libraries read from disk into the filesystem cache (same as your "warm start"). Otherwise, what you're doing is exactly how the shell loads a program (fork and exec) so I don't see how you expect it to be faster. The idea that this will "copy" a process is true if you just fork, but you also exec.
To make a "copying" analogy with the filesystem, it's like if you took a file that was really slow to generate, copied it, then rm'd it and generated it all over again rather than using the copy.
fork creates an exact copy of the process, however exec clears the processes memory. Therefore all the libraries have to be loaded again (or at least initialised - they code segments might be shared).

Resources