Find program's code address at runtime? - c

When I use gdb to debug a program written in C, the command disassemble shows the codes and their addresses in the code memory segmentation. Is it possible to know those memory addresses at runtime? I am using Ubuntu OS. Thank you.
[edit] To be more specific, I will demonstrate it with following example.
#include <stdio.h>
int main(int argc,char *argv[]){
myfunction();
exit(0);
}
Now I would like to have the address of myfunction() in the code memory segmentation when I run my program.

Above answer is vastly overcomplicated. If the function reference is static, as it is above, the address is simply the value of the symbol name in pointer context:
void* myfunction_address = myfunction;
If you are grabbing the function dynamically out of a shared library, then the value returned from dlsym() (POSIX) or GetProcAddress() (windows) is likewise the address of the function.
Note that the above code is likely to generate a warning with some compilers, as ISO C technically forbids assignment between code and data pointers (some architectures put them in physically distinct address spaces).
And some pedants will point out that the address returned isn't really guaranteed to be the memory address of the function, it's just a unique value that can be compared for equality with other function pointers and acts, when called, to transfer control to the function whose pointer it holds. Obviously all known compilers implement this with a branch target address.
And finally, note that the "address" of a function is a little ambiguous. If the function was loaded dynamically or is an extern reference to an exported symbol, what you really get is generally a pointer to some fixup code in the "PLT" (a Unix/ELF term, though the PE/COFF mechanism on windows is similar) that then jumps to the function.

If you know the function name before program runs, simply use
void * addr = myfunction;
If the function name is given at run-time, I once wrote a function to find out the symbol address dynamically using bfd library. Here is the x86_64 code, you can get the address via find_symbol("a.out", "myfunction") in the example.
#include <bfd.h>
#include <stdio.h>
#include <stdlib.h>
#include <type.h>
#include <string.h>
long find_symbol(char *filename, char *symname)
{
bfd *ibfd;
asymbol **symtab;
long nsize, nsyms, i;
symbol_info syminfo;
char **matching;
bfd_init();
ibfd = bfd_openr(filename, NULL);
if (ibfd == NULL) {
printf("bfd_openr error\n");
}
if (!bfd_check_format_matches(ibfd, bfd_object, &matching)) {
printf("format_matches\n");
}
nsize = bfd_get_symtab_upper_bound (ibfd);
symtab = malloc(nsize);
nsyms = bfd_canonicalize_symtab(ibfd, symtab);
for (i = 0; i < nsyms; i++) {
if (strcmp(symtab[i]->name, symname) == 0) {
bfd_symbol_info(symtab[i], &syminfo);
return (long) syminfo.value;
}
}
bfd_close(ibfd);
printf("cannot find symbol\n");
}

To get a backtrace, use execinfo.h as documented in the GNU libc manual.
For example:
#include <execinfo.h>
#include <stdio.h>
#include <unistd.h>
void trace_pom()
{
const int sz = 15;
void *buf[sz];
// get at most sz entries
int n = backtrace(buf, sz);
// output them right to stderr
backtrace_symbols_fd(buf, n, fileno(stderr));
// but if you want to output the strings yourself
// you may use char ** backtrace_symbols (void *const *buffer, int size)
write(fileno(stderr), "\n", 1);
}
void TransferFunds(int n);
void DepositMoney(int n)
{
if (n <= 0)
trace_pom();
else TransferFunds(n-1);
}
void TransferFunds(int n)
{
DepositMoney(n);
}
int main()
{
DepositMoney(3);
return 0;
}
compiled
gcc a.c -o a -g -Wall -Werror -rdynamic
According to the mentioned website:
Currently, the function name and offset only be obtained on systems that use the ELF
binary format for programs and libraries. On other systems, only the hexadecimal return
address will be present. Also, you may need to pass additional flags to the linker to
make the function names available to the program. (For example, on systems using GNU
ld, you must pass (-rdynamic.)
Output
./a(trace_pom+0xc9)[0x80487fd]
./a(DepositMoney+0x11)[0x8048862]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(main+0x1d)[0x80488a4]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7e16775]
./a[0x80486a1]

About a comment in an answer (getting the address of an instruction), you can use this very ugly trick
#include <setjmp.h>
void function() {
printf("in function\n");
printf("%d\n",__LINE__);
printf("exiting function\n");
}
int main() {
jmp_buf env;
int i;
printf("in main\n");
printf("%d\n",__LINE__);
printf("calling function\n");
setjmp(env);
for (i=0; i < 18; ++i) {
printf("%p\n",env[i]);
}
function();
printf("in main again\n");
printf("%d\n",__LINE__);
}
It should be env[12] (the eip), but be careful as it looks machine dependent, so triple check my word. This is the output
in main
13
calling function
0xbfff037f
0x0
0x1f80
0x1dcb
0x4
0x8fe2f50c
0x0
0x0
0xbffff2a8
0xbffff240
0x1f
0x292
0x1e09
0x17
0x8fe0001f
0x1f
0x0
0x37
in function
4
exiting function
in main again
37
have fun!

Related

Library interpositioning

I have been trying to intercept calls to malloc and free, following our textbook (CSAPP book).
I have followed their exact code, and nearly the same code that I found online and I keep getting a segmentation fault. I heard our professor saying something about printf that mallocs and frees memory so I think that this happens because I am intercepting a malloc and since I am using a printf function inside the intercepting function, it will call itself recursively.
However I can't seem to find a solution to solving this problem? Our professor demonstrated that intercepting worked ( he didn't show us the code) and prints our information every time a malloc occurs, so I do know that it's possible.
Can anyone suggest a working method??
Here is the code that I used and get nothing:
mymalloc.c
#ifdef RUNTIME
// Run-time interposition of malloc and free based on // dynamic linker's (ld-linux.so) LD_PRELOAD mechanism #define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h> #include <dlfcn.h>
void *malloc(size_t size) {
static void *(*mallocp)(size_t size) = NULL; char *error;
void *ptr;
// get address of libc malloc
if (!mallocp) {
mallocp = dlsym(RTLD_NEXT, "malloc"); if ((error = dlerror()) != NULL) {
fputs(error, stderr);
exit(EXIT_FAILURE);
}
}
ptr = mallocp(size);
printf("malloc(%d) = %p\n", (int)size, ptr); return ptr;
}
#endif
test.c
#include <stdio.h>
#include <stdlib.h>
int main(){
printf("main\n");
int* a = malloc(sizeof(int)*5);
a[0] = 1;
printf("end\n");
}
The result i'm getting:
$ gcc -o test test.c
$ gcc -DRUNTIME -shared -fPIC mymalloc.c -o mymalloc.so
$ LD_PRELOAD=./mymalloc.so ./test
Segmentation Fault
This is the code that I tried and got segmentation fault (from https://gist.github.com/iamben/4124829):
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
void* malloc(size_t size)
{
static void* (*rmalloc)(size_t) = NULL;
void* p = NULL;
// resolve next malloc
if(!rmalloc) rmalloc = dlsym(RTLD_NEXT, "malloc");
// do actual malloc
p = rmalloc(size);
// show statistic
fprintf(stderr, "[MEM | malloc] Allocated: %lu bytes\n", size);
return p;
}
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define STR_LEN 128
int main(int argc, const char *argv[])
{
char *c;
char *str1 = "Hello ";
char *str2 = "World";
//allocate an empty string
c = malloc(STR_LEN * sizeof(char));
c[0] = 0x0;
//and concatenate str{1,2}
strcat(c, str1);
strcat(c, str2);
printf("New str: %s\n", c);
return 0;
}
The makefile from the git repo didn't work so I manually compiled the files and got:
$ gcc -shared -fPIC libint.c -o libint.so
$ gcc -o str str.c
$ LD_PRELOAD=./libint.so ./str
Segmentation fault
I have been doing this for hours and I still get the same incorrect result, despite the fact that I copied textbook code. I would really appreciate any help!!
One way to deal with this is to turn off the printf when your return is called recursively:
static char ACallIsInProgress = 0;
if (!ACallIsInProgress)
{
ACallIsInProgress = 1;
printf("malloc(%d) = %p\n", (int)size, ptr);
ACallIsInProgress = 0;
}
return ptr;
With this, if printf calls malloc, your routine will merely call the actual malloc (via mallocp) and return without causing another printf. You will miss printing information about a call to malloc that the printf does, but that is generally tolerable when interposing is being used to study the general program, not the C library.
If you need to support multithreading, some additional work might be needed.
The printf implementation might allocate a buffer only once, the first time it is used. In that case, you can initialize a flag that turns off the printf similar to the above, call printf once in the main routine (maybe be sure it includes a nice formatting task that causes printf to allocate a buffer, not a plain string), and then set the flag to turn on the printf call and leave it set for the rest of the program.
Another option is for your malloc routine not to use printf at all but to cache data in a buffer to be written later by some other routine or to write raw data to a file using write, with that data interpreted and formatted by a separate program later. Or the raw data could be written by a pipe to a program that formats and prints it and that is not using your interposed malloc.

How to find load relocation for a PIE binary?

I need to get base address of stack inside my running process. This would enable me to print raw stacktraces that will be understood by addr2line (running binary is stripped, but addr2line has access to symbols).
I managed to do this by examining elf header of argv[0]: I read entry point and substract it from &_start:
#include <stdio.h>
#include <execinfo.h>
#include <unistd.h>
#include <elf.h>
#include <stdio.h>
#include <string.h>
void* entry_point = NULL;
void* base_addr = NULL;
extern char _start;
/// given argv[0] will populate global entry_pont
void read_elf_header(const char* elfFile) {
// switch to Elf32_Ehdr for x86 architecture.
Elf64_Ehdr header;
FILE* file = fopen(elfFile, "rb");
if(file) {
fread(&header, 1, sizeof(header), file);
if (memcmp(header.e_ident, ELFMAG, SELFMAG) == 0) {
printf("Entry point from file: %p\n", (void *) header.e_entry);
entry_point = (void*)header.e_entry;
base_addr = (void*) ((long)&_start - (long)entry_point);
}
fclose(file);
}
}
/// print stacktrace
void bt() {
static const int MAX_STACK = 30;
void *array[MAX_STACK];
auto size = backtrace(array, MAX_STACK);
for (int i = 0; i < size; ++i) {
printf("%p ", (long)array[i]-(long)base_addr );
}
printf("\n");
}
int main(int argc, char* argv[])
{
read_elf_header(argv[0]);
printf("&_start = %p\n",&_start);
printf("base address is: %p\n", base_addr);
bt();
// elf header is also in memory, but to find it I have to already have base address
Elf64_Ehdr * ehdr_addr = (Elf64_Ehdr *) base_addr;
printf("Entry from memory: %p\n", (void *) ehdr_addr->e_entry);
return 0;
}
Sample output:
Entry point from file: 0x10c0
&_start = 0x5648eeb150c0
base address is: 0x5648eeb14000
0x1321 0x13ee 0x29540f8ed09b 0x10ea
Entry from memory: 0x10c0
And then I can
$ addr2line -e a.out 0x1321 0x13ee 0x29540f8ed09b 0x10ea
/tmp/elf2.c:30
/tmp/elf2.c:45
??:0
??:?
How can I get base address without access to argv? I may need to print traces before main() (initialization of globals). Turning of ASLR or PIE is not an option.
How can I get base address without access to argv? I may need to print traces before main()
There are a few ways:
If /proc is mounted (which it almost always is), you could read the ELF header from /proc/self/exe.
You could use dladdr1(), as Antti Haapala's answer shows.
You could use _r_debug.r_map, which points to the linked list of loaded ELF images. The first entry in that list corresponds to a.out, and its l_addr contains the relocation you are looking for. This solution is equivalent to dladdr1, but doesn't require linking against libdl.
Could you provide sample code for 3?
Sure:
#include <link.h>
#include <stdio.h>
extern char _start;
int main()
{
uintptr_t relocation = _r_debug.r_map->l_addr;
printf("relocation: %p, &_start: %p, &_start - relocation: %p\n",
(void*)relocation, &_start, &_start - relocation);
return 0;
}
gcc -Wall -fPIE -pie t.c && ./a.out
relocation: 0x555d4995e000, &_start: 0x555d4995e5b0, &_start - relocation: 0x5b0
Are both 2 and 3 equally portable?
I think they are about equally portable: dladdr1 is a GLIBC extension that is also present on Solaris. _r_debug predates Linux and would also work on Solaris (I haven't actually checked, but I believe it will). It may work on other ELF platforms as well.
This piece of code produces the same value as your base_addr on Linux:
#define _GNU_SOURCE
#include <dlfcn.h>
#include <link.h>
Dl_info info;
void *extra = NULL;
dladdr1(&_start, &info, &extra, RTLD_DL_LINKMAP);
struct link_map *map = extra;
printf("%#llx", (unsigned long long)map->l_addr);
The dladdr1 manual page says the following of RTLD_DL_LINKMAP:
RTLD_DL_LINKMAP
Obtain a pointer to the link map for the matched file. The
extra_info argument points to a pointer to a link_map structure (i.e., struct link_map **), defined in as:
struct link_map {
ElfW(Addr) l_addr; /* Difference between the
address in the ELF file and
the address in memory */
char *l_name; /* Absolute pathname where
object was found */
ElfW(Dyn) *l_ld; /* Dynamic section of the
shared object */
struct link_map *l_next, *l_prev;
/* Chain of loaded objects */
/* Plus additional fields private to the
implementation */
};
Notice that -ldl is required to link against the dynamic loading routines.

mmap load shared object and get function pointer

I want to dynamically load a library without using functions from dlfcn.h i have a folder full of .so files compiled with:
gcc -Wall -shared -fPIC -o filename.so filename.c
And all of them have an entry function named:
void __start(int size, char** cmd);
(I know, probably not the best name)
then i try to call open over the so, then read the elf header as an Elf64_Ehdr load it into memory with mmap (i know i should use mprotect but i want to make it work first and then add the security) and finally adding the the elf header entry to the pointer returned by mmap (plus an offset of 0x100).
The test code is the following:
#include <elf.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>
typedef void (*ptr_t) (int, char**);
void* alloc_ex_memory(int fd) {
struct stat s;
fstat(fd, &s);
void * ptr = mmap(0, s.st_size, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_ANONYMOUS, fd, 0);
if (ptr == (void *)-1) {
perror("mmap");
return NULL;
}
return ptr;
}
void* load_ex_file(const char* elfFile) {
Elf64_Ehdr header;
void * ptr;
FILE* file = fopen(elfFile, "rb");
if(file) {
fread(&header, 1, sizeof(header), file);
if (memcmp(header.e_ident, ELFMAG, SELFMAG) == 0) {
ptr = alloc_ex_memory(fileno(file));
printf("PTR AT -> %p\n", ptr);
printf("Entry at -> %lx\n", header.e_entry+256);
ptr = (header.e_entry + 256);
} else {
ptr = NULL;
}
fclose(file);
return ptr;
}
return NULL;
}
int main(int argc, char** argv) {
ptr_t func = load_ex_file(argv[1]);
printf("POINTER AT: %p\n", func);
getchar();
func(argc, argv);
return 0;
}
Let me explain a bit further:
When i run objdump -d filename.so i get that the _start is at 0x6c0. The elf header entry point returns 0x5c0 (i compensate it adding 256 in dec).
Also pmap shows an executable area being created at lets say 0x7fdf94d0c000
so the function pointer direction that i get and which i call is at 0x7fdf94d0c6c0 this should be the entry point and callable via function pointer. But what i get is, you guessed it: segfault.
Las thing that i would like to point out is that i have the same example running with (dlopen, dlsym, dlclose) but its required that i use the mmap trick. I have seen my professor implement it successfully but i cannot figure out how. (perhaps there is a simpler way that im missing).
I based the code on this jit example
Thank you in advance!
Edit: This is the code of the .c that i compile into .so:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
void __start(int size, char** cmd) {
if (size==2) {
if (!strcmp(cmd[1], "-l")) {
printf("******.********.*****#udc.es\n");
return;
} else if (!strcmp(cmd[1], "-n")) {
printf("****** ******** *****\n");
return;
}
} else if (size==1) {
printf("****** ******** ***** (******.********.*****#udc.es)\n");
return;
}
printf("Wrong command syntax: autores [-l | -n]\n");
}
its required that i use the mmap trick. I have seen my professor implement it successfully but i cannot figure out how.
For this to work, your __start must be completely stand-alone, and not call any other library (you violated that requirement by calling strcmp and printf).
The moment you call an unresolved symbol, you ask the dynamic loader to resolve that symbol, and the loader needs all kinds of info for that. That info is normally set up during dlopen call. Since you bypassed dlopen, it is not at all surprising that your program crashes.
Your first step should be to change your code such that __start doesn't do anything at all (is empty), and verify that you can then call __start. That would confirm that you are computing its address correctly.
Second, add a crash to that code:
void __start(...)
{
int *p = NULL;
*p = 42; // should crash here
}
and verify that you are observing the crash now. That will confirm that your code is getting called.
Third, remove the crashing code and use only direct system calls (e.g. write(1, "Hello\n", 6) (but don't call write -- you must implement it directly in your library)) to implement whatever you need.

executing init and fini

I just read about init and fini sections in ELF files and gave it a try:
#include <stdio.h>
int main(){
puts("main");
return 0;
}
void init(){
puts("init");
}
void fini(){
puts("fini");
}
If I do gcc -Wl,-init,init -Wl,-fini,fini foo.c and run the result the "init" part is not printed:
$ ./a.out
main
fini
Did the init part not run, or was it not able to print somehow?
Is there a any "official" documentation about the init/fini stuff?
man ld says:
-init=name
When creating an ELF executable or shared object, call
NAME when the executable or shared object is loaded, by
setting DT_INIT to the address of the function. By
default, the linker uses "_init" as the function to call.
Shouldn't that mean, that it would be enough to name the init function _init? (If I do gcc complains about multiple definition.)
Don't do that; let your compiler and linker fill in the sections as they see fit.
Instead, mark your functions with the appropriate function attributes, so that the compiler and linker will put them in the correct sections.
For example,
static void before_main(void) __attribute__((constructor));
static void after_main(void) __attribute__((destructor));
static void before_main(void)
{
/* This is run before main() */
}
static void after_main(void)
{
/* This is run after main() returns (or exit() is called) */
}
You can also assign a priority (say, __attribute__((constructor (300)))), an integer between 101 and 65535, inclusive, with functions having a smaller priority number run first.
Note that for illustration, I marked the functions static. That is, the functions won't be visible outside the file scope. The functions do not need to be exported symbols to be automatically called.
For testing, I suggest saving the following in a separate file, say tructor.c:
#include <unistd.h>
#include <string.h>
#include <errno.h>
static int outfd = -1;
static void wrout(const char *const string)
{
if (string && *string && outfd != -1) {
const char *p = string;
const char *const q = string + strlen(string);
while (p < q) {
ssize_t n = write(outfd, p, (size_t)(q - p));
if (n > (ssize_t)0)
p += n;
else
if (n != (ssize_t)-1 || errno != EINTR)
break;
}
}
}
void before_main(void) __attribute__((constructor (101)));
void before_main(void)
{
int saved_errno = errno;
/* This is run before main() */
outfd = dup(STDERR_FILENO);
wrout("Before main()\n");
errno = saved_errno;
}
static void after_main(void) __attribute__((destructor (65535)));
static void after_main(void)
{
int saved_errno = errno;
/* This is run after main() returns (or exit() is called) */
wrout("After main()\n");
errno = saved_errno;
}
so you can compile and link it as part of any program or library. To compile it as a shared library, use e.g.
gcc -Wall -Wextra -fPIC -shared tructor.c -Wl,-soname,libtructor.so -o libtructor.so
and you can interpose it into any dynamically linked command or binary using
LD_PRELOAD=./libtructor.so some-command-or-binary
The functions keep errno unchanged, although it should not matter in practice, and use the low-level write() syscall to output the messages to standard error. The initial standard error is duplicated to a new descriptor, because in many instances, the standard error itself gets closed before the last global destructor -- our destructor here -- gets run.
(Some paranoid binaries, typically security sensitive ones, close all descriptors they don't know about, so you might not see the After main() message in all cases.)
It is not a bug in ld but in the glibc startup code for the main executable. For shared objects the function set by the -init option is called.
This is the commit to ld adding the options -init and -fini.
The _init function of the program isn't called from file glibc-2.21/elf/dl-init.c:58 by the DT_INIT entry by the dynamic linker, but called from __libc_csu_init in file glibc-2.21/csu/elf-init.c:83 by the main executable.
That is, the function pointer in DT_INIT of the program is ignored by the startup.
If you compile with -static, fini isn't called, too.
DT_INIT and DT_FINI should definitely not be used, because they are old-style, see line 255.
The following works:
#include <stdio.h>
static void preinit(int argc, char **argv, char **envp) {
puts(__FUNCTION__);
}
static void init(int argc, char **argv, char **envp) {
puts(__FUNCTION__);
}
static void fini(void) {
puts(__FUNCTION__);
}
__attribute__((section(".preinit_array"), used)) static typeof(preinit) *preinit_p = preinit;
__attribute__((section(".init_array"), used)) static typeof(init) *init_p = init;
__attribute__((section(".fini_array"), used)) static typeof(fini) *fini_p = fini;
int main(void) {
puts(__FUNCTION__);
return 0;
}
$ gcc -Wall a.c
$ ./a.out
preinit
init
main
fini
$

reading the environment when executing ELF IFUNC dispatch functions

The IFUNC mechanism in recent ELF tools on (at least) Linux allows to choose a implementation of a function at runtime. Look at the iunc attribute in the GCC documentation for more detailed description: http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html
Another description of IFUNC mecanism : http://www.agner.org/optimize/blog/read.php?i=167
I would like to choose my implementation depending on the value of an environment variable. However, my experiments show me that the libc (at least the part about environment) is not yet initialized when the resolver function is run. So, the classical interfaces (extern char**environ or getenv()) do not work.
Does anybody know how to access the environment of a program in Linux at very early stage ? The environment is setup by the kernel at the execve(2) system call, so it is already somewhere (but where exactly ?) in the program address space at early initialization.
Thanks in advance
Vincent
Program to test:
#include <stdio.h>
#include <stdlib.h>
extern char** environ;
char** save_environ;
char* toto;
int saved=0;
extern int fonction ();
int fonction1 () {
return 1;
}
int fonction2 () {
return 2;
}
static typeof(fonction) * resolve_fonction (void) {
saved=1;
save_environ=environ;
toto=getenv("TOTO");
/* no way to choose between fonction1 and fonction2 with the TOTO envvar */
return fonction1;
}
int fonction () __attribute__ ((ifunc ("resolve_fonction")));
void print_saved() {
printf("saved: %dn", saved);
if (saved) {
printf("prev environ: %pn", save_environ);
printf("prev TOTO: %sn", toto);
}
}
int main() {
print_saved();
printf("main environ: %pn", environ);
printf("main environ[0]: %sn", environ[0]);
printf("main TOTO: %sn", getenv("TOTO"));
printf("main value: %dn", fonction());
return 0;
}
Compilation and execution:
$ gcc -Wall -g ifunc.c -o ifunc
$ env TOTO=ok ./ifunc
saved: 1
prev environ: (nil)
prev TOTO: (null)
main environ: 0x7fffffffe288
main environ[0]: XDG_VTNR=7
main TOTO: ok
main value: 1
$
In the resolver function, environ is NULL and getenv("TOTO") returns NULL. In the main function, the information is here.
Function Pointer
I found no way to use env in early stage legally. Resolver function runs in linker even earlier, than preinit_array functions. The only legal way to resolve this is to use function pointer and decide what function to use in function of .preinit_array section:
extern char** environ;
int(*f)();
void preinit(int argc, char **argv, char **envp) {
f = f1;
environ = envp; // actually, it is done a bit later
char *v = getenv("TOTO");
if (v && strcmp(v, "ok") == 0) {
f = f2;
}
}
__attribute__((section(".preinit_array"))) typeof(preinit) *__preinit = preinit;
ifunc & GNU ld inners
Glibc's linker ld contains a local symbol _environ and it is initialized, but it is rather hard to extract it. There is another way I found, but it is a bit tricky and rather unreliable.
At linker's entry point _start only stack is initialized. Program arguments and environmental values are sent to the process via stack. Arguments are stored in the following order:
argc, argv, argv + 1, ..., argv + argc - 1, NULL, ENV...
Linker ld shares a global symbol _dl_argv, which points to this place on the stack. With the help of it we can extract all the needful variables:
extern char** environ;
extern char **_dl_argv;
char** get_environ() {
int argc = *(int*)(_dl_argv - 1);
char **my_environ = (char**)(_dl_argv + argc + 1);
return my_environ;
}
typeof(f1) * resolve_f() {
environ = get_environ();
const char *var = getenv("TOTO");
if (var && strcmp(var, "ok") == 0) {
return f2;
}
return f1;
}
int f() __attribute__((ifunc("resolve_f")));

Resources