How to append to __preinit_array_start on Linux? - c

On Linux with GCC if I define
__attribute__((constructor)) static void myfunc(void) {}
, then the address of myfunc will be appended to __init_array_start in the .ctors section. But how can I append a function pointer to __preinit_array_start?
Is __preinit_array_start relevant in a statically linked binary?

As there's no __attribute__((preconstructor)), you can just mush the code into the relevant section using some section attributes e.g.
#include <stdio.h>
int v;
int w;
int x;
__attribute__((constructor)) static void
foo(void)
{
printf("Foo %d %d %d\n", v, w, x);
}
static void
bar(void)
{
v = 3;
}
static void
bar1(void)
{
w = 2;
}
static void
bar2(void)
{
x = 1;
}
__attribute__((section(".preinit_array"))) static void (*y[])(void) = { &bar, &bar1, &bar2 };
int
main(int argc, char **argv)
{
printf("Hello World\n");
}
File dumped into foo.c, compiled using: gcc -o foo foo.c, and then run yields an output of:
Foo 3 2 1
Hello World
File compiled using gcc -static -o foo foo.c, and then run yields the same output, so it does appear to work with statically linked binaries.
It will not work with .so files, though; the linker complains with:
/usr/bin/ld: /tmp/ccI0lMgd.o: .preinit_array section is not allowed in DSO
/usr/bin/ld: failed to set dynamic section sizes: Nonrepresentable section on output
I'd be inclined to avoid it, as code run in that section precedes all other initialization routines. If you're trying to perform some 'this is supposed to run first' initialization, then it's really not a good idea - you're just fighting a race condition which should be solved by some other mechanism.

Related

How call and compile function from elf to my binary?

I have a binary file (ELF) that I don't write, but I want to use 1 function from this binary (I know the address/offset of the function), that function not exported from the binary.
My goal is to call this function from my C code that I write and compile this function statically in my binary (I compile with gcc).
How can I do that please?
I am going to answer the
call to this function from my c code that I write
part.
The below works under certain assumptions, like dynamic linking and position independent code. I haven't thought for too long about what happens if they are broken (let's experiment/discuss, if there's interest).
$ cat lib.c
int data = 42;
static int foo () { return data; }
gcc -fpic -shared lib.c -o lib.so
$ nm lib.so | grep foo
00000000000010e9 t foo
The above reproduces having the address that you know. The address we know now is 0x10e9. It is the virtual address of foo before relocation. We'll model the relocation the dynamic loader does by hand by simply adding the base address at which lib.so gets loaded.
$ cat 1.c
#define _GNU_SOURCE
#include <stdio.h>
#include <link.h>
#include <string.h>
#include <elf.h>
#define FOO_VADDR 0x10e9
typedef int(*func_t)();
int callback(struct dl_phdr_info *info, size_t size, void *data)
{
if (!(strstr(info->dlpi_name, "lib.so")))
return 0;
Elf64_Addr addr = info->dlpi_addr + FOO_VADDR;
func_t f = (func_t)addr;
int res = f();
printf("res = %d\n", res);
return 0;
}
int main()
{
void *handle = dlopen("./lib.so", RTLD_LAZY);
if (!handle) {
puts("failed to load");
return 1;
}
dl_iterate_phdr(&callback, NULL);
dlclose(handle);
return 0;
}
And now...
$ gcc 1.c -ldl && ./a.out
res = 42
Voila -- it worked! That was fun.
Credit: this was helpful.
If you have questions, feel free to read the man and ask in the comments.
As for
compile this function statically in my binary
I don't know off the bat. This would be trickier. Why do you want that? Also, do you know whether the function depends on some data (or maybe it calls other functions) in the original ELF file, like in the example above?

How to wrap existing function in C

I am trying to wrap existing function.
below code is perfectly worked.
#include<stdio.h>
int __real_main();
int __wrap_main()
{
printf("Wrapped main\n");
return __real_main();
}
int main()
{
printf("main\n");
return 0;
}
command:
gcc main.c -Wl,-wrap,main
output:
Wrapped main
main
So i have changed main function with temp. my goal is to wrap temp() function.
Below is the code
temp.c
#include<stdio.h>
int temp();
int __real_temp();
int __wrap_temp()
{
printf("Wrapped temp\n");
return __real_temp();
}
int temp()
{
printf("temp\n");
return 0;
}
int main()
{
temp();
return 0;
}
command:
gcc temp.c -Wl,-wrap,temp
output:
temp
Wrapped temp is not printing. please guide me to wrap funciton temp.
The manpage for ld says:
--wrap=symbol
Use a wrapper function for symbol. Any undefined reference to symbol will be resolved to "__wrap_symbol". Any
undefined reference to "__real_symbol" will be resolved to symbol.
The keyword here is undefined.
If you put the definition temp in the same translation unit as the code that uses it, it will not be undefined in the code that uses it.
You need to split the code definition and the code that uses it:
#!/bin/sh
cat > user.c <<'EOF'
#include<stdio.h>
int temp(void);
int __real_temp(void);
int __wrap_temp()
{
printf("Wrapped temp\n");
return __real_temp();
}
int main()
{
temp();
return 0;
}
EOF
cat > temp.c <<'EOF'
#include<stdio.h>
int temp()
{
printf("temp\n");
return 0;
}
EOF
gcc user.c -Wl,-wrap,temp temp.c # OK
./a.out
Splitting the build into two separate compiles perhaps makes it clearer:
$ gcc -c user.c
$ gcc -c temp.c
$ nm user.o temp.o
temp.o:
U puts
0000000000000000 T temp
user.o:
0000000000000015 T main
U puts
U __real_temp
U temp
0000000000000000 T __wrap_temp
Now since temp is undefined in user.c, the linker can do its __real_/__wrap_magic on it.
$ gcc user.o temp.o -Wl,-wrap=temp
$ ./a.out
Wrapped temp
temp
The answer proposed by PSCocik works great if you can split the function you want to override from the function that will call it. However if you want to keep the callee and the caller in the same source file the --wrap option will not work.
Instead you can use __attribute__((weak)) before the implementation of the callee in order to let someone reimplement it without GCC yelling about multiple definitons.
For example suppose you want to mock the world function in the following hello.c code unit. You can prepend the attribute in order to be able to override it.
#include "hello.h"
#include <stdio.h>
__attribute__((weak))
void world(void)
{
printf("world from lib\n");
}
void hello(void)
{
printf("hello\n");
world();
}
And you can then override it in another unit file. Very useful for unit testing/mocking:
#include <stdio.h>
#include "hello.h"
/* overrides */
void world(void)
{
printf("world from main.c\n");
}
int main(void)
{
hello();
return 0;
}

shared libraries and visibility to user's memory

When I use a shared library via dlopen, can the library code "see" memory of my process that calls dlopen? For example, I would like to pass a pointer to memory allocated by my application to the library API.
I'm on Linux/x86 if it is important.
The answer is yes, it can. Here is a simple quick example for illustration purposes.
The library code (in file myso.c):
void setInt( int * i )
{
*i = 12345;
}
The library can be built as follows:
gcc -c -fPIC myso.c
gcc -shared -Wl,-soname,libmy.so -o libmy.so myso.o -lc
Here is the client code (main.c):
#include <stdio.h>
#include <dlfcn.h>
typedef void (*setint_t)( int * );
int main()
{
void * h = dlopen("./libmy.so", RTLD_NOW);
if (h)
{
puts("Loaded library.");
setint_t setInt = dlsym( h, "setInt" );
if (setInt) {
puts("Symbol found");
int k;
setInt(&k);
printf("The int is %d\n", k);
}
}
return 0;
}
Now build and run the code. Make sure main.c and the library are in the same directory, in which we execute the following:
user#fedora-21 ~]$ gcc main.c -ldl
[user#fedora-21 ~]$ ./a.out
Loaded library.
Symbol found
The int is 12345
As one can see, the library was able to write to the memory of the integer k.

executing init and fini

I just read about init and fini sections in ELF files and gave it a try:
#include <stdio.h>
int main(){
puts("main");
return 0;
}
void init(){
puts("init");
}
void fini(){
puts("fini");
}
If I do gcc -Wl,-init,init -Wl,-fini,fini foo.c and run the result the "init" part is not printed:
$ ./a.out
main
fini
Did the init part not run, or was it not able to print somehow?
Is there a any "official" documentation about the init/fini stuff?
man ld says:
-init=name
When creating an ELF executable or shared object, call
NAME when the executable or shared object is loaded, by
setting DT_INIT to the address of the function. By
default, the linker uses "_init" as the function to call.
Shouldn't that mean, that it would be enough to name the init function _init? (If I do gcc complains about multiple definition.)
Don't do that; let your compiler and linker fill in the sections as they see fit.
Instead, mark your functions with the appropriate function attributes, so that the compiler and linker will put them in the correct sections.
For example,
static void before_main(void) __attribute__((constructor));
static void after_main(void) __attribute__((destructor));
static void before_main(void)
{
/* This is run before main() */
}
static void after_main(void)
{
/* This is run after main() returns (or exit() is called) */
}
You can also assign a priority (say, __attribute__((constructor (300)))), an integer between 101 and 65535, inclusive, with functions having a smaller priority number run first.
Note that for illustration, I marked the functions static. That is, the functions won't be visible outside the file scope. The functions do not need to be exported symbols to be automatically called.
For testing, I suggest saving the following in a separate file, say tructor.c:
#include <unistd.h>
#include <string.h>
#include <errno.h>
static int outfd = -1;
static void wrout(const char *const string)
{
if (string && *string && outfd != -1) {
const char *p = string;
const char *const q = string + strlen(string);
while (p < q) {
ssize_t n = write(outfd, p, (size_t)(q - p));
if (n > (ssize_t)0)
p += n;
else
if (n != (ssize_t)-1 || errno != EINTR)
break;
}
}
}
void before_main(void) __attribute__((constructor (101)));
void before_main(void)
{
int saved_errno = errno;
/* This is run before main() */
outfd = dup(STDERR_FILENO);
wrout("Before main()\n");
errno = saved_errno;
}
static void after_main(void) __attribute__((destructor (65535)));
static void after_main(void)
{
int saved_errno = errno;
/* This is run after main() returns (or exit() is called) */
wrout("After main()\n");
errno = saved_errno;
}
so you can compile and link it as part of any program or library. To compile it as a shared library, use e.g.
gcc -Wall -Wextra -fPIC -shared tructor.c -Wl,-soname,libtructor.so -o libtructor.so
and you can interpose it into any dynamically linked command or binary using
LD_PRELOAD=./libtructor.so some-command-or-binary
The functions keep errno unchanged, although it should not matter in practice, and use the low-level write() syscall to output the messages to standard error. The initial standard error is duplicated to a new descriptor, because in many instances, the standard error itself gets closed before the last global destructor -- our destructor here -- gets run.
(Some paranoid binaries, typically security sensitive ones, close all descriptors they don't know about, so you might not see the After main() message in all cases.)
It is not a bug in ld but in the glibc startup code for the main executable. For shared objects the function set by the -init option is called.
This is the commit to ld adding the options -init and -fini.
The _init function of the program isn't called from file glibc-2.21/elf/dl-init.c:58 by the DT_INIT entry by the dynamic linker, but called from __libc_csu_init in file glibc-2.21/csu/elf-init.c:83 by the main executable.
That is, the function pointer in DT_INIT of the program is ignored by the startup.
If you compile with -static, fini isn't called, too.
DT_INIT and DT_FINI should definitely not be used, because they are old-style, see line 255.
The following works:
#include <stdio.h>
static void preinit(int argc, char **argv, char **envp) {
puts(__FUNCTION__);
}
static void init(int argc, char **argv, char **envp) {
puts(__FUNCTION__);
}
static void fini(void) {
puts(__FUNCTION__);
}
__attribute__((section(".preinit_array"), used)) static typeof(preinit) *preinit_p = preinit;
__attribute__((section(".init_array"), used)) static typeof(init) *init_p = init;
__attribute__((section(".fini_array"), used)) static typeof(fini) *fini_p = fini;
int main(void) {
puts(__FUNCTION__);
return 0;
}
$ gcc -Wall a.c
$ ./a.out
preinit
init
main
fini
$

With MACH-O is there a way to register a function that will run before main?

Under Linux, I can register a routine that will run before main. For example:
#include <stdio.h>
void myinit(int argc, char **argv, char **envp) {
printf("%s: %s\n", __FILE__, __FUNCTION__);
}
__attribute__((section(".init_array"))) typeof(myinit) *__init = myinit;
By compiling this with GCC and linking it in, the function myinit will be run before main.
Is there way to do this under Mac OSX and MACH-O?
Thanks.
You could place the function in __mod_init_func data section of Mach-O binary.
From Mach-O format reference:
__DATA,__mod_init_func
Module initialization functions. The C++ compiler places static constructors here.
example.c
#include <stdio.h>
void myinit(int argc, char **argv, char **envp) {
printf("%s: %s\n", __FILE__, __FUNCTION__);
}
__attribute__((section("__DATA,__mod_init_func"))) typeof(myinit) *__init = myinit;
int main() {
printf("%s: %s\n", __FILE__, __FUNCTION__);
return 0;
}
I build your example with clang on OS X platform:
$ clang -Wall example.c
$ ./a.out
example.c: myinit
example.c: main
Easiest way is to specify the function to be constructor using constructor attribute. The constructor attribute causes the function to be called automatically before execution enters main(). Similarly, the destructor attribute causes the function to be called automatically after main() completes or exit() is called. You can also specify optional priority if you have several functions
e.g. __attribute__((constructor(100)))
#include <stdio.h>
__attribute__((constructor)) void myinit() {
printf("my init\n");
}
int main() {
printf("my main\n");
return 0;
}
__attribute__((destructor)) void mydeinit() {
printf("my deinit\n");
}
$ clang -Wall example.c
$ ./a.out
my init
my main
my deinit
Disclaimer: I generally discourage what I'm about to say. Having code running before or after main makes things less predictable. I'm not sure why you wouldn't just let the first line of main invoke your myinit, but I suppose everyone has a reason. Here goes.
I don't know much about Mach-O, but the simplest way to run code before main, is to link in a C++ class that has a corresponding global instance defined. You can do this independently of your "C" code without having to alter anything else. You can also have this C++ code invoke C functions defined elsewhere in your code. In the example below, I show a simple example of how I would invoke your myinit.
In a standalone .cpp (or .cc) file, declare a very simple C++ class with a constructor that calls your "myinit function".
foo.cpp
// forward declare your myinit function and designate "C" linkage
extern "C" myinit(int, char**, char**);
class CodeToRunBeforeMain
{
public:
CodeToRunBeforeMain()
{
// invoke your myinit function here
myinit(0, NULL, NULL);
}
};
// global instance - constructor will run before main.
CodeToRunBeforeMain g_runBeforeMain;
The above approach doesn't recognize argc, argv, or envp. Hopefully, that isn't important.

Resources