Cannot get libc override function work with LD_PRELOAD?

Cannot get libc override function work with LD_PRELOAD? - c

I'm try to override some libc function using LD_PRELOAD technique, but cannot make it work.
this is the strlib.c
#include <stdio.h>
size_t strlen(const char *s){
printf("take strlen for %s\n",s);
return 0;
}
gcc -shared -fPIC -o strlib.so strlib.c (.c file, no name mangle here)
and the main app
#include <string.h>
int main(){
const char* s = "hello";
printf("length=%d\n",strlen(s));
}
gcc -o main main.c
then start to run it
LD_PRELOAD=./strlib.so ./main
it run but seem that it did not call my override function
$ LD_PRELOAD=./strlib.so ./main
length=5
Did I do anything wrong here?
#edit: as Emest mention, a changed the main.c to avoid compiler optimize, but it still did not work.
#include <string.h>
int main(int argc,char** argv){
const char* s = "hello";
printf("length=%d\n",strlen(argv[1]));
}
$ LD_PRELOAD=./strlib.so ./main hehe
length=4

Check the assembly code ( use the -s argument to gcc.) The compiler will optimize out the strlen() call on the compile-time constant string "hello", computing the length at compile time instead. Try calling your function on a string whose length isn't known until runtime, like one of the arguments to main(), and this should work as you expect.

Related

How can i override a global symbol in a library loaded with dlopen?

There are 3 components involved
main: the main program, loads loader.so
loader.so: compiled with -Bsymbolic, overrides puts and loads other.so
other.so: calls puts, and can't be modified
How can i have other.so use the overridden puts from loader.so?
Note that i want puts to be overridden only from loader.so onwards (included other.so), the main program should be unaffected
Sample code follows
main.c
#include <stdio.h>
#include <dlfcn.h>
int main(int argc, char *argv[]){
dlopen("./loader.so", RTLD_NOW | RTLD_GLOBAL | RTLD_DEEPBIND);
puts("Normal");
return 0;
}
loader.c
#include <stdio.h>
#include <dlfcn.h>
extern int puts(const char *s){
fputs("Hooked: ", stdout);
fputs(s, stdout);
fputc('\n', stdout);
return 0;
}
__attribute__((constructor))
void ctor(void) {
puts("Something");
void *other = dlopen("./other.so", RTLD_NOW);
}
other.c
#include <stdio.h>
__attribute__((constructor))
void ctor(void) {
puts("Hello!");
}
make.sh
#!/bin/bash
gcc main.c -o main -ldl
gcc loader.c -fPIC -shared -Wl,-Bsymbolic -o loader.so
gcc other.c -fPIC -shared -o other.so
Desired output
Hooked: Something
Hooked: Hello!
Normal
Actual output
Hooked: Something
Hello!
Normal

After playing with the problem a bit more, i have a solution that requires a bit of external help from patchelf, so i'll wait to accept this solution in case there's a different approach to the problem.
This solution works by making a new shared object, shared.so, with the modified puts, as following
int puts(const char *s){
fputs("Hooked: ", stdout);
fputs(s, stdout);
fputc('\n', stdout);
return 0;
}
We then need to force other.so to depend on this new shared object, and we can do this by using patchelf --add-needed shared.so other.so
This does involve a modification to other.so, but it doesn't require a re-compilation from source (which makes this approach more feasible).
Now, when we load other.so, we need to specify RTLD_DEEPBIND inside loader.c like this
void *other = dlopen("./libother.so", RTLD_NOW | RTLD_GLOBAL | RTLD_DEEPBIND);
so that the search order won't start from the global context, but from the library itself.
Since other.so doesn't define puts, the direct dependencies will be looked up, and puts will be found in shared.so
The properties of RTLD_DEEPBIND make sure that even eventual LD_PRELOADed objects are trumped over.
So if puts is disabled inside a preloaded shared object we can work around that and call the real, unmodified puts from glibc (and only for calls originating from other.so).
We don't need any patchelf or shared.so if all we want is restore the original behaviour

Try adding flag -Wl,--no-as-needed
gcc loader.c -fPIC -shared -Wl,-Bsymbolic -Wl,--no-as-needed -o loader.so
I successfully hooked time related functions from C library in time-machine.

How can I LD_PRELOAD my own compiled library?

I was wondering how this works, creating a library and preloading it so a program can use it instead of the one in the include statement.
here is what I am doing and is not working so far .
//shared.cpp
int rand(){
return 33;
}
//prograndom.cpp
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(){
srand(time(NULL));
int i = 10;
while(i--) printf("%d\n", rand()%100);
return 0;
}
Then in the terminal:
$ gcc -shared -fPIC shared.cpp -o libshared.so
$ gcc prograndom.cpp -o prograndom
$ export LD_PRELOAD=/home/bob/desarrollo/libshared.so
and finally
$ LD_PRELOAD=/home/bob/desarrollo/libshared.so ./prograndom
which doesnt print 33, just random numbers...

Your programs are C programs, but the cpp file extension implies C++, and GCC will interpret it that way.
That's an issue because it means that your function rand (in shared.cpp) will be compiled as a C++ function, with its name mangled to include its type-signature. However, in main you #include <stdlib.h>, which has the effect of declaring:
extern "C" int rand();
and that is the rand that the linker will look for. So your PRELOAD will have no effect.
If you change the name of the file from shared.cpp to shared.c, then it will work as expected.
Other alternatives, of dubious value, are:
Declare rand to be extern "C" in your shared.cpp file. You can then compile it as C++.
Force compilation as C by using the GCC option -x c.

OCaml shared lib for another shared lib

I am exploring some adventurous ideas.
TL:DR; gnumake is able to use loadable modules, I am trying to use that C barrier to use OCaml but have trouble with the OCaml runtime initializing.
I have this OCaml code:
(* This is speak_ocaml.ml *)
let do_speak () =
print_endline "This called from OCaml!!";
flush stdout;
"Some return value from OCaml"
let () =
Callback.register "speak" do_speak
and I also have this C code: (Yes, needs to use extra CAML macros but not relevant here)
#include <stdlib.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <gnumake.h>
#include <caml/mlvalues.h>
#include <caml/callback.h>
#include <caml/memory.h>
#include <caml/alloc.h>
int plugin_is_GPL_compatible;
char *ocaml_speaker(const char *func_name, int argc, char **argv)
{
char *answer =
String_val(caml_callback(*caml_named_value("speak"), Val_unit));
printf("Speaking and got: %s\n", answer);
char *buf = gmk_alloc(strlen(answer) + 1);
strcpy(buf, answer);
/* receive_arg */
return buf;
}
int do_speak_gmk_setup()
{
printf("Getting Called by Make\n");
// This is pretty critical, will explain below
char **argv = {"/home/Edgar/foo", NULL};
caml_startup(argv);
printf("Called caml_startup\n");
gmk_add_function("speak", ocaml_speaker, 1, (unsigned int)1, 1);
return 1;
}
and I'm compiling it with this Makefile
all:
ocamlopt -c speak_ocaml.ml
ocamlopt -output-obj -o caml_code.o speak_ocaml.cmx
clang -I`ocamlc -where` -c do_speak.c -o do_speak.o
clang -shared -undefined dynamic_lookup -fPIC -L`ocamlc -where` -ldl \
-lasmrun do_speak.o caml_code.o -o do_speak.so
show_off:
echo "Speaker?"
${speak 123}
clean:
#rm -rf *.{cmi,cmt,cmi,cmx,o,cmo,so}
And my problem is that only printf("Getting Called by Make\n"); is going off when I add the appropriate load do_speak.so in the Makefile, caml_startup is not going off correctly. Now I am calling caml_startup because if I don't then I get an error of
Makefile:9: dlopen(do_speak.so, 9): Symbol not found: _caml_atom_table
Referenced from: do_speak.so
Expected in: flat namespace
in do_speak.so
Makefile:9: *** do_speak.so: failed to load. Stop.
And this is because of the way that clang on OS X does linking, see here for more details: http://psellos.com/2014/10/2014.10.atom-table-undef.html
I am kind of out of ideas... I need to create a C shared library out of OCaml code which then needs to be part of another C shared library from which I obviously don't have the original argv pointers that caml_startup wants. As my code sample show, I've tried faking it out, and also used caml_startup(NULL) and char **argv = {NULL}; caml_startup(argv) with similar lack of success. I don't know how else to initialize the runtime correctly.

I actually can't tell very well what you're asking. However, here's a comment on this part of your question:
I've tried faking it out, and also used caml_startup(NULL) and char **argv = {NULL}; caml_startup(argv) with similar lack of success. I don't know how else to initialize the runtime correctly.
As far as I know, the only reason for the argv argument of caml_startup is to establish the command-line arguments (for Sys.argv). If you don't need command-line arguments it should be OK to call like this:
char *arg = NULL;
caml_startup(&arg);
Technically argv is supposed to contain at least one string (the name of the program). So maybe it would be better to call like this:
char *argv[] = { "program", NULL };
caml_startup(argv);

Undefined reference to 'environ'?

I am trying to use the 'environ' variable, but it keeps giving me an error. It seems to be a makefile/build error and I can't seem to fix it. I have searched fo answers, but still I am lost.
Here is my c file:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <dirent.h>
#include "cmd.h"
int cmdExec() {
...
extern char **environ;
...
printf("Enter a command\n");
//gets (input);
scanf("%s%*[^\n]", input);
if (...) {
...
}
else if (strcmp(input, "environ") == 0) {
int i;
for (i = 0; environ[i] != NULL; i++) {
printf("%s\n", environ[i]);
}
exit(0);
else
...
return 0;
}
and here is the makefile:
CC = gcc
CFLAGS = -c
CFLAGS-y = -std=c99
all: cmd
cmd.o: cmd.c cmd.h
$(CC) $(CFLAGS) $(CFLAGS-y) cmd.c
cmd.exe: cmd.o
$(CC) -o cmd.exe cmd.o
clean:
rm -rf *.o cmd.exe a.out
This is the output:
make all
gcc -c -std=c99 cmd.c
gcc cmd.o -o cmd
cmd.o:cmd.c:(.text+0x105): undefined reference to `environ'
cmd.o:cmd.c:(.text+0x127): undefined reference to `environ'
collect2: ld returned 1 exit status
make: *** [cmd] Error 1
From what I've searched this deals with linking libraries, but I don't know how to apply that to my specific situation. If someone could give me a hand I'd appreciate it.

Not all(if any) compilers on Windows provides access to environment variables through a global symbol named environ.
You can use e.g. getenv() to access environment variables.
The win32 API provides GetEnvironmentStrings() to access all the variables.
Some platforms allow you to access the environment through an additional argument to main(), you'd declare your main function as:
int main(int argc, char *argv[], char *environ[])

The environ global variable is defined by POSIX, and is not supported by Windows (unless you're using Cygwin, which is a POSIX-like layer implemented on top of Windows).
As far as I know, the non-standard definition
int main(int argc, char **argv, char **envp) { /* ... */ }
is also not supported on Windows.
But a quick Google search turned up this answer, which points to the documentation for the Windows-specific GetEnvironmentStrings function:
LPTCH WINAPI GetEnvironmentStrings(void);
If the function succeeds, the return value is a pointer to the
environment block of the current process.
If the function fails, the return value is NULL.
The result points to a long string with the environment variables separated by '\0' null characters, with the environment terminated by two consecutive null characters.
LPTCH is Microsoft's typedef for a pointer to either unsigned char or a 16-bit wchar_t. See the referenced documentation for more information.

Wrapper routine for write() with unistd.h included results in error

I am writing a wrapper routine for write() to override the original system function and within it i need to execute another program through execve(); for which I include the header file unistd.h. I get the error conflicting types for 'write' /usr/include/unistd.h:363:16: note: previous declaration of 'write'was here. I would be very gratefull if someone could help me out as I need to call another program from inside the wrapper and also send arguments to it from inside the wrapper routine.

The GNU linker has a --wrap <symbol> option which allows you to do this sort of thing.
If you link with --wrap write, references to write will redirect to __wrap_write (which you implement), and references to __real_write will redirect to the original write (so you can call it from within your wrapper implementation).
Here's a sophisticated test application using write() - I'm doing the compilation and linking steps separately because I'll want to use hello.o again in a minute:
$ cat hello.c
#include <unistd.h>
int main(void)
{
write(0, "Hello, world!\n", 14);
return 0;
}
$ gcc -Wall -c hello.c
$ gcc -o test1 hello.o
$ ./test1
Hello, world!
$
Here's an implementation of __wrap_write(), which calls __real_write(). (Note that we want a prototype for __real_write to match the original. I've added a matching prototype explicitly, but another possible option is to #define write __real_write before #include <unistd.h>.)
$ cat wrapper.c
#include <unistd.h>
extern ssize_t __real_write(int fd, const void *buf, size_t n);
ssize_t __wrap_write(int fd, const void *buf, size_t n)
{
__real_write(fd, "[wrapped] ", 10);
return __real_write(fd, buf, n);
}
$ gcc -Wall -c wrapper.c
$
Now, link the hello.o we made earlier with wrapper.o, passing the appropriate flags to the linker. (We can pass arbitrary options through gcc to the linker using the slightly odd -Wl,option syntax.)
$ gcc -o test2 -Wl,--wrap -Wl,write hello.o wrapper.o
$ ./test2
[wrapped] Hello, world!
$

An alternative to using the GNU liner --wrap symbol option as suggested by Matthew Slattery would be to use dlsym() to obtain the address of the execve() symbol at runtime in order to avoid the compile-time issues with including unistd.h.
I suggest reading Jay Conrod's blog post entitled Tutorial: Function Interposition in Linux for additional information on replacing calls to functions in dynamic libraries with calls to your own wrapper functions.
The following example provides a write() wrapper function that calls the original write() before calling execve() and does not include unistd.h. It is important to note that you cannot directly call the original write() from the wrapper because it will be interpreted as a recursive call to the wrapper itself.
Code:
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
size_t write(int fd, const void *buf, size_t count)
{
static size_t (*write_func)(int, const void *, size_t) = NULL;
static int (*execve_func)(const char *, char *const[], char *const[]) = NULL;
/* arguments for execve() */
char *path = "/bin/echo";
char *argv[] = { path, "hello world", NULL };
char *envp[] = { NULL };
if (!write_func)
{
/* get reference to original (libc provided) write */
write_func = (size_t(*)(int, const void *, size_t)) dlsym(RTLD_NEXT, "write");
}
if (!execve_func)
{
/* get reference to execve */
execve_func = (int(*)(const char *, char *const[], char *const[])) dlsym(RTLD_NEXT, "execve");
}
/* call original write() */
write_func(fd, buf, count);
/* call execve() */
return execve_func(path, argv, envp);
}
int main(int argc, char *argv[])
{
int filedes = 1;
char buf[] = "write() called\n";
size_t nbyte = sizeof buf / sizeof buf[0];
write(filedes, buf, nbyte);
return 0;
}
Output:
$ gcc -Wall -Werror -ldl test.c -o test
$ ./test
write() called
hello world
$
Note: This code is provided as an example of what is possible. I would recommend following Jonathan Leffler's advice on code segregation in constructing the final implementation.

It is an utterly bad idea to try wrapping write() and use POSIX functions. If you chose to work in standard C, then you could wrap write() because it is not a name reserved to the standard. However, once you start using POSIX functions - and execve() is a POSIX function - then you are running into conflicts; POSIX reserves the name write().
If you want to try, you may get away with it if you segregate the code carefully. You have your write() wrapper in one source file which does not include <unistd.h> or use any functions not defined in the C standard for the headers you do include. You have your code that does the execve() in a second file that does include <unistd.h>. And you link those parts together with appropriate function calls.
If you are lucky, it will work as intended. If you aren't lucky, all hell will break loose. And note that your luck status might change on different machines depending on factors outside your control such as o/s updates (bug fixes) or upgrades. It is a very fragile design decision to wrap write().

Just making an illustration for Muggen's attention call (therefore community wiki):
You want to redefine write and call write from inside your redefinition. Something like
void write(int a) {
/* code code code */
write(42); /* ??? what `write`?
??? recursive `write`?
??? the other `write`? */
/* code code code */
}
Better think better about it :)

If you segregate your code appropriately as suggested by Jonathan Leffler, you should be able to avoid compile-time issues related to including unistd.h. The following code is provided as an example of such segregation.
Note that you cannot interpose internal library function calls, since these are resolved before runtime. For instance, if some function in libc calls write(), it will never call your wrapper function.
Code:
exec.c
#include <unistd.h>
inline int execve_func(const char *path, char *const argv[], char *const envp[])
{
return execve(path, argv, envp);
}
test.c
#include <stdio.h>
extern int execve_func(const char *, char *const[], char *const[]);
size_t write(int fd, const void *buf, size_t count)
{
/* arguments for execve() */
char *path = "/bin/echo";
char *argv[] = { path, "hello world", NULL };
char *envp[] = { NULL };
return execve_func(path, argv, envp);
}
int main(int argc, char *argv[])
{
int filedes = 1;
char buf[] = "dummy";
size_t nbyte = sizeof buf / sizeof buf[0];
write(filedes, buf, nbyte);
return 0;
}
Output:
$ gcc -Wall -Werror test.c exec.c -o test
$ ./test
hello world
$