I'm currently working on a project where I need to track the usage of several system calls and low-level functions like mmap, brk, sbrk. So far, I've been doing this using function interposition: I write a wrapper function with the same name as the function I'm replacing (mmap for example), and I load it in a program by setting the LD_PRELOAD environment variable. I call the real function through a pointer that I load with dlsym.
Unfortunately, one of the functions I want to wrap, sbrk, is used internally by dlsym, so the program crashes when I try to load the symbol. sbrk is not a system call in Linux, so I can't simply use syscall to call it indirectly.
So my question is, how can I call a library function from a wrapper function of the same name without using dlsym? Is there any compiler trick (using gcc) that lets me refer to the original function?
see ld's option --wrap symbol. From the man page:
--wrap symbol Use a wrapper function for symbol. Any undefined
reference to symbol will be resolved
to "__wrap_symbol". Any undefined
reference to "__real_symbol" will
be resolved to symbol.
This can be used to provide a
wrapper for a system function. The
wrapper function should be called
"__wrap_symbol". If it wishes to call
the system function, it should call
"__real_symbol".
Here is a trivial example:
void *
__wrap_malloc (size_t c)
{
printf ("malloc called with %zu\n", c);
return __real_malloc (c);
}
If you link other code with this
file using --wrap malloc, then all
calls to "malloc" will call the
function "__wrap_malloc" instead. The
call to "__real_malloc" in
"__wrap_malloc" will call the real
"malloc" function.
You may wish to provide a
"__real_malloc" function as well, so
that links without the --wrap option
will succeed. If you do this, you
should not put the definition of
"__real_malloc" in the same file as
"__wrap_malloc"; if you do, the
assembler may resolve the call before
the linker has a chance to wrap it to
"malloc".
The other option is to possibly look at the source for ltrace, it is more or less does the same thing :-P.
Here's an idea though. You could have your LD_PRELOAD'ed library change the PLT entries to point to your code. This you technically the sbrk() function is still callable from your code nativly.
You can examine function invocation unobtrusively using tools such as:
gdb
ltrace
systemtap
These tools allow a monitor program to inform you when a function is called, and allow you to interrogate the arguments.
The main differences are:
gdb is interactive, but powerful
ltrace simple to use, but you can only print the function name
systemtap is not interactive, but it can be very fast, and is powerful.
If you are running a host system with glibc, the libc has some internal back end to the runtime dynamic linker that I used some time ago. If I recall correctly, I think it's called '__libc_dlsym'. (To check, "$ readelf -s /usr/lib/libc.a | grep dlsym" should help.) Declare it as an externally linked function with the same arguments and return value that dlsym has and use it to wrap dlsym itself.
Does truss not work on your system? It works perfectly for this kind of things here on Solaris.
Related
For example, if I want to override malloc(), what's the best way to do it?
Currently the simplest way I know of is:
malloc.h
#include <stdlib.h>
#define malloc my_malloc
void* my_malloc (size_t size);
foobar.c
#include "malloc.h"
void foobar(void)
{
void* leak = malloc(1024);
}
The problem with this approach is that we now have to use "malloc.h" and can never use "stdlib.h". Is there a way around this? I'm particularly interested in importing 3rd party libraries without modifying them at all, but forcing them into calling my custom libc functions (like malloc).
The short answer is you probably want to use the LD_PRELOAD trick: What is the LD_PRELOAD trick?
That approach basically inserts your own custom shared library on runtime before any other shared library is loaded, exporting the functions you want to override, such as malloc(). By the time the other shared libraries are loaded your symbol is already there and gets preference when resolving calls to that symbol name from other libraries. From within your malloc() wrapper/replacement you can even chose to call the next malloc symbol, which typically would be the actual libc symbol.
This blog post has a lot of comprehensive information about this method:
http://samanbarghi.com/blog/2014/09/05/how-to-wrap-a-system-call-libc-function-in-linux/
Note that example is overriding libc's write() and puts() functions, but the same logic applies for malloc():
LD_PRELOAD allows a shared library to be loaded before any other libraries. So all I need to do is to write a shared library that overrides write and puts functions. If we wrap these functions, we need a way to call the real functions to perform the system call. dlsym just do that for us [man 3 dlsym]: > The function dlsym() takes a “handle” of a dynamic library returned by dlopen() and the null-terminated symbol name, returning the address where that symbol is loaded into memory. If the symbol is not found, in the specified library or any of the libraries that were automatically loaded by dlopen() when that library was loaded, dlsym() returns NULL…
So inside the wrapper function we can use dlsym to get the address of the related symbol in memory and call the glibc function. Another approach can be calling the syscall directly, both approaches will work.
That blog post also describes a compile-time method I did not know about that involves passing a linker flag to ld, "--wrap":
Another way of wrapping functions is by using linker at the link time. GNU linker provides an option to wrap a function for a symbol [man 1 ld]: > Use a wrapper function for symbol. Any undefined reference to symbol will be resolved to “__wrap_symbol”. Any undefined reference to “__real_symbol” will be resolved to symbol.
The handy thing about LD_PRELOAD is that might allow you to change the malloc() implementation on production applications for quick testing, or even allow the user to select (I do this in some server applications) which implementation to use. The 'tcmalloc' library for example can be easily inserted into an application to evaluate performance gains in heavily threaded applications (where tcmalloc tends to perform a lot better than libc's malloc implementation).
Finally if you're on Windows, perhaps try this: LD_PRELOAD equivalent for Windows to preload shared libraries
Lets see what is wrapper function first with example:
https://en.wikipedia.org/wiki/Wrapper_function (also worth looking)
(https://goo.gl/nGiMkl)
--wrap symbol
Use a wrapper function for symbol. Any undefined reference to symbol will be resolved to __wrap_symbol. Any undefined reference to __real_symbol will be resolved to symbol. This can be used to provide a wrapper for a system function. The wrapper function should be called __wrap_symbol. If it wishes to call the system function, it should call __real_symbol. Here is a trivial example:
void *
__wrap_malloc (int c)
{
printf ("malloc called with %ld\n", c);
return __real_malloc (c);
}
If you link other code with this file using --wrap malloc, then all calls to malloc will call the function __wrap_malloc instead. The call to __real_malloc in __wrap_malloc will call the real malloc function. You may wish to provide a __real_malloc function as well, so that links without the --wrap option will succeed. If you do this, you should not put the definition of __real_malloc in the same file as __wrap_malloc; if you do, the assembler may resolve the call before the linker has a chance to wrap it to malloc.
Problem statement:
Wrapper is a feature provided by Compiler
In my project I need to implement similar kind of program which needed wrapper function. But the compiler my project is using (Wind River Diab Compiler) not support wrapper function.
So the real question is can I implement similar functionality with a compiler which not support wrapper function/feature. Means can I implement wrapper function manually.
Complete is project is on C language only
PS: Question is not about malloc(). its just an example.
I'm trying to port a C program compiled with GNU toolchain to OS X but its default ld program does not support the --wrap flag, which is present in GNU's ld.
This is from the man page of GNU's ld:
--wrap symbol
Use a wrapper function for symbol. Any undefined reference to sym-
bol will be resolved to "__wrap_symbol". Any undefined reference
to "__real_symbol" will be resolved to symbol.
This can be used to provide a wrapper for a system function. The
wrapper function should be called "__wrap_symbol". If it wishes to
call the system function, it should call "__real_symbol".
Here is a trivial example:
void *
__wrap_malloc (size_t c)
{
printf ("malloc called with %zu\n", c);
return __real_malloc (c);
}
If you link other code with this file using --wrap malloc, then all
calls to "malloc" will call the function "__wrap_malloc" instead.
The call to "__real_malloc" in "__wrap_malloc" will call the real
"malloc" function.
You may wish to provide a "__real_malloc" function as well, so that
links without the --wrap option will succeed. If you do this, you
should not put the definition of "__real_malloc" in the same file
as "__wrap_malloc"; if you do, the assembler may resolve the call
before the linker has a chance to wrap it to "malloc".
Is there a portable way of achieving this?
I faced a similar issue but never found an equivalent. Following workarounds can help, depending on what you exactly intend to achieve.
1) Wrapping symbols during linking on OS X
2) https://discussions.apple.com/thread/617779?start=0&tstart=0
For example, if I want to override malloc(), what's the best way to do it?
Currently the simplest way I know of is:
malloc.h
#include <stdlib.h>
#define malloc my_malloc
void* my_malloc (size_t size);
foobar.c
#include "malloc.h"
void foobar(void)
{
void* leak = malloc(1024);
}
The problem with this approach is that we now have to use "malloc.h" and can never use "stdlib.h". Is there a way around this? I'm particularly interested in importing 3rd party libraries without modifying them at all, but forcing them into calling my custom libc functions (like malloc).
The short answer is you probably want to use the LD_PRELOAD trick: What is the LD_PRELOAD trick?
That approach basically inserts your own custom shared library on runtime before any other shared library is loaded, exporting the functions you want to override, such as malloc(). By the time the other shared libraries are loaded your symbol is already there and gets preference when resolving calls to that symbol name from other libraries. From within your malloc() wrapper/replacement you can even chose to call the next malloc symbol, which typically would be the actual libc symbol.
This blog post has a lot of comprehensive information about this method:
http://samanbarghi.com/blog/2014/09/05/how-to-wrap-a-system-call-libc-function-in-linux/
Note that example is overriding libc's write() and puts() functions, but the same logic applies for malloc():
LD_PRELOAD allows a shared library to be loaded before any other libraries. So all I need to do is to write a shared library that overrides write and puts functions. If we wrap these functions, we need a way to call the real functions to perform the system call. dlsym just do that for us [man 3 dlsym]: > The function dlsym() takes a “handle” of a dynamic library returned by dlopen() and the null-terminated symbol name, returning the address where that symbol is loaded into memory. If the symbol is not found, in the specified library or any of the libraries that were automatically loaded by dlopen() when that library was loaded, dlsym() returns NULL…
So inside the wrapper function we can use dlsym to get the address of the related symbol in memory and call the glibc function. Another approach can be calling the syscall directly, both approaches will work.
That blog post also describes a compile-time method I did not know about that involves passing a linker flag to ld, "--wrap":
Another way of wrapping functions is by using linker at the link time. GNU linker provides an option to wrap a function for a symbol [man 1 ld]: > Use a wrapper function for symbol. Any undefined reference to symbol will be resolved to “__wrap_symbol”. Any undefined reference to “__real_symbol” will be resolved to symbol.
The handy thing about LD_PRELOAD is that might allow you to change the malloc() implementation on production applications for quick testing, or even allow the user to select (I do this in some server applications) which implementation to use. The 'tcmalloc' library for example can be easily inserted into an application to evaluate performance gains in heavily threaded applications (where tcmalloc tends to perform a lot better than libc's malloc implementation).
Finally if you're on Windows, perhaps try this: LD_PRELOAD equivalent for Windows to preload shared libraries
I would like to trace calls to some 3rd party library which are made from another 3rd party library.
Example: I want to trace calls to library A. My application statically links library B, which in turn is statically linked to library A. So basically what I have is libAB.a
In case of dynamic linking I could write library A2 with wrappers for functions which I want to trace of library A and use LD_PRELOAD=A2.so. Then, my wrappers will be called instead, and I will see the trace.
In my case I cannot use dynamic linking.
Is it possible to achieve the same using static linking?
In ideal case I would like to link my application with libAB.a and trace library libA2.a and get the trace.
Thanks,
Robusta
Okay, I found it :)
man ld
--wrap symbol
Use a wrapper function for symbol. Any undefined reference to symbol will be resolved to "__wrap_symbol". Any undefined ref‐
erence to "__real_symbol" will be resolved to symbol.
This can be used to provide a wrapper for a system function. The wrapper function should be called "__wrap_symbol". If it
wishes to call the system function, it should call "__real_symbol".
Here is a trivial example:
void *
__wrap_malloc (size_t c)
{
printf ("malloc called with %zu\n", c);
return __real_malloc (c);
}
If you link other code with this file using --wrap malloc, then all calls to "malloc" will call the function "__wrap_malloc"
instead. The call to "__real_malloc" in "__wrap_malloc" will call the real "malloc" function.
Depending on how much performance matters you could do it with gdb... (Set a breakpoint on all the functions you care about and log the stack traces... but that involves learning how to script gdb)
There's also things like Oprofile http://oprofile.sourceforge.net/, LTTng http://lttng.org/, and perf (comes with recent kernels in the kernel source it's under tools/perf/ you need to compile it, on Ubuntu I think it's in the linux-tools package)
I can't tell you how to achieve what you want with any of those tools but oprofile and LTTng have lots of documentation and an active user community.
Well, it seems like a dead lock :)
But I think you may solve it using macros. Although this solution might not be clean and may not work for all situations.
You can try this:
void functionFromLibA();
#define functionFromLibA() trace(); functionFromLibA()
int main()
{
functionFromLibA();
}
This will be expanded to:
void myfunc();
int main()
{
trace(); functionFromLibA();
}
EDIT: But note that for this solution, all declarations of functions prototypes should be done before defining the macros. Else you will have the prototypes expanded in preprocessing as well.