Is there a way to overwrite the malloc/free function in C? - c

Is there a way to hook the malloc/free function call from a C application it self?

malloc() and free() are defined in the standard library; when linking code, the linker will search the library only for symbols that are not already resolved by eailier encountered object code, and object files generated from compilation are always linked before any libraries.
So you can override any library function simply by defining it in your own code, ensuring that it has the correct signature (same name, same number and types of parameters and same return type).

Yes you can. Here's an example program. It compiles and builds with gcc 4.8.2 but does not do anything useful since the implementations are not functional.
#include <stdlib.h>
int main()
{
int* ip = malloc(sizeof(int));
double* dp = malloc(sizeof(double));
free(ip);
free(dp);
}
void* malloc(size_t s)
{
return NULL;
}
void free(void* p)
{
}

Not sure if this counts as "overwriting', but you can effectively change the behavior of code that calls malloc and free by using a macro:
#define malloc(x) my_malloc(x)
#define free(x) my_free(x)
void * my_malloc(size_t nbytes)
{
/* Do your magic here! */
}
void my_free(void *p)
{
/* Do your magic here! */
}
int main(void)
{
int *p = malloc(sizeof(int) * 4); /* calls my_malloc */
free(p); /* calls my_free */
}

You may need LD_PRELOAD mechanism to replace malloc and free.

As many mentioned already, this is very platform specific. Most "portable" way is described in an accepted answer to this question. A port to non-posix platforms requires finding an appropriate replacement to dlsym.
Since you mention Linux/gcc, hooks for malloc would probably serve you the best.

Depending on the platform you are using, you may be able to remove the default malloc/free from the library and add your own using the linker or librarian tools. I'd suggest you only do this in a private area and make sure you can't corrupt the original library.

On the Windows platform there is a Detour library. It basically patches any given function on the assembler level. This allows interception of any C library or OS call, like CreateThread, HeapAlloc, etc. I used this library for overriding memory allocation functions in a working application.
This library is Windows specific. On other platforms most likely there are similar libraries.

C does not provide function overloading. So you cannot override.

Related

Overriding C library functions, calling original

I am a bit puzzled on how and why this code works as it does. I have not actually encountered this in any project I've worked on, and I have not even thought of doing it myself.
override_getline.c:
#include <stdio.h>
#define OVERRIDE_GETLINE
#ifdef OVERRIDE_GETLINE
ssize_t getline(char **lineptr, size_t *n, FILE *stream)
{
printf("getline &lineptr=%p &n=%p &stream=%p\n", lineptr, n, stream);
return -1; // note: errno has undefined value
}
#endif
main.c:
#include <stdio.h>
int main()
{
char *buf = NULL;
size_t len = 0;
printf("Hello World! %zd\n", getline(&buf, &len, stdin));
return 0;
}
And finally, example compile and run command:
gcc main.c override_getline.c && ./a.out
With the OVERRIDE_GETLINE define, the custom function gets called, and if it is commented out, normal library function gets called, and both work as expected.
Questions
What is the correct term for this? "Overriding", "shadowing", something else?
Is this gcc-specific, or POSIX, or ANSI C, or even undefined in all?
Does it make any difference if function is ANSI C function or (like here) a POSIX function?
Where does the overriding function get called? By other .o files in the same linking, at least, and I presume .a files added to link command too. How about static or dynamic libs added with -l command line option of linker?
If it is possible, how do I call the library version of getline from the overriden getline?
The linker will search the files you provide on the command line first for symbols, before it searches in libraries. This means that as soon as it sees that getline has been defined, it will no longer look for another getline symbol. This is how linkers works on all platforms.
This of course has implications for your fifth point, in that there is no possibility to call the "original" getline, as your function is the original from the point of view of the linker.
For the fifth point, you may want to look at e.g. this old answer.
There's no standard way to have two functions of the same name in your program, but with some UNIX-like implementations (notably GNU libc) you might be able to get away with this:
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
ssize_t getline(char **lineptr, size_t *n, FILE *stream)
{
ssize_t (*realfunc)(char**, size_t *, FILE*) =
(ssize_t(*)(char**, size_t *, FILE*))(dlsym (RTLD_NEXT, "getline"));
return realfunc(lineptr, n, stream);
}
You will need to link with -ldl for this.
What is happening here is that you are relying on the behaviour of the linker. The linker finds your implementation of getline before it sees the version in the standard library, so it links to your routine. So in effect you are overriding the function via the mechanism of link order. Of course other linkers may behave differently, and I believe the gcc linker may even complain about duplicate symbols if you specify appropriate command line switches.
In order to be able to call both your custom routine and the library routine you would typically resort to macros, e.g.
#ifdef OVERRIDE_GETLINE
#define GETLINE(l, n, s) my_getline(l, n, s)
#else
#define GETLINE(l, n, s) getline(l, n, s)
#endif
#ifdef OVERRIDE_GETLINE
ssize_t my_getline(char **lineptr, size_t *n, FILE *stream)
{
// ...
return getline(lineptr, n, stream);
}
#endif
Note that this requires your code to call getline as GETLINE, which is rather ugly.
What you see is expected behaviour if you linking with shared libraries. Linker will just assign it to your function, as it was first. It will also be correctly called from any other external libraries functions, - because linker will make your function exportable when it will scan linking libraries.
But - if you, say, have no external libraries that links to your function (so it isn't marked exportable, and isn't inserted to symbol table), and then dlopen() some library that want to use it during runtime - it will not find required function. Furthermore, if you first dlopen(RTLD_NOW|RTLD_GLOBAL) original library, every subsequent dlopen()'d library will use this library code, not yours. Your code (or any libraries that you've linked with during compilation phase, not runtime) will still stick with your function, no matter what.

dlopen issue(OSX)

I have a main application which dynamically loads a dylib, from inside that dylib I would like to call exported functions from my main program. I'm using dlopen(NULL,flag) to retrieve my main applications handle and dlsym(handle, symbol) to get the function.
dlopen gives no error but when I try to dlsym my function I get the following error:
dlerror dlsym(RTLD_NEXT, CallMe): symbol not found
The symbol is exported corrected confirmed by nm
I'm not sure why RTLD_NEXT is there? is this the result of dlopen(NULL,flag)?
How can I solve this problem or achieve my goal?
Or are there other ways to call the main application (preferably not by passing on function pointers to the dylib)?
Thanks in advance!
Added:
Export:
extern "C" {
void CallMe(char* test);
}
__attribute__((visibility("default")))
void CallMe(char* test)
{
NSLog(#"CallMe with: %s",test);
}
Result of nm
...
0000000000001922 T _CallMe
..
Code in dylib:
void * m_Handle;
typedef void CallMe(char* test);
CallMe* m_Function;
m_Handle = dlopen(NULL,RTLD_LAZY); //Also tried RTLD_NOW|RTLD_GLOBAL
if(!m_Handle)
return EC_ERROR;
m_Function = (CallMe*)dlsym(m_Handle, "CallMe");
if(!m_Function)
return EC_ERROR;
m_Function("Hallo");
I think a better approach might be to establish a proprietary protocol with your dynamic library where you initialise it by passing it a struct of function pointers. The dynamic library needs to simply provide some sort of init(const struct *myfuncs), or some such, function and this makes it simpler to implement the dynamic library.
This would also make the implementation more portable.

How to compile glibc for use without an operating system

I would like to compile the functions of glibc to an object file which will then be linked to a program which I am running on a computer without any operating system. Some functions, such as open, I want to just fail with ENOSYS. Other functions I will write myself, such as putchar, and then have glibc use those functions in it's own (like printf). I also want to use functions that don't need a file system or process management system or anything like that, such as strlen. How can I do this?
Most C libraries rely on the kernel heavily, so it's not reasonable to port 'em. But since most of it doesn't need to be implemented, you can get away easily with a few prototypes for stubs and gcc builtins.
You can implement stubs easily using weak symbols:
#define STUB __attribute__((weak, alias("__stub"))) int
#define STUB_PTR __attribute__((weak, alias("__stub0"))) void *
int __stub();
void *__stub0();
Then defining the prototypes becomes trivial:
STUB read(int, void*, int);
STUB printf(const char *, ...);
STUB_PTR mmap(void*, int, int, int, int, int);
And the actual functions could be:
int __stub()
{
errno = ENOSYS;
return -1;
}
void *__stub0()
{
errno = ENOSYS;
return NULL;
}
If you need some non-trivial function, like printf, take it from uClibc instead of glibc (or some other smaller implementation).

How to run constructor even if "-nostdlib" option is defined

I have a dynamic library that contains a constructor.
__attribute__ ((constructor))
void construct() {
// This is initialization code
}
The library is compiled with -nostdlib option and I cannot change that. As a result there are no .ctor and .dtor sections in library and the constructor is not running on the library load.
As written there there should be special measures that allow running the constructor even in this case. Could you please advice me what and how that can be done?
Why do you need constructors? Most programmers I work with, myself included, refuse to use libraries with global constructors because all too often they introduce bugs by messing up the program's initial state when main is entered. One concrete example I can think of is OpenAL, which broke programs when it was merely linked, even if it was never called. I was not the one on the project who dealt with this bug, but if I'm not mistaken it had something to do with mucking with ALSA and breaking the main program's use of ALSA later.
If your library has nontrivial global state, instead see if you can simply use global structs and initializers. You might need to add flags with some pointers to indicate whether they point to allocated memory or static memory, though. Another method is to defer initialization to the first call, but this can have thread-safety issues unless you use pthread_once or similar.
Hmm missed the part that there where no .ctor and .dtor sections... forget about this.
#include <stdio.h>
#include <stdint.h>
typedef void (*func)(void);
__attribute__((constructor))
void func1(void) {
printf("func1\n");
}
__attribute__((constructor))
void func2(void) {
printf("func2\n");
}
extern func* __init_array_start;
int main(int argc, char **argv)
{
func *funcarr = (func*)&__init_array_start;
func f;
int idx;
printf("start %p\n", *funcarr);
// iterate over the array
for (idx = 0; ; ++idx) {
f = funcarr[idx];
// skip the end of array marker (0xFFFFFFFF) on 64 bit it's twice as long ;)
if (f == (void*)~0)
continue;
// till f is NULL which indicates the start of the array
if (f == NULL)
break;
printf("constructor %p\n", *f);
f();
}
return 0;
}
Which gives:
Compilation started at Fri Mar 9 09:28:29
make test && ./test
cc test.c -o test
func2
func1
start 0xffffffff
constructor 0x80483f4
func1
constructor 0x8048408
func2
Probably you need to swap the continue and break if you are running on an Big Endian system but i'm not entirely sure.
But just like R.. stated using static constructors in libraries is not so nice to the developers using your library :p
On some platforms, .init_array/.fini_array sections are generated to include all global constructors/destructors. You may use that.

How to get function's name from function's pointer in Linux kernel?

How to get function's name from function's pointer in C?
Edit: The real case is: I'm writing a linux kernel module and I'm calling kernel functions. Some of these functions are pointers and I want to inspect the code of that function in the kernel source. But I don't know which function it is pointing to. I thought it could be done because, when the system fails (kernel panic) it prints out in the screen the current callstack with function's names. But, I guess I was wrong... am I?
I'm surprised why everybody says it is not possible. It is possible on Linux for non-static functions.
I know at least two ways to achieve this.
There are GNU functions for backtrace printing: backtrace() and backtrace_symbols() (See man). In your case you don't need backtrace() as you already have function pointer, you just pass it to backtrace_symbols().
Example (working code):
#include <stdio.h>
#include <execinfo.h>
void foo(void) {
printf("foo\n");
}
int main(int argc, char *argv[]) {
void *funptr = &foo;
backtrace_symbols_fd(&funptr, 1, 1);
return 0;
}
Compile with gcc test.c -rdynamic
Output: ./a.out(foo+0x0)[0x8048634]
It gives you binary name, function name, pointer offset from function start and pointer value so you can parse it.
Another way is to use dladdr() (another extension), I guess print_backtrace() uses dladdr(). dladdr() returns Dl_info structure that has function name in dli_sname field. I don't provide code example here but it is obvious - see man dladdr for details.
NB! Both approaches require function to be non-static!
Well, there is one more way - use debug information using libdwarf but it would require unstripped binary and not very easy to do so I don't recommend it.
That's not directly possible without additional assistance.
You could:
maintain a table in your program mapping function pointers to names
examine the executable's symbol table, if it has one.
The latter, however, is hard, and is not portable. The method will depend on the operating system's binary format (ELF, a.out, .exe, etc), and also on any relocation done by the linker.
EDIT: Since you've now explained what your real use case is, the answer is actually not that hard. The kernel symbol table is available in /proc/kallsyms, and there's an API for accessing it:
#include <linux/kallsyms.h>
const char *kallsyms_lookup(unsigned long addr, unsigned long *symbolsize,
unsigned long *ofset, char **modname, char *namebuf)
void print_symbol(const char *fmt, unsigned long addr)
For simple debug purposes the latter will probably do exactly what you need - it takes the address, formats it, and sends it to printk, or you can use printk with the %pF format specifier.
In the Linux kernel, you can use directly "%pF" format of printk !
void *func = &foo;
printk("func: %pF at address: %p\n", func, func);
The following works me on Linux:
printf the address of the function using %p
Then do an nm <program_path> | grep <address> (without the 0x prefix)
It should show you the function name.
It works only if the function in question is in the same program (not in a dynamically linked library or something).
If you can find out the load addresses of the loaded shared libraries, you can subtract the address from the printed number, and use nm on the library to find out the function name.
You can't diectly but you can implement a different approach to this problem if you want. You can make a struct pointer instead pointing to a function as well as a descriptive string you can set to whatever you want.
I also added a debugging posebilety since you problably do not want these vars to be printet forever.
// Define it like this
typedef struct
{
char *dec_text;
#ifdef _DEBUG_FUNC
void (*action)(char);
#endif
} func_Struct;
// Initialize it like this
func_Struct func[3]= {
#ifdef _DEBUG_FUNC
{"my_Set(char input)",&my_Set}};
{"my_Get(char input)",&my_Get}};
{"my_Clr(char input)",&my_Clr}};
#else
{&my_Set}};
{&my_Get}};
{&my_Clr}};
#endif
// And finally you can use it like this
func[0].action( 0x45 );
#ifdef _DEBUG_FUNC
printf("%s",func.dec_text);
#endif
There is no way how to do it in general.
If you compile the corresponding code into a DLL/Shared Library, you should be able to enlist all entry points and compare with the pointer you've got. Haven't tried it yet, but I've got some experience with DLLs/Shared Libs and would expect it to work. This could even be implemented to work cross-plarform.
Someone else mentioned already to compile with debug symbols, then you could try to find a way to analyse these from the running application, similiar to what a debugger would do.
But this is absolutely proprietary and not portable.
If the list of functions that can be pointed to is not too big or if you already suspect of a small group of functions you can print the addresses and compare them to the one used during execution. Ex:
typedef void (*simpleFP)();
typedef struct functionMETA {
simpleFP funcPtr;
char * funcName;
} functionMETA;
void f1() {/*do something*/}
void f2() {/*do something*/}
void f3() {/*do something*/}
int main()
{
void (*funPointer)() = f2; // you ignore this
funPointer(); // this is all you see
printf("f1 %p\n", f1);
printf("f2 %p\n", f2);
printf("f3 %p\n", f3);
printf("%p\n", funPointer);
// if you want to print the name
struct functionMETA arrFuncPtrs[3] = {{f1, "f1"}, {f2, "f2"} , {f3, "f3"}};
int i;
for(i=0; i<3; i++) {
if( funPointer == arrFuncPtrs[i].funcPtr )
printf("function name: %s\n", arrFuncPtrs[i].funcName);
}
}
Output:
f1 0x40051b
f2 0x400521
f3 0x400527
0x400521
function name: f2
This approach will work for static functions too.
Use kallsyms_lookup_name() to find the address of kallsyms_lookup.
Use a function pointer that points to kallsyms_lookup, to call it.
Check out Visual Leak Detector to see how they get their callstack printing working. This assumes you are using Windows, though.
Alnitak's answer is very helpful to me when I was looking for a workaround to print out function's name in kernel module. But there is one thing I want to supplyment, which is that you might want to use %pS instead of %pF to print function's name, becasue %pF not works anymore at some newer verions of kernel, for example 5.10.x.
Not exactly what the question is asking for but after reading the answers here
I though of this solution to a similar problem of mine:
/**
* search methods */
static int starts(const char *str, const char *c);
static int fuzzy(const char *str, const char *c);
int (*search_method)(const char *, const char *);
/* asign the search_method and do other stuff */
[...]
printf("The search method is %s\n", search_method == starts ? "starts" : "fuzzy")
If your program needs this a lot you could define the method names along with a string in an XMacro and use #define X(name, str) ... #undef X in the code to get the corresponding string from the function name.
You can't. The function name isn't attached to the function by the time it's compiled and linked. It's all by memory address at that point, not name.
You wouldn't know how you look like without a reflecting mirror. You'll have to use a reflection-capable language like C#.

Resources