I am writing a memory profiler for C and for that am intercepting calls to the malloc, realloc and free functions via malloc_hooks. Unfortunately, these are deprecated because of their poor behavior in multi threaded environments. I could not find a document describing the alternative best practice solution to achieve the same thing, can someone enlighten me?
I've read that a simple #define malloc(s) malloc_hook(s) would do the trick, but that does not work with the system setup I have in mind, because it is too intrusive to the original code base to be suitable for use in a profiling / tracing tool. Having to manually change the original application code is a killer for any decent profiler. Optimally, the solution I am looking for should be enabled or disabled just by linking to an optional shared library. For example, my current setup uses a function declared with __attribute__ ((constructor)) to install the intercepting malloc hooks.
Thanks
After trying some things, I finally managed to figure out how to do this.
First of all, in glibc, malloc is defined as a weak symbol, which means that it can be overwritten by the application or a shared library. Hence, LD_PRELOAD is not necessarily needed. Instead, I implemented the following function in a shared library:
void*
malloc (size_t size)
{
[ ... ]
}
Which gets called by the application instead of glibcs malloc.
Now, to be equivalent to the __malloc_hooks functionality, a couple of things are still missing.
1.) the caller address
In addition to the original parameters to malloc, glibcs __malloc_hooks also provide the address of the calling function, which is actually the return address of where malloc would return to. To achieve the same thing, we can use the __builtin_return_address function that is available in gcc. I have not looked into other compilers, because I am limited to gcc anyway, but if you happen to know how to do such a thing portably, please drop me a comment :)
Our malloc function now looks like this:
void*
malloc (size_t size)
{
void *caller = __builtin_return_address(0);
[ ... ]
}
2.) accessing glibcs malloc from within your hook
As I am limited to glibc in my application, I chose to use __libc_malloc to access the original malloc implementation. Alternatively, dlsym(RTLD_NEXT, "malloc") can be used, but at the possible pitfall that this function uses calloc on its first call, possibly resulting in an infinite loop leading to a segfault.
complete malloc hook
My complete hooking function now looks like this:
extern void *__libc_malloc(size_t size);
int malloc_hook_active = 0;
void*
malloc (size_t size)
{
void *caller = __builtin_return_address(0);
if (malloc_hook_active)
return my_malloc_hook(size, caller);
return __libc_malloc(size);
}
where my_malloc_hook looks like this:
void*
my_malloc_hook (size_t size, void *caller)
{
void *result;
// deactivate hooks for logging
malloc_hook_active = 0;
result = malloc(size);
// do logging
[ ... ]
// reactivate hooks
malloc_hook_active = 1;
return result;
}
Of course, the hooks for calloc, realloc and free work similarly.
dynamic and static linking
With these functions, dynamic linking works out of the box. Linking the .so file containing the malloc hook implementation will result of all calls to malloc from the application and also all library calls to be routed through my hook. Static linking is problematic though. I have not yet wrapped my head around it completely, but in static linking malloc is not a weak symbol, resulting in a multiple definition error at link time.
If you need static linking for whatever reason, for example translating function addresses in 3rd party libraries to code lines via debug symbols, then you can link these 3rd party libs statically while still linking the malloc hooks dynamically, avoiding the multiple definition problem. I have not yet found a better workaround for this, if you know one,feel free to leave me a comment.
Here is a short example:
gcc -o test test.c -lmalloc_hook_library -Wl,-Bstatic -l3rdparty -Wl,-Bdynamic
3rdparty will be linked statically, while malloc_hook_library will be linked dynamically, resulting in the expected behaviour, and addresses of functions in 3rdparty to be translatable via debug symbols in test. Pretty neat, huh?
Conlusion
the techniques above describe a non-deprecated, pretty much equivalent approach to __malloc_hooks, but with a couple of mean limitations:
__builtin_caller_address only works with gcc
__libc_malloc only works with glibc
dlsym(RTLD_NEXT, [...]) is a GNU extension in glibc
the linker flags -Wl,-Bstatic and -Wl,-Bdynamic are specific to the GNU binutils.
In other words, this solution is utterly non-portable and alternative solutions would have to be added if the hooks library were to be ported to a non-GNU operating system.
You can use LD_PRELOAD & dlsym
See "Tips for malloc and free" at http://www.slideshare.net/tetsu.koba/presentations
Just managed to NDK build code containing __malloc_hook.
Looks like it's been re-instated in Android API v28, according to https://android.googlesource.com/platform/bionic/+/master/libc/include/malloc.h, esp:
extern void* (*volatile __malloc_hook)(size_t __byte_count, const void* __caller) __INTRODUCED_IN(28);
Related
The descriptions seem virtually identical. Are there any nuances between the two that should be noted? Why would someone use one over the other? This question may be imposed as well for Tcl_Alloc() and malloc().
They're used because Tcl supports being built on Windows with one tool chain, and loading a DLL built with a different toolchain. A key feature of that scenario is that it is fairly common for different toolchains to have their own implementations of the C library, and that means different implementations of malloc(). You must match malloc() and free() to the same library or you get some truly weird failures (crashes, memory leaks, etc.) By providing Tcl_Alloc and Tcl_Free (which are usually very thin wrappers) it makes it possible for user code to match up the allocations and releases correctly.
This is normally the most obvious reason to do that:
Normally, the best understood reason to use your own version of the memory allocation functions is to have a single definition that allows you to change the memory allocator for a different allocator. (a debugging, extended, or implemented with security options, etc.)
Just assume you have the following implementation:
void *my_malloc(size_t siz)
{
return malloc(siz);
}
void my_free(void *ptr)
{
free(ptr);
}
defined in allocator_malloc.c
and for a special customer X you have acquired a license of the new ACME allocator. For this customer you link your executable with the file allocator_ACME.c which contains:
void *my_malloc(size_t siz)
{
return ACME_malloc(siz);
}
void free(void *ptr)
{
ACME_free(ptr);
}
Then, just linking your executable with one or the other file, you generate a dependency of the standard library malloc(), or you'll have to provide an implementation of ACME_malloc() function. In this way, just changing the presence of one of several object files, changes the whole set of dependencies (assuming you have definitions for both my_malloc() and my_free() in your source file) into one of several different implementations.
The drawback is that you have one level of function call more, so in some cases a more sofisticated solution has to be used.
Assume that you buy an automatic garbage collector, so you don't need to return the memory allocated with malloc, as for some magic, the library will detect that you have not used it more, and it garbage collects it automatically:
void *my_malloc(size_t siz)
{
return GC_malloc(siz);
}
void my_free(void *ptr)
{
/* empty */
}
I have some C code that I want to do some tests on. It uses malloc, calloc, and free through out the code. I want to change those functions to a custom function that internally calls the original function. For example:
emxArray->size = (int *)malloc((unsigned int)(sizeof(int) * numDimensions));
would become:
emxArray->size = (int *)myMalloc((unsigned int)(sizeof(int) * numDimensions));
where myMalloc is:
void* myMalloc(unsigned size)
{
if (size < 8)
{
//printf("*** Bumped from %d....\n", size);
size = 8;
}
allocated += size;
return malloc(size);
}
As you can see, myMalloc internally calls malloc. It just does some extra stuff. I wanted to replace the usage of malloc through out the code with myMalloc. I have done this successfully by going through all the code and replacing malloc with myMalloc manually, but this is far from ideal. I will be replacing this code on a test only basis, thus the production code should contain only malloc calls. I realize I could also do this with a script, but wanted to just use a define statement in the Makefile:
-Dmalloc=myMalloc
But that also replaces malloc in the myMalloc function, which causes an infinite recursive situation. I tried changing the malloc call in the myMalloc function to malloc_d, and added a second define to the Makefile.
-Dmalloc=myMalloc -Dmalloc_d=malloc
I was thinking that the first define would not replace the malloc_d (which it didn't) and that the second define would only change the malloc_d (which it didn't). I got the same recursive situation. Is there anyway to do this with Makefile defines? Or are multipass pre-compile situations going to always mess this up?
UPDATE:
Ok, so I have started looking at the LD_PRELOAD option that has been pointed out. I thought I had a workable solution, however, I am still having trouble! Here is what I did...
I moved myMalloc() and myFree() out of the main file and into its own file. I then compiled it into a shared library using:
gcc -shared -o libMyMalloc.so -fPIC myMalloc.c
I then added the following 'dummy functions' to the main file:
void* myMalloc(unsigned size)
{
void* ptr;
return ptr;
}
void myFree(void* ptr)
{
}
As you can see, they do nothing.
I added the following defines to the make file:
-Dmalloc=myMalloc \
-Dfree=myFree
I compiled the code and ran it against the libMyMalloc.so library I created:
LD_PRELOAD=/home/rad/Desktop/myMalloc/libMyMalloc.so ./testRegress
However, I am still not getting it to run with the myMalloc functions that are defined in the libMyMalloc.so file.
The simplest solution is to not call malloc directly in your code: If you choose a different name (say MALLOC), it's trivial to switch to a custom allocator.
Example code:
#ifndef MALLOC
#define MALLOC malloc
#endif
For test builds, you'd do -DMALLOC=myMalloc.
It gets more complicated if for some reason you want keep the calls to malloc. Then you'd have to add something like the following after all standard library headers have been included:
#ifdef USE_MY_MALLOC
#undef malloc
#define malloc(SIZE) myMalloc(SIZE)
#endif
You can call the standard library function by using parens, ie
void* myMalloc(unsigned size)
{
...
return (malloc)(size);
}
and enable it via -DUSE_MY_MALLOC.
Considering the additional requirements given in the comments, two approaches come to mind:
pre-process the generated source, textually replacing calls to malloc
intercept inclusion of stdlib.h (assuming that's where MATLAB gets its malloc declaration from)
Your own version of stdlib.h would look something like this:
#ifndef MY_STDLIB_H_
#define MY_STDLIB_H_
#include "/usr/include/stdlib.h"
#undef malloc
#define malloc(SIZE) myMalloc(SIZE)
#endif
Then, you can conditionally add the directory where you've placed that file to the include path. Also note that this is not a particularly robust solution, but it might work for you anyway.
You can use a pointer to a function. In the normal case, make it point to malloc. In debugging mode, let it point to you function.
In some h file:
extern void *(*myMalloc)(size_t size);
In one of you c files:
#ifdef DEBUG
void *(*myMalloc)(size_t size) = dMalloc;
#else
void *(*myMalloc)(size_t size) = malloc; // derived from libc
#endif
I found my solution and wanted to share. Thank you to everyone that contributed and pointed me in the right direction.
I ended up creating my custom library code and compiling it into a shared library:
gcc -shared -o libtmalloc.so -fPIC tmalloc.c
I then modified the makefile to use the shared library and globally define 'malloc' to my custom function name (which internally calls malloc()) to malloc_t, as well as calloc() and free():
gcc -L/home/path/to/mallocTrace -Wall -o test test.c lib/coder/*.c -lm -ltmalloc \
-Dmalloc=malloc_t \
-Dcalloc=calloc_t \
-Dfree=free_t
The defines changed all the function calls for me which were linked to the implementation in the shared library. Because I am using a shared library (which is already compiled), I didn't have to worry about my makefile defines causing a recursive call situation in my custom functions. With this usage, I can take any pre-generated C code from my other tools and observe the memory usage with these simple makefile changes and using my custom malloc trace library.
I'm creating a cross platform library using C. I have a piece of code like the following, in which I'm using the libc memory management functions directly:
myObject* myObjectCreate(void)
{
...
myObject *pObject = (myObject*)malloc(sizeof(*pObject));
...
}
void myObjectDestroy(myObject *pObject)
{
...
free(pObject);
...
}
I understand these memory management functions are not always available, especially on embedded systems based on low-end microcontrollers. Unfortunately my library needs to be compilable on these systems.
To work around this problem, I suppose I'd have to make these functions customisable by my library client.
So, what are the recommended ways to achieve this?
There are many approaches.
I use #if, combined with compiler provided defines, to have per platform behaviour.
Should a given functionality (such as malloc) be found, #define MYLIB_MALLOC can be defined.
Then, later, you can check for #ifdef MYLIB_MALLOC and if not present, provide a dummy malloc function, which will allow your code to compile.
Use function pointers.
Define the following pointers in the library:
void* (*CustomMalloc)(size_t) = NULL;
void (*CustomFree)(void*) = NULL;
And prior to using of the library functions initialize these pointers to point to custom implementations of malloc() and free(). Or initialize them to point to the real malloc() and free().
Inside of the library replace malloc(size) with CustomMalloc(size) and free(pointer) with CustomFree(pointer).
Use conditional compile, i.e. define some macro's like LIBC_AVAIL, LIBC_NOT_AVAIL and include different code when compiling.
This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
What is the best solution to replace a new memory allocator in an existing code?
I'm writing a library in C. I'd like to know if there is a way to divert every malloc() call my library makes to a different "augmented" testmalloc() function that I provide without (significantly) modifying my library. This question is inspired from p158 of Kernighan and Pike's "The Practice of Programming", where they say
Write a version of your storage allocator that intentionally fails early, to test your code for recovering from out-of-memory errors.
I am in a position where I could provide a wrapper mymalloc() and use that exclusively in my library. I suspect it will be necessary to use this freedom to avoid multiple symbol definitions during linking.
yeah. you should include the library at last, and use #define malloc mymalloc
example:
library.h:
void * myalloc(int);
#define malloc myalloc
source.c:
#include <stdlib.h>
int* i = malloc(4);
-> uses myalloc
I guess writing your own malloc:
char* malloc(size_t sz)
{
return (char*)0;
}
and then linking it in doesn't work here?
(Background note: You can usually replace a function in a library with another by linking it in first in the link step. This doesn't replace the calls in the library, so the library still uses its own function, but everything that needed a link to malloc from your own code when the linker gets to your version will use your version.)
If you cannot modify the code you can consider using __malloc_hook.
See (http://www.gnu.org/s/libc/manual/html_node/Hooks-for-Malloc.html)
in addition to Yossarian's answer, you can use malloc hooks, defined at least for the GNU C library.
It is even possible to write a malloc() implementation that can succeed or fail depending on a global. Unix linkers won't look for the real malloc function as it finds one in the object file(s). I do not know how this would behave on Windows.
void *malloc(size_t aSize)
{
if (gNextMallocShallFail)
{
gNextMallocShallFail = 0; //--- next call will succeed
return NULL;
}
else
{
return realloc(NULL, aSize);
}
}
I would like to monitor the use of mallocs and frees in an application by using the malloc and free hooks.
Here's the documentation http://www.gnu.org/s/libc/manual/html_node/Hooks-for-Malloc.html
From the example page you can see that my_malloc_hook transiently switches the malloc hook off (or to the previous hook in the chain) before re-invoking malloc.
This is a problem when monitoring multi-threaded applications (see end of question for explanation).
Other examples of the use of malloc hook that I have found on the internet have the same problem.
Is there a way to re-write this function to work correctly in a multi-threaded application?
For instance, is there an internal libc function that the malloc hook can invoke that completes the allocation, without the need to deactivate my hook.
I can't look at the libc source code due to corporate legal policy, so the answer may be obvious.
My design spec says I cannot replace malloc with a different malloc design.
I can assume that no other hooks are in play.
UPDATE
Since the malloc hook is temporarily removed while servicing the malloc, another thread may call malloc and NOT get the hook.
It has been suggested that malloc has a big lock around it that prevents this from happening, but it's not documented, and the fact that I effectively recursively call malloc suggests any lock must either exist after the hook, or be jolly clever:
caller ->
malloc ->
malloc-hook (disables hook) ->
malloc -> # possible hazard starts here
malloc_internals
malloc <-
malloc-hook (enables hook) <-
malloc
caller
UPDATED
You are right to not trust __malloc_hooks; I have glanced at the code, and they are - staggeringly crazily - not thread safe.
Invoking the inherited hooks directly, rather than restoring and re-entering malloc, seems to be deviating from the the document you cite a little bit too much to feel comfortable suggesting.
From http://manpages.sgvulcan.com/malloc_hook.3.php:
Hook variables are not thread-safe so they are deprecated now. Programmers should instead preempt calls to the relevant functions by defining and exporting functions like "malloc" and "free".
The appropriate way to inject debug malloc/realloc/free functions is to provide your own library that exports your 'debug' versions of these functions, and then defers itself to the real ones. C linking is done in explicit order, so if two libraries offer the same function, the first specified is used. You can also inject your malloc at load-time on unix using the LD_PRELOAD mechanisms.
http://linux.die.net/man/3/efence describes Electric Fence, which details both these approaches.
You can use your own locking if in these debug functions if that is necessary.
I have the same problem. I have solved it with that example. If we do not define THREAD_SAFE, we have the example given by the man, and we have a segmentation error.
If we define THREAD_SAFE, we have no segmentation error.
#include <malloc.h>
#include <pthread.h>
#define THREAD_SAFE
#undef THREAD_SAFE
/** rqmalloc_hook_ */
static void* (*malloc_call)(size_t,const void*);
static void* rqmalloc_hook_(size_t taille,const void* appel)
{
void* memoire;
__malloc_hook=malloc_call;
memoire=malloc(taille);
#ifndef THREAD_SAFE
malloc_call=__malloc_hook;
#endif
__malloc_hook=rqmalloc_hook_;
return memoire;
}
/** rqfree_hook_ */
static void (*free_call)(void*,const void*);
static void rqfree_hook_(void* memoire,const void* appel)
{
__free_hook=free_call;
free(memoire);
#ifndef THREAD_SAFE
free_call=__free_hook;
#endif
__free_hook=rqfree_hook_;
}
/** rqrealloc_hook_ */
static void* (*realloc_call)(void*,size_t,const void*);
static void* rqrealloc_hook_(void* memoire,size_t taille,const void* appel)
{
__realloc_hook=realloc_call;
memoire=realloc(memoire,taille);
#ifndef THREAD_SAFE
realloc_call=__realloc_hook;
#endif
__realloc_hook=rqrealloc_hook_;
return memoire;
}
/** memory_init */
void memory_init(void)
{
malloc_call = __malloc_hook;
__malloc_hook = rqmalloc_hook_;
free_call = __free_hook;
__free_hook = rqfree_hook_;
realloc_call = __realloc_hook;
__realloc_hook = rqrealloc_hook_;
}
/** f1/f2 */
void* f1(void* param)
{
void* m;
while (1) {m=malloc(100); free(m);}
}
void* f2(void* param)
{
void* m;
while (1) {m=malloc(100); free(m);}
}
/** main */
int main(int argc, char *argv[])
{
memory_init();
pthread_t t1,t2;
pthread_create(&t1,NULL,f1,NULL);
pthread_create(&t1,NULL,f2,NULL);
sleep(60);
return(0);
}
Since all calls to malloc() will go through your hook, you can synchronize on a semaphore (wait until it is free, lock it, juggle the hooks and free the semaphore).
[EDIT] IANAL but ... If you can use glibc in your code, then you can look at the code (since it's LGPL, anyone using it must be allowed to have a copy of the source). So I'm not sure you understood the legal situation correctly or maybe you're not legally allowed to use glibc by your company.
[EDIT2] After some thinking, I guess that this part of the call path must be protected by a lock of some kind which glibc creates for you. Otherwise, using hooks in multi-threaded code would never work reliably and I'm sure the docs would mention this. Since malloc() must be thread safe, the hooks must be as well.
If you're still worried, I suggest to write a small test program with two threads which allocate and free memory in a loop. Increment a counter in the hook. After a million rounds, the counter should be exactly two million. If this holds, then the hook is protected by the malloc() lock as well.
[EDIT3] If the test fails, then, because of your legal situation, it's not possible to implement the monitor. Tell your boss and let him make a decision about it.
[EDIT4] Googling turned up this comment from a bug report:
The hooks are not thread-safe. Period. What are you trying to fix?
This is part of a discussion from March 2009 about a bug in libc/malloc/malloc.c which contains a fix. So maybe a version of glibc after this date works but there doesn't seem to be a guarantee. It also seems to depend on your version of GCC.
There is no way to use the malloc hooks in a thread-safe way while recursing into malloc. The interface is badly designed, probably beyond repair.
Even if you put a mutex in your hook code, the problem is that calls into malloc do not see those locks until after they have passed through the hook mechanism, and to pass through the hook mechanism, they look at global variables (the hook pointers) without acquiring your mutex. As you're saving, changing and restoring these pointers in one thread, allocator calls in another thread are affected by them.
The main design problem is that the hooks are null pointers by default. If the interface simply provided non-null default hooks which are the allocator proper (the bottom-level allocator which doesn't call any more hooks), then it would be simple and safe to add hooks: you could just save the previous hooks, and in the new hooks, recurse into malloc by calling the hold hooks, without fiddling with any global pointers (other than at hook installation time, which can be done prior to any threads start up).
Alternatively, glibc could provide an internal malloc interface which doesn't invoke the hooks.
Another sane design would be the use of thread-local storage for the hooks. Overriding and restoring a hook would be done in one thread, without disturbing the hooks seen by another thread.
As it stands, what you can do to use the glibc malloc hook safely is to avoid recursing into malloc. Do not change the hook pointers inside the hook callbacks, and simply call your own allocator.