I'm creating a cross platform library using C. I have a piece of code like the following, in which I'm using the libc memory management functions directly:
myObject* myObjectCreate(void)
{
...
myObject *pObject = (myObject*)malloc(sizeof(*pObject));
...
}
void myObjectDestroy(myObject *pObject)
{
...
free(pObject);
...
}
I understand these memory management functions are not always available, especially on embedded systems based on low-end microcontrollers. Unfortunately my library needs to be compilable on these systems.
To work around this problem, I suppose I'd have to make these functions customisable by my library client.
So, what are the recommended ways to achieve this?
There are many approaches.
I use #if, combined with compiler provided defines, to have per platform behaviour.
Should a given functionality (such as malloc) be found, #define MYLIB_MALLOC can be defined.
Then, later, you can check for #ifdef MYLIB_MALLOC and if not present, provide a dummy malloc function, which will allow your code to compile.
Use function pointers.
Define the following pointers in the library:
void* (*CustomMalloc)(size_t) = NULL;
void (*CustomFree)(void*) = NULL;
And prior to using of the library functions initialize these pointers to point to custom implementations of malloc() and free(). Or initialize them to point to the real malloc() and free().
Inside of the library replace malloc(size) with CustomMalloc(size) and free(pointer) with CustomFree(pointer).
Use conditional compile, i.e. define some macro's like LIBC_AVAIL, LIBC_NOT_AVAIL and include different code when compiling.
Related
In C, we can force the linker to put a specific function in a specific section from the source code, using something like the following example.
Here, the function my_function is tagged with the preprocessor macro PUT_IN_USER_SECTION in order to tell the linker to put it in the section .user_section.
#define PUT_IN_USER_SECTION __attribute__((__section__(".user_section"))) __attribute__ ((noinline))
double PUT_IN_USER_SECTION my_function(double a, double b)
{
// Function content
}
Now, what I'd like to know is, when we use standard functions (for example log functions from the math.h library) from the GLIBC, MUSL, ... and we perform static linking: is it possible to put those functions in specific sections? and how to that?
There is a technique called overlaying that comes from a time when memory resources were limited. It is still used in Embedded Systems with restricted resources :
https://en.wikipedia.org/wiki/Overlay_(programming)
source : https://de.wikipedia.org/wiki/Overlay_(Programmierung)#/media/Datei:Overlay_Programming.svg
In https://developer.arm.com/documentation/100066/0611/mapping-code-and-data-to-target-memory/manual-overlay-support/writing-an-overlay-manager-for-manually-placed-overlays is an implementation of overlaying, the load addresses are in the scatter file.
https://sourceware.org/gdb/onlinedocs/gdb/How-Overlays-Work.html
For the addresses of functions in memory see https://ftp.gnu.org/pub/old-gnu/Manuals/ld-2.9.1/html_node/ld_22.html
https://www.keil.com/support/man/docs/armlink/armlink_pge1362066004071.htm
The descriptions seem virtually identical. Are there any nuances between the two that should be noted? Why would someone use one over the other? This question may be imposed as well for Tcl_Alloc() and malloc().
They're used because Tcl supports being built on Windows with one tool chain, and loading a DLL built with a different toolchain. A key feature of that scenario is that it is fairly common for different toolchains to have their own implementations of the C library, and that means different implementations of malloc(). You must match malloc() and free() to the same library or you get some truly weird failures (crashes, memory leaks, etc.) By providing Tcl_Alloc and Tcl_Free (which are usually very thin wrappers) it makes it possible for user code to match up the allocations and releases correctly.
This is normally the most obvious reason to do that:
Normally, the best understood reason to use your own version of the memory allocation functions is to have a single definition that allows you to change the memory allocator for a different allocator. (a debugging, extended, or implemented with security options, etc.)
Just assume you have the following implementation:
void *my_malloc(size_t siz)
{
return malloc(siz);
}
void my_free(void *ptr)
{
free(ptr);
}
defined in allocator_malloc.c
and for a special customer X you have acquired a license of the new ACME allocator. For this customer you link your executable with the file allocator_ACME.c which contains:
void *my_malloc(size_t siz)
{
return ACME_malloc(siz);
}
void free(void *ptr)
{
ACME_free(ptr);
}
Then, just linking your executable with one or the other file, you generate a dependency of the standard library malloc(), or you'll have to provide an implementation of ACME_malloc() function. In this way, just changing the presence of one of several object files, changes the whole set of dependencies (assuming you have definitions for both my_malloc() and my_free() in your source file) into one of several different implementations.
The drawback is that you have one level of function call more, so in some cases a more sofisticated solution has to be used.
Assume that you buy an automatic garbage collector, so you don't need to return the memory allocated with malloc, as for some magic, the library will detect that you have not used it more, and it garbage collects it automatically:
void *my_malloc(size_t siz)
{
return GC_malloc(siz);
}
void my_free(void *ptr)
{
/* empty */
}
I need to load a library dynamically at runtime, and the code doesn't have the custom types the library uses defined at compile time.
This seems to initialize the struct correctly:
void *data = malloc(128);
InitCustomType(data);
The problem with this approach is that the size of the struct is unknown.
This is an example of how the library is normally used: (CustomType is a struct)
CustomType customType;
InitCustomType(&customType);
// Now customType can be used in the library calls
This is an example of how the library is normally used: (CustomType is a struct)
If that is an example, then nothing should stop you from using CustomType, or using malloc(sizeof(CustomType)).
the code doesn't have the custom types the library uses defined at compile time.
That statement is inconsistent with above. Either you do have a definition of CustomType, or you don't. If you do, you don't have a problem. If you don't, you can't really use this library.
I am writing a memory profiler for C and for that am intercepting calls to the malloc, realloc and free functions via malloc_hooks. Unfortunately, these are deprecated because of their poor behavior in multi threaded environments. I could not find a document describing the alternative best practice solution to achieve the same thing, can someone enlighten me?
I've read that a simple #define malloc(s) malloc_hook(s) would do the trick, but that does not work with the system setup I have in mind, because it is too intrusive to the original code base to be suitable for use in a profiling / tracing tool. Having to manually change the original application code is a killer for any decent profiler. Optimally, the solution I am looking for should be enabled or disabled just by linking to an optional shared library. For example, my current setup uses a function declared with __attribute__ ((constructor)) to install the intercepting malloc hooks.
Thanks
After trying some things, I finally managed to figure out how to do this.
First of all, in glibc, malloc is defined as a weak symbol, which means that it can be overwritten by the application or a shared library. Hence, LD_PRELOAD is not necessarily needed. Instead, I implemented the following function in a shared library:
void*
malloc (size_t size)
{
[ ... ]
}
Which gets called by the application instead of glibcs malloc.
Now, to be equivalent to the __malloc_hooks functionality, a couple of things are still missing.
1.) the caller address
In addition to the original parameters to malloc, glibcs __malloc_hooks also provide the address of the calling function, which is actually the return address of where malloc would return to. To achieve the same thing, we can use the __builtin_return_address function that is available in gcc. I have not looked into other compilers, because I am limited to gcc anyway, but if you happen to know how to do such a thing portably, please drop me a comment :)
Our malloc function now looks like this:
void*
malloc (size_t size)
{
void *caller = __builtin_return_address(0);
[ ... ]
}
2.) accessing glibcs malloc from within your hook
As I am limited to glibc in my application, I chose to use __libc_malloc to access the original malloc implementation. Alternatively, dlsym(RTLD_NEXT, "malloc") can be used, but at the possible pitfall that this function uses calloc on its first call, possibly resulting in an infinite loop leading to a segfault.
complete malloc hook
My complete hooking function now looks like this:
extern void *__libc_malloc(size_t size);
int malloc_hook_active = 0;
void*
malloc (size_t size)
{
void *caller = __builtin_return_address(0);
if (malloc_hook_active)
return my_malloc_hook(size, caller);
return __libc_malloc(size);
}
where my_malloc_hook looks like this:
void*
my_malloc_hook (size_t size, void *caller)
{
void *result;
// deactivate hooks for logging
malloc_hook_active = 0;
result = malloc(size);
// do logging
[ ... ]
// reactivate hooks
malloc_hook_active = 1;
return result;
}
Of course, the hooks for calloc, realloc and free work similarly.
dynamic and static linking
With these functions, dynamic linking works out of the box. Linking the .so file containing the malloc hook implementation will result of all calls to malloc from the application and also all library calls to be routed through my hook. Static linking is problematic though. I have not yet wrapped my head around it completely, but in static linking malloc is not a weak symbol, resulting in a multiple definition error at link time.
If you need static linking for whatever reason, for example translating function addresses in 3rd party libraries to code lines via debug symbols, then you can link these 3rd party libs statically while still linking the malloc hooks dynamically, avoiding the multiple definition problem. I have not yet found a better workaround for this, if you know one,feel free to leave me a comment.
Here is a short example:
gcc -o test test.c -lmalloc_hook_library -Wl,-Bstatic -l3rdparty -Wl,-Bdynamic
3rdparty will be linked statically, while malloc_hook_library will be linked dynamically, resulting in the expected behaviour, and addresses of functions in 3rdparty to be translatable via debug symbols in test. Pretty neat, huh?
Conlusion
the techniques above describe a non-deprecated, pretty much equivalent approach to __malloc_hooks, but with a couple of mean limitations:
__builtin_caller_address only works with gcc
__libc_malloc only works with glibc
dlsym(RTLD_NEXT, [...]) is a GNU extension in glibc
the linker flags -Wl,-Bstatic and -Wl,-Bdynamic are specific to the GNU binutils.
In other words, this solution is utterly non-portable and alternative solutions would have to be added if the hooks library were to be ported to a non-GNU operating system.
You can use LD_PRELOAD & dlsym
See "Tips for malloc and free" at http://www.slideshare.net/tetsu.koba/presentations
Just managed to NDK build code containing __malloc_hook.
Looks like it's been re-instated in Android API v28, according to https://android.googlesource.com/platform/bionic/+/master/libc/include/malloc.h, esp:
extern void* (*volatile __malloc_hook)(size_t __byte_count, const void* __caller) __INTRODUCED_IN(28);
This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
What is the best solution to replace a new memory allocator in an existing code?
I'm writing a library in C. I'd like to know if there is a way to divert every malloc() call my library makes to a different "augmented" testmalloc() function that I provide without (significantly) modifying my library. This question is inspired from p158 of Kernighan and Pike's "The Practice of Programming", where they say
Write a version of your storage allocator that intentionally fails early, to test your code for recovering from out-of-memory errors.
I am in a position where I could provide a wrapper mymalloc() and use that exclusively in my library. I suspect it will be necessary to use this freedom to avoid multiple symbol definitions during linking.
yeah. you should include the library at last, and use #define malloc mymalloc
example:
library.h:
void * myalloc(int);
#define malloc myalloc
source.c:
#include <stdlib.h>
int* i = malloc(4);
-> uses myalloc
I guess writing your own malloc:
char* malloc(size_t sz)
{
return (char*)0;
}
and then linking it in doesn't work here?
(Background note: You can usually replace a function in a library with another by linking it in first in the link step. This doesn't replace the calls in the library, so the library still uses its own function, but everything that needed a link to malloc from your own code when the linker gets to your version will use your version.)
If you cannot modify the code you can consider using __malloc_hook.
See (http://www.gnu.org/s/libc/manual/html_node/Hooks-for-Malloc.html)
in addition to Yossarian's answer, you can use malloc hooks, defined at least for the GNU C library.
It is even possible to write a malloc() implementation that can succeed or fail depending on a global. Unix linkers won't look for the real malloc function as it finds one in the object file(s). I do not know how this would behave on Windows.
void *malloc(size_t aSize)
{
if (gNextMallocShallFail)
{
gNextMallocShallFail = 0; //--- next call will succeed
return NULL;
}
else
{
return realloc(NULL, aSize);
}
}