Converting malloc() calls into external Library Calls in C

Converting malloc() calls into external Library Calls in C - c

I am writing an open-source tool for run-time memory issue debugging:
https://github.com/sandeepsinghmails/S_malloc
The current version requires the user to change his/her wrapper functions for malloc() and free() and call two additional functions from my library.
I want to modify this code so that the user's malloc() and free() calls are automatically mapped to my own implementations. The user need not modify his source code (something which Valgrind provides).
Can somebody please guide me on this?

Take a look at malloc_hooks:
http://man7.org/linux/man-pages/man3/malloc_hook.3.html
The GNU C library lets you modify the behavior of malloc(3), realloc(3),
and free(3) by specifying appropriate hook functions. You can use these
hooks to help you debug programs that use dynamic memory allocation, for
example.

Related

cJSON_Delete() and cJSON_free()

am still new to the cJSON library and i cant fully understand the uses of cJSON_Delete() and cJSON_free(),
Is there any document that accurately describes what functions should be released, also when to use cJSON_free() and when to use cJSON_Delete().
What is cJSON_InitHooks() purpose and how to use it in my code.
Everytime i declare a variable cJSON *Variable; do i need to free it to minimize memory usage or it will free itself??
Thanks!!

A quick glance in the Readme and the header file shows:
From the letter: No, the project seems not to provide such documents.
Anyway, you don't need to call cJSON_free() if you don't call cJSON_malloc(). It's more a helper function to let you call the free() and malloc() hooked functions.
You need to call cJSON_Delete() for any cJSON object you receive from any of the allocating functions, like the parsers.
The purpose of cJSON_InitHooks() is to provide your own memory allocation functions to the library. This might be interesting if you don't want to use the default functions or if you use a target without (working) malloc() and free().
The declaration as such does not allocate memory for a cJSON object. If you don't obtain such an object, you cannot call cJSON_Delete() successfully. By calling cJSON_Delete(), the memory allocated by one of the parsers for example, will be freed.
It seems as if you need to learn about pointers and dynamic memory allocation to use this library correctly. This is fairly basic C stuff independent of this library.
However, reading the provided introduction (especially the examples) and if in doubt the source will help, too.

How can I detect the main executable's function definitions from a dynamic library - particularly malloc

So libgcc will use the application's malloc and free, if it has defined one, in order to satisfy the application's need to call free() after certain library calls, eg realpath.
Within my dynamic library, I really don't want to use that application malloc/free, because I don't trust it generally - I'm happy to use libgcc's implementation of malloc, which is what most applications use, so I declared my own malloc() function that calls libgcc's implementation via dlsym().
All was going well until... I want to call realpath (and perhaps others)! As a simple fix, I need a way to do the equivalent of dlsym() but on the main executable (which I don't own) to get the application's implementation of free, if any. Does such a thing exist?
I know it must, because the dynamic linker "does the right thing", but is it accessible to mere mortal programmers, and how?
In the particular case of realpath, I know I can provide a buffer, but that comes with its own unknown dangers about buffer size. For some other calls I can't do that.
I can also go down the winding path of symbol renaming with objcopy, but I'd prefer not to, if possible.
[laters]
I do take your point on malloc being possibly defined by another dynamic library, and I would want to use that version, while the application still uses its compiled in version (I have seen that it does continue to use it, even if tcmalloc is preloaded, for example.
I guess that extends the question to ask if any library has defined malloc, and if the application has defined malloc, I want to cherry-pick which version of malloc/free I use in each place in my code to match the behaviour of libgcc when necessary, and not when not, so I want to be able to get a reference to them both.
In the short term I have resolved my current issue by replacing realpath() in my code with a version that pre-defines the buffer using my malloc, before calling the libgcc implementation, but I feel this is very much a band-aide.

The interposing malloc can come from anywhere, not just the executable. It could be a shared object referenced by a DT_NEEDED entry in the dynamic section of another object, or a library injected into the process image using LD_PRELOAD.
In general, many libraries have functions which allocate something which then has to be deallocated using free. Software ported to Windows will not do this (because DLLs have separate heaps there), but otherwise, it is not uncommon. There is not just realpath, there is also strdup, asprintf, and probably I bunch of other functions I do not remember.
In your case, you should just call free on such pointers, and use a different name for your own memory deallocation functions. Once malloc has been interposed, it is not possible to safely use the original libc allocator because it has not been properly initialized. For example, if you call the glibc malloc function in a process which uses a different, interposed malloc, then malloc will not be thread-safe: the initialization is not thread-safe because the implementation knows that pthread_create will call malloc before creating the first new thread, thereby initializing malloc while the process is single-threaded. Which is why there is no synchronization in the initialization code.
(libgcc does not provide malloc, by the way. It comes from libc/glibc.)

malloc function interposition in the standard C and C++ libraries

I want to write a shared library in such a way that it is possible to isolate it’s memory usage from the application it is linked against. That is, if the shared library, let’s call it libmemory.so, calls malloc, I want to maintain that memory in a separate heap from the heap that is used to service calls to malloc made in the application. This question isn't about writing memory allocators, it more about linking and loading the library and application together.
So far I’ve been experimenting with combinations of function interposition, symbol visibility, and linking tricks. So far, I can’t get this right because of one thing: the standard library. I cannot find a way to distinguish between calls to the standard library that internally use malloc that happen in libmemory.so versus the application. This causes an issue since then any standard library usage within libmemory.so pollutes the application heap.
My current strategy is to interpose definitions of malloc in the shared library as a hidden symbol. This works nicely and all of the library code works as expected except, of course, the standard library which is loaded dynamically at runtime. Naturally, I’ve been trying to find a way to statically embed the standard library usage so that it would use the interposed malloc in libmemory.so at compile time. I’ve tried -static-libgcc and -static-libstdc++ without success (and anyway, it seems this is discouraged). Is this the right answer?
What do?
P.s., further reading is always appreciated, and help on the question tagging front would be nice.

I’ve tried -static-libgcc and -static-libstdc++ without success
Of course this wouldn't succeed: malloc doesn't live in libgcc or libstdc++; it lives in libc.
What you want to do is statically link libmemory.so with some alternative malloc implementation, such as tcmalloc or jemalloc, and hide all malloc symbols. Then your library and your application will have absolutely separate heaps.
It goes without saying that you must never allocate something in your library and free it in the application, or vice versa.
In theory you could also link the malloc part of system libc.a into your library, but in practice GLIBC (and most other UNIX C libraries) does not support partially-static link (if you link libc.a, you must not link libc.so).
Update:
If libmemory.so makes use of a standard library function, e.g., gmtime_r, which is linked in dynamically, thereby resolving malloc at runtime, then libmemory.so mistakenly uses malloc provided at runtime (the one apparently from glibc
There is nothing mistaken about that. Since you've hidden your malloc inside your library, there is no other malloc that gmtime_r could use.
Also, gmtime_r doesn't allocate memory, except for internal use by GLIBC itself, and such memory could be cleaned up by __libc_freeres, so it would be wrong to allocate this memory anywhere other than using GLIBC's malloc.
Now, fopen is another example you used, and fopen does malloc memory. Apparently you would like fopen to call your malloc (even though it's not visible to fopen) when called by your library, but call system malloc when called by the application. But how can fopen know who called it? Surely you are not suggesting that fopen walk the stack to figure out whether it was called by your library or by something else?
So, if you really want to make your library never call into system malloc, then you would have to statically link all other libc functions that you use and that may call malloc (and hide them in your library as well).
You could use something like uclibc or dietlibc to achieve that.

How to replace newlib's malloc

I'm using LPCXpresso with LPC1768. I'm trying to implement few memory pools. I have my old code that allows this, so I'm fine there. What I'm unable to do is to prevent newlib from using it's own malloc. There are few functions in newlib calling malloc. I dodged them all, except for _Csys_alloc, which is unfortunately called by _initio. Since malloc isn't weak, I can't simply replace it with my own implementation. So is there any other way to do it except for either modifying newlib and recompiling or writing my own _initio routine?
Thanks for your help.

It is perhaps simplest to let Newlib use its malloc as it wants and implement _sbrk() to limit its use and location to a static pool sized to just what is needed for library initialisation, then override malloc() for use in your own code - the linker will only link to standard library symbols if not previously found in another library of object code.

Compiling a custom malloc

I have written a custom library which implements malloc/calloc/realloc/free using the standard C prototypes, and I figured out how to compile it to an so. I want to test the library by linking a standard application against it? What would be a good way to do this? Once I have a working library I assume I can just load it with LD_PRELOAD, but how do I get my functions to co-exist with but take precedence over the system library ones? My functions need to make a call to malloc in order to get memory to run, so I can't just completely ditch stdlib... Help?

Functions that you are trying to replace are standard C functions, not macros, not system calls. So you have to simply give your functions the same names and compile them into a shared library.
Then, use LD_PRELOAD to pre-load your library before binary starts. Since all addresses are resolved once, linker will figure out addresses of your functions and remember their names and will not look for them in standard library later.
This approach might not work if your program is linked with the standard runtime statically. Also, it will not work on Mac OS X as there is another API for interpolation.
In Linux, for example, in order for your functions to co-exist (i.e. if you want to use system malloc in your own implementation of malloc), you have to open the standard library manually using dlopen, look up functions you need there using dlsym and call them later by address.

Don't write your malloc() in terms of malloc() -- write it using sbrk, which gets memory directly from the OS.

If you have control of the source code that is to use this library, here is one possibility. Use different function names: Rather than malloc, for example, call it newCoolMalloc. This method is sometimes simpler and doesn't depend on special linker options.
Then in your code, use #define to cause the code to call the desired set of functions. You can #define malloc to be something different. For example:
#define malloc newCoolMalloc
#define free newCoolFree
If you do that, though, you have to be very very careful to include that consistently. Otherwise you run the risk of using stdlib malloc in one place and then your own free in another leading to messy bugs. One way to help mitigate that situation is to (if possible) in your own code use custom names for the allocation and free functions. Then it is easier to ensure that the correct one is being called. You can define the various custom names to your own malloc functions or even the original stdlib malloc functions.
For example, you might use mallocPlaceHolder as the actual name in the code:
someThing = mallocPlaceHolder( nbytes );
Then your defines would look more like this:
#define mallocPlaceHolder myCoolMalloc
If no function of the form mallocPlaceHolder (and associated free) actually exist, it avoids mixing different libraries.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight