Changing glibc but nothing happens - c

I want to modify glibc. So I have downloaded a version of it and made some changes in the code. For example I've made changes to memset. However, I don't see any difference if I use the .so file produced by the compilation (using LD_PRELOAD) as compared to when I don't do any LD_PRELOAD. memset behaves like it does. Why is that so? Is that maybe the compiler is inlining memset and not using anything from the shared object? I don't understand this. I even made changes to printf, but still nothing. Why is that so. How can I modify glibc (for testing purposes), so that I do see a change?
Moreover, when I tried to change pthread_create (and ofcourse LD_PRELOAded libpthread.so) by introducing printf( "pthread_create") in the beginning of that function, I just get a segmentation fault. What is going on here? Also if I check the difference in the libc.so after making changes in glibc source, I see no difference in the produced versions. What is happening here. This is driving me nuts!

GCC provides built-in versions of several functions, including memset() and printf(). It does not link to glibc's implementation of these functions.
Try passing the -fno-builtin compiler option to inhibit this behavior.

Related

Providing a `malloc` implementation for `newlib-nano`

I'd like to provide an implementation of malloc for newlib-nano when using it with gcc. In my situation, I have some source file, say main.c, that calls strftime. The newlib-nano implementation of strftime uses malloc. In a header file, my_memory.h, I've declared a function void *malloc(size_t size) and provided an implementation in a corresponding my_memory.c file.
When linking the project using gcc, the linker fails at .../libc_nano.a(liba-malloc.o) because of multiple definitions of malloc. The behavior I'd like is for the linker to take my implementation of malloc rather than pulling newlib-nano's, but to retain using newlib-nano's implementation of other standard library functions, e.g. memset.
I've searched around for an "exclude object file from static library" option in gcc to try to exclude libc_nano.a(liba-malloc.o) but with no luck. Note that the compiler is pulling in this object file and I don't have access to the compiler's libc_nano.a to patch liba-malloc.o with my own object file.
Anyway, am I missing something, or is it not possible to achieve what I'm trying to achieve?
Likely liba-malloc.o contains other allocator function definitions like calloc, free, realloc, etc. and thus is getting pulled in for linking because of references to one of them. You can see this with the -t option to ld (pass -Wl,-t on gcc command line when linking to use it). If this is the case, you can avoid linking it just by ensuring you've provided definitions of all these functions yourself.
A better idea might be getting rid of the malloc dependency by using a different strftime. It's rather ridiculous for strftime, especially an embedded-oriented implementation, to be calling malloc; it has no fundamental need to and I'm somewhat baffled how they found a way to make malloc useful to it. Aside from some tie-in with locale which could be extricated fairly easily, musl libc's strftime.c (disclosure: author=me) is very self-contained and could probably serve as a drop-in replacement.

Disable default malloc with arm-none-eabi on cortex m3 (bare metal)

I want to provide my own or better no malloc function. So I want to make sure it's not linked at all.
I already pass -nostdlib and --specs=nano.specs to the linker.
When providing my own malloc function I get:
../lib/gcc/arm-none-eabi/7.2.1/../../../../arm-none-eabi/lib/thumb/v7-m\libc_nano.a(lib_a-malloc.o): In function `malloc':
malloc.c:(.text.malloc+0x0): multiple definition of `malloc'
I'm looking for a way to skip linking of the lib_a-malloc.o
As a clarification: It's more about having no malloc at all than about providing my own implementation. Providing my own implementation was just to check if there already is one or for debugging purpose.
Using the same name as a standard function's name is almost always a bad idea.
Even you, after some time not working on that project, will not remember that this malloc() you are reading in your code is not the malloc() that we all know and loved. Let aside anyone else.
So, for maintainability and readability, I suggest you name your function differently, plain example: my_malloc().
PS: You might want to read GCC - How to stop malloc being linked?

Can I run GCC as a daemon (or use it as a library)?

I would like to use GCC kind of as a JIT compiler, where I just compile short snippets of code every now and then. While I could of course fork a GCC process for each function I want to compile, I find that GCC's startup overhead is too large for that (it seems to be about 50 ms on my computer, which would make it take 50 seconds to compile 1000 functions). Therefore, I'm wondering if it's possible to run GCC as a daemon or use it as a library or something similar, so that I can just submit a function for compilation without the startup overhead.
In case you're wondering, the reason I'm not considering using an actual JIT library is because I haven't found one that supports all the features I want, which include at least good knowledge of the ABI so that it can handle struct arguments (lacking in GNU Lightning), nested functions with closure (lacking in libjit) and having a C-only interface (lacking in LLVM; I also think LLVM lacks nested functions).
And no, I don't think I can batch functions together for compilation; half the point is that I'd like to compile them only once they're actually called for the first time.
I've noticed libgccjit, but from what I can tell, it seems very experimental.
My answer is "No (you can't run GCC as a daemon process, or use it as a library)", assuming you are trying to use the standard GCC compiler code. I see at least two problems:
The C compiler deals in complete translation units, and once it has finished reading the source, compiles it and exits. You'd have to rejig the code (the compiler driver program) to stick around after reading each file. Since it runs multiple sub-processes, I'm not sure that you'll save all that much time with it, anyway.
You won't be able to call the functions you create as if they were normal statically compiled and linked functions. At the least you will have to load them (using dlopen() and its kin, or writing code to do the mapping yourself) and then call them via the function pointer.
The first objection deals with the direct question; the second addresses a question raised in the comments.
I'm late to the party, but others may find this useful.
There exists a REPL (read–eval–print loop) for c++ called Cling, which is based on the Clang compiler. A big part of what it does is JIT for c & c++. As such you may be able to use Cling to get what you want done.
The even better news is that Cling is undergoing an attempt to upstream a lot of the Cling infrastructure into Clang and LLVM.
#acorn pointed out that you'd ruled out LLVM and co. for lack of a c API, but Clang itself does have one which is the only one they guarantee stability for: https://clang.llvm.org/doxygen/group__CINDEX.html

Stand-alone portable snprintf(), independent of the standard library?

I am writing code for a target platform with NO C-runtime. No stdlib, no stdio. I need a string formatting function like snprintf but that should be able to run without any dependencies, not even the C library.
At most it can depend on memory alloc functions provided by me.
I checked out Trio but it needs stdio.h header. I can't use this.
Edit
Target platform : PowerPC64 home made OS(not by me). However the library shouldn't rely on OS specific stuff.
Edit2
I have tried out some 3rd-party open source libs, such as Trio(http://daniel.haxx.se/projects/trio/), snprintf and miniformat(https://bitbucket.org/jj1/miniformat/src) but all of them rely on headers like string.h, stdio.h, or(even worse) stdlib.h. I don't want to write my own implementation if one already exists, as that would be time-wasting and bug-prone.
Try using the snprintf implementation from uclibc. This is likely to have the fewest dependencies. A bit of digging shows that snprintf is implemented in terms of vsnprintf which is implemented in terms of vfprintf (oddly enough), it uses a fake "stream" to write to string.
This is a pointer to the code: http://git.uclibc.org/uClibc/tree/libc/stdio/_vfprintf.c
Also, a quick google search also turned up this:
http://www.ijs.si/software/snprintf/
http://yallara.cs.rmit.edu.au/~aholkner/psnprintf/psnprintf.html
http://www.jhweiss.de/software/snprintf.html
Hopefully one is suitable for your purposes. This is likely to not be a complete list.
There is a different list here:
http://trac.eggheads.org/browser/trunk/src/compat/README.snprintf?rev=197
You will probably at least need stdarg.h or low level knowledge of the specific compiler/architecture calling convention in order to be able to process the variadic arguments.
I have been using code based on Kustaa Nyholm's implementation It provides printf() (with user supplied character output stub) and sprintf(), but adding snprintf() would be simple enough. I added vprintf() and vsprintf() for example in my implementation.
No dynamic memory application is required, but it does have a dependency on stdarg.h, but as I said, you are unlikely to be able to get away without that for any variadic function - though you could potentially implement your own.
I am guessing you are in a norming enivronment where you need to explicitly document and verify COTS code.
However, I think in the case of stdarg.h this is worthwhile. You could pull in the source for just this and treat it like handwritten code (review, lint, unit-test, etc.). Any self-written replacement will be a lot of work, probably less stable and absolutely not portable.
That said, the actual snprintf implementation should not be too hard, and you could do this yourself, probably. Especially if you might be able to strip a few features away.
Keep in mind that vararg code has no typechecking and is prone to errors. For library snprintf you may find gcc's warnings helpful.

Change library load order at run time (like LD_PRELOAD but during execution)

How do I change the library a function loads from during run time?
For example, say I want to replace the standard printf function with something new, I can write my own version and compile it into a shared library, then put "LD_PRELOAD=/my/library.so" in the environment before running my executable.
But let's say that instead, I want to change that linkage from within the program itself. Surely that must be possible... right?
EDIT
And no, the following doesn't work (but if you can tell me how to MAKE it work, then that would be sufficient).
void* mylib = dlopen("/path/to/library.so",RTLD_NOW);
printf = dlsym(mylib,"printf");
AFAIK, that is not possible. The general rule is that if the same symbol appears in two libraries, ld.so will favor the library that was loaded first. LD_PRELOAD works by making sure the specified libraries are loaded before any implicitly loaded libraries.
So once execution has started, all implicitly loaded libraries will have been loaded and therefore it's too late to load your library before them.
There is no clean solution but it is possible. I see two options:
Overwrite printf function prolog with jump to your replacement function.
It is quite popular solution for function hooking in MS Windows. You can find examples of function hooking by code rewriting in Google.
Rewrite ELF relocation/linkage tables.
See this article on codeproject that does almost exactly what you are asking but only in a scope of dlopen()'ed modules. In your case you want to also edit your main (typically non-PIC) module. I didn't try it, but maybe its as simple as calling provided code with:
void* handle = dlopen(NULL, RTLD_LAZY);
void* original;
original = elf_hook(argv[0], LIBRARY_ADDRESS_BY_HANDLE(handle), printf, my_printf);
If that fails you'll have to read source of your dynamic linker to figure out what needs to be adapted.
It should be said that trying to replace functions from the libc in your application has undefined behavior as per ISO C/POSIX, regardless of whether you do it statically or dynamically. It may work (and largely will work on GNU/Linux), but it's unwise to rely on it working. If you just want to use the name "printf" but have it do something nonstandard in your program, the best way to do this is to #undef printf and #define printf my_printf AFTER including any system headers. This way you don't interfere with any internal use of the function by libraries you're using...and your implementation of my_printf can even call the system printf if/when it needs to.
On the other hand, if your goal is to interfere with what libraries are doing, somewhere down the line you're probably going to run into compatibility issues. A better approach would probably be figuring out why the library won't do what you want without redefining the functions it uses, patching it, and submitting patches upstream if they're appropriate.
You can't change that. In general *NIX linking concept (or rather lack of concept) symbol is picked from first object where it is found. (Except for oddball AIX which works more like OS/2 by default.)
Programmatically you can always try dlsym(RTLD_DEFAULT) and dlsym(RTLD_NEXT). man dlsym for more. Though it gets out of hand quite quickly. Why is rarely used.
there is an environment variable LD_LIBRARY_PATH where the linker searches for shred libraries, prepend your path to LD_LIBRARY_PATH, i hope that would work
Store the dlsym() result in a lookup table (array, hash table, etc). Then #undef print and #define print to use your lookup table version.

Resources