Can't preload a function in C program - c

I wrote some code in nasm and I'm trying to implement it in a C program as a replacement of strlen through a shared library, but it doesn't work.
nasm code:
section .text
global strlen:function
strlen:
mov rax, 42
ret
C code:
#include <stdio.h>
size_t strlen(const char *s);
int main()
{
printf("%zu\n", strlen("foobar"));
return (0);
}
I compile the C program just using gcc without any arguments, and I create the shared library with the following commands:
nasm -f elf64 strlen.asm
gcc -shared -fPIC -o libasm.so strlen.o
Finally, I include the shared library:
export LD_PRELOAD=`pwd`/libasm.so
But it displays '6' where I expect it to display '42'.
I don't think the problem comes from my library, because I get segmentation fault when I execute the ls command with LD_PRELOAD.
I'm working on Ubuntu 16.04.

This is not related to nasm at all. A C equivalent of your strlen() function does not work either.
$ cat strlen.c
#include <stddef.h>
size_t strlen(const char *s)
{
return 43;
}
$ cat s.c
#include <stdio.h>
size_t strlen(const char *s);
int main()
{
printf("%zu\n", strlen("foobar"));
return 0;
}
$ make s
cc s.c -o s
$ gcc -shared -fPIC -o strlen.so strlen.c
$ LD_PRELOAD=$PWD/strlen.so ./s
6
What is happening here is that gcc is using its own built-in version of strlen() that cannot be overridden. If the C program that calls strlen() is recompiled to not use this built-in version of strlen(), then your shared library can override it.
$ rm s
$ make s CFLAGS=-fno-builtin-strlen
cc -fno-builtin-strlen s.c -o s
$ LD_PRELOAD=$PWD/strlen.so ./s
43
$ LD_PRELOAD=$PWD/libasm.so ./s
42

Related

__attribute__((weakref)) not work for external function

Recently I 'm studying the linking process and when it comes to weak symbol, my textbook give a code below to demonstrate how to use __attribute__((weakref)) to declare a weak reference to external function:
//pthread.c
#include <stdio.h>
#include <pthread.h>
int pthread_create(
pthread_t*,
const pthread_attr_t*,
void* (*)(void*),
void*
)__attribute__((weak));
int main()
{
if(pthread_create){
printf("This is multi-thread version\n");
}else{
printf("This is single-thread version!\n");
}
}
Then the author use different ways of linking ,which gives the result:
$ gcc pthread.c -o pt
$ ./pt
This is single-thread version!
$ gcc pthread.c -lpthread -o pt
$ ./pt
This is multi-thread version!
I reproduced the same procedure on my machine, but both results give This is single-thread version!
I tried to find out what 's going on here, but I quickly stuck into two problem :
I think it might be that the author mistakenly write __attribute((weak))__instead of __attribute__((weakref))__ here. Because if one module declare a weak symbol, the linker would not find the definition of the symbol in the library during the static linking process. Considering that GCC use dynamic linking by default, I use static linking to verify that :
$ gcc -c pthread.c
$ nm pthread.o
U _GLOBAL_OFFSET_TABLE_
0000000000000000 T main
w pthread_create
U puts
$ gcc -static pthread.c -lpthread -o pt
$ nm pt | grep 'pthread_create'
$
the symbol 'pthread_create' do not appear in the symbol table.
Now I use __attribute((weakref))__ to reproduce the procedures.
//pthread.c modified
#include <stdio.h>
#include <pthread.h>
static int pthread_create_dup(
pthread_t*,
const pthread_attr_t*,
void* (*)(void*),
void*
)__attribute__((weakref,alias("pthread_create")));
int main()
{
if(pthread_create_dup){
printf("This is multi-thread version!\n");
}else{
printf("This is single-thread version!\n");
}
}
still, after compiling and linking, the result is This is single-thread version!
$ gcc -static pthread.c -lpthread -o pt
$ nm pt | grep 'pthread_create'
$
I make another sample to simulate above:
test.c:
extern int foo(int a,int b);
static int foo_dup(int a,int b) __attribute__((weakref,alias("foo")));
int main(){
if(foo_dup){
printf("foo is linked\n");
}else{
printf("foo isn't linked\n");
}
}
ref_foo.c
int foo(int a ,int b){
return a+b;
}
Then compile those two:
$ gcc -c ref_foo.c test.c
$ ar rcs libfoo.a ref_foo.o
$ gcc -static test.o libfoo.a -o test_withlib
$ ./test_withlib
foo isn't linked
$ gcc -static test.o ref_foo.o -o test_withoutlib
$ ./test_withoutlib
foo is linked
So why would this happen? It's apparently I cannot extract pthread_create.o from libpthread.o and simply gcc pthread.c pthread_create.o -o pt.How to correctly implement pthread.c so it will print This is multi-thread version! when linking to the libpthread ?

Need rebuild to use it with different library that provides the same interface?

I have a project "A" that has dependency to "B". "B" is dynamically loaded in start up time.
Imagine there's a library "C" that satisfies the same interface as "B". Can A.so that is built using B.so be used with C.so without rebuild? Is it linker's job to find the correct function's addresses?
Can this be "safely" achieved just by configuring LD_LIBRARY_PATH that prioritizes C.so ?
Edit:
I seem to have misunderstood the question. If you want to keep both libB.so & libC.so but use libC.so for some binaries, then keep them in separate folders and make LD_LIBRARYPATH to find libC.so folder first. Where you'll create a soft-link libC.so -> libB.so. You may have run the binary that needs libC.so separately than others, as the environment variables change. Test it.
Let's define a function say str_mod() with same interface in two different library. One(A) changes all characters in input string to upper-case, while other(B) reverses it.
We'll have a wrapper function library(C) str_mod_C() which replaces the string itself. Later calls str_mod().
A test program which calls the wrapper str_mod_C(). What happens when we replace the core
Creating files & test folder:
test_so/
├── lib_AAA.c
├── lib_BBB.c
├── lib_CCC.c
├── lib_if.h
└── test_so.c
Content of the files:
lib_if.h
#ifndef _LIB_IF_H_
#define _LIB_IF_H_ 1
char* str_mod (char* str, size_t slen);
char* str_mod_C (char* str, size_t slen);
#endif
lib_AAA.c : Just converts all chars to upper-case
#include <stdio.h>
#include <stddef.h>
#include <ctype.h>
#include "lib_if.h"
// change string to UPPER-CASE
char* str_mod (char* str, size_t slen) {
printf ("\nHello from [%s] = ", __FILE__);
for (size_t ci = 0; ci < slen; ++ci)
str[ci] = toupper (str[ci]);
return str;
}
lib_BBB.c : reverses the string
#include <stdio.h>
#include <stddef.h>
#include <ctype.h>
#include "lib_if.h"
// ESREVER/REVERSE the string in place
char* str_mod (char* str, size_t slen) {
printf ("\nHello from [%s] = ", __FILE__);
for (size_t ai = 0, zi = slen -1; ai < zi; ++ai, --zi) {
char tmp = str[ai];
str[ai] = str[zi];
str[zi] = tmp;
}
return str;
}
lib_CCC.c : replaces the string with a default string
#include <stdio.h>
#include <stddef.h>
#include <string.h>
#include "lib_if.h"
// Replaces the string itself!
char* str_mod_C (char* str, size_t slen) {
snprintf (str, slen, "Middle exists just because extremes don't want to meet.");
printf ("Hello from [%s] = [%s]\n", __FILE__, str);
return str_mod (str, strlen(str));
}
test_so.c : calls a function from `lib_CCC.c
#include <stdio.h>
#include <string.h>
#include "lib_if.h"
int main () {
char ipStr[] = "DUMA: Power doesn't like to be shared; so, it concentrates.";
printf ("\nOriginal: [%s]\n\n", ipStr);
printf ("Modified: [%s]\n", str_mod_C (ipStr, strlen (ipStr)));
return 0;
}
Compiling Libraries & Programs
export LD_LIBRARY_PATH=./:$LD_LIBRARY_PATH
gcc -c -Wall -Wextra -pedantic -g1 -O2 -std=c17 -march=native -fPIC lib_AAA.c
gcc --shared lib_AAA.o -o libAAA.so
gcc -c -Wall -Wextra -pedantic -g1 -O2 -std=c17 -march=native -fPIC lib_BBB.c
gcc --shared lib_BBB.o -o libBBB.so
gcc -c -Wall -Wextra -pedantic -g1 -O2 -std=c17 -march=native -fPIC lib_CCC.c
gcc --shared lib_CCC.o -o libCCC.so
gcc -Wall -Wextra -pedantic -g1 -O2 -std=c17 -march=native test_so.c -L./ -lCCC -lAAA -o test_CCC_AAA
gcc -Wall -Wextra -pedantic -g1 -O2 -std=c17 -march=native test_so.c -L./ -lCCC -lBBB -o test_CCC_BBB
Running Programs
Library C replaces the string & A capitalises it.
./test_CCC_AAA
Original: [DUMA: Power doesn't like to be shared; so, it concentrates.]
Hello from [lib_CCC.c] = [Middle exists just because extremes don't want to meet.]
Hello from [lib_AAA.c] = Modified: [MIDDLE EXISTS JUST BECAUSE EXTREMES DON'T WANT TO MEET.]
Again, C replaces the string, library B reverses it.
./test_CCC_BBB
Original: [DUMA: Power doesn't like to be shared; so, it concentrates.]
Hello from [lib_CCC.c] = [Middle exists just because extremes don't want to meet.]
Hello from [lib_BBB.c] = Modified: [.teem ot tnaw t'nod semertxe esuaceb tsuj stsixe elddiM]
Replacing libAAA.so with a copy of libBBB.so & run test_CCC_AAA
mv libAAA.so libAAA.so.BAK
ln -s libBBB.so libAAA.so
Created a soft-link libAAA.so -> libBBB.so
ldd test_CCC_AAA
linux-vdso.so.1 (0x00007fffb07e7000)
libCCC.so => ./libCCC.so (0x00007f692dac9000)
libAAA.so => ./libAAA.so (0x00007f692dac4000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f692d8b5000)
/lib64/ld-linux-x86-64.so.2 (0x00007f692dad5000)
test_CCC_AAA is still linked to libAAA.so but executes libBBB.so's code
./test_CCC_AAA
Original: [DUMA: Power doesn't like to be shared; so, it concentrates.]
Hello from [lib_CCC.c] = [Middle exists just because extremes don't want to meet.]
Hello from [lib_BBB.c] = Modified: [.teem ot tnaw t'nod semertxe esuaceb tsuj stsixe elddiM]
Tested on:
OS : Ubuntu 20.04 LTS x64
GCC : 9.4

C - Print value stored at symbol location

I have a simple command line application (custom dir binary) which I would like to instrument. The debug symbols are enabled, and I can see the global string pointer I'm interested in, the_full_path_name in the output of objdump and nm -D.
Is it possible, in c, to somehow lookup that symbol name/location, and print the contents of the memory which it points at using code injection (ie: LD_PRELOAD library with a custom gcc attribute((constructor)) and additional functions)? I need to accomplish this without having to attach gdb to the process.
Thank you.
I am not really sure if i understood your question but does following help you anyways?
File containing global pointer
$ cat global.c
char mylongstring[] = "myname is nulled pointer";
$ gcc -fPIC -shared global.c -o libglobal.so
Original library
$ cat get_orig.c
#include <stdio.h>
extern char * mylongstring;
char *get()
{
mylongstring = "get from orig";
return mylongstring;
}
$ gcc -fPIC -shared get_orig.c -o libget_orig.so -L. -lglobal
Fake Library
$ cat get_dup.c
#include <stdio.h>
extern char * mylongstring;
char *get()
{
mylongstring = "get from dup";
return mylongstring;
}
$ gcc -fPIC -shared get_dup.c -o libget_dup.so -L. -lglobal
Actual consumer of global variable:
$ cat printglobal.c
#include <stdio.h>
char *get();
int main(void)
{
printf("global value=%s\n",get());
return 0;
}
$ gcc printglobal.c -L. -lglobal -lget_orig -o a.out
otool output
$ otool -L a.out
a.out:
libglobal.so (compatibility version 0.0.0, current version 0.0.0)
libget_orig.so (compatibility version 0.0.0, current version 0.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1213.0.0)
Running a.out
$ DYLD_LIBRARY_PATH=./ ./a.out
global value=get from orig
Replace library
$ cp /tmp/libget_dup.so libget_orig.so
$ DYLD_LIBRARY_PATH=./ ./a.out
global value=get from dup
BTW, i tested this on MAC so .so is really a misnomer for .dylib

Linking two object files together causes segmentation fault 11

I am experimenting with externs and various methods of linking to better understand the linking process.
I have three files:
foo.c:
#include "foo.h"
int a = 4;
test.c:
#include <stdio.h>
#include "foo.h"
int main(int, char**);
int mymain();
int mymain() {
main(0, 0);
printf("test\r\n");
return 0;
}
int main(int argc, char** argv) {
printf("extern a has %d\r\n", a);
return 0;
}
foo.h:
extern int a; // defined in foo.c
If I build each file together and link at compile time using gcc like this:
gcc *.c -o final.bin
I can execute final.bin as:
./final.bin
and get expected output
extern a has 4
However, if I compile (but don't link) test.c and foo.c separately, then try and link the object files together at runtime to produce a binary, I get a segmentation fault 11 (which from what I can gather is some generic memory corruption bug like a normal segfault(?)
Here is my makefile I'm using to compile and link separately. Note I am specifying my own entry point and linking against libc to get printf()...
all: test.o foo.o
#echo "Making all..."
ld test.o foo.o -o together.bin -lc -e _mymain
test.o: test.c
#echo "Making test..."
gcc -c test.c -o test.o
foo.o: foo.c
#echo "Making foo..."
gcc -c foo.c -o foo.o
Output when running 'together.bin':
./together.bin
extern a has 4
test
Segmentation fault: 11
Perhaps my function signature for 'mymain' is wrong? My guess is that something is wrong with my 'myentry' usage.
Also, if anyone has any recommendations on good books for how linkers and loaders work, I am certainly in the market for one. I've heard mixed things about 'Linkers and Loaders', so I'm waiting on more opinions before I invest the time in that book in particular.
Thanks for any help on this... My understanding of linkers is sub-par to say the least.
Unless if you have a good reason to do so, just use gcc to link:
$ gcc test.o foo.o "-Wl,-e,_mymain" -o ./final.bin; ./final.bin
extern a has 4
test
gcc calls ld---though, with a few more arguments than you are providing in your example. If you want to know exactly how gcc invokes ld, use the -v option. Example:
$ gcc -v test.o foo.o "-Wl,-e,_mymain" -o ./final.bin
Apple LLVM version 8.0.0 (clang-800.0.38)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
"/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld" -demangle -dynamic -arch x86_64 -macosx_version_min 10.12.0 -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk -o ./final.bin test.o foo.o -e _mymain -lSystem /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin/libclang_rt.osx.a

Only link certain symbols from a library

I am developing an embedded system with GCC, and would like to only use a few symbols from libc. For instance, I would like to use the basic memcpy, memmove, memset, strlen, strcpy, etc. However, I would like to provide my own (smaller) printf function, so I do not want libc to privide printf. I don't want dynamic allocation in this platform, so I do not want malloc to resolve at all.
Is there a way to tell GCC "only provide these symbols" from libc?
edit: To be clear, I am asking if there is a way I can only provide a few specific symbols from a library, not just override a library function with my own implementation. If the code uses a symbol that is in the library but not specified, the linker should fail with "unresolved symbol". If another question explains how to do this, I haven't yet seen it.
This should happen "automatically" as long as your libc and linker setup supports it. You haven't told what your platform is, so here is one where it does work.
So, let's create a silly example using snprintf.
/*
* main.c
*/
#include <stdio.h>
int main(int argc, char **argv) {
char l[100];
snprintf(l, 100, "%s %d\n", argv[0], argc);
return 0;
}
try to compile and link it
$ CC=/opt/gcc-arm-none-eabi-4_7-2013q3/bin/arm-none-eabi-gcc
$ CFLAGS="-mcpu=arm926ej-s -Wall -Wextra -O6"
$ LDFLAGS="-nostartfiles -L. -Wl,--gc-sections,-emain"
$ $CC $CFLAGS -c main.c -o main.o
$ $CC $LDFLAGS main.o -o example
/opt/gcc-arm-none-eabi-4_7-2013q3/bin/../lib/gcc/arm-none-eabi/4.7.4/../../../../arm-none-eabi/lib/libc.a(lib_a-sbrkr.o): In function `_sbrk_r':
sbrkr.c:(.text._sbrk_r+0x18): undefined reference to `_sbrk'
collect2: error: ld returned 1 exit status
It needs _sbrk because newlib *printf functions use malloc which needs a way to allocate system memory. Let's provide it a dummy one.
/*
* sbrk.c
*/
#include <stdint.h>
#include <unistd.h>
void *_sbrk(intptr_t increment) {
return 0;
}
and compile it
$ $CC $CFLAGS -c sbrk.c -o sbrk.o
$ $CC $LDFLAGS -Wl,-Map,"sbrk.map" main.o sbrk.o -o with-sbrk
$ /opt/gcc-arm-none-eabi-4_7-2013q3/bin/arm-none-eabi-size with-sbrk
text data bss dec hex filename
28956 2164 56 31176 79c8 with-sbrk
Well, that's the reason you'd like to get rid of printf and friends, isn't it? Now, replace snprintf with our function
/*
* replace.c
*/
#include <stdio.h>
#include <string.h>
int snprintf(char *str, size_t size, const char *format, ...) {
return strlen(format);
}
then compile
$ $CC $CFLAGS -c replace.c -o replace.o
$ $CC $LDFLAGS -Wl,-Map,"replace.map" main.o replace.o -o with-replace
$ /opt/gcc-arm-none-eabi-4_7-2013q3/bin/arm-none-eabi-size with-sbrk
text data bss dec hex filename
180 0 0 180 b4 with-replace
Note that we did not use the _sbrk stub at all. As long as you don't provide _sbrk, you can be sure that malloc is not (can't be) linked and used.
The simplest solution is probably to use a wrapper which defines the symbols and resolves them at runtime using dlfcn:
#include <dlfcn.h>
void* (*memcpy)(void *dest, const void *src, size_t n);
char* (*strncpy)(char *dest, const char *src, size_t n);
...
void init_symbols (void) {
void *handle = dlopen("/lib/libc.so.6", RTLD_LAZY);
memcpy = dlsym(handle, "memcpy");
strncpy = dlsym(handle, "strncpy");
...
}
and link your binary with -nostdlib. This gives you the best control on which symbols to use from which source.

Resources