I am trying to use AddressSanitizer for a program that is multi threaded and uses pthread library functions. However instead of statically linking with -lpthread flag during compilation with gcc, I am using dynamic link -ldl -rdynamic $filename.so which provides implementation of pthread functions. I have read that AddressSanitizer should be the first linked library, so I added it like this:
export LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/7/libasan.so
The problem I have is that while executing the program, it gets stuck at pthread_create. As I understand because this function is not provided statically, it's looked for in dynamically linked libraries, in libasan.so first and it gets blocked there, not moving to the second library that actually contains implementation of pthread_create function.
I also tried using -fsanitize=address (with -lasan) to link with AddressSanitizer during compilation but I got the same result.
Is there anything else I could try to make these two libraries work together?
Related
So essentially I want to compile a c program statically with gcc, and I want it to be able to link c stdlib functions, but I want it to start at main, and not include the _start function as well as the libc init stuff that happens before main. Normally when you want to compile a program without _start, you run gcc with the -nostdlib flag, but I want to also be able to include code from stdlib, just not the libc init. Is there any way to do this?
I know that this could cause a lot of problems, but for my use case I'm not actually running the c program itself so it makes sense to do this.
Thanks in advance
The option -nostdlib tells the linker to not use the startup files (ie. the code that is executed before the main).
-nostdlib
Do not use the standard system startup files or libraries when linking.
No startup files and only the libraries you specify are
passed to the linker, and options specifying linkage of the system
libraries, such as -static-libgcc or -shared-libgcc, are ignored.
The compiler may generate calls to memcmp, memset, memcpy and memmove.
These entries are usually resolved by entries in libc. These
entry points should be supplied through some other mechanism when this
option is specified.
It is frequent to use this option in low-level bare-metal programming in order to control exactly what is going on.
You can still use the functions of your libc by using -lc. However keep in mind that some of the libc function depend on the startup code. For example in some implementations printf requires dynamic memory allocation and the heap is initialized by the startup code.
new to using C
Header files for libraries like stdlib do not contain the actual implementation code for the functions they provide access to. I understand that the actual source text for libraries like this aren't needed to compile, but how does this work specifically? Are the implementation details for these libraries contained within the compiler?
When you use a function like printf(), including the header file essentially pastes in code for the declaration of the function, but normally the implementation code would need to be available as well.
What form is it stored in? (and where?) Is this compiler specific? Would it be possible to write custom code and reference it in this way without modifying the behavior of the compiler?
I've been searching around and found some info that is relevant but nothing specific. This could be related to not formulating the question well. Thanks.
When you link a program, the compiler will implicitly add some extra libraries to your program:
$ ls
main.c
$ cc -c main.c
$ cc main.o
$ ls
main.c main.o a.out
You can discover the extra libraries a program uses with ldd. Here, there are three libraries linked into the program, and I didn't ask for any of them:
$ ldd a.out
linux-vdso.so => (0x00...)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00...)
/lib64/ld-linux-x86-64.so.2 (0x00...)
So, what happens if we link without these libraries? That's easy enough, just use the linker (ld) directly, instead of calling it through cc. When you use ld, it doesn't give you these extra libraries, so you get an error:
$ ld main.o
Undefined symbols:
"_printf", referenced from:
_main in main.o
The implementation for printf() is stored in the standard C library, which is usually just another library on your system... the only difference is that it gets automatically included into your program when you compile C.
You can use nm to find out what symbols are in a library, so I can use it to find printf() in libc:
$ nm -D /lib/x86_64-linux-gnu/libc-2.13.so | grep printf
...
000000000004e4b0 T printf
...
So, now that we know that libc has printf(), we can use -lc to tell the linker to include libc, and that will get rid of the errors about printf() being missing:
$ ld main.o -lc
There might be some other bits missing, and that's why we use cc to link our programs instead of ld: cc gives us all the default libraries.
When you compile a file you only need to promise the compiler that you have certain functions and symbols. A function call is in the compiled into a call [some_address]
The compiler will compile each C-file into object files that just have place holders for calls to functions declared in the headers. That is [some_address] does not need to be known at this point.
A number of oject files can be collected into what is known as a library.
After that it is the linkers job to look through all object files and libraries it know of and find out what the real value of all unknown [some_address] is and translate the call to, e.g. call 0x1234 if the particular function you are calling starts at 0x1234 (or it might be a relative offset from the current program pointer.
Stdlib and other library functions are implemented in an object library. A library is a collection of code that is linked with your program. By default C programs are linked against the stdlib library, which is usually provided by the operating system. Most modern operating systems use a dynamical linker. That is, your program is not linked against the library until it is executed. When it is being loaded, the linker-loader combines your code and the library code in your program's address space. You code and then make a call to the printf() code that is located in that library.
Usually a header file contains only a function prototype while the implementation is either in a separate source file or a precompiled library in the case of stdlib (and other libraries, both shipped with a compiler or available separately) the precompiled library gets linked at the end of the compilation process. (There's also a distinction between static and dynamic libraries, but I won't go into detail about that)
The implementation of standard libraries (which are shipped with a compiler) are usually compiler specific (there is a standard describing which functions have to be in a library, but the compiler programmer can decide how exactly he implements them) and it is (in theory) possible to exchange these libraries with some of your own without modifying the behaviour of the compiler (though not recommended as you would have to rewrite the entire library in order to ensure that all functions are contained).
I recently wrote a few replacements for string routines (memcpy, memset, and memmove). It is my understanding that if the library containing these routines is specified on the compile / link line, these will take precedence over system standard library routines of the same name. If I'm wrong already, please let me know!
This works correctly in all testing I did (verified by disassembly that the correct routines are there and glibc routines don't exist), but further testing discovered an odd break caused by this:
1) build another file in the same library with -g (I had been building -O2)
2) this file has an explicit call to memset
3a) if the compile time options work in such a way that this memset is inlined by gcc, everything is OK
3b) if, however, the options disable the inlining of a memset call which would have been inlined otherwise, the library will build but using the library to statically link an application causes a duplicate symbol linker error - the other instance of the symbol is the system library's memset.
Basically I can build two versions of my library (100's of source files), and by changing the make CFLAGS in one directory from -O1 -g to just -g I can trigger the linker error when this library is used.
I can take the working version, run it through nm, and see that it has many undefined references to memset including in routines which are linked into my test case - so I know it should be trying to resolve memset in the working case. When I diff this against the nm output for the broken library, all I see is a few extra undefined memcpy and memset references. If memset resolved in the first case (to my routine), it should in the second.
I have also looked at the verbose compiler output and verified that the link lines are exactly the same in both cases, except for the path to this one library.
There are two super puzzling things here (among a myriad of other issues):
1) Why would a file in a library built -O1 -g link any differently than -g
2) Why would a replacement memset, in a user library, conflict with the system memset
And for the grand prize, how does 1) cause 2)
It took a long time to come up with this solution, but it makes sense now:
1) Higher optimization enabled gcc to inline bzero, which had no other references to it at link time. The memset calls it was inlining / not inlining here were red herrings.
2) bzero is in the same file as memset in libc.a : memset.o. When ld tried to pull in memset.o to satisfy the bzero request it got the duplicate memset symbol.
(1) causes (2).
The solution was to provide my own bzero routine in my library, stopping libc's memset.o from ever being needed.
GCC provides a large number of built-in versions of standard library functions. These are provided for optimization purposes.
Many of these functions are optimized in only certain cases. If they
are not optimized in a certain case, a call to the library function is
emitted.
Hence a library built -O1 -g would link differently than -g.
I have used dlsym to create a malloc/calloc wrapper in the efence code as to able to able to access the libc malloc (occassionally apart from efence malloc/calloc). Now when i link it, and run, it gives following error: "RTLD_NEXT used in code not dynamically loaded"
bash-3.2# /tool/devel/usr/bin/gcc -g -L/tool/devel/usr/lib/ efence_time_interval_measurement_test.c -o dev.out -lefence -ldl -lpthread
bash-3.2# export LD_LIBRARY_PATH=/tool/devel/usr/lib/
bash-3.2# ./dev.out
eFence: could not resolve 'calloc' in 'libc.so': RTLD_NEXT used in code not dynamically loaded
Now, if i use "libefence.a" it is happening like this:
bash-3.2# /tool/devel/usr/bin/gcc -g -L/tool/devel/usr/lib/ -static
efence_time_interval_measurement_test.c -o dev.out -lefence -ldl -lpthread
/tool/devel/usr/lib//libefence.a(page.o): In function `stringErrorReport':
/home/raj/eFence/BUILD/electric-fence-2.1.13/page.c:50: warning: `sys_errlist' is deprecated; use `strerror' or `strerror_r' instead
/home/raj/eFence/BUILD/electric-fence-2.1.13/page.c:50: warning: `sys_nerr' is deprecated; use `strerror' or `strerror_r' instead
/tool/devel/usr/lib//libc.a(malloc.o): In function `__libc_free':
/home/rpmuser/rpmdir/BUILD/glibc-2.9/malloc/malloc.c:3595: multiple definition of `free'
/tool/devel/usr/lib//libefence.a(efence.o):/home/raj/eFence/BUILD/electric-fence-2.1.13/efence.c:790: first defined here
/tool/devel/usr/lib//libc.a(malloc.o): In function `__libc_malloc':
/home/rpmuser/rpmdir/BUILD/glibc-2.9/malloc/malloc.c:3551: multiple definition of `malloc'
/tool/devel/usr/lib//libefence.a(efence.o):/home/raj/eFence/BUILD/electric-fence-2.1.13/efence.c:994: first defined here
/tool/devel/usr/lib//libc.a(malloc.o): In function `__libc_realloc':
/home/rpmuser/rpmdir/BUILD/glibc-2.9/malloc/malloc.c:3647: multiple definition of `realloc'
/tool/devel/usr/lib//libefence.a(efence.o):/home/raj/eFence/BUILD/electric-fence-2.1.13/efence.c:916: first defined here
Please help me. Is there any problem in linking?
NO ONE IN STACK OVERFLOW WHO CAN RESOLVE THIS
The problem is with your question, not with us ;-)
First off, efence is most likely the wrong tool to use on a Linux system. For most bugs that efence can find, Valgrind can find them and describe them to you (so you could fix them) much more accurately. The only good reason for you to use efence is if your application runs for many hours, and Valgrind is too slow.
Second, efence is not intended to work with static linking, so the errors you get with -static flag are not at all surprising.
Last, you didn't tell us what libc is installed on your system (in /lib), and what libraries are present in /tool/devel/usr/lib/. It is exceedingly likely that there is libc.so.6 present in /usr/devel/usr/lib, and that its version does not match the one installed in /lib.
That would explain the RTLD_NEXT used in code not dynamically loaded error. The problem is that glibc consists of multiple binaries, which all must match exactly. If the system has e.g. libc-2.7 installed, then you are using /lib/ld-linux.so.2 from glibc-2.7 (the dynamic loader is hard-coded into every executable and is not affected by environment variables), and mixing it with libc.so.6 from glibc-2.9. The usual result of doing this is a SIGSEGV, weird unresolved symbol errors, and other errors that make no sense.
I had written a small thread program when i compiled cc filename.c, i got some statements during compilation, but when i compiled using -lpthread (cc filename.c -lpthread) it got executed what is this -lpthread why is it required? can anyone explain this in detail. it would be of great help.
The pthread_create() function that you use in your program is not a basic C function, and requires that you use a library.
This is why you have to use this command switch -lpthread.
This gcc command tells him to look for a library named libpthread somewhere on your disk, and use it to provide the thread creation mechanisms.
I suggest you read this to get familiar with the "library" concept: http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html
The -l option is typically used to specify a library (in this case, the pthread library) that should be linked with your program.
Since the thread functions often live in a separate library, you need an option like this when building a program that uses them, or you will get linker errors.
pthread is something called POSIX Threads. It's the standard library for threads in Unix-like POSIX envirnoments.
Since you are going to use pthread you need to tell the compiler to link to that library.
You can read more about exactly what lpthread is and how it works: https://computing.llnl.gov/tutorials/pthreads/