Threaded shared library for non threaded application - c

I have some application for which I need to write extension using shared library. In my shared library I need to use threads. And main application neither uses threads neither linked with threads library (libpthread.so, for example).
As first tests showed my library causes crashes of the main application. And if i use LD_PRELOAD hack crashes goes away:
LD_PRELOAD=/path/to/libpthread.so ./app
The only OS where i have no segfaults without LD_PRELOAD hack is OS X. On other it just crashes. I tested: Linux, FreeBSD, NetBSD.
My question is: is there a way to make my threaded shared library safe for non-threaded application without changing of the main application and LD_PRELOAD hacks?
To reproduce the problem i wrote simple example:
mylib.c
#include <pthread.h>
#include <assert.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
void *_thread(void *arg) {
int i;
struct addrinfo *res;
for (i=0; i<10000; i++) {
if (getaddrinfo("localhost", NULL, NULL, &res) == 0) {
if (res) freeaddrinfo(res);
}
}
pthread_mutex_lock(&mutex);
printf("Just another thread message!\n");
pthread_mutex_unlock(&mutex);
return NULL;
}
void make_thread() {
pthread_t tid[10];
int i, rc;
for (i=0; i<10; i++) {
rc = pthread_create(&tid[i], NULL, _thread, NULL);
assert(rc == 0);
}
void *rv;
for (i=0; i<10; i++) {
rc = pthread_join(tid[i], &rv);
assert(rc == 0);
}
}
main.c
#include <stdio.h>
#include <dlfcn.h>
int main() {
void *mylib_hdl;
void (*make_thread)();
mylib_hdl = dlopen("./libmy.so", RTLD_NOW);
if (mylib_hdl == NULL) {
printf("dlopen: %s\n", dlerror());
return 1;
}
make_thread = (void (*)()) dlsym(mylib_hdl, "make_thread");
if (make_thread == NULL) {
printf("dlsym: %s\n", dlerror());
return 1;
}
(*make_thread)();
return 0;
}
Makefile
all:
cc -pthread -fPIC -c mylib.c
cc -pthread -shared -o libmy.so mylib.o
cc -o main main.c -ldl
clean:
rm *.o *.so main
And all together: https://github.com/olegwtf/sandbox/tree/bbbf76fdefe4bacef8a0de7a2475995719ae0436/threaded-so-for-non-threaded-app
$ make
cc -pthread -fPIC -c mylib.c
cc -pthread -shared -o libmy.so mylib.o
cc -o main main.c -ldl
$ ./main
*** glibc detected *** ./main: double free or corruption (fasttop): 0x0000000001614c40 ***
Segmentation fault
$ ldd libmy.so | grep thr
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe7e2591000)
$ LD_PRELOAD=/lib/x86_64-linux-gnu/libpthread.so.0 ./main
Just another thread message!
Just another thread message!
Just another thread message!
Just another thread message!
Just another thread message!
Just another thread message!
Just another thread message!
Just another thread message!
Just another thread message!
Just another thread message!

My question is: is there a way to make my threaded shared library safe
for non-threaded application without changing of the main application
and LD_PRELOAD hacks?
No, those are the two ways you can make it work. With neither in place, your program is invalid.

dlopen is supposed to do the right thing, and to open all the libraries your own .so depends upon.
In fact, your code is working for me if I comment out the address lookup code that you placed inside your thread function. So loading the pthread library works perfectly.
And if I run the code including the lookup, valgrind shows me that the crash is below getaddrinfo.
So the problem is not that the libraries aren't loaded, somehow their initialization code is not executed or not in the right order.

gdb helped to understand what's goin on with this example.
After 3 tries gdb showed that app always crashed at rewind.c line 36 inside libc. Since tests were run on Debian 7, libc implementation is eglibc. And here you can see line 36 of rewind.c:
http://www.eglibc.org/cgi-bin/viewvc.cgi/branches/eglibc-2_13/libc/libio/rewind.c?annotate=12752
_IO_acquire_lock() is a macros and after grepping eglibc source I found 2 places where it is defined:
bits/stdio-lock.h line 49: http://www.eglibc.org/cgi-bin/viewvc.cgi/branches/eglibc-2_13/libc/bits/stdio-lock.h?annotate=12752
sysdeps/pthread/bits/stdio-lock.h line 91: http://www.eglibc.org/cgi-bin/viewvc.cgi/branches/eglibc-2_13/libc/nptl/sysdeps/pthread/bits/stdio-lock.h?annotate=12752
Comment for first says Generic version and for second NPTL version, where NTPL is Native POSIX Thread Library. So in few words first defines non-threaded implementation for this and several other macroses and second threaded implementation.
When our main application is not linked with pthreads it starts and loads this first non-threaded implementation of _IO_acquire_lock() and others macroses. Then it opens our threaded shared library and executes function from it. And this function uses already loaded and non thread safe version of _IO_acquire_lock(). However in fact should use threads compatible version defined by pthreads. This is where segfault occures.
This is how it works on Linux. On *BSD situation is even more sad. On FreeBSD your program will hang up immediately after your threaded library will try to create new thread. On NetBSD instead of hang up program will be terminated with SIGABRT.
So answering to the main question: is it possible to use threaded shared library from application not linked with pthreads?
In general -- no. And particularly this depends on libc implementation. For OS X, for example, this will work without any problems. For Linux this will work if you'll not use libc functions that uses such special macroses redefined by pthreads. But how to know which uses? Ok, you can make 1+1, this looks safe. On *BSD your program will crash or hang up immediately, no matter what your thread do.

Related

Where do Linux shells look for interpreters for ELF binaries? [duplicate]

So everyone probably knows that glibc's /lib/libc.so.6 can be executed in the shell like a normal executable in which cases it prints its version information and exits. This is done via defining an entry point in the .so. For some cases it could be interesting to use this for other projects too. Unfortunately, the low-level entry point you can set by ld's -e option is a bit too low-level: the dynamic loader is not available so you cannot call any proper library functions. glibc for this reason implements the write() system call via a naked system call in this entry point.
My question now is, can anyone think of a nice way how one could bootstrap a full dynamic linker from that entry point so that one could access functions from other .so's?
Update 2: see Andrew G Morgan's slightly more complicated solution which does work for any GLIBC (that solution is also used in libc.so.6 itself (since forever), which is why you can run it as ./libc.so.6 (it prints version info when invoked that way)).
Update 1: this no longer works with newer GLIBC versions:
./a.out: error while loading shared libraries: ./pie.so: cannot dynamically load position-independent executable
Original answer from 2009:
Building your shared library with -pie option appears to give you everything you want:
/* pie.c */
#include <stdio.h>
int foo()
{
printf("in %s %s:%d\n", __func__, __FILE__, __LINE__);
return 42;
}
int main()
{
printf("in %s %s:%d\n", __func__, __FILE__, __LINE__);
return foo();
}
/* main.c */
#include <stdio.h>
extern int foo(void);
int main()
{
printf("in %s %s:%d\n", __func__, __FILE__, __LINE__);
return foo();
}
$ gcc -fPIC -pie -o pie.so pie.c -Wl,-E
$ gcc main.c ./pie.so
$ ./pie.so
in main pie.c:9
in foo pie.c:4
$ ./a.out
in main main.c:6
in foo pie.c:4
$
P.S. glibc implements write(3) via system call because it doesn't have anywhere else to call (it is the lowest level already). This has nothing to do with being able to execute libc.so.6.
I have been looking to add support for this to pam_cap.so, and found this question. As #EmployedRussian notes in a follow-up to their own post, the accepted answer stopped working at some point. It took a while to figure out how to make this work again, so here is a worked example.
This worked example involves 5 files to show how things work with some corresponding tests.
First, consider this trivial program (call it empty.c):
int main(int argc, char **argv) { return 0; }
Compiling it, we can see how it resolves the dynamic symbols on my system as follows:
$ gcc -o empty empty.c
$ objcopy --dump-section .interp=/dev/stdout empty ; echo
/lib64/ld-linux-x86-64.so.2
$ DL_LOADER=/lib64/ld-linux-x86-64.so.2
That last line sets a shell variable for use later.
Here are the two files that build my example shared library:
/* multi.h */
void multi_main(void);
void multi(const char *caller);
and
/* multi.c */
#include <stdio.h>
#include <stdlib.h>
#include "multi.h"
void multi(const char *caller) {
printf("called from %s\n", caller);
}
__attribute__((force_align_arg_pointer))
void multi_main(void) {
multi(__FILE__);
exit(42);
}
const char dl_loader[] __attribute__((section(".interp"))) =
DL_LOADER ;
(Update 2021-11-13: The forced alignment is to help __i386__ code be SSE compatible - without it we get hard to debug glibc SIGSEGV crashes.)
We can compile and run it as follows:
$ gcc -fPIC -shared -o multi.so -DDL_LOADER="\"${DL_LOADER}\"" multi.c -Wl,-e,multi_main
$ ./multi.so
called from multi.c
$ echo $?
42
So, this is a .so that can be executed as a stand alone binary. Next, we validate that it can be loaded as shared object.
/* opener.c */
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
void *handle = dlopen("./multi.so", RTLD_NOW);
if (handle == NULL) {
perror("no multi.so load");
exit(1);
}
void (*multi)(const char *) = dlsym(handle, "multi");
multi(__FILE__);
}
That is we dynamically load the shared-object and run a function from it:
$ gcc -o opener opener.c -ldl
$ ./opener
called from opener.c
Finally, we link against this shared object:
/* main.c */
#include "multi.h"
int main(int argc, char **argv) {
multi(__FILE__);
}
Where we compile and run it as follows:
$ gcc main.c -o main multi.so
$ LD_LIBRARY_PATH=./ ./main
called from main.c
(Note, because multi.so isn't in a standard system library location, we need to override where the runtime looks for the shared object file with the LD_LIBRARY_PATH environment variable.)
I suppose you'd have your ld -e point to an entry point which would then use the dlopen() family of functions to find and bootstrap the rest of the dynamic linker. Of course you'd have to ensure that dlopen() itself was either statically linked or you might have to implement enough of your own linker stub to get at it (using system call interfaces such as mmap() just as libc itself is doing.
None of that sounds "nice" to me. In fact just the thought of reading the glibc sources (and the ld-linux source code, as one example) enough to assess the size of the job sounds pretty hoary to me. It might also be a portability nightmare. There may be major differences between how Linux implements ld-linux and how the linkages are done under OpenSolaris, FreeBSD, and so on. (I don't know).

what can be called from -fini function of shared library?

I am using a shared library with LD_PRELOAD, and it seems that I can't call some functions from the function set with -fini= ld option. I am running Linux Ubuntu 20.04 on a 64-bit machine.
Here is the SSCCE:
shared.sh:
#!/bin/bash
gcc -shared -fPIC -Wl,-init=init -Wl,-fini=fini shared.c -o shared.so
LD_PRELOAD=$PWD/shared.so whoami
shared.c:
#include <stdio.h>
#include <unistd.h>
void init() {
printf("%s\n", __func__);
fflush(stdout);
}
void fini() {
int printed;
printed = printf("%s\n", __func__);
if (printed < 0)
sleep(2);
fflush(stdout);
}
When I call ./shared.sh , I get
init
mark
and 2 second pause.
So it seems printf() fails in fini() but sleep() succeeds (errno values are not specified for printf, so I don't check it) Why and what kind of functions can I call from fini? ld manpage does not say anything about any restrictions.
The initialization functions of each dynamically linked component are executed in the order in which the components are loaded. In particular, if A depends on B but B does not depend on A, then B's initialization functions run before A's. The termination functions of each dynamically linked component are executed in the order in which the components are unloaded. In particular, if A depends on B but B does not depend on A, then B's initialization functions run after A's. Generally, termination functions run in reverse order from initialization functions, but I don't know if that's true in all cases (for example when there are circular dependencies). You can find the rules in the System V ABI specification which Linux and many other Unix variants follow. Note that the rules leave some cases unspecified; they might depend on the compiler and on the standard library (possibly on the kernel, but I think for this particular topic it doesn't matter).
A shared library loaded with LD_PRELOAD is loaded before the main executable, so its initialization functions run before the ones from libc and its termination functions run after the ones from libc. In particular, libc flushes standard streams and closes the file descriptors for the output streams. You can see this happening by tracing system calls:
$ strace env LD_PRELOAD=$PWD/shared.so whoami
…
write(1, "gilles\n", 6gilles
) = 6
close(1) = 0
close(2) = 0
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=2, tv_nsec=0}, 0x7ffc12bd2df0) = 0
exit_group(0) = ?
+++ exited with 0 +++
The call to clock_nanosleep is sleep(2). The calls to printf and fflush happen just before; since stdout has been closed, they do nothing and return -1. Check the return value or use a debugger to confirm this.
Contrast with what happens if shared.so is linked normally, rather than preloaded.
$ cat main.c
#include <stdio.h>
int main(void) {
puts("main");
return 0;
}
$ gcc -o main main.c -Wl,-rpath,. -Wl,--no-as-needed -L. -l:shared.so
$ ./main
init
main
fini
Here, since main loads shared.so, the shared library is initialized last and terminated first. So by the time the fini function in shared.so runs, libc hasn't run its termination functions and the standard streams are still available.

Why shared library is unloaded while another program still uses it?

For what I understand, if there are more than one program using a shared library, the shared library won't get unloaded untill all program finishes.
I am reading The Linux Programming Interface:
42.4 Initialization and Finalization Functions It is possible to define one or more functions that are executed automatically when a
shared library is loaded and unloaded. This allows us to perform
initialization and finalization actions when working with shared
libraries. Initialization and finalization functions are executed
regardless of whether the library is loaded automatically or loaded
explicitly using the dlopen interface (Section 42.1).
Initialization and finalization functions are defined using the gcc
constructor and destructor attributes. Each function that is to be
executed when the library is loaded should be defined as follows:
void __attribute__ ((constructor)) some_name_load(void)
{
/* Initialization code */
}
Unload functions are similarly defined:
void __attribute__ ((destructor)) some_name_unload(void)
{
/* Finalization code */
} The function names `some_name_load()` and `some_name_unload()` can be replaced by any desired names. ....
Then I wrote 3 files to test:
foo.c
#include <stdio.h>
void __attribute__((constructor)) call_me_when_load(void){
printf("Loading....\n");
}
void __attribute__((destructor)) call_me_when_unload(void){
printf("Unloading...\n");
}
int xyz(int a ){
return a + 3;
}
main.c
#include <stdio.h>
#include <unistd.h>
int main(){
int xyz(int);
int b;
for(int i = 0;i < 1; i++){
b = xyz(i);
printf("xyz(i) is: %d\n", b);
}
}
main_while_sleep.c
#include <stdio.h>
#include <unistd.h>
int main(){
int xyz(int);
int b;
for(int i = 0;i < 10; i++){
b = xyz(i);
sleep(1);
printf("xyz(i) is: %d\n", b);
}
}
Then I compile a shared library and 2 executables:
gcc -g -Wall -fPIC -shared -o libdemo.so foo.c
gcc -g -Wall -o main main.c libdemo.so
gcc -g -Wall -o main_while_sleep main_while_sleep.c libdemo.so
finally run LD_LIBRARY_PATH=. ./main_while_sleep in a shell and run LD_LIBRARY_PATH=. ./main in another:
main_while_sleep output:
Loading....
xyz(i) is: 3
xyz(i) is: 4
xyz(i) is: 5
xyz(i) is: 6
xyz(i) is: 7
xyz(i) is: 8
xyz(i) is: 9
xyz(i) is: 10
xyz(i) is: 11
xyz(i) is: 12
Unloading...
main output:
Loading....
xyz(i) is: 3
Unloading...
My question is, while main_while_sleep is not finished, why Unloading is printed in main, which indicates the shared library has been unloaded? The shared library shouldn't be unloaded yet, main_while_sleep is still running!
Do I get something wrong?
My question is, while main_while_sleep is not finished, why Unloading is printed in main, which indicates the shared library has been unloaded? The shared library shouldn't be unloaded yet, main_while_sleep is still running!
You are confusing/conflating initialization/deinitialization with load/unload.
A constructor is an initialization function that is called after a shared library has been mapped into a given process's memory.
It does not affect any other process (which is in a separate, per-process address space).
Likewise, the mapping (or unmapping) of a shared library in a given process does not affect any other process.
When a process maps a library, nothing is "loaded". When the process tries to access a memory page that is part of the shared library, it receives a page fault and the given page is mapped, the page is marked resident, and the faulting instruction is restarted.
There is much more detail in my answers:
How does mmap improve file reading speed?
Which segments are affected by a copy-on-write?
read line by line in the most efficient way *platform specific*
Is Dynamic Linker part of Kernel or GCC Library on Linux Systems?
Malloc is using 10x the amount of memory necessary

Linker can't find semaphore functions

I'm trying to make a C programm, that will execute subprocesses, which will be interact using semaphore.
Then I compile code, gcc throw referencing error - because it doesn't know about functions "sem_init", "sem_post" and "sem_wait", even though I include semaphore.h library.
Here's how it look:
Code:
#include <stdio.h>
#include <semaphore.h>
#include <pthread.h>
#include <unistd.h>
#define LETTER_COUNT 26
#define THREADS 2
char letter[LETTER_COUNT] = "aBCDefghiJklMNoPqrsTuvWxyZ";
pthread_t t[THREADS];
sem_t sem[THREADS];
void print_letter(void) {
//print string
}
void* reorder(void* d) {
(void)d;
//do some work
return NULL;
}
void* switch_case(void* d) {
(void)d;
//do some work
return NULL;
}
int main(void) {
int i;
for(i = 0; i < THREADS; i++) {
if(sem_init(&sem[i], 0, 0) == -1) {
perror("sem_init");
return -1;
}
}
pthread_create(&t[0], NULL, reorder, NULL);
pthread_create(&t[1], NULL, switch_case, NULL);
while(1) {
i = (i + 1) % (THREADS - 1);
sem_post(&sem[i]);
sem_wait(&sem[2]);
print_letter();
sleep(1);
}
return 0;
}
Error:
gcc -Wall task4.c -o task4.o
Undefined first referenced
symbol in file
sem_init /var/tmp//cc0i56ka.o
sem_post /var/tmp//cc0i56ka.o
sem_wait /var/tmp//cc0i56ka.o
ld: fatal: symbol referencing errors. No output written to task4.o
collect2: ld returned 1 exit status
I'm trying to find some information about this problem, but I can't find any working solutions. Maybe I should use some compilation flag (like -lsocket)?
As per man sem_init (and friends)
gcc -Wall task4.c -o task4.o -lpthread
On some system, the 'librt' shared library is built against shared libpthread, and referencing -lrt will imply -lpthread. However the man page indicate the proper command to link is to use -pthread, see below. Note that -pthread will invoke MT semantics, as needed, usually -lpthread, but other libraries, flags or #defines. For example, on GCC/Mint19, it will define -D_REENTRANT.
From man sem_init
AME
sem_init - initialize an unnamed semaphore
SYNOPSIS
#include
int sem_init(sem_t *sem, int pshared, unsigned int value);
Link with -lpthread.
From man gcc
Options Controlling the Preprocessor
-pthread
Define additional macros required for using the POSIX threads library. You should use this option consistently for both compilation
and linking. This option is supported on GNU/Linux targets, most other Unix derivatives, and also on x86 Cygwin and MinGW targets.
Options for Linking
-pthread
Link with the POSIX threads library. This option is supported on GNU/Linux targets, most other Unix derivatives, and also on x86
Cygwin and MinGW targets. On some targets this option also sets flags for the preprocessor, so it should be used consistently for both
compilation and linking.

Buffering `printf` outputs between different threads in Linux

Here is my code:
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <pthread.h>
pthread_t ntid;void
printids(const char *s) {
printf("%s \n", s);
}
void *
thr_fn(void *arg) {
printids("new thread: ");
return((void *)0);
}
int
main(void) {
pthread_create(&ntid, NULL, thr_fn, NULL);
printids("main thread:");
}
I'm running it on Red Hat Enterprise Linux Workstation release 6.5 .
Here is my compiling command
gcc -ansi -g -std=c99 -Wall -DLINUX -D_GNU_SOURCE threadid.c -o threadid -pthread -lrt -lbsd
Here is the output:
main thread:
new thread:
new thread:
Why "new thread" has been printed twice?
I doubt this may related to buffering mechanism in Linux. But after I added fflush(stdout) and fsync(1) in the end of each function. The output is almost the same.
If you run the program several times. The output differs:
main thread:
new thread:
or
main thread:
new thread:
new thread:
Or
main thread:
Most libc libraries do buffer the output as you mentioned. And at the end of the program (when the main thread exits), they flush all the buffers and exit.
There is a slight possibility that your new thread has flushed the output but before it could update the state of the buffer, the main program exited and the cleanup code flushed the same buffer again. Since these buffers are local to the thread I am sure they won't have concurrency mechanism. But because of this rare case it might get messed up.
You can try
err = pthread_create(&ntid, NULL, thr_fn, NULL);
printids("main thread:");
pthread_join(ntid, NULL);
At the end of the main function and check if the problem is solved.
This will cause your main function to wait till the new thread is finished (including the flushing operation it does).
Double output is possible on glibc-based linux systems due to a nasty bug in glibc: if the FILE lock is already held at the time exit tries to flush, the lock is simply ignored and the buffer access is performed with no synchronization. This would be a great test case to report to glibc to pressure them to fix it, if you can reproduce it reliably.

Resources