C pthread including header file and linking with -nolibc - c

This example:
#include <stdio.h>
#include <pthread.h>
__attribute__((weak)) int pthread_create( pthread_t*, const pthread_attr_t*,
void*(*)(void*), void*);
int main()
{
if (pthread_create)
{
printf("This is multi-thread version!\n");
}
else
{
printf("This is single-thread version!\n");
}
return 0;
}
It says it is going to run in single thread mode if not linked to the pthread library but with the #include pthread isn't it going to be linked if compiled normally?
I think pthread is in glibc or libc but firstly is there a way to link excluding the standard library and if so when would you do that?
If there is code that can be run in multithread mode, is there ever any point in running it in single thread mode as in the example or is this just a bad example? If so, what is a better example of hard-coding in something as a weak symbol?

but with the #include pthread isn't it going to be linked if compiled normally?
No, including a header file is different from linking with a library.
I think pthread is in glibc or libc
It is not.
is there a way to link excluding the standard library
Check your compiler documentation. gcc has many link options like -nolibc -nostdlib nodefaultlibs.
if so when would you do that?
When I am compiling for a bare-metal target that indeed has no C library. When I am writing my own standard library or I want to use a different C library then the default one distributed with the system or when crosscompiling I have a custom C library in a custom location and system doesn't ship one with the crosscompiler. Etc.
If there is code that can be run in multithread mode, is there ever any point in running it in single thread mode`
Yes. For some reasons multithreading would results in worse performance when compared to single thread, like on a single core system. In case a realtime process that owns cpu anyway. Or in case a particular algorithm can't be multithreaded or would results in worse performance when multithreaded.

Related

access a POSIX function using dlopen

POSIX 2008 introduces several file system functions, which rely on directory descriptor when determining a path to the file (I'm speaking about -at functions, such as openat, renameat, symlinkat, etc.). I doubt if all POSIX platforms support it (well, at least the most recent versions seem to support) and I'm looking for a way to determine if platform supports such functions. Of course one may use autoconf and friends for compile-time determination, but I'm looking for a possibility to find out whether implementation supports -at functions dynamically.
The first that comes to my mind is a dlopen()/dlsym()/dlclose() combo; at least I've successfully loaded the necessary symbols from /usr/libc.so.6 shared library. However, libc may be (or is?) named differently on various platforms. Is there a list of standard locations to find libc? At least on Linux /lib/libc.so appears to be not a symbolic link to shared library, but a ld script. May be there exist some other way to examine during runtime if a POSIX function is supported? Thanks in advance!
#define _GNU_SOURCE 1
#include <dlfcn.h>
#include <stdio.h>
int main ()
{
void * funcaddr = dlsym(RTLD_DEFAULT, "symlinkat");
/* -----------------------^ magic! */
printf ("funcaddr = %p\n", funcaddr);
}
Output:
funcaddr = 0x7fb62e44c2c0
Magic explanation: your program is already linked with libc, no need to load it again.
Note, this is actually GNU libc feature, as hinted by _GNU_SOURCE. POSIX reserves RTLD_DEFAULT "for future use", and then proceeds to define it exactly like GNU libc does. So strictly speaking it is not guaranteed to work on all POSIX systems.

How to create a library which uses mutexes only if pthread is linked?

I'm creating a C library on Linux which has several functions, which together operate upon some global data. In order for these functions to be thread safe, they must employ mutexes at the appropriate points in the code.
In Linux, in order to use pthreads in an application, one needs to link in the appropriate library, -lpthread. In the case of my library once compiled, I'd like to make it work both if the user of it decided to use pthreads in their application, as well as if they don't.
In the case where a developer does not use threads in their application, they will not link against pthreads. Therefore I'd like my compiled library to not require it, and furthermore, employing mutexes in a single threaded application uses needless overhead (not to mention is silly).
Is there some kind of way to write code (with GCC extensions if necessary) that a certain block of code will only run if certain symbols were linked in? I'm aware I can use dlopen() and friends, but that in itself would require some of what I'm trying to avoid. I imagine what I'm looking for must exist, as several standard functions are in the same boat, and would require mutexes to be thread safe (and they are), but work even when not linked with pthreads.
On this point, I notice that FreeBSD's popen() function on line 66 & 67 employs a non portable check - isthreaded, to determine if threads are used or not, and whether to use mutexes. I doubt anything like that is standardized in any way. But more to the point such code can't compile and link if the symbols aren't recognized, which in Linux, the mutex symbols won't even be present if pthread is not linked.
To summarize: On Linux, how does one create a library, which knows when threads are also used, and if so, employs mutexes where appropriate, and does not require linking against pthreads, unless the application developer specifically wants to use threading somewhere?
After some testing, it seems that Linux already does what I want automatically! You only need to link against pthreads if you use threading, not if you just want pthread mutex support.
In this test case:
#include <stdio.h>
#include <errno.h>
#include <pthread.h>
int main()
{
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
if (!(errno = pthread_mutex_lock(&mutex))) { puts("Mutex locked!"); }
else { perror("Could not lock mutex"); }
if (!(errno = pthread_mutex_lock(&mutex))) { puts("Mutex locked!"); }
else { perror("Could not lock mutex"); }
return 0;
}
When compiling this without pthreads linked, I see "Mutex locked!" twice. Which indicates that pthread_mutex_lock() is essentially a non-op. But with pthreads linked, running this application will stall after the first time "Mutex locked!" is printed.
Therefore, I can use mutexes in my library where appropriate, and don't need to require pthreads to use, and no (signifigant?) overhead where it isn't needed.
The usual solutions are:
Use a #define switch to control at build time whether to call the pthreads functions or not, and have your build process create two versions of your library: one pthread-aware and one not, with different names. Rely on the user of your library to link against the correct one.
Don't call the pthreads functions directly, but instead call user-provided lock and unlock callbacks (and thread-local-storage too, if you need that). The library user is responsible for allocating and calling the appropriate locking mechanisms, which also allows them to use a non-pthreads threading library.
Do nothing at all, and merely document that user code should ensure that your library functions aren't entered at the same time from multiple threads.
glibc does something different again - it uses tricks with lazy binding symbols to call the pthreads functions only if they are linked into the binary. This isn't portable though, because it relies on specific details of the glibc implementation of pthreads. See the definition of __libc_maybe_call():
#ifdef __PIC__
# define __libc_maybe_call(FUNC, ARGS, ELSE) \
(__extension__ ({ __typeof (FUNC) *_fn = (FUNC); \
_fn != NULL ? (*_fn) ARGS : ELSE; }))
#else
# define __libc_maybe_call(FUNC, ARGS, ELSE) \
(FUNC != NULL ? FUNC ARGS : ELSE)
#endif

compiling a thread program

I had written a small thread program when i compiled cc filename.c, i got some statements during compilation, but when i compiled using -lpthread (cc filename.c -lpthread) it got executed what is this -lpthread why is it required? can anyone explain this in detail. it would be of great help.
The pthread_create() function that you use in your program is not a basic C function, and requires that you use a library.
This is why you have to use this command switch -lpthread.
This gcc command tells him to look for a library named libpthread somewhere on your disk, and use it to provide the thread creation mechanisms.
I suggest you read this to get familiar with the "library" concept: http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html
The -l option is typically used to specify a library (in this case, the pthread library) that should be linked with your program.
Since the thread functions often live in a separate library, you need an option like this when building a program that uses them, or you will get linker errors.
pthread is something called POSIX Threads. It's the standard library for threads in Unix-like POSIX envirnoments.
Since you are going to use pthread you need to tell the compiler to link to that library.
You can read more about exactly what lpthread is and how it works: https://computing.llnl.gov/tutorials/pthreads/

Patching code/symbols into a dynamic-linked ELF binary

Suppose I have an ELF binary that's dynamic linked, and I want to override/redirect certain library calls. I know I can do this with LD_PRELOAD, but I want a solution that's permanent in the binary, independent of the environment, and that works for setuid/setgid binaries, none of which LD_PRELOAD can achieve.
What I'd like to do is add code from additional object files (possibly in new sections, if necessary) and add the symbols from these object files to the binary's symbol table so that the newly added version of the code gets used in place of the shared library code. I believe this should be possible without actually performing any relocations in the existing code; even though they're in the same file, these should be able to be resolved at runtime in the usual PLT way (for what it's worth I only care about functions, not data).
Please don't give me answers along the line of "You don't want to do this!" or "That's not portable!" What I'm working on is a way of interfacing binaries with slightly-ABI-incompatible alternate shared-library implementations. The platform in question is i386-linux (i.e. 32-bit) if it matters. Unless I'm mistaken about what's possible, I could write some tools to parse the ELF files and perform my hacks, but I suspect there's a fancy way to use the GNU linker and other tools to accomplish this without writing new code.
I suggest the elfsh et al. tools from the ERESI (alternate) project, if you want to instrument the ELF files themselves. Compatibility with i386-linux is not a problem, as I've used it myself for the same purpose.
The relevant how-tos are here.
You could handle some of the dynamic linking in your program itself. Read the man page for dlsym(3) in particular, and dlopen(3), dlerror(3), and dlclose(3) for the rest of the dynamic linking interface.
A simple example -- say I want to override dup2(2) from libc. I could use the following code (let's call it "dltest.c"):
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <dlfcn.h>
int (*prev_dup2)(int oldfd, int newfd);
int dup2(int oldfd, int newfd) {
printf("DUP2: %d --> %d\n", oldfd, newfd);
return prev_dup2(oldfd, newfd);
}
int main(void) {
int i;
prev_dup2 = dlsym(RTLD_NEXT, "dup2");
if (!prev_dup2) {
printf("dlsym failed to find 'dup2' function!\n");
return 1;
}
if (prev_dup2 == dup2) {
printf("dlsym found our own 'dup2' function!\n");
return 1;
}
i = dup2(1,3);
if (i == -1) {
perror("dup2() failed");
}
return 0;
}
Compile with:
gcc -o dltest dltest.c -ldl
The statically linked dup2() function overrides the dup2() from the library. This works even if the function is in another .c file (and is compiled as a separate .o).
If your overriding functions are themselves dynamically linked, you may want to use dlopen() rather than trusting the linker to get the libraries in the correct order.
EDIT: I suspect that if a different function within the overridden library calls an overridden function, the original function gets called rather than the override. I don't know what will happen if one dynamic library calls another.
I don't seem to be able to just add comment to this question, so posting it as an "answer". Sorry about it, doing that just to hopefully help other folks who search an answer.
So, I seem to have similar usecase, but I explicitly find any modification to existing binaries unacceptable (for me), so I'm looking for standalone proxy approach: Proxy shared library (sharedlib, shlib, so) for ELF?

Thread Safety in C

imagine I write a library in C. Further, imagine this library to be used from a multi-threaded environment. How do I make it thread-safe? More specific: How do I assure, that certain functions are executed only by one thread at a time?
In opposite to Java or C# for example, C has no means to deal with threads/locks/etc., nor does the C standard library. I know, that operating systems support threads, but using their api would restrict the compatibility of my library very much. Which possibilities do I have, to keep my library as compatible/portable as possible? (for example relying on OpenMP, or on Posix threads to keep it compatible with at least all unix-like operating systems?)
You can create wrappers with #ifdef. It's really the best you can do. (Or you can use a third party library to do this).
I'll show how I did it as an example for windows and linux. It's in C++ and not C but again it's just an example:
#ifdef WIN32
typedef HANDLE thread_t;
typedef unsigned ThreadEntryFunction;
#define thread __declspec(thread)
class Mutex : NoCopyAssign
{
public:
Mutex() { InitializeCriticalSection(&mActual); }
~Mutex() { DeleteCriticalSection(&mActual); }
void Lock() { EnterCriticalSection(&mActual); }
void Unlock() { LeaveCriticalSection(&mActual); }
private:
CRITICAL_SECTION mActual;
};
class ThreadEvent : NoCopyAssign
{
public:
ThreadEvent() { Actual = CreateEvent(NULL, false, false, NULL); }
~ThreadEvent() { CloseHandle(Actual); }
void Send() { SetEvent(Actual); }
HANDLE Actual;
};
#else
typedef pthread_t thread_t;
typedef void *ThreadEntryFunction;
#define thread __thread
extern pthread_mutexattr_t MutexAttributeRecursive;
class Mutex : NoCopyAssign
{
public:
Mutex() { pthread_mutex_init(&mActual, &MutexAttributeRecursive); }
~Mutex() { pthread_mutex_destroy(&mActual); }
void Lock() { pthread_mutex_lock(&mActual); }
void Unlock() { pthread_mutex_unlock(&mActual); }
private:
pthread_mutex_t mActual;
};
class ThreadEvent : NoCopyAssign
{
public:
ThreadEvent() { pthread_cond_init(&mActual, NULL); }
~ThreadEvent() { pthread_cond_destroy(&mActual); }
void Send() { pthread_cond_signal(&mActual); }
private:
pthread_cond_t mActual;
};
inline thread_t GetCurrentThread() { return pthread_self(); }
#endif
/* Allows for easy mutex locking */
class MutexLock : NoAssign
{
public:
MutexLock(Mutex &m) : mMutex(m) { mMutex.Lock(); }
~MutexLock() { mMutex.Unlock(); }
private:
Mutex &mMutex;
};
You will need to use your OS's threading library. On Posix, that will usually be pthreads and you'll want pthread_mutex_lock.
Windows has it's own threading library and you'll want to look at either critical sections or CreateMutex. Critical sections are more optimized but are limited to a single process and you can't use them in WaitForMultipleObjects.
You have two main options:
1) You specify which multi-threaded environment your library is thread-safe in, and use the synchronisation functions of that environment.
2) You specify that your library is not thread-safe. If your caller wants to use it in a multi-threaded environment, then it's their responsibility to make it thread-safe, by using external synchronisation if necessary to serialise all calls to your library. If your library uses handles and doesn't need any global state, this might for instance mean that if they have a handle they only use in a single thread, then they don't need any synchronisation on that handle, because it's automatically serialised.
Obviously you can take a multi-pack approach to (1), and use compile-time constants to support all the environments you know about.
You could also use a callback architecture, link-time dependency, or macros, to let your caller tell you how to synchronise. This is kind of a mixture of (1) and (2).
But there's no such thing as a standard multi-threaded environment, so it's pretty much impossible to write self-contained code that is thread-safe everywhere unless it's completely stateless (that is, the functions are all side-effect free). Even then you have to interpret "side-effect" liberally, since of course the C standard does not define which library functions are thread-safe. It's a bit like asking how to write C code which can execute in a hardware interrupt handler. "What's an interrupt?", you might very well ask, "and what things that I might do in C aren't valid in one?". The only answers are OS-specific.
You also should avoid static and global variables that can be modified avoiding synchronization code all over your module
It is a misconception that the pthreads library doesn't work on Windows. Check out sourceforge.net. I would recommend pthreads because it is cross-platform and its mutexes are way faster than e.g. the Windows builtin mutexes.
Write your own lock.
Since you're targeting PCs you're dealing with the x86 architecture which natively supplies all the multi-threading support you should need. Go over your code and identify any functions that have shared resources. Give each shared resource a 32-bit counter. Then using the interlocked operations that are implemented by the CPUs keep track of how many threads are using each shared resource and make any thread that wants to use a shared resource wait until the resource is released.
Here's a really good blog post about interlocked operations: Using Interlocked Instructions from C/C++
The author focuses mostly on using the Win32 Interlocked wrappers, but pretty much every operating system has their own wrappers for the interlocked operations, and you can always write the assembly (each of these operations is only one instruction).
If your goal is to be compatible on unix-like operating systems, I would use POSIX threading.
That being said, if you want to support windows as well, you'll need to have two code paths for this - pthreads on unix and Windows threads on Windows. It's fairly easy to just make your own "thread library" to wrap these.
There are quite a few that do this (like OpenThreads), but most of them I've used are C++, not C.
Using Posix threads sounds like a good idea to me (but I'm no expert). In particular, Posix has good primitives for ensuring mutual exclusion.
If you had to create a library without any dependencies, you would have to implement the mutual exclusion algorithms yourself, which is a bad idea.
"imagine I write a library in C. Further, imagine this library to be used from a multi-threaded environment. How do I make it thread-safe? More specific: How do I assure, that certain functions are executed only by one thread at a time?"
You can't -> write a thread-safe or better re-entrant functions.
Unless, You would like to write system-wide locks - a very bad idea.
"In opposite to Java or C# for example, C has no means to deal with threads/locks/etc."
This is a joke - right? Long before the Java and C# was developed, the locks were invented and widely used as an synchronization objects...
"I know, that operating systems support threads, but using their api would restrict the compatibility of my library very much."
The thing is, that such libraries already exists - f.e. wxWidgets, which are offering the portable wxThread... (but this is C++)
Anyway, there are 2 main "flavours" of C: the ANSI C and the GNU C -> two different worlds... pick one or the other.

Resources