Where are syscalls located in glibc source - c

So I was looking through the linux glibc source and I don't see where it actually does anything. The following is from io/chdir.c but it is indicative of many of the source files. What's going on here? Obviously I am missing something. What's the secret, where does it make a system call or actually do something?
stub_warning is some legacy craziness. __set_errno seems to be a simple macro that sets errno. And while I find a million usages of weak_alias I don't see it defined anywhere.
Is there a helpful guide to understanding how glibc works somewhere?
#include <errno.h>
#include <stddef.h>
#include <unistd.h>
/* Change the current directory to PATH. */
int
__chdir (path)
const char *path;
{
if (path == NULL)
{
__set_errno (EINVAL);
return -1;
}
__set_errno (ENOSYS);
return -1;
}
stub_warning (chdir)
weak_alias (__chdir, chdir)
#include <stub-tag.h>

What you've found is a stub function for systems it's not implemented on. You need to look under the sysdeps tree for the actual implementation. The following may be of interest:
sysdeps/unix/sysv/linux
sysdeps/posix
sysdeps/i386 (or x86_64 or whatever your cpu arch is)

The actual system call code for chdir() is auto-generated on most systems supported by glibc, by the script make-syscalls.sh. That's why you can't find it in the source tree.

That's a generic stub that is used if another definition doesn't exist; weak_alias is a cpp macro which tells the linker that __chdir should be used when chdir is requested, but only if no other definition is found. (See weak symbols for more details.)
chdir is actually a system call; there will be per-OS system call bindings in the gibc source tree, which will override the stub definition with a real one that calls into the kernel. This allows glibc to present a stable interface across systems which may not have all of the system calls that glibc knows about.

Note that the actual system calls aren't defined anywhere in the source tree - they're generated at build time from syscalls.list (linked is the one in sysdeps/unix, there are additional ones further down), a series of macros in sysdep.h (linked linux/i386), and a script that actually generates the source files.

Related

How to get dylibs in /opt/local/lib to be recognized by dlopen in MacOS Monterey?

I am trying to compile some nim code that depends on libsass, and it fails with
dlopen(libsass.dylib, 0x0002): tried: 'libsass.dylib' (no such file), '/usr/local/lib/libsass.dylib' (no such file), '/usr/lib/libsass.dylib' (no such file), '/Users/emre/code/nimforum/libsass.dylib' (no such file)
could not load: libsass.dylib
On my system, that file is in /opt/local/lib, since I installed it with macports. I tried setting LD_LIBRARY_PATH, DYLD_LIBRARY_PATH, and DYLD_FALLBACK_LIBRARY_PATH, to /opt/local/lib but this did not help. I believe macOS's System Integrity Protection module is the cause but I am not sure how best to accommodate it.
This has nothing to do with SIP.
You need to pass the full path to the library you want to open to the first argument of dlopen(). From the man page:
SYNOPSIS
#include <dlfcn.h>
void*
dlopen(const char* path, int mode);
DESCRIPTION
dlopen() examines the mach-o file specified by path.
If you really need to use dlopen() to load this, you should be passing dlopen("/opt/local/lib/libsass.dylib", RTLD_NOW).
However, dlopen() is generally frowned upon as it bypasses a lot of the performance and correctness benefits of static linkage. You should aim to use static linkage (ie: pass -lsass at build time) wherever possible.

Redirecting assert fail messages

We have a software project with real-time constraints largely written in C++ but making use of a number of C libraries, running in a POSIX operating system. To satisfy real-time constraints, we have moved almost all of our text logging off of stderr pipe and into shared memory ring buffers.
The problem we have now is that when old code, or a C library, calls assert, the message ends up in stderr and not in our ring buffers with the rest of the logs. We'd like to find a way to redirect the output of assert.
There are three basic approaches here that I have considered:
1.) Make our own assert macro -- basically, don't use #include <cassert>, give our own definition for assert. This would work but it would be prohibitively difficult to patch all of the libraries that we are using that call assert to include a different header.
2.) Patch libc -- modify the libc implementation of __assert_fail. This would work, but it would be really awkward in practice because this would mean that we can't build libc without building our logging infra. We could make it so that at run-time, we can pass a function pointer to libc that is the "assert handler" -- that's something that we could consider. The question is if there is a simpler / less intrusive solution than this.
3.) Patch libc header so that __assert_fail is marked with __attribute__((weak)). This means that we can override it at link-time with a custom implementation, but if our custom implementation isn't linked in, then we link to the regular libc implementation. Actually I was hoping that this function already would be marked with __attribute__((weak)) and I was surprised to find that it isn't apparently.
My main question is: What are the possible downsides of option (3) -- patching libc so that this line: https://github.com/lattera/glibc/blob/master/assert/assert.h#L67
extern void __assert_fail (const char *__assertion, const char *__file,
unsigned int __line, const char *__function)
__THROW __attribute__ ((__noreturn__));
is marked with __attribute__((weak)) as well ?
Is there a good reason I didn't think of that the maintainers didn't already do this?
How could any existing program that is currently linking and running successfully against libc break after I patch the header in this way? It can't happen, right?
Is there a significant run-time cost to using weak-linking symbols here for some reason? libc is already a shared library for us, and I would think the cost of dynamic linking should swamp any case analysis regarding weak vs. strong resolution that the system has to do at load time?
Is there a simpler / more elegant approach here that I didn't think of?
Some functions in glibc, particularly, strtod and malloc, are marked with a special gcc attribute __attribute__((weak)). This is a linker directive -- it tells gcc that these symbols should be marked as "weak symbols", which means that if two versions of the symbol are found at link time, the "strong" one is chosen over the weak one.
The motivation for this is described on wikipedia:
Use cases
Weak symbols can be used as a mechanism to provide default implementations of functions that can be replaced by more specialized (e.g. optimized) ones at link-time. The default implementation is then declared as weak, and, on certain targets, object files with strongly declared symbols are added to the linker command line.
If a library defines a symbol as weak, a program that links that library is free to provide a strong one for, say, customization purposes.
Another use case for weak symbols is the maintenance of binary backward compatibility.
However, in both glibc and musl libc, it appears to me that the __assert_fail function (to which the assert.h macro forwards) is not marked as a weak symbol.
https://github.com/lattera/glibc/blob/master/assert/assert.h
https://github.com/lattera/glibc/blob/master/assert/assert.c
https://github.com/cloudius-systems/musl/blob/master/include/assert.h
You don't need attribute((weak)) on symbol __assert_fail from glibc. Just write your own implementation of __assert_fail in your program, and the linker should use your implementation, for example:
#include <stdio.h>
#include <assert.h>
void __assert_fail(const char * assertion, const char * file, unsigned int line, const char * function)
{
fprintf(stderr, "My custom message\n");
abort();
}
int main()
{
assert(0);
printf("Hello World");
return 0;
}
That's because when resolving symbols by the linker the __assert_fail symbol will already be defined by your program, so the linker shouldn't pick the symbol defined by libc.
If you really need __assert_fail to be defined as a weak symbol inside libc, why not just objcopy --weaken-symbol=__assert_fail /lib/libc.so /lib/libc_with_weak_assert_fail.so. I don't think you need to rebuild libc from sources for that.
If I were you, I would probably opt for opening a pipe(2) and fdopen(2)'ing stderr to take the write end of that pipe. I'd service the read end of the pipe as part of the main poll(2) loop (or whatever the equivalent is in your system) and write the contents to the ring buffer.
This is obviously slower to handle actual output, but from your write-up, such output is rare, so the impact ought to be negligable (especially if you already have a poll or select this fd can piggyback on).
It seems to me that tweaking libc or relying on side-effects of the tools might break in the future and will be a pain to debug. I'd go for the guaranteed-safe mechanism and pay the performance price if at all possible.

Undefined reference to 'strnlen' despite "string.h" include

I am trying to use create a project for LPC1769 on LPCXpresso. I have a C file calling
#include <string.h>
int main()
{
//some stuff
strnlen(SomeString, someInt);
}
to which I get an error:
Undefined reference to 'strnlen'
The weird part is that there is no problem with strcpy, strncpy or other common string functions.
I am building for a Cortex-M3 processor
Compiler used is: arm-none-eabi-gcc
In Eclipse, I have ticked the MCU linker option : No startup or default libs
I am running Eclipse on Ubuntu
While it may be easy enough to bypass this by just using strlen, I am actually facing a problem using a library which uses strnlen, and I don't want to mess with the library source.
The strnlen function was (until fairly recently) a Linux-specific function (some documentation such as the GNU libc manual still says that it is a "GNU extension"). The current manual page says it is part of POSIX.1-2008. Since you are cross-compiling, it is possible that the target machine's runtime library does not have this function. A forum posting from 2011 said just that.
I add the same problem and I found out that using -std=gnu++11 compiler flag solves it.
The following may work for you (since strnlen() is not a part of the runtime lib).
Define your own/local version of the strnlen().
int strnlen(char *param, int maxlen)
{
// Perform appropriate string manipulation ... as needed.
// Return what you need.
};
You want this include instead:
#include <string.h>
The difference between <> and "" is that <> searches for header files in your systems include folder. The "" searches for header files in the current directory and in any other include folders specified by -I directory

access a POSIX function using dlopen

POSIX 2008 introduces several file system functions, which rely on directory descriptor when determining a path to the file (I'm speaking about -at functions, such as openat, renameat, symlinkat, etc.). I doubt if all POSIX platforms support it (well, at least the most recent versions seem to support) and I'm looking for a way to determine if platform supports such functions. Of course one may use autoconf and friends for compile-time determination, but I'm looking for a possibility to find out whether implementation supports -at functions dynamically.
The first that comes to my mind is a dlopen()/dlsym()/dlclose() combo; at least I've successfully loaded the necessary symbols from /usr/libc.so.6 shared library. However, libc may be (or is?) named differently on various platforms. Is there a list of standard locations to find libc? At least on Linux /lib/libc.so appears to be not a symbolic link to shared library, but a ld script. May be there exist some other way to examine during runtime if a POSIX function is supported? Thanks in advance!
#define _GNU_SOURCE 1
#include <dlfcn.h>
#include <stdio.h>
int main ()
{
void * funcaddr = dlsym(RTLD_DEFAULT, "symlinkat");
/* -----------------------^ magic! */
printf ("funcaddr = %p\n", funcaddr);
}
Output:
funcaddr = 0x7fb62e44c2c0
Magic explanation: your program is already linked with libc, no need to load it again.
Note, this is actually GNU libc feature, as hinted by _GNU_SOURCE. POSIX reserves RTLD_DEFAULT "for future use", and then proceeds to define it exactly like GNU libc does. So strictly speaking it is not guaranteed to work on all POSIX systems.

Patching code/symbols into a dynamic-linked ELF binary

Suppose I have an ELF binary that's dynamic linked, and I want to override/redirect certain library calls. I know I can do this with LD_PRELOAD, but I want a solution that's permanent in the binary, independent of the environment, and that works for setuid/setgid binaries, none of which LD_PRELOAD can achieve.
What I'd like to do is add code from additional object files (possibly in new sections, if necessary) and add the symbols from these object files to the binary's symbol table so that the newly added version of the code gets used in place of the shared library code. I believe this should be possible without actually performing any relocations in the existing code; even though they're in the same file, these should be able to be resolved at runtime in the usual PLT way (for what it's worth I only care about functions, not data).
Please don't give me answers along the line of "You don't want to do this!" or "That's not portable!" What I'm working on is a way of interfacing binaries with slightly-ABI-incompatible alternate shared-library implementations. The platform in question is i386-linux (i.e. 32-bit) if it matters. Unless I'm mistaken about what's possible, I could write some tools to parse the ELF files and perform my hacks, but I suspect there's a fancy way to use the GNU linker and other tools to accomplish this without writing new code.
I suggest the elfsh et al. tools from the ERESI (alternate) project, if you want to instrument the ELF files themselves. Compatibility with i386-linux is not a problem, as I've used it myself for the same purpose.
The relevant how-tos are here.
You could handle some of the dynamic linking in your program itself. Read the man page for dlsym(3) in particular, and dlopen(3), dlerror(3), and dlclose(3) for the rest of the dynamic linking interface.
A simple example -- say I want to override dup2(2) from libc. I could use the following code (let's call it "dltest.c"):
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <dlfcn.h>
int (*prev_dup2)(int oldfd, int newfd);
int dup2(int oldfd, int newfd) {
printf("DUP2: %d --> %d\n", oldfd, newfd);
return prev_dup2(oldfd, newfd);
}
int main(void) {
int i;
prev_dup2 = dlsym(RTLD_NEXT, "dup2");
if (!prev_dup2) {
printf("dlsym failed to find 'dup2' function!\n");
return 1;
}
if (prev_dup2 == dup2) {
printf("dlsym found our own 'dup2' function!\n");
return 1;
}
i = dup2(1,3);
if (i == -1) {
perror("dup2() failed");
}
return 0;
}
Compile with:
gcc -o dltest dltest.c -ldl
The statically linked dup2() function overrides the dup2() from the library. This works even if the function is in another .c file (and is compiled as a separate .o).
If your overriding functions are themselves dynamically linked, you may want to use dlopen() rather than trusting the linker to get the libraries in the correct order.
EDIT: I suspect that if a different function within the overridden library calls an overridden function, the original function gets called rather than the override. I don't know what will happen if one dynamic library calls another.
I don't seem to be able to just add comment to this question, so posting it as an "answer". Sorry about it, doing that just to hopefully help other folks who search an answer.
So, I seem to have similar usecase, but I explicitly find any modification to existing binaries unacceptable (for me), so I'm looking for standalone proxy approach: Proxy shared library (sharedlib, shlib, so) for ELF?

Resources