How to check the signal handler in Linux - c

I have read this discussion which discuss about how to check the signal actions of each process:
How can I check what signals a process is listening to?
However, I want to use C/C++, Python or other ways to get the userspace of the signal handler name of each process. Just like the psig in Solaris:
What is the meaning of every column when executing psig command?
Would it be possible to do that in Linux?

Because the Linux kernel does not expose the signal handlers (other than using the sigaction() (or signal()) syscall in the process itself), you need to inject executable code to the target process to obtain this information.
(Or, alternatively, create a Linux kernel module that exposes this information.)
So, this is definitely a c linux question.
Note that it is not possible to always obtain the name of a function, because the handler can be declared static void without any particular name in the binary.
The easiest approach would be to create a dynamic library to interpose signal() and sigaction(), with an ELF constructor function (not related to C++; Linux ELF binaries just support marking functions "constructor", in which case they are automatically executed prior to main()) that opens a suitable log file in write-only append mode, say /var/log/sighandlers/PID.log, and dumps the contents of /proc/self/maps to record the addresses the binaries are loaded in. The interposed functions will then simply write the addresses of newly assigned handlers to the log file.
It is important to note that both sigaction() and signal() are async-signal safe functions, so the interposed versions should be, too. (Fortunately, you only need write(), which is async-signal safe. I recommend using dlsym() to look up the original function pointers in the ELF constructor function.)
When examining the log file, the function addresses should be calculated relative to the beginning of the binary mapping, then objdump -tT binary or objdump -d binary to find the symbol the address belongs to.
I would personally not bother with any other approaches, even if this requires one to execute each binary using a special command (setting LD_LIBRARY_PATH) to find them out; it is either that, or a kernel module.

Related

dlclose gets implicitly called

So I was studying about shared libraries and I read that an implicit dlclose() is performed upon process termination.
I want to know who is responsible for this call. For example, if I wrote:
#include <stdio.h>
int main() {
printf("Hello World\n");
return 0;
}
And then if I did ldd ./a.out then I get a list of these libraries:
linux-vdso.so.1 => (0x00007ffd6675c000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2569866000)
/lib64/ld-linux-x86-64.so.2 (0x0000562b69162000)
Linker is responsible for loading these right, so who is responsible upon termination of this ./a.out executable for implicit dlclose() of these libraries?
I do not have Kerrisk's book, but if you have accurately characterized its contents then they seem to be a bit simplified. It is not strictly correct to say that whenever a process terminates, the function dlclose() is called for each of its open shared libraries, but it is reasonable to say that whenever a process terminates, all its handles on open shared libraries are closed. As a result, the operating system recognizes that one fewer process references each of those shared libraries, and if that brings any shared libraries' reference counts to zero then the OS may choose to unload them from memory.
dlclose() does more work than that. In particular, it causes any destructor functions in the library to run. Those functions will also run when the process exits normally by returning from main() or by calling exit(), but not if the process terminates by other means, such as calling _exit() or in response to receiving a signal. In the normal-exit case, the net effect may be the same as if dlclose() were called for each open shared library, but even then, that's not necessarily achieved by actually calling dlclose().
Finally, do be aware that although the dl*() functions are defined by POSIX, substantially all details of dynamic / shared libraries are left at the discretion of implementations. Since you've asked about a Linux book, I've referenced a few Linux-specific details.
I suspect the book is only talking about normal process termination when calling exit() or returning from main(). dlopen() presumably registers an atexit() handler that executes all the termination functions of the dynamic libraries.
It's not feasible for libraries to execute any code when a process is terminated abnormally. If the process is terminated by the OS instead of by exiting normally, the OS just releases any file handles, but it won't execute code in the context of the process.

Hook arbitrary function on to _exit(2)

I'm interested in whether I can call an arbitrary function on _exit(2) call, which bypasses other hooking architectures, so it doesn't seem easy to me.
If this is an ordinary exit(3) or return statement, obviously it's possible by atexit(3), on_exit(3), or __attribute__((destructor)) with gcc extension
It is possible by overriding _exit(2) with LD_PRELOAD; which I wish to avoid
Is there a way to do it without LD_PRELOAD, say, overriding _exit(2)?
Edit: The problem I'm facing is fork(2)ed Perl programs with CoW. The program's children processes run destructors on exit(3) call, in which they touch many memory locations, to cause large memory copy, in spite they will exit.
It's hard to bypass destructors with ordinary exit call in Perl, so an idea is call POSIX::_exit instead.
However, there is a dynamically loaded library with LD_PRELOAD, and I want to call a function in it on process exit.
AFAIU, it is simply not possible without LD_PRELOAD tricks, or ptrace(2) with PTRACE_SYSCALLfrom another process (e.g. the parent process running gdb). At the lowest level, _exit(2) is a system call so is an "atomic" operation using the SYSENTER machine instruction, e.g. thru vdso(7).
Notice that a C program could use some asm to invoke the _exit syscall (or use the indirect syscall(2))
Assuming a dynamically linked executable to GNU libc or musl-libc, your only way is to catch exit(3) library function (not the _exit(2) syscall!) using atexit(3)
You could redefine _exit and hope that the dynamic linker would call your _exit, not the one in libc. I won't play such tricks.
Alternatively, write a small wrapping C program which fork, execve and waitpid the original program.
could you make use of the 'atexit()' function call?
Near the beginning of main() call atexit() with a parameter of the function that you want executed when the program exits.
You can call atexit() numerous time, thereby stacking several things to be executed when the application exits.

Forcefully remove fcntl locks from a different process

Is there any way I can remove fcntl byte range locks on a file from a process that did not lock these ranges?
I have several processes that put byte range locks on files. What I basically need to come up with is an external tool that would help me remove byte range locks for files I specify.
There are two options that immediately come to mind.
Write a kernel module to do this.
As far as I know, there is no kernel facility to do this as of right now.
(You could add a new command to fcntl(), that given superuser privileges or same user as the owner of the lock, does the force-unlock or lock stealing.)
Write a small library, that installs a realtime signal handler, say SIGRTMAX. When this signal is caught, sent by sigqueue(), and the int payload describes an open file descriptor, release all byte locks on that descriptor.
Alternatively, you can have the signal handler open and read a file or pipe (say /tmp/PID.lock, where the file or pipe contains a data packet defining which file or file descriptor and byte range to unlock.
As long as the library is loaded when the process starts (and possibly interposing all signal() and sigaction() calls to make sure your signal is kept in the call chain), this should work fine.
The second option requires that you preload the library (via LD_PRELOAD environment variable, or preloading it for all binaries using /etc/ld.so.conf).
The interposing library is not difficult at all to write. I have shown an example of using an interposing library to monitor fork() calls. In your case, you'd have to think of a good way to define the byte ranges to be unlocked (in file or pipe, triggered by a signal), and handle all that in the signal handler context; but there are enough async-signal-safe low-level unistd.h I/O functions to do this.

Is there a way to close output of stderr in one thread but not others?

Say my program has some threads, since the file descriptors are shared among the threads, if I call close(stderr), all the threads won't output to stderr. my question: is there a way to shut down the output of stderr in one thread, but not the others?
To be more specific, one thread of my program calls a third party library function, and it keeps output warning messages which I know are useless. But I have no access to this third party library source.
No. File descriptors are global resources available to all threads in a process. Standard error is file descriptor number 2, of course, so it is a global resource and you can't stop the third party code from writing to it.
If the problem is serious enough to warrant the treatment, you can do:
int fd2_copy = dup(2);
int fd2_null = open("/dev/null", O_WRONLY);
Before calling your third-party library function:
dup2(fd2_null, 2);
third_party_library_function();
dup2(fd2_copy, 2);
Basically, for the duration of the third-party library, switch standard error to /dev/null, reinstating the normal output after the function.
You should, of course, error check the system calls.
The downside of this is that while this thread is executing the third party function, any other thread that needs to write to standard error will also write to /dev/null.
You'd probably have to think in terms of adding an 'error writing thread' (EWT) which can be synchronized with the 'third-party library executing thread' (TPLET). Other threads would write a message to the EWT. If the TPLET was executing the third-party library, the EWT would wait until it was done, and only then write any queued messages. (While that would 'work', it is hard work.)
One way around this would be to have the error reporting functions used by the general code (other than the third-party library code) write to fd2_copy rather than standard error per se. This would require a disciplined use of error reporting functions, but is a whole heap easier than an extra thread.
stderr is per process not per thread, so closing it will close for all threads.
If you want to skip particular messages, may be you can use grep -v.
On Linux it is possible to give the current thread its own private file descriptor table, using the unshare() function declared in <sched.h>:
unshare(CLONE_FILES);
After that call, you can call close(2); and it will affect only the current thread.
Note however that once the file descriptor table is unshared, you can't go back to sharing it again - it's a one-way operation. This is also Linux-specific, so it's not portable.

Reading shared data inside a signal handler

I am in a situation where I need to read a binary search tree (BST) inside a signal handler (SIGSEGV signal handler, which according to my knowledge is per thread base). The BST can be modified by the other threads in the application.
Now since a signal handler can't use semaphores, mutexes etc. and therefore can't access shared data, How do I solve this problem? Note that my application is multithreaded and running on a multicore system.
You shouldn't access shared data from signal handler. You can find out more information about signals in following articles:
Linux Signals for the Application Programmer
The Linux Signals Handling Model
All about Linux signals
Looks like the safest way to deal with signals in linux so far is signalfd.
I can see two quite clean solutions:
Linux-specific: Create a dedicated thread handling signals. Catch signals using signalfd(). This way you will handle signals in a regular thread, not any limited handler.
Portable: Also use a dedicated thread that sleeps until signal is received. You may use a pipe to create a pair of file descriptors. The thread may read(2) from the first descriptor and in a signal handler you may write(2) to the second descriptor. Using write() in a signal handler is legal according to POSIX. When the thread reads something from the pipe it knows it must perform some action.
Assuming the SH can't access the shared data directly, then maybe you could do it indirectly:
Have some global variable that only signal handlers can write to, but can be read from elsewhere (even if only within the same thread).
SH sets the flag when it is invoked
Threads poll this flag when they are not in the middle of modifying the BST; when the find it set, they do the processing that is required by the original signal (using whatever synchronizations are necessary), and then raise a different signal (like SIGUSR1) to indicate that the processing is done
The SH for THAT signal resets the flag
If you're worried about overlapping SIGSEGVs, add a counter to the mix to keep track. (Hey! You just built your own semaphore!)
The weak link here is obviously the polling, but its a start.
You might consider mmap-ing a fuse file system (in user space).
Actually, you'll be more happy on Gnu Hurd which has support for external pagers
And perhaps your hack of reading a binary search tree in your signal handler could often work in practice, non-portably and in a kernel version dependent way. Perhaps serializing access with low-level non portable tricks (e.g. futexes and atomic gcc builtins) might work. Reading the (machine specific) source code of NPTL i.e. current Linux pthread routines should help.
It could probably be the case that pthread_mutex_lock etc are in fact usable from inside a Linux signal handler... (because it probably does only futex and atomic instructions).

Resources