How can I switch between different processes fork() ed in gdb? - c

I'm debugging such a multiple process application,
how can I switch between the fork()ed processes?

You can put the child process to sleep and then attach a new instance of GDB to it. The GDB User Manual describes this process as follows (emphasis is mine):
On most systems, gdb has no special
support for debugging programs which
create additional processes using the
fork function. When a program forks,
gdb will continue to debug the parent
process and the child process will run
unimpeded. If you have set a
breakpoint in any code which the child
then executes, the child will get a
SIGTRAP signal which (unless it
catches the signal) will cause it to
terminate.
However, if you want to debug the
child process there is a workaround
which isn't too painful. Put a call to
sleep in the code which the child
process executes after the fork. It
may be useful to sleep only if a
certain environment variable is set,
or a certain file exists, so that the
delay need not occur when you don't
want to run gdb on the child. While
the child is sleeping, use the ps
program to get its process ID. Then
tell gdb (a new invocation of gdb if
you are also debugging the parent
process) to attach to the child
process (see Attach). From that point
on you can debug the child process
just like any other process which you
attached to.
The long and the short of it is that when you start a program that later forks, GDB will stay connected to the parent process (though you can follow the child process, instead, by using set follow-fork-mode child). By putting the other process to sleep, you can have a new instance of GDB connect to it, as well.
Use set detach-on-fork off to hold both processes under the control of gdb. By default, the parent process will be debugged as usual and the child will be held suspended, but by calling set follow-fork-mode child you can change this behavior (so that the child process will be debugged as usual and the parent will be held suspended). The GDB User Manual describes this process as follows:
gdb will retain control of all forked
processes (including nested forks).
You can list the forked processes
under the control of gdb by using the
info inferiors command, and switch
from one fork to another by using the
inferior command (see Debugging
Multiple Inferiors and Programs).
To quit debugging one of the forked
processes, you can either detach from
it by using the detach inferiors
command (allowing it to run
independently), or kill it using the
kill inferiors command. See Debugging
Multiple Inferiors and Programs.

Show all the processes.
(gdb) info inferiors
Num Description Executable
1 process 1000 /tmp/a.out
* 2 <null> /tmp/a.out # current attach inferior
Switch between different processes.
(gdb) inferior 1
[Switching to inferior 1 [process 1000] (/tmp/a.out)]
[Switching to thread 1.1 (LWP 1000)]

Related

A C program process is waited by some OS routine?

Well, I'm learning about processes using the C language, and I have seen that when you call the exit function a process is terminated and without waiting for it, it will become a zombie process. My question is, if the first process created when executing the program is a process itself, is there a 0S routine that wait for it after an exit() call, avoiding that it becomes a zombie process? I'm curious about it.
For Unix systems at least (and I expect Windows is similar), when the system boots, it creates one special first process. Every process after that is created by some existing process.
When you log into a windowed desktop interface, there is some desktop manager process (that has been created by the first process or one of its descendants) managing windows. When you start a program by clicking on it, that desktop manager or one of its children (maybe some file manager software) creates a process to run the program. When you start a program by executing a command in a terminal window, there is a command line shell process that is interpreting the things you type, and it creates a process to run the program.
So, in all cases, your user program has a parent process, either a command-line shell or some desktop software.
If a child process creates another child (even as the first instruction) then the parent also has to wait for it or it becomes a zombie.
Basically processes always become zombie until they are removed from the process table, the OS (via the process init) will handle and wait() for orphans (zombies without parents), it does that periodically so normally you won't have orphans running for very long.
On Linux, the top most (parent) process is init. This is the only process, which has no parent. Any other process (without any exception) do have a parent and hence is a child of another process.
See:
init
Section NOTES on wait
A child that terminates, but has not been waited for becomes a
"zombie". The kernel maintains a minimal set of information
about the zombie process (PID, termination status, resource usage
information) in order to allow the parent to later perform a wait
to obtain information about the child. As long as a zombie is
not removed from the system via a wait, it will consume a slot in
the kernel process table, and if this table fills, it will not be
possible to create further processes. If a parent process
terminates, then its "zombie" children (if any) are adopted by
init(1), ... init(1) automatically performs a wait to remove the
zombies.

How to programmatically backtrace crash of forked child using C

Is there a possibility to backtrace a location where child process crashed in Linux using C/C++ code?
What I want to do is the following:
fork a new child process and retrieve it's PID
wait for forked child process to crash ... probably using signal handler for SIGCHLD, or using waitpid()/waitid()
retrieve stack trace of child at location where it crashed
This would make parent process act similar to debugger when attached proces crashes.
You can assume that child process is compiled with debug symbols and parent process has root permissions.
What is the simplest way to achieve such functionality?
It is much simpler in Linux to use the libSegFault library provided as part of the GNU C library. On my system, it is installed in /lib/x86_64-linux-gnu/libSegFault.so.
All you need to do is to set SEGFAULT_SIGNALS environment variable to all (so you can catch all crash causes the library supports), optionally SEGFAULT_OUTPUT_NAME to point to the file the stack trace is written to (default is to standard error), and LD_PRELOAD to point to the segfault library. As long as the process does not modify these environment variables, they apply to all child processes as well.
For example, if ./yourprog was the program that forks a child that crashes, and you want the stack trace to ./yourprog.stacktrace, run
SEGFAULT_SIGNALS=all \
SEGFAULT_OUTPUT_NAME=./yourprog.stacktrace \
LD_PRELOAD=/lib/x86_64-linux-gnu/libSegFault.so \
./yourprog
or all in one line without the backslashes (\).
The only downside is that each crash overwrites the existing file, so you'll only see the latest one. If you have /proc mounted, then the crash dump includes both a backtrace and the memory map of the process at the crash moment.
If you insist on doing it in your own C program, I recommend you first take a look at the libSegFault sources.
The point is, the stack trace must be dumped by the process itself; it is not accessible to the parent. To do that, you inject code into the child process using e.g. LD_PRELOAD environment variable (which is one of the dynamic linker control variables in Linux). (Note that the stack tracing etc. is done in a signal handler context, so only async-signal-safe functions should be used.)
For example, the parent process can create a pipe, and move its write end to a specific descriptor in the child process before executing the target process, with your helper preload library path in LD_PRELOAD.
The helper preload library interposes signal(), sigaction(), and possibly sigprocmask(), sigwait(), sigwaitinfo(), pthread_sigmask(), to ensure the helper librarys crash dump signal handlers are executed when such a signal is delivered (SIGSEGV, SIGBUS, SIGILL, possibly SIGTRAP). The signal handler does the stack dump (and prints /proc/PID/maps), then sets the signal disposition to default, and re-raises the signal (using raise()).
Essentially, it boils down to doing the same as above libSegFault, except with your own C code.
If you don't want to inject code to the child process, or managing the signal handlers is too complicated, you can use ptrace instead.
When the tracee is killed by a signal (other than SIGKILL), the thread receiving the signal is stopped first ("signal-delivery-stop"), so the tracer can examine its stack (and memory map of the tracee), before letting the child process continue/die.
In practice, ptracing is more invasive, as there are many events that cause the tracees threads to stop. It is also much more complicated for multithreaded processes than the LD_PRELOAD approach, because ptrace can control individual threads in the tracee; there are much more details to get right.

waitpid for child process not succeeding

I am starting a process using execv and letting it write to a file. I start a thread simultaneously that monitors the file so that it's size does not exceed a certain limit using stat.st_size. Now, when the limit is hit, I waitpid for the child process, but this throws an error and the process I start in the background becomes a zombie. When I do the stop using the same waitpid from the main thread, the process is killed without becoming a zombie. Any ideas?
Edit: The errno is 10 and waitpid returns -1. This is on a linux platform.
This is difficult to debug without code, but errno 10 is ECHILD.
Per the man page, this is returned as follows:
ECHILD (for waitpid() or waitid()) The process specified by pid (waitpid()) or idtype and id (waitid()) does not exist or is not a child of the calling process. (This can happen for one's own child if the action for SIGCHLD is set to SIG_IGN. See also the Linux Notes section about threads.)
In short, the pid you are specifying is not a child of the process calling waitpid() (or is no longer, perhaps because it has terminated).
Note the parenthetical section:
"This can happen for one's own child if the action for SIGCHLD is set to SIG_IGN" - if you've set up a signal handler for SIGCHLD to be SIG_IGN, the wait is effectively done automatically, and therefore waitpid won't work as the child will have already terminated (will not go through zombie state).
"See also the Linux Notes section about threads." - In Linux, threads are essentially processes. Modern linux will allow one thread to wait for children of other threads (provided they are in the same thread group - broadly parent process). If you are using Linux prior to 2.4, this is not the case. See the documentation on __WNOTHREAD for details.
I'm guessing the thread thing is a red herring, and the problem is actually the signal handler, as this accords with your statement 'the process is killed without becoming a zombie.'

How to make child process die whenever parent process gets restarted

Daemon X spawns process Y. Sometimes daemon X could die abruptly and in that case it did not have a chance to properly terminate its child process Y (that is, process Y would remain running in the background). How to make sure that Y gets always terminated whenever X abruptly died?
Currently I have implemented daemon X in a such way that, if it abruptly died, then it gets restarted; reads process' Y pid file and terminates Y by using kill(pid, SIGTERM). This solution, however, has its drawbacks - before killing process Y, I need to make sure that it is indeed process Y (because some other newer process could be reusing the same pid that was in Y's pid file). Even, if X checks process' Y name against /proc/<pid>/, then there is still a small window where theoretically X could be killing wrong process.
Since process Y is not developed by me, I can't use prctl(PR_SET_PDEATHSIG, SIGTERM) from Y.
Also, system("killall Y") is too broad for my use case.
Is there a better way to solve this problem than what I currently have?
The process groups/sessions are probably the way to go. How about this:
If X does setsid(), remove it from the code. Write a very simple wrapper shell script to run the daemon:
#!/bin/sh
/usr/bin/xd || pkill -TERM -s 0
and run it with setsid command. If the daemon process exits abnormally, pkill will send fatal signal to all processes in the current session. The session is inherited through fork()/exec(), and the session leader (shell process) still exists when pkill is executed, so unless child does setsid() as well, there is no chance of child escaping or killing the wrong process.
For Linux only:
Using the function prctl() with option PR_SET_PDEATHSIG allows the parent child to set a signal that is send to their children to it in case the parent dies. The applies only to children created after this option had been set.
Verbatim from the Linux man page for ptctl():
PR_SET_PDEATHSIG (since Linux 2.1.57)
Set the parent process death signal of the calling process to
arg2 (either a signal value in the range 1..maxsig, or 0 to
clear). This is the signal that the calling process will get
when its parent dies. This value is cleared for the child of
a fork(2) and (since Linux 2.4.36 / 2.6.23) when executing a
set-user-ID or set-group-ID binary.

how to efficiently debug processess created by fork()

I want to know whether there is any way to debug (using gdb) both child and parent processes simultaneously. I know the command used to change processes to child one. But that is not the solution I am looking for, as this method only has control of either the child or the parent. I am looking for simultaneous execution steps of both child and parent.
For instance, say the child is executing the a'th step in program b, while the parent is executing the c'th step in program d.
It seems that stepping through the processes of both parent and child is necessary. Is there any way to do this, and if so, how might I go about doing it?
I want to know whether there is any way to debug(using gdb) both child and parent processes simultaneously.
Yes, there is. Documentation here and here. If you are on Linux, you'll want to
(gdb) set detach-on-fork off
Regarding ctt's comment, it's not hard to do with gdb, as I've just found out. I found this resource, which works for both ddd and gdb:
Using gdb/ddd to debug child processes
If you have tried to debug a child process using ddd, you may have
noticed that ddd steps into the parent (and not the child) after the
call to fork(). It is possible to debug the child as well, but it
requires a special procedure. Since the child is a seperate process,
it will require a second debugger window, and we will make use of
gdb's ability to "attach" to a process which is already running.
Before you start, you must do the following:
Make sure your call to fork() assigns a value to some variable so you
can read it easily, e.g. pid = fork().
Make sure you place a sleep() statement in the child as the first line
of code after the fork(), e.g. sleep(60) [make the sleep() long
enough for step 4 below ...]. The sleep() statement can be removed once
debugging is complete. [You only need to do this if you need to attach to the child process right away.]
Compile your program with the "-g" option set, e.g. gcc myProg.c -o myProg -g
Now you are ready to start:
Start 2 copies of ddd [or gdb] in the background, like ddd myProg & ddd myProg & [or for gdb, start gdb as normal (e.g. gdb myProg) in two separate terminals]. It is important that the two copies being running concurrently.
Pick (arbitrarily) one window to be the "parent" and set a breakpoint
after the call to fork() (but not in any code the child will be
executing ... that is, set the breakpoint somewhere in the parent's
code ... if you set the breakpoint in the child's code, DDD will kill
the child as soon as it is created!).
Run the parent to the breakpoint. Note the value returned by fork(),
i.e. the process ID of the child.
In the "child" window, type "attach " in the gdb console window
where is the child's process ID. Note: the gdb console is found
at the bottom of the ddd window; this is where you can type commands
directly to gdb.
Set a breakpoint in the child after the sleep() statement, and click
on "cont" (in the popup "Command Tool" window) to allow the child to
continue execution to the breakpoint.
You will also need to type c or continue in the child gdb after you attach. Use print pid to see the value of pid.
If you have a process that kicks off a different process, then start the child gdb passing in the name of the child process. Ex. if program a kicks off program b, you want one terminal with gdb a (parent gdb) and one with gdb b (child gdb). In this case, you run from the gdb a terminal (it is no longer arbitrary).
This answer was more helpful to me because I'm using an older version of gdb, for which set detach-on-fork off wasn't working as I expected. This way you also don't have to suspend the process(es) you're not currently debugging, which may be required for some purposes.

Resources