how to efficiently debug processess created by fork() - c

I want to know whether there is any way to debug (using gdb) both child and parent processes simultaneously. I know the command used to change processes to child one. But that is not the solution I am looking for, as this method only has control of either the child or the parent. I am looking for simultaneous execution steps of both child and parent.
For instance, say the child is executing the a'th step in program b, while the parent is executing the c'th step in program d.
It seems that stepping through the processes of both parent and child is necessary. Is there any way to do this, and if so, how might I go about doing it?

I want to know whether there is any way to debug(using gdb) both child and parent processes simultaneously.
Yes, there is. Documentation here and here. If you are on Linux, you'll want to
(gdb) set detach-on-fork off

Regarding ctt's comment, it's not hard to do with gdb, as I've just found out. I found this resource, which works for both ddd and gdb:
Using gdb/ddd to debug child processes
If you have tried to debug a child process using ddd, you may have
noticed that ddd steps into the parent (and not the child) after the
call to fork(). It is possible to debug the child as well, but it
requires a special procedure. Since the child is a seperate process,
it will require a second debugger window, and we will make use of
gdb's ability to "attach" to a process which is already running.
Before you start, you must do the following:
Make sure your call to fork() assigns a value to some variable so you
can read it easily, e.g. pid = fork().
Make sure you place a sleep() statement in the child as the first line
of code after the fork(), e.g. sleep(60) [make the sleep() long
enough for step 4 below ...]. The sleep() statement can be removed once
debugging is complete. [You only need to do this if you need to attach to the child process right away.]
Compile your program with the "-g" option set, e.g. gcc myProg.c -o myProg -g
Now you are ready to start:
Start 2 copies of ddd [or gdb] in the background, like ddd myProg & ddd myProg & [or for gdb, start gdb as normal (e.g. gdb myProg) in two separate terminals]. It is important that the two copies being running concurrently.
Pick (arbitrarily) one window to be the "parent" and set a breakpoint
after the call to fork() (but not in any code the child will be
executing ... that is, set the breakpoint somewhere in the parent's
code ... if you set the breakpoint in the child's code, DDD will kill
the child as soon as it is created!).
Run the parent to the breakpoint. Note the value returned by fork(),
i.e. the process ID of the child.
In the "child" window, type "attach " in the gdb console window
where is the child's process ID. Note: the gdb console is found
at the bottom of the ddd window; this is where you can type commands
directly to gdb.
Set a breakpoint in the child after the sleep() statement, and click
on "cont" (in the popup "Command Tool" window) to allow the child to
continue execution to the breakpoint.
You will also need to type c or continue in the child gdb after you attach. Use print pid to see the value of pid.
If you have a process that kicks off a different process, then start the child gdb passing in the name of the child process. Ex. if program a kicks off program b, you want one terminal with gdb a (parent gdb) and one with gdb b (child gdb). In this case, you run from the gdb a terminal (it is no longer arbitrary).
This answer was more helpful to me because I'm using an older version of gdb, for which set detach-on-fork off wasn't working as I expected. This way you also don't have to suspend the process(es) you're not currently debugging, which may be required for some purposes.

Related

child process changing memory image

I was reading about processes and I came across this:
Usually, the child process then executes execve or a similar system call to change its memory image
what I can derive from this is this pseudocode:
if(child_created_sucessfully)
{
do_ABC_and_ignore_the_part_of_the_parent's_control_flow //is this what it meant to "change its memory image"?
}
(Question asked in the pseudocode's comment)
I completely don't understand this other part:
example, when a user types a command, say, sort, to the shell, the
shell forks off a child process and the child executes sort. The reason for this twostep
process is to allow the child to manipulate its file descriptors after the fork but
before the execve in order to accomplish redirection of standard input, standard
output, and standard error.
Regarding the first part
Usually, the child process then executes execve or a similar system
call to change its memory image
This simply means that when you create a child process it initializes it's own stack and heap memory although this is not 100% true. Since the new process is forked at time T at the time T + 1 when the process starts to run it is pretty much identical when it comes to the data in memory so there is a smart optimization called 'copy on write' more here.
Regarding the second part
example, when a user types a command, say, sort, to the shell, the
shell forks off a child process and the child executes sort. The
reason for this twostep process is to allow the child to manipulate
its file descriptors after the fork but before the execve in order to
accomplish redirection of standard input, standard output, and
standard error.
Simply put this means that when you execute a shell command (like ls, ps, grep, nstat...) the OS forks the current process which executes the command and the command itself is executed by this new process. An easy way to understand this is by using ps | grep ps this will first fork and create a new process, then this part comes to play
this twostep process is to allow the child to manipulate its file
descriptors after the fork but before the execve
and the standard output of the process is changed. After the new ps process executes the ps it will then fork and create one more process for the grep ps which will execute the grep and you should be able to see the ps process which created this grep process.

Stepping through a multithreaded application with GDB

1st foray into using pthreads to create a multithreaded aplication
I'm trying to debug with gdb but getting some strange unexpected behaviour
Trying to ascertain whether its me or gdb at fault
Scenario:
Main thread creates a child thread.
I place a breakpoint on a line in the child thread fn
gdb stops on that breakpoint no problem
I confirm there are now 2 threads with info threads
I also check that the 2nd thread is starred, i.e. it is the current thread for gdbs purposes
Here is the problem, when I now hit n to step through to the next line in the thread fn, the parent thread (thread 1) simply resumes and completes and gdb exits.
Is this the correct behaviour?
How can I step through the thread fn code that is being executed in the 2nd thread line by line with gdb?
In other words, even though thread 2 is confirmed as the current thread by gdb, when I hit n, it seems to be the equivalent of hitting c in the parent thread, i.e. the parent thread (thread 1) just resumes execution, completes and exits.
At a loss as to how to debug multiple threads with gdb behaving as it is currently
I am using gdb from within emacs25, i.e. M-x gud-gdb
What GDB does here depends on your settings, and also your system (some vendors patch this area).
Normally, in all-stop mode, when the inferior stops, GDB stops all the threads. This gives you the behavior that you'd "expect" -- you can switch freely between threads and see what is going on in each one.
When the inferior continues, including via next or step, GDB lets all threads run. So, if your second thread doesn't interact with your first thread in any way (no locks, etc), you may well see it exit.
However, you can control this using set scheduler-locking. Setting this to on will make it so that only the current thread can be resumed. And, setting it to step will make it so that only the current thread can be resumed by step and next, but will let all threads run freely on continue and the like.
The default mode here is replay, which is basically off, except when using record-and-replay mode. However, the Fedora GDB is built with the default as step; I am not sure if other distros followed this, but you may want to check.
Yes, this is correct behaviour of gdb. You are only debugging currently active thread, other threads are executing normally behind the scenes. Think about it, how else would you move other threds?
But your code has a bug. Your parent thread should not exit before child thread is done. The best way to do this is to join child thread in the main thread before exiting.

How can I switch between different processes fork() ed in gdb?

I'm debugging such a multiple process application,
how can I switch between the fork()ed processes?
You can put the child process to sleep and then attach a new instance of GDB to it. The GDB User Manual describes this process as follows (emphasis is mine):
On most systems, gdb has no special
support for debugging programs which
create additional processes using the
fork function. When a program forks,
gdb will continue to debug the parent
process and the child process will run
unimpeded. If you have set a
breakpoint in any code which the child
then executes, the child will get a
SIGTRAP signal which (unless it
catches the signal) will cause it to
terminate.
However, if you want to debug the
child process there is a workaround
which isn't too painful. Put a call to
sleep in the code which the child
process executes after the fork. It
may be useful to sleep only if a
certain environment variable is set,
or a certain file exists, so that the
delay need not occur when you don't
want to run gdb on the child. While
the child is sleeping, use the ps
program to get its process ID. Then
tell gdb (a new invocation of gdb if
you are also debugging the parent
process) to attach to the child
process (see Attach). From that point
on you can debug the child process
just like any other process which you
attached to.
The long and the short of it is that when you start a program that later forks, GDB will stay connected to the parent process (though you can follow the child process, instead, by using set follow-fork-mode child). By putting the other process to sleep, you can have a new instance of GDB connect to it, as well.
Use set detach-on-fork off to hold both processes under the control of gdb. By default, the parent process will be debugged as usual and the child will be held suspended, but by calling set follow-fork-mode child you can change this behavior (so that the child process will be debugged as usual and the parent will be held suspended). The GDB User Manual describes this process as follows:
gdb will retain control of all forked
processes (including nested forks).
You can list the forked processes
under the control of gdb by using the
info inferiors command, and switch
from one fork to another by using the
inferior command (see Debugging
Multiple Inferiors and Programs).
To quit debugging one of the forked
processes, you can either detach from
it by using the detach inferiors
command (allowing it to run
independently), or kill it using the
kill inferiors command. See Debugging
Multiple Inferiors and Programs.
Show all the processes.
(gdb) info inferiors
Num Description Executable
1 process 1000 /tmp/a.out
* 2 <null> /tmp/a.out # current attach inferior
Switch between different processes.
(gdb) inferior 1
[Switching to inferior 1 [process 1000] (/tmp/a.out)]
[Switching to thread 1.1 (LWP 1000)]

Using gdb for fork() system call

I want to use gdb for looking into the various details of the fork() system call. To do this, I used one breakpoint at the fork() and from there onwards i am using step command but this way it is not working fine.
Can somebody explain me how to use gdb to look into every single step occuring during fork() system call?
Maybe you meant that you want to follow the child process instead of the parent once the fork is called? In that case:
If you want to follow the child
process instead of the parent process,
use the command set follow-fork-mode.
set follow-fork-mode mode
Set the debugger response to a program call of fork or vfork. A call
to fork or vfork creates a new
process. The mode argument can be:
parent: The original process is debugged after a fork. The child
process runs unimpeded. This is the
default.
child: The new process is debugged after a fork. The parent process runs
unimpeded.
If you want to see whats happening best if to look at the kernel code first, check it here.
I don't think you can single step through kernel from user space. You can use virtual server to do the debugging using KGDB. Check blog post here. Or you can use KGDB on main kernel.

Passing the shell to a child before aborting

Current scenario, I launch a process that forks, and after a while it aborts().
The thing is that both the fork and the original process print to the shell, but after the original one dies, the shell "returns" to the prompt.
I'd like to avoid the shell returning to the prompt and keep as if the process didn't die, having the child handle the situation there.
I'm trying to figure out how to do it but nothing yet, my first guess goes somewhere around tty handling, but not sure how that works.
I forgot to mention, the shell takeover for the child could be done on fork-time, if that makes it easier, via fd replication or some redirection.
I think you'll probably have to go with a third process that handles user interaction, communicating with the "parent" and "child" through pipes.
You can even make it a fairly lightweight wrapper, just passing data back and forth to the parent and terminal until the parent dies, and then switching to passing to/from the child.
To add a little further, as well, I think the fundamental problem you're going to run into is that the execution of a command by the shell just doesn't work that way. The shell is doing the equivalent of calling system() -- it's going to wait for the process it just spawned to die, and once it does, it's going to present the user with a prompt again. It's not really a tty issue, it's how the shell works.
bash (and I believe other shells) have the wait command:
wait: wait [n]
Wait for the specified process and report its termination status. If
N is not given, all currently active child processes are waited for,
and the return code is zero. N may be a process ID or a job
specification; if a job spec is given, all processes in the job's
pipeline are waited for.
Have you considered inverting the parent child relationship?
If the order in which the new processes will die is predictable, run the code that will abort in the "child" and the code that will continue in the parent.

Resources