If given no arguments or redirection use, the cat command reads from standard input.
But when I execute it with execve() it doesn't behave as it does in bash.
Code:
#include <unistd.h>
#include <fcntl.h>
int main(int ac, char **av, char **env)
{
char *args[] = {"/bin/cat",NULL};
int ps = fork();
if (!ps)
execve("/bin/cat",args, env);
}
Output :
cat: stdin: Input/output error
I tried running it with no arguments, but it returns an error instead.
Assumption: you're running this program from a shell which is running in a terminal. (Otherwise you wouldn't see this behavior.)
Note: if you try to reproduce this, you may see different effects due to a race condition between the parent and the child. The behavior in the question is the most likely one, and you can force it with the child_sleeps variant below, but you might observe different behavior if the child gets a lot of CPU time before the parent exits — I'll explain below with the parent_sleeps variant.
cat with no argument tries to read from its standard input (i.e. file descriptor 0). Since nothing has redirected the standard input, it's still the terminal. What could go wrong when trying to read from a terminal? Let's consult some documentation about read, for example the OpenBSD man page1 or the Linux man page or the POSIX specification. Citing POSIX (whose wording the others more or less copy):
The process is a member of a background process group attempting to read from its controlling terminal, and either the calling thread is blocking SIGTTIN or the process is ignoring SIGTTIN or the process group of the process is orphaned.
To understand this, you need to understand the basics of process groups and how they interact with terminals.
The basic idea of a process group is that it consists of a process, its children, its grandchildren, etc., apart from the sub(sub…)processes that have moved into their own process group. When you run a command from a shell, the shell puts it in its own process group. So there is a process group which contains both the original process of your program and the child process created by your call to fork, and no other process.
Now for the part about the terminal. The basic idea is that only one program should have access to a terminal at a time. Otherwise, which program would receive the input? Some programs use subprocesses, so the ownership of the terminal goes to a process group, not just a process. The process group that owns the terminal is called the foreground process group, and other process groups are background process groups. The shell command fg makes a process group become the foreground process group.
When a process tries to read from a terminal, the kernel checks whether it “owns” the terminal. More precisely, the process should be in the foreground process group. Unrelated processes are also allowed to read from a terminal (as long as it has had permission to open it, that's fair game, even if it's an unusual thing to do). But a process belonging to a background process group is not allowed to read from the terminal. Normally, the kernel sends the process a SIGTTIN signal, and the default effect is to suspend the process2,3. But if the process ignores or blocks SIGTTIN, there is a further step to prevent the process from reading: the read system call errors out with EIO. This is to avoid a situation where an unrelated background program would accidentally “steal” some input from the foreground program.
Now we can connect this with what happens with cat. By the time cat runs, its parent has exited. (In principle the parent might not have exited yet, if cat starts sufficiently fast, but it's unlikely. I'll discuss this below with the parent_sleeps variant.) So the child cat process is alone in its process group. When the parent process exits, the shell takes back ownership of the terminal, so the process group of cat is a background process group, and the kernel will try to prevent it from reading from the terminal.
But we're still not quite there yet: cat does not try to handle SIGTTIN, so why doesn't the kernel send this signal? It's the other case for EIO: the orphaned process group. Once the parent process exits (and the shell notices), cat's parent process no longer exits. But processes must have a parent, so the init process (PID 1) “adopts” orphan processes: if a process's original parent disappears, the process's parent is set to 1. Since cat was alone in its process group, and the parent process is 1 which is not part of the same session, the process group is an orphan process group, and the kernel makes read return EIO.
By the way, the reason for the different treatment for orphan process groups is that in the normal case, a background process group might go back into the foreground if the user runs the fg command in the shell. So if the background program tries to read from the terminal, it's suspended until hopefully it regains access to the terminal. But if the process group is orphaned, there is no longer a shell job that the user can put in the foreground, so there is no “normal” way for the process to ever get back into a state where it would be allowed to read from the terminal. So there's no point in suspending it: reading is and will remain an error.
To allow cat to run in the background, keep its parent process running. You can run the following variation parent_waits where the parent waits for the process to exit.
/* parent_waits.c */
#include <unistd.h>
#include <fcntl.h>
int main(int ac, char **av, char **env)
{
char *args[] = {"/bin/cat",NULL};
int ps = fork();
if (ps) {
int status;
wait(&status);
} else {
execve("/bin/cat",args, env);
}
}
I mentioned above that there is a race condition. If you can't reliably reproduce the behavior in the question, use the child_sleeps variant below, where the child sleeps for long enough for the parent to finish exiting.
/* child_sleeps.c */
#include <unistd.h>
#include <fcntl.h>
int main(int ac, char **av, char **env)
{
char *args[] = {"/bin/cat",NULL};
int ps = fork();
if (!ps) {
usleep(100000);
execve("/bin/cat",args, env);
}
}
If the parent is slow to exit, it's possible that cat will be able to read before the parent exits. You can force this behavior by adding a delay before starting cat, with the following parent_sleeps variant:
/* parent_sleeps.c */
#include <unistd.h>
#include <fcntl.h>
int main(int ac, char **av, char **env)
{
char *args[] = {"/bin/cat",NULL};
int ps = fork();
if (ps) {
sleep(1);
} else {
execve("/bin/cat",args, env);
}
}
With this variant, until the sleep ends (1 second in the code above, adjust as desired), cat works normally. Then the parent exits and you get a shell prompt back. After that, when cat tries to read again, it receives EIO.
$ ./parent_sleeps
one
one
$ two
two
/bin/cat: -: Input/output error
A final note: you might be tempted to observe what's going on by looking in a debugger, or by tracing system calls. But you need to be careful not to change the situation with respect to process groups. For example, under Linux, if you try to trace the program normally with strace, the strace` process is also in the process group and remains in the foreground.
strace -o strace_in_foreground.strace -f ./a.out
To observe the system calls leading to the EIO case, tell strace to detach the traced program.
strace -o program_in_background.strace -D -f ./a.out
1 Disappointingly, the FreeBSD manual omits the relevant case.
2 If this happens in a process group that is a job in a shell, the shell prints a message like “suspended (tty input)” (zsh) or “Stopped (SIGTTIN)” (ksh) or “Stopped” (bash)).
3 The same happens with an attempt to write and the SIGTTOU signal. With output however, the process can ignore SIGTTOU and the write will go through.
Related
In this code (run on linux):
void child_process()
{
int count=0;
for(;count<1000;count++)
{
printf("Child Process: %04d\n",count);
}
printf("Child's process id: %d\n",getpid());
}
void parent_process()
{
int count=0;
for(;count<1000;count++)
{
printf("Parent Process: %04d\n",count);
}
}
int main()
{
pid_t pid;
int status;
if((pid = fork()) < 0)
{
printf("unable to create child process\n");
exit(1);
}
if(pid == 0)
child_process();
if(pid > 0)
{
printf("Return value of wait: %d\n",wait();
parent_process();
}
return 0;
}
If the wait() were not present in the code, one of the process (child or parent) would finish it's execution and then the control is given to the linux terminal and then finally the process left (child or parent) would run. The output of such a case is:
Parent Process: 0998
Parent Process: 0999
guest#debian:~/c$ Child Process: 0645 //Control given to terminal & then child process is again picked for processing
Child Process: 0646
Child Process: 0647
In case wait() is present in the code, what should be the flow of execution?
When fork() is called then a process tree must be created containing parent and child process. In above code when the processing of child process ends, the parent is informed about the death of child zombie process via wait() system call, but parent and child being two separate processes, is it mandatory that the control is passed the directly to the parent after child process is over? (no control given to other process like terminal at all) - if yes then it is like child process is a part of parent process (like a function called from another function).
This comment is, at least, misleading:
//Control given to terminal & then child process is again picked for processing
The "terminal" process doesn't really enter into the equation. It's always running, assuming that you are using a terminal emulator to interact with your program. (If you're using the console, then there is no terminal process. But that's unlikely these days.)
The process in control of the user interface is whatever shell you're using. You type some command-line like
$ ./a.out
and the shell arranges for your program to run. (The shell is an ordinary user program without special privileges, by the way. You could write your own.)
Specifically, the shell:
Uses fork to create a child process.
Uses waitpid to wait for that child process to finish.
The child process sets up any necessary redirects and then uses some exec system call, typically execve, to replace itself with the ./a.out program, passing execve (or whatever) the command line arguments you specified.
That's it.
Your program, in ./a.out, uses fork to create a child and then possibly waits for the child to finish before terminating. As soon as your parent process terminates, the shell's waitpid() can return, and as soon as it returns, the shell prints a new command prompt.
So there are at least three relevant processes: the shell, your parent process, and your child process. In the absence of synchronisation functions like waitpid(), there are no guarantees about ordering. So when your parent process calls fork(), the created child could start executing immediately. Or not. If it does start executing immediately, it does not necessarily preempt your parent process, assuming your computer is reasonably modern and has more than one core. They could both be executing at the same time. But that's not going to last very long because your parent process will either immediately call exit or immediately call wait.
When a process calls wait (or waitpid), it is suspended and becomes runnable again when the process it is waiting for terminates. But again there are no guarantees. The mere fact that a process is runnable doesn't mean that it will immediately start running. But generally, in the absence of high load, the operating system will start running it pretty soon. Again, it might be running at the same time as another process, such as your child process (if your parent didn't wait for it to finish).
In short, if you performed your experiment a million times, and your parent waits for your child, then you will see the same result a million times; the child must finish before the parent is unsuspended, and your parent must finish before the shell is unsuspended. (If your parent process printed something before waiting, you would see different results; the parent and child outputs could be in any order, or even overlapped.)
If, on the other hand, your parent does not wait for the child, then you could see any of a number of results, and in a million repetitions you're likely to see more than one of them (but not with the same probability). Since there is no synchronisation between parent and child, the outputs could appear in either order (or be interleaved). And since the child is not synchronised with the shell, its output could appear before or after the shell's prompt, or be interleaved with the shell's prompt. No guarantees, other than that the shell will not resume until your parent is done.
Note that the terminal emulator, which is a completely independent process, is runnable the entire time. It owns a pseudo-terminal ("pty") which is how it emulates a terminal. The pseudo-terminal is a kind of pipe; at one end of the pipe is the process which thinks it's communicating with a console, and at the other end is the terminal emulator which interprets whatever is being written to the pty in order to render it in the GUI, and which sends any keystrokes it receives, suitably modified as a character stream back through the pipe. Since the terminal emulator is never suspended and its execution is therefore interleaved with whatever other processes are active on your computer, it will (more or less) immediately show you any output which is sent by your shell or the processes it starts up. (Again, assuming the machine is not overloaded.)
I am writing a shell in C and I am trying to add signal handling. In the shell, fork() is called and the child process executes a shell command. The child process is put into its own process group. This way, if Ctrl-C is pressed when a child process is in the foreground, it closes all of the processes that share the same process group id. The shell executes the commands as expected.
The problem is the signals. When, for example, I execute "sleep 5", and then I press Ctrl-C for SIGINT, the "shell>" prompt comes up as expected but the process is still running in the background. If I quickly run "ps" after I press Ctrl-C, the sleep call is still there. Then after the 5 seconds are up and I run "ps" again, it's gone. The same thing happens when I press Ctrl-Z (SIGTSTP). With SIGTSTP, the process goes to the background, as expected, but it doesn't pause execution. It keeps running until it's finished.
Why are these processes being sent to the background like this and continuing to run?
Here is the gist of my code...
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int status;
void sig_handler_parent()
{
printf("\n");
}
void sig_handler_sigchild(int signum)
{
waitpid(-1, &status, WNOHANG);
}
int main()
{
signal(SIGCHLD, sig_handler_sigchild);
signal(SIGINT, sig_handler_parent);
signal(SIGQUIT, sig_handler_parent);
signal(SIGTERM, sig_handler_parent);
signal(SIGCONT, sig_handler_parent);
signal(SIGTSTP, sig_handler_parent);
while (1)
{
printf("shell> ");
// GET COMMAND INPUT HERE
pid = fork();
if (pid == 0)
{
setpgid(getpid(), getpid());
execvp(cmd[0], cmd);
printf("%s: unknown command\n", cmd[0]);
exit(1);
}
else
waitpid(0, &status, WUNTRACED);
}
return 0;
}
p.s. I have already tried setting all of the signal handlers to SIG_DFL before the exec command.
The code you provide does not compile, and an attempt to fix it shows
that you omitted a lot. I am only guessing.
In order to bring you forward, I'll point out a number of facts that
you might have misunderstood. Together with a couple of documentation
links, I hope this is helpful.
Error Handling
First: please make a habit of handling errors, especially when you
know there's something that you don't understand. For example, the
parent (your shell) waits until the child terminates,
waitpid(0, &status, WUNTRACED);
You say,
When, for example, I execute "sleep 5", and then I press Ctrl-C for
SIGINT, the "shell>" prompt comes up as expected but the process is
still running in the background.
What actually happens is that once you press Ctrl-C, the parent (not the
child; see below for why) receives SIGINT (the kernel's terminal
subsystem handles keyboard input, sees that someone holds "Ctrl" and
"C" at the same time, and concludes that all processes with that
controlling terminal must be sent SIGINT).
Change the parent branch to,
int error = waitpid(0, &status, WUNTRACED);
if (error != 0)
perror("waitpid");
With this, you'd see perror() print something like:
waitpid: interrupted system call
You want SIGINT to go to the child, so something must be wrong.
Signal Handlers, fork(), and exec()
Next, what happens to your signal handlers across fork() and
exec()?
The signal overview man
page states,
A child created via fork(2) inherits a copy of its parent's signal
dispositions. During an execve(2), the dispositions of handled
signals are reset to the default; the dispositions of ignored
signals are left unchanged.
So, ideally, what this means is that:
The parent (shell) sees SIGINT, as observed above, and prints
"interrupted system call".
The child's signal handlers are reset back to their defaults. For SIGINT,
this means to terminate.
You do not fiddle with the controlling terminal, so the child
inherits the controlling terminal of the parent. This means that
SIGINT is delivered to both parent and child. Given that the
child's SIGINT behavior is to terminate, I'd bet that no process is
left running.
Except when you use setpgid() to create a new process group.
Process Groups, Sessions, and Controlling Terminal
Someone once called me a UNIX greybeard. While this is true form a
visual point of view, I must reject that compliment because I rarely
hang around in one of the darkest corners of UNIX - the terminal
subsystem. Shell writers have to understand that too though.
In this context, it's the "NOTES" section of the setpgid() man
page. I suggest
you read that, especially where it says,
At any time, one (and only one) of the process groups in the session
can be the foreground process group for the terminal; (...)
The shell (bash maybe) from which you start your shell program has
done so for the foreground invocation of your program, and marked that
as "foreground process group". Effectively this means, "Please, dear
terminal, whenever someone presses Ctrl-C, send a SIGINT to all
processes in that group. I (your parent) just sit and wait (waitpid()) until all is over, and will take control again then.".
You create a process group for the child, but don't tell the terminal
about it. What you want is to
Detach the parent from the terminal.
Set the child process group as the foregroud process group of the terminal.
Wait for the child (you already do).
Regain terminal foreground.
Further down in the "NOTES" section of said man page, they give links
to how that is done. Follow those, read thoroughly, try out things,
and make sure you handle errors. In most cases, such errors are signs
of misunderstanding. And, in most cases, such errors are fixed by
re-reading the documentation.
Are you sure that your child process is actually receiving the signals from your tty? I believe you need to make a call to tcsetpgrp to actually tell the controlling terminal to send signals to the process group of your child process.
For example, after you call fork, and before exec, try this from within your child.
tcsetpgrp(STDIN_FILENO, getpid())
Here is the man page for tcsetpgrp(3)
I am trying to program a shell in C , and I found that each command is executed in a new process, my question is why do we make a new process to execute the command? can't we just execute the command in the current process?
It's because of how the UNIX system was designed, where the exec family of calls replace the current process. Therefore you need to create a new process for the exec call if you want the shell to continue afterward.
When you execute a command, one of the following happens:
You're executing a builtin command
You're executing an executable program
An executable program needs many things to work: different memory sections (stack, heap, code, ...), it is executed with a specific set of privileges, and many more things are happening.
If you run this new executable program in your current process, you're going to replace the current program (your shell) with the new one. It works perfectly fine but when the new executable program is done, you cannot go back to your shell since it's not in memory anymore. This is why we create a new process and run the executable program in this new process. The shell waits for this new process to be done, then it collects its exit status and prompts you again for a new command to execute.
can't we just execute the command in the current process?
Sure we can, but that would then replace the shell program with the program of the command called. But that's probably not something you want in this particular application. There are in fact, many situations in which replacing the process program via execve is a the most straightforward way to implement something. But in the case of a shell, that's likely not what you want.
You should not think processes to be something to be avoided or "feared". As a matter of fact, segregating different things into different processes is the foundation of reliability and security features. Processes are (mostly) isolated from each other, so if a process gets terminated for whatever reason (bug, crash, etc.) this in the first degree affects only that particular process.
Here's something to try out:
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
int segfault_crash()
{
fprintf(stderr, "I will SIGSEGV...\n");
fputs(NULL, stderr);
return 0;
}
int main(int argc, char *argv)
{
int status = -1;
pid_t const forked_pid = fork();
if( -1 == forked_pid ){
perror("fork: ");
return 1;
}
if( 0 == forked_pid ){
return segfault_crash();
}
waitpid(forked_pid, &status, 0);
if( WIFSIGNALED(status) ){
fprintf(stderr, "Child process %lld terminated by signal %d\n",
(long long)forked_pid,
(int)WTERMSIG(status) );
} else {
fprintf(stderr, "Child process %lld terminated normally\n",
(long long)forked_pid);
}
return 0;
}
This little program forks itself, then calls a function that deliberately performs undefined behavior, that on commonplace systems triggers some kind of memory protection fault (Access Violation on Windows, Segmentation Fault on *nix systems). But because this crash has been isolated into dedicated process, the parent process (and also siblings) are not crashing together with it.
Furthermore processes may drop their privileges, limit themselves to only a subset of system calls, and be moved into namespaces/containers, each of which prevents a bug in the process to damage the rest of the system. This is how modern browsers (for example) implement sandboxing, to improve security.
I'm writting a C program that launches another program using the system() function. I'd want to know if there is a possible way to kill the program that is launched, if the main program is killed. I'm programming it for a Linux machine.
Example:
/* foo.c */
int main()
{
system("./blah");
return 0;
}
blah does whatever has to do. If I kill foo, blah is still running.
Is there any way to make foo to kill blah when it dies ?
You'll need to work with signal handling to know when someone/something is trying to kill your application, read the below documentation for further information.
linuxjournal.com - The Linux Signal Model
Besides that you'll need to know the process id of your spawned child process. For this I'd recommend to use something more sophisticated than system to fire up your launched process.
yolinux.com - Fork, Exec and Process control
You'll also have to know how to kill the spawned child (using it's pid).
pubs.opengroup.org - functions: kill
How the below program works and create a Zombie process under linux?
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
int main ()
{
pid_t child_pid;
child_pid = fork ();
if (child_pid > 0) {
sleep (60);
}
else {
exit (0);
}
return 0;
}
It creates children and doesn't wait (with one of the wait* system call) for them. And zombies are just that: children that the parents hasn't waited yet, the kernel has to maintain some information for them -- mainly the exit status -- in order to be able to return it to the parent.
The setsid() command is missing.
Every *nix process produces an exit status that must be reaped. This is supposed to be reaped by the parent process using a wait() statement, if the child is supposed to terminate first.
The setsid() command switches the parent process to init when the parent terminates before the child process.
Root should be able to remove zombies from the process list using kill -9. Inexperienced programmers sometimes omit setsid(), which will hide bugs that produce errors that would otherwise clog the disk drive.
In days of old, the system administrator would use zombies to identify inexperienced programmers that need additional training to produce good code.
The exit status harvested by init is sent to syslog when the kernel terminates a program prematurely. That exit status is used to identify the nature of the bug that caused the early termination (error conditions not handled by the programmer).
Exit status reported in this way becomes part of the syslog or klog files, which are commonly used to debug code.