Is there any way to know if a process has started to run from a call of exec() or has started from the terminal by the user?
Helpful to you: child and parent process id;
getppid() returns the process ID of the parent of the calling
process. This will be either the ID of the process that created this
process using fork(), or, (!!!CARE!!!) if that process has already terminated, the
ID of the process to which this process has been reparented;
I would also consider adding additional program arg.
All programs are started by a call to exec family of functions.
When you type a command in the terminal, for example, it searches for the binary executable, forks and calls exec in the child process. This will substitute the binary image of the calling process (the terminal) for the binary image of the new program. The program will execute and the terminal process will wait.
There is this absolutely awesome answer by paxdiablo on the question Please explain exec() function and its family that will surely help you understand how exec works.
In Unix, all processes are created by using the fork system call, optionally followed by the exec system call, even those started by a user (they are fork/exec'd by the user's shell).
Depending on what you really want to do, the library function isatty() will tell you if stdin, stdout or stderr are file descriptors of a tty device. i.e. input comes from a terminal, output goes to a terminal or errors go to a terminal. However, a command like
myprog < somefile 1>someotherfile 2>errorfile
will fool code using isatty. But maybe that is what you want. If you want to take different actions based on whether there is a user typing input from a keyboard or input is coming from a file, isatty is what you need.
Related
I was reading about processes and I came across this:
Usually, the child process then executes execve or a similar system call to change its memory image
what I can derive from this is this pseudocode:
if(child_created_sucessfully)
{
do_ABC_and_ignore_the_part_of_the_parent's_control_flow //is this what it meant to "change its memory image"?
}
(Question asked in the pseudocode's comment)
I completely don't understand this other part:
example, when a user types a command, say, sort, to the shell, the
shell forks off a child process and the child executes sort. The reason for this twostep
process is to allow the child to manipulate its file descriptors after the fork but
before the execve in order to accomplish redirection of standard input, standard
output, and standard error.
Regarding the first part
Usually, the child process then executes execve or a similar system
call to change its memory image
This simply means that when you create a child process it initializes it's own stack and heap memory although this is not 100% true. Since the new process is forked at time T at the time T + 1 when the process starts to run it is pretty much identical when it comes to the data in memory so there is a smart optimization called 'copy on write' more here.
Regarding the second part
example, when a user types a command, say, sort, to the shell, the
shell forks off a child process and the child executes sort. The
reason for this twostep process is to allow the child to manipulate
its file descriptors after the fork but before the execve in order to
accomplish redirection of standard input, standard output, and
standard error.
Simply put this means that when you execute a shell command (like ls, ps, grep, nstat...) the OS forks the current process which executes the command and the command itself is executed by this new process. An easy way to understand this is by using ps | grep ps this will first fork and create a new process, then this part comes to play
this twostep process is to allow the child to manipulate its file
descriptors after the fork but before the execve
and the standard output of the process is changed. After the new ps process executes the ps it will then fork and create one more process for the grep ps which will execute the grep and you should be able to see the ps process which created this grep process.
the man page says that "The exec() family of functions replaces the current process image with a new process image." but I am not quite understand the meaning of "replaces the current process image with a new process image". For example, if exec succeed, perror would not be reached
execl("/bin/ls", /* Remaining items sent to ls*/ "/bin/ls", ".", (char *) NULL);
perror("exec failed");
Correct. If the exec works, the perror will not be called, simply because the call to perror no longer exists.
I find it's sometimes easier when educating newcomers to these concepts, to think of the UNIX execution model as being comprised of processes, programs and program instances.
Programs are executable files such as /bin/ls or /sbin/fdisk (note that this doesn't include things like bash or Python scripts since, in that case, the actual executable would be the bash or python interpreter, not the script).
Program instances are programs that have been loaded into memory and are basically running. While there is only one program like /bin/ls, there may be multiple instances of it running at any given time if, for example, both you and I run it concurrently.
That "loaded into memory" phrase is where processes come into the picture. Processes are just "containers" in which instances of programs can run.
So, when you fork a process, you end up with two distinct processes but they're still each running distinct instances of the same program. The fork call is often referred to as one which one process calls but two processes return from.
Likewise, exec will not have an effect on the process itself but it will discard the current program instance in that process and start a new instance of the requested program.
This discard in a successful exec call is what dictates that the code following it (perror in this case) will not be called.
It means your current process becomes the new process instead of what it was. You stop doing what you're doing and start doing,really being, something else instead, never to rebecome what that process once was.
Instead of starting a whole new process, however, your current pid and environment become the new process instead. That let's you setup things the way the new process will need it before doing the exec
You are correct. perror will not be called unless the execl fails. The exec functions are the means for starting new processes in a POSIX compliant OS (typically combined with a fork call). Maybe an example will help. Suppose your program, call it programX, is running. It then calls one of the exec functions like the one you have above. programX will no longer exist as a running process. Instead, ls will be running. It will have the same exact PID as programX, but pretty much be a whole new process otherwise.
I tried to see what happens if I read something from keyboard while I have multiple processes with fork() (in my case there are two children and a parent) and I discovered the following problem: I need to tell the parent to wait for children's processes, otherwise the program behaves strangely.
I did a research and I found that the problem is with the parent, he needs to wait for the child's process to end because if the parent's process ends first somehow he closes the STDIN, am I right? But also I found that every process has a copy of STDIN so my question is:
Why it works this way and why only the parent has the problem with STDIN and the children not, I mean why if the child's process ends first doesn't affect STDIN but if the parent's process ends first it does affect STDIN?
Here are my tests:
I ran the program without wait() and after I typed a number the program stopped, but then I pressed enter two more times and the other two messages from printf() appeared.
When I ran the program with wait() everything worked fine, every process called scanf() separately and read a different number.
Well, a lot of stuff is going on here. I will try to explain it step by step.
When you start your terminal, the terminal creates a special file having path /dev/pts/<some number>. Then it starts your shell (which is bash in this case) and links the STDIN, STDOUT and STDERR of the bash process to this special file. This file is called a special file because it doesn't actually exist on your hard disk. Instead, whatever you write to this file, it goes directly to the terminal and the terminal renders it on the screen. (Similarly, whenever you try to read from this file, the read blocks until someone types something at the terminal).
Now when you launch your program by typing ./main, bash calls the fork function in order create a new process. The child process execs your executable file, while the parent process waits for the child to terminate. Your program then calls fork twice and we have three processes trying to read their STDINs, ie the same file /dev/pts/something. (Remember that calling fork and exec duplicates and preserves the file descriptors respectively).
The three processes are in race condition. When you enter something at the terminal, one of the three processes will receive it (99 out of 100 times it would be the parent process since the children have to do more work before reaching scanf statement).
So, parent process prints the number and exits first. The bash process that was waiting for the parent to finish, resumes and puts the STDIN into a so called "non-canonical" mode, and calls read in order to read the next command. Now again, three processes (Child1, Child2 and bash) are trying to read STDIN.
Since the children are trying to read STDIN for a longer time, the next time you enter something it will be received by one of the children, rather than bash. So you think of typing, say, 23. But oops! Just after you press the 2 key, you get Your number is: 2. You didn't even press the Enter key! That happened because of this so called "non-canonical" mode. I won't be going into what and why is that. But for now, to make things easier, use can run your program on sh instead of bash, since sh doesn't put STDIN into non-canonical mode. That will make the picture clear.
TL;DR
No, parent process closing its STDIN doesn't mean that its children or other process won't be able to use it.
The strange behavior you are seeing is because when the parent exits, bash puts the pty (pseudo terminal) into non-canonical mode. If you use sh instead, you won't see that behavior. Read up on pseudo terminals, and line discipline if you want to have a clear understading.
The shell process will resume as soon as the parent exits.
If you use wait to ensure that parents exits last, you won't have any problem, since the shell won't be able to run along with your program.
Normally, bash makes sure that no two foreground processes read from STDIN simultaneously, so you don't see this strange behavior. It does this by either piping STDOUT of one program to another, or by making one process a background process.
Trivia: When a background process tries to read from its STDIN, it is sent a signal SIGTTIN, which stops the process. Though, that's not really relevant to this scenario.
There are several issues that can happen when multiple processes try to do I/O to the same TTY. Without code, we can't tell which may be happening.
Trying to do I/O from a background process group may deliver a signal: SIGTTIN for input (usually enabled), or SIGTTOU for output (usually disabled)
Buffering: if you do any I/O before the fork, any data that has been buffered will be there for both processes. Under some conditions, using fflush may help, but it's better to avoid buffering entirely. Remember that, unlike output buffering, it is impossible to buffer input on a line-by-line basis (although you can only buffer what is available, so it may appear to be line-buffered at first).
Race conditions: if more than one process is trying to read the same pipe-like file, it is undefined which one will "win" and actually get the input each time it is available.
I am working on a school project, and though it's not required, I want to implement this functionality. With that said, I can't share code, but I think it's irrelevant in this case.
When using fork(), my understanding is that the child process created inherits stdin and stdout, as the child inherits all the file streams from the parent.
My shell requires background capability, and while it technically already has that, if the "background" program runs, it still receives all the data from stdin and continues output to the screen which is just a jumbled mess. For the record, my instructor's compiled sample shell does the same thing, but I don't want that to happen!
I'm pretty certain I should be using a combination of pipe(), fork(), and dup2(), but I can't put it all together. I understand fork, but I don't understand how pipe or dup2 works and how I should implement it in the shell. I'm thinking something along these lines:
thePipe[2] = pipe();
pid = fork();
close stdin/out on child somehow if backgrounded
But I don't understand the functionality of pipe() or dup2() so I'm stuck.
Thanks!
You don't want pipes here. Processes run in an interactive shell should share their standard file descriptors with the shell — doing otherwise would break a lot more things (including the child processes' ability to determine they're running interactively, and to interact with the tty to handle things like window size changes). It'd also seriously complicate pipelines. Don't do it.
The missing piece here is process groups, which are described in the "General Terminal Interface" section of the Open Group UNIX specs. In brief, the kernel can be made to explicitly recognize a "foreground process group" for the terminal. If a process that isn't in this group tries to read from or write to the terminal, it is automatically stopped.
A brief walkthrough of what is necessary to make a properly functioning shell is available as part of the GNU libc manual, under "Implementing a Job Control Shell". Try following their instructions and see how that goes.
I don't want to use system() in my C program, because system(3) blocks and this is not what I want. What is the optimal way to do it?
I think that a quick and dirty action is to call sytem(command &). the & will spawn the new process.
Use fork() to create a new process and then use system() (or any exec function) in it. The original process will then be able to continue executing.
The answer depends on what your real goal is. You don't say what platform you're on, and I know very little about Windows, so this only covers your options on linux/unix.
You just want to spawn another program, and don't need to interact with it. In this case, call fork(), and then in the child process run execve() (or related function).
You want to interact with another program. In this case, use popen().
You want part of your program to run as a subprocess. In this case, use fork(), and call whatever functions you need to run in the child.
You need to interact with part of your program running as a subprocess. Call pipe() so you have a file descriptor to communicate through, then call fork() and use the file descriptor pair to communicate. Alternatively, you could communicate through a socket, message queue, shared memory, etc.
You might want to use popen. It creates new processes and allows you to redirect the process output to your own process.
If in windows, use the ShellExecute() function from the Windows API.
If in Unix, go for fork() then system() as mentioned.