This question already has answers here:
printf anomaly after "fork()"
(3 answers)
Duplicated output using printf() and fork() in C
(1 answer)
Closed 1 year ago.
I'm having difficulty understanding the behavior of fork(). I thought the child process will execute the lines "after" the fork(). So I expected to see only one "Hello world!", but this code:
printf("Hello World!\n");
fork();
return 0;
outputs two "Hello World". Why is that?
I also noticed from online examples using pipe() that pipes are created before forking a child. How does the child also have a pipe when it was created after the creation of the pipe in the parent process?
The first question is more than likely related to your program's file buffering mode, it's likely using full buffering setup, meaning that the stream is only written once the buffer (stdout) is full, this will delay the output, and the child process will also output, because it has the same stdout.
If you use fflush(stdout) right after the printf, or if you change your buffering mode (setvbuf) to line buffered or not buffered at all, you will prevent the duplicate output because it will happen before the fork.
As for the second question, the child process duplicates the code of the parent process, after the fork, as you correctly mentioned, but it also duplicates the file table. By creating the unnamed pipes before the fork you will assure the child has the same file descriptors. This is commonly used to setup pipe comunication between child and parent processes.
You can check the /proc/<process id>/fd folder for each one of the two processes to confirm this.
Footnote
For future reference, these are two different subjects, albeit tangential, they belong in different questions, you can see one of the problems here, the question was close with duplicates related to the first question but not the second, though by chance it didn't remain unanswered, it's still burried in a different matter and virtually undiscoverable by other users.
When we do fork the child process get the exact copy of address space of parent process. So such behaviour should be expected. The pipe system call returns a file descriptor and both parent and child process have this descriptor .
Related
This question already has answers here:
What is the difference between fork() and vfork()?
(6 answers)
Closed 2 years ago.
I am a newbie in System Programming and I encounter some misunderstanding in fork and vfork.
From my knowledge, fork duplicate parent process's process and child process has its own VM and own file descriptor table.
As for vfork, it shared parent process's VM but child process will has its own file descriptor table.
So here comes the problem:
As child process share parent process' address, why does it need its own file descriptor table?
Where will the variable be stored if I declare one in the child process? (Will it use the space of parent process)
Thanks a lot.
Because it's common to redirect stdin and/or stdout before calling exec in the child process. If they shared the same file descriptor table, this would modify the parent process's I/O.
You shouldn't store any variables in the child process. vfork() should only be used if you're going to immediately call an exec function.
Note that vfork() is obsolete on modern operating systems. Instead of copying the address space they use copy-on-write.
For more information see What is the difference between fork() and vfork()?
This question already has answers here:
printf anomaly after "fork()"
(3 answers)
Closed 8 years ago.
fork() creates a new process and the child process starts to execute from the current state of the parent process.
This is the thing I know about fork() in Linux.
So, accordingly the following code:
int main() {
printf("Hi");
fork();
return 0;
}
needs to print "Hi" only once as per the above.
But on executing the above in Linux, compiled with gcc, it prints "Hi" twice.
Can someone explain to me what is happening actually on using fork() and if I have understood the working of fork() properly?
(Incorporating some explanation from a comment by user #Jack)
When you print something to the "Standard Output" stdout (computer monitor usually, although you can redirect it to a file), it gets stored in temporary buffer initially.
Both sides of the fork inherit the unflushed buffer, so when each side of the fork hits the return statement and ends, it gets flushed twice.
Before you fork, you should fflush(stdout); which will flush the buffer so that the child doesn't inherit it.
stdout to the screen (as opposed to when you're redirecting it to a file) is actually buffered by line ends, so if you'd done printf("Hi\n"); you wouldn't have had this problem because it would have flushed the buffer itself.
printf("Hi"); doesn't actually immediately print the word "Hi" to your screen. What it does do is fill the stdout buffer with the word "Hi", which will then be shown once the buffer is 'flushed'. In this case, stdout is pointing to your monitor (assumedly). In that case, the buffer will be flushed when it is full, when you force it to flush, or (most commonly) when you print out a newline ("\n") character. Since the buffer is still full when fork() is called, both parent and child process inherit it and therefore they both will print out "Hi" when they flush the buffer. If you call fflush(stout); before calling fork it should work:
int main() {
printf("Hi");
fflush(stdout);
fork();
return 0;
}
Alternatively, as I said, if you include a newline in your printf it should work as well:
int main() {
printf("Hi\n");
fork();
return 0;
}
In general, it's very unsafe to have open handles / objects in use by libraries on either side of fork().
This includes the C standard library.
fork() makes two processes out of one, and no library can detect it happening. Therefore, if both processes continue to run with the same file descriptors / sockets etc, they now have differing states but share the same file handles (technically they have copies, but the same underlying files). This makes bad things happen.
Examples of cases where fork() causes this problem
stdio e.g. tty input/output, pipes, disc files
Sockets used by e.g. a database client library
Sockets in use by a server process - which can get strange effects when a child to service one socket happens to inherit a file handle for anohter - getting this kind of programming right is tricky, see Apache's source code for examples.
How to fix this in the general case:
Either
a) Immediately after fork(), call exec(), possibly on the same binary (with necessary parameters to achieve whatever work you intended to do). This is very easy.
b) after forking, don't use any existing open handles or library objects which depend on them (opening new ones is ok); finish your work as quickly as possible, then call _exit() (not exit() ). Do not return from the subroutine that calls fork, as that risks calling C++ destructors etc which may do bad things to the parent process's file descriptors. This is moderately easy.
c) After forking, somehow clear up all the objects and make them all in a sane state before having the child continue. e.g. close underlying file descriptors without flushing data which are in a buffer which is duplicated in the parent. This is tricky.
c) is approximately what Apache does.
printf() does buffering. Have you tried printing to stderr?
Technical answer:
when using fork() you need to make sure that exit() is not called twice (falling off of main is the same as calling exit()). The child (or rarely the parent) needs to call _exit instead. Also, don't use stdio in the child. That's just asking for trouble.
Some libraries have a fflushall() you can call before fork() that makes stdio in the child safe. In this particular case it would also make exit() safe but that is not true in the general case.
I tried to see what happens if I read something from keyboard while I have multiple processes with fork() (in my case there are two children and a parent) and I discovered the following problem: I need to tell the parent to wait for children's processes, otherwise the program behaves strangely.
I did a research and I found that the problem is with the parent, he needs to wait for the child's process to end because if the parent's process ends first somehow he closes the STDIN, am I right? But also I found that every process has a copy of STDIN so my question is:
Why it works this way and why only the parent has the problem with STDIN and the children not, I mean why if the child's process ends first doesn't affect STDIN but if the parent's process ends first it does affect STDIN?
Here are my tests:
I ran the program without wait() and after I typed a number the program stopped, but then I pressed enter two more times and the other two messages from printf() appeared.
When I ran the program with wait() everything worked fine, every process called scanf() separately and read a different number.
Well, a lot of stuff is going on here. I will try to explain it step by step.
When you start your terminal, the terminal creates a special file having path /dev/pts/<some number>. Then it starts your shell (which is bash in this case) and links the STDIN, STDOUT and STDERR of the bash process to this special file. This file is called a special file because it doesn't actually exist on your hard disk. Instead, whatever you write to this file, it goes directly to the terminal and the terminal renders it on the screen. (Similarly, whenever you try to read from this file, the read blocks until someone types something at the terminal).
Now when you launch your program by typing ./main, bash calls the fork function in order create a new process. The child process execs your executable file, while the parent process waits for the child to terminate. Your program then calls fork twice and we have three processes trying to read their STDINs, ie the same file /dev/pts/something. (Remember that calling fork and exec duplicates and preserves the file descriptors respectively).
The three processes are in race condition. When you enter something at the terminal, one of the three processes will receive it (99 out of 100 times it would be the parent process since the children have to do more work before reaching scanf statement).
So, parent process prints the number and exits first. The bash process that was waiting for the parent to finish, resumes and puts the STDIN into a so called "non-canonical" mode, and calls read in order to read the next command. Now again, three processes (Child1, Child2 and bash) are trying to read STDIN.
Since the children are trying to read STDIN for a longer time, the next time you enter something it will be received by one of the children, rather than bash. So you think of typing, say, 23. But oops! Just after you press the 2 key, you get Your number is: 2. You didn't even press the Enter key! That happened because of this so called "non-canonical" mode. I won't be going into what and why is that. But for now, to make things easier, use can run your program on sh instead of bash, since sh doesn't put STDIN into non-canonical mode. That will make the picture clear.
TL;DR
No, parent process closing its STDIN doesn't mean that its children or other process won't be able to use it.
The strange behavior you are seeing is because when the parent exits, bash puts the pty (pseudo terminal) into non-canonical mode. If you use sh instead, you won't see that behavior. Read up on pseudo terminals, and line discipline if you want to have a clear understading.
The shell process will resume as soon as the parent exits.
If you use wait to ensure that parents exits last, you won't have any problem, since the shell won't be able to run along with your program.
Normally, bash makes sure that no two foreground processes read from STDIN simultaneously, so you don't see this strange behavior. It does this by either piping STDOUT of one program to another, or by making one process a background process.
Trivia: When a background process tries to read from its STDIN, it is sent a signal SIGTTIN, which stops the process. Though, that's not really relevant to this scenario.
There are several issues that can happen when multiple processes try to do I/O to the same TTY. Without code, we can't tell which may be happening.
Trying to do I/O from a background process group may deliver a signal: SIGTTIN for input (usually enabled), or SIGTTOU for output (usually disabled)
Buffering: if you do any I/O before the fork, any data that has been buffered will be there for both processes. Under some conditions, using fflush may help, but it's better to avoid buffering entirely. Remember that, unlike output buffering, it is impossible to buffer input on a line-by-line basis (although you can only buffer what is available, so it may appear to be line-buffered at first).
Race conditions: if more than one process is trying to read the same pipe-like file, it is undefined which one will "win" and actually get the input each time it is available.
To my previous question about segmentation fault ,I got very useful answers.Thanks for those who have responded.
#include<stdio.h>
main()
{
printf("hello");
int pid = fork();
wait(NULL);
}
output: hellohello.
In this the child process starts executing form the beginning.
If Iam not wrong , then how the program works if I put the sem_open before fork()
(ref answers to :prev questions)
I need a clear explanation about segmentation fault which happens occasionally and not always. And why not always... If there is any error in coding then it should occur always right...?
fork creates a clone of your process. Conceptually speaking, all state of the parent also ends up in the child. This includes:
CPU registers (including the instruction pointer, which defines where in the code your program is)
Memory (as an optimization your kernel will most likely mark all pages as copy-on-write, but semantically speaking it should be the same as copying all memory.)
File descriptors
Therefore... Your program will not "start running" from anywhere... All the state that you had when you called fork will propagate to the child. The child will return from fork just as the parent will.
As for what you can do after a fork... I'm not sure about what POSIX says, but I wouldn't rely on semaphores doing the right thing after a fork. You might need an inter-process semaphore (see man sem_open, or the pshared parameter of sem_init). In my experience cross-process semaphores aren't really well supported on free Unix type OS's... (Example: Some BSDs always fail with ENOSYS if you ever try to create one.)
#GregS mentions the duplicated "hello" strings after a fork. He is correct to say that stdio (i.e. FILE*) will buffer in user-space memory, and that a fork leads to the string being buffered in two processes. You might want to call fflush(stdout); fflush(stderr); and flush any other important FILE* handles before a fork.
No, it starts from the fork(), which returns 0 in the child or the child's process ID in the parent.
You see "hello" twice because the standard output is buffered, and has not actually been written at the point of the fork. Both parent and child then actually write the buffered output. If you fflush(stdout); after the printf(), you should see it only once.
I've got a little C server that needs to accept a connection and fork a child process. I need the stderr of the child process to go to an already existing named pipe, the stdout of the child to go to the stdout of the parent, and the stdin of the child tp come from the same place as the stdin of the parent.
My initial attempts involved popen() but I could never seem to get quite what I wanted.
Finally, this particular solution only needs to work in Solaris. Thanks.
EDIT: Updated the question in hopes of more accurately portraying what I'm trying to accomplish. Thanks for being patient with me.
EDIT2: I also need the parent to get the return value of the child process and then do something with it if that makes any difference.
You might be using the wrong function - popen() is used when you want the invoking program either to write to the forked process's standard input or read from its standard output. It seems you want neither. It also takes two arguments.
Your requirements are also somewhat contradictory:
I want it to (ideally) inherit stdin and stdout from the parent
any input to the parent goes to the child and any output from the child goes back to the parent
but at a minimum, I'd like it to inherit stdin and write stdout to a named pipe
The first option is easy - it requires no special coding. Any data supplied to the stdin of the parent will also be available on the stdin of the child (but only one of the two processes will get to read it). The child's stdout will normally go to the same place as the parent's stdout. If you want the parent to read the child's stdout, then you do need a pipe - and popen() is then appropriate, but the 'at minimum' stuff is confusing.
So, let's define what you really want?
Option 1
The standard error of the child should go to a named pipe.
The standard output of the child should be read by the invoking process.
The standard input of the child should come from the same place as the standard input of the parent.
The named pipe already exists.
Hence:
FILE *fp = popen("/run/my/command -with arguments 2>/my/other/pipe", "r");
Note that the child will be hung until a process opens '/my/other/pipe' for reading; that in turn means that if the parent process reads from fp, it too will be hung until some other process opens '/my/other/pipe' for reading.
Option 2
The standard error of the child should go to a named pipe.
The standard output of the child should go to the standard output of the parent.
The standard input of the child should come from the same place as the standard input of the parent.
The named pipe already exists.
Now popen() is not appropriate, and we get into naked `fork & exec' code. What follows is more pseudo-code than operational C.
if ((pid = fork() < 0)
error
else if (pid > 0)
{
/* Parent - might wait for child to complete */
}
else
{
int fd = open("/my/other/pipe", O_WRONLY|O_NONBLOCK);
if (fd < 0)
error
dup2(fd, 2); /* There is a symbolic name for stderr too */
close(fd); /* Do not want this open any more */
char *cmd[4] = { "/bin/sh", "-c", "/run/my/command -with arguments", 0 };
execv(cmd[0], cmd);
error - if execv returns, it failed!
}
If you're totally confident no-one has pulled any stunts on you like closing stdout, you can avoid using dup2() by closing stderr (fd = 2) before calling open(). However, if you do that, you can't report any errors any more - because you closed stderr. So, I would do it as shown.
If you have a different requirement, state what you want to achieve.
As noted by p2vb, if you want the parent to wait for the child to finish, then simply using system() may be sufficient. If the parent should continue while the child is running, you might try system() where the command string ends with an ampersand (&) to put the child into the background, or you might use the code outlined in Option 2 above.
Using system(), the parent will have little chance to read the /my/other/pipe which gets the standard error from the child. You could easily deadlock if the child produces a lot.
Also, be careful with your FD_CLOEXEC flag - set it on files that you don't want the child modifying. On Linux, you can use the O_CLOEXEC flag on the open() call; with Solaris, you have to set it via fcntl() - carefully.