What Race Causes the Output to Look Different? - c

I was given this question on a midterm last year.
Consider the following program.
#include <stdio.h>
#include <unistd.h>
int main (void) {
int pid = getpid ();
printf ("hello world (pid:%d)\n", pid);
int rc = fork();
if (rc < 0) {
fprintf (stderr, "fork failed\n");
retrun 1;
}
pid = getpid();
if (rc == 0) {
printf ("hello, I am child (pid:%d)\n", pid);
} else {
printf("hello, I am parent of %d (pid:%d)\n", rc, pid);
}
return 0;
}
And consider the following behavior that I got when I compiled and ran this program:
$ gcc -02 -Wall question.c
$ ./a.out # First program run
hello world (pid:20600)
hello, I am parent of 20601 (pid:20600)
hello, I am child (pid:20601)
$ ./a.out | cat # Second program run
hello world (pid:20605)
hello, I am parent of 20607 (pid:20605)
hello world (pid:20605)
hello, I am child (pid:20607)
a) What race could cause the output to look substantially different from either the first or the second run, and what would this output look like?
b) Explain each different in the outputs of the two program runs.
For part (a) I argued that there is a race between the child process and the parent process, and that the child can print before the parent, but apparently that was wrong. Is there any other race that would cause the output to be different? And why is my answer wrong?
For part (b) I am shaky on multiple things. First is that I see the PIDs are different in both runs, but I don't have a good explanation for that. Second that extra hello world in the second run is because of the way the program is being run with a pipe and the cat command?

The problem is that you pipe the output to cat.
By default, when stdout is connected to a terminal or console, then stdout is line buffered which means that the internal buffers are flushed at newline (or when the buffer is full, or when fflush is called).
But when stdout is not connected to a terminal or console, like when it is connected to a pipe, it becomes fully buffered. This means it will only be flushed if the buffer becomes full or fflush is called. Printing newlines doesn't do anything special, the newline is just added to the buffer.
Now because stdout is fully buffered the buffer with the contents of the first printf call will be copied to the child process as part of the fork call, and will be flushed when the child process exits.

Related

A question about the "watch" command in Linux with the "fork()" system call [duplicate]

This question already has answers here:
printf anomaly after "fork()"
(3 answers)
Closed 25 days ago.
This post was edited and submitted for review 24 days ago.
I have a C programme like this:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
printf("hello world (pid:%d)\n", (int)getpid());
int rc = fork();
if (rc < 0)
{ // fork failed; exit
fprintf(stderr, "fork failed\n");
exit(1);
}
else if (rc == 0)
{ // child (new process)
printf("hello, I am child (pid:%d)\n", (int)getpid());
}
else
{ // parent goes down this path (main)
printf("hello, I am parent of %d (pid:%d)\n",
rc, (int)getpid());
}
return 0;
}
After finishing the gcc compilation, I got the "a.out" executable file. And I got the correct programme output in the Linux shell.
hello world (pid:1056088)
hello, I am parent of 1056089 (pid:1056088)
hello, I am child (pid:1056089)
But if I use the watch command to execute it, I got the output like this:
hello world (pid:1056155)
hello, I am parent of 1056156 (pid:1056155)
hello world (pid:1056155)
hello, I am child (pid:1056156)
I don't know why if I use the watch command, there would be an extra line in the output.
I have tried to delete the line "printf("hello world (pid:%d)\n", (int)getpid());", there would be no difference between the outputs. And if I add the "fflush(stdout);" line under it, it also has no difference between the outputs.
Now I know that "stdout" is line buffered. But I wonder why the "watch" commmand has the diffwerent output with the "./a.out" format. Is this a bug, or a feature?
I would be appreciate if anybody can explain it to me.
When printing to a pipe, standard output is buffered. When you fork, the whole program is copied to the fork, including the contents of the buffer.
If you print more than just one line, you can see that only the final part of it gets copied and printed twice, e.g.
for (int i = 1; i < 64000; ++i) printf("%d\n", i);
On my machine, everyting up to 63970 gets printed once, the rest is printed twice:
a.out | grep 6397[01]
63970
63971
63971
See also Why does printf not flush after the call unless a newline is in the format string?

Unexpected output on a fork call in C [duplicate]

This question already has answers here:
Why is main called twice?
(3 answers)
Closed 1 year ago.
I encountered this in my school work and it didn't produce what I thought it should:
int main() {
printf("c");
fork();
printf("d");
}
I know there are several things that aren't good about this code (i.e. no parameters in main, no variable for the return value from fork, no return statement at the end, etc.), but this is how it was presented and it's not relevant to my question anyway.
This code produces the output:
cdcd
It was my understanding that when fork is called, both parent and child would resume/begin on the line after the fork call. Based on that, I would have expected the output to be:
cdd
Assuming, of course, that the fork call is successful. Can anyone explain to me why that "c" is printed a second time even though it's on the line before the fork call?
Thanks!
M_MN
You forked your program before flushing stdout (i.e.: data was still in the output buffer). Just call fflush(stdout) to fix it:
❯ cat test.c
#include <stdio.h>
#include <unistd.h>
int main() {
printf("c");
fflush(stdout);
fork();
printf("d");
}
[22:14:01]~/devel
❯ clang test.c -o test
[22:14:07]~/devel
❯ ./test
cdd[22:14:09]~/devel
The reason you're seeing c twice is that the fork() duplicates the unprinted buffered output. You could flush the output stream before the fork():
fflush(stdout);
Or you could set stdout to be unbuffered, but you should do this first, before calling printf() the first time:
setvbuf(stdout, NULL, _IONBF, 0);
Here's what I'm guessing is happening. printf writes to the stream stdout. Since you didn't flush stdout after printing "c" nor did that string end in a new line, the character sat there in a user-space buffer. When you called fork, the child process got a copy of the parent's virtual address space including the buffered text. When both programs exited, their buffers were flushed and so "c" showed up twice.
Try adding fflush(stdout); just prior to the call to fork.

Why does the child process behave differently when exec is supplied an invalid command?

pid_t pid;
pid = fork(); //Two processes are made
const char* ptr = secondlinecopy;
if (pid > 0 && runBGflag==0) //Parent process. Waits for child termination and prints exit status
{
int status;
if (waitpid(pid, &status, 0) == pid && WIFEXITED(status))
{
printf("Exitstatus [");
for (int i = 0; i < noOfTokens; i++)
{
printf("%s ", commands[i]);
}
printf("\b] = %d\n", WEXITSTATUS(status));
}
}
else if (pid == 0) //Child process. Executes commands and prints error if something unexpected happened
{
printf("why does this only print when an invalid command is supplied?");
if (runBGflag==1) insertElement(getpid(),ptr);
execvp(commands[0], commands);
printf ("exec: %s\n", strerror(errno));
exit(1);
}
In the code excerpt we see a process creation via fork(). When execvp is supplied a real command, such as "ls" for example, we get the output.
/home/kali/CLionProjects/clash/cmake-build-debug: ls
clash CMakeCache.txt cmake_install.cmake Testing
clash.cbp CMakeFiles Makefile
Exitstatus [ls] = 0
However, if we supply an invalid command, the output will be:
/home/kali/CLionProjects/clash/cmake-build-debug: sd
why does this only print when an invalid command is supplied?exec: No such file or directory
Exitstatus [sd] = 1
Why is that the case? Shouldnt the process always call printf("Why does ...") first and then run exec?
Why is that the case? Shouldnt the process always call printf("Why does ...") first and then run exec?
Let's say it works like this:
printf(...) --> internal buffer --(fflush? newline? max_buffer_size?)--> output
Usually stdout is line buffered and your printf has no newline. The data to-be-printed are stored inside some internal buffer. When exec-ing the stdout is not fflushed and the parent process is replaced by child process as it is as a whole - so all the data stored in parent process, including some internal stdout state, are removed. When exec fails, stdout is flushed when you printf(...\n" (or after calling exit() when stdout is block buffered) and the data show up. Research: studio buffering modes and setvbuf() function.
Why is that the case? Shouldnt the process always call printf("Why does ...") first and then run exec?
Yes, it should, and you've not presented any reason to think that it doesn't.
printf directs output to the standard output stream, and that defaults to being line buffered when it is connected to an interactive device, or to being block buffered otherwise. When execvp() succeeds, it replaces the whole program image with that of a new program, including the contents of any I/O buffers. Any data that have been buffered but not flushed to the underlying device are lost.
When execvp() fails and the program thereafter terminates normally (regardless of its exit status) all then-buffered buffered data is automatically flushed to the relevant output device.
You would see different behavior if you appended a newline to the message you are printing, or if you called fflush(stdout) between the printf and execvp calls, or if you printed to stderr instead of to stdout, or if you turned off buffering of stdout.

Two children of same parent are not communicating using pipe if parent do not call wait()

Please see the code below:
#include<stdio.h>
main(){
int pid, fds[2], pid1;
char buf[200];
pipe(fds);
pid = fork();
if(pid==0)
{
close(fds[0]);
scanf("%s", &buf);
write(fds[1], buf, sizeof(buf)+1);
}
else
{
pid1 = fork();
if(pid1==0)
{
close(fds[1]);
read(fds[0], buf, sizeof(buf)+1);
printf("%s\n", buf);
}
else
{
Line1: wait();
}
}
}
If I do not comment out Line1, it is working fine. Please see below:
hduser#pc4:~/codes/c/os$ ./a.out
hello //*Entry from keyboard*
hello //Output
hduser#pc4:~/codes/c/os$
But if I comment out Line1, two child processes are not communicating:
hduser#pc4:~/codes/c/os$ ./a.out
hduser#pc4:~/codes/c/os$
hi //*Entry from keyboard*
hi: command not found
hduser#pc4:~/codes/c/os$
I cannot understand significance of wait() here.
What's happening here is that the parent process completes execution before the child processes finish. Causing the children to lose access to the terminal.
Let us have a closer look at all this.
What does wait() do ?
The wait() system call suspends execution of the calling process until
one of its children terminates.
Your program is like this
Your main Process forks 2 child processes. The first one writes to a pipe while the other one reads from a pipe. All this happens while the main process continues to execute.
What happens when the main process has executed it's code ? It terminates. When it terminates, it gives up its control on the terminal. Which causes the children to lose access to the terminal.
This explains why you get command not found -- what you have typed is not on the stdin of your program but on the shell prompt itself.
There were a couple of other issues with your code too,
1) In this part of your code,
scanf("%s", &buf);
This is wrong. You were unlucky and didn't get a segmentation fault. Since buf is already an address, this should have been
scanf("%s", buf);
2) Notice this,
read(fds[0], buf, sizeof(buf)+1);
This is undefined behavior as was pointed out in the comments section. You are trying to read more data and store it in a lesser memory space. This
should have been,
read(fds[0], buf, sizeof(buf));
3) Calling wait(). You have created two child processes, you should wait for both of them to finish, so you should call wait() twice.
After fixing some infelicities in the code, I came up with a semi-instrumented version of your program like this:
#include <unistd.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
int pid, fds[2], pid1;
char buf[200];
pipe(fds);
pid = fork();
if (pid == 0)
{
close(fds[0]);
printf("Prompt: "); fflush(0);
if (scanf("%199s", buf) != 1)
fprintf(stderr, "scanf() failed\n");
else
write(fds[1], buf, strlen(buf) + 1);
}
else
{
pid1 = fork();
if (pid1 == 0)
{
close(fds[1]);
if (read(fds[0], buf, sizeof(buf)) > 0)
printf("%s\n", buf);
else
fprintf(stderr, "read() failed\n");
}
else
{
/*Line1: wait();*/
}
}
return 0;
}
That compiles cleanly under stringent options (GCC 5.1.0 on Mac OS X 10.10.5):
gcc -O3 -g -std=c11 -Wall -Wextra -Werror p11.c -o p11
When I run it, the output is:
$ ./p11
Prompt: scanf() failed
read() failed
$
The problem is clear; the scanf() fails. At issue: why?
The wait() version needs an extra header #include <sys/wait.h> and the correct calling sequence. I used the paragraph:
else
{
printf("Kids are %d and %d\n", pid, pid1);
int status;
int corpse = wait(&status);
printf("Parent gets PID %d status 0x%.4X\n", corpse, status);
}
When compiled and run, the output is now:
$ ./p11
Kids are 20461 and 20462
Prompt: Albatross
Albatross
Parent gets PID 20461 status 0x0000
$
So, the question becomes: how or why is the standard input of the child process closed when the parent doesn't wait? It is Bash doing some job control that wreaks havoc.
I upgraded the program once more, using int main(int argc, char **argv) and testing whether the command was passed any arguments:
else if (argc > 1 && argv != 0) // Avoid compilation warning for unused argv
{
printf("Kids are %d and %d\n", pid, pid1);
int status;
int corpse = wait(&status);
printf("Parent gets PID %d status 0x%.4X\n", corpse, status);
}
I've got an Heirloom Shell, which is close to an original Bourne shell. I ran the program under that, and it behaved as I would expect:
$ ./p11
Prompt: $ Albatross
Albatross
$ ./p11 1
Kids are 20483 and 20484
Prompt: Albatross
Albatross
Parent gets PID 20483 status 0x0000
$
Note the $ after the Prompt: in the first run; that's the shell prompt, but when I type Albatross, it is (fortunately) read by the child of the p11 process. That's not guaranteed; it could have been the shell that read the input. In the second run, we get to see the parent's output, then the children at work, then the parents exiting message.
So, under classic shells, your code would work as expected. Bash is somehow interfering with the normal operation of child processes. Korn shell behaves like Bash. So does C shell (tcsh). Attempting dash, I got interesting behaviour (3 runs):
$ ./p11
Prompt: $ Albatross
scanf() failed
read() failed
dash: 2: Albatross: not found
$ ./p11
Prompt: $ Albatross
scanf() failed
dash: 4: Albatross: not found
$ read() failed
$ ./p11
Prompt: scanf() failed
$ read() failed
$
Note that the first two runs shows dash reading the input, but the children did not detect problems until after I hit return after typing Albatross. The last time, the children detected problems before I typed anything.
And, back with Bash, redirecting standard input works 'sanely':
$ ./p11 <<< Albatross
Prompt: Albatross
$ ./p11 1 <<< Albatross
Kids are 20555 and 20556
Prompt: Albatross
Parent gets PID 20555 status 0x0000
$
The output Albatross comes from the second child, of course.
The answer is going to be lurking somewhere in behaviour of job control shells, but it's enough to make me want to go back to life before that.

Why output is different between to shell and to a file

Consider the following program.
main() {
printf("hello\n");
if(fork()==0)
printf("world\n");
exit(0);
}
Compiling this program using ./a.out gives the following Output:
hello
world
compiling this program using ./a.out > output gives the output in the file called 'output' and appears to be like this:
hello
hello
world
Why is this so?
Because when you output to shell stdout is usually line buffered, while when you write to a file, it's usually full buffered.
After fork(), the child process will inherit the buffer, when you output to shell, the buffer is empty because of the new line \n, but when you output to a file, the buffer still contains the content, and will be in both the parent and child's output buffer, that's why hello is seen twice.
You can try it like this:
int main() {
printf("hello"); //Notice here without "\n"
if(fork()==0)
printf("world\n");
exit(0);
}
You will likely see hello twice when output to shell as well.
This answer after Yu Hao 's answer . He already explain a lot .
main() {
printf("hello\n");
fflush(stdout);//flush the buffer
if(fork()==0)
printf("world\n");
exit(0);
}
Then you can get the right output.

Resources