I'm implementing a simplified shell which supports pipe.
A part of my code shown below runs fine, but I'm not sure why it works.
main.cpp
#include <iostream>
#include <string>
#include <queue>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include "include/command.h"
using namespace std;
int main()
{
string rawCommand;
IndividualCommand tempCommand = {};
int pipeFD[2] = {PIPE_IN, PIPE_OUT};
int firstPipeRead, firstPipeWrite, secondPipeRead, secondPipeWrite;
while (true)
{
cout << "% ";
getline(cin, rawCommand);
if (rawCommand == "exit")
break;
Command *command = new Command(rawCommand);
deque<IndividualCommand> commandQueue = command->parse();
delete command;
while (!commandQueue.empty())
{
tempCommand = commandQueue.front();
commandQueue.pop_front();
firstPipeRead = secondPipeRead;
firstPipeWrite = secondPipeWrite;
if (tempCommand.outputStream == PIPE_OUT)
{
pipe(pipeFD);
secondPipeRead = pipeFD[0];
secondPipeWrite = pipeFD[1];
}
pid_t child_pid;
child_pid = fork();
int status;
// child process
if (child_pid == 0)
{
if (tempCommand.redirectToFile != "")
{
int fd = open(tempCommand.redirectToFile.c_str(), O_RDWR | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR);
dup2(fd, STDOUT_FILENO);
close(fd);
}
if (tempCommand.inputStream == PIPE_IN)
{
close(firstPipeWrite);
dup2(firstPipeRead, STDIN_FILENO);
close(firstPipeRead);
}
if (tempCommand.outputStream == PIPE_OUT)
{
close(secondPipeRead);
dup2(secondPipeWrite, STDOUT_FILENO);
close(secondPipeWrite);
}
if (tempCommand.argument != "")
execl(tempCommand.executable.c_str(), tempCommand.executable.c_str(), tempCommand.argument.c_str(), NULL);
else
execl(tempCommand.executable.c_str(), tempCommand.executable.c_str(), NULL);
}
else
{
close(secondPipeWrite);
if (commandQueue.empty())
waitpid(child_pid, &status, 0);
}
}
}
return 0;
}
command.h
#ifndef COMMAND_H
#define COMMAND_H
#include <string>
#include <queue>
#include <sstream>
#include <unistd.h>
using namespace std;
#define PIPE_IN 0x100000
#define PIPE_OUT 0x100001
struct IndividualCommand
{
string executable = "";
string argument = "";
string redirectToFile = "";
int inputStream = STDIN_FILENO;
int outputStream = STDOUT_FILENO;
int errorStream = STDERR_FILENO;
};
class Command
{
private:
string rawCommand, tempString;
queue<string> splittedCommand;
deque<IndividualCommand> commandQueue;
stringstream commandStream;
IndividualCommand tempCommand;
bool isExecutableName;
public:
Command(string rawCommand);
deque<IndividualCommand> parse();
};
#endif
command.cpp
#include "include/command.h"
Command::Command(string rawCommand)
{
this->rawCommand = rawCommand;
isExecutableName = true;
}
deque<IndividualCommand> Command::parse()
{
commandStream << rawCommand;
while (!commandStream.eof())
{
commandStream >> tempString;
splittedCommand.push(tempString);
}
while (!splittedCommand.empty())
{
tempString = splittedCommand.front();
splittedCommand.pop();
if (isExecutableName)
{
tempCommand.executable = tempString;
isExecutableName = false;
if (!commandQueue.empty() && commandQueue.back().outputStream == PIPE_OUT)
tempCommand.inputStream = PIPE_IN;
}
else
{
// normal pipe
if (tempString == "|")
{
tempCommand.outputStream = PIPE_OUT;
isExecutableName = true;
commandQueue.push_back(tempCommand);
tempCommand = {};
}
// redirect to file
else if (tempString == ">")
{
tempCommand.redirectToFile = splittedCommand.front();
splittedCommand.pop();
}
// argv
else
tempCommand.argument = tempString;
}
if (splittedCommand.empty())
{
commandQueue.push_back(tempCommand);
tempCommand = {};
}
}
return commandQueue;
}
So basically the communication is established between two child processes, not between child and parent. (I'm using those first and second pipes to avoid overwriting FDs with consecutive calls to pipe() when facing something like "ls | cat |cat").
The shell originally got stuck because the write end was not closed, and thus the read end got blocked. I've tried closing everything in both the child processes, but nothing changed.
My question is why close(secondPipeWrite); in the parent process solved everything? Does it mean that it is the write end of the pipe that really matters, and we don't have to care about whether the read end is closed explicitly?
Moreover, why I don't need to close anything in the child process and it still works?
Accidents will happen! Things will sometimes seem to work when there is no good reason for them to do so reliably. A multi-stage pipeline is not guaranteed to work if you do not close all the unused pipe descriptors properly, even though it happens to work for you. You aren't closing enough file descriptors in the child processes, in particular. You should close all the unused ends of all the pipes.
Here's a 'Rule of Thumb' I've included in other answers.
Rule of thumb: If you
dup2()
one end of a pipe to standard input or standard output, close both of the
original file descriptors returned by
pipe()
as soon as possible.
In particular, you should close them before using any of the
exec*()
family of functions.
The rule also applies if you duplicate the descriptors with either
dup()
or
fcntl()
with F_DUPFD or F_DUPFD_CLOEXEC.
If the parent process will not communicate with any of its children via
the pipe, it must ensure that it closes both ends of the pipe early
enough (before waiting, for example) so that its children can receive
EOF indications on read (or get SIGPIPE signals or write errors on
write), rather than blocking indefinitely.
Even if the parent uses the pipe without using dup2(), it should
normally close at least one end of the pipe — it is extremely rare for
a program to read and write on both ends of a single pipe.
Note that the O_CLOEXEC option to
open(),
and the FD_CLOEXEC and F_DUPFD_CLOEXEC options to fcntl() can also factor
into this discussion.
If you use
posix_spawn()
and its extensive family of support functions (21 functions in total),
you will need to review how to close file descriptors in the spawned process
(posix_spawn_file_actions_addclose(),
etc.).
Note that using dup2(a, b) is safer than using close(b); dup(a);
for a variety of reasons.
One is that if you want to force the file descriptor to a larger than
usual number, dup2() is the only sensible way to do that.
Another is that if a is the same as b (e.g. both 0), then dup2()
handles it correctly (it doesn't close b before duplicating a)
whereas the separate close() and dup() fails horribly.
This is an unlikely, but not impossible, circumstance.
Note that if the wrong process keeps a pipe descriptor open, it can prevent processes from detecting EOF. If the last process in a pipeline has the write end of a pipe open where a process (possibly itself) is reading until EOF on the read end of that pipe, the process will never get EOF.
Reviewing the C++ code
On the whole, your code was good. My default compilation options picked two problems with close(firstPipeWrite) and close(firstPipeRead) operating on uninitialized variables; they were treated as errors because I compile with:
c++ -O3 -g -std=c++11 -Wall -Wextra -Werror -c -o main.o main.cpp
But that was all — which is remarkably good work.
However, those errors also point to where your problem is.
Let's suppose you have a command input which requires two pipes (P1 and P2) and three processes (or commands, C1, C2, C3), such as:
who | grep -v root | sort
You want the commands set up as follows:
C1: who — creates P1; standard input = stdin, standard output = P1[W]
C2: grep — creates P2; standard input = P1[R], standard output = P2[W]
C3: sort — creates no pipe; standard input = P2[R], standard output = stdout
The PN[R] notation means the read descriptor of pipe N, etc.
A more elaborate pipeline, such as who | awk '{print $1}' | sort | uniq -c | sort -n, with 5 commands and 4 pipes is similar: it simply has more processes CN (with N = 2, 3, 4) which create PN and run with standard input coming from P(N-1)[R] and standard output going to PN[W].
A two-command pipeline has just one pipe, of course, and the structure:
C1 — creates P1; standard input = stdin, standard output = P1[W]
C2 — creates no pipe; standard input = P1[R], standard output = stdout
And a one-command (degenerate) pipeline has zero pipes, of course, and the structure:
C1 — creates no pipe; standard input = stdin, standard output = stdout
Note that you need to know whether the command you're processing is first, last, or in the middle of the pipeline — the plumbing work to be done for each is different. Also, if you have a multi-command pipeline (three or more commands), you can close the older pipes after a while; they won't be needed again. So as you're processing C3, both ends of P1 can be closed permanently; they won't be referenced again. You need the input pipe and the output pipe for the current process; any older pipes can be closed by the process coordinating the plumbing.
You need to decide which process is coordinating the plumbing. The easiest way in some respects is to have the original (parent) shell process launch all the sub-processes, left-to-right — which is what you're doing — but it is by no means the only way.
With the shell process launching the child processes, it is crucial that the shell eventually close all the descriptors of all the pipes it opened, so that the child processes can detect EOF. This must be done before waiting for any of the children. Indeed, all the processes in the pipeline must be launched before the parent can afford to wait for any of them — those processes must run concurrently, in general, as otherwise, the pipes in the middle may fill up, blocking the entire pipeline.
I'm going to point you at C Minishell — Adding Pipelines as a question with an answer showing how to do it. It is not the only way of doing it, and I'm not convinced it is the best way to do it, but it does work.
Sorting this out in your code is left as an exercise — I need to get some work done now. But this should give you strong pointers in the right direction.
Note that since your parent shell creates all the sub-processes, the waitpid() code is not ideal. You will have zombie processes accumulating. You'll need to think about a loop which collects any dead children, possibly with WNOHANG as part of the third argument so that when there are no zombies, the shell can continue. This becomes even more important when you run processes in background pipelines, etc.
Related
I am trying to make a program to takes a command including pipes and then executes it. This is a simplified version of it where I'm trying to pipe the ls and wc command:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include<fcntl.h>
int main(){
char* arglist1[] = {"ls", NULL}; // writing process
char* arglist2[] = {"wc", NULL}; // reading process
int pipefd[2];
pid_t p1, p2;
if (pipe(pipefd) < 0) {
printf("\nPipe could not be initialized");
return 0;
}
p1 = fork();
if (p1 < 0) {
printf("\nCould not fork");
return 0;
}
if (p1 == 0) { // Child 1 executing it needs to write at the write end
close(pipefd[0]);
dup2(pipefd[1], STDOUT_FILENO);
close(pipefd[1]);
if (execvp(arglist1[0], arglist1) < 0) {
printf("\nCould not execute command 1..");
exit(0);
}
} else { // Parent executing
p2 = fork();
if (p2 < 0) {
printf("\nCould not fork");
return 0;
}
if (p2 == 0) { // Child 2 executing it needs to read at the read end
close(pipefd[1]);
dup2(pipefd[0], STDIN_FILENO);
close(pipefd[0]);
if (execvp(arglist2[0], arglist2) < 0) {
printf("\nCould not execute command 2..");
exit(0);
}
} else { // parent executing, waiting for two children
wait(NULL);
wait(NULL);
}
}
printf("\n");
return 0;
}
Although there is error handling in the program, it neither shows anything nor ends. Where is it blocking?
Your problem is that the parent doesn't close both the pipe's file descriptors, and the wc process won't die until it gets EOF on the pipe, and that won't happen until every process that has the write end of the pipe open has closed it. You need to close both ends of the pipe in the parent before waiting for the children to die.
Rule of thumb: If you
dup2()
one end of a pipe to standard input or standard output, close both of the
original file descriptors returned by
pipe()
as soon as possible.
In particular, you should close them before using any of the
exec*()
family of functions.
The rule also applies if you duplicate the descriptors with either
dup()
or
fcntl()
with F_DUPFD or F_DUPFD_CLOEXEC.
If the parent process will not communicate with any of its children via
the pipe, it must ensure that it closes both ends of the pipe early
enough (before waiting, for example) so that its children can receive
EOF indications on read (or get SIGPIPE signals or write errors on
write), rather than blocking indefinitely.
Even if the parent uses the pipe without using dup2(), it should
normally close at least one end of the pipe — it is extremely rare for
a program to read and write on both ends of a single pipe.
Note that the O_CLOEXEC option to
open(),
and the FD_CLOEXEC and F_DUPFD_CLOEXEC options to fcntl() can also factor
into this discussion.
If you use
posix_spawn()
and its extensive family of support functions (21 functions in total),
you will need to review how to close file descriptors in the spawned process
(posix_spawn_file_actions_addclose(),
etc.).
Note that using dup2(a, b) is safer than using close(b); dup(a);
for a variety of reasons.
One is that if you want to force the file descriptor to a larger than
usual number, dup2() is the only sensible way to do that.
Another is that if a is the same as b (e.g. both 0), then dup2()
handles it correctly (it doesn't close b before duplicating a)
whereas the separate close() and dup() fails horribly.
This is an unlikely, but not impossible, circumstance.
Side notes:
Error messages should be written to stderr, not stdout, and should end with a newline. They don't normally need to start with a newline.
You don't need to test the return value from the exec*() family of functions. If they succeed, they don't return; if they return, they failed. But it is important to have code after the eec*() call to trap the error.
The program should exit with a non-zero status (e.g. EXIT_FAILURE) if the exec*() function fails. Exiting with status zero reports success.
I've been trying to implement shell-like functionality with pipes in an application and I'm following this example. I will reproduce the code here for future reference in case the original is removed:
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
/**
* Executes the command "cat scores | grep Villanova | cut -b 1-10".
* This quick-and-dirty version does no error checking.
*
* #author Jim Glenn
* #version 0.1 10/4/2004
*/
int main(int argc, char **argv)
{
int status;
int i;
// arguments for commands; your parser would be responsible for
// setting up arrays like these
char *cat_args[] = {"cat", "scores", NULL};
char *grep_args[] = {"grep", "Villanova", NULL};
char *cut_args[] = {"cut", "-b", "1-10", NULL};
// make 2 pipes (cat to grep and grep to cut); each has 2 fds
int pipes[4];
pipe(pipes); // sets up 1st pipe
pipe(pipes + 2); // sets up 2nd pipe
// we now have 4 fds:
// pipes[0] = read end of cat->grep pipe (read by grep)
// pipes[1] = write end of cat->grep pipe (written by cat)
// pipes[2] = read end of grep->cut pipe (read by cut)
// pipes[3] = write end of grep->cut pipe (written by grep)
// Note that the code in each if is basically identical, so you
// could set up a loop to handle it. The differences are in the
// indicies into pipes used for the dup2 system call
// and that the 1st and last only deal with the end of one pipe.
// fork the first child (to execute cat)
if (fork() == 0)
{
// replace cat's stdout with write part of 1st pipe
dup2(pipes[1], 1);
// close all pipes (very important!); end we're using was safely copied
close(pipes[0]);
close(pipes[1]);
close(pipes[2]);
close(pipes[3]);
execvp(*cat_args, cat_args);
}
else
{
// fork second child (to execute grep)
if (fork() == 0)
{
// replace grep's stdin with read end of 1st pipe
dup2(pipes[0], 0);
// replace grep's stdout with write end of 2nd pipe
dup2(pipes[3], 1);
// close all ends of pipes
close(pipes[0]);
close(pipes[1]);
close(pipes[2]);
close(pipes[3]);
execvp(*grep_args, grep_args);
}
else
{
// fork third child (to execute cut)
if (fork() == 0)
{
// replace cut's stdin with input read of 2nd pipe
dup2(pipes[2], 0);
// close all ends of pipes
close(pipes[0]);
close(pipes[1]);
close(pipes[2]);
close(pipes[3]);
execvp(*cut_args, cut_args);
}
}
}
// only the parent gets here and waits for 3 children to finish
close(pipes[0]);
close(pipes[1]);
close(pipes[2]);
close(pipes[3]);
for (i = 0; i < 3; i++)
wait(&status);
}
I have trouble understanding why the pipes are being closed just before calling execvp and reading or writing any data. I believe it has something to do with passing EOF flags to processes so that they can stop reading writing however I don't see how that helps before any actual data is pushed to the pipe. I'd appreciate a clear explanation. Thanks.
I have trouble understanding why the pipes are being closed just before calling execvp and reading or writing any data.
The pipes are not being closed. Rather, some file descriptors associated with the pipe ends are being closed. Each child process is duping pipe-end file descriptors onto one or both of its standard streams, then closing all pipe-end file descriptors that it is not actually going to use, which is all of the ones stored in the pipes array. Each pipe itself remains open and usable as long as each end is open in at least one process, and each child process holds at least one end of one pipe open. Those are closed when the child processes terminate (or at least under the control of the child processes, post execvp()).
One reason to perform such closures is for tidiness and resource management. There is a limit on how many file descriptors a process may have open at once, so it is wise to avoiding leaving unneeded file descriptors open.
But also, functionally, a process reading from one of the pipes will not detect end of file until all open file descriptors associated with the write end of the pipe, in any process, are closed. That's what EOF on a pipe means, and it makes sense because as long as the write end is open anywhere, it is possible that more data will be written to it.
I am currently writing my own shell implementation in C. I understood the principle behind piping and redirecting the fds. However, some specific behavior with pipes has attracted my attention:
cat | ls (or any command that does not read from stdin as final element of the pipe).
In that case, what happens in the shell is that ls executes and cat asks for a single line before exiting (resulting from a SIGPIPE I guess). I have tried to follow this tutorial to better understand the principle behind multiple pipes: http://web.cse.ohio-state.edu/~mamrak.1/CIS762/pipes_lab_notes.html
Below is some code I have written to try to replicate the behavior I am looking for:
char *cmd1[] = {"/bin/cat", NULL};
char *cmd2[] = {"/bin/ls", NULL};
int pdes[2];
pid_t child;
if (!(child = fork()))
{
pipe(pdes);
if (!fork())
{
close(pdes[0]);
dup2(pdes[1], STDOUT_FILENO);
/* cat command gets executed here */
execvp(cmd1[0], cmd1);
}
else
{
close(pdes[1]);
dup2(pdes[0], STDIN_FILENO);
/* ls command gets executed here */
execvp(cmd2[0], cmd2);
}
}
wait(NULL);
I am aware of the security flaws of that implementation but this is just for testing. The problem with that code as I understand it is that whenever ls gets executed, it just exits and then cat runs in the background somehow (and in my case fail because it tries to read during the prompt of zsh as my program exits). I cannot find a solution to make it work like it should be. Because if I wait for the commands one by one, such commands as cat /dev/random | head -c 10 would run forever...
If anyone has a solution for this issue or at least some guidance it would be greatly appreciated.
After consideration of comments from #thatotherguy here is the solution I found as implemented in my code. Please bear in mind that pipe and fork calls should be checked for errors but this version is meant to be as simple as possible. Extra exit calls are also necessary for some of my built-in commands.
void exec_pipe(t_ast *tree, t_sh *sh)
{
int pdes[2];
int status;
pid_t child_right;
pid_t child_left;
pipe(pdes);
if (!(child_left = fork()))
{
close(pdes[READ_END]);
dup2(pdes[WRITE_END], STDOUT_FILENO);
/* Execute command to the left of the tree */
exit(execute_cmd(tree->left, sh));
}
if (!(child_right = fork()))
{
close(pdes[WRITE_END]);
dup2(pdes[READ_END], STDIN_FILENO);
/* Recursive call or execution of last command */
if (tree->right->type == PIPE_NODE)
exec_pipe(tree->right, sh);
else
exit(execute_cmd(tree->right, sh));
}
/* Should not forget to close both ends of the pipe */
close(pdes[WRITE_END]);
close(pdes[READ_END]);
wait(NULL);
waitpid(child_right, &status, 0);
exit(get_status(status));
}
I was confused with the original link I posted and the different ways to handle chained pipes. From the link to the POSIX documented posted below my original question (http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_09_02) it appears that:
If the pipeline is not in the background (see Asynchronous Lists), the shell shall wait for the last command specified in the pipeline to complete, and may also wait for all commands to complete.
Both behavior are therefore accepted: waiting for last command, or waiting for all of them. I chose to implement the second behavior to stick to what bash/zsh would do.
I am trying to make my own shell in C. It uses one pipe and the input (for now) is static. I execute commands using execvp.
Everything is fine except when I run the command ls |grep ".c" I get no results. Can anyone show me where is the problem and find a solution.
The shell so far:
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
int p[2];
int pid;
int r;
main()
{
char *ls[] = {"ls", NULL};
char *grep[] = {"grep", "\".c\"", NULL};
pipe(p);
pid = fork();
if (pid != 0) {
// Parent: Output is to child via pipe[1]
// Change stdout to pipe[1]
dup2(p[1], 1);
close(p[0]);
r = execvp("ls", ls);
} else {
// Child: Input is from pipe[0] and output is via stdout.
dup2(p[0], 0);
close(p[1]);
r = execvp("grep", grep);
close(p[0]);
}
return r;
}
Remove the quotes in the argument to grep. i.e., use
char *grep[] = {"grep", ".c", NULL};
If you are calling execvp, the usual shell expansion of arguments (i.e., globbing, removal of quotes, etc) does not happen, so effectively what you are doing is the same as
ls | grep '".c"'
In a normal shell.
Also be aware that nothing that comes after the call to execvp will execute, execvp replaces the current process, it will never return.
You have multiple problems:
One problem is that you have far too few calls to close(). When you use dup2() to replicate a file descriptor from a pipe to standard input or standard output, you should close both file descriptors returned by pipe().
A second problem is that the shell removes double quotes around arguments but you've added them around your. You are looking for files whose name contains ".c" (where the double quotes are part of the file name being searched for). Use:
char *grep[] = { "grep", "\\.c$", NULL };
This looks for a dot and a c at the end of the line.
You should report failures after execvp(). If any of the exec*() functions returns, it failed. It can happen when the user mistypes a command name, for example. It is crucial that you report the error and that the child process then exits. If you don't do that, you can end up in a normal iterative shell (rather than this one-shot, non-iterative, non-interactive shell) with multiple shell processes all trying to read from the terminal at the same time, which leads to chaos and confusion.
There is a program (Ubuntu 12.04 LTS, a single-core processor):
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <fcntl.h>
#include <sys/types.h>
int main(){
mode_t mode = S_IRUSR | S_IWUSR;
int i = 0, fd, pid;
unsigned char pi1 = 0x33, pi2 = 0x34;
if((fd = open("res", O_WRONLY | O_CREAT | O_TRUNC, mode)) < 0){
perror("open error");
exit(1);
}
if((pid = fork()) < 0){
perror("fork error");
exit(1);
}
if(pid == 0) {
if(write(fd, &pi2, 1) != 1){
perror("write error");
exit(1);
}
}else{
if(write(fd, &pi1, 1) != 1){
perror("write error");
exit(1);
}
}
close(fd);
return 0;
}
The idea is to open the file for writing, then going fork. The position at which there will be a record total for both processes. The strange thing is that if you run the program, it's output to a file "res" is not constant: I have infuriated then 34 then 4 then 3. The question is why such a conclusion? (After all, if the position is shared, then the conclusion must be either 34 or 43.).
In my suspicion, the process is interrupted in the function write, when he found a position in which to write.
When you spawn multiple processes with fork() there is no way to tell in which order they will be executed. It's up to the operating systems scheduler to decide.
So having multiple processes write to the same file is a recipe for disaster.
Regarding the question why sometimes one of the two numbers gets omitted: write first writes the data and then it increments the file pointer. I think it could be possible that the thread control changes in exactly that moment so that the second thread writes before the file position was updated. So it overwrites the data the other process just wrote.
Here's what I think is happening.
You open the file and you fork
The parent gets to run first (similar stuff can happen if the child runs first)
The parent writes 3 and exits
The parent was the controlling process for the terminal so the kernel sends a SIGHUP to all members of the foreground group
The default action of SIGHUP is to terminate the process so the child silently dies
A simple way to test this is to add sleep:
sleep(5); /* Somewhere in the child, right before you write. */
You'll see the child process dies instantaneously: the write is never performed.
Another way to test this is to ignore the SIGHUP, before you fork:
sigignore(SIGHUP); /* Define _XOPEN_SOURCE 500 before including signal.h. */
/* You can also use signal(SIGHUP, SIG_IGN); */
You'll see the process now writes both digits to the file.
The overwriting hypothesis is unlikely. After fork, both processes share a link to the same file descriptor in a system-wide table which also contains the file offset.
I run your program several times,and the result is "34" or "43".
So i have write a shell script
#!/bin/bash
for i in {1..500}
do
./your_program
for line in $(cat res)
do
echo "$line"
done
done
,and run your program 500 times. As we can see,it get '3' or '4' some times( aboat 20 times in 500)。
How we can explain this?
The answer is that:
when we fork() a child process,the child share the same file description and file state structure(which has the current file offset).
In normal,a process get offset=0 first,and write the first byte,and the offset=1;the other process get offset=1,and it will write the second byte.
But some times, if the parent process get offset=0 from the file state structure ,and child get offset=0 at the same time,a process write the first byte,and the other overwrite the first byte. The result will be "3" or "4" (depends on whether the parent write first or child ). Because they both write the first byte of the file.
Fork and offset,see this
The standard says
Write requests of {PIPE_BUF} bytes or less shall not be interleaved with data from other processes doing writes on the same pipe.
Link - http://pubs.opengroup.org/onlinepubs/009696699/functions/write.html
The atomicity of write is only guaranteed in case of writing to a pipe for less than equal to PIPE_BUF and it is not specified for a regular file and we cannot assume that.
So in this case race condition is happening and that results in incorrect data for some runs.
(In my system also it is happening after say few thousand runs).
I think you should think of using mutex/semaphore/any other locking primitive to solve this.
Are you sure you need fork() exactly? fork() creates different process with different memory space (file descriptors and so on). Maybe pthreads would suit you? In case with pthreads you'll share the same fd for all processes. But anyways, you should really think about using mutexes in your project.