Understanding pipe, fork and exec - C programming

Understanding pipe, fork and exec - C programming - c

I am trying to understand pipe, fork and exec in C, so I tried to write a little program that takes an input string and prints it out with the help of 2 child processes that run simultaneously.
Since the code is too long I posted it in this link: https://pastebin.com/mNcRWkDg which I will use as a reference. I also posted the short version of my code at the bottom
Example what should it do with input abcd:
> ./echo
abcd
result ->
abcd
I am taking an input though getline() and checking if the input_lengh is even or can be broken into even parts. If it's just one char it just prints it out.
If it is for example abcd i.e has input_length of 4, it will split it into 2 parts first_part ab and second_part cd with the help of the struct parts like this:
struct parts p1;
split(input, &p1);
Then I setup pipe for first child and fork it and then for the second child the same. I redirect first child output to be input of the parent process and the same for the second child. Let's assume that that part works like it should.
Then I write it to their child processes input:
write(pipeEndsFirstChild2[1], p1.first_half, strlen(p1.first_half));
write(pipeEndsSecondChild2[1], p1.second_half, strlen(p1.second_half));
After that I open their outputs with fdopen() and read it with fgets()
At the end I allocate memory and concat both results with:
char *result = malloc(strlen(readBufFirstChild) + strlen(readBufSecondChild));
strcat(result, readBufFirstChild);
strcat(result, readBufSecondChild);
I used stderr to see the output since stdout is redirected and what I get is:
>./echo
abcd
result ->
cd
result ->
ab
result ->
����
Question:
How do I get child process 1 to give me the ab first and then the second child to give me cd i.e how do I assure child processes run in correct order? Since I am only printing How do I save ab and cd between processes and concat them in the parent process to output them onto stdout?
If I try:
>./echo
ab
result ->
ab
everything works as expected, so I guess if I have to call child processes multiple times as in abcd input then something gets messed up. Why?
int main(int argc, char *argv[])
{
int status = 0;
char *input;
input = getLine();
int input_length = strlen(input);
if((input_length/2)%2 == 1 && input_length > 2)
{
usage("input must have even length");
}
if (input_length == 1)
{
fprintf(stdout, "%s", input);
}else
{
struct parts p1;
split(input, &p1);
int pipeEndsFirstChild1[2];
int pipeEndsFirstChild2[2];
.
.
.
pid_t pid1 = fork();
redirectPipes(pid1, pipeEndsFirstChild1, pipeEndsFirstChild2);
int pipeEndsSecondChild1[2];
int pipeEndsSecondChild2[2];
.
.
.
pid_t pid2 = fork();
redirectPipes(pid2, pipeEndsSecondChild1, pipeEndsSecondChild2);
// write to 1st and 2nd child input
write(pipeEndsFirstChild2[1], p1.first_half, strlen(p1.first_half));
write(pipeEndsSecondChild2[1], p1.second_half, strlen(p1.second_half));
.
.
.
// open output fd of 1st child
FILE *filePointer1 = fdopen(pipeEndsFirstChild1[0], "r");
// put output into readBufFirstChild
fgets(readBufFirstChild,sizeof(readBufFirstChild),filePointer1);
// open output fd of 2nd child
FILE *filePointer2 = fdopen(pipeEndsSecondChild1[0], "r");
// open output fd of 2st child
fgets(readBufSecondChild,sizeof(readBufSecondChild),filePointer2);
//concat results
char *result = malloc(strlen(readBufFirstChild) +
strlen(readBufSecondChild) + 1);
strcpy(result, readBufFirstChild);
strcat(result, readBufSecondChild);
fprintf(stderr, "result ->\n%s\n", result);
if(wait(&status) == -1){
exit(EXIT_FAILURE);
}
exit(EXIT_SUCCESS);
}
}

There's no way to control the order that child processes run if they both have input available to them.
The way to solve this in your application is that you shouldn't write to the second child until after you've read the response from the first child.
write(pipeEndsFirstChild2[1], p1.first_half, strlen(p1.first_half));
char readBufFirstChild[128];
FILE *filePointer1 = fdopen(pipeEndsFirstChild1[0], "r");
fgets(readBufFirstChild,sizeof(readBufFirstChild),filePointer1);
write(pipeEndsSecondChild2[1], p1.second_half, strlen(p1.second_half));
char readBufSecondChild[128];
FILE *filePointer2 = fdopen(pipeEndsSecondChild1[0], "r");
fgets(readBufSecondChild,sizeof(readBufSecondChild),filePointer2);
I've omitted the error checking and closing of all the unnecessary pipe ends.
You only need to do this because each process is printing its portion of the result to stderr, so you care what order they're running. Normally you shouldn't care about the order that they result, since they can contribute their portion of the final result in any order. If only the original parent process displayed the result, your code would be fine.

Related

Communication between processes - pipe and fifo

I need to create program with 3 processes:
The first process should repeatedly read /dev/urandom and send 15 chars each cycle to the second process via a pipe.
The second process should convert received data to hex and send the result to the third process via a fifo.
The third process should print the received data.
This is what I wrote so far. Communication using the pipe is working fine, however there is some problem with the fifo - when I change n to a larger number such as 100000 or 1000000, the program doesn't start. When it's smaller, say 500 or 1000, the program works. What could be the reason behind that?
This is how I run it:
cat /dev/urandom | ./a.out
And here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#define FIFO "my_fifo"
int main(void) {
int pdesk[2];
char buf[15];
char buffer[15];
char hex[30];
char f[30];
int len;
int n;
n = 100;
umask(0);
mkfifo(FIFO, 0666);
pipe(pdesk);
if (fork() == 0) {
for (int i = 0; i < n; i++) {
read(STDIN_FILENO, buffer, 15);
write(pdesk[1], buffer, 15);
}
close(pdesk[1]);
} else {
sleep(1);
int fp;
for(int i = 0; i < n; i++) {
read(pdesk[0], buf, 15);
for(int a = 0, b = 0; b < 30; ++a, b+= 2)
sprintf(hex + b, "%02x", buf[a] & 0xff);
fp = open(FIFO, O_WRONLY);
write(fp, hex, 30);
close(fp);
usleep(10000);
}
close(pdesk[0]);
}
if (fork() == 0) {
sleep(2);
int fp;
for (int i = 0; i < n; i++) {
fp = open(FIFO, O_RDONLY);
read(fp, f, 30);
printf("Odczytano: %s\n", f);
close(fp);
usleep(10000);
}
}
}

If I understand your code correct, it will do the following:
With the first fork you start a child that reads from stdin and writes to the pipe.
Your parent process reads from the pipe and writes to the FIFO.
When your parent process has finished its loop it calls the second fork to create another child which will read from the FIFO and print the data.
When the loop count is too large you will reach the buffer limit of the FIFO and the parent will block because no process is reading from the FIFO. When the process is blocked in writing to the FIFO it will never create the child that is expected to read from the FIFO.
I think the main problem is that you should create the second child before starting the loop that reads the data from the pipe and writes to the FIFO.
Some additional remarks:
With cat /dev/urandom | ./a.out your program does not read /dev/urandom directly. It reads from a pipe which might behave differently.
You should always check the return value of read. It will tell you how many bytes it has read which may be less than you asked it to read. If you want to have exactly 15 characters you might have to read several times if you get less than 15 characters.
The same applies to write.

Thank you very much. When the process that displays data is above other child processes, it finally works.
With cat /dev/urandom | ./a.out your program does not read /dev/urandom directly. It reads from a pipe which might behave differently.
How could I change it?
The programs also needs to read files the same way it reads from /dev/urandom, for example:
cat file.txt | ./a.out
I took your advice and started to check the value of read and now it doesn't go behind the range of file. The problem is I don't know how to check which parameter was called (and hence I can't check the length of file) - if it was file.txt, /dev/urandom, none or anything else. I tried with
int main(char argc, char* argv[])
but argv is always ./a.out, no matter what I call. Is there any way to check that?

write() prints in wrong order

I'm trying to output ordered set of lines created by multiple processes.
I found that printf() and fprintf() is not suitable for such task. Right now I'm using this set of commands:
sprintf(buff,"%d: some string", (*counter)++); //build string (write can't do that)
write(file, buff, strlen(buff));
memset(buff, 0, strlen(buff)); //clear buffer before next writing
File opening and starting processes is shown below:
int file;
int main(){
pid_t busPID, riderPID;
file = open("outputFile.txt", O_WRONLY | O_CREAT | O_APPEND | O_TRUNC, 0666);
if((busPID = fork()) == 0){
bus();
else{
if((riderPID = fork()) == 0){
for(int i = 0; i < 10; i++){
rider();
}
}else{
pid_t waitPID;
while ((waitPid = wait(&status)) > 0); //wait for all riders to finish
}
}
waitpid(busPID, NULL, 0);
return 0;
}
Here are functions, which prints output:
void bus() {
char buff[50];
//writing
do {
//writing
if(*onStop > 0) {
//writing
sem_post(semRider);
sem_wait(semBus);
//writing
*onStop = 0; //internal value, irrelevant for answer
}
//writing
usleep(rand()%busSleep);
//writing
sem_post(semRider);
sem_wait(semBus);
departuredAkt += temp; //internal value, irrelevant for answer
} while(departuredAkt < departuredTotal);
//writing
exit(EXIT_SUCCESS); //exit this process
}
void rider() {
char buff[50];
//writing
int pos = ++(*onStop);
//writing
sem_wait(semRider);
//writing
sem_post(semBus);
sem_wait(semRider);
//writing
sem_post(semBus);
exit(EXIT_SUCCESS);
}
There is only 1 process using bus() function and N processes using rider()function (specified by argument). desired output is:
1: bus string
2: bus string
3: bus string
4: rider 1 string
5: rider 1 string
.
.
.
25: rider N string
26: bus string
My current output looks like this:
1: bus string
2: bus string
3: bus string
4: rider 1 string
6: bus string //here is the problem
5: rider 1 string
Question is, how can I achieve printing lines in correct order?

First of all, side note: never use sprintf, this function is completely unsecure. Use snprintf. Example:
snprintf (buff, sizeof (buff), "%d: some string", (*counter)++);
Second: you missed information we need to understand your question. I mean the following information:
How exactly you opened file?
How you started you processes?
Are this processes really processes or they are threads?
Did you open file in one process and somehow shared this opened file with other processes or did you opened the file in each process separately?
This details are critical for understanding your question.
Next time you will writing some question, please, provide FULL EXAMPLE. I. e. minimal working example we can compile and run. It should include all relevant details, i. e. starting processes, opening files, etc. And, of course, you should strip all unnecessary details.
Okey, there is two different notion in POSIX: "file descripTOR" and "file descripTION". Websearch for them. Type "man 2 open" in UNIX shell and read carefully, this manual page talks about distinction.
Exact details about how you started your processes and how you opened your file causes (or not causes) sharing of file description between processes and thus affects behavior of "write".
I wrote big text about file descriptors and descriptions. I put it here: https://zerobin.net/?eb2d99ee02f36b92#hQY7vTMCD9ekThAod+bmjlJgnlBxyDSXCYcgmjVSu2w= , because it is not much relevant to this question, but still will be useful for education.
Okey, what to do?
Well, if you for whatever reason cannot share ONE file DESCRIPTION, then simply open file with O_APPEND. :) You don't need to open the file every time you write to it. Simply open it with O_APPEND one time in each process and all will be OK.

Tee - mimicking program only writes the initial input to file, ignoring all sequential inputs

So I have a mytee program (with much much less functionality). Trying to learn how to work with pipes / children / etc
(1) I do pipe
(2) Create the file(s)
(3) fork
(4) the parent does scanf to get the text
(5) sends the text to the pipe
(6) child receives it and writes it to files
-> #4 should be a loop until the user writes '.'
-> #6 should continue writing new lines, but somewhere there is a breakdown.
Some of the things that I think it might be:
1. Something is wrong with my permissions (but O_APPEND is there, and not sure what else I would need)
2. there may be a problem in parent do while loop, where it should send the msg to the pipe (fd[1])
3. #6 where I strongly think my problem lies. After the initial write it doesn, continue writing. I am not sure if I need to somehow keep track of the size of bytes already written, but if that was the case I would expect the last message to be there not the first.
I'm pretty much at a loss right now
I run it using
./mytee test1
Code:
ret = pipe (fd);
if (ret == -1)
{
perror ("pipe");
return 1;
}
for (i=0;i<argc-1;i++) {
if ((filefd[i] = open(argv[i+1], O_CREAT|O_TRUNC|O_WRONLY|O_APPEND, 0644)) < 0) {
perror(argv[i]); /* open failed */
return 1;
}
}
pid = fork();
if (pid==0) /* child */
{
int read_data;
do {
read_data = read(fd[0], buffer, sizeof(buffer));
for(i=0;i<argc;i++) {
write(filefd[i], buffer, read_data);
}
} while (read_data > 1);
for (i=0; i<argc; i++)
close(filefd[i]);
return 0;
}
else { /* parent */
char msg[20];
do{
scanf("%s",msg);
write(fd[1],msg,sizeof(msg));
}while (strcmp(msg,".")!=0);
while ((pid = wait(&status)) != -1)
fprintf(stderr, "process %d exits with %d\n", pid, WEXITSTATUS(status));
return 0;
}
Adding Output:
$ ./a.out test1
qwe
asd
zxc
.
^C
It doesn't exit properly. I think the child is stuck in the loop
And the contents of test1:
qwe

Working through this with the OP, reportedly the problem was unconditionally writing all 20 bytes of msg instead of just the NUL-terminated string contained within it. Suggested minimal fix: change
scanf("%s",msg);
write(fd[1],msg,sizeof(msg));
to
scanf("%19s",msg);
write(fd[1],msg,strlen(msg));

I see a couple of issues, which could potentially cause that behaviour.
Firstly, your loop condition doesn't look right. Currently it will terminate if a single byte is read. Change it to this:
while (read_data > 0);
The other issue I see is that you're writing to more files than you opened. Make sure you loop to argc-1, not argc:
for (i=0; i<argc-1; i++)

using sort with dup2

I'm experimenting with this dup2 command in linux. I've written a code as follows:
#include <stdio.h>
#include <unistd.h>
#include <string.h>
int main()
{
int pipe1_ends[2];
int pipe2_ends[2];
char string[] = "this \n is \n not \n sorted";
char buffer[100];
pid_t pid;
pipe(pipe1_ends);
pipe(pipe2_ends);
pid = fork();
if(pid > 0) { /* parent */
close(pipe1_ends[0]);
close(pipe2_ends[1]);
write(pipe1_ends[1],string,strlen(string));
read(pipe2_ends[0], buffer, 100);
printf("%s",buffer);
return 0;
}
if(pid == 0) { /* child */
close(pipe1_ends[1]);
close(pipe2_ends[0]);
dup2(pipe1_ends[0], 0);
dup2(pipe2_ends[1],1);
char *args[2];
args[0] = "/usr/bin/sort";
args[1] = NULL;
execv("/usr/bin/sort",args);
}
return 0;
}
I expect this program to behave as follows:
It should fork a child and replace its image with sort process. And since the stdin and stdout are replaced with dup2 command, I expect sort to read input from the pipe and write the output into the other pipe which is printed by the parent. But the sort program doesn't seem to be reading any input. If no commandline argument is given, sort reads it from the stdin right? Can someone help me with this problem, please.
Many thanks!

Hm. What's happening is that you aren't finishing your write: after sending data to the child process, you have to tell it you're done writing, either by closing pipe1_ends[1] or calling shutdown(2) on it. You should also call write/read in a loop, since it's quite likely in the general case that read at least won't give you all the results in one go. Obviously the full code checks all return values, doesn't it?
One final thing: Your printf is badly broken. It can only accept null-terminated strings, and the result returned by read will not be null-terminated (it's a buffer-with-length, the other common way of knowing where the end is). You want:
int n = read(pipe2_ends[0], buffer, 99);
if (n < 0) { perror("read"); exit(1); }
buffer[n] = 0;
printf("%s",buffer);

Implementing pipelining in C. What would be the best way to do that?

I can't think of any way to implement pipelining in c that would actually work. That's why I've decided to write in here. I have to say, that I understand how do pipe/fork/mkfifo work. I've seen plenty examples of implementing 2-3 pipelines. It's easy. My problem starts, when I've got to implement shell, and pipelines count is unknown.
What I've got now:
eg.
ls -al | tr a-z A-Z | tr A-Z a-z | tr a-z A-Z
I transform such line into something like that:
array[0] = {"ls", "-al", NULL"}
array[1] = {"tr", "a-z", "A-Z", NULL"}
array[2] = {"tr", "A-Z", "a-z", NULL"}
array[3] = {"tr", "a-z", "A-Z", NULL"}
So I can use
execvp(array[0],array)
later on.
Untli now, I believe everything is OK. Problem starts, when I'm trying to redirect those functions input/output to eachother.
Here's how I'm doing that:
mkfifo("queue", 0777);
for (i = 0; i<= pipelines_count; i++) // eg. if there's 3 pipelines, there's 4 functions to execvp
{
int b = fork();
if (b == 0) // child
{
int c = fork();
if (c == 0)
// baby (younger than child)
// I use c process, to unblock desc_read and desc_writ for b process only
// nothing executes in here
{
if (i == 0) // 1st pipeline
{
int desc_read = open("queue", O_RDONLY);
// dup2 here, so after closing there's still something that can read from
// from desc_read
dup2(desc_read, 0);
close(desc_read);
}
if (i == pipelines_count) // last pipeline
{
int desc_write = open("queue", O_WRONLY);
dup2(desc_write, 0);
close(desc_write);
}
if (i > 0 && i < pipelines_count) // pipeline somewhere inside
{
int desc_read = open("queue", O_RDONLY);
int desc_write = open("queue", O_WRONLY);
dup2(desc_write, 1);
dup2(desc_read, 0);
close(desc_write);
close(desc_read);
}
exit(0); // closing every connection between process c and pipeline
}
else
// b process here
// in b process, i execvp commands
{
if (i == 0) // 1st pipeline (changing stdout only)
{
int desc_write = open("queue", O_WRONLY);
dup2(desc_write, 1); // changing stdout -> pdesc[1]
close(desc_write);
}
if (i == pipelines_count) // last pipeline (changing stdin only)
{
int desc_read = open("queue", O_RDONLY);
dup2(desc_read, 0); // changing stdin -> pdesc[0]
close(desc_read);
}
if (i > 0 && i < pipelines_count) // pipeline somewhere inside
{
int desc_write = open("queue", O_WRONLY);
dup2(desc_write, 1); // changing stdout -> pdesc[1]
int desc_read = open("queue", O_RDONLY);
dup2(desc_read, 0); // changing stdin -> pdesc[0]
close(desc_write);
close(desc_read);
}
wait(NULL); // it wait's until, process c is death
execvp(array[0],array);
}
}
else // parent (waits for 1 sub command to be finished)
{
wait(NULL);
}
}
Thanks.

Patryk, why are you using a fifo, and moreover the same fifo for each stage of the pipeline?
It seems to me that you need a pipe between each stage. So the flow would be something like:
Shell ls tr tr
----- ---- ---- ----
pipe(fds);
fork();
close(fds[0]); close(fds[1]);
dup2(fds[0],0);
pipe(fds);
fork();
close(fds[0]); close(fds[1]);
dup2(fds[1],1); dup2(fds[0],0);
exex(...); pipe(fds);
fork();
close(fds[0]); etc
dup2(fds[1],1);
exex(...);
The sequence that runs in each forked shell (close, dup2, pipe etc) would seem like a function (taking the name and parameters of the desired process). Note that up until the exec call in each, a forked copy of the shell is running.
Edit:
Patryk:
Also, is my thinking correct? Shall it work like that? (pseudocode):
start_fork(ls) -> end_fork(ls) -> start_fork(tr) -> end_fork(tr) ->
start_fork(tr) -> end_fork(tr)
I'm not sure what you mean by start_fork and end_fork. Are you implying that ls runs to completion before tr starts? This isn't really what is meant by the diagram above. Your shell will not wait for ls to complete before starting tr. It starts all of the processes in the pipe in sequence, setting up stdin and stdout for each one so that the processes are linked together, stdout of ls to stdin of tr; stdout of tr to stdin of the next tr. That is what the dup2 calls are doing.
The order in which the processes run is determined by the operating system (the scheduler), but clearly if tr runs and reads from an empty stdin it has to wait (to block) until the preceding process writes something to the pipe. It is quite possible that ls might run to completion before tr even reads from its stdin, but it is equally possible that it wont. For example if the first command in the chain was something that ran continually and produced output along the way, the second in the pipeline will get scheduled from time to time to prcess whatever the first sends along the pipe.
Hope that clarifies things a little :-)

It might be worth using libpipeline. It takes care of all the effort on your part and you can even include functions in your pipeline.

The problem is you're trying to do everything at once. Break it into smaller steps instead.
1) Parse your input to get ls -al | out of it.
1a) From this you know you need to create a pipe, move it to stdout, and start ls -al. Then move the pipe to stdin. There's more coming of course, but you don't worry about it in code yet.
2) Parse the next segment to get tr a-z A-Z |. Go back to step 1a as long as your next-to-spawn command's output is being piped somewhere.

Implementing pipelining in C. What would be the best way to do that?
This question is a bit old, but here's an answer that was never provided. Use libpipeline. libpipeline is a pipeline manipulation library. The use case is one of the man page maintainers who had to frequently use a command like the following (and work around associated OS bugs):
zsoelim < input-file | tbl | nroff -mandoc -Tutf8
Here's the libpipeline way:
pipeline *p;
int status;
p = pipeline_new ();
pipeline_want_infile (p, "input-file");
pipeline_command_args (p, "zsoelim", NULL);
pipeline_command_args (p, "tbl", NULL);
pipeline_command_args (p, "nroff", "-mandoc", "-Tutf8", NULL);
status = pipeline_run (p);
The libpipeline homepage has more examples. The library is also included in many distros, including Arch, Debian, Fedora, Linux from Scratch and Ubuntu.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight