I am writing some C code that involves the use of pipes. To make a child process use my pipe instead of STDOUT for output, I used the following lines:
close(STDOUT);
dup2(leftup[1], STDOUT);
However, it seems to go into some sort of infinite loop or hang on those lines. When I get rid of close, it hangs on dup2.
Curiously, the same idea works in the immediately preceding line for STDIN:
close(STDIN);
dup2(leftdown[0], STDIN);
What could be causing this behavior?
Edit: Just to be clear...
#define STDIN 0
#define STDOUT 1
Edit 2: Here is a stripped-down example:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#define STDIN 0
#define STDOUT 1
main(){
pid_t child1 = 0;
int leftdown[2];
if (pipe(leftdown) != 0)
printf("ERROR");
int leftup[2];
if (pipe(leftup) != 0)
printf("ERROR");
printf("MADE PIPES");
child1 = fork();
if (child1 == 0){
close(STDOUT);
printf("TEST 1");
dup2(leftup[1], STDOUT);
printf("TEST 2");
exit(0);
}
return(0);
}
The "TEST 1" line is never reached. The only output is "MADE PIPES".
At a minimum, you should ensure that the dup2 function returns the new file descriptor rather than -1.
There's always a possibility that it will give you an error (for example, if the pipe() call failed previously). In addition, be absolutely certain that you're using the right indexes (0 and 1) - I've been bitten by that before and it depends on whether you're in the parent or child process.
Based on your edit, I'm not the least bit surprised that MADE PIPES is the last thing printed.
When you try to print TEST 1, you have already closed the STDOUT descriptor so that will go nowhere.
When you try to print TEST 2, you have duped the STDOUT descriptor so that will go to the parent but your parent doesn't read it.
If you change your forking code to:
child1 = fork();
if (child1 == 0){
int count;
close(STDOUT);
count = printf("TEST 1\n");
dup2(leftup[1], STDOUT);
printf("TEST 2 (%d)\n", count);
exit(0);
} else {
char buff[80];
read (leftup[0], buff, 80);
printf ("%s\n", buff);
sleep (2);
}
you'll see that the TEST 2 (-1) line is output by the parent because it read it via the pipe. The -1 in there is the return code from the printf you attempted in the child after you closed the STDOUT descriptor (but before you duped it), meaning that it failed.
From ISO C11 7.20.6.3 The printf function:
The printf function returns the number of characters transmitted, or a negative value if an output or encoding error occurred.
Multiple thing to mention,
When you use fork, it causes almost a complete copy of parent process. That also includes the buffer that is set up for stdout standard output stream as well. The stdout stream will hold the data till buffer is full or explicitly requested to flush the data from buffer/stream. Now because of this , now you have "MADE PIPES" sitting in buffer. When you close the STDOUT fd and use printf for writing data out to terminal, it does nothing but transfers your "TEST 1" and "TEST 2" into the stdout buffer and doesn't cause any error or crash (due to enough buffer). Thus even after duplicating pipe fd on STDOUT, due to buffered output printf hasn't even touched pipe write end. Most important, please use only one set of APIs i.e. either *NIX or standard C lib functions. Make sure you understand the libraries well, as they often play tricks for some sort of optimization.
Now, another thing to mention, make sure that you close the appropriate ends of pipe in appropriate process. Meaning that if say, pipe-1 is used to communicate from parent to child then make sure that you close the read end in parent and write end in child. Otherwise, your program may hung, due to reference counts associated with file descriptors you may think that closing read end in child means pipe-read end is closed. But as when you don't close the read end in parent, then you have extra reference count for read end of pipe and ultimately the pipe will never close.
There are many other things about your coding style, better you should get hold on it :)
Sooner you learn it better it will save your time. :)
Error checking is absolutely important, use at least assert to ensure that your assumptions are correct.
While using printf statements to log the error or as method of debugging and you are changing terminal FD's (STDOUT / STDIN / STDERR) its better you open a log file with *NIX open and write errors/ log entries to it.
At last, using strace utility will be a great help for you. This utility will allow you to track the system calls executed while executing your code. It is very straight forward and simple. You can even attach this to executing process, provided you have right permissions.
Related
It's been a while since I last programmed in C, and I'm having trouble making pipes work. (For sake of clarity, I'm using Cygwin on Windows 7.) In particular, I need help understanding the behavior of the following example:
/* test.c */
#include <stdio.h>
#include <unistd.h>
int main() {
char c;
//scanf("%c", &c); // this is problematic
int p[2];
pipe(p);
int out = dup(STDOUT_FILENO);
// from now on, implicitly read from and write on pipe
dup2(p[0], STDIN_FILENO);
dup2(p[1], STDOUT_FILENO);
printf("hello");
fflush(stdout);
// restore stdout
dup2(out, STDOUT_FILENO);
// should read from pipe and write on stdout
putchar(getchar());
putchar(getchar());
putchar(getchar());
}
If I invoke:
echo abcde | ./test.exe
I get the following output:
hel
However, if I uncomment the scanf call, I get:
bcd
Which I can't explain. This is actually a very simplified version of a more complex program with a fork/exec structure that started behaving very bad. Despite not having cycles, it somehow began spawning infinite children in an endless loop. So, rules permitting, I'll probably need to extend the question with a more concrete case of use. Many thanks.
The stream I/O functions such as scanf generally perform buffering to improve performance. Thus, if you call scanf on the standard input then it will probably read more characters than needed to satisfy the request, and the extra will be waiting, buffered, for the next read.
Swapping out the the underlying file descriptor does not affect previously buffered data. When you subsequently read the file again, you get data buffered the first time until those are exhausted, and only then do you get fresh data from the new underlying file.
If you wish, you can turn off buffering of a stream via the setvbuf() function, before any I/O operations have been performed on it:
int result = setvbuf(stdin, NULL, _IONBF, 0);
if (result != 0) {
// handle error ...
}
This is actually a very simplified version of a more complex program
with a fork/exec structure that started behaving very bad. Despite not
having cycles, it somehow began spawning infinite children in an
endless loop.
I don't see how that behavior would be related to what you've asked here.
So, rules permitting, I'll probably need to extend the
question with a more concrete case of use.
That would be a separate question.
There are many questions related to reading and writing of pipes in this forum, but i am unable to resolve my issue.
The code snippet below, does the following things:
Through command line argument filename is passed to the child process through pipe_p
Child process opens the file specified, and writes its content to pipe_c for parent process to read and display on the screen.
Everything is working fine, but parent process is unable to read data from the pipe (since it is not printing anything).
I observed that data is successfully written by child process since i am able to print the contents through pipe in child process block but not in parent process.
NOTE : STEP 4 is not working
Anyone please help me this.
Code:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv){
int pipe_p[2], pipe_c[2];
int childpid, c, k = 0;
char buffer[1000] = {0};
FILE *file;
pipe(pipe_p);
pipe(pipe_c);
childpid = fork();
if(childpid){
//parent process block
//STEP 1 -------
close(pipe_p[0]); //closing reading side of pipe
write(pipe_p[1], argv[1], strlen(argv[1]));
close(pipe_p[1]);
//--------------
wait(NULL);
//--------------
//printf("%s\n", "Its working");
//STEP 4 -------
close(pipe_c[1]);
read(pipe_c[0], buffer, sizeof(buffer));
close(pipe_c[0]);
printf("%s\n", buffer);
//--------------
}
else{
//child process block
//sleep(1);
//STEP 2 -------
close(pipe_p[1]);
read(pipe_p[0], buffer, sizeof(buffer));
close(pipe_p[0]);
//printf("%s\n", buffer);
//--------------
//STEP 3 -------
file = fopen(buffer, "r");
while((c = getc(file)) != EOF){
buffer[k++] = c;
}
buffer[k] = 0;
//printf("%s", buffer);
close(pipe_c[0]);
write(pipe_c[1], buffer, strlen(buffer));
close(pipe_c[1]);
//--------------
}
return 0;
}
I see five bugs in this code. I'm going to list them from least to most important. I haven't tried to fix any of the bugs, so there may be more that are hidden behind these.
You forgot to include sys/wait.h. The compiler should have complained about an implicit declaration of wait. (If your compiler did not make any complaints, turn on all the warnings.)
You are not checking whether any of your system calls are failing. Every system call should be followed by a check for failure. When one does fail, print to stderr a full description of the failure, including the name of the system call that failed, the names of all files involved (if any), and strerror(errno) and then exit the program with a nonzero (unsuccessful) exit code. If you had done this you would have discovered for yourself that, in fact, certain things were not "working fine".
Relatedly, you are not checking whether the child exited unsuccessfully. Instead of wait(NULL), the parent should be doing waitpid(childpid, &status, 0) and then decoding the exit status and printing a message to stderr for anything other than WIFEXITED(status) && WEXITSTATUS(status) == 0 and then exiting unsuccessfully itself.
In the parent, you are calling wait in the wrong place. You need to call wait AFTER you have read and processed all of the data from pipe_c. Otherwise, if the child process completely fills up the pipe buffer, the program will deadlock. (Also, you need to read all of the data from the pipe, not just the first 1000 bytes of it.)
In the child, you have a buffer overrun. You are reading an unlimited amount of data from the file into buffer, but buffer has a fixed size. You should either use malloc and realloc to enlarge it as necessary, or copy from the file to the pipe in chunks no bigger than the size of buffer.
I discovered all of these problems by running the program under the strace utility, in -f mode (so it traces both sides of the fork), with input from a large file. This is a valuable debugging technique which you should try for yourself.
Before stating my question, I have read several related questions on stack overflow, such as pipe & dup functions in UNIX, and several others,but didn't clarify my confusion.
First, the code, which is an example code from 'Beginning Linux Programming', 4th edition, Chapter 13:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main()
{
int data_processed;
int file_pipes[2];
const char some_data[] = "123";
pid_t fork_result;
if (pipe(file_pipes) == 0)
{
fork_result = fork();
if (fork_result == (pid_t)-1)
{
fprintf(stderr, "Fork failure");
exit(EXIT_FAILURE);
}
if (fork_result == (pid_t)0) // Child process
{
close(0);
dup(file_pipes[0]);
close(file_pipes[0]); // LINE A
close(file_pipes[1]); // LINE B
execlp("od", "od", "-c", (char *)0);
exit(EXIT_FAILURE);
}
else // parent process
{
close(file_pipes[0]); // LINE C
data_processed = write(file_pipes[1], some_data,
strlen(some_data));
close(file_pipes[1]); // LINE D
printf("%d - wrote %d bytes\n", (int)getpid(), data_processed);
}
}
exit(EXIT_SUCCESS);
}
The execution result is:
momo#xue5:~/TestCode/IPC_Pipe$ ./a.out
10187 - wrote 3 bytes
momo#xue5:~/TestCode/IPC_Pipe$ 0000000 1 2 3
0000003
momo#xue5:~/TestCode/IPC_Pipe$
If you commented LINE A, LINE C, and LINE D, the result is the same as above.
I understand the result, the child get the data from its parent through its own stdin which is connected to pipe, and send 'od -c' result to its stdout.
However, if you commented LINE B, the result would be:
momo#xue5:~/TestCode/IPC_Pipe$ ./a.out
10436 - wrote 3 bytes
momo#xue5:~/TestCode/IPC_Pipe$
No 'od -c' result!
Is 'od -c' started by execlp() not excuted, or its output not directed to stdout? One possibility is the read() of 'od' is blocked, because the write file descriptor file_pipes[1] of child is open if you commented LINE B. But commenting LINE D, which let write file descriptor file_pipes[1] of parent open, can still have the 'od -c' output.
And, why we need to close pipe before execlp()? execlp() will replace the process image, including stack, .data, .heap, .text with new image from 'od'. Does that mean, even if you don't close file_pipes[0] and file_pipes[1] in child as LINE A and B, file_pipes[0] and file_pipes[1] will still be 'destroyed' by execlp()? From the result by code, it is not. But where am I wrong?
Thanks so much for your time and efforts here~~
Is closing a pipe necessary when followed by execlp()?
It's not strictly necessary because it depends on how the pipe is used. But in general, yes it should be closed if the pipe end is not needed by the process.
why we need to close pipe before execlp()? execlp() will replace the process image
Because file descriptors (by default) remain open across exec calls. From the man page: "By default, file descriptors remain open across an execve(). File descriptors that are marked close-on-exec are closed; see the description of FD_CLOEXEC in fcntl(2)."
However, if you commented LINE B,...No 'od -c' result!
This is because the od process reads from stdin until it gets an EOF. If the process itself does not close file_pipes[1] then it will not see an EOF as the write end of the pipe would not be fully closed by all processes that had it opened.
If you commented LINE A, LINE C, and LINE D, he result is the same as above
This is because the file descriptors at A and C are read ends of the pipe and no one will be blocked waiting for it to be closed (as described above). The file descriptor at D is a write end and not closing it would indeed cause problems. However, even though the code does not explicitly call close on that file descriptor, it will still be closed because the process exits.
And, why we need to close pipe before execlp()? execlp() will replace the process image, including stack, .data, .heap, .text with new image from 'od'.
Yes, the exec-family functions, including execlp(), replace the process image of the calling process with a copy of the specified program. But the process's table of open file descriptors is not part of the process image -- it is maintained by the kernel, and it survives the exec.
Does that mean, even if you don't close file_pipes[0] and file_pipes[1] in child as LINE A and B, file_pipes[0] and file_pipes[1] will still be 'destroyed' by execlp()?
The variable file_pipes is destroyed by execlp(), but that's just the program's internal storage for the file descriptors. The descriptors are just integer indexes into a table maintained for the process by the kernel. Losing track of the file descriptor values does not cause the associated files to be closed. In fact, that's a form of resource leakage.
From the result by code, it is not. But where am I wrong?
As described above.
Additionally, when a process exits, all its open file descriptors are closed, but the underlying open file description in the kernel, to which the file descriptors refer, is closed only when no open file descriptors referring to it remain. Additional open file descriptors may be held by other processes, as a result of inheriting them across a fork().
Now as to the specific question of what happens when the child process does not close file_pipes[1] before execing od, you might get a clue by checking the process list via the ps command. You will see the child od process still running (maybe several, if you have tested several times). Why?
Well, how does od know when to exit? It processes its entire input, so it must exit when it reaches the end of its input(s). But the end of input on a pipe doesn't mean that no more data is available right now, because more data might later be written to the write end of the pipe. End of input on a pipe happens when the write end is closed. And if the child does not close file_pipes[1] before it execs, then it likely will remain open indefinitely, because after the exec the child doesn't any longer know that it owns it.
The following code is the pipe implementation given in beej's guide:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(void)
{
int pfds[2];
pipe(pfds);
if (!fork()) {
close(1); /* close normal stdout */
dup(pfds[1]); /* make stdout same as pfds[1] */
close(pfds[0]); /* we don't need this */
execlp("ls", "ls", NULL);
} else {
close(0); /* close normal stdin */
dup(pfds[0]); /* make stdin same as pfds[0] */
close(pfds[1]); /* we don't need this */
execlp("wc", "wc", "-l", NULL);
}
return 0;
}
I wanted to ask:
Is it possible that close(0) is executed before dup(pfds[1])? If yes, then in that case the program will not behave as expected.
What is the use of the following lines of code:
close(pfds[0]); /* we don't need this */
close(pfds[1]); /* we don't need this */
And what would change if these lines were not there?
Is it possible that close(0) is executed before dup(pfds[1])? If yes,
then in that case the program will not behave as expected.
Yes, it is possible to have the parent successfully complete close(0) before the child calls dup(pfds[1]). However, this is not a problem. When you fork a new process, the new process gets an entire copy of the parent's memory address space, including open file descriptors (except those marked with the O_CLOEXEC flag - see fcntl(2)). So, essentially each process has its own private copy of the file descriptors and is isolated and free to do whatever it wants with that copy.
Thus, when the parent calls close(0), it is only closing its copy of file descriptor 0 (stdin); it does not affect the child in any way, which still has a reference to stdin and can use it if needed (even though in this example it won't).
What is the use of the following lines of code:
close(pfds[0]); /* we don't need this */
close(pfds[1]); /* we don't need this */
Best practices mandate that you should close file descriptors that you don't use - this is the case for close(pfds[0]). Unused open file descriptors eat up space and resources, why keep it open if you're not going to use it?
close(pfds[1]) is a little more subtle though. Pipes report end of file only when there is no more data in the pipe buffer and there are no active writers, i.e., no live processes that have the pipe open for writing. If you do not close pfds[1] in the parent, the program will hang forever because wc(1) will never see the end of input, since there is a process (wc(1) itself) that has the pipe opened for writing and as such could (but won't) write more data.
Tl;DR: close(pfds[0]) is just good practice but not mandatory; close(pfds[1]) is absolutely necessary to ensure program correctness.
Question 1:
Yes it is entirely possible that "close(0);" (in the parent) is executed before "dup(pfds[1]);" (in the child). But since this happens in different processes, the child will still have fd 0 open.
Question 2:
It is good bookkeeping practice to close the end of the pipe that a process is not going to use. That way, you can avoid bugs further down the road in more complex programs. In the above scenario, the child process should ever only read from the pipe. If you close the write end in the child, eny attempt to write to it will cause an error, otherwise you might have a bug that is hard to detect.
I have the following code taken from the “Pipes” section of Beej’s Guide to Unix IPC.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(void)
{
int pfds[2];
pipe(pfds);
if (!fork()) {
close(1); /* close normal stdout */
dup(pfds[1]); /* make stdout same as pfds[1] */
close(pfds[0]); /* we don't need this */
execlp("ls", "ls", NULL);
} else {
close(0); /* close normal stdin */
dup(pfds[0]); /* make stdin same as pfds[0] */
close(pfds[1]); /* we don't need this */
execlp("wc", "wc", "-l", NULL);
}
return 0;
}
This code allows the user to see how many files are in a specific directory. How can I edit this code to implement the longer pipeline cat /etc/passwd | cut –f1 –d: | sort? Does anyone have any idea how to do this because I am completely stuck. Any help would be appreciated.
Feels like homework, so I'll just give you some pointers:
The longer pipeline has two pipes, so you'll need to call pipe() twice. (I'd also check pipe's return value whilst I was at it.)
There are three processes, which means two forks. Again, check fork()'s return value properly: it's tri-state: parent, child or failure, and your program should test all three cases.
If you call pipe() twice up front, think carefully about which file descriptors (i.e. which ends of pipes) are which in each process, and hence which ones to close before invoking execlp(). I'd draw a picture.
I'd prefer dup2() to dup(), since you're explicitly setting the target file descriptor, and so it makes sense to specify it in the call. Also avoids silly bugs.
dup and execlp can fail, so I'd check their return values too...
You need (depending on the length of the command list) some pipes. But: at maximum you need not more than two pipe-pair-fds for a process in the middle, for the first and the last you need one pipe-pair-fds. Be really sure to close the pipe-fds which are not needed - if not, the child processes might not get an EOF and never finish.
And (as user3392484 stated): check all system calls for error conditions and report them to the caller. This will make life much easier.
I implemented something like this during the last days, maybe you want to have a look there: pipexec.c.
Kind regards - Andreas