Should data to the pipe be written at a time?

Should data to the pipe be written at a time? - c

I have a scenario where two pipes are used for IPC between child and parent. The child process uses execvp to execute a remote program. The parent process takes care of writing data to the pipe. The remote programs stdin is duplicated to read end of one pipe. To the same pipe parent writes data at the write end. The remote program has a simple getchar() in one of the functions that is called twice in the remote program's main function.
The parent writes data in the following sequence.
writes data to the pipe. Closes all the required handles. (say wrote 1)
after some time writes data again to the pipe. Closes handle (say wrote 2)
The getchar in the remote program reads "1" in the proper fashion. But the problem comes while reading "2". The getchar is reading garbage values.
I have debugged using GDB and the program exits normally. No "signals" are raised while debugging.
I have used the fork(), dup2() and pipe() functions and need to stick to it.

Related

How to detect pclose inside client

How can a C-client started by popen and writing to stdout properly detect that the calling process has called pclose.
I am sending binary data from a small client program written in C to Matlab. To this end, Matlab is starting the process by calling popen inside an API written in C. The client is continuously writing binary data to stdout using fwrite. When Matlab is stopping, the API apparently calls pclose on the client's handle, but that does not stop the client process. I guess the fwrite will not raise an error, as the data gets buffered by the OS. So what is the appropriate way to detect the pclose inside the client?
BTW I will run into the same problem agin, when trying to write to some C-client from within Matlab.

Three ways:
If you are reading from the pipe, and the other end goes away, you should get end-of-file, and you can detect that, and stop.
If you are reading from the pipe, and you control the protocol, the other end can send you a "quit" command over the pipe, and you can read that, and stop.
If you are writing to the pipe, and the other end goes away, you should get a SIGPIPE signal, which by default should kill your process.
Note that #1 only works if the other end is the only process that has the pipe open. If any other process has the writing end of pipe open, you won't get EOF, and you won't know to stop. If you're calling pipe(), then exec(), it's easy for you to have the writing end of the pipe open. This is a common mistake. (Since you're calling popen(), though, it's less likely you're having this problem.)

How fork() and scanf() work together?

I tried to see what happens if I read something from keyboard while I have multiple processes with fork() (in my case there are two children and a parent) and I discovered the following problem: I need to tell the parent to wait for children's processes, otherwise the program behaves strangely.
I did a research and I found that the problem is with the parent, he needs to wait for the child's process to end because if the parent's process ends first somehow he closes the STDIN, am I right? But also I found that every process has a copy of STDIN so my question is:
Why it works this way and why only the parent has the problem with STDIN and the children not, I mean why if the child's process ends first doesn't affect STDIN but if the parent's process ends first it does affect STDIN?
Here are my tests:
I ran the program without wait() and after I typed a number the program stopped, but then I pressed enter two more times and the other two messages from printf() appeared.
When I ran the program with wait() everything worked fine, every process called scanf() separately and read a different number.

Well, a lot of stuff is going on here. I will try to explain it step by step.
When you start your terminal, the terminal creates a special file having path /dev/pts/<some number>. Then it starts your shell (which is bash in this case) and links the STDIN, STDOUT and STDERR of the bash process to this special file. This file is called a special file because it doesn't actually exist on your hard disk. Instead, whatever you write to this file, it goes directly to the terminal and the terminal renders it on the screen. (Similarly, whenever you try to read from this file, the read blocks until someone types something at the terminal).
Now when you launch your program by typing ./main, bash calls the fork function in order create a new process. The child process execs your executable file, while the parent process waits for the child to terminate. Your program then calls fork twice and we have three processes trying to read their STDINs, ie the same file /dev/pts/something. (Remember that calling fork and exec duplicates and preserves the file descriptors respectively).
The three processes are in race condition. When you enter something at the terminal, one of the three processes will receive it (99 out of 100 times it would be the parent process since the children have to do more work before reaching scanf statement).
So, parent process prints the number and exits first. The bash process that was waiting for the parent to finish, resumes and puts the STDIN into a so called "non-canonical" mode, and calls read in order to read the next command. Now again, three processes (Child1, Child2 and bash) are trying to read STDIN.
Since the children are trying to read STDIN for a longer time, the next time you enter something it will be received by one of the children, rather than bash. So you think of typing, say, 23. But oops! Just after you press the 2 key, you get Your number is: 2. You didn't even press the Enter key! That happened because of this so called "non-canonical" mode. I won't be going into what and why is that. But for now, to make things easier, use can run your program on sh instead of bash, since sh doesn't put STDIN into non-canonical mode. That will make the picture clear.
TL;DR
No, parent process closing its STDIN doesn't mean that its children or other process won't be able to use it.
The strange behavior you are seeing is because when the parent exits, bash puts the pty (pseudo terminal) into non-canonical mode. If you use sh instead, you won't see that behavior. Read up on pseudo terminals, and line discipline if you want to have a clear understading.
The shell process will resume as soon as the parent exits.
If you use wait to ensure that parents exits last, you won't have any problem, since the shell won't be able to run along with your program.
Normally, bash makes sure that no two foreground processes read from STDIN simultaneously, so you don't see this strange behavior. It does this by either piping STDOUT of one program to another, or by making one process a background process.
Trivia: When a background process tries to read from its STDIN, it is sent a signal SIGTTIN, which stops the process. Though, that's not really relevant to this scenario.

There are several issues that can happen when multiple processes try to do I/O to the same TTY. Without code, we can't tell which may be happening.
Trying to do I/O from a background process group may deliver a signal: SIGTTIN for input (usually enabled), or SIGTTOU for output (usually disabled)
Buffering: if you do any I/O before the fork, any data that has been buffered will be there for both processes. Under some conditions, using fflush may help, but it's better to avoid buffering entirely. Remember that, unlike output buffering, it is impossible to buffer input on a line-by-line basis (although you can only buffer what is available, so it may appear to be line-buffered at first).
Race conditions: if more than one process is trying to read the same pipe-like file, it is undefined which one will "win" and actually get the input each time it is available.

Why should you close a pipe in linux?

When using a pipe for process-process communication, what is the purpose of closing one end of the pipe?
For example: How to send a simple string between two programs using pipes?
Notice that one side of the pipe is closed in the child and parent processes. Why is this required?

If you connect two processes - parent and child - using a pipe, you create the pipe before the fork.
The fork makes the both processes have access to both ends of the pipe. This is not desirable.
The reading side is supposed to learn that the writer has finished if it notices an EOF condition. This can only happen if all writing sides are closed. So it is best if it closes its writing FD ASAP.
The writer should close its reading FD just in order not to have too many FDs open and thus reaching a maybe existing limit of open FDs. Besides, if the then only reader dies, the writer gets notified about this by getting a SIGPIPE or at least an EPIPE error (depending on how signals are defined). If there are several readers, the writer cannot detect that "the real one" went away, goes on writing and gets stuck as the writing FD blocks in the hope, the "unused" reader will read something.
So here in detail what happens:
parent process calls pipe() and gets 2 file descriptors: let's call it rd and wr.
parent process calls fork(). Now both processes have a rd and a wr.
Suppose the child process is supposed to be the reader.
Then
the parent should close its reading end (for not wasting FDs and for proper detection of dying reader) and
the child must close its writing end (in order to be possible to detect the EOF condition).

The number of file descriptors that can be open at a given time is limited. If you keep opening pipes and not closing them pretty soon you'll run out of FDs and can't open anything anymore: not pipes, not files, not sockets, ...
Another reason why it can be important to close the pipe is when the closing itself has a meaning to the application. For example, a common use of pipes is to send the errno from a child process to the parent when using fork and exec to launch an external program:
The parent creates the pipe, calls fork to create a child process, closes its writing end, and tries to read from the pipe.
The child process attempts to use exec to run a different program:
If exec fails, for example because the program does not exist, the child writes errno to the pipe, and the parent reads it and knows what went wrong, and can tell the user.
If exec is successful the pipe is closed without anything being written. The read function in the parent returns 0 indicating the pipe was closed and knows the program was successfully started.
If the parent did not close its writing end of the pipe before trying to read from the pipe this would not work because the read function would never return when exec is successful.

Closing unused pipe file descriptor is more than a matter of ensuring that a process doesn't exhaust its limited set of file descriptor-it is essential to the correct use of pipes. We now consider why the unused file descriptors for both the read and write ends of the pipe must be closed.
The process reading from the pipe closes its write descriptor for the pipe, so that, when the other process completes its output and closes its write descriptor, the read sees end-of-file (once it has ready any outstanding data in the pipe).
If the reading process doesn't close the write end of the pipe, then after the other process closes its write descriptor, the reader won't see end-of-file, even after it has read all data from the pipe. Instead, a read() would block waiting for data, because the kernel knows that there is still at least one write descriptor open for the pipe.That this descriptor is held open by the reading process itself is irrelevant; In theory, that process could still write to the pipe, even if it is blocked trying to read.
For example, the read() might be interrupted by a signal handler that writes data to the pipe.
The writing process closes its read descriptor for the pipe for a different reason.
When a process tries to write to a pipe for which no process has an open read descriptor, the kernel sends the SIGPIPE signal to the writing process. By default, this signal kills a process. A process can instead arrange to catch or ignore this signal, in which case the write() on the pipe fails with the error EPIPE (broken pipe). Receiving the SIGPIPE signal or getting the EPIPE error is useful indication about the status of the pipe, and this is why unused read descriptors for the pipe should be closed.
If the writing process doesn't close the read end of the pipe, then even after the other process closes the read end of the pipe, the writing process will fill the pipe, and a further attempt to write will block indefinitely.
One final reason for closing unused file descriptor is that only after it all file descriptor are closed that the pipe is destroyed and its resources released for reuse by other processes. At this point, any unread data in the pipe is lost.
~ Micheal Kerrisk , the Linux programming interface

how to communicate to a program with another external program

I'm trying to write to stdin and read from stdout ( and stderr ) from an external program, without changing the code.
I've tried using named pipes, but stdout doesn't show until the program is terminated and stdin only works on the first input( then cin is null ).
i've tried using /proc/[pid]/fd but that only writes and reads from the terminal and not the program.
i've tried writing a character device file for this and it worked, but only one program at a time ( this needs to work for multiple programs at a time ).
at this point, to my knowledge, I could write the driver that worked to multiplex the io across multiple programs but I don't think that's the "right" solution.
the main purpose of this is to view a feed of a program through a web interface. I'm sure there has to be someway to do this. is there anything I haven't tried that's been done before?

The typical way of doing this is:
Create anonymous pipes (not named pipes) with the pipe(2) system call for the new process's standard streams
Call fork(2) to spawn the child process
close(2) the appropriate ends of the pipes in both the parent and the child (e.g. for the stdin pipe, close the read end in the parent and close the write end in the child; vice-versa for the stdout and stderr pipes)
Use dup2(2) in the child to copy the pipe file descriptors onto file descriptors 0, 1, and 2, and then close(2) the remaining old descriptors
exec(3) the external application in the child process
In the parent process, simultaneously write to the child's stdin pipe and read from the child's stdout and stderr pipes. However, depending on how the child behaves, this can easily lead to deadlock if you're not careful. One way to avoid deadlock is to spawn separate threads to handle each of the 3 streams; another way is to use the select(2) system call to wait until one of the streams can be read from/written to without blocking, and then process that stream.
Even if you do this all correctly, you may still not see your program's output right away. This is typically due to buffering stdout. Normally, when stdout is going to a terminal, it's line-buffered—it gets flushed after every newline gets written. But when stdout is a pipe (or anything else that's not a terminal, like a file or a socket), it's fully buffered, and it only gets written to when the program has outputted a full buffer's worth of data (e.g. 4 KB).
Many programs have command line options to change their buffering behavior. For example, grep(1) has the --line-buffered flag to force it to line-buffer its output even when stdout isn't a terminal. If your external program has such an option, you should probably use it. If not, it's still possible to change the buffering behavior, but you have to use some sneaky tricks—see this question and this question for how to do that.

I want to make a REPL-style program with pipes and signals, but my program locks

The program should fork, then the parent should read user input, send it to the child; the child should deal with it, then send a result to the parent, who prints it (it is required to work this way).
I've done a part of it, but the program locks after reading from the fifo the first time.
I suspect the problem is somewhere between lines 122–199. Making the pipe nonblocking makes the program jump through the scanf at 185 and loop indefinitely. Closing and reopening the pipe before writing and after reading leads to the same effect.
Here is the source: link.
Later edit (clarification):
The parent blocks before the printf at 184, when it reads the second command (the first time it seems to work just fine).
I haven't implemented the "child sends stuff back to the parent" part. At the moment I just want to make the child output the data it recieves through the pipe from the parent and then give control back to the parent to read another command.
The child lives in a paused state (pause()) while the parent reads input and sends it through the pipe, then it wakes up the child and goes in a paused state itself. The child reads data from the pipe and outputs it, then wakes up the parent and goes to sleep.

Did you use some multiplexing system call like select or poll which are able to test of a set of file descriptors has some of them ready (for input, or for output)?
Learn more about poll or about select and friends.

You should clarify your question which with part (parent or child) it is that locks.
Make sure all output is line feed terminated. It seems (from a quick reading) that you use puts(), which should take care of that.
Try calling fflush() on the output after the client is done, to make sure the output gets written to the pipe, if the child lives on while the parent reads its output. I didn't read your code closely enough to track down the lifetime handling.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight