how to write large amounts of data into pipes in c - c

For my systems programming class, I'm trying to communicate between a parent and child process using a pipe. I have a large amount of text (>64kB) that I want to send to a child process using a pipe. The child process will periodically read the text.
While writing, how do I check if the pipe is full? Also, how do I repeatedly check if the buffer has been emptied (by being read by the child process) and write the next chunk to the buffer?
I am aware that I could write the entire string out to a file, but I was just curious if there was a way to achieve this using a pipe.

Pipes are blocking (unless you set it to be non-blocking). That solves your both issues.

Related

Fork and pipe creation

My book on C applied to Linux, says that if a process creates a child with a fork(), then the pipe created between them follow this principle:
It is important to notice that both the parent process and the child process initially close their unused ends of the pipe
If both processes start with their pipe-end closed, how they know when the other is free to communicate? Maybe, is there an intermediate buffer between the processes?
Pipes on computers works very much like pipes in real life. There are two ends, you put something into one end and it comes out the other end.
Normally when using pipes in a program, you usually only want the input-end, where you write data, or you want the output-end, where data is read from. If the parent process only wants to write to the child process, and the child process only reads from the parent process, then the parent process could close the read end after the fork, and the child process can close the write end.
Pipe is an interprocess communication mechanism provided by the kernel. A process writing on the pipe need not worry whether there is some other process to read it. The communication is asynchronous. The kernel takes care of the data in transit.

Is it really necessary to close the unused end of the pipe in a process

I am reading about the pipes in UNIX for inter process communication between 2 processes. I have following question
Is it really necessary to close the unused end of the pipe? for example, if my parent process is writing data in to the pipe and child is reading from pipe, is it really necessary to close the read end of the pipe in parent process and close the write end from child process? Are there any side effects if I won't close those ends? Why do we need to close those ends?
Here's the problem if you don't. In your example, the parent creates a pipe for writing to the child. It then forks the child but does not close its own read descriptor. This means that there are still two read descriptors on the pipe.
If the child had the only one and it closed it (for example, by exiting), the parent would get a SIGPIPE signal or, if that was masked, an error on writing to the pipe.
However, there is a second read descriptor on the pipe (the parent's). Now, if the child exits, the pipe will remain open. The parent can continue to write to the pipe until it fills and then the next write will block (or return without writing if non-blocking).
Thus, by not closing the parent's read descriptor, the parent cannot detect that the child has closed its descriptor.
According to the man page for getdtablesize
Each process has a fixed size descriptor table, which is guaranteed to
have at least 20 slots.
Each pipe uses two entries in the descriptor table. Closing the unneeded end of the pipe frees up one of those descriptors. So, if you were unfortunate enough to be on a system where each process is limited to 20 descriptors, you would be highly motivated free up unneeded file descriptors.
Pipes are destined to be used as unidirectional communication channels. Closing them is a good practice allowing to avoid some mess in sent messages. Writer's descriptor should be closed for reader and vice versa.
Refer here
Quoting from above reference:
[...] Each pipe provides one-way communication; information flows from
one process to another.
For this reason, the parent and child process should close unused end
of the pipe.
There is actually another more important reason for closing unused
ends of the pipe.
The process reading from the pipe blocks when making the read system
call unless:
The pipe contains enough data to fill the reader's buffer or,
The end-of-file character is sent. The end-of-file character is sent through the pipe when every file descriptor to write end of the
pipe is closed. Any process reading from the pipe and forgetting to
close the write end of the pipe will never be notified of the
"end-of-file" [...]

How to make C program block until FIFO pipe is empty?

I'm doing IPC using named (FIFO) pipes and I would like to coordinate that program can only write into the pipe when program reading the pipe has read the previously written data out from the pipe. So I would like to block the write until the the pipe is empty. Is this possible?
One option that I though is that write function blocks when the pipe is full. But I would like to do this to much smaller amounts of data than the pipe size in Linux. E.g I would like that program can only write 20 bytes and then it waits until other end has read the data. I think you can not shrink named pipes to be so small. (Minimum size seems to be page file size (4096 bytes)?)
Thanks!
A possible solution is to have the reading process sending a signal to the writing process when it has read some data. You can do this using kill() to send SIGTERM to the writer, since SIGTERM can be catched and handled.

Why cant a pipe created using pipe() be used as a bi-directional pipe?

Almost all the pipe examples I've seen advice closing the unused write/read ends. Also man clearly states that pipe() creates a pipe, a unidirectional data channel But I've tried reading and writing to both ends of the pipe in both the parent and the child and everything seems to be OK.
So my doubt is why do we need 2 pipes if two processes have to both read and write to each other and why not do it using a single pipe?
If you use the same pipe how does the child separate its messages from the parents messages and vice versa?
For example:
Parent writes to pipe
Parent reads from pipe hoping to get message from child but gets its own message :(
It is much easier to use one pipe for child->parent and another pipe for parent->child.
Even if you have some protocol for reading/writing it is quite easy to deadlock the parent and child process.
You can read and write at both ends of the created pipe, but uni-directional means that data only travels in one direction at any time, from parent to child or vice versa. Two pipes are needed for non-blocking sending and receiving of data, meaning that you can read and write at the same time with two pipes, but with one pipe you must finish reading before you can write to the pipe or you must finish writing something before you can read the pipe. In layman terms, you can only read or write at any point of time with only one pipe

How can a child process return two values to the parent when using pipe()?

I have my child process counting the frequency of words from a text file. I am using pipe() for IPC. How can the child process return both the word name and the word frequency to the parent process? My source code is in C and I am executing it in a UNIX environment.
Write the two values to one end of the pipe in the child, separated by some delimiter. In the parent, read from the other end of the pipe, and separate the content using the delimiter.
Writes to a pipe up to the size of PIPE_BUF are atomic (included in limits.h), therefore you can easily pack your information into some type of struct, and write that to the pipe in your child process for the parent process to read. For instance, you could setup your struct to look like:
struct message
{
int word_freq;
char word[256];
};
Then simply do a read from your pipe with a buffer that is equal to sizeof(struct message). That being said, keep in mind that it is best to only have either a single reader/writer to the pipe, or you can have multiple writers (because writes are atomic), but again, only a single reader. While multiple readers can be managed with pipes, the fact that reads are not atomic means that you could end up with scenarios where messages either get missed due to the non-deterministic nature of process scheduling, or you get garbled messages because a process doesn't complete a read and leaves part of a message in the pipe.

Resources