From my understanding, C pipes are like a special kind of file, where internally, the kernal keep tracks of the openings and closings from each process in a table. see the post here
So in that sense:
Is it possible for 1 single pipe to be connected by multiple processes?
If it is possible, can multiple processes read the same data?
If 2 is possible, will they be reading the same data, or does reading the data "empty" the data?
For example: process 1 writes into pipe, can process 2,3,4 read the data that process 1 wrote?
Yes, multiple processes can read from (or write to) a pipe.
But data isn't duplicated for the processes. Once data has been read from the pipe by one process, it's lost and available only to the process that actually read it.
Conversely, there's no way to distinguish data or from which process it originated if you have multiple processes writing to a single pipe.
1. Is it possible for 1 single pipe to be connected by multiple processes?
Yes.
2. If it is possible, can multiple processes read the same data?
No!
Unix fifos (pipes) can not be used in "single producer, multiple consumer" (spmc) manner; this also holds for Unix Domain Sockets (for most implementations UDS and fifos are implemented by the very same code, with just a few configuration bits differing on creation). Each byte written into a pipe / SOCK_STREAM UDS (or datagram written into a SOCK_DGRAM unix domain socket) can be read from only one single reading end.
However what's perfectly possible is having a "multiple producer, single consumer" fifo, UDS, that is the consumer having open one reading end (and also keeping open the writing end, but not using it¹), multiple producers can send data to the single consumer. For stream oriented pipes there's no strict ordering, so all the bytes sent will get mixed up. But for SOCK_DGRAM UDS socketpairs message boundaries are preserved.
¹: There's a particular pitfall, that if the creating process does not keep open its instance of the writing end, as soon as any one of the producer processes closes one of their writing end, it will tear down the connection for all other processes.
Related
I am writing a parent process, that needs to count events from a group of child processes.
I am going to use pipe() to achieve this.
Can I open a single pipe on the parent, and then fork 4 child processes that will use that same pipe to communicate with the parent, or must I create 4 different pipes? (1 for each child process)
It is important to state that the parent never communicates with the child processes. All it does is: Count and sum up the rate at which the child processes raise events.
Also: In case I can use a shared pipe, what would the atomicity of the messages be. Do I have to keep them one byte long, or can I assume that two 4 byte messages will not get their bytes interpolated?
You can use a single pipe.
You don't need to limit yourself to single-byte events.
man 7 pipe on Linux states:
PIPE_BUF
POSIX.1 says that write(2)s of less than PIPE_BUF bytes must be
atomic: the output data is written to the pipe as a contiguous
sequence. Writes of more than PIPE_BUF bytes may be nonatomic: the
kernel may interleave the data with data written by other processes.
POSIX.1 requires PIPE_BUF to be at least 512 bytes. (On Linux,
PIPE_BUF is 4096 bytes.)
(Related: The description of write in POSIX.)
Another option is to use one datagram unix socket pair created with socketpair instead of a pipe. In this case each write creates a separate datagram and each read returns one datagram only. This way messages can be larger than PIPE_BUF and still be atomic.
Try using named pipe is this example https://www.geeksforgeeks.org/named-pipe-fifo-example-c-program/ Rin the reader and then the written should do the work, you should also think about message quue that is asynchronous way of communicating between process
I am currently measuring performance of named pipe to compare with another library.
I need to simulate a clients (n) / server (1) situation where server read messages and do a simple action for every written messages. So clients are writers.
My code now work, but if I add a 2nd writer, the reader (server) will never see the data and will not receive forever. The file is still filled with the non-read data at the end and read method will return 0.
Is it ok for a single named pipe to be written by multiple-process? Do I need to initialize it with a special flag for multiple-process?
I am not sure I can/should use multiple writers on a single pipe. But, I am not sure also it would be a good design to create 1 pipe for each clients.
Would it be a more standard design to use 1 named pipe per client connection?
I know about Unix Domain Name Socket and it will be use later. I need o make the named pipes work.
I know for pthreads, if they're modifying the same variables or files, you can use pthread_mutex_lock to prevent simultaneous writes.
If I'm using fork() to have multiple processes, which are editing the same file, how can I make sure they're not writing simultaneously to that file?
Ideally I'd like to lock the file for one writer at a time, and each process would only need to write once (no loops necessary). Do I need to do this manually or will UNIX do it for me?
Short answer: you have to do it manually. There are certain guarantees on the atomicity of each write, but you'll still need to synchronize the processes to avoid interleaving writes. There are a lot of techniques for synchronizing processes. Since all of your writers are descendants of a common process, probably the easiest thing to do is to pass a token on a common pipe. Before you fork, create a pipe and write a single byte into it. Any time a process wants to write to the file, it will do a blocking read on the pipe. If it gets a byte, then it proceeds to write to the file. When it is done, it writes a byte back into the pipe. If any other process wants to access the file, it will block on the pipe read until the other process is done writing. This is often simpler than using a semaphore, which is another excellent technique.
In Inter Process Communication(IPC), to communicate with each process "PIPE" that OS provides should be needed. And to transmit data from Input unit to program or from program to Output unit "Stream" that OS provides should be needed.
Here are my questions.
Are there differences between PIPE and Stream??
If they are different, because their functions are very similar isn't it more useful using only "PIPE" or "Stream" to transmit data??
A pipe is a communication channel between two processes. It has a writing end and a reading end. When on open one of these two end, one get a (writing or reading) stream. So in a first approximation there is a stream at each end of a pipe.
So to set up an IPC, you should
create a pipe using the function pipe. This return two ints identifying the two ends of the pipes;
usually fork to get two processes;
open each end of the pipe (usually in a different process after forking) and get two corresponding streams.
See http://www.gnu.org/software/libc/manual/html_node/Creating-a-Pipe.html
I have my child process counting the frequency of words from a text file. I am using pipe() for IPC. How can the child process return both the word name and the word frequency to the parent process? My source code is in C and I am executing it in a UNIX environment.
Write the two values to one end of the pipe in the child, separated by some delimiter. In the parent, read from the other end of the pipe, and separate the content using the delimiter.
Writes to a pipe up to the size of PIPE_BUF are atomic (included in limits.h), therefore you can easily pack your information into some type of struct, and write that to the pipe in your child process for the parent process to read. For instance, you could setup your struct to look like:
struct message
{
int word_freq;
char word[256];
};
Then simply do a read from your pipe with a buffer that is equal to sizeof(struct message). That being said, keep in mind that it is best to only have either a single reader/writer to the pipe, or you can have multiple writers (because writes are atomic), but again, only a single reader. While multiple readers can be managed with pipes, the fact that reads are not atomic means that you could end up with scenarios where messages either get missed due to the non-deterministic nature of process scheduling, or you get garbled messages because a process doesn't complete a read and leaves part of a message in the pipe.