Are pipe reads atomic on Linux (multiple writers, one reader)? - c

I have multiple processes (and multiple threads within some processes) writing to a single named pipe. The pipe is opened with O_WRONLY for each writer.
I have another process reading from this pipe, blocking with select. The pipe is opened with O_RDONLY | O_NONBLOCK in the reader.
When select in the reader wakes up, will read return at most one chunk of data available, or could it return multiple chunks? If the former, then I expect after I read the first chunk, select will immediately wake up until I finish reading the remaining chunks.
Or could read return less than one of the chunks written by a writer?
I'm writing and reading strings, and they're all less than PIPE_BUF, so I know the writes are atomic. I can easily append a delimiter to check for multiple strings, but I'm just curious how it works on Linux.

read will return all data available in the pipe, it doesn't matter how many writes were used to write that data. The number of bytes returned will be same as the size requested, when there is more data in pipe. In such cases, select will return immediately, indicating that there is some data to read.
You will have to delimit each chuck you write to pipe and separate it after reading.

Related

Can single pipe be connected and read by multiple processes

From my understanding, C pipes are like a special kind of file, where internally, the kernal keep tracks of the openings and closings from each process in a table. see the post here
So in that sense:
Is it possible for 1 single pipe to be connected by multiple processes?
If it is possible, can multiple processes read the same data?
If 2 is possible, will they be reading the same data, or does reading the data "empty" the data?
For example: process 1 writes into pipe, can process 2,3,4 read the data that process 1 wrote?
Yes, multiple processes can read from (or write to) a pipe.
But data isn't duplicated for the processes. Once data has been read from the pipe by one process, it's lost and available only to the process that actually read it.
Conversely, there's no way to distinguish data or from which process it originated if you have multiple processes writing to a single pipe.
1. Is it possible for 1 single pipe to be connected by multiple processes?
Yes.
2. If it is possible, can multiple processes read the same data?
No!
Unix fifos (pipes) can not be used in "single producer, multiple consumer" (spmc) manner; this also holds for Unix Domain Sockets (for most implementations UDS and fifos are implemented by the very same code, with just a few configuration bits differing on creation). Each byte written into a pipe / SOCK_STREAM UDS (or datagram written into a SOCK_DGRAM unix domain socket) can be read from only one single reading end.
However what's perfectly possible is having a "multiple producer, single consumer" fifo, UDS, that is the consumer having open one reading end (and also keeping open the writing end, but not using it¹), multiple producers can send data to the single consumer. For stream oriented pipes there's no strict ordering, so all the bytes sent will get mixed up. But for SOCK_DGRAM UDS socketpairs message boundaries are preserved.
¹: There's a particular pitfall, that if the creating process does not keep open its instance of the writing end, as soon as any one of the producer processes closes one of their writing end, it will tear down the connection for all other processes.

Mutexs with pipes in C

I am sorry if this sounds like I am repeating this question, but I have a couple additions that I am hoping someone can explain for me.
I am trying to implement a 'packet queueing system' with pipes. I have 1 thread that has a packet of data that it needs to pass to a second thread (Lets call the threads A and B respectively). Originally I did this with a queueing structure that I implemented using linked lists. I would lock a Mutex, write to the queue, and then unlock the Mutex. On the read side, I would do the same thing, lock, read, unlock. Now I decided to change my implementation and make use of pipes (so that I can make use of blocking when data is not available). Now for my question:
Do I need to use Mutexs to lock the file descriptors of the pipe for read and write operations?
Here is my thinking.
I have a standard message that gets written to the pipe on writes, and it is expected to be read on the read side.
struct pipe_message {
int stuff;
short more_stuff;
char * data;
int length;
};
// This is where I read from the pipe
num_bytes_read = read(read_descriptor, &buffer, sizeof(struct pipe_message));
if(num_bytes_read != sizeof(struct pipe_message)) // If the message isn't full
{
printe("Error: Read did not receive a full message\n");
return NULL;
}
If I do not use Mutexs, could I potentially read only half of my message from the pipe?
This could be bad because I would not have a pointer to the data and I could be left with memory leaks.
But, if I use Mutexs, I would lock the Mutex on the read, attempt to read which would block, and then because the Mutex is locked, the write side would not be able to access the pipe.
Do I need to use Mutexs to lock the file descriptors of the pipe for read and write operations?
It depends on the circumstances. Normally, no.
Normality
If you have a single thread writing into the pipe's write file descriptor, no. Nor does the reader need to use semaphores or mutexes to control reading from the pipe. That's all taken care of by the OS underneath on your behalf. Just go ahead and call write() and read(); nothing else is required.
Less Usual
If you have multiple threads writing into the pipe's write file descriptor, then the answer is maybe.
Under Linux calling write() on the pipe's write file descriptor is an atomic operation provided that the size of data being written is less than a certain amount (this is specified in the man page for pipe(), but I recall that it's 4kbytes). This means that you don't need a mutex or semaphore to control access to the pipe's write file descriptor.
If the size of the data you're writing is too large then then the call to write() on the pipe is not atomic. So if you have multiple threads writing to the pipe and the size is too large then you do need a mutex to control access to the write end of the pipe.
Using a mutex with a blocking pipe is actually dangerous. If the write side takes the mutex, writes to the pipe and blocks because the pipe is full, then the read side can't get the mutex to read the data from the pipe, and you have a deadlock.
To be safe, on the write side you'd probably need to do something like take the mutex, check if the pipe has space for what you want to write, if not then release the mutex, yield and then try again.

How to make C program block until FIFO pipe is empty?

I'm doing IPC using named (FIFO) pipes and I would like to coordinate that program can only write into the pipe when program reading the pipe has read the previously written data out from the pipe. So I would like to block the write until the the pipe is empty. Is this possible?
One option that I though is that write function blocks when the pipe is full. But I would like to do this to much smaller amounts of data than the pipe size in Linux. E.g I would like that program can only write 20 bytes and then it waits until other end has read the data. I think you can not shrink named pipes to be so small. (Minimum size seems to be page file size (4096 bytes)?)
Thanks!
A possible solution is to have the reading process sending a signal to the writing process when it has read some data. You can do this using kill() to send SIGTERM to the writer, since SIGTERM can be catched and handled.

How can a child process return two values to the parent when using pipe()?

I have my child process counting the frequency of words from a text file. I am using pipe() for IPC. How can the child process return both the word name and the word frequency to the parent process? My source code is in C and I am executing it in a UNIX environment.
Write the two values to one end of the pipe in the child, separated by some delimiter. In the parent, read from the other end of the pipe, and separate the content using the delimiter.
Writes to a pipe up to the size of PIPE_BUF are atomic (included in limits.h), therefore you can easily pack your information into some type of struct, and write that to the pipe in your child process for the parent process to read. For instance, you could setup your struct to look like:
struct message
{
int word_freq;
char word[256];
};
Then simply do a read from your pipe with a buffer that is equal to sizeof(struct message). That being said, keep in mind that it is best to only have either a single reader/writer to the pipe, or you can have multiple writers (because writes are atomic), but again, only a single reader. While multiple readers can be managed with pipes, the fact that reads are not atomic means that you could end up with scenarios where messages either get missed due to the non-deterministic nature of process scheduling, or you get garbled messages because a process doesn't complete a read and leaves part of a message in the pipe.

C prog: After append file, read still return 0

I've a new file, opened as read/write then 1 thread will receive from network and append binary data to that file, the other thread will read from the same file to process the binary data, but the read() always return 0, so I can't read the data, but if I using cat in command line to append data, then the program can read the data and process. I don't know why it can't notice the new data coming from network. I'm using open(), read(), and write() in this program.
Use a pipe instead of an HDD-file. Depending on your system (which you didnt tell us) there are only minor modifications to your code (which you didnt give us) to do that.
file operations are buffered. try flushing the stream?
Assuming that your read() and write() functions are the POSIX one, they share the file position, even if they are used in different threads. So your read after write was trying to read after the position at which write had written. Don't use file IO to communicate between threads. In most contexts, I'd not even use pipe or sockets for that (one context I'd use them is when the reading thread is using poll/select with other file descriptors) but simple shared memory and mutex.

Resources