I have just come across the select() function for linux (or is it Unix?) OS's. And its looking like it can achieve what I need to do.
I have a Linux process (on Debian) that has IPC (Inter-Process Communication) between 3 other processes. 2 of them are Serial Ports Streams and the other is a Named Pipe.
My process needs to read data from each of these streams and react accordingly (its a proxy between these 3 processes). Theres no order to the data coming in from each process (one may talk, then another lay silent for a while).
So I am thinking of having a main loop that simply uses select() to listen on all streams (with a timeout of never). That way select can notify me when/if a stream writes to my process, which stream is talking and then I can react accordingly.
Is this how select works? Is this design ok and how you would handle 3 streams where their behaviour is dynamic and not predictable (in terms of when they will write data to a stream)?
Yes, that's exactly what select is designed to do: multiplex multiple input streams and detect which have data ready to be read from.
Related
From my understanding, C pipes are like a special kind of file, where internally, the kernal keep tracks of the openings and closings from each process in a table. see the post here
So in that sense:
Is it possible for 1 single pipe to be connected by multiple processes?
If it is possible, can multiple processes read the same data?
If 2 is possible, will they be reading the same data, or does reading the data "empty" the data?
For example: process 1 writes into pipe, can process 2,3,4 read the data that process 1 wrote?
Yes, multiple processes can read from (or write to) a pipe.
But data isn't duplicated for the processes. Once data has been read from the pipe by one process, it's lost and available only to the process that actually read it.
Conversely, there's no way to distinguish data or from which process it originated if you have multiple processes writing to a single pipe.
1. Is it possible for 1 single pipe to be connected by multiple processes?
Yes.
2. If it is possible, can multiple processes read the same data?
No!
Unix fifos (pipes) can not be used in "single producer, multiple consumer" (spmc) manner; this also holds for Unix Domain Sockets (for most implementations UDS and fifos are implemented by the very same code, with just a few configuration bits differing on creation). Each byte written into a pipe / SOCK_STREAM UDS (or datagram written into a SOCK_DGRAM unix domain socket) can be read from only one single reading end.
However what's perfectly possible is having a "multiple producer, single consumer" fifo, UDS, that is the consumer having open one reading end (and also keeping open the writing end, but not using it¹), multiple producers can send data to the single consumer. For stream oriented pipes there's no strict ordering, so all the bytes sent will get mixed up. But for SOCK_DGRAM UDS socketpairs message boundaries are preserved.
¹: There's a particular pitfall, that if the creating process does not keep open its instance of the writing end, as soon as any one of the producer processes closes one of their writing end, it will tear down the connection for all other processes.
Some Unix code I am working on depends on being able to poll over a small number of pipes. poll is a POSIX system call that (much like the older select) allows the process to wait until one or more file descriptors is "ready" for reading or writing, which means one can proceed to do so without blocking. This is useful to implement event loops where waiting is clearly separated from the rest of the communication.
Is it possible to do the same for Windows pipe handles - wait for one or more of them to become "ready" for reading/writing?
Existing SO advice on the matter, such as answers to this question, recommend the use of completion ports. However as far as I can tell, completion ports require initiating reading/writing beforehand, and then waiting for (or being notified of) the completion of those operations. This approach does not fit the architecture of the code, which strongly separates the polling code from the reading/writing code, the latter calling into a library that uses the regular ReadFile and WriteFile on the underlying handle.
If there is no direct equivalent to poll, could one abuse completion ports to provide something similar? In other words, is it possible to create IO completion events that announce "you can now call ReadFile (WriteFile) on this handle without it blocking" and wait for them using WaitForMultipleObjects or GetQueuedCompletionStatus?
I am currently measuring performance of named pipe to compare with another library.
I need to simulate a clients (n) / server (1) situation where server read messages and do a simple action for every written messages. So clients are writers.
My code now work, but if I add a 2nd writer, the reader (server) will never see the data and will not receive forever. The file is still filled with the non-read data at the end and read method will return 0.
Is it ok for a single named pipe to be written by multiple-process? Do I need to initialize it with a special flag for multiple-process?
I am not sure I can/should use multiple writers on a single pipe. But, I am not sure also it would be a good design to create 1 pipe for each clients.
Would it be a more standard design to use 1 named pipe per client connection?
I know about Unix Domain Name Socket and it will be use later. I need o make the named pipes work.
What is the best way to have some shared stream with data for all threads?
If i have a threads interacting with each user connection, and then every users input must be available for all threads. We can imagine like a simple chat, where everyone sees everyones messages.
So i though that i can use some kind of "shared stream", which i can use for some kind of select() between this stream and users input socket, to write there when i got an input and read from there when there is something new available. I though about having some shared socket, but it wan't work this way, because when first thread will read data from socket, it wan't be available for othrer threads anymore.
So what it the best and idiomatic way to achieve this?
I think youy might be taking the engineering here a bit too far..
what you are looking for is an SMP like this or this of some sort, not a stream. The "stream" concept could be handled by a different process (a manager process, if you will) that will handle that stream of incoming information. In the chat scenario you described, it's not neccery because each thread can add to the SMP whatever is received on its own input stream.
Can anyone tell me the use and application of select function in socket programming in c?
The select() function allows you to implement an event driven design pattern, when you have to deal with multiple event sources.
Let's say you want to write a program that responds to events coming from several event sources e.g. network (via sockets), user input (via stdin), other programs (via pipes), or any other event source that can be represented by an fd. You could start separate threads to handle each event source, but you would have to manage the threads and deal with concurrency issues. The other option would be to use a mechanism where you can aggregate all the fd into a single entity fdset, and then just call a function to wait on the fdset. This function would return whenever an event occurs on any of the fd. You could check which fd the event occurred on, read that fd, process the event, and respond to it. After you have done that, you would go back and sit in that wait function - till another event on some fd arrives.
select facility is such a mechanism, and the select() function is the wait function. You can find the details on how to use it in any number of books and online resources.
The select function allows you to check on several different sockets or pipes (or any file descriptors at all if you are not on Windows), and do something based on whichever one is ready first. More specifically, the arguments for the select function are split up into three groups:
Reading: When any of the file descriptors in this category are ready for reading, select will return them to you.
Writing: When any of the file descriptors in this category are ready for writing, select will return them to you.
Exceptional: When any of the file descriptors in this category have an exceptional case -- that is, they close uncleanly, a connection breaks or they have some other error -- select will return them to you.
The power of select is that individual file/socket/pipe functions are often blocking. Select allows you to monitor the activity of several different file descriptors without having to have a dedicated thread of your program to each function call.
In order for you to get a more specific answer, you will probably have to mention what language you are programming in. I have tried to give as general an answer as possible on the conceptual level.
select() is the low-tech way of polling sockets for new data to read or for an open TCP window to write. Unless there's some compelling reason not to, you're probably better off using poll(), or epoll_wait() if your platform has it, for better performance.
I like description at gnu.org:
Sometimes a program needs to accept input on multiple input channels whenever input arrives. For example, some workstations may have devices such as a digitizing tablet, function button box, or dial box that are connected via normal asynchronous serial interfaces; good user interface style requires responding immediately to input on any device. [...]
You cannot normally use read for this purpose, because this blocks the program until input is available on one particular file descriptor; input on other channels won’t wake it up. You could set nonblocking mode and poll each file descriptor in turn, but this is very inefficient.
A better solution is to use the select function. This blocks the program until input or output is ready on a specified set of file descriptors, or until a timer expires, whichever comes first.
Per the documentation for Linux manpages and MSDN for Windows,
select() and pselect() allow a program to monitor multiple file
descriptors, waiting until one or more of the file descriptors become
"ready" for some class of I/O operation (e.g., input possible). A file
descriptor is considered ready if it is possible to perform the
corresponding I/O operation (e.g., read(2)) without blocking.
For simple explanation: often it is required for an application to do multiple things at once. For example you may access multiple sites in a web browser, a web server may want to serve multiple clients simultaneously. One needs a mechanism to monitor each socket so that the application is not busy waiting for one communication to complete.
An example: imagine downloading a large Facebook page on your smart phone whilst traveling on a train. Your connection is intermittent and slow, the web server should be able to process other clients when waiting for your communication to finish.
select(2) - Linux man page
select Function - Winsock Functions