i have the following code:
{
send(dstSocket, rcvBuffer, recvMsgSize, 0);
sndMsgSize = recv(dstSocket, sndBuffer, RCVBUFSIZE, 0);
send(rcvSocket, sndBuffer, sndMsgSize, 0);
recvMsgSize = recv(rcvSocket, rcvBuffer, RCVBUFSIZE, 0);
}
which eventually should become part of a generic TCP-Proxy. Now as it stands, it doesn't work quite correctly, since the recv() waits for input so the data only gets transmitted in chunks, depending where it currently is.
What i read up on it is that i need something like "non-blocking sockets" and a mechanism to monitor them. This mechanism as i found out is either select, poll or epoll in Linux. Could anyone give me a confirmation that i am on the right track here? Or could this excercise also be done with blocking sockets?
Regards
You are on the right track.
"select" and "poll" are system calls where you can pass in one or more sockets and block (for a specific amount of time) until data has been received (or ready for sending) on one of those sockets.
"non-blocking sockets" is a setting you can apply to a socket (or a recv call flag) such that if you try to call recv, but no data is available, the call will return immediately. Similar semantics exist for "send". You can use non-blocking sockets with or without the select/poll method described above. It's usually not a bad idea to use non-blocking operations just in case you get signaled for data that isn't there.
"epoll" is a highly scalable version of select and poll. A "select" set is actually limited to something like 64-256 sockets for monitoring at a time and it takes a perf hit as the number of monitored sockets goes up. "epoll" can scale up to thousands of simultaneous network connections.
Yes you are in the right track. Use non-blocking socket passing their relative file descriptors to select (see FD_SET()).
This way select will monitor for events (read/write) on them.
When select returns you can check on which fd has occurred an event (look at FD_ISSET()) and handle it.
You can also set a timeout on select, and it will return after that period even if not events have been occurred.
Yes you'll have to use one of those mechanisms. poll is portable and IMO the most easy one to use. You don't have to turn off blocking in this case, provided you use a small enough value for RCVBUFSIZE (around 2k-10k should be appropriate). Non-blocking sockets are a bit more complicated to handle, since if you get EAGAIN on send, you can't just loop to try again (well you can, but you shouldn't since it uses CPU unnecessarily).
But I would recommend to use a wrapper such as libevent. In this case a struct bufferevent would work particularly well. It will make a callback when new data is available, and you just queue it up for sending on the other socket.
Tried to find an bufferevent example but seems to be a bit short on them. The documentation is here anyway: http://monkey.org/~provos/libevent/doxygen-2.0.1/index.html
Related
I think the question is not new: I have a thread which should read from an X server (via XCB) and another server connected with TCP, so calling select is needed.
What's confusing me is, when the program returns from select, turning out there is data in the X server link, what if the data is not enough for an XCB event? In this case xcb_poll_for_event() should return NULL, but when the program calls select again it does not block because there is some data after all, so the program is trapped in the "busy" waiting state.
Is this a valid concern? I believe so because each XCB event is composed of many bytes and the server may be interrupted during sending.
How about setting the SO_RCVLOWAT for the xcb fd with the required size of an XCB event using setsockopt(). Now, the socket's file descriptor will only select as readable when there is at least that amount of data read to read. This is normal approach we used when dealing with TCP server's, haven't tried this with XCB fd's though.
I am an experienced network programmer and am faced with a situation where i need some advice.
I am required to distribute some data on several outgoing interfaces (via different tcp socket connections, each corresponding to each interface). However, the important part is, i should be able to send MORE/most of the data on the interface with better bandwidth i.e. the one that can send faster.
The opinion i had was to use select api (both unix and windows) for this purpose. I have used select, poll or even epoll in the past. But it was always for READING from multiple sockets whenever data is available.
Here i intend to write successive packets on several interfaces in sequence, then monitor each of them for write descriptors (select parameter), then which ever is available (means it was able to send the packet first), i would keep sending more packets via that descriptor.
Will i be able to achieve my intension here? i.e. if i have an interface with 10Mbps link vs another one with 1Mbps, i hope to be able to get most of the packets out via the faster interface.
Update 1: I was wondering what would be select's behavior in this case, i.e. when you call select on read descriptors, the one on which data is available is returned. However, in my scenario when we are writing on the descriptors and waiting for select to return the one that finished writing first, does select ensure returning only when entire packet is written i.e. say i tried writing 1200 bytes in one go. Will it only return when entire 1200 are return or there is a permanent error? I am not sure how would select behave and failed to find any documentation describing that.
I'd adapt the producer/consumer pattern. In this case one producer and several consumers.
Let the main thread handle your source (be the producer) and spawn off one thread for each connection (being the consumers).
The treads in parallel pull a chunk of the source each and send it over the connection one by one.
The thread holding the fastest connection is expected to send the most chunks in this setup.
Using poll/epoll/select for writing is rather tricky. The reason is that sockets are mostly ready for writing unless their socket send buffer is full. So, polling for 'writable' is apt to just spin without ever waiting.
You need to proceed as follows:
When you have something to write to a socket, write it, in a loop that terminates when all the data has been written or write() returns -1 with errno == EAGAIN/EWOULDBLOCK.
At that point you have a full socket send buffer. So, you need to register this socket with the selector/poll/epoll for writability.
When you have nothing else to do, select/poll/epoll and repeat the writes that caused the associated sockets to be polled for writability.
Do those writes the same way as at (1) but this time, if the write completes, deregister the socket for writability.
In other words you must only select/poll for writeability if you already know the socket's send buffer is full, and you must stop doing so immediately you know it isn't.
How you fit all this into your application is another question.
This question already has an answer here:
Why is it assumed that send may return with less than requested data transmitted on a blocking socket?
(1 answer)
Closed 9 years ago.
After select returns with write fd set for a tcp socket. If I try to send data on that socket, what is the minimum guaranteed size of data to be sent at once using send api? I understand that I have to run a loop to make sure all the data is sent. Still i want to understand what is the minimum guaranteed data sent and why?
This has come up before. I'm still searching for the referenced answer.
Let's start with the function prototype for send()
ssize_t send(int sockfd, const void *buf, size_t len, int flags);
For blocking TCP sockets - all the documentation will suggest that send() and write() will return a value between [1..len], unless there was an error. However, the reality is that no one I know has ever observed send() returning something other than -1 (error) or just "len" in the success case to indicate all of "buf" was sent in one call. I've never felt good about this, so I code defensively and just put my blocking send calls in a loop until the entire buffer is sent.
For non-blocking TCP sockets - you should just code as if the minimum was "1" (or -1 on error). Don't make any assumptions about a minimum data size.
And for recv(), you should always assume recv() will return some random value between 1..len in the success case, or 0 (closed), or -1 (error). Don't EVER assume recv will return a full buffer.
From the official POSIX reference:
A descriptor shall be considered ready for writing when a call to an output function with O_NONBLOCK clear would not block, whether or not the function would transfer data successfully.
As you see it doesn't actually mentions any size, or that the write will even be successful, just that you will be able to write to the socket without blocking.
So the answer is that there is no minimum guaranteed size.
When select() indicates that a socket is writable it is guaranteed you can transfer at least one byte without incurring EWOULDBLOCK or EAGAIN. Nothing to say you won't incur a different error though :-)
I dont think that on a POSIX layer you have any control over that. send() will accept anything you pass into internal buffers of TCP/IP stack implementation. And what happens next with it it's already another story which you can watch over using same select() call.
If you are asking about size which will fit in one packet that's MTU (but this assumption also should be used with care in regards of fragmentation and reassembling of the packets on the way)
UPD:
I will answer your comment here. No, you shouldn't bother about fragmentation at all. Leave it to TCP/IP stack. There are a lot of reasons why you shouldn't do this.. One as an example. Your application works on Application(7) layer of OSI model (although I consider OSI model in most cases to be an evil thing, it's really applicable for this example). And from this layer you trying to affect functionality/properties of a logic which is on much lower layers (Session/Transport). You shouldn't do this. POSIX calls like send(), recv() are designed to give your application an ability to instruct underneath layers that you need to pass certain amount of data and you have a way to monitor an execution of a command ( select()), that's all you have to do. And lower layers suppose to do their best to deliver data you instruct them to do in most optimal way depending on OS network settings/etc.
UPD2: everything above mostly consider NON_BLOCKING sockets. Sorry forgot to mention this, I don't use blocking sockets in my projects for ages.. in case your socket is blocking I still would consider to pass everything at once and just wait for operation result in another thread for example, because trying to optimise this could lead to very OS/drivers dependant code.
The basic code sequence I'm interesting for is (pseudocode)
sendto(some host); // host may be unreachable for now which is normal
...
if(select(readfs, timeout)) // there are some data to read
recvfrom();
Since Win2000, ICMP packet, which is sent back after sending UDP datagram to unreachable port, triggers select, after that recvfrom fails with WSAECONNRESET. Such behaviour isn't desirable for me, because I want select to finish with timeout in this case (there are no data to read). On Windows this can be solved with WSAIoctl SIO_UDP_CONNRESET ( http://support.microsoft.com/kb/263823 ).
My questions are:
Is SIO_UDP_CONNRESET the best way in this situation?
Are there some other methods to ignore ICMP for "select" or to filter it for recvfrom (maybe, ignoring WSAECONNRESET error on Windows treating it like timeout, can this error be triggered in some other case)?
Are there similar issues on Linux and Unix (Solaris, OpenBSD)?
select()'s readfds set really just reports that a read() on the socket won't block -- it doesn't promise anything about whether or not there is actual data available to read.
I don't know what specifically you're trying to accomplish with the two-second timeout rather than just sleeping forever -- nor why you can't just add an if block to check for WSAECONNRESET from recvfrom() -- but it feels like you've got an overly-complicated design if it doesn't handle this case well.
The select_tut(2) manpage on many Linux systems has some guidelines for properly using select(). Here's several rules that seem most apropos to your situation:
1. You should always try to use select() without a timeout.
Your program should have nothing to do if there is no
data available. Code that depends on timeouts is not
usually portable and is difficult to debug.
...
3. No file descriptor must be added to any set if you do not
intend to check its result after the select() call, and
respond appropriately. See next rule.
4. After select() returns, all file descriptors in all sets
should be checked to see if they are ready.
I'm writing a program in linux to interface, through serial, with a piece of hardware. The device sends packets of approximately 30-40 bytes at about 10Hz. This software module will interface with others and communicate via IPC so it must perform a specific IPC sleep to allow it to receive messages that it's subscribed to when it isn't doing anything useful.
Currently my code looks something like:
while(1){
IPC_sleep(some_time);
read_serial();
process_serial_data();
}
The problem with this is that sometimes the read will be performed while only a fraction of the next packet is available at the serial port, which means that it isn't all read until next time around the loop. For the specific application it is preferable that the data is read as soon as it's available, and that the program doesn't block while reading.
What's the best solution to this problem?
The best solution is not to sleep ! What I mean is a good solution is probably to mix
the IPC event and the serial event. select is a good tool to do this. Then you have to find and IPC mechanism that is select compatible.
socket based IPC is select() able
pipe based IPC is select() able
posix message queue are also selectable
And then your loop looks like this
while(1) {
select(serial_fd | ipc_fd); //of course this is pseudo code
if(FD_ISSET(fd_set, serial_fd)) {
parse_serial(serial_fd, serial_context);
if(complete_serial_message)
process_serial_data(serial_context)
}
if(FD_ISSET(ipc_fd)) {
do_ipc();
}
}
read_serial is replaced with parse_serial, because if you spend all your time waiting for complete serial packet, then all the benefit of the select is lost. But from your question, it seems you are already doing that, since you mention getting serial data in two different loop.
With the proposed architecture you have good reactivity on both the IPC and the serial side. You read serial data as soon as they are available, but without stopping to process IPC.
Of course it assumes you can change the IPC mechanism. If you can't, perhaps you can make a "bridge process" that interface on one side with whatever IPC you are stuck with, and on the other side uses a select()able IPC to communicate with your serial code.
Store away what you got so far of the message in a buffer of some sort.
If you don't want to block while waiting for new data, use something like select() on the serial port to check that more data is available. If not, you can continue doing some processing or whatever needs to be done instead of blocking until there is data to fetch.
When the rest of the data arrives, add to the buffer and check if there is enough to comprise a complete message. If there is, process it and remove it from the buffer.
You must cache enough of a message to know whether or not it is a complete message or if you will have a complete valid message.
If it is not valid or won't be in an acceptable timeframe, then you toss it. Otherwise, you keep it and process it.
This is typically called implementing a parser for the device's protocol.
This is the algorithm (blocking) that is needed:
while(! complete_packet(p) && time_taken < timeout)
{
p += reading_device.read(); //only blocks for t << 1sec.
time_taken.update();
}
//now you have a complete packet or a timeout.
You can intersperse a callback if you like, or inject relevant portions in your processing loops.