The basic code sequence I'm interesting for is (pseudocode)
sendto(some host); // host may be unreachable for now which is normal
...
if(select(readfs, timeout)) // there are some data to read
recvfrom();
Since Win2000, ICMP packet, which is sent back after sending UDP datagram to unreachable port, triggers select, after that recvfrom fails with WSAECONNRESET. Such behaviour isn't desirable for me, because I want select to finish with timeout in this case (there are no data to read). On Windows this can be solved with WSAIoctl SIO_UDP_CONNRESET ( http://support.microsoft.com/kb/263823 ).
My questions are:
Is SIO_UDP_CONNRESET the best way in this situation?
Are there some other methods to ignore ICMP for "select" or to filter it for recvfrom (maybe, ignoring WSAECONNRESET error on Windows treating it like timeout, can this error be triggered in some other case)?
Are there similar issues on Linux and Unix (Solaris, OpenBSD)?
select()'s readfds set really just reports that a read() on the socket won't block -- it doesn't promise anything about whether or not there is actual data available to read.
I don't know what specifically you're trying to accomplish with the two-second timeout rather than just sleeping forever -- nor why you can't just add an if block to check for WSAECONNRESET from recvfrom() -- but it feels like you've got an overly-complicated design if it doesn't handle this case well.
The select_tut(2) manpage on many Linux systems has some guidelines for properly using select(). Here's several rules that seem most apropos to your situation:
1. You should always try to use select() without a timeout.
Your program should have nothing to do if there is no
data available. Code that depends on timeouts is not
usually portable and is difficult to debug.
...
3. No file descriptor must be added to any set if you do not
intend to check its result after the select() call, and
respond appropriately. See next rule.
4. After select() returns, all file descriptors in all sets
should be checked to see if they are ready.
Related
I think the question is not new: I have a thread which should read from an X server (via XCB) and another server connected with TCP, so calling select is needed.
What's confusing me is, when the program returns from select, turning out there is data in the X server link, what if the data is not enough for an XCB event? In this case xcb_poll_for_event() should return NULL, but when the program calls select again it does not block because there is some data after all, so the program is trapped in the "busy" waiting state.
Is this a valid concern? I believe so because each XCB event is composed of many bytes and the server may be interrupted during sending.
How about setting the SO_RCVLOWAT for the xcb fd with the required size of an XCB event using setsockopt(). Now, the socket's file descriptor will only select as readable when there is at least that amount of data read to read. This is normal approach we used when dealing with TCP server's, haven't tried this with XCB fd's though.
This question already has an answer here:
Why is it assumed that send may return with less than requested data transmitted on a blocking socket?
(1 answer)
Closed 9 years ago.
After select returns with write fd set for a tcp socket. If I try to send data on that socket, what is the minimum guaranteed size of data to be sent at once using send api? I understand that I have to run a loop to make sure all the data is sent. Still i want to understand what is the minimum guaranteed data sent and why?
This has come up before. I'm still searching for the referenced answer.
Let's start with the function prototype for send()
ssize_t send(int sockfd, const void *buf, size_t len, int flags);
For blocking TCP sockets - all the documentation will suggest that send() and write() will return a value between [1..len], unless there was an error. However, the reality is that no one I know has ever observed send() returning something other than -1 (error) or just "len" in the success case to indicate all of "buf" was sent in one call. I've never felt good about this, so I code defensively and just put my blocking send calls in a loop until the entire buffer is sent.
For non-blocking TCP sockets - you should just code as if the minimum was "1" (or -1 on error). Don't make any assumptions about a minimum data size.
And for recv(), you should always assume recv() will return some random value between 1..len in the success case, or 0 (closed), or -1 (error). Don't EVER assume recv will return a full buffer.
From the official POSIX reference:
A descriptor shall be considered ready for writing when a call to an output function with O_NONBLOCK clear would not block, whether or not the function would transfer data successfully.
As you see it doesn't actually mentions any size, or that the write will even be successful, just that you will be able to write to the socket without blocking.
So the answer is that there is no minimum guaranteed size.
When select() indicates that a socket is writable it is guaranteed you can transfer at least one byte without incurring EWOULDBLOCK or EAGAIN. Nothing to say you won't incur a different error though :-)
I dont think that on a POSIX layer you have any control over that. send() will accept anything you pass into internal buffers of TCP/IP stack implementation. And what happens next with it it's already another story which you can watch over using same select() call.
If you are asking about size which will fit in one packet that's MTU (but this assumption also should be used with care in regards of fragmentation and reassembling of the packets on the way)
UPD:
I will answer your comment here. No, you shouldn't bother about fragmentation at all. Leave it to TCP/IP stack. There are a lot of reasons why you shouldn't do this.. One as an example. Your application works on Application(7) layer of OSI model (although I consider OSI model in most cases to be an evil thing, it's really applicable for this example). And from this layer you trying to affect functionality/properties of a logic which is on much lower layers (Session/Transport). You shouldn't do this. POSIX calls like send(), recv() are designed to give your application an ability to instruct underneath layers that you need to pass certain amount of data and you have a way to monitor an execution of a command ( select()), that's all you have to do. And lower layers suppose to do their best to deliver data you instruct them to do in most optimal way depending on OS network settings/etc.
UPD2: everything above mostly consider NON_BLOCKING sockets. Sorry forgot to mention this, I don't use blocking sockets in my projects for ages.. in case your socket is blocking I still would consider to pass everything at once and just wait for operation result in another thread for example, because trying to optimise this could lead to very OS/drivers dependant code.
i am learning to use SO_SNDTIMEO and SO_RCVTIMEO to check the timeout.
It is easy to use with read socket. But when i want to check write timeout, it always return successful. Here is what i did:(all in blocking mode)
close the client read socket and exit before server start write
terminate the client before server start write
unplug the cable of server after accept but before write
well, it seems all these case write just return sucessfully.
I think the reason should be that port is resource managed by os, and at the client side, after program gone, the tcp connection still shows FIN_WAIT2 state.
so, is there any convenient way to simulate some cases that write can receive errors such as EPIPE, EAGAIN?
How to get the error EAGAIN?
To get the error EAGAIN, you need to be using Non-Blocking Sockets. With Non-Blocking sockets, you need to write huge amounts of data (and stop receiving data on the peer side), so that your internal TCP buffer gets filled and returns this error.
How to get the error EPIPE?
To get the error EPIPE, you need to send large amount of data after closing the socket on the peer side. You can get more info about EPIPE error from this SO Link. I had asked a question about Broken Pipe Error in the link provided and the accepted answer gives a detailed explanation. It is important to note that to get EPIPE error you should have set the flags parameter of send to MSG_NOSIGNAL. Without that, an abnormal send can generate SIGPIPE signal.
Additional Note
Please note that it is difficult to simulate a write failure, as TCP generally stores the data that you are trying to write into it's internal buffer. So, if the internal buffer has sufficient space, then you won't get an error immediately. The best way is to try to write huge amounts of data. You can also try setting a smaller buffer size for send by using setsockopt function with SO_SNDBUF option
You can simulate errors using fault injection. For example, libfiu is a fault injection library that comes with an example project that allows you to simulate errors from POSIX functions. Basically it uses LD_PRELOAD to inject a wrapper around the regular system calls (including write), and then the wrapper can be configured to either pass through to the real system call, or return whatever error you like.
You could set the receive buffer size to be really small on one side, and send a large buffer on the other. Or on the one side set the send buffer small and try to send a large message.
Otherwise the most common test (I think) is to let the server and client talk for a while, and then remove a network cable.
i have the following code:
{
send(dstSocket, rcvBuffer, recvMsgSize, 0);
sndMsgSize = recv(dstSocket, sndBuffer, RCVBUFSIZE, 0);
send(rcvSocket, sndBuffer, sndMsgSize, 0);
recvMsgSize = recv(rcvSocket, rcvBuffer, RCVBUFSIZE, 0);
}
which eventually should become part of a generic TCP-Proxy. Now as it stands, it doesn't work quite correctly, since the recv() waits for input so the data only gets transmitted in chunks, depending where it currently is.
What i read up on it is that i need something like "non-blocking sockets" and a mechanism to monitor them. This mechanism as i found out is either select, poll or epoll in Linux. Could anyone give me a confirmation that i am on the right track here? Or could this excercise also be done with blocking sockets?
Regards
You are on the right track.
"select" and "poll" are system calls where you can pass in one or more sockets and block (for a specific amount of time) until data has been received (or ready for sending) on one of those sockets.
"non-blocking sockets" is a setting you can apply to a socket (or a recv call flag) such that if you try to call recv, but no data is available, the call will return immediately. Similar semantics exist for "send". You can use non-blocking sockets with or without the select/poll method described above. It's usually not a bad idea to use non-blocking operations just in case you get signaled for data that isn't there.
"epoll" is a highly scalable version of select and poll. A "select" set is actually limited to something like 64-256 sockets for monitoring at a time and it takes a perf hit as the number of monitored sockets goes up. "epoll" can scale up to thousands of simultaneous network connections.
Yes you are in the right track. Use non-blocking socket passing their relative file descriptors to select (see FD_SET()).
This way select will monitor for events (read/write) on them.
When select returns you can check on which fd has occurred an event (look at FD_ISSET()) and handle it.
You can also set a timeout on select, and it will return after that period even if not events have been occurred.
Yes you'll have to use one of those mechanisms. poll is portable and IMO the most easy one to use. You don't have to turn off blocking in this case, provided you use a small enough value for RCVBUFSIZE (around 2k-10k should be appropriate). Non-blocking sockets are a bit more complicated to handle, since if you get EAGAIN on send, you can't just loop to try again (well you can, but you shouldn't since it uses CPU unnecessarily).
But I would recommend to use a wrapper such as libevent. In this case a struct bufferevent would work particularly well. It will make a callback when new data is available, and you just queue it up for sending on the other socket.
Tried to find an bufferevent example but seems to be a bit short on them. The documentation is here anyway: http://monkey.org/~provos/libevent/doxygen-2.0.1/index.html
Hi' I'm writing a simple http port forwarder. I read data from port 80, and pass the data to my lighttpd server, on port 8080.
As long as I write() data on the socket on port 8080 (forwarding the request) there's no problem, but when I read() data from that socket (forwarding the response), the last read() hangs a lot (about 1 or 2 seconds) before realizing there's no more data and returning 0.
I tried to set the socket to non-blocking, but this doesn't work, as sometimes it returns EWOULDBLOCKING even if there's some data left (lighttpd + cgi can be quite slow).
I tried to set a timeout with select(), but, as above, a slow cgi could timeout the socket when there's actually some data to transmit.
Update: SOLVED. It was the keepalive after all. After I disabled it in my lighttpd configuration file, the whole thing runs flawlessly.
Well, for the sake of completion, and as per my comment:
It is likely that the HTTP server itself (lighttpd in your case) is maintaining a persistent connection to your proxy because your proxy relayed a header containing “Connection: keep-alive”. This header aids when the client wants to make multiple requests over the same connection. So, because lighttpd received this header, it assumed it was going to receive further requests and kept the socket open, causing read to block in your proxy.
Disabling keep-alive in your lighttpd configuration is one way to fix it, but also you could also strip the “Connection: keep-alive“ from the header before you relay it to your web server.
Using both non-blocking sockets and select is the right way to go. Returning EWLOULDBLOCK doesn't mean that the entire stream of data is finished being received, it means that, instantaneously, there is nothing to read right now. That's exactly what you want, because it means that read won't wait even half a second for more data to show up. If the data isn't immediately available it will return.
Now, obviously, this means you will need to call read multiple times to get the complete data. The general format for doing this is a select loop. In pseudocode:
do
select ( my_sockets )
if ( select error )
handle_error
else
for each ( socket in my_sockets ) do
if ( socket is ready ) then
nonblocking read from socket
if ( no data was read ) then
close socket
remove socket from my_sockets
endif
endif
loop
endif
loop
The idea is that select will tell you which sockets have data available for reading right now. If you read one of those sockets, you are guaranteed either to get data or to get a return value of 0, indicating that the remote end closed the socket.
If you use this method, you will never be stuck in a read call that is not reading data, for any length of time. The blocking operation is the select call, and you can also select over writeable sockets if you need to write, and set a timeout if you need to do things periodically.
Don't do that!
Keepalives boost performance from other clients. Instead, fix your client. Send a Connection: close header in your client and make sure your request doesn't claim HTTP/1.1 compliance. (If for no other reason than that you probably don't handle chunked encoding either.)
I guess that I would use non-blocking I/O to full extend. Instead of setting timeouts I'd rather wait for event's:
while(select(...)) {
switch(...) {
case ...: // Handle accepting new connection
case ...: // Handle reading from socket
...
}
}
Sinle-thread, blocking forwarder will cause problems anyway with multiple clients.
Sorry - I don't remember exact calls. Also it can be strange in some cases (IIRC - you need to handle write), but there are libraries which simplify the task.