Non-Blocking socket and poll() quirks in proxy - newbie - c

I am a newbie dabbling in C and my little project is to write a simple SOCKS4 proxy. Thanks to the help here i got so far to use non-blocking sockets and poll() for my routine. However at this point i seem to have 2 problems:
The outgoing Socket dstSocket doesn't get closed if the incoming Socket rcvSocket gets closed and vice versa. I don't check for this in the loop, but i don't know how. I tried POLLHUP as revents, but that doesn't seem to do the trick. A normal check seems to be whether recv() returns 0, but is that also valid for non-blocking sockets? And if so, how does that work with revents, i can't seem to figure out where i would put this, since if POLLIN | POLLPRI are set it seems to me recv() never should return 0? Also i don't understand what the exact difference is between POLLIN and POLLPRI, seems to me just a check "data is available for reading" in either case?
The proxy seems to work for connections i tested with netcat. However if i use a browser, it says (when i target a website) whether i want to save "binary data". I checked the data in wireshark and what is received from the server is correctly forwarded to the client byte by byte it seems. If anyone maybe has an idea why that could happen with this program it would be nice :)
Attached the relevant code (beware programming newbie):
fds[1].fd = dstSocket;
fds[0].fd = rcvSocket;
fds[1].events = POLLIN | POLLPRI | POLLHUP;
fds[0].events = POLLIN | POLLPRI | POLLHUP;
timer = poll(fds, 2, timeout_msecs); /* i dont use this yet */
fcntl(rcvSocket, F_SETFL, O_NONBLOCK);
fcntl(dstSocket, F_SETFL, O_NONBLOCK);
while (1 == 1)
{
if (fds[0].revents & POLLIN | POLLPRI)
{
recvMsgSize = recv(rcvSocket, rcvBuffer, RCVBUFSIZE, 0);
if (recvMsgSize > 0) {send(dstSocket, rcvBuffer, recvMsgSize, 0);}
}
if (fds[1].revents & POLLIN | POLLPRI)
{
sndMsgSize = recv(dstSocket, sndBuffer, RCVBUFSIZE, 0);
if (sndMsgSize > 0) { send(rcvSocket, sndBuffer, sndMsgSize, 0);}
}
if ((fds[0].revents & POLLHUP) || (fds[1].revents & POLLHUP))
{
close(rcvSocket);
close(dstSocket);
}
}

recv() returns 0 on a clean remote shutdown - this is true even for nonblocking sockets. In this case, POLLIN will be returned - the notification that the remote side has closed the socket is considered a "readable" event.
You shouldn't need to use POLLPRI for SOCKS / HTTP connections - it indicates TCP "urgent data", which isn't used by those protocols (or indeed, much used at all).
Apart from your direct questions, you need to do more to implement a reliable proxy:
You need to be calling poll() on every loop, not just once. The way you have it written it is busy-looping, which is generally not considered acceptable practise.
You should be setting the disposition of SIGPIPE to ignored with signal(SIGPIPE, SIG_IGN);. This allows you to gracefully handle write failures.
You should be checking the result of send(). Note that it can write less than the amount you requested - in this case, you will have to keep the unsent data buffered, return to the poll() and try sending the remaining data again if POLLOUT is raised on the socket. You only want to request POLLOUT if there is unsent data remaining, so you need to make sure .events is set correctly before every poll() call.
You should be checking errno if recv() or send() returns a value less than 0. EINTR and EWOULDBLOCK should be ignored; any other error should be treated as a connection failure.
You should not be closing both directions immediately when one socket closes - you must support asymmetric shutdowns. This means that when you see that fds[0] has been closed by the remote end, you should call shutdown(fds[1], SHUT_WR);, and vice-versa; only when both have been shutdown (or a connection failure has occured) should you call close() on both file descriptors and finish up.

Related

Is it possible to do epoll on accept event?

Let's suppose I've created a listening socket:
sock = socket(...);
bind(sock,...);
listen(sock, ...);
Is it possible to do epoll_wait on sock to wait for incoming connection? And how do I get client's socket fd after that?
The thing is on the platform I'm writing for sockets cannot be non-blocking, but there is working epoll implementation with timeouts, and I need to accept connection and work with it in a single thread so that it doesn't hang if something goes wrong and connection doesn't come.
Without knowing what this non-standard platform is it's impossible to know exactly what semantics they gave their epoll call. But on the standard epoll on Linux, a listening socket will be reported as "readable" when an incoming connection arrives, and then you can accept the connection by calling accept. If you leave the socket in blocking mode, and always check for readability using epoll's level-triggered mode before each call to accept, then this should work – the only risk is that if you somehow end up calling accept when no connection has arrived, then you'll get stuck. For example, this could happen if there are two processes sharing a listening socket, and they both try to accept the same connection. Or maybe it could happen if an incoming connection arrives, and then is closed again before you call accept. (Pretty sure in this case Linux still lets the accept succeed, but this kind of edge case is exactly where I'd be suspicious of a weird platform doing something weird.) You'd want to check these things.
Non-blocking mode is much more reliable because in the worst case, accept just reports that there's nothing to accept. But if that's not available, then you might be able to get away with something like this...
Since this answer is the first up in the results in duckduckgo. I will just chime in to say that under GNU/Linux 4.18.0-18-generic (Ubuntu 18.10).
The asynchronously accept an incoming connection using one has to watch for errno value EWOULDBLOCK (11) and then add the socket to epoll read set.
Here is a small shot of scheme code that achieves that:
(define (accept fd)
(let ((out (socket:%accept fd 0 0)))
(if (= out -1)
(let ((code (socket:errno)))
(if (= code EWOULDBLOCK)
(begin
(abort-to-prompt fd 'read)
(accept fd))
(error 'accept (socket:strerror code))))
out)))
In the above (abort-to-prompt fd 'read) will pause the coroutine and add fd to epoll read set, done as follow:
(epoll-ctl epoll EPOLL-CTL-ADD fd (make-epoll-event-in fd)))
When the coroutine is unpaused, the code proceed after the abort to call itself recursively (in tail-call position)
In the code I am working in Scheme, it is a bit more involving since I rely on call/cc to avoid callbacks. The full code is at source hut.
That is all.

Non Blocking recv() in C Sockets

I am using an infinite loop in sockets in which if it receives some data it should receive it or if it wants to send data it sends. Something like given below. I am using select. I have only one socket sd.
fd_set readsd;
int maxsd = readsd +1;
// all those actions of setting maxsd to the maximum fd +1 and FDSETing the FDs.
while(1)
{
FD_ZERO(&read_sd);
FD_SET(sd, &read_sd);
if(FD_ISSET(sd, &readsd))
{
//recv call
}
else
{
//send call
}
}
As far as I know, select selects one of the socket descriptors on which data arrives first. But here I have only one socket, and I want to recv if there is some data or I want to send otherwise.
In that case, is the code given above fine? Or there is some other option for me which I don't know about?
In that case, is the code given above fine ?
I don't see any call to select. Also, if “maxsd” is designed to be the first argument of select, its value is wrong : it must be the bigest file descriptor +1. Anyway, you could simply call recv with the flag MSG_DONTWAIT, in which case it will return an error if there is no data to read.
It kind of depends... First of all, you actually do have a select call in your real code?
Now about the blocking... If select returns with your socket set in the read-set, then you are guaranteed that you can call recv without blocking. But there are no guarantees about the amount of data available. If you use UDP then there will be at least one (hopefully complete) packet, but if you use TCP you may get only one byte. For some protocols with message boundaries, you might not get full messages, and you have to call recv in a loop to get all of the message, and unfortunately this will sooner or later cause the recv call to block.
So in short, using select helps, but it does not help in all cases. The only way to actually guarantee that a recv call won't block is to make the socket non-blocking.
Im not very sure about what you are trying to do, so I can think about two options:
Set a socket to be non-blocking
Since seems like you have only one socket, you can set the socket to non-blocking mode using fcntl and call the recv()
fcntl(sock, F_SETFL, fcntl(sock, F_GETFL) | O_NONBLOCK);
// if fcntl returns no error, sock is now non-blocking
Set the select timer
Using select you can set a timer to force the return after some time happened even if no data was received.
First, I cannot find any select in your code.
However, you may call fcntl(fd, F_SETFL, flags | O_NONBLOCK) first to make your socket non-blocking. Then check if errno == EWOULDBLOCK when you cannot read anything from recv. You need not to use select in this case.

Errno 35 (EAGAIN) returned on recv call

I have a socket which waits for recv and then after receiving data, sends data forward for processing. However, then it again goes for recv, and this time it receives nothing returns -1 and when printed the errno it prints 35 (which is EAGAIN).
This happens only on MAC OS Lion operating system, for other OS this runs perfectly fine
do{
rc = recv(i, buffer, sizeof(buffer), 0);
if (rc < 0){
printf("err code %d", errno);
}
if(rc == 0){
//Code for processing the data in buffer
break;
}
....
}while(1);
EDIT: Corrected indentation and errno
You either set the socket to non-blocking mode or enabled the receive timeout. Here's from recv(2) on a mac:
The calls fail if:
[EAGAIN] The socket is marked non-blocking, and the receive operation would block, or a receive timeout had been set, and the timeout expired before data were received.
Edit 0:
Hmm, apologies for quoting again. This time from intro(2):
11 EDEADLK Resource deadlock avoided. An attempt was made to
lock a system resource that would have resulted in a deadlock
situation.
...
35 EAGAIN Resource temporarily
unavailable. This is a temporary condition and later calls to the
same routine may complete normally.
Just use strerror(3) to figure out the actual issue.
Your socket is in non-blocking mode. EAGAIN is the normal return from recv() (and other system calls) when there is no data available to read. In that sense it's not really an error.
If you meant for your socket to be nonblocking then you need to monitor it to find out when it has data available and only call recv() when there is data available. Use poll() (or kqueue, which is specific to FreeBSD and MacOS) to monitor is. Usually this is done in your application's main event loop.
If you did not mean for your socket to be nonblocking, then you should set it to blocking more with fcntl():
flags = fcntl(i, F_GETFL, 0); /* add error checking here, please */
flags &= ~O_NONBLOCK;
fcntl(i, F_SETFL, flags); /* add more error checking here! */
But you should be aware that the default blocking state of sockets (and all file descriptors) is blocking, so if your socket is in nonblocking mode then that means someone or something has manually made it nonblocking.
In blocking mode, the recv call will block and wait for more data instead of returning EAGAIN (or EWOULDBLOCK which is the same thing as EAGAIN).

Dealing with data on multiple TCP connections with epoll

I have an application that is going to work like a p2p-software where all peer are going to talk to each other. Since the communication will be TCP i thought that I could use epool(4) so that multiple connections can be handled. Since each peer will send data very often, I thought that I will establish a persistent connection to each peer that will be used under the applications lifetime.
Now, one thing that I don't know how to handle is that since the connection is never closed how do I know when i should stop receiving data with read() and call epool_wait() again to listen after more packages? Or is there a better way of dealing with persistent TCP connections?
You should set the socket to non-blocking, and when epoll indicates there is data to read
you should call read() in a loop until read() returns -1 and errno is EWOULDBLOCK
That is, your read loop could look sometihng like:
for(;;)
ssize_t ret;
ret = read(...);
if(ret == 0) {
//client disconnected, handle it, remove the fd from the epoll set
break;
} else if(ret == -1) {
if(errno == EWOULDBLOCK) {
// no more data, return to epoll loop
} else {
//error occured, handle it remove the fd from the epoll set
}
break;
}
// handle the read data
}
If you're not using edge triggered mode with epoll, you don't really need the loop - you could get away with doing just 1 read and return to the epoll loop. But handle the return values just like the above code.
That should have been 'epoll', not 'epool'...not familiar with epoll, but have a look here at the Beej's guide to see an example of the sockets using 'poll'...look at section 7.2 in there to see how it is done, also look at the section 9.17 for the usage of 'poll'...
Hope this helps,
Best regards,
Tom.
read() reads as much data as is immediately available (but no more that you request). Just run read() on the active socket, with a big-enough buffer (you probably don't need it bigger than your MTU… 2048 bytes will do) and call epoll_wait() when it completes.

Check Socket File Descriptor is Available?

If I got a file descriptor (socket fd), how to check this fd is avaiable for read/write?
In my situation, the client has connected to server and we know the fd.
However, the server will disconnect the socket, are there any clues to check it ?
You want fcntl() to check for read/write settings on the fd:
#include <unistd.h>
#include <fcntl.h>
int r;
r = fcntl(fd, F_GETFL);
if (r == -1)
/* Error */
if (r & O_RDONLY)
/* Read Only */
else if (r & O_WRONLY)
/* Write Only */
else if (r & O_RDWR)
/* Read/Write */
But this is a separate issue from when the socket is no longer connected. If you are already using select() or poll() then you're almost there. poll() will return status nicely if you specify POLLERR in events and check for it in revents.
If you're doing normal blocking I/O then just handle the read/write errors as they come in and recover gracefully.
You can use select() or poll() for this.
In C#, this question is answered here
In general, socket disconnect is asynchronous and needs to be polled for in some manner. An async read on the socket will typically return if it's closed as well, giving you a chance to pick up on the status change quicker. Winsock (Windows) has the ability to register to receive notification of a disconnect (but again, this may not happen for a long time after the other side "goes away", unless you use some type of 'keepalive' (SO_KEEPALIVE, which by default may not notice for hours, or an application-level heartbeat).
I found the recv can check. when socket fd is bad, some errno is set.
ret = recv(socket_fd, buffer, bufferSize, MSG_PEEK);
if(EPIPE == errno){
// something wrong
}
Well, you could call select(). If the server has disconnected, I believe you'll eventually get an error code returned... If not, you can use select() to tell whether you're network stack is ready to send more data (or receive it).

Resources