Unexpected behaviour of SO_SNDTIMEO and SO_RCVTIMEO - c

I'm trying to set the timeout for the blocking TCP socket on Linux using setsockopt with SO_SNDTIMEO and SO_RCVTIMEO. But for some reason I get a lock while waiting on recv call.
Consider the minimal example. I've shortened it a bit for readability, the full code is available in this gist. I create a server socket and set the timeouts using setsockopt. Next, the server exchanges messages with the client. After a while, the client interrupts data exchange and closes the socket. But the server still waiting in blocking recv. However, I expect it to abort recv when the timeout expires.
Network functions used by the client and server:
int set_timeout(int fd, int sec) {
struct timeval timeout;
timeout.tv_sec = sec;
timeout.tv_usec = 0;
if (setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, (char *)&timeout,
sizeof(timeout)) < 0) {
return -1;
}
if (setsockopt(fd, SOL_SOCKET, SO_SNDTIMEO, (char *)&timeout,
sizeof(timeout)) < 0) {
return -1;
}
return 0;
}
int blocking_recv(int sock, char *data, size_t size) {
size_t remain_data = size;
ssize_t recv_nb;
int offset = 0;
while ((remain_data > 0) &&
((recv_nb = recv(sock, data + offset, remain_data, 0)) >= 0)) {
remain_data -= recv_nb;
offset += recv_nb;
}
if (recv_nb == -1 || remain_data > 0) {
return -1;
}
return 0;
}
int blocking_send(int sock, char *buf, size_t size) {
size_t total = 0;
int remain = size;
int n;
while (total < size) {
n = send(sock, buf + total, remain, 0);
if (n == -1) {
break;
}
total += n;
remain -= n;
}
return (total == size) ? 0 : -1;
}
Server:
while (1) {
struct sockaddr_in caddr;
size_t len = sizeof(caddr);
if ((client_fd = accept(server_fd, (struct sockaddr *)&caddr, (socklen_t *)&len)) < 0) {
continue;
}
// Set 3 seconds timeout using SO_{SND,RCV}TIMEO
if (set_timeout(client_fd, 3) != 0) {
continue;
}
// Send and receive data in the loop. We assume that the client is stuck,
// we'll break the connection using a timeout.
while (1) {
char in_data[data_size];
char *out_data = "test\0";
if (blocking_recv(client_fd, in_data, data_size) != 0)
break;
if (blocking_send(client_fd, out_data, data_size) != 0)
break;
}
}
Client:
int attempts_left = 5;
while (--attempts_left > 0) {
char in_data[6];
char *out_data = "test\0";
if (blocking_send(fd, out_data, 6) != 0)
break;
if (blocking_recv(fd, in_data, 6) != 0)
break;
sleep(1);
}
close(fd);
What could be wrong?
Is it possible to implement the timeout for recv in this way, without using select, poll and signals?
Thanks!

You are looping your reading:
while ((remain_data > 0) &&
((recv_nb = recv(sock, data + offset, remain_data, 0)) >= 0)) {
remain_data -= recv_nb;
offset += recv_nb;
}
yet read from man read returns:
If no messages are available to be received and the peer has
performed an orderly shutdown, recv() shall return 0.
So you are just looping endlessly.
You should correctly handle errno in your program. when SNDTIMEO stuff timeouts, you get: the timeout has been reached then -1 is returned with errno set to EAGAIN or EWOULDBLOCK, or EINPROGRESS (for connect(2)) ....
Still EAGAIN can be returned when signal interrupts, so just the usual select() or poll() for 3 seconds would be probably simpler. If not, measure time yourself anyway and set the timeout to the max timeout you want and measure how much time has passed. Along:
timeout = now + 3 seconds.
// check timeout yourself
while (timeout_not_expired(&timeout)) {
// set timeout for the __next__ recv operation
set_recv_timeout(timeout_to_expire(&timeout));
ret = recv();
if (ret == -1 && errno == EAGAIN) {
continue;
}
if (ret == -1) {
/* handle errror */
}
if (ret == 0) {
/* closed */
}
/* actually received stuff */
}
Use tools like strace and a debugger to debug your programs.

Related

linux c socket connect nonblocking poll

I am using the below code for non-blocking connect() on some IOT device with connected Internet via WiFi. Sometimes it happens that the WiFi drops and in 99% of those cases the code exits with rc != 1, meaning connect() failed.
But, for some reason, even if there is no Internet connected, sometimes I get back that poll() succeeded with the POLLOUT event and SO_ERROR says no error. Is this a bug in poll(), or am I checking it wrong? Why would I get connect() success, which is impossible?
sockfd = socket(AF_INET, SOCK_STREAM, 0);
addr = (struct sockaddr*)&address;
addrlen = sizeof(struct sockaddr_in);
int rc = -1;
// Set O_NONBLOCK
int sockfd_flags_before;
if ((sockfd_flags_before = fcntl(sockfd, F_GETFL, 0) < 0)) return -1;
if (fcntl(sockfd, F_SETFL, sockfd_flags_before | O_NONBLOCK) < 0) return -1;
// Start connecting (asynchronously)
struct timespec start;
struct timespec now;
if (clock_gettime(CLOCK_REALTIME, &start) < 0) { LOG_ERROR("Cannot get time"); return false; }
do {
if (connect(sockfd, addr, addrlen) < 0) {
// Did connect return an error? If so, we'll fail.
int err = errno;
if ((err != EWOULDBLOCK) && (err != EINPROGRESS)) {
rc = -1;
}
// Otherwise, we'll wait for it to complete.
else {
// Wait for the connection to complete.
do {
// Calculate how long until the deadline
if (clock_gettime(CLOCK_REALTIME, &now) < 0)
{
rc = -1; break;
}
int elapsed = Util::TimespecDiffMs(&now, &start);
int remaining = timeout_ms - elapsed;
if (remaining <= 0) { rc = 0; break; }
// Wait for connect to complete (or for the timeout deadline)
struct pollfd pfds[] = { {sockfd, POLLOUT,NULL } };
rc = poll(pfds, 1, remaining);
// If poll 'succeeded', make sure it *really* succeeded
if (rc > 0) {
int error = -1; socklen_t len = sizeof(error);
int retval = getsockopt(sockfd, SOL_SOCKET, SO_ERROR, &error, &len);
if (retval == 0) errno = error;
if (error != 0) rc = -1;
}
}
// If poll was interrupted, try again.
while (rc == -1 && errno == EINTR);
// Did poll timeout? If so, fail.
if (rc == 0) {
errno = ETIMEDOUT;
rc = -1;
}
}
}
} while (0);

Only first few multithreaded clients can read response from socket

I am building a client-server program in C using sockets. Both my client and my server use a fixed number of threads to operate. I tested with very few client and server threads at first (5 and 3) and everything seemed to work fine. But now I tried to up the number of client threads to 500 (while the number of server thread stays at 3), but everything breaks. The first hundred or so client can send their request and receive a response, but the others don't receive anything from the server.
I'm working on a Debian Windows Subsystem if that changes anything.
I have also tried upping the number of server thread to 300 but the problem still happens.
Here is my (very) simplified code.
Client thread
int client_socket= socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = htons(2018);
addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
memset(&addr.sin_zero, 0, sizeof(addr.sin_zero));
connect(client_socket, (struct sockaddr *) &addr, sizeof(addr));
int response;
int cid = 1;
send(client_socket, &cid, sizeof(int), 0);
int len = read_socket(socket, &response, sizeof(int), 1000);
if (len == 0) {
printf("No response");
} else {
printf("Response");
}
close(client_socket);
Server thread
int socket_fd, cid, len;
while (1)
{
socket_fd =
accept(socket_fd, (struct sockaddr *)&thread_addr, &socket_len);
if (socket_fd> 0) {
int cid;
int len = read_socket(socket_fd, &cid, sizeof(cid), 1000);
if (len == 0) {
printf("Nothing");
}
send(socket_fd, &cid, sizeof(int),0);
close(socket_fd);
}
}
And here is my helper function read_socket()
ssize_t read_socket(int sockfd, void *buf, size_t obj_sz, int timeout) {
int ret;
int len = 0;
struct pollfd fds[1];
fds->fd = sockfd;
fds->events = POLLIN;
fds->revents = 0;
do {
// wait for data or timeout
ret = poll(fds, 1, timeout);
if (ret > 0) {
if (fds->revents & POLLIN) {
ret = recv(sockfd, (char*)buf + len, obj_sz - len, 0);
if (ret < 0) {
// abort connection
perror("recv()");
return -1;
}
len += ret;
}
} else {
// TCP error or timeout
if (ret < 0) {
perror("poll()");
}
break;
}
} while (ret != 0 && len < obj_sz);
return ret;
}
Like I said, some client can complete their execution with no problem, but a lot of them don't receive a response from the server.

Can't seem to get a timeout working when connecting to a socket

I'm trying to supply a timeout for connect(). I've searched around and found several articles related to this. I've coded up what I believe should work but unfortunately I get no error reported from getsockopt(). But then when I come to the write() it fails with an errno of 107 - ENOTCONN.
A couple of points. I'm running on Fedora 23. The docs for connect() says it should return failure with an errno of EINPROGRESS for a connect that is not complete yet however I was experiencing EAGAIN so I added that to my check. Currently my socket server is setting the backlog to zero in the listen() call. Many of the calls succeed but the ones that fail all fail with the 107 - ENOTCONN I had mentioned in the write() call.
I'm hoping I'm just missing something but so far can't figure out what.
int domain_socket_send(const char* socket_name, unsigned char* buffer,
unsigned int length, unsigned int timeout)
{
struct sockaddr_un addr;
int fd = -1;
int result = 0;
// Create socket.
fd = socket(AF_UNIX, SOCK_STREAM, 0);
if (fd == -1)
{
result = -1;
goto done;
}
if (timeout != 0)
{
// Enabled non-blocking.
int flags;
flags = fcntl(fd, F_GETFL);
fcntl(fd, F_SETFL, flags | O_NONBLOCK);
}
// Set socket name.
memset(&addr, 0, sizeof(addr));
addr.sun_family = AF_UNIX;
strncpy(addr.sun_path, socket_name, sizeof(addr.sun_path) - 1);
// Connect.
result = connect(fd, (struct sockaddr*) &addr, sizeof(addr));
if (result == -1)
{
// If some error then we're done.
if ((errno != EINPROGRESS) && (errno != EAGAIN))
goto done;
fd_set write_set;
struct timeval tv;
// Set timeout.
tv.tv_sec = timeout / 1000000;
tv.tv_usec = timeout % 1000000;
unsigned int iterations = 0;
while (1)
{
FD_ZERO(&write_set);
FD_SET(fd, &write_set);
result = select(fd + 1, NULL, &write_set, NULL, &tv);
if (result == -1)
goto done;
else if (result == 0)
{
result = -1;
errno = ETIMEDOUT;
goto done;
}
else
{
if (FD_ISSET(fd, &write_set))
{
socklen_t len;
int socket_error;
len = sizeof(socket_error);
// Get the result of the connect() call.
result = getsockopt(fd, SOL_SOCKET, SO_ERROR,
&socket_error, &len);
if (result == -1)
goto done;
// I think SO_ERROR will be zero for a successful
// result and errno otherwise.
if (socket_error != 0)
{
result = -1;
errno = socket_error;
goto done;
}
// Now that the socket is writable issue another connect.
result = connect(fd, (struct sockaddr*) &addr,
sizeof(addr));
if (result == 0)
{
if (iterations > 1)
{
printf("connect() succeeded on iteration %d\n",
iterations);
}
break;
}
else
{
if ((errno != EAGAIN) && (errno != EINPROGRESS))
{
int err = errno;
printf("second connect() failed, errno = %d\n",
errno);
errno = err;
goto done;
}
iterations++;
}
}
}
}
}
// If we put the socket in non-blocking mode then put it back
// to blocking mode.
if (timeout != 0)
{
// Turn off non-blocking.
int flags;
flags = fcntl(fd, F_GETFL);
fcntl(fd, F_SETFL, flags & ~O_NONBLOCK);
}
// Write buffer.
result = write(fd, buffer, length);
if (result == -1)
{
int err = errno;
printf("write() failed, errno = %d\n", err);
errno = err;
goto done;
}
done:
if (result == -1)
result = errno;
else
result = 0;
if (fd != -1)
{
shutdown(fd, SHUT_RDWR);
close(fd);
}
return result;
}
UPDATE 04/05/2016:
It dawned on me that maybe I need to call connect() multiple times until successful, after all this is non-blocking io not async io. Just like I have to call read() again when there is data to read after encountering an EAGAIN on a read(). In addition, I found the following SO question:
Using select() for non-blocking sockets to connect always returns 1
in which EJP's answer says you need to issue multiple connect()'s. Also, from the book EJP references:
https://books.google.com/books?id=6H9AxyFd0v0C&pg=PT681&lpg=PT681&dq=stevens+and+wright+tcp/ip+illustrated+non-blocking+connect&source=bl&ots=b6kQar6SdM&sig=kt5xZubPZ2atVxs2VQU4mu7NGUI&hl=en&sa=X&ved=0ahUKEwjmp87rlfbLAhUN1mMKHeBxBi8Q6AEIIzAB#v=onepage&q=stevens%20and%20wright%20tcp%2Fip%20illustrated%20non-blocking%20connect&f=false
it seems to indicate you need to issue multiple connect()'s. I've modified the code snippet in this question to call connect() until it succeeds. I probably still need to make changes around possibly updating the timeout value passed to select(), but that's not my immediate question.
Calling connect() multiple times appears to have fixed my original problem, which was that I was getting ENOTCONN when calling write(), I guess because the socket was not connected. However, you can see from the code that I'm tracking how many times through the select loop until connect() succeeds. I've seen the number go into the thousands. This gets me worried that I'm in a busy wait loop. Why is the socket writable even though it's not in a state that connect() will succeed? Is calling connect() clearing that writable state and it's getting set again by the OS for some reason, or am I really in a busy wait loop?
Thanks,
Nick
From http://lxr.free-electrons.com/source/net/unix/af_unix.c:
441 static int unix_writable(const struct sock *sk)
442 {
443 return sk->sk_state != TCP_LISTEN &&
444 (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
445 }
I'm not sure what these buffers are that are being compared, but it looks obvious that the connected state of the socket is not being checked. So unless these buffers are modified when the socket becomes connected it would appear my unix socket will always be marked as writable and thus I can't use select() to determine when the non-blocking connect() has finished.
and based on this snippet from http://lxr.free-electrons.com/source/net/unix/af_unix.c:
1206 static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
1207 int addr_len, int flags)
.
.
.
1230 timeo = sock_sndtimeo(sk, flags & O_NONBLOCK);
.
.
.
1271 if (unix_recvq_full(other)) {
1272 err = -EAGAIN;
1273 if (!timeo)
1274 goto out_unlock;
1275
1276 timeo = unix_wait_for_peer(other, timeo);
.
.
.
it appears setting the send timeout might be capable of timing out the connect. Which also matches the documentation for SO_SNDTIMEO at http://man7.org/linux/man-pages/man7/socket.7.html.
Thanks,
Nick
Your error handling on select() could use some cleanup. You don't really need to query SO_ERROR unless except_set is set. If select() returns > 0 then either write_set and/or except_set is set, and if except_set is not set then the connection was successful.
Try something more like this instead:
int domain_socket_send(const char* socket_name, unsigned char* buffer,
unsigned int length, unsigned int timeout)
{
struct sockaddr_un addr;
int fd;
int result;
// Create socket.
fd = socket(AF_UNIX, SOCK_STREAM, 0);
if (fd == -1)
return errno;
if (timeout != 0)
{
// Enabled non-blocking.
int flags = fcntl(fd, F_GETFL);
fcntl(fd, F_SETFL, flags | O_NONBLOCK);
}
// Set socket name.
memset(&addr, 0, sizeof(addr));
addr.sun_family = AF_UNIX;
strncpy(addr.sun_path, socket_name, sizeof(addr.sun_path) - 1);
// Connect.
result = connect(fd, (struct sockaddr*) &addr, sizeof(addr));
if (result == -1)
{
// If some error then we're done.
if ((errno != EINPROGRESS) && (errno != EAGAIN))
goto done;
// Now select() to find out when connect() has finished.
fd_set write_set;
fd_set except_set;
FD_ZERO(&write_set);
FD_ZERO(&write_set);
FD_SET(fd, &write_set);
FD_SET(fd, &except_set);
struct timeval tv;
// Set timeout.
tv.tv_sec = timeout / 1000000;
tv.tv_usec = timeout % 1000000;
result = select(fd + 1, NULL, &write_set, &except_set, &tv);
if (result == -1)
{
goto done;
}
else if (result == 0)
{
result = -1;
errno = ETIMEDOUT;
goto done;
}
else if (FD_ISSET(fd, &except_set))
{
int socket_error;
socklen_t len = sizeof(socket_error);
// Get the result of the connect() call.
result = getsockopt(fd, SOL_SOCKET, SO_ERROR, &socket_error, &len);
if (result != -1)
{
result = -1;
errno = socket_error;
}
goto done;
}
else
{
// connected
}
}
// If we put the socket in non-blocking mode then put it back
// to blocking mode.
if (timeout != 0)
{
int flags = fcntl(fd, F_GETFL);
fcntl(fd, F_SETFL, flags & ~O_NONBLOCK);
}
// Write buffer.
result = write(fd, buffer, length);
done:
if (result == -1)
result = errno;
else
result = 0;
if (fd != -1)
{
shutdown(fd, SHUT_RDWR);
close(fd);
}
return result;
}

Receiving data from socket using recv not working

I'm trying to create a simple proxy server using BSD sockets, which listens on a port for a request and then passes that request on to another server, before sending the server's response back to the browser.
I am able to receive a REST request from the browser, using the code below:
void *buffer = malloc(512);
long length = 0;
while (1) {
void *tempBuffer = malloc(512);
long response = recv(acceptedDescriptor, tempBuffer, 512, 0);
if (response == 0 || response < 512) {
free(tempBuffer);
printf("Read %lu bytes\n", length);
break;
}
memcpy(buffer + length, tempBuffer, response);
free(tempBuffer);
length += response;
realloc(buffer, length + 512);
}
However, recv() should return 0 when the connection is closed by the peer (in this case the browser), but this is never the case. The only way I am able to detect whether or not the connection has closed is by checking if the response is less than the maximum amount requested from recv(), 512 bytes. This is sometimes problematic as some requests I see are incomplete.
If there is no more data to receive, recv() blocks and never returns, and setting the accepted descriptor to be non-blocking means that the read loop goes on forever, never exiting.
If I:
Set the listening socket descriptor to non-blocking, I get a EAGAIN error (resource temporarily unavailable) when I try to accept() the connection
Set the accepted socket descriptor to non-blocking, recv() never returns 0 and the loop continues on forever
Set them both to non-blocking, I get a 'bad file descriptor' error when trying to accept() the connection
Don't set either of them to non-blocking, the loop never exits because recv() never returns.
The socket itself is created as follows, but since it is able to detect a request, I can't see anything wrong with its initialisation:
int globalDescriptor = -1;
struct sockaddr_in localServerAddress;
...
int initSocket() {
globalDescriptor = socket(AF_INET, SOCK_STREAM, 0);
if (globalDescriptor < 0) {
perror("Socket Creation Error");
return 0;
}
localServerAddress.sin_family = AF_INET;
localServerAddress.sin_addr.s_addr = INADDR_ANY;
localServerAddress.sin_port = htons(8374);
memset(localServerAddress.sin_zero, 0, 8);
int res = 0;
setsockopt(globalDescriptor, SOL_SOCKET, SO_REUSEADDR, &res, sizeof(res));
//fcntl(globalDescriptor, F_SETFL, O_NONBLOCK);
return 1;
}
...
void startListening() {
int bindResult = bind(globalDescriptor, (struct sockaddr *)&localServerAddress, sizeof(localServerAddress));
if (bindResult < 0) {
close(globalDescriptor);
globalDescriptor = 0;
perror("Socket Bind Error");
exit(1);
}
listen(globalDescriptor, 1);
struct sockaddr_in clientAddress;
int clientAddressLength = sizeof(clientAddress);
while (1) {
memset(&clientAddress, 0, sizeof(clientAddress));
clientAddressLength = sizeof(clientAddress);
int acceptedDescriptor = accept(globalDescriptor, (struct sockaddr *)&clientAddress, (socklen_t *)&clientAddressLength);
//fcntl(acceptedDescriptor, F_SETFL, O_NONBLOCK);
if (acceptedDescriptor < 0) {
perror("Incoming Connection Error");
exit(1);
}
void *buffer = malloc(512);
long length = 0;
while (1) {
void *tempBuffer = malloc(512);
long response = recv(acceptedDescriptor, tempBuffer, 512, 0);
if (response == 0) {
free(tempBuffer);
printf("Read %lu bytes\n", length);
break;
}
memcpy(buffer + length, tempBuffer, response);
free(tempBuffer);
length += response;
realloc(buffer, length + 512);
}
executeRequest(buffer, length, acceptedDescriptor);
close(acceptedDescriptor);
free(buffer);
}
}
...
The startListening() function is called only if initSocket() returns 1:
int main(int argc, const char *argv[]) {
if (initSocket() == 1) {
startListening();
}
return 0;
}
I'm probably doing something stupid here, but I'd appreciate any information you may have about this problem and how I could fix it.
Since your REST request is a HTTP method, it has the well-defined HTTP Message Length, so you just have to recv() until the complete message has arrived.

Receive Timeout Happen with my socket

Hi I try to write my own send and recv function for my application. Which has to handle 144 number of request per seconds in load. In Load mode my application faced recive timeout issue at the same time ( 5 request in 1 Lakh request). Here I have set 20 secs for timeout. pls tell me what is the problem with my code.
recvAll function :
int recvAll(int s, char *buf, int len, int timeout)
{
fd_set fds;
int n;
struct timeval tv;
FD_ZERO(&fds);
FD_SET(s, &fds);
tv.tv_sec = timeout;
tv.tv_usec = 0;
n = select(s+1, &fds, NULL, NULL, &tv);
if (n == 0) return -2;
if (n == -1) return -1;
int retVal =recv(s, buf, len, 0);
printf("received byes %d\n",retVal);
buf[retVal+1]='\0';
return retVal;
}
Function Call :
do
{
if(0 >= (bytesRcvd =recvAll(sockfd, recvBuffer,1024,20)))
{
perror("Receive Timeout Happened");
close(sockfd);
return -1;
}
totalBytesRcvd += bytesRcvd;
}while(totalBytesRcvd < 1024);
The problem with your code is that you're guessing. It could be any error at all. You're telling yourself it's a read timeout, but it could be EOS (bytesRcvd == 0) or some other error.
Actually real problem is Interrupted signal has been caught (INTR) in load mode. So
do
{
if(0 >= (bytesRcvd =recvAll(sockfd, recvBuffer,1024,20)))
{
if (errno == EINTR) continue;
perror("Receive Timeout Happened");
close(sockfd);
return -1;
}
totalBytesRcvd += bytesRcvd;
}while(totalBytesRcvd < 1024);
This could be the answer for the above question.

Resources