How to set socket timeout in C when making multiple connections? - c

I'm writing a simple program that makes multiple connections to different servers for status check. All these connections are constructed on-demand; up to 10 connections can be created simultaneously. I don't like the idea of one-thread-per-socket, so I made all these client sockets Non-Blocking, and throw them into a select() pool.
It worked great, until my client complained that the waiting time is too long before they can get the error report when target servers stopped responding.
I've checked several topics in the forum. Some had suggested that one can use alarm() signal or set a timeout in the select() function call. But I'm dealing with multiple connections, instead of one. When a process wide timeout signal happens, I've no way to distinguish the timeout connection among all the other connections.
Is there anyway to change the system-default timeout duration ?

You can use the SO_RCVTIMEO and SO_SNDTIMEO socket options to set timeouts for any socket operations, like so:
struct timeval timeout;
timeout.tv_sec = 10;
timeout.tv_usec = 0;
if (setsockopt (sockfd, SOL_SOCKET, SO_RCVTIMEO, &timeout,
sizeof timeout) < 0)
error("setsockopt failed\n");
if (setsockopt (sockfd, SOL_SOCKET, SO_SNDTIMEO, &timeout,
sizeof timeout) < 0)
error("setsockopt failed\n");
Edit: from the setsockopt man page:
SO_SNDTIMEO is an option to set a timeout value for output operations. It accepts a struct timeval parameter with the number of seconds and microseconds used to limit waits for output operations to complete. If a send operation has blocked for this much time, it returns with a partial count or with the error EWOULDBLOCK if no data were sent. In the current implementation, this timer is restarted each time additional data are delivered to the protocol, implying that the limit applies to output portions ranging in size from the low-water mark to the high-water mark for output.
SO_RCVTIMEO is an option to set a timeout value for input operations. It accepts a struct timeval parameter with the number of seconds and microseconds used to limit waits for input operations to complete. In the current implementation, this timer is restarted each time additional data are received by the protocol, and thus the limit is in effect an inactivity timer. If a receive operation has been blocked for this much time without receiving additional data, it returns with a short count or with the error EWOULDBLOCK if no data were received. The struct timeval parameter must represent a positive time interval; otherwise, setsockopt() returns with the error EDOM.

am not sure if I fully understand the issue, but guess it's related to the one I had, am using Qt with TCP socket communication, all non-blocking, both Windows and Linux..
wanted to get a quick notification when an already connected client failed or completely disappeared, and not waiting the default 900+ seconds until the disconnect signal got raised. The trick to get this working was to set the TCP_USER_TIMEOUT socket option of the SOL_TCP layer to the required value, given in milliseconds.
this is a comparably new option, pls see https://www.rfc-editor.org/rfc/rfc5482, but apparently it's working fine, tried it with WinXP, Win7/x64 and Kubuntu 12.04/x64, my choice of 10 s turned out to be a bit longer, but much better than anything else I've tried before ;-)
the only issue I came across was to find the proper includes, as apparently this isn't added to the standard socket includes (yet..), so finally I defined them myself as follows:
#ifdef WIN32
#include <winsock2.h>
#else
#include <sys/socket.h>
#endif
#ifndef SOL_TCP
#define SOL_TCP 6 // socket options TCP level
#endif
#ifndef TCP_USER_TIMEOUT
#define TCP_USER_TIMEOUT 18 // how long for loss retry before timeout [ms]
#endif
setting this socket option only works when the client is already connected, the lines of code look like:
int timeout = 10000; // user timeout in milliseconds [ms]
setsockopt (fd, SOL_TCP, TCP_USER_TIMEOUT, (char*) &timeout, sizeof (timeout));
and the failure of an initial connect is caught by a timer started when calling connect(), as there will be no signal of Qt for this, the connect signal will no be raised, as there will be no connection, and the disconnect signal will also not be raised, as there hasn't been a connection yet..

Can't you implement your own timeout system?
Keep a sorted list, or better yet a priority heap as Heath suggests, of timeout events. In your select or poll calls use the timeout value from the top of the timeout list. When that timeout arrives, do that action attached to that timeout.
That action could be closing a socket that hasn't connected yet.

connect timeout has to be handled with a non-blocking socket (GNU LibC documentation on connect). You get connect to return immediately and then use select to wait with a timeout for the connection to complete.
This is also explained here : Operation now in progress error on connect( function) error.
int wait_on_sock(int sock, long timeout, int r, int w)
{
struct timeval tv = {0,0};
fd_set fdset;
fd_set *rfds, *wfds;
int n, so_error;
unsigned so_len;
FD_ZERO (&fdset);
FD_SET (sock, &fdset);
tv.tv_sec = timeout;
tv.tv_usec = 0;
TRACES ("wait in progress tv={%ld,%ld} ...\n",
tv.tv_sec, tv.tv_usec);
if (r) rfds = &fdset; else rfds = NULL;
if (w) wfds = &fdset; else wfds = NULL;
TEMP_FAILURE_RETRY (n = select (sock+1, rfds, wfds, NULL, &tv));
switch (n) {
case 0:
ERROR ("wait timed out\n");
return -errno;
case -1:
ERROR_SYS ("error during wait\n");
return -errno;
default:
// select tell us that sock is ready, test it
so_len = sizeof(so_error);
so_error = 0;
getsockopt (sock, SOL_SOCKET, SO_ERROR, &so_error, &so_len);
if (so_error == 0)
return 0;
errno = so_error;
ERROR_SYS ("wait failed\n");
return -errno;
}
}

Related

Determining the moment a TCP connection is really been closed in C

I need to perform some operations only after the time a TCP connection is fully closed, that's to say - all the data segments, as well as the finishing routine (FIN-ACK or RST) have been performed and done, and no packets will be sent on the wires.
Since closesocket() is not synchronous and could return before a full close of the connection and socket, I've used the SO_LINGER option to get the moment of closing.
According to the instructions in the MSDN for closesocket, and to the fact that my socket is non-blocking (and thus asynchronous), I wrote this code:
int ret;
/* config 2 secs of linger */
struct linger lng = {1, 2};
setsockopt(s, SOL_SOCKET, SO_LINGER, (const char*)&lng, sizeof(lng));
/* graceful close of TCP (FIN-ACK for both sides) */
shutdown(s, SD_BOTH);
/* linger routine for asynchronous sockets according to msdn */
do {
ret = closesocket(s);
} while (ret == SOCKET_ERROR && WSAGetLastError() == WSAEWOULDBLOCK);
/* my code to be run immediately after all the traffic */
printf("code to run after closing");
However, the closesocket call returns zero (success; instead of getting into the loop) and I see in Wireshark that my final printing is called before all the packets were sent, so - it looks like the linger isn't working.
By the way, the functions I used to open and connect the asynchronous socket were socket(), WSAIoctl() and its lpfnConnectEx() callback.
What's the reason that the lingered closesocket return before a full finish of the TCP connection? Is there a solution?

select for multiple non-blocking connections

I have a single threaded program. It sends message to four destinations every five seconds. I don't want connect() to be blocked. So I am writing my program like this:
int j, rc, non_blocking=1, sockets[4], max_fd=0;
struct sockaddr server=get_server_addr();
fd_set fdset;
const struct timeval conn_timeout = { 2, 0 }; /* 2 seconds */
for (j=0; j<4; ++j)
{
sockets[j]=socket( AF_INET, SOCK_STREAM, 0 );
ioctl(sockets[j], FIONBIO, (char *)&non_blocking);
connect(sockets[j], &server, sizeof (server));
}
/* prepare fd_set */
FD_ZERO ( &fdset );
for (j=0;j<4;++j)
{
if (sockets[j] != -1 )
{
FD_SET ( sockets[j], &fdset );
if ( sockets[j] > max_fd )
{
max_fd = sockets[j];
}
}
}
rc=select(max_fd + 1, NULL, &fdset, NULL, &conn_timeout );
if(rc > 0)
{
for (j=0;j<4;++j)
{
if(sockets[j]!=-1 && FD_ISSET(sockets[j],&fdset))
{
/* send() */
}
}
}
/* close all valid sockets */
However, it seems select() returns immediately after ONE file descriptor is ready instead of blocking for conn_timeout (2 seconds). So in this case how can I achieve my targets?
The program continues if all sockets are ready.
The program can block there for 2 seconds if any one of sockets are not ready.
Yeah, select was designed on the assumption that you would want to service each socket as soon as it became ready.
If I understand what you're trying to do, then the simplest way to accomplish it will be to remove each socket from the fdset as it becomes ready. If there are any sockets left in the set, use gettimeofday to adjust the timeout downward, and call select again. When the set is empty, all four sockets are usable and you can proceed.
There are three basic approaches:
If you want to stay strictly portable you need to iterate:
calculate end time from current time and timeout of your choice
Cycle:
-- Create fdset with those fds not yet ready
-- calculate max time to wait
-- select()
-- remeber those fds that are now ready
-- break if end time reached or all fds ready
End cycle
Now you have knowledge of the ready fds and the elapsed time
If you want to stay portable, but can use threads:
start n threads
select on one fd per thread
join all threads
If you do not need to be portable: Most OSes have a facility for such a situation, e.g. Windows/.NET has WaitAll (together with async send and an event)
I don't see the connection between your stated targets and your stated problem. You are correct in saying that select() blocks until at least one socket is ready, but according to target #2 above that is exactly what you want. There's nothing in your stated targets about blocking until all four sockets are ready at the same time.
You should also note that sockets are almost always ready for writing, unless the send buffer is full, which means the receiver's receive buffer is full, which means the receiver is slower than the sender. So using select() alone as the underlying write timer isn't a good idea.

Listen to multiple ports from one server

Is it possible to bind and listen to multiple ports in Linux in one application?
For each port that you want to listen to, you:
Create a separate socket with socket.
Bind it to the appropriate port with bind.
Call listen on the socket so that it's set up with a listen queue.
At that point, your program is listening on multiple sockets. In order to accept connections on those sockets, you need to know which socket a client is connecting to. That's where select comes in. As it happens, I have code that does exactly this sitting around, so here's a complete tested example of waiting for connections on multiple sockets and returning the file descriptor of a connection. The remote address is returned in additional parameters (the buffer must be provided by the caller, just like accept).
(socket_type here is a typedef for int on Linux systems, and INVALID_SOCKET is -1. Those are there because this code has been ported to Windows as well.)
socket_type
network_accept_any(socket_type fds[], unsigned int count,
struct sockaddr *addr, socklen_t *addrlen)
{
fd_set readfds;
socket_type maxfd, fd;
unsigned int i;
int status;
FD_ZERO(&readfds);
maxfd = -1;
for (i = 0; i < count; i++) {
FD_SET(fds[i], &readfds);
if (fds[i] > maxfd)
maxfd = fds[i];
}
status = select(maxfd + 1, &readfds, NULL, NULL, NULL);
if (status < 0)
return INVALID_SOCKET;
fd = INVALID_SOCKET;
for (i = 0; i < count; i++)
if (FD_ISSET(fds[i], &readfds)) {
fd = fds[i];
break;
}
if (fd == INVALID_SOCKET)
return INVALID_SOCKET;
else
return accept(fd, addr, addrlen);
}
This code doesn't tell the caller which port the client connected to, but you could easily add an int * parameter that would get the file descriptor that saw the incoming connection.
You only bind() to a single socket, then listen() and accept() -- the socket for the bind is for the server, the fd from the accept() is for the client. You do your select on the latter looking for any client socket that has data pending on the input.
In such a situation, you may be interested by libevent. It will do the work of the select() for you, probably using a much better interface such as epoll().
The huge drawback with select() is the use of the FD_... macros that limit the socket number to the maximum number of bits in the fd_set variable (from about 100 to 256). If you have a small server with 2 or 3 connections, you'll be fine. If you intend to work on a much larger server, then the fd_set could easily get overflown.
Also, the use of the select() or poll() allows you to avoid threads in the server (i.e. you can poll() your socket and know whether you can accept(), read(), or write() to them.)
But if you really want to do it Unix like, then you want to consider fork()-ing before you call accept(). In this case you do not absolutely need the select() or poll() (unless you are listening on many IPs/ports and want all children to be capable of answering any incoming connections, but you have drawbacks with those... the kernel may send you another request while you are already handling a request, whereas, with just an accept(), the kernel knows that you are busy if not in the accept() call itself—well, it does not work exactly like that, but as a user, that's the way it works for you.)
With the fork() you prepare the socket in the main process and then call handle_request() in a child process to call the accept() function. That way you may have any number of ports and one or more children to listen on each. That's the best way to really very quickly respond to any incoming connection under Linux (i.e. as a user and as long as you have child processes wait for a client, this is instantaneous.)
void init_server(int port)
{
int server_socket = socket();
bind(server_socket, ...port...);
listen(server_socket);
for(int c = 0; c < 10; ++c)
{
pid_t child_pid = fork();
if(child_pid == 0)
{
// here we are in a child
handle_request(server_socket);
}
}
// WARNING: this loop cannot be here, since it is blocking...
// you will want to wait and see which child died and
// create a new child for the same `server_socket`...
// but this loop should get you started
for(;;)
{
// wait on children death (you'll need to do things with SIGCHLD too)
// and create a new children as they die...
wait(...);
pid_t child_pid = fork();
if(child_pid == 0)
{
handle_request(server_socket);
}
}
}
void handle_request(int server_socket)
{
// here child blocks until a connection arrives on 'server_socket'
int client_socket = accept(server_socket, ...);
...handle the request...
exit(0);
}
int create_servers()
{
init_server(80); // create a connection on port 80
init_server(443); // create a connection on port 443
}
Note that the handle_request() function is shown here as handling one request. The advantage of handling a single request is that you can do it the Unix way: allocate resources as required and once the request is answered, exit(0). The exit(0) will call the necessary close(), free(), etc. for you.
In contrast, if you want to handle multiple requests in a row, you want to make sure that resources get deallocated before you loop back to the accept() call. Also, the sbrk() function is pretty much never going to be called to reduce the memory footprint of your child. This means it will tend to grow a little bit every now and then. This is why a server such as Apache2 is setup to answer a certain number of requests per child before starting a new child (by default it is between 100 and 1,000 these days.)

epoll and timeouts

I'm using epoll to manage about 20 to 30 sockets. I figure out that epoll_wait can be used to wait for some data to arrive over one of the socket but I'm missing how do I implement timeouts on socket level. I can use timeout on epoll_wait but it not very useful in my case. For example, if I need to every close a socket where no activity is recorded for > 500 ms orr may be send some data to a socket every 200 ms no matter what. How can these socket level timeout be implemented using epoll? Any suggestion and idea would be appreciated!
Thanks,
Shivam Kalra
Try pairing each socket with a timer fd object (timerfd_create). For each socket in your application, create a timer that's initially set to expire after 500ms, and add the timer to the epoll object (same as with a socket—via epoll_ctl and EPOLL_CTL_ADD). Then, whenever data arrives on a socket, reset that socket's associated timer back to a 500ms timeout.
If a timer expires (because a socket has been inactive for 500ms) then the timer will become "read ready" in the epoll object and cause any thread waiting on epoll_wait to wake up. That thread may then handle the timeout for the timer's associated socket.
Sounds like you're trying to write an event loop (if so have a look at libev btw). epoll will not help you there, you have to keep track of socket inactivity yourself (clock_gettime() or gettimeofday() for instance), then wake up several times a second and check everything you need.
Some pseudo code
while (1) {
n = epoll_wait(..., 5);
if (n > 0) {
/* process activity */
} else {
/* process inactivity */
}
}
This will wake you up 200 times a second if all sockets are inactive.
The inactivity check requires a list of the sockets to be examined along with timestamps of the last inactivity:
struct sockstamp_s {
/* socket descriptor */
int sockfd;
/* last active */
struct timeval tv;
};
/* check which socket has been inactive */
for (struct sockstamp_s *i = socklist; ...; i = next(i)) {
if (diff(s->tv, now()) > 500) {
/* socket s->sockfd was inactive for more than 500 ms */
...
}
}
where diff() gives you the difference of 2 struct timevals and now() gives you the current timestamp.

C- Unix Sockets - Non-blocking read

I am trying to make a simple client-server chat program. On the client side I spin off another thread to read any incomming data from the server. The problem is, I want to gracefully terminate that second thread when a person logs out from the main thread. I was trying to use a shared variable 'running' to terminate, problem is, the socket read() command is a blocking command, so if I do while(running == 1), the server has to send something before the read returns and the while condition can be checked again. I am looking for a method (with common unix sockets only) to do a non-blocking read, basically some form of peek() would work, for I can continually check the loop to see if I'm done.
The reading thread loop is below, right now it does not have any mutex's for the shared variables, but I plan to add that later don't worry! ;)
void *serverlisten(void *vargp)
{
while(running == 1)
{
read(socket, readbuffer, sizeof(readbuffer));
printf("CLIENT RECIEVED: %s\n", readbuffer);
}
pthread_exit(NULL);
}
You can make socket not blockable, as suggested in another post plus use select to wait input with timeout, like this:
fd_set input;
FD_ZERO(&input);
FD_SET(sd, &input);
struct timeval timeout;
timeout.tv_sec = sec;
timeout.tv_usec = msec * 1000;
int n = select(sd + 1, &input, NULL, NULL, &timeout);
if (n == -1) {
//something wrong
} else if (n == 0)
continue;//timeout
if (!FD_ISSET(sd, &input))
;//again something wrong
//here we can call not blockable read
fcntl(socket, F_SETFL, O_NONBLOCK);
or, if you have other flags:
int x;
x=fcntl(socket ,F_GETFL, 0);
fcntl(socket, F_SETFL, x | O_NONBLOCK);
then check the return value of read to see whether there was data available.
note: a bit of googling will yield you lots of full examples.
You can also use blocking sockets, and "peek" with select with a timeout. It seems more appropriate here so you don't do busy wait.
The best thing is likely to get rid of the extra thread and use select() or poll() to handle everything in one thread.
If you want to keep the thread, one thing you can do is call shutdown() on the socket with SHUT_RDWR, which will shut down the connection, wake up all threads blocked on it but keep the file descriptor valid. After you have joined the reader thread, you can then close the socket. Note that this only works on sockets, not on other types of file descriptor.
Look for function setsockopt with option SO_RCVTIMEO.

Resources