C- Unix Sockets - Non-blocking read - c

I am trying to make a simple client-server chat program. On the client side I spin off another thread to read any incomming data from the server. The problem is, I want to gracefully terminate that second thread when a person logs out from the main thread. I was trying to use a shared variable 'running' to terminate, problem is, the socket read() command is a blocking command, so if I do while(running == 1), the server has to send something before the read returns and the while condition can be checked again. I am looking for a method (with common unix sockets only) to do a non-blocking read, basically some form of peek() would work, for I can continually check the loop to see if I'm done.
The reading thread loop is below, right now it does not have any mutex's for the shared variables, but I plan to add that later don't worry! ;)
void *serverlisten(void *vargp)
{
while(running == 1)
{
read(socket, readbuffer, sizeof(readbuffer));
printf("CLIENT RECIEVED: %s\n", readbuffer);
}
pthread_exit(NULL);
}

You can make socket not blockable, as suggested in another post plus use select to wait input with timeout, like this:
fd_set input;
FD_ZERO(&input);
FD_SET(sd, &input);
struct timeval timeout;
timeout.tv_sec = sec;
timeout.tv_usec = msec * 1000;
int n = select(sd + 1, &input, NULL, NULL, &timeout);
if (n == -1) {
//something wrong
} else if (n == 0)
continue;//timeout
if (!FD_ISSET(sd, &input))
;//again something wrong
//here we can call not blockable read

fcntl(socket, F_SETFL, O_NONBLOCK);
or, if you have other flags:
int x;
x=fcntl(socket ,F_GETFL, 0);
fcntl(socket, F_SETFL, x | O_NONBLOCK);
then check the return value of read to see whether there was data available.
note: a bit of googling will yield you lots of full examples.
You can also use blocking sockets, and "peek" with select with a timeout. It seems more appropriate here so you don't do busy wait.

The best thing is likely to get rid of the extra thread and use select() or poll() to handle everything in one thread.
If you want to keep the thread, one thing you can do is call shutdown() on the socket with SHUT_RDWR, which will shut down the connection, wake up all threads blocked on it but keep the file descriptor valid. After you have joined the reader thread, you can then close the socket. Note that this only works on sockets, not on other types of file descriptor.

Look for function setsockopt with option SO_RCVTIMEO.

Related

Adding new FDs to fd_Set while blocking on select

I have a question regarding adding new socket file descriptors to an FDSET. Lets say we've already connected to a socket s1:
fd_set readfds;
//s1 = socket(...);
//connect(s1, ...)...
FD_ZERO(&readfds);
FD_SET(s1, &readfds);
and we are waiting for data to come down the socket, by calling select in a thread:
socket_reader_thread() {
for (;;)
{
int rv = select(n, &readfds, NULL, NULL, &tv);
if (rv == -1) {
perror("select"); // error occurred in select()
}
else if (rv == 0) {
printf("Timeout occurred! No data after 10.5 seconds.\n");
}
else {
// one the descriptors have data
.....
}
}
}
If I now wanted to add another socket (or may be two more socket etc) to the readfds set, given that select is blocking, how should I proceed? how can I interrupt select
Is the trick to add a zero timeout and use select like poll?
You need to use the "pipe-trick".
This is where an additional socket or pipe is created add it to the fd_set.
Then to interrupt a running or pending select, send a 1 byte message to it via another thread.
The select will then return and if the special pipe FD is one of the ones that are ready in the set, that means you need to say look at a list or something "do work" - like add any new FDs to the fd_set before returning to the select call.
You can interrupt select by sending (and catching) a signal to your process, for example using raise. select will return in this case with -1 and errno set to EINTR. You can then change the events you want to wait for and call select again.
Is the trick to add a zero timeout and use select like poll?
One can simply use a timeout of 0 in which case it will just do a non-blocking check if any of the events got triggered, i.e. polling. But this should only be done in a few cases since busy polling instead of a blocking wait uses lots of resources of machine. And I would even consider the interrupting of a blocking select a questionable design, although probably not as bad as busy polling.

What is meant by select() and poll() are stateless?

I am trying to understand how epoll() is different from select() and poll(). select() and poll() are pretty similar. select() allows you to monitor multiple file descriptors and it checks if any of those file descriptors are available for an operation (e.g. read, write) without blocking. When the timeout expires, select() returns the file descriptors that are ready and the program can perform the operations on those file descriptors without blocking.
...
FD_ZERO(&rfds);
FD_SET(0, &rfds);
/* Wait up to five seconds. */
tv.tv_sec = 5;
tv.tv_usec = 0;
retval = select(1, &rfds, NULL, NULL, &tv);
/* Don’t rely on the value of tv now! */
if (retval == -1)
perror("select()");
else if (retval)
printf("Data is available now.\n");
/* FD_ISSET(0, &rfds) will be true. */
else
printf("No data within five seconds.\n");
...
poll() is a little more flexible in that it does not rely on bitmap, but array of file descriptors. Also, since poll() uses separate fields for requested (events) and result (revents), you don't have to worry to refill the sets that were overwritten by kernel.
...
struct pollfd fds[2];
fds[0].fd = open("/dev/dev0", ...);
fds[1].fd = open("/dev/dev1", ...);
fds[0].events = POLLOUT | POLLWRBAND;
fds[1].events = POLLOUT | POLLWRBAND;
ret = poll(fds, 2, timeout_msecs);
if (ret > 0) {
for (i=0; i<2; i++) {
if (fds[i].revents & POLLWRBAND) {
...
However, I read that there is an issue with poll() too since both select() and poll() are stateless; the kernel does not internally maintain the requested sets. I read this:
Suppose that there are 10,000 concurrent connections. Typically, only
a small number of file descriptors among them, say 10, are ready to
read. The rest 9,990 file descriptors are copied and scanned for no
reason, for every select()/poll() call. As mentioned earlier, this
problem comes from the fact that those select()/poll() interfaces are
stateless.
I don't understand what is meant by the file descripters are "copied" and "scanned". Copied where? And I don't know what is meant by "stateless". Thanks for clarification.
"Stateless" means "Does not retain anything between two calls". So kernel need to rebuild many things for mainly nothing in the mentioned example.

select for multiple non-blocking connections

I have a single threaded program. It sends message to four destinations every five seconds. I don't want connect() to be blocked. So I am writing my program like this:
int j, rc, non_blocking=1, sockets[4], max_fd=0;
struct sockaddr server=get_server_addr();
fd_set fdset;
const struct timeval conn_timeout = { 2, 0 }; /* 2 seconds */
for (j=0; j<4; ++j)
{
sockets[j]=socket( AF_INET, SOCK_STREAM, 0 );
ioctl(sockets[j], FIONBIO, (char *)&non_blocking);
connect(sockets[j], &server, sizeof (server));
}
/* prepare fd_set */
FD_ZERO ( &fdset );
for (j=0;j<4;++j)
{
if (sockets[j] != -1 )
{
FD_SET ( sockets[j], &fdset );
if ( sockets[j] > max_fd )
{
max_fd = sockets[j];
}
}
}
rc=select(max_fd + 1, NULL, &fdset, NULL, &conn_timeout );
if(rc > 0)
{
for (j=0;j<4;++j)
{
if(sockets[j]!=-1 && FD_ISSET(sockets[j],&fdset))
{
/* send() */
}
}
}
/* close all valid sockets */
However, it seems select() returns immediately after ONE file descriptor is ready instead of blocking for conn_timeout (2 seconds). So in this case how can I achieve my targets?
The program continues if all sockets are ready.
The program can block there for 2 seconds if any one of sockets are not ready.
Yeah, select was designed on the assumption that you would want to service each socket as soon as it became ready.
If I understand what you're trying to do, then the simplest way to accomplish it will be to remove each socket from the fdset as it becomes ready. If there are any sockets left in the set, use gettimeofday to adjust the timeout downward, and call select again. When the set is empty, all four sockets are usable and you can proceed.
There are three basic approaches:
If you want to stay strictly portable you need to iterate:
calculate end time from current time and timeout of your choice
Cycle:
-- Create fdset with those fds not yet ready
-- calculate max time to wait
-- select()
-- remeber those fds that are now ready
-- break if end time reached or all fds ready
End cycle
Now you have knowledge of the ready fds and the elapsed time
If you want to stay portable, but can use threads:
start n threads
select on one fd per thread
join all threads
If you do not need to be portable: Most OSes have a facility for such a situation, e.g. Windows/.NET has WaitAll (together with async send and an event)
I don't see the connection between your stated targets and your stated problem. You are correct in saying that select() blocks until at least one socket is ready, but according to target #2 above that is exactly what you want. There's nothing in your stated targets about blocking until all four sockets are ready at the same time.
You should also note that sockets are almost always ready for writing, unless the send buffer is full, which means the receiver's receive buffer is full, which means the receiver is slower than the sender. So using select() alone as the underlying write timer isn't a good idea.

Listen to multiple ports from one server

Is it possible to bind and listen to multiple ports in Linux in one application?
For each port that you want to listen to, you:
Create a separate socket with socket.
Bind it to the appropriate port with bind.
Call listen on the socket so that it's set up with a listen queue.
At that point, your program is listening on multiple sockets. In order to accept connections on those sockets, you need to know which socket a client is connecting to. That's where select comes in. As it happens, I have code that does exactly this sitting around, so here's a complete tested example of waiting for connections on multiple sockets and returning the file descriptor of a connection. The remote address is returned in additional parameters (the buffer must be provided by the caller, just like accept).
(socket_type here is a typedef for int on Linux systems, and INVALID_SOCKET is -1. Those are there because this code has been ported to Windows as well.)
socket_type
network_accept_any(socket_type fds[], unsigned int count,
struct sockaddr *addr, socklen_t *addrlen)
{
fd_set readfds;
socket_type maxfd, fd;
unsigned int i;
int status;
FD_ZERO(&readfds);
maxfd = -1;
for (i = 0; i < count; i++) {
FD_SET(fds[i], &readfds);
if (fds[i] > maxfd)
maxfd = fds[i];
}
status = select(maxfd + 1, &readfds, NULL, NULL, NULL);
if (status < 0)
return INVALID_SOCKET;
fd = INVALID_SOCKET;
for (i = 0; i < count; i++)
if (FD_ISSET(fds[i], &readfds)) {
fd = fds[i];
break;
}
if (fd == INVALID_SOCKET)
return INVALID_SOCKET;
else
return accept(fd, addr, addrlen);
}
This code doesn't tell the caller which port the client connected to, but you could easily add an int * parameter that would get the file descriptor that saw the incoming connection.
You only bind() to a single socket, then listen() and accept() -- the socket for the bind is for the server, the fd from the accept() is for the client. You do your select on the latter looking for any client socket that has data pending on the input.
In such a situation, you may be interested by libevent. It will do the work of the select() for you, probably using a much better interface such as epoll().
The huge drawback with select() is the use of the FD_... macros that limit the socket number to the maximum number of bits in the fd_set variable (from about 100 to 256). If you have a small server with 2 or 3 connections, you'll be fine. If you intend to work on a much larger server, then the fd_set could easily get overflown.
Also, the use of the select() or poll() allows you to avoid threads in the server (i.e. you can poll() your socket and know whether you can accept(), read(), or write() to them.)
But if you really want to do it Unix like, then you want to consider fork()-ing before you call accept(). In this case you do not absolutely need the select() or poll() (unless you are listening on many IPs/ports and want all children to be capable of answering any incoming connections, but you have drawbacks with those... the kernel may send you another request while you are already handling a request, whereas, with just an accept(), the kernel knows that you are busy if not in the accept() call itself—well, it does not work exactly like that, but as a user, that's the way it works for you.)
With the fork() you prepare the socket in the main process and then call handle_request() in a child process to call the accept() function. That way you may have any number of ports and one or more children to listen on each. That's the best way to really very quickly respond to any incoming connection under Linux (i.e. as a user and as long as you have child processes wait for a client, this is instantaneous.)
void init_server(int port)
{
int server_socket = socket();
bind(server_socket, ...port...);
listen(server_socket);
for(int c = 0; c < 10; ++c)
{
pid_t child_pid = fork();
if(child_pid == 0)
{
// here we are in a child
handle_request(server_socket);
}
}
// WARNING: this loop cannot be here, since it is blocking...
// you will want to wait and see which child died and
// create a new child for the same `server_socket`...
// but this loop should get you started
for(;;)
{
// wait on children death (you'll need to do things with SIGCHLD too)
// and create a new children as they die...
wait(...);
pid_t child_pid = fork();
if(child_pid == 0)
{
handle_request(server_socket);
}
}
}
void handle_request(int server_socket)
{
// here child blocks until a connection arrives on 'server_socket'
int client_socket = accept(server_socket, ...);
...handle the request...
exit(0);
}
int create_servers()
{
init_server(80); // create a connection on port 80
init_server(443); // create a connection on port 443
}
Note that the handle_request() function is shown here as handling one request. The advantage of handling a single request is that you can do it the Unix way: allocate resources as required and once the request is answered, exit(0). The exit(0) will call the necessary close(), free(), etc. for you.
In contrast, if you want to handle multiple requests in a row, you want to make sure that resources get deallocated before you loop back to the accept() call. Also, the sbrk() function is pretty much never going to be called to reduce the memory footprint of your child. This means it will tend to grow a little bit every now and then. This is why a server such as Apache2 is setup to answer a certain number of requests per child before starting a new child (by default it is between 100 and 1,000 these days.)

C - How do you use select() with multiple pipes?

I'm having a difficult time figuring out how select() is suppose to work with pipes in UNIX. I've scanned the man pages several times, and I don't completely understand the given definition.
From reading the man pages, I was under the impression that select() would make the system wait until one of the file descriptors given could make a read (in my case) from a pipe without blocking.
Here's some of my outline code[EDITED]:
int size, size2;
fd_set rfds;
struct timeval tv;
char buffer[100];
char buffer2[100];
int retval;
while(1)
{
FD_ZERO(&rfds);
FD_SET(fd[0], &rfds);
FD_SET(fd2[0], &rfds);
tv.tv_sec = 2;
tv.tv_usec = 0;
retval = select(2, &rfds, NULL, NULL, &tv); //2 seconds before timeout
if(retval == -1)
perror("Select failed.\n");
else if(retval)
{
size = read(fd[0], buffer, sizeof(buffer));
if(size > 0)
printf("Parent received from even: %s\n", buffer);
size2 = read(fd2[READ], buffer2, sizeof(buffer2));
if(size2 > 0)
printf("Parent received from odd: %s\n", buffer2);
}
else
printf("No data written to pipe in 2 last seconds.\n");
}
I have two pipes here. Two children processes are writing to their respective pipes and the parent has to read them both in.
As a test, I write a small string to each pipe. I then attempt to read them in and prevent blocking with select. The only thing that gets printed out is the string from the even pipe. It appears to still be blocking. I am becoming frustrated as I feel like I'm missing something on the man pages. Could someone tell me what I'm doing wrong?
After select() returns, 0 or more of your file descriptors will be in a "ready" state where you can read them without blocking. But if you read one that's not ready, it will still block. Right now you are reading all of them, and since select() only waits until one is ready, it's very likely that another will not be.
What you need to do is figure out which ones are ready, and only read() from them. The return value of select() will tell you how many are ready, and you can ask if a specific one is ready with the ISSET() macro.
You need to use FD_ZERO() - see select before the FD_SET
Set the timeout values just before the select. Those values are changed through the select

Resources