Waiting for data via select not working - c

I'm currently working on a project which involves multiple clients connected to a server and waiting for data. I'm using select and monitoring the connection for incoming data. However, the client just continues to print nothing, acting as if select has discovered incoming data. Perhaps I'm attacking this wrong?
For the first piece of data the server does send, it is displayed correctly. However, the server then disconnects and the client continues to spew blank lines.
FD_ZERO(&readnet);
FD_SET(sockfd, &readnet);
while(1){
rv = select(socketdescrip, &readnet, NULL, NULL, &timeout);
if (rv == -1) {
perror("select"); // error occurred in select()
} else if (rv == 0) {
printf("Connection timeout! No data after 10 seconds.\n");
} else {
// one or both of the descriptors have data
if (FD_ISSET(sockfd, &readnet)) {
numbytes = recv(sockfd, buf, sizeof buf, 0);
printf("Data Received\n");
buf[numbytes] = '\0';
printf("client: received '%s'\n",buf);
sleep(10);
}
}
}

I think you need to check the result of recv. If it returns zero, I believe it means the server has closed the socket.
Also (depending on the implementation), you may need to pass socketdescrip+1 to select.

If I remember correctly, you need to initialise set of fds before each call to select() because select() corrupts it.
So move FD_ZERO() and FD_SET() inside the loop, just before select().

acting as if select has discovered
incoming data. Perhaps I'm attacking
this wrong?
In addition to what was said before, I'd like to note that select()/poll() do tell you not when "data are there" but rather that next corresponding system call will not block. That's it. As was said above, recv() doesn't block and properly returns 0, what means EOF, connection was closed by the other side.
Though on most *nix systems in the case only first call of recv() would return 0, following calls would return -1. When using async I/O rigorous error checking is a must!
And personally I would strongly suggest to use poll() instead. Unlike select(), it doesn't destroy its arguments and works fine with high numbered socket descriptors.

When server closes the connection, it will send a packet taking FIN flag to client side to announce that it no longer sends data. The packet is processed by TCP/IP stack at the client side and has no data for application level. The application level is notified to trigger select because something happened on the monitored file descriptor, and recv() return 0 bytes because no data sent by server.

Is this true when talking about your code?
select(highest_file_descriptor+1, &readnet, NULL, NULL, &timeout);
In your simple example (with FD_ZERO and FD_SET moved inside the while(1) loop as qrdl said) it should look like this:
select(sockfd+1, &readnet, NULL, NULL, &timeout);
Also - please note that when recv returns 0 bytes read it means that connection was closed - no more data! Your code is also buggy - when something bad happens on recv (it returns <0 when this happens) you will have serious trouble because something like buf[-1] may lead to unpredictable results. Please handle this case properly.
While I respect the fact that you try to use the low-level BSD sockets API I must say that I find it awfully inefficient. That's why I recommend to you if possible to use ACE which is a very efficient and productive framework which has a lot of things already implemented when it comes to network programming (ACE_Reactor for example is something that makes it easier to do what you're trying to achieve here).

Related

connect(), accept() and select() happening sequence order

I am a newbie in C. I just noticed that the connect() function on the client side can return as long as the TCP three-way hand-shake is finished. I mean connect() can even return before the accept() on the server side is called (correct me if I am wrong). Based on this knowledge, my question is that when I call select() afterwards on the client side, and watch the file descriptor to wait for it to be writeable, when select() successfully returns, that means the server side has already called accept() and now I can safely write to the server side, right? Many thanks for your time.
int flags = fcntl(fd, F_GETFL);
flags |= O_NONBLOCK;
fcntl(fd, F_SETFL, flags);
if (connect(fd, (struct sockaddr *)saptr, salen) < 0)
{
if (errno != EINPROGRESS)
/* error_return */
}
fd_set set;
FD_ZERO (&set);
FD_SET (fd, &set);
select (FD_SETSIZE, NULL, &set, NULL, &timeout)
/* Here, if select returns 1, that means accept() is already called
on the server side, and now I can safely write to the server, right? */
when select() successfully returns, that means the server side has already called accept()
No, not necessarily. connect() returns when the connection attempt is complete, having either succeeded or failed. On the remote side, this is handled by the network stack, outside the context of any application. The subsequent accept() itself produces no additional communication.
and now I can safely write to the server side, right?
There are all kinds of things you could mean by "safely", but if you mean that the local side can write at least one byte without blocking then yes, select() promises you that. Whatever you successfully write will be sent over the wire to the remote side. It may be buffered there for a time, depending on the behavior of the software on the remote end. Whether that software has yet accept()ed the connection or not is not directly relevant to that question.
Update: note also that the network stack maintains a per-socket queue of established connections that have not yet been accept()ed (its backlog). This queuing behavior is one reason why a server might not accept() connections immediately after they are established, especially under heavy load.
'I mean connect() can even return before the accept() on the server
side is called'
Yes, it can, and does.
when I call select() afterwards on the client side, and watch the file
descriptor to wait for it to be writeable, when select() successfully
returns, that means the server side has already called accept() and
now I can safely write to the server side, right?
Sure. Write away:)

Use semaphores for handling sockets in C

I have the following piece of code:
SOCKET sock = open_socket(szListenHost, iListenPort);
if (sock > 0) {
SOCKET client;
struct sockaddr_in peeraddr;
T_socklen len = sizeof (struct sockaddr_in);
char buf[1024];
sin.dwFlags = STARTF_USESTDHANDLES | STARTF_USESHOWWINDOW;
sin.hStdInput = GetStdHandle(STD_INPUT_HANDLE);
sin.hStdOutput = GetStdHandle(STD_OUTPUT_HANDLE);
sin.hStdError = GetStdHandle(STD_ERROR_HANDLE);
sin.wShowWindow = SW_HIDE;
dwCreationFlags = CREATE_NO_WINDOW;
CreateProcess(NULL, buf, NULL, NULL, FALSE, dwCreationFlags,
NULL, NULL, &sin, &pin);
memset(&peeraddr, 0, sizeof (struct sockaddr_in));
client = accept(sock, (sockaddr*)&peeraddr, &len);
if (client > 0) {
rv = message_loop(client);
}
closesocket(sock);
}
As you can see, this is opening a TCP socket for interrogation reasons.
The situation is the following: my client application (who is opening those sockets) might need to open different TCP sockets simultaneously, which might cause problems.
In order to avoid those problems, I would like to ask whether the socket is already opened. If yes, then wait until the socket is freed again and then try again to open the socket.
I have understood that semaphores can be used for this intention, but I have no idea how to do this.
Can anybody help me?
Thanks
First I'd like to thank John Bollinger for your fast response. Unfortunately their seems to be as misunderstanding: I am not looking for a way to open one socket different times simultaneously, but I am looking for a way to be noticed when a socket becomes available. In fact, I would like to do the following: Instead of:
SOCKET sock = open_socket(szListenHost, iListenPort);
I could do this (very basically):
while (open_socket(szListenHost, iListenPort)) {sleep (1 second;)}
This however means that I would need to poll the socket constantly, creating quite some overhead. I have heard that semaphores could solve this issue, something like:
SOCKET sock = handle_semaphore(open_socket(szListenHost, iListenPort));
and the "handle_semaphore" would then be a system that automatically waits for the socket to be released, so that immediately my client process can open the socket, without the risk of being pushed behind. As you can see, it's all about rumours but I have no idea how to realise this. Does anybody know whether indeed semaphores can be used for this intention and if possible, give me some guidance on how to do this?
Thanks
Once opened, a socket cannot be reopened, even if it is closed. You can create a similar, new socket, though. Either way, it is difficult to reliably determine whether a previously-opened socket has been closed, except by closing it.
In any case, the usual paradigm does not require the kind of coordinating mechanism you ask about. Normally, one thread of one process would open the socket and have responsibility for accepting connections on it. If it is desired that the program be able to handle more than one connection at a time, then each time that thread accepts a new connection, it assigns that connection to be handled by another thread or process -- typically, but not necessarily, a newly-created one.
It is not usually necessary or desirable to open a new socket to receive additional connections at the same address and port. Usually you just use the same socket, without caring about the state of any connections already established via that socket. You could, perhaps, use a semaphore to coordinate multiple threads of the same process receiving connections from the same socket, but I would avoid that if I were you.
Meanwhile the situation has changed, and I now have been able to add semaphores to my socket related application. Generally this works but sometimes the application hangs.
After some debugging I have understood that the application hangs at the moment I launch following C command:
printf("Will it be accepted?\n");
fflush(stdout);
memset(&peeraddr, 0, sizeof (struct sockaddr_in));
client = accept(sock, (sockaddr*)&peeraddr, &len);
printf("It is accepted, the client is %d.\n",client);
=> I can see in my debug log "Will it be accepted?", but I don't see "It is accepted, ...".
I admit that I am quite violent while testing (sometimes I stop debugging sessions without giving the application to close the socket, ...), but you can imagine customers behaving in the same way, to the application needs to be sufficiently robust.
Does anybody know how I can avoid the "accept" command going into such an infinite loop?
Thanks

select() returns invalid argument

I am successfully reading from a pipe from another thread, and printing the output (in an ncurses window as it happens).
I need to do this one character at a time, for various reasons, and I'm using a select() on the FD for the read end of the pipe, along with a few other FDs (like stdin).
My idea is to attempt to read from the pipe only when it is imminently ready to be read, in preference to handling any input. This seems to be working - at least to start. select() sets up the fd_set and if FD_ISSET I do my read() of 1 byte from the FD. But select() says yes one time too many, and the read() blocks.
So my question is this - why would select() report that a fd is ready for reading, if a subsequent read() blocks?
(approximately) This same code worked fine when the other end of the pipe was connected to a forked process, if that helps.
I can post the code on request, but it's bog standard. Set up a fd_set, copy it, select the copy, if the FD is set call a function which reads a byte from the same FD... otherwise revert the fd_set copy
EDIT: by request, here's the code:
Setting up my fd_set:
fd_set fds;
FD_ZERO(&fds);
FD_SET(interp_output[0], &fds);
FD_SET(STDIN_FILENO, &fds);
struct timeval timeout, tvcopy; timeout.tv_sec=1;
int maxfd=interp_output[0]+1; //always >stdin+1
fd_set read_fds;
FD_COPY(&fds, &read_fds);
In a loop:
if (select(maxfd, &read_fds, NULL, NULL, &timeout)==-1) {perror("couldn't select"); return;}
if (FD_ISSET(interp_output[0], &read_fds)) {
handle_interp_out();
} else if (FD_ISSET(STDIN_FILENO, &read_fds)) {
//waddstr(cmdwin, "stdin!"); wrefresh(cmdwin);
handle_input();
}
FDCOPY(&fds, &read_fds);
handle_interp_out():
void handle_interp_out() {
int ch;
read(interp_output[0], &ch, 1);
if (ch>0) {
if (ch=='\n') { if (cmd_curline>=cmdheight) cmdscroll(); wmove(cmdwin, ++cmd_curline, 1); }
else waddch(cmdwin, ch);
wrefresh(cmdwin);
}
}
EDIT 2: The write code is just a fprintf on a FILE* opened with fdopen(interp_output[1], "w") - this is in a different thread. All I'm getting to is my "prompt> " - it prints all that properly, but does one more iteration that it shouldn't. I've turned off buffering, which was giving me other problems.
EDIT 3: This has become a problem with my invocation of select(). It appears that, right away, it's returning -1 and errno's being set to 'invalid argument'. The read() doesn't know that and just keeps going. What could be wrong with my select()? I've updated the code and changed the title to more accurately reflect the problem...
EDIT 4: So now I'm thoroughly confused. The timeout value of .tv_sec=1 was no good, somehow. By getting rid of it, the code works just fine. If anybody has any theories, I'm all ears. I would just leave it at NULL, except this thread needs to periodically do updates.
To absolutely guarantee that the read will never block, you must set O_NONBLOCK on the fd.
Your select error is almost certainly caused because you aren't setting the entire time struct. You're only setting the seconds. The other field will contain garbage data picked up from the stack.
Use struct initialization. That will guarantee the other fields are set to 0.
It would look like this:
struct timeval timeout = {1, 0};
Also, in your select loop you should be aware that Linux will write the time remaining into the timeout value. That means that it will not be 1 second the next time through the loop unless you reset the value to 1 second.
According to the manpage:
On error, -1 is returned, and errno is set appropriately; the sets and timeout become undefined, so do not rely on their contents after an error.
You are not checking the return code from select().
The most likely explanation is that select() is being interrupted (errno=EINTR) and so returning an error, and the FD bit is still set in the "read" fd_set giving you the behaviour you are seeing.
Incidentally it is a very bad idea to name variables after standard/system/common functions. "read_fds" would be a MUCH better name than "read".
It's correct. See the select() manpage in Linux for example: http://linux.die.net/man/2/select
"Under Linux, select() may report a socket file descriptor as "ready for reading", while nevertheless a subsequent read blocks"
The only solution is to use NON-BLOCKING socket.
The answer is "it wouldn't". The behaviour you describe should never happen. Something else is going wrong.
I have been in a similar situation many times, and usually it turned out that there was a silly typo or cut and paste error elsewhere that led to behaviour that I mis-diagnosed.
If you post your code maybe we can help better - post the write code as well as the read please.
Also consider using asynchronous IO, even if only for debug purposes. If what you suspect is happening really is happening, then read() will return EWOULDBLOCK.
Also, you say you are copying the fd_set. How? Can you post your code for that?

Dealing with data on multiple TCP connections with epoll

I have an application that is going to work like a p2p-software where all peer are going to talk to each other. Since the communication will be TCP i thought that I could use epool(4) so that multiple connections can be handled. Since each peer will send data very often, I thought that I will establish a persistent connection to each peer that will be used under the applications lifetime.
Now, one thing that I don't know how to handle is that since the connection is never closed how do I know when i should stop receiving data with read() and call epool_wait() again to listen after more packages? Or is there a better way of dealing with persistent TCP connections?
You should set the socket to non-blocking, and when epoll indicates there is data to read
you should call read() in a loop until read() returns -1 and errno is EWOULDBLOCK
That is, your read loop could look sometihng like:
for(;;)
ssize_t ret;
ret = read(...);
if(ret == 0) {
//client disconnected, handle it, remove the fd from the epoll set
break;
} else if(ret == -1) {
if(errno == EWOULDBLOCK) {
// no more data, return to epoll loop
} else {
//error occured, handle it remove the fd from the epoll set
}
break;
}
// handle the read data
}
If you're not using edge triggered mode with epoll, you don't really need the loop - you could get away with doing just 1 read and return to the epoll loop. But handle the return values just like the above code.
That should have been 'epoll', not 'epool'...not familiar with epoll, but have a look here at the Beej's guide to see an example of the sockets using 'poll'...look at section 7.2 in there to see how it is done, also look at the section 9.17 for the usage of 'poll'...
Hope this helps,
Best regards,
Tom.
read() reads as much data as is immediately available (but no more that you request). Just run read() on the active socket, with a big-enough buffer (you probably don't need it bigger than your MTU… 2048 bytes will do) and call epoll_wait() when it completes.

select on UDP socket doesn't end when socket is closed - what am I doing wrong?

I'm working on Linux system (Ubuntu 7.04 server with a 2.6.20 kernel).
I've got a program that has a thread (thread1) waiting on a select for a UDP socket to become readable.
I'm using the select (with my socket as the single readfd and the single exceptfd) instead of just calling recvfrom because I want a timeout.
From another thread, I shutdown and close the socket.
If I do this while thread1 is blocked in a recvfrom, then the recvfrom will terminate immediately.
If I do this while thread1 is blocked in a select with a timeout, then the select will NOT terminate immediately, but will eventually timeout properly.
Can anyone tell me why it is that the select doesn't exit as soon as the socket is closed? Isn't that an exception? I can see where it isn't readable (obviously), but it's closed, which seems to be to be exeptional.
Here's the opening of the socket (all error handling removed to keep things simple):
m_sockfd = socket(PF_INET, SOCK_DGRAM, 0);
struct sockaddr_in si_me;
memset((char *) &si_me, 0, sizeof(si_me));
si_me.sin_family = AF_INET;
si_me.sin_port = htons(port);
si_me.sin_addr.s_addr = htonl(INADDR_ANY);
if (bind(m_sockfd, (struct sockaddr *)(&si_me), sizeof(si_me)) < 0)
{
// deal with error
}
Here's the select statement that thread1 executes:
struct timeval to;
to.tv_sec = timeout_ms/1000;// just the seconds portion
to.tv_usec = (timeout_ms%1000)*1000;// just the milliseconds
// converted to microseconds
// watch our one fd for readability or
// exceptions.
fd_set readfds, exceptfds;
FD_ZERO(&readfds);
FD_SET(m_sockfd, &readfds);
FD_ZERO(&exceptfds);
FD_SET(m_sockfd, &exceptfds);
int nsel = select(m_sockfd+1, &readfds, NULL, &exceptfds, &to);
UPDATE: Obviously (as stated below), closing the socket isn't an exceptional condition (from select's point of view). I think what I need to know is: Why? And, Is that intentional?.
I REALLY want to understand the thinking behind this select behavior because it seems counter to my expectations. Thus, I obviously need to adjust my thinking on how the TCP stack works. Please explain it to me.
Maybe you should use something else to wake up the select. Maybe a pipe or something like that.
UDP is a connectionless protocol. Since there is no connection, none can be broken, so the consumer doesn't know that the producer will never send again.
You could make the producer send an "end of stream" message, and have the consumer terminate upon receiving it.
I think the most obvious solution is that being closed isn't considered an exceptional condition. I think the root of the problem is, that you're not really embracing the philosophy of select. Why on earth are you fiddling around with the socket in another thread, that sounds like a recipe for disaster.
Could you not send a signal (e.g. USR2) to the thread which would cause select() to return with EINTR?
Then in the signal handler set a flag telling it not to restart the select()?
That would remove the need for waiting on multiple file descriptors, and seems a lot cleaner than using a pipe to kill it.
I would say the difference is that recvfrom is actively trying to read a message from a single socket, where select is waiting for a message to arrive, possibly on multiple handles, and not necessarily socket handles.
Your code is fundamentally broken. Variations on this mistake are common and have caused serious bugs with massive security implications in the past. Here's what you're missing:
When you go to close the socket, there is simply no possible way you can know whether the other thread is blocked in select or about to block in select. For example, consider the following:
The thread goes to call select, but doesn't get scheduled.
You close the socket.
In a thread your code is unaware of (maybe it's part of the platform's internal memory management or logging internals) a library opens a socket and gets the same identifier as the socket you closed.
The thread now goes into select, but it's selecting on the socket opened by the library.
Disaster strikes.
You must not attempt to release a resource while another thread is, or might be, using it.

Resources