Here is my situation, I am using UDP, and have a server who is sending me data, but he could also possibly not send the data, due to a lost packet, and I'd like him to resend it, but my client might also not receive the data, so I'd like him re-read it. For now, I decided to go with a timeout on my client, who will wait a certain amount of time, and then re-read. My problem is that, is simply adding a goto which heads back to the select going to solve the issue, or will select be broken after the first timeout? Do I havet to make the socket non-blocking? I read that somewhere on the net. Essentially, my goal is that if read doesn't happen, try again, after a certain period of time, cause I know thes sender is trying to send. I want to know if my logic is correct since there is no way of me testing it.
fd_set set;
struct timeval timeout;
int rv;
char buff[100];
int len = 100;
FD_ZERO(&set); /* clear the set */
FD_SET(sockfd, &set); /* add our file descriptor to the set */
timeout.tv_sec = 0;
timeout.tv_usec = 10000;
retry:
write(sockfd,"hi",3);//sent to client now waiting for an ack
rv = select(sockfd + 1, &set, NULL, NULL, &timeout);
read(sockfd, buff, strlen(buff), 0);
if(rv == -1)
perror("select"); /* an error accured */
else if(rv == 0)
printf("timeout");
goto retry;
else
read( filedesc, buff, len ); /* there was data to read */
}
I would excpect that select modifies your fd_set so that it doesn't contain your sockfd anymore after read (because it wasn't ready to read). So just to be sure you should reinitialize the set before the retry. Or you could try if the fd is still there after a timeout, but I think the normal behaviour is to reinitialize all FDs before calling select.
If you use poll or epoll you don't have to do that.
Apart from that the code looks ok.
Whether you use blockin or nonblocking IO doesn't matter for the read in your case. If you use non-blocking your write could fail, therefore staying with blocking is easier.
Related
I have a client server client connection where the server reads the message sent by the client every 1 second but I do not want the server to keep on waiting for a message for too long. I tried using the select() function but the server continues waiting for some message to read. Could anyone tell me what I am doing wrong please?
fd_set master;
fd_set read_fds;
FD_ZERO(&master);
FD_ZERO(&read_fds);
FD_SET(sock, &master);
while (1) {
bzero(message, 256);
sleep(1);
read_fds = master;
if(select(sock+2, &read_fds, NULL, NULL, NULL) < 0)
error("ERROR reading");
//if there is any data to read from the socket
else if(FD_ISSET(sock, &read_fds)){
n = read(sock, buffer, 256);
c = buffer[0];
printf("1st char is %c",c);
}//close else if statement
else printf("Nothing was read");
}//close while loop
A few comments that are too long to fit in the comments...
The first parameter to select really only needs to be sock+1.
By passing NULL for the timeout, select will block indefinitely (so you might as well have just called read).
When select does tell you that sock is ready for reading, there may only be one byte present, even if the other end wrote more then that. You will have to loop, reading the socket until you get the number of bytes you want. You will have to decide if you want the timeout only while waiting for the first byte, or if you want to timeout in the middle (I.e. is the select inside or outside the loop).
Normally, when using select, you only want to block in select, and not in read, so the socket should be put in non-blocking mode, and you should be prepared for the read to fail with EWOULDBLOCK.
If you do time out, do you want to close the connection? If you got half way through a message, you can't just throw away the half and keep going because you would now be expecting the next byte to be another start of message when it will now be a middle.
Hello I have a server program and a client program. The server program is working fine, as in I can telnet to the server and I can read and write in any order (like a chat room) without any issue. However I am now working on my client program and when I use 'select' and check if the socket descriptor is set to read or write, it always goes to write and then is blocked. As in messages do not get through until the client sends some data.
How can I fix this on my client end so I can read and write in any order?
while (quit != 1)
{
FD_ZERO(&read_fds);
FD_ZERO(&write_fds);
FD_SET(client_fd, &read_fds);
FD_SET(client_fd, &write_fds);
if (select(client_fd+1, &read_fds, &write_fds, NULL, NULL) == -1)
{
perror("Error on Select");
exit(2);
}
if (FD_ISSET(client_fd, &read_fds))
{
char newBuffer[100] = {'\0'};
int bytesRead = read(client_fd, &newBuffer, sizeof(newBuffer));
printf("%s",newBuffer);
}
if(FD_ISSET(client_fd, &write_fds))
{
quit = transmit(handle, buffer, client_fd);
}
}
Here is code to transmit function
int transmit(char* handle, char* buffer, int client_fd)
{
int n;
printf("%s", handle);
fgets(buffer, 500, stdin);
if (!strchr(buffer, '\n'))
{
while (fgetc(stdin) != '\n');
}
if (strcmp (buffer, "\\quit\n") == 0)
{
close(client_fd);
return 1;
}
n = write(client_fd, buffer, strlen(buffer));
if (n < 0)
{
error("ERROR writing to socket");
}
memset(buffer, 0, 501);
}
I think you are misinterpreting the use of the writefds parameer of select(): only set the bit when you want to write data to the socket. In other words, if there is no data, do not set the bit.
Setting the bit will check if there is room for writing, and if yes, the bit will remain on. Assuming you are not pumping megabytes of data, there will always be room, so right now you will always call transmit() which waits for input from the command line with fgets(), thus blocking the rest of the program. You have to monitor both the client socket and stdin to keep the program running.
So, check for READ action on stdin (use STDIN_FILENO to get the file descriptor for that), READ on client_fd always and just write() your data to the client_fd if the amount of data is small (if you need to write larger data chunks consider non-blocking sockets).
BTW, you forget to return a proper value at the end of transmit().
Sockets are almost always writable, except when the socket send buffer is full, which indicates that you are sending faster than the receiver is receiving.
So your transmit() function will be entered every time around the loop, so it will read some data from stdin, which blocks until you type something, so nothing happens.
You should only select on writability when a prior send() has returned EWOULDBLOCK/EAGAIN. Otherwise you should just send, when you have something to send.
I would throw this code away and use two or three threads in blocking mode.
select is used to check whether a socket has become ready to read or write. If it is blocking for read then that indicates no data to read. If it is blocking in write, then that indicates the TCP buffer is likely full and the remote end has to read some data so that the socket will allow more data to be written. Since the select blocks until one of the socket descriptions is ready, you also need to use timeout in select to avoid waiting for a long time.
In your specific case, if your remote/receiving end keep reading data from the socket then the select will not block for the write on the other end. Otherwise the tcp buffer will become full on the sender side and select will block. Answers posted also indicate the importance of handling EAGAIN or EWOULDBLOCK.
Sample flow:
while(bytesleft > 0)
then
nbytes = write data
if(nbytes > 0)
bytesleft -= nbytes;
else
if write returns with EAGAIN or EWOULDBLOCK
call poll or select to wait for the socket to be come ready
endif
endif
if poll or select times out
then handle the timeout error(e.g. the remote end did not send the
data within expected time interval)
endif
end while
The code also should include handle error conditions and read/write returning with (For example, write/read returning with 0). Also note read/recv returning 0 indicates the remote end closed the socket.
I have a client and server, and the client runs a select loop to multiplex between a TCP and a UDP connection. I'm trying to add my TCP connection file descriptor to both the read and the write set and then initiate one message exchange using write set and one using read set. My message communication with the write set works fine but with the read set I'm unable to do so.
Client Code:
char buf[256] = {};
char buf_to_send[256] = {};
int nfds, sd, r;
fd_set rd, wr;
int connect_init = 1;
/* I do the Connect Command here */
FD_ZERO(&rd);
FD_ZERO(&wr);
FD_SET(sd, &rd);
FD_SET(sd, &wr);
nfds = sd;
for(; ;){
r = select(nfds + 1, &rd, &wr, NULL, NULL);
if(connect_init == 0){
if(FD_ISSET(sd, &rd)){ // this is not working, if I change rd to wr, it works!
r = recv(sd, buf, sizeof(buf),0);
printf("received buf = %s", buf);
sprintf(buf, "%s", "client_reply\n");
send(sd, buf, strlen(buf), 0);
}
}
/* Everything below this works correctly */
if (connect_init){
if(FD_ISSET(sd, &wr)){
sprintf(buf_to_send, "%s", "Client connect request");
write(sd, buf_to_send, strlen(buf_to_send));
recv(sd, buf, sizeof(buf), 0);
printf("Server said = %s", buf);
sprintf(buf_to_send, "Hello!\n"); // client Hellos back
send(sd, buf_to_send, strlen(buf_to_send), 0);
}
connect_init = 0;
}
} // for loops ends
You need to initialize the sets in the loop, every time before calling select. This is needed because select modifies them. Beej's Guide to Network Programming has a comprehensive example on one way to use select.
So in your code, it seems select returns first with writing allowed, but reading not, which has the read bit reset to 0, and then there's nothing to set it back to 1, because from then on select will not touch it, because it is already 0.
If select API bothers you, look at poll, it avoids this (note that there's probably no practical/efficiency difference, it basically boils down to personal preference). On a "real" code with many descriptors (such as a network server with many clients), where performance matters, you should use some other mechanism though, probably some higher level event library, which then uses the OS specific system API, such as Linux's epoll facility. But checking just a few descriptors, select is the tried and true and relatively portable choice.
I have a multi-threaded server (thread pool) that is handling a large number of requests (up to 500/sec for one node), using 20 threads. There's a listener thread that accepts incoming connections and queues them for the handler threads to process. Once the response is ready, the threads then write out to the client and close the socket. All seemed to be fine until recently, a test client program started hanging randomly after reading the response. After a lot of digging, it seems that the close() from the server is not actually disconnecting the socket. I've added some debugging prints to the code with the file descriptor number and I get this type of output.
Processing request for 21
Writing to 21
Closing 21
The return value of close() is 0, or there would be another debug statement printed. After this output with a client that hangs, lsof is showing an established connection.
SERVER 8160 root 21u IPv4 32754237 TCP localhost:9980->localhost:47530 (ESTABLISHED)
CLIENT 17747 root 12u IPv4 32754228 TCP localhost:47530->localhost:9980 (ESTABLISHED)
It's as if the server never sends the shutdown sequence to the client, and this state hangs until the client is killed, leaving the server side in a close wait state
SERVER 8160 root 21u IPv4 32754237 TCP localhost:9980->localhost:47530 (CLOSE_WAIT)
Also if the client has a timeout specified, it will timeout instead of hanging. I can also manually run
call close(21)
in the server from gdb, and the client will then disconnect. This happens maybe once in 50,000 requests, but might not happen for extended periods.
Linux version: 2.6.21.7-2.fc8xen
Centos version: 5.4 (Final)
socket actions are as follows
SERVER:
int client_socket;
struct sockaddr_in client_addr;
socklen_t client_len = sizeof(client_addr);
while(true) {
client_socket = accept(incoming_socket, (struct sockaddr *)&client_addr, &client_len);
if (client_socket == -1)
continue;
/* insert into queue here for threads to process */
}
Then the thread picks up the socket and builds the response.
/* get client_socket from queue */
/* processing request here */
/* now set to blocking for write; was previously set to non-blocking for reading */
int flags = fcntl(client_socket, F_GETFL);
if (flags < 0)
abort();
if (fcntl(client_socket, F_SETFL, flags|O_NONBLOCK) < 0)
abort();
server_write(client_socket, response_buf, response_length);
server_close(client_socket);
server_write and server_close.
void server_write( int fd, char const *buf, ssize_t len ) {
printf("Writing to %d\n", fd);
while(len > 0) {
ssize_t n = write(fd, buf, len);
if(n <= 0)
return;// I don't really care what error happened, we'll just drop the connection
len -= n;
buf += n;
}
}
void server_close( int fd ) {
for(uint32_t i=0; i<10; i++) {
int n = close(fd);
if(!n) {//closed successfully
return;
}
usleep(100);
}
printf("Close failed for %d\n", fd);
}
CLIENT:
Client side is using libcurl v 7.27.0
CURL *curl = curl_easy_init();
CURLcode res;
curl_easy_setopt( curl, CURLOPT_URL, url);
curl_easy_setopt( curl, CURLOPT_WRITEFUNCTION, write_callback );
curl_easy_setopt( curl, CURLOPT_WRITEDATA, write_tag );
res = curl_easy_perform(curl);
Nothing fancy, just a basic curl connection. Client hangs in tranfer.c (in libcurl) because the socket is not perceived as being closed. It's waiting for more data from the server.
Things I've tried so far:
Shutdown before close
shutdown(fd, SHUT_WR);
char buf[64];
while(read(fd, buf, 64) > 0);
/* then close */
Setting SO_LINGER to close forcibly in 1 second
struct linger l;
l.l_onoff = 1;
l.l_linger = 1;
if (setsockopt(client_socket, SOL_SOCKET, SO_LINGER, &l, sizeof(l)) == -1)
abort();
These have made no difference. Any ideas would be greatly appreciated.
EDIT -- This ended up being a thread-safety issue inside a queue library causing the socket to be handled inappropriately by multiple threads.
Here is some code I've used on many Unix-like systems (e.g SunOS 4, SGI IRIX, HPUX 10.20, CentOS 5, Cygwin) to close a socket:
int getSO_ERROR(int fd) {
int err = 1;
socklen_t len = sizeof err;
if (-1 == getsockopt(fd, SOL_SOCKET, SO_ERROR, (char *)&err, &len))
FatalError("getSO_ERROR");
if (err)
errno = err; // set errno to the socket SO_ERROR
return err;
}
void closeSocket(int fd) { // *not* the Windows closesocket()
if (fd >= 0) {
getSO_ERROR(fd); // first clear any errors, which can cause close to fail
if (shutdown(fd, SHUT_RDWR) < 0) // secondly, terminate the 'reliable' delivery
if (errno != ENOTCONN && errno != EINVAL) // SGI causes EINVAL
Perror("shutdown");
if (close(fd) < 0) // finally call close()
Perror("close");
}
}
But the above does not guarantee that any buffered writes are sent.
Graceful close: It took me about 10 years to figure out how to close a socket. But for another 10 years I just lazily called usleep(20000) for a slight delay to 'ensure' that the write buffer was flushed before the close. This obviously is not very clever, because:
The delay was too long most of the time.
The delay was too short some of the time--maybe!
A signal such SIGCHLD could occur to end usleep() (but I usually called usleep() twice to handle this case--a hack).
There was no indication whether this works. But this is perhaps not important if a) hard resets are perfectly ok, and/or b) you have control over both sides of the link.
But doing a proper flush is surprisingly hard. Using SO_LINGER is apparently not the way to go; see for example:
http://msdn.microsoft.com/en-us/library/ms740481%28v=vs.85%29.aspx
https://www.google.ca/#q=the-ultimate-so_linger-page
And SIOCOUTQ appears to be Linux-specific.
Note shutdown(fd, SHUT_WR) doesn't stop writing, contrary to its name, and maybe contrary to man 2 shutdown.
This code flushSocketBeforeClose() waits until a read of zero bytes, or until the timer expires. The function haveInput() is a simple wrapper for select(2), and is set to block for up to 1/100th of a second.
bool haveInput(int fd, double timeout) {
int status;
fd_set fds;
struct timeval tv;
FD_ZERO(&fds);
FD_SET(fd, &fds);
tv.tv_sec = (long)timeout; // cast needed for C++
tv.tv_usec = (long)((timeout - tv.tv_sec) * 1000000); // 'suseconds_t'
while (1) {
if (!(status = select(fd + 1, &fds, 0, 0, &tv)))
return FALSE;
else if (status > 0 && FD_ISSET(fd, &fds))
return TRUE;
else if (status > 0)
FatalError("I am confused");
else if (errno != EINTR)
FatalError("select"); // tbd EBADF: man page "an error has occurred"
}
}
bool flushSocketBeforeClose(int fd, double timeout) {
const double start = getWallTimeEpoch();
char discard[99];
ASSERT(SHUT_WR == 1);
if (shutdown(fd, 1) != -1)
while (getWallTimeEpoch() < start + timeout)
while (haveInput(fd, 0.01)) // can block for 0.01 secs
if (!read(fd, discard, sizeof discard))
return TRUE; // success!
return FALSE;
}
Example of use:
if (!flushSocketBeforeClose(fd, 2.0)) // can block for 2s
printf("Warning: Cannot gracefully close socket\n");
closeSocket(fd);
In the above, my getWallTimeEpoch() is similar to time(), and Perror() is a wrapper for perror().
Edit: Some comments:
My first admission is a bit embarrassing. The OP and Nemo challenged the need to clear the internal so_error before close, but I cannot now find any reference for this. The system in question was HPUX 10.20. After a failed connect(), just calling close() did not release the file descriptor, because the system wished to deliver an outstanding error to me. But I, like most people, never bothered to check the return value of close. So I eventually ran out of file descriptors (ulimit -n), which finally got my attention.
(very minor point) One commentator objected to the hard-coded numerical arguments to shutdown(), rather than e.g. SHUT_WR for 1. The simplest answer is that Windows uses different #defines/enums e.g. SD_SEND. And many other writers (e.g. Beej) use constants, as do many legacy systems.
Also, I always, always, set FD_CLOEXEC on all my sockets, since in my applications I never want them passed to a child and, more importantly, I don't want a hung child to impact me.
Sample code to set CLOEXEC:
static void setFD_CLOEXEC(int fd) {
int status = fcntl(fd, F_GETFD, 0);
if (status >= 0)
status = fcntl(fd, F_SETFD, status | FD_CLOEXEC);
if (status < 0)
Perror("Error getting/setting socket FD_CLOEXEC flags");
}
Great answer from Joseph Quinsey. I have comments on the haveInput function. Wondering how likely it is that select returns an fd you did not include in your set. This would be a major OS bug IMHO. That's the kind of thing I would check if I wrote unit tests for the select function, not in an ordinary app.
if (!(status = select(fd + 1, &fds, 0, 0, &tv)))
return FALSE;
else if (status > 0 && FD_ISSET(fd, &fds))
return TRUE;
else if (status > 0)
FatalError("I am confused"); // <--- fd unknown to function
My other comment pertains to the handling of EINTR. In theory, you could get stuck in an infinite loop if select kept returning EINTR, as this error lets the loop start over. Given the very short timeout (0.01), it appears highly unlikely to happen. However, I think the appropriate way of dealing with this would be to return errors to the caller (flushSocketBeforeClose). The caller can keep calling haveInput has long as its timeout hasn't expired, and declare failure for other errors.
ADDITION #1
flushSocketBeforeClose will not exit quickly in case of read returning an error. It will keep looping until the timeout expires. You can't rely on the select inside haveInput to anticipate all errors. read has errors of its own (ex: EIO).
while (haveInput(fd, 0.01))
if (!read(fd, discard, sizeof discard)) <-- -1 does not end loop
return TRUE;
This sounds to me like a bug in your Linux distribution.
The GNU C library documentation says:
When you have finished using a socket, you can simply close its file
descriptor with close
Nothing about clearing any error flags or waiting for the data to be flushed or any such thing.
Your code is fine; your O/S has a bug.
include:
#include <unistd.h>
this should help solve the close(); problem
I have a server that sends data to a client every 5 seconds. I want the client to block on read() until the server sends some data and then print it. I know read () is blocking by default. My problem is that my client is not blocking on read(). This is very odd and this does not seem to be a normal issue.
My code prints "Nothing came back" in an infinite loop. I am on a linux machine, programming in c. My code snippet is below. Please advice.
while(1)
{
n = read(sockfd, recvline, MAXLINE);
if ( n > 0)
{
recvline[n] = 0;
if (fputs(recvline, stdout) == EOF)
printf("fputs error");
}
else if(n == 0)
printf("Nothing came back");
else if (n < 0)
printf("read error");
}
return;
There may be several cause and several exceptions are possible at different place:
check socket where you create:
sockfd=socket(AF_INET,SOCK_STREAM,0);
if (sockfd==-1) {
perror("Create socket");
}
You and also enable blocking mode explicitly before use it:
// Set the socket I/O mode: In this case FIONBIO
// enables or disables the blocking mode for the
// socket based on the numerical value of iMode.
// If iMode = 0, blocking is enabled;
// If iMode != 0, non-blocking mode is enabled.
ioctl(sockfd, FIONBIO, &iMode);
or you can use setsockopt as below:
struct timeval t;
t.tv_sec = 0;
tv_usec = 0;
setsockopt(
sockfd, // Socket descriptor
SOL_SOCKET, // To manipulate options at the sockets API level
SO_RCVTIMEO,// Specify the receiving or sending timeouts
const void *(&t), // option values
sizeof(t)
);
Check Read function call (Reason of bug)
n = read(sockfd, recvline, MAXLINE);
if(n < 0){
perror("Read Error:");
}
Also check server code!:
May your server send some blank(non-printable, null, enter) charter(s). And your are unaware of this. Bug you server code too.
Or your server terminated before your client can read.
One more interesting thing, Try to understand:
When you call N write() at server its not necessary there should be N read() call at other side.
What Greg Hewgill already wrote as a comment: An EOF (that is, an explicit stop of writing, be it via close() or via shutdown()) will be communicated to the receiving side by having recv() return 0. So if you get 0, you know that there won't be any data and you can terminate the reading loop.
If you had non-blocking enabled and there are no data, you will get -1 and errno will be set to EAGAIN or EWOULDBLOCK.
What is the value of MAXLINE?
If the value is 0, then it will return 0 as well.
Otherwise, as Grijesh Chauhan mention, set it explcitly to blocking.
Or, you may also consider using recv() where blocking and non-blocking can be specified.
It has the option, MSG_WAITALL, where it will block until all bytes arrived.
n = recv(sockfd, recvline, MAXLINE, MSG_WAITALL);